From collinw at gmail.com Fri Sep 1 04:52:02 2006 From: collinw at gmail.com (Collin Winter) Date: Thu, 31 Aug 2006 21:52:02 -0500 Subject: [Python-Dev] A test suite for unittest Message-ID: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> I've just uploaded a trio of unittest-related patches: #1550272 (http://python.org/sf/1550272) is a test suite for the mission-critical parts of unittest. #1550273 (http://python.org/sf/1550273) fixes 6 issues uncovered while writing the test suite. Several other items that I raised earlier (http://mail.python.org/pipermail/python-dev/2006-August/068378.html) were judged to be either non-issues or behaviours that, while suboptimal, people have come to rely on. #1550263 (http://python.org/sf/1550263) follows up on an earlier patch I submitted for unittest's docs. This new patch corrects and clarifies numerous sections of the module's documentation. I'd appreciate it if these changes could make it into 2.5-final or at least 2.5.1. What follows is a list of the issues fixed in patch #1550273: 1) TestLoader.loadTestsFromName() failed to return a suite when resolving a name to a callable that returns a TestCase instance. 2) Fix a bug in both TestSuite.addTest() and TestSuite.addTests() concerning a lack of input checking on the input test case(s)/suite(s). 3) Fix a bug in both TestLoader.loadTestsFromName() and TestLoader.loadTestsFromNames() that had ValueError being raised instead of TypeError. The problem occured when the given name resolved to a callable and the callable returned something of the wrong type. 4) When a name resolves to a method on a TestCase subclass, TestLoader.loadTestsFromName() did not return a suite as promised. 5) TestLoader.loadTestsFromName() would raise a ValueError (rather than a TypeError) if a name resolved to an invalid object. This has been fixed so that a TypeError is raised. 6) TestResult.shouldStop was being initialised to 0 in TestResult.__init__. Since this attribute is always used in a boolean context, it's better to use the False spelling. Thanks, Collin Winter From fdrake at acm.org Fri Sep 1 06:02:59 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 1 Sep 2006 00:02:59 -0400 Subject: [Python-Dev] A test suite for unittest In-Reply-To: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> References: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> Message-ID: <200609010003.00418.fdrake@acm.org> On Thursday 31 August 2006 22:52, Collin Winter wrote: > I've just uploaded a trio of unittest-related patches: Thanks, Collin! > #1550272 (http://python.org/sf/1550272) is a test suite for the > mission-critical parts of unittest. > > #1550273 (http://python.org/sf/1550273) fixes 6 issues uncovered while > writing the test suite. Several other items that I raised earlier > (http://mail.python.org/pipermail/python-dev/2006-August/068378.html) > were judged to be either non-issues or behaviours that, while > suboptimal, people have come to rely on. I'm hesitant to commit even tests at this point (the release candidate has already been released, and there's no plan for a second). I've not reviewed the patches. > #1550263 (http://python.org/sf/1550263) follows up on an earlier patch > I submitted for unittest's docs. This new patch corrects and clarifies > numerous sections of the module's documentation. Anthony did approve documentation changes for 2.5, so I've committed this for 2.5 and on the trunk (2.6). These should be considered for 2.4.4 as well. (The other two may be appropriate as well.) -Fred -- Fred L. Drake, Jr. From anthony at interlink.com.au Fri Sep 1 06:35:19 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 1 Sep 2006 14:35:19 +1000 Subject: [Python-Dev] A test suite for unittest In-Reply-To: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> References: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> Message-ID: <200609011435.21060.anthony@interlink.com.au> At this point, I'd say the documentation patches should go in - the other patches are probably appropriate for 2.5.1. I only want to accept critical patches between now and 2.5 final. Thanks for the patches (and particularly for the unittest! woooooo!) Anthony From fredrik at pythonware.com Fri Sep 1 10:08:18 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 1 Sep 2006 10:08:18 +0200 Subject: [Python-Dev] That library reference, yet again References: <8233478f0608311255o7058a1feo55c710e7eb8e6b6c@mail.gmail.com> Message-ID: "Johann C. Rocholl" wrote: > What is the status of http://effbot.org/lib/ ? > > I think it's a step in the right direction. Is it still in progress? the pushback from the powers-that-be was massive, so we're currently working "under the radar", using alternative deployment approaches (see pytut.infogami.com and friends). From jimjjewett at gmail.com Fri Sep 1 15:31:36 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 1 Sep 2006 09:31:36 -0400 Subject: [Python-Dev] Fwd: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc In-Reply-To: <20060831224237.B872C1E4002@bag.python.org> References: <20060831224237.B872C1E4002@bag.python.org> Message-ID: This 8 vs 4 is getting cruftier and cruftier. (And does it deal properly with existing code that already has four spaces because it was written recently?) "Tim" regularly fixes whitespace already, with little damage. Would it make sense to do a one-time cutover on the 2.6 trunk? How about the bugfix branches? If it is ever going to happen, then immediately after a release, before unfreezing, is probably the best time. -jJ ---------- Forwarded message ---------- From: brett.cannon Date: Aug 31, 2006 6:42 PM Subject: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc To: python-checkins at python.org Author: brett.cannon Date: Fri Sep 1 00:42:37 2006 New Revision: 51674 Modified: python/trunk/Misc/Vim/vimrc Log: Have pre-existing C files use 8 spaces indents (to match old PEP 7 style), but have all new files use 4 spaces (to match current PEP 7 style). Modified: python/trunk/Misc/Vim/vimrc ============================================================================== --- python/trunk/Misc/Vim/vimrc (original) +++ python/trunk/Misc/Vim/vimrc Fri Sep 1 00:42:37 2006 @@ -19,9 +19,10 @@ " Number of spaces to use for an indent. " This will affect Ctrl-T and 'autoindent'. " Python: 4 spaces -" C: 4 spaces +" C: 8 spaces (pre-existing files) or 4 spaces (new files) au BufRead,BufNewFile *.py,*pyw set shiftwidth=4 -au BufRead,BufNewFile *.c,*.h set shiftwidth=4 +au BufRead *.c,*.h set shiftwidth=8 +au BufNewFile *.c,*.h set shiftwidth=4 " Number of spaces that a pre-existing tab is equal to. " For the amount of space used for a new tab use shiftwidth. _______________________________________________ Python-checkins mailing list Python-checkins at python.org http://mail.python.org/mailman/listinfo/python-checkins From guido at python.org Fri Sep 1 17:02:37 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 1 Sep 2006 08:02:37 -0700 Subject: [Python-Dev] Fwd: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc In-Reply-To: References: <20060831224237.B872C1E4002@bag.python.org> Message-ID: For 2.x we really don't want to reformat all code. I even think it's questionable to use 4 spaces for new files since it will mean problems for editors switching between files. For 3.0 we really do. But as long as 2.x and 3.0 aren't too far apart I'd rather not reformat everything because it would break all merge capabilities. --Guido On 9/1/06, Jim Jewett wrote: > This 8 vs 4 is getting cruftier and cruftier. (And does it deal > properly with existing code that already has four spaces because it > was written recently?) > > "Tim" regularly fixes whitespace already, with little damage. > > Would it make sense to do a one-time cutover on the 2.6 trunk? > How about the bugfix branches? > > If it is ever going to happen, then immediately after a release, > before unfreezing, is probably the best time. > > -jJ > > ---------- Forwarded message ---------- > From: brett.cannon > Date: Aug 31, 2006 6:42 PM > Subject: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc > To: python-checkins at python.org > > > Author: brett.cannon > Date: Fri Sep 1 00:42:37 2006 > New Revision: 51674 > > Modified: > python/trunk/Misc/Vim/vimrc > Log: > Have pre-existing C files use 8 spaces indents (to match old PEP 7 style), but > have all new files use 4 spaces (to match current PEP 7 style). > > > Modified: python/trunk/Misc/Vim/vimrc > ============================================================================== > --- python/trunk/Misc/Vim/vimrc (original) > +++ python/trunk/Misc/Vim/vimrc Fri Sep 1 00:42:37 2006 > @@ -19,9 +19,10 @@ > " Number of spaces to use for an indent. > " This will affect Ctrl-T and 'autoindent'. > " Python: 4 spaces > -" C: 4 spaces > +" C: 8 spaces (pre-existing files) or 4 spaces (new files) > au BufRead,BufNewFile *.py,*pyw set shiftwidth=4 > -au BufRead,BufNewFile *.c,*.h set shiftwidth=4 > +au BufRead *.c,*.h set shiftwidth=8 > +au BufNewFile *.c,*.h set shiftwidth=4 > > " Number of spaces that a pre-existing tab is equal to. > " For the amount of space used for a new tab use shiftwidth. > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhettinger at ewtllc.com Fri Sep 1 19:56:17 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Fri, 01 Sep 2006 10:56:17 -0700 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <44F6D12C.4040808@gmail.com> References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F6D12C.4040808@gmail.com> Message-ID: <44F87441.7060203@ewtllc.com> >>> The right way to do it was presented in PEP343. The implementation >>> was correct and the API was simple. >> >> >> >> Raymond's persuaded me that he's right on the API part at the very >> least. The current API was a mechanical replacement of the initial >> __context__ based API with a normal method, whereas I should have >> reverted back to the module-level localcontext() function from PEP343 >> and thrown the method on Context objects away entirely. >> >> I can fix it on the trunk (and add those missing tests!), but I'll >> need Anthony and/or Neal's permission to backport it and remove the >> get_manager() method from Python 2.5 before we get stuck with it >> forever. > > > > I committed this fix as 51664 on the trunk (although the docstrings > are still example free because doctest doesn't understand __future__ > statements). > Thanks for getting this done. Please make the following changes: * rename ContextManger to _ContextManger and remove it from the __all__ listing * move the copy() step from localcontext() to _ContextManager() * make the trivial updates the whatsnew25 example Once those nits are fixed, I recommend this patch be backported to the Py2.5 release. Raymond From rhettinger at ewtllc.com Sat Sep 2 01:47:21 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Fri, 01 Sep 2006 16:47:21 -0700 Subject: [Python-Dev] Problem withthe API for str.rpartition() Message-ID: <44F8C689.6050804@ewtllc.com> Currently, both the partition() and rpartition() methods return a (head, sep, tail) tuple and the only difference between the two is whether the partition element search starts from the beginning or end of the string. When no separator is found, both methods return the string S and two empty strings so that 'a'.partition('x') == 'a'.rpartition('x') == ('a', '', ''). For rpartition() the notion of head and tail are backwards -- you repeatedly search the tail, not the head. The distinction is vital because the use cases for rpartition() are a mirror image of those for partition(). Accordingly, rpartition()'s result should be interpreted as (tail, sep, head) and the partition-not-found endcase needs change so that 'a'.rpartition('x') == ('', '', 'a') . The test invariant should be: For every s and p: s.partition(p) == s[::-1].rpartition(p)[::-1] The following code demonstrates why the current choice is problematic: line = 'a.b.c.d' while line: field, sep, line = line.partition('.') print field line = 'a.b.c.d' while line: line, sep, field = line.rpartition('.') print field The second fragment never terminates. Since this is a critical API flaw rather than a implementation bug, I think it should get fixed right away rather than waiting for Py2.5.1. Raymond From guido at python.org Sat Sep 2 02:04:12 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 1 Sep 2006 17:04:12 -0700 Subject: [Python-Dev] Problem withthe API for str.rpartition() In-Reply-To: <44F8C689.6050804@ewtllc.com> References: <44F8C689.6050804@ewtllc.com> Message-ID: +1 On 9/1/06, Raymond Hettinger wrote: > Currently, both the partition() and rpartition() methods return a (head, > sep, tail) tuple and the only difference between the two is whether the > partition element search starts from the beginning or end of the > string. When no separator is found, both methods return the string S > and two empty strings so that 'a'.partition('x') == 'a'.rpartition('x') > == ('a', '', ''). > > For rpartition() the notion of head and tail are backwards -- you > repeatedly search the tail, not the head. The distinction is vital > because the use cases for rpartition() are a mirror image of those for > partition(). Accordingly, rpartition()'s result should be interpreted > as (tail, sep, head) and the partition-not-found endcase needs change so > that 'a'.rpartition('x') == ('', '', 'a') . > > The test invariant should be: > For every s and p: s.partition(p) == s[::-1].rpartition(p)[::-1] > > The following code demonstrates why the current choice is problematic: > > line = 'a.b.c.d' > while line: > field, sep, line = line.partition('.') > print field > > line = 'a.b.c.d' > while line: > line, sep, field = line.rpartition('.') > print field > > The second fragment never terminates. > > Since this is a critical API flaw rather than a implementation bug, I > think it should get fixed right away rather than waiting for Py2.5.1. > > > > Raymond > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From kbk at shore.net Sat Sep 2 03:28:44 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri, 1 Sep 2006 21:28:44 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200609020128.k821SicT001270@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 412 open ( +5) / 3397 closed ( +4) / 3809 total ( +9) Bugs : 900 open (+12) / 6149 closed ( +4) / 7049 total (+16) RFE : 233 open ( +1) / 236 closed ( +0) / 469 total ( +1) New / Reopened Patches ______________________ set literals (2006-08-28) CLOSED http://python.org/sf/1547796 opened by Georg Brandl "for x in setliteral" peepholer optimization (2006-08-28) CLOSED http://python.org/sf/1548082 opened by Georg Brandl set comprehensions (2006-08-29) http://python.org/sf/1548388 opened by Georg Brandl Fix for structmember conversion issues (2006-08-29) http://python.org/sf/1549049 opened by Roger Upole Implementation of PEP 3102 Keyword Only Argument (2006-08-31) http://python.org/sf/1549670 opened by Jiwon Seo Add a test suite for test_unittest (2006-08-31) http://python.org/sf/1550272 opened by Collin Winter Fix numerous bugs in unittest (2006-08-31) http://python.org/sf/1550273 opened by Collin Winter Ellipsis literal "..." (2006-09-01) http://python.org/sf/1550786 opened by Georg Brandl make exec a function (2006-09-01) http://python.org/sf/1550800 opened by Georg Brandl Patches Closed ______________ Allow os.listdir to accept file names longer than MAX_PATH (2006-04-26) http://python.org/sf/1477350 closed by rupole set literals (2006-08-28) http://python.org/sf/1547796 closed by gbrandl pybench.py error reporting broken for bad -s filename (2006-08-25) http://python.org/sf/1546372 closed by lemburg "if x in setliteral" peepholer optimization (2006-08-28) http://python.org/sf/1548082 closed by gvanrossum New / Reopened Bugs ___________________ Typo in Language Reference Section 3.2 Class Instances (2006-08-28) http://python.org/sf/1547931 opened by whesse_at_clarkson curses module segfaults on invalid tparm arguments (2006-08-28) http://python.org/sf/1548092 opened by Marien Zwart Add 'find' method to sequence types (2006-08-28) http://python.org/sf/1548178 opened by kovan Recursion limit exceeded in the match function (2006-08-29) CLOSED http://python.org/sf/1548252 opened by wojtekwu sgmllib.sgmlparser is not thread safe (2006-08-28) http://python.org/sf/1548288 opened by Andres Riancho whichdb too dumb (2006-08-28) http://python.org/sf/1548332 opened by Curtis Doty filterwarnings('error') has no effect (2006-08-29) http://python.org/sf/1548371 opened by Roger Upole C modules reloaded on certain failed imports (2006-08-29) http://python.org/sf/1548687 opened by Josiah Carlson shlex (or perhaps cStringIO) and unicode strings (2006-08-29) http://python.org/sf/1548891 opened by Erwin S. Andreasen bug in classlevel variabels (2006-08-30) CLOSED http://python.org/sf/1549499 opened by Thomas Dybdahl Ahle Pdb parser bug (2006-08-30) http://python.org/sf/1549574 opened by Alexander Belopolsky urlparse return exchanged values (2006-08-30) CLOSED http://python.org/sf/1549589 opened by Oscar Acena Enhance and correct unittest's docs (redux) (2006-08-31) http://python.org/sf/1550263 reopened by fdrake Enhance and correct unittest's docs (redux) (2006-08-31) http://python.org/sf/1550263 opened by Collin Winter inspect module and class startlineno (2006-09-01) http://python.org/sf/1550524 opened by Ali Gholami Rudi SWIG wrappers incompatible with 2.5c1 (2006-09-01) http://python.org/sf/1550559 opened by Andrew Gregory itertools.tee raises SystemError (2006-09-01) http://python.org/sf/1550714 opened by Alexander Belopolsky itertools.tee raises SystemError (2006-09-01) CLOSED http://python.org/sf/1550761 opened by Alexander Belopolsky Bugs Closed ___________ x!=y and [x]=[y] (!) (2006-08-22) http://python.org/sf/1544762 closed by rhettinger Recursion limit exceeded in the match function (2006-08-29) http://python.org/sf/1548252 closed by gbrandl bug in classlevel variabels (2006-08-30) http://python.org/sf/1549499 closed by gbrandl urlparse return exchanged values (2006-08-30) http://python.org/sf/1549589 closed by gbrandl Enhance and correct unittest's docs (redux) (2006-08-31) http://python.org/sf/1550263 closed by fdrake itertools.tee raises SystemError (2006-09-01) http://python.org/sf/1550761 deleted by belopolsky From ncoghlan at gmail.com Sat Sep 2 06:47:30 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 02 Sep 2006 14:47:30 +1000 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <44F715DC.1090001@ewtllc.com> References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F715DC.1090001@ewtllc.com> Message-ID: <44F90CE2.2050200@gmail.com> Raymond Hettinger wrote: > Please go ahead and get the patch together for localcontext(). This > should be an easy sell: > > * simple bugs can be fixed in Py2.5.1 but API mistakes are forever. * > currently, all of the docs, docstrings, and whatsnew are incorrect. > * the solution has already been worked-out in PEP343 -- it's nothing new. > * nothing else, anywhere depends on this code -- it is as safe a change > as we could hope for. > > Neal is tough, but he's not heartless ;-) I backported the changes and assigned the patch to Neal: http://www.python.org/sf/1550886 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From gjcarneiro at gmail.com Sat Sep 2 14:10:04 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 2 Sep 2006 13:10:04 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: We have to resort to timeouts in pygtk in order to catch unix signals in threaded mode. The reason is this. We call gtk_main() (mainloop function) which blocks forever. Suppose there are threads in the program; then any thread can receive a signal (e.g. SIGINT). Python catches the signal, but doesn't do anything; it simply sets a flag in a global structure and calls Py_AddPendingCall(), and I guess it expects someone to call Py_MakePendingCalls(). However, the main thread is blocked calling a C function and has no way of being notified it needs to give control back to python to handle the signal. Hence, we use a 100ms timeout for polling. Unfortunately, timeouts needlessly consume CPU time and drain laptop batteries. According to [1], all python needs to do to avoid this problem is block all signals in all but the main thread; then we can guarantee signal handlers are always called from the main thread, and pygtk doesn't need a timeout. Another alternative would be to add a new API like Py_AddPendingCallNotification, which would let python notify extensions that new pending calls exist and need to be processed. But I would really prefer the first alternative, as it could be fixed within python 2.5; no need to wait for 2.6. Please, let's make Python ready for the enterprise! [2] [1] https://bugzilla.redhat.com/bugzilla/process_bug.cgi#c3 [2] http://perkypants.org/blog/2006/09/02/rfte-python/ From nmm1 at cus.cam.ac.uk Sat Sep 2 15:02:43 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sat, 02 Sep 2006 14:02:43 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Your message of "Sat, 02 Sep 2006 13:10:04 BST." Message-ID: "Gustavo Carneiro" wrote: > > We have to resort to timeouts in pygtk in order to catch unix signals > in threaded mode. A common defect of modern designs - TCP/IP is particularly objectionable in this respect, but that battle was lost and won over two decades ago :-( > The reason is this. We call gtk_main() (mainloop function) which > blocks forever. Suppose there are threads in the program; then any > thread can receive a signal (e.g. SIGINT). Python catches the signal, > but doesn't do anything; it simply sets a flag in a global structure > and calls Py_AddPendingCall(), and I guess it expects someone to call > Py_MakePendingCalls(). However, the main thread is blocked calling a > C function and has no way of being notified it needs to give control > back to python to handle the signal. Hence, we use a 100ms timeout > for polling. Unfortunately, timeouts needlessly consume CPU time and > drain laptop batteries. Yup. > According to [1], all python needs to do to avoid this problem is > block all signals in all but the main thread; then we can guarantee > signal handlers are always called from the main thread, and pygtk > doesn't need a timeout. 1) That page is password protected, so I can't see what it says, and am disinclined to register myself to yet another such site. 2) No way, Jose, anyway. The POSIX signal handling model was broken beyond redemption, even before threading was added, and the combination is evil almost beyond belief. That procedure is good practice, yes, but that is NOT all that you have to do - it may be all that you CAN do, but that is not the same. Come back MVS (or even VMS) - all is forgiven! That is only partly a joke. > Another alternative would be to add a new API like > Py_AddPendingCallNotification, which would let python notify > extensions that new pending calls exist and need to be processed. Nope. Sorry, but you can't solve a broken design by adding interfaces. > But I would really prefer the first alternative, as it could be > fixed within python 2.5; no need to wait for 2.6. It clearly should be done, assuming that Python's model is that it doesn't want to get involved with subthread signalling (and I really, but REALLY, recommend not doing so). The best that can be done is to say that all signal handling is the business of the main thread and that, when the system bypasses that, all bets are off. > Please, let's make Python ready for the enterprise! [2] Given that no Unix variant or Microsoft system is, isn't that rather an unreasonable demand? I am probably one of the last half-dozen people still employed in a technical capacity who has implemented run-time systems that supported user-level signal handling with threads/asynchronicity and allowing for signals received while in system calls. It would be possible to modify/extend POSIX or Microsoft designs to support this, but currently they don't make it possible. There is NOTHING that Python can do but to minimise the chaos. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From jjl at pobox.com Sat Sep 2 17:01:52 2006 From: jjl at pobox.com (John J Lee) Date: Sat, 2 Sep 2006 15:01:52 +0000 (UTC) Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <44F6D12C.4040808@gmail.com> References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F6D12C.4040808@gmail.com> Message-ID: On Thu, 31 Aug 2006, Nick Coghlan wrote: [...] > I committed this fix as 51664 on the trunk (although the docstrings are still > example free because doctest doesn't understand __future__ statements). [...] Assuming doctest doesn't try to parse the Python code when SKIP is specified, I guess this would solve that little problem: http://docs.python.org/dev/lib/doctest-options.html """ SKIP When specified, do not run the example at all. This can be useful in contexts where doctest examples serve as both documentation and test cases, and an example should be included for documentation purposes, but should not be checked. E.g., the example's output might be random; or the example might depend on resources which would be unavailable to the test driver. The SKIP flag can also be used for temporarily "commenting out" examples. ... Changed in version 2.5: Constant SKIP was added. """ John From ncoghlan at gmail.com Sat Sep 2 17:27:03 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 03 Sep 2006 01:27:03 +1000 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F6D12C.4040808@gmail.com> Message-ID: <44F9A2C7.5060803@gmail.com> John J Lee wrote: > On Thu, 31 Aug 2006, Nick Coghlan wrote: > [...] >> I committed this fix as 51664 on the trunk (although the docstrings are still >> example free because doctest doesn't understand __future__ statements). > [...] > > Assuming doctest doesn't try to parse the Python code when SKIP is > specified, I guess this would solve that little problem: > > http://docs.python.org/dev/lib/doctest-options.html > > """ > SKIP A quick experiment suggests that using SKIP will solve the problem - fixing that can wait until 2.5.1 though. The localcontext() docstring does actually contain an example - it just isn't in a form that doctest will try to execute. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From alan.mcintyre at gmail.com Sat Sep 2 18:31:54 2006 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sat, 02 Sep 2006 12:31:54 -0400 Subject: [Python-Dev] Windows build slave down until Tuesday-ish Message-ID: <44F9B1FA.4010305@gmail.com> The "x86 XP trunk" build slave will be down for a bit longer, unfortunately. Tropical storm Ernesto got in the way of my DSL installation - I don't have a new install date yet, but I'm assuming it's going to be Tuesday or later. Alan From gjcarneiro at gmail.com Sat Sep 2 18:39:51 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 2 Sep 2006 17:39:51 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/2/06, Nick Maclaren wrote: > > According to [1], all python needs to do to avoid this problem is > > block all signals in all but the main thread; then we can guarantee > > signal handlers are always called from the main thread, and pygtk > > doesn't need a timeout. > > 1) That page is password protected, so I can't see what it says, and > am disinclined to register myself to yet another such site. Oh, sorry, here's the comment: (coment by Arjan van de Ven): | afaik the kernel only sends signals to threads that don't have them blocked. | If python doesn't want anyone but the main thread to get signals, it should just | block signals on all but the main thread and then by nature, all signals will go | to the main thread.... > 2) No way, Jose, anyway. The POSIX signal handling model was broken > beyond redemption, even before threading was added, and the combination > is evil almost beyond belief. That procedure is good practice, yes, > but that is NOT all that you have to do - it may be all that you CAN > do, but that is not the same. > > Nope. Sorry, but you can't solve a broken design by adding interfaces. Well, Python has a broken design too; it postpones tasks and expects to magically regain control in order to finish the job. That often doesn't happen! > > > But I would really prefer the first alternative, as it could be > > fixed within python 2.5; no need to wait for 2.6. > > It clearly should be done, assuming that Python's model is that it > doesn't want to get involved with subthread signalling (and I really, > but REALLY, recommend not doing so). The best that can be done is to > say that all signal handling is the business of the main thread and > that, when the system bypasses that, all bets are off. Python is halfway there; it assumes signals are to be handled in the main thread. However, it _catches_ them in any thread, sets a flag, and just waits for the next opportunity when it runs again in the main thread. It is precisely this "split handling" of signals that is failing now. Anyway, attached a patch that should fix the problem in posix threads systems, in case anyone wants to review. Cheers. -------------- next part -------------- A non-text attachment was scrubbed... Name: pythreads.diff Type: text/x-patch Size: 1030 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060902/b91250df/attachment.bin From raymond.hettinger at verizon.net Sat Sep 2 19:11:58 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 02 Sep 2006 10:11:58 -0700 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F715DC.1090001@ewtllc.com> <44F90CE2.2050200@gmail.com> Message-ID: <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> [Neal] > Please review the patch and make a comment. I did a diff between HEAD > and 2.4 and am fine with this going in once you are happy. I fixed a couple of documentation nits in rev 51688. The patch is ready-to-go. Nick, please go ahead and backport. Raymond From nmm1 at cus.cam.ac.uk Sat Sep 2 20:41:59 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sat, 02 Sep 2006 19:41:59 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Your message of "Sat, 02 Sep 2006 17:39:51 BST." Message-ID: "Gustavo Carneiro" wrote: > > Oh, sorry, here's the comment: > > (coment by Arjan van de Ven): > | afaik the kernel only sends signals to threads that don't have them blocked. > | If python doesn't want anyone but the main thread to get signals, it > should just > | block signals on all but the main thread and then by nature, all > signals will go > | to the main thread.... Well, THAT'S wrong, I am afraid! Things ain't that simple :-( Yes, POSIX implies that things work that way, but there are so many get-out clauses and problems with trying to implement that specification that such behaviour can't be relied on. > Well, Python has a broken design too; it postpones tasks and expects > to magically regain control in order to finish the job. That often > doesn't happen! Very true. And that is another problem with POSIX :-( > Python is halfway there; it assumes signals are to be handled in the > main thread. However, it _catches_ them in any thread, sets a flag, > and just waits for the next opportunity when it runs again in the main > thread. It is precisely this "split handling" of signals that is > failing now. I agree that is not how to do it, but that code should not be removed. Despite best attempts, there may well be circumstances under which signals are received in a subthread, despite all attempts of the program to ensure that the main thread gets them. > Anyway, attached a patch that should fix the problem in posix > threads systems, in case anyone wants to review. Not "fix" - "improve" :-) I haven't looked at it, but I agree that what you have said is the way to proceed. The best solution is to enable the main thread for all relevant signals, disable all subthreads, but to not rely on any of that working in all cases. It won't help with the problem where merely receiving a signal causes chaos, or where blocking them does so, but there is nothing that Python can do about that, in general. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From anthony at interlink.com.au Sun Sep 3 05:58:40 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Sun, 3 Sep 2006 13:58:40 +1000 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> References: <44F4D9D2.2040804@ewtllc.com> <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> Message-ID: <200609031358.42774.anthony@interlink.com.au> On Sunday 03 September 2006 03:11, Raymond Hettinger wrote: > [Neal] > > > Please review the patch and make a comment. I did a diff between HEAD > > and 2.4 and am fine with this going in once you are happy. > > I fixed a couple of documentation nits in rev 51688. > The patch is ready-to-go. > Nick, please go ahead and backport. I think this is suitable for 2.5. I'm thinking, though, that we need a second release candidate, given the number of changes since rc1. -- Anthony Baxter It's never too late to have a happy childhood. From aahz at pythoncraft.com Sun Sep 3 06:06:27 2006 From: aahz at pythoncraft.com (Aahz) Date: Sat, 2 Sep 2006 21:06:27 -0700 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <200609031358.42774.anthony@interlink.com.au> References: <44F4D9D2.2040804@ewtllc.com> <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> <200609031358.42774.anthony@interlink.com.au> Message-ID: <20060903040627.GA21743@panix.com> On Sun, Sep 03, 2006, Anthony Baxter wrote: > > I think this is suitable for 2.5. I'm thinking, though, that we need > a second release candidate, given the number of changes since rc1. +1 -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ I support the RKAB From fdrake at acm.org Sun Sep 3 07:01:50 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 3 Sep 2006 01:01:50 -0400 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <200609031358.42774.anthony@interlink.com.au> References: <44F4D9D2.2040804@ewtllc.com> <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> <200609031358.42774.anthony@interlink.com.au> Message-ID: <200609030101.51129.fdrake@acm.org> On Saturday 02 September 2006 23:58, Anthony Baxter wrote: > I think this is suitable for 2.5. I'm thinking, though, that we need a > second release candidate, given the number of changes since rc1. +1 -Fred -- Fred L. Drake, Jr. From chrism at plope.com Mon Sep 4 04:36:23 2006 From: chrism at plope.com (Chris McDonough) Date: Sun, 3 Sep 2006 22:36:23 -0400 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: Would adding an API for sigprocmask help here? (Although it has been tried before -- http://mail.python.org/ pipermail/python-dev/2003-February/033016.html and died in the womb due to threading-related issues -- http://mail.mems-exchange.org/ durusmail/quixote-users/1248/) - C On Sep 2, 2006, at 8:10 AM, Gustavo Carneiro wrote: > We have to resort to timeouts in pygtk in order to catch unix signals > in threaded mode. > The reason is this. We call gtk_main() (mainloop function) which > blocks forever. Suppose there are threads in the program; then any > thread can receive a signal (e.g. SIGINT). Python catches the signal, > but doesn't do anything; it simply sets a flag in a global structure > and calls Py_AddPendingCall(), and I guess it expects someone to call > Py_MakePendingCalls(). However, the main thread is blocked calling a > C function and has no way of being notified it needs to give control > back to python to handle the signal. Hence, we use a 100ms timeout > for polling. Unfortunately, timeouts needlessly consume CPU time and > drain laptop batteries. > > According to [1], all python needs to do to avoid this problem is > block all signals in all but the main thread; then we can guarantee > signal handlers are always called from the main thread, and pygtk > doesn't need a timeout. > > Another alternative would be to add a new API like > Py_AddPendingCallNotification, which would let python notify > extensions that new pending calls exist and need to be processed. > > But I would really prefer the first alternative, as it could be > fixed within python 2.5; no need to wait for 2.6. > > Please, let's make Python ready for the enterprise! [2] > > [1] https://bugzilla.redhat.com/bugzilla/process_bug.cgi#c3 > [2] http://perkypants.org/blog/2006/09/02/rfte-python/ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists > %40plope.com > From anthony at interlink.com.au Mon Sep 4 09:19:39 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 4 Sep 2006 17:19:39 +1000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <200609041719.41488.anthony@interlink.com.au> On Saturday 02 September 2006 22:10, Gustavo Carneiro wrote: > According to [1], all python needs to do to avoid this problem is > block all signals in all but the main thread; then we can guarantee > signal handlers are always called from the main thread, and pygtk > doesn't need a timeout. > But I would really prefer the first alternative, as it could be > fixed within python 2.5; no need to wait for 2.6. Assuming "the first alternative" is the "just block all signals in all but the main thread" option, there is absolutely no chance of this going into 2.5. Signals and threads combined are an complete *nightmare* of platform-specific behaviour. I'm -1000 on trying to change this code now, _after_ the first release candidate. To say that "that path lies madness" is like saying "Pacific Ocean large, wet, full of fish". -- Anthony Baxter It's never too late to have a happy childhood. From rasky at develer.com Mon Sep 4 12:29:51 2006 From: rasky at develer.com (Giovanni Bajo) Date: Mon, 4 Sep 2006 12:29:51 +0200 Subject: [Python-Dev] Error while building 2.5rc1 pythoncore_pgo on VC8 References: <8dd9fd0608310336q45d2d3d3re203e871c7b384b8@mail.gmail.com> <8dd9fd0608310446o6008240x8bfa852b41595eab@mail.gmail.com> Message-ID: <01ca01c6d00d$0dd14a90$b803030a@trilan> Fredrik Lundh wrote: >> That error mentioned in that post was in "pythoncore" module. >> My error is while compiling "pythoncore_pgo" module. > > iirc, that's a partially experimental alternative build for playing > with performance guided optimizations. are you sure you need > that module ? Oh yes, it's a 30% improvement in pystone, for free. -- Giovanni Bajo From mwh at python.net Mon Sep 4 15:30:41 2006 From: mwh at python.net (Michael Hudson) Date: Mon, 04 Sep 2006 14:30:41 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: (Gustavo Carneiro's message of "Sat, 2 Sep 2006 13:10:04 +0100") References: Message-ID: <2mpseboj26.fsf@starship.python.net> "Gustavo Carneiro" writes: > According to [1], all python needs to do to avoid this problem is > block all signals in all but the main thread; Argh, no: then people who call system() from non-main threads end up running subprocesses with all signals masked, which breaks other things in very mysterious ways. Been there... No time to read the rest of the post, maybe in a few days... Cheers, mwh -- Arrrrgh, the braindamage! It's not unlike the massively non-brilliant decision to use the period in abbreviations as well as a sentence terminator. Had these people no imagination at _all_? -- Erik Naggum, comp.lang.lisp From gjcarneiro at gmail.com Mon Sep 4 15:48:54 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Mon, 4 Sep 2006 13:48:54 +0000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <2mpseboj26.fsf@starship.python.net> References: <2mpseboj26.fsf@starship.python.net> Message-ID: On 9/4/06, Michael Hudson wrote: > "Gustavo Carneiro" writes: > > > According to [1], all python needs to do to avoid this problem is > > block all signals in all but the main thread; > > Argh, no: then people who call system() from non-main threads end up > running subprocesses with all signals masked, which breaks other > things in very mysterious ways. Been there... That's a very good point; I wasn't aware that child processes inherited the signals mask from their parent processes. > No time to read the rest of the post, maybe in a few days... Don't worry. From the feedback received so far it seems that any proposed solution has to wait for Python 2.6 :-( I am now thinking of something along these lines: typedef void (*PyPendingCallNotify)(void *user_data); PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, void *user_data); PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify callback, void *user_data); Regards. From nmm1 at cus.cam.ac.uk Mon Sep 4 16:05:56 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 15:05:56 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: "Gustavo Carneiro" wrote: > > That's a very good point; I wasn't aware that child processes > inherited the signals mask from their parent processes. That's one of the few places where POSIX does describe what happens. Well, usually. You really don't want to know what happens when you call something revolting, like csh or a setuid program. This particular mess is why I had to write my own nohup - the new POSIX interfaces broke the existing one, and it remains broken today on almost all systems. > I am now thinking of something along these lines: > typedef void (*PyPendingCallNotify)(void *user_data); > PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, > void *user_data); > PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify > callback, void *user_data); Why would that help? The problems are semantic, not syntactic. Anthony Baxter isn't exaggerating the problem, despite what you may think from his posting. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nmm1 at cus.cam.ac.uk Mon Sep 4 16:07:17 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 15:07:17 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: Chris McDonough wrote: > > Would adding an API for sigprocmask help here? No. sigprocmask is a large part of the problem. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From anthony at interlink.com.au Mon Sep 4 16:22:22 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 5 Sep 2006 00:22:22 +1000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <200609050022.23944.anthony@interlink.com.au> On Tuesday 05 September 2006 00:05, Nick Maclaren wrote: > Anthony Baxter isn't exaggerating the problem, despite what you may > think from his posting. If the SF bugtracker had a better search interface, you could see why I have such a bleak view of this area of Python. What's there now *mostly* works (I exclude freakshows like certain versions of HP/UX, AIX, SCO and the like). It took a hell of a lot of effort to get it to this point. threads + signals == tears. Anthony From gjcarneiro at gmail.com Mon Sep 4 16:52:36 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Mon, 4 Sep 2006 14:52:36 +0000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/4/06, Nick Maclaren wrote: > "Gustavo Carneiro" wrote: > > I am now thinking of something along these lines: > > typedef void (*PyPendingCallNotify)(void *user_data); > > PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, > > void *user_data); > > PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify > > callback, void *user_data); > > Why would that help? The problems are semantic, not syntactic. > > Anthony Baxter isn't exaggerating the problem, despite what you may > think from his posting. You guys are tough customers to please. I am just trying to solve a problem here, not create a new one; you have to believe me. OK, let's review what we know about current python, signals, and threads: 1. Python launches threads without touching sigprocmask; 2. Python installs signal handlers for all signals; 3. Signals can be delivered to any thread, let's assume (because of point #1 and not others not mentioned) that we have no control over which threads receive which signals, might as well be random for all we know; 4. Python signal handlers do almost nothing: just sets a flag, and calls Py_AddPendingCall, to postpone the job of handling a signal until a "safer" time. 5. The function Py_MakePendingCalls() should eventually get called at a "safer" time by user or python code. 6. It follows that until Py_MakePendingCalls() is called, the signal will not be handled at all! Now, back to explaining the problem. 1. In PyGTK we have a gobject.MainLoop.run() method, which blocks essentially forever in a poll() system call, and only wakes if/when it has to process timeout or IO event; 2. When we only have one thread, we can guarantee that e.g. SIGINT will always be caught by the thread running the g_main_loop_run(), so we know poll() will be interrupted and a EINTR will be generated, giving us control temporarily back to check for python signals; 3. When we have multiple thread, we cannot make this assumption, so instead we install a timeout to periodically check for signals. We want to get rid of timeouts. Now my idea: add a Python API to say: "dear Python, please call me when you start having pending calls, even if from a signal handler context, ok?" >From that point on, signals will get handled by Python, python calls PyGTK, PyGTK calls a special API to safely wake up the main loop even from a thread or signal handler, then main loop checks for signal by calling PyErr_CheckSignals(), it is handled by Python, and the process lives happily ever after, or die trying. I sincerely hope my explanation was satisfactory this time. Best regards. PS: there's a "funny" comment in Py_AddPendingCall that suggests it is not very safe against reentrancy problems: /* XXX Begin critical section */ /* XXX If you want this to be safe against nested XXX asynchronous calls, you'll have to work harder! */ Are signal handlers guaranteed to not be interrupted by another signal, at least? What about threads? From anthony at interlink.com.au Mon Sep 4 17:30:11 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 5 Sep 2006 01:30:11 +1000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <200609050130.13189.anthony@interlink.com.au> On Tuesday 05 September 2006 00:52, Gustavo Carneiro wrote: > 3. Signals can be delivered to any thread, let's assume (because > of point #1 and not others not mentioned) that we have no control over > which threads receive which signals, might as well be random for all > we know; Note that some Unix variants only deliver signals to the main thread (or so the manpages allege, anyway). Anthony From exarkun at divmod.com Mon Sep 4 17:56:00 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Mon, 4 Sep 2006 11:56:00 -0400 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Message-ID: <20060904155600.1717.605687145.divmod.quotient.38950@ohm> On Mon, 04 Sep 2006 15:05:56 +0100, Nick Maclaren wrote: >"Gustavo Carneiro" wrote: >> >> That's a very good point; I wasn't aware that child processes >> inherited the signals mask from their parent processes. > >That's one of the few places where POSIX does describe what happens. >Well, usually. You really don't want to know what happens when you >call something revolting, like csh or a setuid program. This >particular mess is why I had to write my own nohup - the new POSIX >interfaces broke the existing one, and it remains broken today on >almost all systems. > >> I am now thinking of something along these lines: >> typedef void (*PyPendingCallNotify)(void *user_data); >> PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, >> void *user_data); >> PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify >> callback, void *user_data); > >Why would that help? The problems are semantic, not syntactic. > >Anthony Baxter isn't exaggerating the problem, despite what you may >think from his posting. > Dealing with threads and signals is certainly hairy. However, that barely has anything to do with what Gustavo is talking about. By the time Gustavo's proposed API springs into action, the threads already exist and the signal is already being handled by one. So, let's forget about threads and signals for a moment. The problem to be solved is that one piece of code wants to communicate a piece of information to another piece of code. The first piece of code is in Python itself. The second piece of code could be from any third-party library, and Python has no way of knowing about it - now. Gustavo is suggesting adding a registration API so that these third-party libraries can tell Python that they exist and are interested in this piece of information. Simple, no? PyGTK would presumably implement its pending call callback by writing a byte to a pipe which it is also passing to poll(). This lets them handle signals in a very timely manner without constantly waking up from poll() to see if Python wants to do any work. This is far from a new idea - it's basically the bog standard way of handling this situation. It strikes me as a very useful API to add to Python (although at this point in the 2.5 release process, not to 2.5, sorry Gustavo). Jean-Paul From david.nospam.hopwood at blueyonder.co.uk Mon Sep 4 18:19:27 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Mon, 04 Sep 2006 17:19:27 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <44FC520F.3070307@blueyonder.co.uk> Gustavo Carneiro wrote: > OK, let's review what we know about current python, signals, and threads: > > 1. Python launches threads without touching sigprocmask; > 2. Python installs signal handlers for all signals; > 3. Signals can be delivered to any thread, let's assume (because > of point #1 and not others not mentioned) that we have no control over > which threads receive which signals, might as well be random for all > we know; > 4. Python signal handlers do almost nothing: just sets a flag, > and calls Py_AddPendingCall, to postpone the job of handling a signal > until a "safer" time. > 5. The function Py_MakePendingCalls() should eventually get > called at a "safer" time by user or python code. > 6. It follows that until Py_MakePendingCalls() is called, the > signal will not be handled at all! > > Now, back to explaining the problem. > > 1. In PyGTK we have a gobject.MainLoop.run() method, which blocks > essentially forever in a poll() system call, and only wakes if/when it > has to process timeout or IO event; > 2. When we only have one thread, we can guarantee that e.g. > SIGINT will always be caught by the thread running the > g_main_loop_run(), so we know poll() will be interrupted and a EINTR > will be generated, giving us control temporarily back to check for > python signals; > 3. When we have multiple thread, we cannot make this assumption, > so instead we install a timeout to periodically check for signals. > > We want to get rid of timeouts. Now my idea: add a Python API to say: > "dear Python, please call me when you start having pending calls, > even if from a signal handler context, ok?" What can be safely done from a signal handler context is *very* limited. Calling back arbitrary Python code is certainly not safe. Reliable asynchronous interruption of arbitrary code is a difficult problem, but POSIX and POSIX implementations botch it particularly badly. I don't know how to implement what you want here, but I'd endorse the comments of Nick Maclaren and Antony Baxter against making precipitate changes. -- David Hopwood From nmm1 at cus.cam.ac.uk Mon Sep 4 18:24:27 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 17:24:27 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Your message of "Mon, 04 Sep 2006 14:52:36 -0000." Message-ID: "Gustavo Carneiro" wrote: > > You guys are tough customers to please. I am just trying to solve a > problem here, not create a new one; you have to believe me. Oh, I believe you. Look at it this way. You are trying to resolve the problem that your farm is littered with cluster bombs, and your cows keep blowing their legs off. Your solution is effectively saying "well, let's travel around and pick them all up then". > We want to get rid of timeouts. Now my idea: add a Python API to say: > "dear Python, please call me when you start having pending calls, > even if from a signal handler context, ok?" Yes, I know. I have been there and done that, both academically and (observing, as a consultant) to the vendor. And that was on a system that was a damn sight better engineered than any of the main ones that Python runs on today. I have attempted to do much EASIER tasks under both Unix and (earlier) versions of Microsoft Windows, and failed dismally because the system wasn't up to it. > From that point on, signals will get handled by Python, python calls > PyGTK, PyGTK calls a special API to safely wake up the main loop even > from a thread or signal handler, then main loop checks for signal by > calling PyErr_CheckSignals(), it is handled by Python, and the process > lives happily ever after, or die trying. The first thing that will happen to that beautiful theory when it goes out into Unix County or Microsoft City is that a gang of ugly facts will find it and beat it into a pulp. > I sincerely hope my explanation was satisfactory this time. Oh, it was last time. It isn't that that is the problem. > Are signal handlers guaranteed to not be interrupted by another > signal, at least? What about threads? No and no. In theory, what POSIX says about blocking threads should be reliable; in my experience, it almost is, except under precisely the circumstances that you most want it to work. Look, I am agreeing that your basic design is right. What I am saying is that (a) you cannot make delivery reliable and abolish timeouts and (b) that it is such a revoltingly system-dependent mess that I would much rather Python didn't fiddle with it. Do you know how signalling is misimplemented at the hardware level? And that it is possible for a handler to be called with any of its critical pointers (INCLUDING the global code and data pointers) in undefined states? Do you know how to program round that sort of thing? I can answer "yes" to all three - for my sins, which must be many and grievous, for that to be the case :-( Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From david.nospam.hopwood at blueyonder.co.uk Mon Sep 4 18:24:56 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Mon, 04 Sep 2006 17:24:56 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <20060904155600.1717.605687145.divmod.quotient.38950@ohm> References: <20060904155600.1717.605687145.divmod.quotient.38950@ohm> Message-ID: <44FC5358.70806@blueyonder.co.uk> Jean-Paul Calderone wrote: > PyGTK would presumably implement its pending call callback by writing a > byte to a pipe which it is also passing to poll(). But doing that in a signal handler context invokes undefined behaviour according to POSIX. -- David Hopwood From exarkun at divmod.com Mon Sep 4 18:46:22 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Mon, 4 Sep 2006 12:46:22 -0400 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <44FC5358.70806@blueyonder.co.uk> Message-ID: <20060904164622.1717.895455315.divmod.quotient.38999@ohm> On Mon, 04 Sep 2006 17:24:56 +0100, David Hopwood wrote: >Jean-Paul Calderone wrote: >> PyGTK would presumably implement its pending call callback by writing a >> byte to a pipe which it is also passing to poll(). > >But doing that in a signal handler context invokes undefined behaviour >according to POSIX. write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. Was this changed in a later edition? Otherwise, I don't understand what you mean by this. Jean-Paul From nmm1 at cus.cam.ac.uk Mon Sep 4 19:18:41 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 18:18:41 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: Jean-Paul Calderone wrote: > On Mon, 04 Sep 2006 17:24:56 +0100, > David Hopwood der.co.uk> wrote: > >Jean-Paul Calderone wrote: > >> PyGTK would presumably implement its pending call callback by writing a > >> byte to a pipe which it is also passing to poll(). > > > >But doing that in a signal handler context invokes undefined behaviour > >according to POSIX. > > write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. > Was this changed in a later edition? Otherwise, I don't understand what you > mean by this. Try looking at the C90 or C99 standard, for a start :-( NOTHING may safely be done in a real signal handler, except possibly setting a value of type static volatile sig_atomic_t. And even that can be problematic. And note that POSIX defers to C on what the C languages defines. So, even if the function is async-signal-safe, the code that calls it can't be! POSIX's lists are complete fantasy, anyway. Look at the one that defines thread-safety, and then try to get your mind around what exit being thread-safe actually implies (especially with regard to atexit functions). Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From david.nospam.hopwood at blueyonder.co.uk Mon Sep 4 19:24:38 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Mon, 04 Sep 2006 18:24:38 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <20060904164622.1717.895455315.divmod.quotient.38999@ohm> References: <20060904164622.1717.895455315.divmod.quotient.38999@ohm> Message-ID: <44FC6156.3000708@blueyonder.co.uk> Jean-Paul Calderone wrote: > On Mon, 04 Sep 2006 17:24:56 +0100, David Hopwood wrote: > >>Jean-Paul Calderone wrote: >> >>>PyGTK would presumably implement its pending call callback by writing a >>>byte to a pipe which it is also passing to poll(). >> >>But doing that in a signal handler context invokes undefined behaviour >>according to POSIX. > > write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. I stand corrected. I must have misremembered this. -- David Hopwood From exarkun at divmod.com Mon Sep 4 19:55:41 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Mon, 4 Sep 2006 13:55:41 -0400 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Message-ID: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> On Mon, 04 Sep 2006 18:18:41 +0100, Nick Maclaren wrote: >Jean-Paul Calderone wrote: >> On Mon, 04 Sep 2006 17:24:56 +0100, >> David Hopwood > der.co.uk> wrote: >> >Jean-Paul Calderone wrote: >> >> PyGTK would presumably implement its pending call callback by writing a >> >> byte to a pipe which it is also passing to poll(). >> > >> >But doing that in a signal handler context invokes undefined behaviour >> >according to POSIX. >> >> write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. >> Was this changed in a later edition? Otherwise, I don't understand what you >> mean by this. > >Try looking at the C90 or C99 standard, for a start :-( > >NOTHING may safely be done in a real signal handler, except possibly >setting a value of type static volatile sig_atomic_t. And even that >can be problematic. And note that POSIX defers to C on what the C >languages defines. So, even if the function is async-signal-safe, >the code that calls it can't be! > >POSIX's lists are complete fantasy, anyway. Look at the one that >defines thread-safety, and then try to get your mind around what >exit being thread-safe actually implies (especially with regard to >atexit functions). > Thanks for expounding. Given that it is basically impossible to do anything useful in a signal handler according to the relevant standards (does Python's current signal handler even avoid relying on undefined behavior?), how would you suggest addressing this issue? It seems to me that it is actually possible to do useful things in a signal handler, so long as one accepts that doing so is relying on platform specific behavior. How hard would it be to implement this for the platforms Python supports, rather than for a hypothetical standards-exact platform? Jean-Paul From nmm1 at cus.cam.ac.uk Mon Sep 4 20:44:30 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 19:44:30 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Your message of "Mon, 04 Sep 2006 13:55:41 EDT." <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> Message-ID: Jean-Paul Calderone wrote: > > Thanks for expounding. Given that it is basically impossible to do > anything useful in a signal handler according to the relevant standards > (does Python's current signal handler even avoid relying on undefined > behavior?), how would you suggest addressing this issue? Much as you are doing, and I described, but the first step would be to find out what 'most' Python people need for signal handling in threaded programs. This is because there is an unavoidable conflict between portability/reliability and functionality. I would definitely block all signals in threads, except for those that are likely to be generated ON the thread (SIGFPE etc.) It is a very good idea not to touch the handling of several of those, because doing so can cause chaos. I would have at least two 'standard' handlers, one of which would simply set a flag and return, and the other of which would abort. Now, NEITHER is a very useful specification, but providing ANY information is risky, which is why it is critical to know what people need. I would not TRUST the blocking of signals, so would set up handlers even when I blocked them, and would do the minimum fiddling in the main thread compatible with decent functionality. I would provide a call to test if the signal flag was set, and another to test and clear it. This would be callable ONLY from the main thread, and that would be checked. It is possible to do better, but that starts needing serious research. > It seems to me that it is actually possible to do useful things in a > signal handler, so long as one accepts that doing so is relying on > platform specific behavior. Unfortunately, that is wrong. That was true under MVS and VMS, but in Unix and Microsoft systems, the problem is that the behaviour is both platform and circumstance-dependent. What you can do reliably depends mostly on what is going on at the time. For example, on many Unix and Microsoft platforms, signals received while you are in the middle of certain functions or system calls, or certain particular signals (often SIGFPE), call the C handler with a bad set of global pointers or similar. I believe that this is one of reasons (perhaps the main one) that some such failures so often cause debuggers to be unable to find the stack pointer. I have tracked a few of those down, and have occasionally identified the cause (and even got it fixed!), but it is a murderous task, and I know of few other people who have ever succeeded. > How hard would it be to implement this for the platforms Python supports, > rather than for a hypothetical standards-exact platform? I have seen this effect on OSF/1, IRIX, Solaris, Linux and versions of Microsoft Windows. I have never used a modern BSD, haven't used HP-UX since release 9, and haven't used Microsoft systems seriously in years (though I did hang my new laptop in its GUI fairly easily). As I say, this isn't so much a platform issue as a circumstance one. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From andreas.raab at gmx.de Mon Sep 4 23:36:19 2006 From: andreas.raab at gmx.de (Andreas Raab) Date: Mon, 04 Sep 2006 14:36:19 -0700 Subject: [Python-Dev] Cross-platform math functions? Message-ID: <44FC9C53.5060304@gmx.de> Hi - I'm curious if there is any interest in the Python community to achieve better cross-platform math behavior. A quick test[1] shows a non-surprising difference between the platform implementations. Question: Is there any interest in changing the behavior to produce identical results across platforms (for example by utilizing fdlibm [2])? Since I have need for a set of cross-platform math functions I'll probably start with a math-compatible fdlibm module (unless somebody has done that already ;-) Cheers, - Andreas [1] Using Python 2.4: >>> import math >>> math.cos(1.0e32) WinXP: -0.39929634612021897 LinuxX86: -0.49093671143542561 [2] http://www.netlib.org/fdlibm/ From gjcarneiro at gmail.com Tue Sep 5 01:31:06 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Tue, 5 Sep 2006 00:31:06 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> Message-ID: In GLib we have a child watch notification feature that relies on the following signal handler: static void g_child_watch_signal_handler (int signum) { child_watch_count ++; if (child_watch_init_state == CHILD_WATCH_INITIALIZED_THREADED) { write (child_watch_wake_up_pipe[1], "B", 1); } else { /* We count on the signal interrupting the poll in the same thread. */ } } Now, we've had this API for a long time already (at least 2.5 years). I'm pretty sure it works well enough on most *nix systems. Event if it works 99% of the times, it's way better than *failing* *100%* of the times, which is what happens now with Python. All I ask is an API to add a callback that Python signal handlers call, from signal context. That much I'm sure is safe. What happens from there on will be out of Python's hands, so Python purist^H^H^H^H^H^H developers cannot be blamed for anything that happens next. You can laugh at PyGTK and GLib all you want for having "unsafe signal handling", I don't care. Regards. On 9/4/06, Nick Maclaren wrote: > Jean-Paul Calderone wrote: > > > > Thanks for expounding. Given that it is basically impossible to do > > anything useful in a signal handler according to the relevant standards > > (does Python's current signal handler even avoid relying on undefined > > behavior?), how would you suggest addressing this issue? > > Much as you are doing, and I described, but the first step would be > to find out what 'most' Python people need for signal handling in > threaded programs. This is because there is an unavoidable conflict > between portability/reliability and functionality. > > I would definitely block all signals in threads, except for those that > are likely to be generated ON the thread (SIGFPE etc.) It is a very > good idea not to touch the handling of several of those, because doing > so can cause chaos. > > I would have at least two 'standard' handlers, one of which would simply > set a flag and return, and the other of which would abort. Now, NEITHER > is a very useful specification, but providing ANY information is risky, > which is why it is critical to know what people need. > > I would not TRUST the blocking of signals, so would set up handlers even > when I blocked them, and would do the minimum fiddling in the main > thread compatible with decent functionality. > > I would provide a call to test if the signal flag was set, and another > to test and clear it. This would be callable ONLY from the main thread, > and that would be checked. > > It is possible to do better, but that starts needing serious research. > > > It seems to me that it is actually possible to do useful things in a > > signal handler, so long as one accepts that doing so is relying on > > platform specific behavior. > > Unfortunately, that is wrong. That was true under MVS and VMS, but > in Unix and Microsoft systems, the problem is that the behaviour is > both platform and circumstance-dependent. What you can do reliably > depends mostly on what is going on at the time. > > For example, on many Unix and Microsoft platforms, signals received > while you are in the middle of certain functions or system calls, or > certain particular signals (often SIGFPE), call the C handler with a > bad set of global pointers or similar. I believe that this is one of > reasons (perhaps the main one) that some such failures so often cause > debuggers to be unable to find the stack pointer. > > I have tracked a few of those down, and have occasionally identified > the cause (and even got it fixed!), but it is a murderous task, and > I know of few other people who have ever succeeded. > > > How hard would it be to implement this for the platforms Python supports, > > rather than for a hypothetical standards-exact platform? > > I have seen this effect on OSF/1, IRIX, Solaris, Linux and versions > of Microsoft Windows. I have never used a modern BSD, haven't used > HP-UX since release 9, and haven't used Microsoft systems seriously > in years (though I did hang my new laptop in its GUI fairly easily). > > As I say, this isn't so much a platform issue as a circumstance one. > > > Regards, > Nick Maclaren, > University of Cambridge Computing Service, > New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. > Email: nmm1 at cam.ac.uk > Tel.: +44 1223 334761 Fax: +44 1223 334679 > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com > From tim.peters at gmail.com Tue Sep 5 01:06:50 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 4 Sep 2006 19:06:50 -0400 Subject: [Python-Dev] Cross-platform math functions? In-Reply-To: <44FC9C53.5060304@gmx.de> References: <44FC9C53.5060304@gmx.de> Message-ID: <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com> [Andreas Raab] > I'm curious if there is any interest in the Python community to achieve > better cross-platform math behavior. A quick test[1] shows a > non-surprising difference between the platform implementations. > Question: Is there any interest in changing the behavior to produce > identical results across platforms (for example by utilizing fdlibm > [2])? Since I have need for a set of cross-platform math functions I'll > probably start with a math-compatible fdlibm module (unless somebody has > done that already ;-) Package a Python wrapper and see how popular it becomes. Some reasons against trying to standardize on fdlibm were explained here: http://mail.python.org/pipermail/python-list/2005-July/290164.html Bottom line is I suspect that when it comes to bit-for-bit reproducibility, fewer people care about that x-platform than care about it x-language on the box they use. Nothing wrong with different modules for people with different desires. From tim.peters at gmail.com Tue Sep 5 04:25:01 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 4 Sep 2006 22:25:01 -0400 Subject: [Python-Dev] gcc 4.2 exposes signed integer overflows In-Reply-To: <200608301242.28648.anthony@interlink.com.au> References: <20060826190600.0E75911002B@bromo.msbb.uc.edu> <20060829201022.GA22579@code0.codespeak.net> <1f7befae0608291557l5b04a8f6wd1371e62a5c9c69c@mail.gmail.com> <200608301242.28648.anthony@interlink.com.au> Message-ID: <1f7befae0609041925h61c184f1m8716951740b00b39@mail.gmail.com> [Tim Peters] >> Speaking of which, I saw no feedback on the proposed patch in >> >> http://mail.python.org/pipermail/python-dev/2006-August/068502.html >> >> so I'll just check that in tomorrow. [Anthony Baxter] > This should also be backported to release24-maint and release23-maint. Let me > know if you can't do the backport... Done in rev 51711 on the 2.5 branch. Done in rev 51715 on the 2.4 branch. Done in rev 51716 on the trunk, although in the LONG_MIN way (which is less obscure, but a more "radical" code change). I don't care about the 2.3 branch, so leaving that to someone who does. Merge rev 51711 from the 2.5 branch. It will generate a conflict on Misc/NEWS. Easiest to revert Misc/NEWS then and just copy/paste the little blurb from 2.5 news at the appropriate place: """ - Overflow checking code in integer division ran afoul of new gcc optimizations. Changed to be more standard-conforming. """ From rhamph at gmail.com Tue Sep 5 05:28:37 2006 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 4 Sep 2006 21:28:37 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/4/06, Nick Maclaren wrote: > Jean-Paul Calderone wrote: > > On Mon, 04 Sep 2006 17:24:56 +0100, > > David Hopwood > der.co.uk> wrote: > > >Jean-Paul Calderone wrote: > > >> PyGTK would presumably implement its pending call callback by writing a > > >> byte to a pipe which it is also passing to poll(). > > > > > >But doing that in a signal handler context invokes undefined behaviour > > >according to POSIX. > > > > write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. > > Was this changed in a later edition? Otherwise, I don't understand what you > > mean by this. > > Try looking at the C90 or C99 standard, for a start :-( > > NOTHING may safely be done in a real signal handler, except possibly > setting a value of type static volatile sig_atomic_t. And even that > can be problematic. And note that POSIX defers to C on what the C > languages defines. So, even if the function is async-signal-safe, > the code that calls it can't be! I don't believe that is true. It says (or atleast SUSv3 says) that: """ 3.26 Async-Signal-Safe Function A function that may be invoked, without restriction, from signal-catching functions. No function is async-signal-safe unless explicitly described as such.""" Sure, it doesn't give me a warm-fuzzy feeling of knowing why it works, but we can expect that it magically does. My understanding is that threading in general is the same way... Of course that doesn't preclude bugs in the various implementations, but those trump the standards anyway. -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Tue Sep 5 05:41:13 2006 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 4 Sep 2006 21:41:13 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> Message-ID: On 9/4/06, Gustavo Carneiro wrote: > Now, we've had this API for a long time already (at least 2.5 > years). I'm pretty sure it works well enough on most *nix systems. > Event if it works 99% of the times, it's way better than *failing* > *100%* of the times, which is what happens now with Python. Failing 99% of the time is as bad as failing 100% of the time, if your goal is to eliminate the short timeout on poll(). 1% is quite a lot, and it would probably have an annoying tendency to trigger repeatedly when the user does certain things (not reproducible by you of course). That said, I do hope we can get 100%, or at least enough nines that we can increase the timeout significantly. -- Adam Olsen, aka Rhamphoryncus From nnorwitz at gmail.com Tue Sep 5 06:12:43 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 4 Sep 2006 21:12:43 -0700 Subject: [Python-Dev] [Python-checkins] TRUNK IS UNFROZEN, available for 2.6 work if you are so inclined In-Reply-To: References: <200608180023.14037.anthony@interlink.com.au> Message-ID: On 8/18/06, Georg Brandl wrote: > > I'd like to commit this. It fixes bug 1542051. > > Index: Objects/exceptions.c ... Georg, Did you still want to fix this? I don't remember anything happening with it. I don't see where _PyObject_GC_TRACK is called, so I'm not sure why _PyObject_GC_UNTRACK is necessary. You should probably add the patch to the bug report and we can discuss there. n From nnorwitz at gmail.com Tue Sep 5 06:14:34 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 4 Sep 2006 21:14:34 -0700 Subject: [Python-Dev] no remaining issues blocking 2.5 release In-Reply-To: <20060815164114.GB23991@niemeyer.net> References: <20060815164114.GB23991@niemeyer.net> Message-ID: Gustavo, Did you still want this addressed? Anthony and I made some comments on the bug/patch, but nothing has been updated. n -- On 8/15/06, Gustavo Niemeyer wrote: > > If you have issues, respond ASAP! The release candidate is planned to > > be cut this Thursday/Friday. There are only a few more days before > > code freeze. A branch will be made when the release candidate is cut. > > I'd like to see problem #1531862 fixed. The bug is clear and the > fix should be trivial. I can commit a fix tonight, if the subprocess > module author/maintainer is unavailable to check it out. > > -- > Gustavo Niemeyer > http://niemeyer.net > From nnorwitz at gmail.com Tue Sep 5 06:24:16 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 4 Sep 2006 21:24:16 -0700 Subject: [Python-Dev] 2.5 status Message-ID: There are 3 bugs currently listed in PEP 356 as blocking: http://python.org/sf/1551432 - __unicode__ breaks on exception classes http://python.org/sf/1550938 - improper exception w/relative import http://python.org/sf/1541697 - sgmllib regexp bug causes hang Does anyone want to fix the sgmlib issue? If not, we should revert this week before c2 is cut. I'm hoping that we will have *no changes* in 2.5 final from c2. Should there be any bugs/patches added to or removed from the list? The buildbots are currently humming along, but I believe all 3 versions (2.4, 2.5, and 2.6) are fine. Test out 2.5c1+ and report all bugs! n From andreas.raab at gmx.de Tue Sep 5 07:03:11 2006 From: andreas.raab at gmx.de (Andreas Raab) Date: Mon, 04 Sep 2006 22:03:11 -0700 Subject: [Python-Dev] Cross-platform math functions? In-Reply-To: <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com> References: <44FC9C53.5060304@gmx.de> <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com> Message-ID: <44FD050F.20901@gmx.de> Tim Peters wrote: > Package a Python wrapper and see how popular it becomes. Some reasons > against trying to standardize on fdlibm were explained here: > > http://mail.python.org/pipermail/python-list/2005-July/290164.html Thanks, these are good points. About speed, do you have any good benchmarks available? In my experience fdlibm is quite reasonable for speed in the context of use by dynamic languages (i.e., counting allocation overheads, lookup and send performance etc) but since I'm not a Python expert I'd appreciate some help with realistic benchmarks. > Bottom line is I suspect that when it comes to bit-for-bit > reproducibility, fewer people care about that x-platform than care > about it x-language on the box they use. Nothing wrong with different > modules for people with different desires. Agreed. Thus my question if someone had already done this ;-) Cheers, - Andreas From nmm1 at cus.cam.ac.uk Tue Sep 5 10:51:43 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 05 Sep 2006 09:51:43 +0100 Subject: [Python-Dev] Cross-platform math functions? Message-ID: Andreas Raab wrote: > > I'm curious if there is any interest in the Python community to achieve > better cross-platform math behavior. A quick test[1] shows a > non-surprising difference between the platform implementations. > Question: Is there any interest in changing the behavior to produce > identical results across platforms (for example by utilizing fdlibm > [2])? Since I have need for a set of cross-platform math functions I'll > probably start with a math-compatible fdlibm module (unless somebody has > done that already ;-) > > [1] Using Python 2.4: > >>> import math > >>> math.cos(1.0e32) > > WinXP: -0.39929634612021897 > LinuxX86: -0.49093671143542561 Well, I hope not, but I am afraid that there is :-( The word "better" is emotive and inaccurate. Such calculations are numerically meaningless, and merely encourage the confusion between consistency and correctness. There is a strong sense in which giving random results between -1 and 1 would be better. Now, I am not saying that you don't have a requirement for consistency but I am saying that confusing it with correctness (as has been fostered by IEEE 754, Java etc.) is harmful. One of the great advantages of the wide variety of arithmetics available in the 1970s is that numerical testing was easier and more reliable - if you got wildly different results on two platforms, you got a strong pointer to numerical problems. That viewpoint is regarded as heresy nowadays, but used not to be! Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nmm1 at cus.cam.ac.uk Tue Sep 5 11:07:12 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 05 Sep 2006 10:07:12 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: "Adam Olsen" wrote: > On 9/4/06, Gustavo Carneiro wrote: > > > Now, we've had this API for a long time already (at least 2.5 > > years). I'm pretty sure it works well enough on most *nix systems. > > Event if it works 99% of the times, it's way better than *failing* > > *100%* of the times, which is what happens now with Python. > > Failing 99% of the time is as bad as failing 100% of the time, if your > goal is to eliminate the short timeout on poll(). 1% is quite a lot, > and it would probably have an annoying tendency to trigger repeatedly > when the user does certain things (not reproducible by you of course). That can make it a lot WORSE that repeated failure. At least with hard failures, you have some hope of tracking them down in a reasonable time. The problem with exception handling code that goes off very rarely, under non-reproducible circumstances, is that it is almost untestable and that bugs in it are positive nightmares. I have been inflicted with quite a large number in my time, and have a fairly good success rate, but the number of people who know the tricks is decreasing. Consider the (real) case where an unpredictable process on a large server (64 CPUs) was failing about twice a week (detectably), with no indication of how many failures were giving wrong answers. We replaced dozens of DIMMs, took days of down time and got nowhere; it then went hard (i.e. one failure a day). After a week's total down time, with me spending 100% of my time on it and the vendor allocating an expert at high priority, we cracked it. We were very lucky to find it so fast. I could give you other examples that were/are there years and decades later, because the pain threshhold never got high enough to dedicate the time (and the VERY few people with experience). I know of at least one such problem in generic TCP/IP (i.e. on Linux, IRIX, AIX and possibly Solaris) that has been there for decades and causes occasional failure in most networked applications/protocols. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From andreas.raab at gmx.de Tue Sep 5 11:17:25 2006 From: andreas.raab at gmx.de (Andreas Raab) Date: Tue, 05 Sep 2006 02:17:25 -0700 Subject: [Python-Dev] Cross-platform math functions? In-Reply-To: References: Message-ID: <44FD40A5.8090406@gmx.de> Nick Maclaren wrote: > The word "better" is emotive and inaccurate. Such calculations are > numerically meaningless, and merely encourage the confusion between > consistency and correctness. There is a strong sense in which giving > random results between -1 and 1 would be better. I did, of course, mean more consistent (and yes, random consistent results would be "better" by this definition and indeed I would prefer that over inconsistent but more accurate results ;-) Cheers, - Andreas From gjcarneiro at gmail.com Tue Sep 5 15:44:14 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Tue, 5 Sep 2006 14:44:14 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> Message-ID: On 9/5/06, Adam Olsen wrote: > On 9/4/06, Gustavo Carneiro wrote: > > Now, we've had this API for a long time already (at least 2.5 > > years). I'm pretty sure it works well enough on most *nix systems. > > Event if it works 99% of the times, it's way better than *failing* > > *100%* of the times, which is what happens now with Python. > > Failing 99% of the time is as bad as failing 100% of the time, if your > goal is to eliminate the short timeout on poll(). 1% is quite a lot, > and it would probably have an annoying tendency to trigger repeatedly > when the user does certain things (not reproducible by you of course). > > That said, I do hope we can get 100%, or at least enough nines that we > can increase the timeout significantly. Anyway, I was speaking hypothetically. I'm pretty sure writing to a pipe is async signal safe. It is the oldest trick in the book, everyone uses it. I don't have to see a written signed contract to know that it works. Here's a list of web sites google found me that talk about this problem: This one describes the pipe writing technique: http://www.cocoadev.com/index.pl?SignalSafety This one presents a list of "The only routines that POSIX guarantees to be Async-Signal-Safe": http://docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWdev/MTP/p40.html#GEN-95948 Also here: http://www.cs.usyd.edu.au/cgi-bin/man.cgi?section=5&topic=attributes This is all the evidence that I need. And again I reiterate that whether or not async safety can be achieved in practice for all platforms is not Python's problem. Although I believe writing to a pipe is 100% reliable for most platforms. Even if it is not, any mission critical application relying on signals for correct behaviour should be rewritten to use unix sockets instead; end of argument. From nmm1 at cus.cam.ac.uk Tue Sep 5 15:53:45 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 05 Sep 2006 14:53:45 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: "Gustavo Carneiro" wrote: > > Anyway, I was speaking hypothetically. I'm pretty sure writing to a > pipe is async signal safe. It is the oldest trick in the book, > everyone uses it. I don't have to see a written signed contract to > know that it works. Ah. Well, I can assure you that it's not the oldest trick in the book, and not everyone uses it. > This is all the evidence that I need. And again I reiterate that > whether or not async safety can be achieved in practice for all > platforms is not Python's problem. I wish you the joy of trying to report a case where it doesn't work to a large vendor and get them to accept that it is a bug. > Although I believe writing to a > pipe is 100% reliable for most platforms. Even if it is not, any > mission critical application relying on signals for correct behaviour > should be rewritten to use unix sockets instead; end of argument. Er, no. There are lots of circumstances where that isn't feasible, such as wanting to close down an application cleanly when the scheduler sends it a SIGXCPU. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From gustavo at niemeyer.net Tue Sep 5 17:28:33 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Tue, 5 Sep 2006 12:28:33 -0300 Subject: [Python-Dev] no remaining issues blocking 2.5 release In-Reply-To: References: <20060815164114.GB23991@niemeyer.net> Message-ID: <20060905152833.GA12378@niemeyer.net> > Did you still want this addressed? Anthony and I made some comments > on the bug/patch, but nothing has been updated. I was waiting because I got unassigned from the bug, so I thought the maintainer was stepping up. I'll commit a fix for it today. Thanks for pinging me, -- Gustavo Niemeyer http://niemeyer.net From jimjjewett at gmail.com Tue Sep 5 18:08:19 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 5 Sep 2006 12:08:19 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: Message-ID: Reversing the order of the return tuple will break the alignment with split/rsplit. Why not just change which of the three strings holds the remainder in the not-found case? In rc1, "d".rpartition(".") --> ('d', '', '') If that changes to "d".rpartition(".") --> ('', '', 'd') then (1) the loop will terminate (2) rpartition will be more parallel to partition (and split), (3) people who used rpartition without looping to termination (and therefore didn't catch the problem) will still be able to use their existing working code. (4) the existing docstring would remain correct, though it could still be improved. (It says "returns S and two empty strings", but doesn't specify the order.) -jJ From rhettinger at ewtllc.com Tue Sep 5 18:13:49 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 09:13:49 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: Message-ID: <44FDA23D.2060602@ewtllc.com> Jim Jewett wrote: > >Why not just change which of the three strings holds the remainder in >the not-found case? > > That was the only change submitted. Are you happy with what was checked-in? Raymond From jdahlin at async.com.br Tue Sep 5 18:18:20 2006 From: jdahlin at async.com.br (Johan Dahlin) Date: Tue, 05 Sep 2006 13:18:20 -0300 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <44FDA34C.6030605@async.com.br> Nick Maclaren wrote: > "Gustavo Carneiro" wrote: >> Anyway, I was speaking hypothetically. I'm pretty sure writing to a >> pipe is async signal safe. It is the oldest trick in the book, >> everyone uses it. I don't have to see a written signed contract to >> know that it works. > > Ah. Well, I can assure you that it's not the oldest trick in the book, > and not everyone uses it. > >> This is all the evidence that I need. And again I reiterate that >> whether or not async safety can be achieved in practice for all >> platforms is not Python's problem. > > I wish you the joy of trying to report a case where it doesn't work > to a large vendor and get them to accept that it is a bug. Are you saying that we should let less commonly used platforms dictate features and functionality for the popular ones? I mean, who uses HP/UX, SCO and [insert your favorite flavor] as a modern desktop system where this particular bug makes a difference? Can't this just be enabled for platforms where it's known to work and let Python as it currently is for the users of these legacy systems ? Johan From jimjjewett at gmail.com Tue Sep 5 18:47:26 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 5 Sep 2006 12:47:26 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDA23D.2060602@ewtllc.com> References: <44FDA23D.2060602@ewtllc.com> Message-ID: > Jim Jewett wrote: > >Why not just change which of the three strings holds the remainder in > >the not-found case? On 9/5/06, Raymond Hettinger wrote: > That was the only change submitted. > Are you happy with what was checked-in? This change looks wrong: PyDoc_STRVAR(rpartition__doc__, -"S.rpartition(sep) -> (head, sep, tail)\n\ +"S.rpartition(sep) -> (tail, sep, head)\n\ It looks like the code itself does the right thing, but I wasn't quite confident of that. -jJ From rhettinger at ewtllc.com Tue Sep 5 19:10:47 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 10:10:47 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <44FDA23D.2060602@ewtllc.com> Message-ID: <44FDAF97.3050502@ewtllc.com> > > This change looks wrong: > > PyDoc_STRVAR(rpartition__doc__, > -"S.rpartition(sep) -> (head, sep, tail)\n\ > +"S.rpartition(sep) -> (tail, sep, head)\n\ > > It looks like the code itself does the right thing, but I wasn't quite > confident of that. > It is correct. There may be some confusion in terminology. Head and tail do not mean left-side or right-side. Instead, they refer to the "small part chopped-off" and "the rest that is still choppable". Think of head and tail in the sense of car and cdr. A post-condition invariant for both str.partition() and str.rpartition() is: assert sep not in head For non-looping cases, users will likely to use different variable names when they unpack the tuple: left, middle, right = s.rpartition(p) But when they perform multiple partitions, the "tail" or "rest" terminology is more appropriate for the part of the string that may still contain separators. Raymond From mcherm at mcherm.com Tue Sep 5 19:24:46 2006 From: mcherm at mcherm.com (Michael Chermside) Date: Tue, 05 Sep 2006 10:24:46 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() Message-ID: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> Jim Jewett writes: > This change [in docs] looks wrong: > > PyDoc_STRVAR(rpartition__doc__, > -"S.rpartition(sep) -> (head, sep, tail)\n\ > +"S.rpartition(sep) -> (tail, sep, head)\n\ Raymond Hettinger replies: > It is correct. There may be some confusion in terminology. Head > and tail do not mean left-side or right-side. Instead, they refer to > the "small part chopped-off" and "the rest that is still choppable". > Think of head and tail in the sense of car and cdr. It is incorrect. The purpose of documentation is to explain things to users, and documentation which fails to achieve this is not "correct". The level of confusion generated by using "head" to refer to the last part of the string and "tail" to refer to the beginning, is quite significant. How about something like this: S.partition(sep) -> (head, sep, tail) S.rpartition(sep) -> (tail, sep, rest) Perhaps someone else can find something clearer than my suggestion, but in my own head, the terms "head" and "tail" are tighly bound with the idea of beginning and end (respectively) rather than with the idea of "small part chopped off" and "big part that is still choppable". -- Michael Chermside From barry at python.org Tue Sep 5 19:26:15 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 13:26:15 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDAF97.3050502@ewtllc.com> References: <44FDA23D.2060602@ewtllc.com> <44FDAF97.3050502@ewtllc.com> Message-ID: <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 1:10 PM, Raymond Hettinger wrote: >> This change looks wrong: >> >> PyDoc_STRVAR(rpartition__doc__, >> -"S.rpartition(sep) -> (head, sep, tail)\n\ >> +"S.rpartition(sep) -> (tail, sep, head)\n\ >> >> It looks like the code itself does the right thing, but I wasn't >> quite >> confident of that. >> > It is correct. There may be some confusion in terminology. Head and > tail do not mean left-side or right-side. Instead, they refer to the > "small part chopped-off" and "the rest that is still choppable". Think > of head and tail in the sense of car and cdr. > > A post-condition invariant for both str.partition() and > str.rpartition() is: > > assert sep not in head > > For non-looping cases, users will likely to use different variable > names > when they unpack the tuple: > > left, middle, right = s.rpartition(p) > > But when they perform multiple partitions, the "tail" or "rest" > terminology is more appropriate for the part of the string that may > still contain separators. ISTM this is just begging for newbie (and maybe not-so-newbie) confusion. Why not just document both as returning (left, sep, right) which seems the most obvious description of what the methods return? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP2zPHEjvBPtnXfVAQKpvQP/X1Vg9G4gZLl9R7/fnevmfeszTbqVk1Bq V7aXYm5pTFiD27cKV2e7MKZPifob6Pg8NPjsvAh6jZU5Uj0BUQhIwgDXZpcivsTM MykyPz8oVpSLRhu5xfYU1IZjbogoKfPQ04FkqWgtM2QUqKjiLcvwzPnzLNLVxx9r v2LplvrqJyc= =Tckf -----END PGP SIGNATURE----- From rhettinger at ewtllc.com Tue Sep 5 19:46:01 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 10:46:01 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> References: <44FDA23D.2060602@ewtllc.com> <44FDAF97.3050502@ewtllc.com> <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> Message-ID: <44FDB7D9.5040108@ewtllc.com> > ISTM this is just begging for newbie (and maybe not-so-newbie) > confusion. Why not just document both as returning (left, sep, > right) which seems the most obvious description of what the methods > return? I'm fine with that (though it's a little sad that we think the rather basic concepts of head and tail are beyond the grasp of typical pythonistas). Changing to left/sep/right will certainly disambiguate questions about the ordering of the return tuple. OTOH, there is some small loss in that the head/tail terminology is highly suggestive of how to use the function when making succesive partitions. Raymond From fdrake at acm.org Tue Sep 5 19:51:49 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 5 Sep 2006 13:51:49 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> Message-ID: <200609051351.50494.fdrake@acm.org> On Tuesday 05 September 2006 13:24, Michael Chermside wrote: > How about something like this: > > S.partition(sep) -> (head, sep, tail) > S.rpartition(sep) -> (tail, sep, rest) I think I prefer: S.partition(sep) -> (head, sep, rest) S.rpartition(sep) -> (tail, sep, rest) Here, "rest" is always used for "what remains"; head/tail are somewhat more clear here I think. -Fred -- Fred L. Drake, Jr. From barry at python.org Tue Sep 5 19:52:45 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 13:52:45 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDB7D9.5040108@ewtllc.com> References: <44FDA23D.2060602@ewtllc.com> <44FDAF97.3050502@ewtllc.com> <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> <44FDB7D9.5040108@ewtllc.com> Message-ID: <76BC85F2-2184-476C-8059-A1944BBDD194@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 1:46 PM, Raymond Hettinger wrote: >> ISTM this is just begging for newbie (and maybe not-so-newbie) >> confusion. Why not just document both as returning (left, sep, >> right) which seems the most obvious description of what the >> methods return? > > > I'm fine with that (though it's a little sad that we think the > rather basic concepts of head and tail are beyond the grasp of > typical pythonistas). > > Changing to left/sep/right will certainly disambiguate questions > about the ordering of the return tuple. OTOH, there is some small > loss in that the head/tail terminology is highly suggestive of how > to use the function when making succesive partitions. Personally, I'd rather the docstring be clear and concise rather than suggestive of use cases. IMO, the latter would be better served as an example in the latex documentation. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP25cXEjvBPtnXfVAQJ4EwQAuKnVxtyabdtAv/Eu9CcZ8EkcwCJYOoAT DmgMWeml861Sn4qN6NV1vMKbXljxiKqoSBgbKdpU+FRb6TeNiCisuWA0Q9xoOfsj Jyvy3XN54WXCUBNBnfsfUROPqxjiNGnKxYUzx2a+pjkeSSSZxDzbuplU+2ijB6w4 HJWIT4JLldA= =u6iU -----END PGP SIGNATURE----- From fdrake at acm.org Tue Sep 5 19:55:17 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 5 Sep 2006 13:55:17 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDB7D9.5040108@ewtllc.com> References: <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> <44FDB7D9.5040108@ewtllc.com> Message-ID: <200609051355.18117.fdrake@acm.org> On Tuesday 05 September 2006 13:46, Raymond Hettinger wrote: > Changing to left/sep/right will certainly disambiguate questions about left/right is definately not helpful. It's also ambiguous in the case of .rpartition(), where left and right in the input and result are different. > the ordering of the return tuple. OTOH, there is some small loss in > that the head/tail terminology is highly suggestive of how to use the > function when making succesive partitions. See my previous note in this thread for another suggestion. -Fred -- Fred L. Drake, Jr. From jimjjewett at gmail.com Tue Sep 5 20:02:31 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 5 Sep 2006 14:02:31 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <200609051351.50494.fdrake@acm.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: On 9/5/06, Fred L. Drake, Jr. wrote: > S.partition(sep) -> (head, sep, rest) > S.rpartition(sep) -> (tail, sep, rest) > Here, "rest" is always used for "what remains"; head/tail are somewhat more > clear here I think. Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) Another possibility is data (for head/tail) and unparsed (for rest). S.partition(sep) -> (data, sep, unparsed) S.rpartition(sep) -> (unparsed, sep, data) I'm not sure which is worse -- (1) distinguishing between tail and rest (2) using (overly generic) jargon like unparsed and data. Whatever the final decision, it would probably be best to add an example to the docstring. "a.b.c".rpartition(".") -> ("a.b", ".", "c") -jJ From rhettinger at ewtllc.com Tue Sep 5 20:06:19 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 11:06:19 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: <44FDBC9B.6050406@ewtllc.com> > > Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) Gads, the cure is worse than the disease. car and cdr are starting to look pretty good ;-) Raymond From fdrake at acm.org Tue Sep 5 20:10:33 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 5 Sep 2006 14:10:33 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: <200609051410.34201.fdrake@acm.org> On Tuesday 05 September 2006 14:02, Jim Jewett wrote: > Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) Whichever matches reality, sure. I've lost track of the rpartition() result order. --sigh-- > Another possibility is data (for head/tail) and unparsed (for rest). > > S.partition(sep) -> (data, sep, unparsed) > S.rpartition(sep) -> (unparsed, sep, data) It's all data, so I think that's too contrived. > I'm not sure which is worse -- > (1) distinguishing between tail and rest > (2) using (overly generic) jargon like unparsed and data. I don't see the distinction between tail and rest as problematic. But I've not used lisp for a long time. > Whatever the final decision, it would probably be best to add an > example to the docstring. "a.b.c".rpartition(".") -> ("a.b", ".", > "c") Agreed. -Fred -- Fred L. Drake, Jr. From barry at python.org Tue Sep 5 20:12:16 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 14:12:16 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDBC9B.6050406@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDBC9B.6050406@ewtllc.com> Message-ID: <6B0AEAE3-A77E-4CE3-956E-14CF31F26FD8@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 2:06 PM, Raymond Hettinger wrote: >> Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) > > Gads, the cure is worse than the disease. > > car and cdr are starting to look pretty good ;-) LOL, the lisper in me likes that too, but I don't think it'll work. :) Fred's disagreement notwithstanding, I still like (left, sep, right), but another alternative comes to mind after actually reading the docstring for rpartition : (before, sep, after). Now, that's not ambiguous is it? Seems to work for both partition and rpartition. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP2+AHEjvBPtnXfVAQLiPAP+N80jHkoT5VNTtX1h2cqD4pONz+j2maCI QXDBoODucxLDPrig8FJ3c6IcT+Uapifu8Rrvd7Vm8gSPMUsMqAgAqhqNDbXTkHVH xLk31en2k2fdiCQKQyKJSjE1R1CaFCezByV29FK3fWvqrrxObISRnsxf/wXB6Czu pOUNSA9LLKo= =g+iz -----END PGP SIGNATURE----- From Scott.Daniels at Acm.Org Tue Sep 5 20:16:56 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Tue, 05 Sep 2006 11:16:56 -0700 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <44FDA34C.6030605@async.com.br> References: <44FDA34C.6030605@async.com.br> Message-ID: Johan Dahlin wrote: > Nick Maclaren wrote: >> "Gustavo Carneiro" wrote: >>> .... I'm pretty sure writing to a pipe is async signal safe. It is the >>> oldest trick in the book, everyone uses it. I ... know that it works. >> Ah. Well, I can assure you that it's not the oldest trick in the book, >> and not everyone uses it. > ... > Can't this just be enabled for platforms where it's known to work and let > Python as it currently is for the users of these legacy systems ? Ah, but that _is_ the current state of affairs. .5 :-) -- Scott David Daniels Scott.Daniels at Acm.Org From jjl at pobox.com Tue Sep 5 20:22:11 2006 From: jjl at pobox.com (John J Lee) Date: Tue, 5 Sep 2006 19:22:11 +0100 (GMT Standard Time) Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <200609051351.50494.fdrake@acm.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: On Tue, 5 Sep 2006, Fred L. Drake, Jr. wrote: > On Tuesday 05 September 2006 13:24, Michael Chermside wrote: > > How about something like this: > > > > S.partition(sep) -> (head, sep, tail) > > S.rpartition(sep) -> (tail, sep, rest) > > I think I prefer: > > S.partition(sep) -> (head, sep, rest) > S.rpartition(sep) -> (tail, sep, rest) > > Here, "rest" is always used for "what remains"; head/tail are somewhat more > clear here I think. But isn't rest is in the wrong place there, for rpartition: that's not the string that you might typically call.rpartition() on a second time. How about: S.partition(sep) -> (left, sep, rest) S.rpartition(sep) -> (rest, sep, right) John From brett at python.org Tue Sep 5 20:25:53 2006 From: brett at python.org (Brett Cannon) Date: Tue, 5 Sep 2006 11:25:53 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: On 9/4/06, Neal Norwitz wrote: > > There are 3 bugs currently listed in PEP 356 as blocking: > http://python.org/sf/1551432 - __unicode__ breaks on exception > classes I replied on the bug report, but might as well comment here. The problem with this bug is that BaseException now defines a __unicode__() method in its PyMethodDef. That intercepts the unicode() call on the class and it complains it was not handed an instance. I guess the only way to fix this is to toss out the __unicode__() method and change the tp_str function to return Unicode as needed (unless someone else has a better idea). Or the bug can be closed as Won't Fix. http://python.org/sf/1550938 - improper exception w/relative import > http://python.org/sf/1541697 - sgmllib regexp bug causes hang > > Does anyone want to fix the sgmlib issue? If not, we should revert > this week before c2 is cut. I'm hoping that we will have *no changes* > in 2.5 final from c2. Should there be any bugs/patches added to or > removed from the list? > > The buildbots are currently humming along, but I believe all 3 > versions (2.4, 2.5, and 2.6) are fine. > > Test out 2.5c1+ and report all bugs! > > n > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/8ac53a65/attachment-0001.htm From seojiwon at gmail.com Tue Sep 5 20:33:59 2006 From: seojiwon at gmail.com (Jiwon Seo) Date: Tue, 5 Sep 2006 11:33:59 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <6B0AEAE3-A77E-4CE3-956E-14CF31F26FD8@python.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDBC9B.6050406@ewtllc.com> <6B0AEAE3-A77E-4CE3-956E-14CF31F26FD8@python.org> Message-ID: On 9/5/06, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Sep 5, 2006, at 2:06 PM, Raymond Hettinger wrote: > > >> Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) > > > > Gads, the cure is worse than the disease. > > > > car and cdr are starting to look pretty good ;-) > > LOL, the lisper in me likes that too, but I don't think it'll work. :) > but when it comes to cadr, cddr, cdar... ;^) I personally prefer (left, sep, right ) since it's most clear and there are many Python programmers whose first language is not English. > Fred's disagreement notwithstanding, I still like (left, sep, right), > but another alternative comes to mind after actually reading the > docstring for rpartition : (before, sep, after). Now, that's > not ambiguous is it? Seems to work for both partition and rpartition. > > - -Barry > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (Darwin) > > iQCVAwUBRP2+AHEjvBPtnXfVAQLiPAP+N80jHkoT5VNTtX1h2cqD4pONz+j2maCI > QXDBoODucxLDPrig8FJ3c6IcT+Uapifu8Rrvd7Vm8gSPMUsMqAgAqhqNDbXTkHVH > xLk31en2k2fdiCQKQyKJSjE1R1CaFCezByV29FK3fWvqrrxObISRnsxf/wXB6Czu > pOUNSA9LLKo= > =g+iz > -----END PGP SIGNATURE----- > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/seojiwon%40gmail.com > From rhettinger at ewtllc.com Tue Sep 5 20:32:46 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 11:32:46 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: <44FDC2CE.1040902@ewtllc.com> Jim Jewett wrote: > > Another possibility is data (for head/tail) and unparsed (for rest). > > S.partition(sep) -> (data, sep, unparsed) > S.rpartition(sep) -> (unparsed, sep, data) This communicates very little about the ordering of the return tuple. Beware of overly general terms like "data" that provide no hints about the semantics of the method. The one good part that the terms are consistent between partition and rpartition so that the invariant can be stated: assert sep not in datum I recommend we just leave the existing head/tail wording and add an example which will make the meaning instantly clear: 'www.python.org'.rpartition('.') --> ('www.python', '.', 'org') Also, remember that this discussion is being held in abstract. An actual user of rpartition() is already thinking in terms of parsing from the end of the string. Another thought is that strings don't really have a left and right. They have a beginning and end. The left/right or top/bottom distinction is culture specific. Raymond BTW, if someone chops your ankles, does it matter which way you're facing to decide whether it was your feet or your head that had been cut-off? From rrr at ronadam.com Tue Sep 5 20:35:40 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 05 Sep 2006 13:35:40 -0500 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> Message-ID: <44FDC37C.80304@ronadam.com> Michael Chermside wrote: > Jim Jewett writes: >> This change [in docs] looks wrong: >> >> PyDoc_STRVAR(rpartition__doc__, >> -"S.rpartition(sep) -> (head, sep, tail)\n\ >> +"S.rpartition(sep) -> (tail, sep, head)\n\ > > Raymond Hettinger replies: >> It is correct. There may be some confusion in terminology. Head >> and tail do not mean left-side or right-side. Instead, they refer to >> the "small part chopped-off" and "the rest that is still choppable". >> Think of head and tail in the sense of car and cdr. > > > It is incorrect. The purpose of documentation is to explain > things to users, and documentation which fails to achieve this > is not "correct". The level of confusion generated by using "head" > to refer to the last part of the string and "tail" to refer to > the beginning, is quite significant. > > How about something like this: > > S.partition(sep) -> (head, sep, tail) > S.rpartition(sep) -> (tail, sep, rest) This isn't immediately clear to me what I will get. s.partition(sep) -> (left, sep, right) s.rpartition(sep) -> (left, sep, right) Would be clearer, along with an explanation of what left, and right are. I hope this discussion is only about the words used and the documentation and not about the actual order of what is received. I would expect both the following should be true, and it is the current behavior. ''.join(s.partition(sep)) -> s ''.join(s.rpartition(sep)) -> s > Perhaps someone else can find something clearer than my suggestion, > but in my own head, the terms "head" and "tail" are tighly bound > with the idea of beginning and end (respectively) rather than with > the idea of "small part chopped off" and "big part that is still > choppable". Maybe this? partition(...) S.partition(sep) -> (left, sep, right) Partition a string at the first occurrence of sep from the left into a tuple of left, sep, and right parts. Returns (S, '', '') if sep is not found in S. rpartition(...) S.rpartition(sep) -> (left, sep, right) Partition a string at the first occurrence of sep from the right into a tuple of left, sep, and right parts. Returns ('', '', S) if sep is not found in S. I feel the terms head and tail, rest etc... should be used in examples where their meaning will be clear by the context they are used in. But not in the definition where their meanings are not obvious. Cheers, Ron From rrr at ronadam.com Tue Sep 5 20:44:40 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 05 Sep 2006 13:44:40 -0500 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDC2C5.2080709@ronadam.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <44FDC2C5.2080709@ronadam.com> Message-ID: <44FDC598.2000106@ronadam.com> Ron Adam wrote: Correcting myself... > I hope this discussion is only about the words used and the > documentation and not about the actual order of what is received. I > would expect both the following should be true, and it is the current > behavior. > > ''.join(s.partition(sep)) -> s > ''.join(s.rpartition(sep)) -> s >>> 'abcd'.partition('x') ('abcd', '', '') >>> 'abcd'.rpartition('x') ('abcd', '', '') >>> Ok, I see Raymonds point, they are not what I expected. Although the above is still true, the returned value for the not found condition is inconsistent. _Ron From g.brandl at gmx.net Tue Sep 5 20:49:01 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 05 Sep 2006 20:49:01 +0200 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: Brett Cannon wrote: > > > On 9/4/06, *Neal Norwitz* > wrote: > > There are 3 bugs currently listed in PEP 356 as blocking: > http://python.org/sf/1551432 - __unicode__ breaks on > exception classes > > > I replied on the bug report, but might as well comment here. > > The problem with this bug is that BaseException now defines a > __unicode__() method in its PyMethodDef. That intercepts the unicode() > call on the class and it complains it was not handed an instance. I > guess the only way to fix this is to toss out the __unicode__() method > and change the tp_str function to return Unicode as needed (unless > someone else has a better idea). Or the bug can be closed as Won't Fix. Throwing out the __unicode__ method is fine with me -- exceptions didn't have one before the NeedForSpeed rewrite, so there would be no loss in functionality. Georg From barry at python.org Tue Sep 5 20:51:13 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 14:51:13 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDC2CE.1040902@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 2:32 PM, Raymond Hettinger wrote: > Another thought is that strings don't really have a left and right. > They have a beginning and end. The left/right or top/bottom > distinction > is culture specific. For the target of the method, this is true, but it's not true for the results which is what we're talking about describing here. 'left' is whatever is to the left of the separator and 'right' is whatever is to the right of the separator. Seems obvious to me. I believe (left, sep, right) will be the clearest description for all users, with little chance of confusion. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP3HIXEjvBPtnXfVAQIx5wP+MPF5tk4moX4jH0yhGvR6gKcGBusyN152 redIr0xiNqECfrIHkc756UDLn3HhB2WdEjR9pn06RzmbgePMPcGP19cjZdHGwjFK 3e4Qg8zW3cL0iCnybL4AEaoZksuHGwJpZbId9HF60GFqYdjNTKEMNIVRI7jTE9pP zbBO6Sscnl0= =HB4k -----END PGP SIGNATURE----- From rrr at ronadam.com Tue Sep 5 20:58:30 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 05 Sep 2006 13:58:30 -0500 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDC2CE.1040902@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: <44FDC8D6.5090002@ronadam.com> Raymond Hettinger wrote: > Another thought is that strings don't really have a left and right. > They have a beginning and end. The left/right or top/bottom distinction > is culture specific. Well, it should have been epartition() and not rpartition() in that case. ;-) Is python ever edited in languages that don't use left to right lines? From rhettinger at ewtllc.com Tue Sep 5 21:06:03 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 12:06:03 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDC37C.80304@ronadam.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <44FDC37C.80304@ronadam.com> Message-ID: <44FDCA9B.60101@ewtllc.com> Ron Adam wrote: >I hope this discussion is only about the words used and the >documentation and not about the actual order of what is received. I >would expect both the following should be true, and it is the current >behavior. > > ''.join(s.partition(sep)) -> s > ''.join(s.rpartition(sep)) -> s > > > Right. The only thing in question is wording for the documentation. The viable options on the table are: * Leave the current wording and add a clarifying example. * Switch to left/sep/right and add a clarifying example. The former tells you which part can still contain a separator and suggests how to use the tool when successive partitions are needed. The latter makes the left/right ordering clear and tells you nothing about which part can still have the separators in it. That has some import because the use cases for rpartition() all involve strings with multiple separators --if there were only one, you would just use partition(). BTW, the last check-in fixed the return value for the sep-not-found case, so that now: 'a'.partition('x') --> ('a', '', '') 'a'.rpartition('x') --> ('', '', 'a') This was necessary so that looping/recursion would work and so that rpartition() acts as a mirror-image of partition(). Raymond From tim.peters at gmail.com Tue Sep 5 21:07:43 2006 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 5 Sep 2006 15:07:43 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> upto, sep, rest in whatever order they apply. I think of a partition-like function as starting at some position and matching "up to" the first occurence of the separator (be that left or right or diagonally, "up to" is relative to the search direction), and leaving "the rest" alone. The docs should match that, since my mental model is correct ;-) From brett at python.org Tue Sep 5 21:19:52 2006 From: brett at python.org (Brett Cannon) Date: Tue, 5 Sep 2006 12:19:52 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: On 9/5/06, Georg Brandl wrote: > > Brett Cannon wrote: > > > > > > On 9/4/06, *Neal Norwitz* > > wrote: > > > > There are 3 bugs currently listed in PEP 356 as blocking: > > http://python.org/sf/1551432 - __unicode__ breaks on > > exception classes > > > > > > I replied on the bug report, but might as well comment here. > > > > The problem with this bug is that BaseException now defines a > > __unicode__() method in its PyMethodDef. That intercepts the unicode() > > call on the class and it complains it was not handed an instance. I > > guess the only way to fix this is to toss out the __unicode__() method > > and change the tp_str function to return Unicode as needed (unless > > someone else has a better idea). Or the bug can be closed as Won't Fix. > > Throwing out the __unicode__ method is fine with me -- exceptions didn't > have one before the NeedForSpeed rewrite, so there would be no loss in > functionality. If this step is done and the tp_str function is not changed to return Unicode as needed, PEP 352 will need to be updated. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/a3423370/attachment.html From mal at egenix.com Tue Sep 5 21:33:54 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 05 Sep 2006 21:33:54 +0200 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: <44FDD122.3000809@egenix.com> Brett Cannon wrote: > On 9/4/06, Neal Norwitz wrote: >> >> There are 3 bugs currently listed in PEP 356 as blocking: >> http://python.org/sf/1551432 - __unicode__ breaks on exception >> classes > > > I replied on the bug report, but might as well comment here. > > The problem with this bug is that BaseException now defines a __unicode__() > method in its PyMethodDef. That intercepts the unicode() call on the class > and it complains it was not handed an instance. I guess the only way to > fix this is to toss out the __unicode__() method and change the tp_str function > to return Unicode as needed (unless someone else has a better idea). Or > the bug can be closed as Won't Fix. The proper fix would be to introduce a tp_unicode slot and let this decide what to do, ie. call .__unicode__() methods on instances and use the .__name__ on classes. I think this would be the right way to go for Python 2.6. For Python 2.5, just dropping this .__unicode__ method on exceptions is probably the right thing to do. The reason why the PyObject_Unicode() function tries to be smart here is that we don't have a tp_unicode slot (to complement tp_str). It's obvious that this is not perfect, but only a work-around. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 05 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From brett at python.org Tue Sep 5 21:41:49 2006 From: brett at python.org (Brett Cannon) Date: Tue, 5 Sep 2006 12:41:49 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: <44FDD122.3000809@egenix.com> References: <44FDD122.3000809@egenix.com> Message-ID: On 9/5/06, M.-A. Lemburg wrote: > > Brett Cannon wrote: > > On 9/4/06, Neal Norwitz wrote: > >> > >> There are 3 bugs currently listed in PEP 356 as blocking: > >> http://python.org/sf/1551432 - __unicode__ breaks on exception > >> classes > > > > > > I replied on the bug report, but might as well comment here. > > > > The problem with this bug is that BaseException now defines a > __unicode__() > > method in its PyMethodDef. That intercepts the unicode() call on the > class > > and it complains it was not handed an instance. I guess the only way to > > fix this is to toss out the __unicode__() method and change the tp_str > function > > to return Unicode as needed (unless someone else has a better idea). Or > > the bug can be closed as Won't Fix. > > The proper fix would be to introduce a tp_unicode slot and let > this decide what to do, ie. call .__unicode__() methods on instances > and use the .__name__ on classes. That was my bug reaction and what I said on the bug report. Kind of surprised one doesn't already exist. I think this would be the right way to go for Python 2.6. For > Python 2.5, just dropping this .__unicode__ method on exceptions > is probably the right thing to do. Neal, do you want to rip it out or should I? -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/f0862cc8/attachment.htm From p.f.moore at gmail.com Tue Sep 5 21:41:58 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 5 Sep 2006 20:41:58 +0100 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> Message-ID: <79990c6b0609051241x35bfd75fia7a9d8bb095e1019@mail.gmail.com> On 9/5/06, Tim Peters wrote: > upto, sep, rest > > in whatever order they apply. I think of a partition-like function as > starting at some position and matching "up to" the first occurence of > the separator (be that left or right or diagonally, "up to" is > relative to the search direction), and leaving "the rest" alone. The > docs should match that, since my mental model is correct ;-) +1 Paul From nmm1 at cus.cam.ac.uk Tue Sep 5 21:44:50 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 05 Sep 2006 20:44:50 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: Johan Dahlin wrote: > > Are you saying that we should let less commonly used platforms dictate > features and functionality for the popular ones? > I mean, who uses HP/UX, SCO and [insert your favorite flavor] as a modern > desktop system where this particular bug makes a difference? You haven't been following the thread. As I posted, this problem occurs to a greater or lesser degree on all platforms. This will be my last posting on the topic, but I shall try to explain. The first problem is in the hardware and operating system. A signal interrupts the thread, and passes control to a handler with a very partial environment and (usually) information on the environment when it was interrupted. If it interrupted the thread in the middle of a system call or other library routine that uses non-Python conventions, the registers and other state may be weird. There ARE solutions to this, but they are unbelievably foul, and even Linux on x86 gas had trouble with this. And, on return, everything has to be reversed entirely transparently! It is VERY common for there to be bugs in the C run-time system and not rare for there to be ones in the kernel (that area of Linux has been rewritten MANY times, for this reason). In many cases, the run-time system simply doesn't pretend to handle interrupts in arbitrary code (which is where the C undefined behaviour is used by vendors). The second problem is that what you can do depends both on what you were doing and how your 'primitive' is implemented. For example, if you call something that takes out even a very short term lock or uses a spin loop to emulate an atomic operation, you had better not use it if you interrupted code that was doing the same. Your thread may hang, crash or otherwise go bananas. Can you guarantee that even write is free of such things? No, and certainly not if you are using a debugger, a profiling library or even tracing system calls. I have often used programs that crashed as soon as I did one of those :-( Related to this is that it is EXTREMELY hard to write synchronisation primitives (mutexes etc.) that are interrupt-safe - MUCH harder than to write thread-safe ones - and few people are even aware of the issues. There was a thread on some Linux kernel mailing list about this, and even the kernel developers were having headaches thinking about the issues. Even if write is atomic, there are gotchas. What if the interrupted code is doing something to that file at the time? Are you SURE that an unexpected operation on it (in the same thread) won't cause the library function of program to get confused? And can you be sure that the write will terminate fast enough to not cause time-critical code to fail? And have you studied the exact semantics of blocking on pipes? They are truly horrible. So this is NOT a matter of platform X is safe and platform Y isn't. Even Linux x86 isn't entirely safe - or wasn't, the last time I heard. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From rhettinger at ewtllc.com Tue Sep 5 22:13:02 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 13:13:02 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> Message-ID: <44FDDA4E.2080506@ewtllc.com> Tim Peters wrote: > upto, sep, rest > >in whatever order they apply. > In the rpartition case, that would be (rest, sep, upto) which seems a bit cryptic. We need some choice of words that clearly mean: * the chopped-off snippet (guaranteed to not contain the separator) * the separator if found * the unchopped remainer of the string (which may contain a separator). Of course, if a clear example is added, the choice of words becomes much less important. Raymond From barry at python.org Tue Sep 5 22:17:20 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 16:17:20 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDDA4E.2080506@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> <44FDDA4E.2080506@ewtllc.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 4:13 PM, Raymond Hettinger wrote: > Tim Peters wrote: > >> upto, sep, rest >> >> in whatever order they apply. >> > In the rpartition case, that would be (rest, sep, upto) which seems a > bit cryptic. > > We need some choice of words that clearly mean: > * the chopped-off snippet (guaranteed to not contain the separator) > * the separator if found > * the unchopped remainer of the string (which may contain a > separator). > > Of course, if a clear example is added, the choice of words becomes > much > less important. Ideally too, the terminology (and order) for partition and rpartition would be the same. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP3bVXEjvBPtnXfVAQJSKwP9Ev3MPzum3kp4hNDJZyBmEShzPvL2WQv2 VThbxZX1MDfeDXupNwF22bFA5gF/9vZp3nToUqyAbOaPSd93hJSHOdeWdAhR2BdT EICkzBTGCtVkbqu3Ep1N/jb9GJUvgkgNAWtRZVuTWQtJc6AanV9ssTcF6F7ipc6p zgSWeAc0a3E= =W7LV -----END PGP SIGNATURE----- From jimjjewett at gmail.com Tue Sep 5 22:43:20 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 5 Sep 2006 16:43:20 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: I think I finally figured out where Raymond is coming from. For Raymond, "head" is where he started processing -- for rpartition, this is the .endswith part. For me, "head" is the start of the data structure -- always the .startswith part. We won't resolve that with anything suggesting a sequential order; we need something that makes it clear which part is the large leftover. S.partition(sep) -> (record, sep, remains) S.rpartition(sep) -> (remains, sep, record) I do like the plural (or collective) sound of "remains". I have no solid reasoning for "record" vs "rec" vs "onerec". I would welcome a word that did not suggest it would have further internal structure. -jJ From barry at python.org Tue Sep 5 22:55:44 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 16:55:44 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote: > I think I finally figured out where Raymond is coming from. > > For Raymond, "head" is where he started processing -- for rpartition, > this is the .endswith part. > > For me, "head" is the start of the data structure -- always the > .startswith part. > > We won't resolve that with anything suggesting a sequential order; we > need something that makes it clear which part is the large leftover. See, for me, it's all about the results of the operation, not how the results are (supposedly) used. The way I think about it is that I've got some string and I'm looking for some split point within that string. That split point is clearly the "middle" (but "sep" works too) and everything to the right of that split point gets returned in "right" while everything to the left gets returned in "left". I'm less concerned with repeated splits because I probably have as many existing cases where I'm looking for the first split point as where I'm looking repeatedly for split points (think RFC 2822 header splitting -- partition will be awesome for this). The bias with these terms is clearly the English left-to-right order. Actually, that brings up an interesting question: what would happen if you called rpartition on a unicode string representing Hebrew, Arabic, or other RTL language? Do partition and rpartition suddenly switch directions? If not, then I think left-sep-right are fine. If so, then yeah, we probably need something else. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP3kUHEjvBPtnXfVAQJd6wP+OBtRR22O0A+s/uHF3ACgWhrdZJdEnzEW qimKEWmDCUuK7CFIUsJKteoNNSHjIBgZIMMdnsymgI7CPgPNuB6CUAp8KFFeYvMy PVpMIqNFOFXGUVYf4VA7ED9S7QbbDzHJv32kUUZvbuTniYK9DVMi0O7GStsv1Kg6 insyP+W1EcU= =4aar -----END PGP SIGNATURE----- From pje at telecommunity.com Tue Sep 5 23:07:17 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 05 Sep 2006 17:07:17 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> At 04:55 PM 9/5/2006 -0400, Barry Warsaw wrote: >On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote: > > > I think I finally figured out where Raymond is coming from. > > > > For Raymond, "head" is where he started processing -- for rpartition, > > this is the .endswith part. > > > > For me, "head" is the start of the data structure -- always the > > .startswith part. > > > > We won't resolve that with anything suggesting a sequential order; we > > need something that makes it clear which part is the large leftover. > >See, for me, it's all about the results of the operation, not how the >results are (supposedly) used. The way I think about it is that I've >got some string and I'm looking for some split point within that >string. That split point is clearly the "middle" (but "sep" works >too) and everything to the right of that split point gets returned in >"right" while everything to the left gets returned in "left". +1 for left/sep/right for both operations. It's easier to remember a visual correlation (left,sep,right) than it is to try and think about an abstraction in which the order of results has something to do with what direction I found the separator in. If I'm repeating from right to left, then of course the "left" is the part I'll want to repeat on. From rhettinger at ewtllc.com Tue Sep 5 23:16:53 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 14:16:53 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> Message-ID: <44FDE945.7080801@ewtllc.com> An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/9475cb5e/attachment.html From gjcarneiro at gmail.com Wed Sep 6 02:21:11 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Wed, 6 Sep 2006 01:21:11 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/5/06, Nick Maclaren wrote: [...] > Even if write is atomic, there are gotchas. What if the interrupted > code is doing something to that file at the time? Are you SURE that > an unexpected operation on it (in the same thread) won't cause the > library function of program to get confused? Yes, I'm sure. The technique is based on writing any arbitrary byte onto a well known pipe. Any byte will do. All it matters is that we trick the kernel into realizing there is data to read on the other end of the pipe, so that it can wake up the poll() syscall waiting on it. Only signal handlers ever write to this file descriptor. If one signal handler interrupts another one, it's ok; all it takes is that at least one of them succeeds, and the data itself is irrelevant. Only the mainloop ever reads from the pipe. > And can you be sure that the write will terminate fast enough to not cause time-critical code to fail? Time critical code should block signals. Or should use a real-time OS. > And have you studied the exact semantics of blocking > on pipes? They are truly horrible. The pipe is changed to async mode; never blocks. We don't care about any data being transferred at all, only the state on the file descriptor changing. > So this is NOT a matter of platform X is safe and platform Y isn't. > Even Linux x86 isn't entirely safe - or wasn't, the last time I heard. We can't prove write() is async safe, but you can't prove it isn't either. From all I know, write() doesn't use malloc(); it only loads a few registers and calls some interrupt (or syscall in amd64). It is plausible that it is perfectly async safe. And that's completely beside the point. We only ask python to call a function of ours every time it handles a signal. You are criticizing the way pygtk or glib will handle the notification, but we are here to discuss how will Python just give us a small hand in solving the signals problem. These are different problem domains. We don't ask Python developers to endorse any particular way of solving our problem. But since Python already snatches away our beloved signals, especially SIGINT, it should at least be courteous enough to give us just a notification when signals happen. There is _no_ other way. From david.nospam.hopwood at blueyonder.co.uk Wed Sep 6 03:08:03 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Wed, 06 Sep 2006 02:08:03 +0100 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> Message-ID: <44FE1F73.7020206@blueyonder.co.uk> Barry Warsaw wrote: > The bias with these terms is clearly the English left-to-right > order. Actually, that brings up an interesting question: what would > happen if you called rpartition on a unicode string representing > Hebrew, Arabic, or other RTL language? Do partition and rpartition > suddenly switch directions? What happens is that rpartition searches the string backwards in logical order (i.e. left to right as the text is written, assuming it only contains Hebrew or Arabic letters, and not numbers or a mixture of scripts). But this is not "switching directions"; it's still searching backwards. You really don't want to think of bidirectional text in terms of presentation, when you're doing processing that should be independent of presentation. > If not, then I think left-sep-right are fine. If so, then yeah, we > probably need something else. +1 for (upto, sep, rest) -- and I think it should be in that order for both partition and rpartition. -- David Hopwood From pje at telecommunity.com Wed Sep 6 03:14:18 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 05 Sep 2006 21:14:18 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FE1F73.7020206@blueyonder.co.uk> References: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> Message-ID: <5.1.1.6.0.20060905211030.026352c8@sparrow.telecommunity.com> At 02:08 AM 9/6/2006 +0100, David Hopwood wrote: >Barry Warsaw wrote: > > The bias with these terms is clearly the English left-to-right > > order. Actually, that brings up an interesting question: what would > > happen if you called rpartition on a unicode string representing > > Hebrew, Arabic, or other RTL language? Do partition and rpartition > > suddenly switch directions? > >What happens is that rpartition searches the string backwards in logical >order (i.e. left to right as the text is written, assuming it only contains >Hebrew or Arabic letters, and not numbers or a mixture of scripts). But this >is not "switching directions"; it's still searching backwards. You really >don't want to think of bidirectional text in terms of presentation, when >you're doing processing that should be independent of presentation. > > > If not, then I think left-sep-right are fine. If so, then yeah, we > > probably need something else. > >+1 for (upto, sep, rest) -- and I think it should be in that order for >both partition and rpartition. It appears the problem is that one group of people thinks in terms of the order of the string, and the other in terms of the order of processing. Both groups agree that both partition and rpartition should be "in the same order" -- but we disagree about what that means. :) Me, I want left/sep/right because I'm in the "string order" camp, and you want upto/sep/rest because you're in the "processing order" camp. From fperez.net at gmail.com Wed Sep 6 06:56:04 2006 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 05 Sep 2006 22:56:04 -0600 Subject: [Python-Dev] inspect.py very slow under 2.5 Message-ID: Hi all, I know that the 2.5 release is extremely close, so this will probably be 2.5.1 material. I discussed it briefly with Guido at scipy'06, and he asked for some profile-based info, which I've only now had time to gather. I hope this will be of some use, as I think the problem is rather serious. For context: I am the IPython lead developer (http://ipython.scipy.org), and ipython is used as the base shell for several interactive environments, one of which is the mathematics system SAGE (http://modular.math.washington.edu/sage). It was the SAGE lead who first ran into this problem while testing SAGE with 2.5. The issue is the following: ipython provides several exception reporting modes which give a lot more information than python's default tracebacks. In order to generate this info, it makes extensive use of the inspect module. The module in ipython responsible for these fancy tracebacks is: http://projects.scipy.org/ipython/ipython/browser/ipython/trunk/IPython/ultraTB.py which is an enhanced port of Ka Ping-Yee's old cgitb module. Under 2.5, the generation of one of these detailed tracebacks is /extremely/ expensive, and the cost goes up very quickly the more modules have been imported into the current session. While in a new ipython session the slowdown is not crippling, under SAGE (which starts with a lot of loaded modules) it is bad enough to make the system nearly unusable. I'm attaching a little script which can be run to show the problem, but you need IPython to be installed to run it. If any of you run ubuntu, fedora, suse or almost any other major linux distro, it's already available via the usual channels. In case you don't want to (or can't) run the attached code, here's a summary of what I see on my machine (ubuntu dapper). Using ipython under python 2.4.3, I get: 2268 function calls (2225 primitive calls) in 0.020 CPU seconds Ordered by: call count List reduced from 127 to 32 due to restriction <0.25> ncalls tottime percall cumtime percall filename:lineno(function) 305 0.000 0.000 0.000 0.000 :0(append) 259/253 0.010 0.000 0.010 0.000 :0(len) 177 0.000 0.000 0.000 0.000 :0(isinstance) 90 0.000 0.000 0.000 0.000 :0(match) 68 0.000 0.000 0.000 0.000 ultraTB.py:539(tokeneater) 68 0.000 0.000 0.000 0.000 tokenize.py:16 (generate_tokens) 61 0.000 0.000 0.000 0.000 :0(span) 57 0.000 0.000 0.000 0.000 sre_parse.py:130(__getitem__) 56 0.000 0.000 0.000 0.000 string.py:220(lower) etc, while running the same script under ipython/python2.5 and no other changes gives: 230370 function calls (229754 primitive calls) in 3.340 CPU seconds Ordered by: call count List reduced from 83 to 21 due to restriction <0.25> ncalls tottime percall cumtime percall filename:lineno(function) 55003 0.420 0.000 0.420 0.000 :0(startswith) 45026 0.264 0.000 0.264 0.000 :0(endswith) 20013 0.148 0.000 0.148 0.000 :0(append) 12138 0.180 0.000 0.660 0.000 posixpath.py:156(islink) 12138 0.192 0.000 0.192 0.000 :0(lstat) 12138 0.180 0.000 0.288 0.000 stat.py:60(S_ISLNK) 12138 0.108 0.000 0.108 0.000 stat.py:29(S_IFMT) 11838 0.680 0.000 1.244 0.000 posixpath.py:56(join) 4837 0.052 0.000 0.052 0.000 :0(len) 4362 0.028 0.000 0.028 0.000 :0(split) 4362 0.048 0.000 0.100 0.000 posixpath.py:47(isabs) 3598 0.036 0.000 0.056 0.000 string.py:218(lower) 3598 0.020 0.000 0.020 0.000 :0(lower) 2815 0.032 0.000 0.032 0.000 :0(isinstance) 2809 0.028 0.000 0.028 0.000 :0(join) 2808 0.264 0.000 0.520 0.000 posixpath.py:374(normpath) 2632 0.040 0.000 0.068 0.000 inspect.py:35(ismodule) 2143 0.016 0.000 0.016 0.000 :0(hasattr) 1884 0.028 0.000 0.444 0.000 posixpath.py:401(abspath) 1557 0.016 0.000 0.016 0.000 :0(range) 1078 0.008 0.000 0.044 0.000 inspect.py:342(getfile) These enormous numbers of calls are the origin of the slowdown, and the more modules have been imported, the worse it gets. I haven't had time to dive deep into inspect.py to try and fix this, but I figured it would be best to at least report it now. As far as IPython and its user projects is concerned, I'll probably hack things to overwrite inspect.py from 2.4 over the 2.5 version in the exception reporter, because the current code is simply unusable for detailed tracebacks. It would be great if this could be fixed in the trunk at some point. I'll be happy to provide further feedback or put this information elsewhere. Guido suggested initially posting here, but if you prefer it on the SF tracker (even as incomplete as this report is) I'll be glad to do so. Regards, f -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: traceback_timings.py Url: http://mail.python.org/pipermail/python-dev/attachments/20060905/fb0ac8bf/attachment.asc From steve at holdenweb.com Wed Sep 6 10:14:20 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 06 Sep 2006 09:14:20 +0100 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDE945.7080801@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> <44FDE945.7080801@ewtllc.com> Message-ID: Raymond Hettinger wrote: [...] > That's fine with me. I accept there will always be someone who stands > on their head [...] You'd have to be some kind of contortionist to stand on your head. willfully-misunderstanding-ly y'rs - steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From ncoghlan at gmail.com Wed Sep 6 10:21:54 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 06 Sep 2006 18:21:54 +1000 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> Message-ID: <44FE8522.5020703@gmail.com> Phillip J. Eby wrote: > At 04:55 PM 9/5/2006 -0400, Barry Warsaw wrote: >> On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote: >> >>> I think I finally figured out where Raymond is coming from. >>> >>> For Raymond, "head" is where he started processing -- for rpartition, >>> this is the .endswith part. >>> >>> For me, "head" is the start of the data structure -- always the >>> .startswith part. >>> >>> We won't resolve that with anything suggesting a sequential order; we >>> need something that makes it clear which part is the large leftover. >> See, for me, it's all about the results of the operation, not how the >> results are (supposedly) used. The way I think about it is that I've >> got some string and I'm looking for some split point within that >> string. That split point is clearly the "middle" (but "sep" works >> too) and everything to the right of that split point gets returned in >> "right" while everything to the left gets returned in "left". > > +1 for left/sep/right for both operations. It's easier to remember a > visual correlation (left,sep,right) than it is to try and think about an > abstraction in which the order of results has something to do with what > direction I found the separator in. -1. The string docs are already lousy with left/right terminology that is flatout wrong when dealing with a script that is displayed with a right-to-left or vertical orientation*. In reality, strings are processed such that index 0 is the first character and index -1 is the last character, regardless of script orientation, but you could be forgiven for not realising that after reading the current string docs. Let's not make that particular problem any worse. I don't see anything wrong with Raymond's 'head, sep, tail' and 'tail, sep, head' terminology (although noting the common postcondition 'sep not in head' in the docstrings might be useful). However, if we're going to use the same result tuple for both, then I'd prefer 'before, sep, after', with the partition() postcondition being 'sep not in before' and the rpartition() postcondition being 'sep not in after'. Those terms are accurate regardless of script orientation. Either way, I suggest putting the postcondition in the docstring to make the difference between the two methods explicit. Regards, Nick. * I acknowledge that Python *code* is almost certainly going to be edited in a left-to-right text editor, because it's an English-based programming language. But the strings that string methods like partition() and rpartition() are used with are quite likely to be coming from or written to a or user interface that uses a native script orientation. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From steve at holdenweb.com Wed Sep 6 10:32:19 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 06 Sep 2006 09:32:19 +0100 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FE8522.5020703@gmail.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> <44FE8522.5020703@gmail.com> Message-ID: Nick Coghlan wrote: > Phillip J. Eby wrote: > >>At 04:55 PM 9/5/2006 -0400, Barry Warsaw wrote: >> >>>On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote: >>> >>> >>>>I think I finally figured out where Raymond is coming from. >>>> >>>>For Raymond, "head" is where he started processing -- for rpartition, >>>>this is the .endswith part. >>>> >>>>For me, "head" is the start of the data structure -- always the >>>>.startswith part. >>>> >>>>We won't resolve that with anything suggesting a sequential order; we >>>>need something that makes it clear which part is the large leftover. >>> >>>See, for me, it's all about the results of the operation, not how the >>>results are (supposedly) used. The way I think about it is that I've >>>got some string and I'm looking for some split point within that >>>string. That split point is clearly the "middle" (but "sep" works >>>too) and everything to the right of that split point gets returned in >>>"right" while everything to the left gets returned in "left". >> >>+1 for left/sep/right for both operations. It's easier to remember a >>visual correlation (left,sep,right) than it is to try and think about an >>abstraction in which the order of results has something to do with what >>direction I found the separator in. > > > -1. The string docs are already lousy with left/right terminology that is > flatout wrong when dealing with a script that is displayed with a > right-to-left or vertical orientation*. In reality, strings are processed such > that index 0 is the first character and index -1 is the last character, > regardless of script orientation, but you could be forgiven for not realising > that after reading the current string docs. Let's not make that particular > problem any worse. > > I don't see anything wrong with Raymond's 'head, sep, tail' and 'tail, sep, > head' terminology (although noting the common postcondition 'sep not in head' > in the docstrings might be useful). > > However, if we're going to use the same result tuple for both, then I'd prefer > 'before, sep, after', with the partition() postcondition being 'sep not in > before' and the rpartition() postcondition being 'sep not in after'. Those > terms are accurate regardless of script orientation. > > Either way, I suggest putting the postcondition in the docstring to make the > difference between the two methods explicit. > > Regards, > Nick. > > * I acknowledge that Python *code* is almost certainly going to be edited in a > left-to-right text editor, because it's an English-based programming language. > But the strings that string methods like partition() and rpartition() are used > with are quite likely to be coming from or written to a or user interface that > uses a native script orientation. > Perhaps we should be thinking "beginning" and "end" here, though it seems as though it won't be possible to find a terminology that will be intuitively obvious to everyone. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From g.brandl at gmx.net Wed Sep 6 10:39:07 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 06 Sep 2006 10:39:07 +0200 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> <44FE8522.5020703@gmail.com> Message-ID: Steve Holden wrote: >> * I acknowledge that Python *code* is almost certainly going to be edited in a >> left-to-right text editor, because it's an English-based programming language. >> But the strings that string methods like partition() and rpartition() are used >> with are quite likely to be coming from or written to a or user interface that >> uses a native script orientation. >> > Perhaps we should be thinking "beginning" and "end" here, though it > seems as though it won't be possible to find a terminology that will be > intuitively obvious to everyone. Which is why an example is absolutely necessary and will make things clear for everyone. Georg From ralf at brainbot.com Wed Sep 6 12:14:09 2006 From: ralf at brainbot.com (Ralf Schmitt) Date: Wed, 06 Sep 2006 12:14:09 +0200 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: References: Message-ID: <44FE9F71.3090903@brainbot.com> Fernando Perez wrote: > > These enormous numbers of calls are the origin of the slowdown, and the more > modules have been imported, the worse it gets. --- /exp/lib/python2.5/inspect.py 2006-08-28 11:53:36.000000000 +0200 +++ inspect.py 2006-09-06 12:10:45.000000000 +0200 @@ -444,7 +444,8 @@ in the file and the line number indexes a line in that list. An IOError is raised if the source code cannot be retrieved.""" file = getsourcefile(object) or getfile(object) - module = getmodule(object) + #module = getmodule(object) + module = None if module: lines = linecache.getlines(file, module.__dict__) else: The problem seems to originate from the module=getmodule(object) in findsource. If I outcomment that code (or rather do a module=None), things seem to be back as normal. (linecache.getlines has been called with a None module in python 2.4's inspect.py). - Ralf From mwh at python.net Wed Sep 6 12:34:23 2006 From: mwh at python.net (Michael Hudson) Date: Wed, 06 Sep 2006 11:34:23 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: (Gustavo Carneiro's message of "Mon, 4 Sep 2006 14:52:36 +0000") References: Message-ID: <2m8xkxnv0w.fsf@starship.python.net> "Gustavo Carneiro" writes: > On 9/4/06, Nick Maclaren wrote: >> "Gustavo Carneiro" wrote: >> > I am now thinking of something along these lines: >> > typedef void (*PyPendingCallNotify)(void *user_data); >> > PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, >> > void *user_data); >> > PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify >> > callback, void *user_data); >> >> Why would that help? The problems are semantic, not syntactic. >> >> Anthony Baxter isn't exaggerating the problem, despite what you may >> think from his posting. > > You guys are tough customers to please. Yes. > I am just trying to solve a problem here, not create a new one; you > have to believe me. We believe you, but you are stirring the ashes of old problems. > 1. In PyGTK we have a gobject.MainLoop.run() method, which blocks > essentially forever in a poll() system call, and only wakes if/when it > has to process timeout or IO event; > 2. When we only have one thread, we can guarantee that e.g. > SIGINT will always be caught by the thread running the > g_main_loop_run(), so we know poll() will be interrupted and a EINTR > will be generated, giving us control temporarily back to check for > python signals; > 3. When we have multiple thread, we cannot make this assumption, > so instead we install a timeout to periodically check for signals. > > We want to get rid of timeouts. Now my idea: add a Python API to say: > "dear Python, please call me when you start having pending calls, > even if from a signal handler context, ok?" This seems a reasonable proposal. But it's totally a Python 2.6 thing, so how about taking a deep breath, working on a patch and submitting it when it's ready? Having to wake a process up a few times a second is ugly and annoying, sure, but it is not a release delaying problem. Cheers, mwh -- It is never worth a first class man's time to express a majority opinion. By definition, there are plenty of others to do that. -- G. H. Hardy From ncoghlan at gmail.com Wed Sep 6 12:54:45 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 06 Sep 2006 20:54:45 +1000 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FE9F71.3090903@brainbot.com> References: <44FE9F71.3090903@brainbot.com> Message-ID: <44FEA8F5.1000700@gmail.com> Ralf Schmitt wrote: > The problem seems to originate from the module=getmodule(object) in > findsource. If I outcomment that code (or rather do a module=None), > things seem to be back as normal. (linecache.getlines has been called > with a None module in python 2.4's inspect.py). It looks like the problem is the call to getabspath() in getmodule(). This happens every time, even if the file name is already in the modulesbyfile cache. This calls os.path.abspath() and os.path.normpath() every time that inspect.findsource() is called. That can be fixed by having findsource() pass the filename argument to getmodule(), and adding a check of the modulesbyfile cache *before* the call to getabspath(). Can you try this patch and see if you get 2.4 level performance back on Fernando's test?: http://www.python.org/sf/1553314 (Assigned to Neal in the hopes of making 2.5rc2) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ralf at brainbot.com Wed Sep 6 13:22:45 2006 From: ralf at brainbot.com (Ralf Schmitt) Date: Wed, 06 Sep 2006 13:22:45 +0200 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FEA8F5.1000700@gmail.com> References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> Message-ID: <44FEAF85.1000107@brainbot.com> Nick Coghlan wrote: > > It looks like the problem is the call to getabspath() in getmodule(). This > happens every time, even if the file name is already in the modulesbyfile > cache. This calls os.path.abspath() and os.path.normpath() every time that > inspect.findsource() is called. > > That can be fixed by having findsource() pass the filename argument to > getmodule(), and adding a check of the modulesbyfile cache *before* the call > to getabspath(). > > Can you try this patch and see if you get 2.4 level performance back on > Fernando's test?: no. this doesn't work. getmodule always iterates over sys.modules.values() and only returns None afterwards. One would have to cache the bad file value, or only inspect new/changed modules from sys.modules. > > http://www.python.org/sf/1553314 > > (Assigned to Neal in the hopes of making 2.5rc2) > > Cheers, > Nick. > From g.brandl at gmx.net Wed Sep 6 14:41:19 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 06 Sep 2006 14:41:19 +0200 Subject: [Python-Dev] Exception message for invalid with statement usage Message-ID: Current trunk: >>> with 1: ... print "1" ... Traceback (most recent call last): File "", line 1, in AttributeError: 'int' object has no attribute '__exit__' Isn't that a bit crude? For "for i in 1" there's a better error message, so why shouldn't the above give a TypeError: 'int' object is not a context manager ? Georg From ncoghlan at gmail.com Wed Sep 6 15:06:33 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 06 Sep 2006 23:06:33 +1000 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FEAF85.1000107@brainbot.com> References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> <44FEAF85.1000107@brainbot.com> Message-ID: <44FEC7D9.80500@gmail.com> Ralf Schmitt wrote: > Nick Coghlan wrote: >> >> It looks like the problem is the call to getabspath() in getmodule(). >> This happens every time, even if the file name is already in the >> modulesbyfile cache. This calls os.path.abspath() and >> os.path.normpath() every time that inspect.findsource() is called. >> >> That can be fixed by having findsource() pass the filename argument to >> getmodule(), and adding a check of the modulesbyfile cache *before* >> the call to getabspath(). >> >> Can you try this patch and see if you get 2.4 level performance back >> on Fernando's test?: > > no. this doesn't work. getmodule always iterates over > sys.modules.values() and only returns None afterwards. > One would have to cache the bad file value, or only inspect new/changed > modules from sys.modules. Good point. I modified the patch so it does the latter (it only calls getabspath() again for a module if the value of module.__file__ changes). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Wed Sep 6 15:11:31 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 06 Sep 2006 23:11:31 +1000 Subject: [Python-Dev] Exception message for invalid with statement usage In-Reply-To: References: Message-ID: <44FEC903.7060303@gmail.com> Georg Brandl wrote: > Current trunk: > >>>> with 1: > ... print "1" > ... > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'int' object has no attribute '__exit__' > > Isn't that a bit crude? For "for i in 1" there's a better > error message, so why shouldn't the above give a > TypeError: 'int' object is not a context manager The for loop has a nice error message because it starts with its own opcode, but the with statement translates pretty much to the code in PEP 343. There's a special opcode at the end to help with unwinding the stack, but at the start it's just normal attribute retrieval opcodes for __enter__ and __exit__. >>> def f(): ... with 1: ... pass ... >>> dis.dis(f) 2 0 LOAD_CONST 1 (1) 3 DUP_TOP 4 LOAD_ATTR 0 (__exit__) 7 STORE_FAST 0 (_[1]) 10 LOAD_ATTR 1 (__enter__) 13 CALL_FUNCTION 0 16 POP_TOP 17 SETUP_FINALLY 4 (to 24) 3 20 POP_BLOCK 21 LOAD_CONST 0 (None) >> 24 LOAD_FAST 0 (_[1]) 27 DELETE_FAST 0 (_[1]) 30 WITH_CLEANUP 31 END_FINALLY 32 LOAD_CONST 0 (None) 35 RETURN_VALUE Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ralf at brainbot.com Wed Sep 6 16:53:30 2006 From: ralf at brainbot.com (Ralf Schmitt) Date: Wed, 06 Sep 2006 16:53:30 +0200 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FEC7D9.80500@gmail.com> References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> <44FEAF85.1000107@brainbot.com> <44FEC7D9.80500@gmail.com> Message-ID: <44FEE0EA.7000303@brainbot.com> Nick Coghlan wrote: > Ralf Schmitt wrote: >> Nick Coghlan wrote: >>> It looks like the problem is the call to getabspath() in getmodule(). >>> This happens every time, even if the file name is already in the >>> modulesbyfile cache. This calls os.path.abspath() and >>> os.path.normpath() every time that inspect.findsource() is called. >>> >>> That can be fixed by having findsource() pass the filename argument to >>> getmodule(), and adding a check of the modulesbyfile cache *before* >>> the call to getabspath(). >>> >>> Can you try this patch and see if you get 2.4 level performance back >>> on Fernando's test?: >> no. this doesn't work. getmodule always iterates over >> sys.modules.values() and only returns None afterwards. >> One would have to cache the bad file value, or only inspect new/changed >> modules from sys.modules. > > Good point. I modified the patch so it does the latter (it only calls > getabspath() again for a module if the value of module.__file__ changes). with _filesbymodname[modname] = file changed to _filesbymodname[modname] = f it seems to work ok. diff -r d41ffd2faa28 inspect.py --- a/inspect.py Wed Sep 06 13:01:12 2006 +0200 +++ b/inspect.py Wed Sep 06 16:52:39 2006 +0200 @@ -403,6 +403,7 @@ def getabsfile(object, _filename=None): return os.path.normcase(os.path.abspath(_filename)) modulesbyfile = {} +_filesbymodname = {} def getmodule(object, _filename=None): """Return the module an object was defined in, or None if not found.""" @@ -410,17 +411,23 @@ def getmodule(object, _filename=None): return object if hasattr(object, '__module__'): return sys.modules.get(object.__module__) + if _filename is not None and _filename in modulesbyfile: + return sys.modules.get(modulesbyfile[_filename]) try: file = getabsfile(object, _filename) except TypeError: return None if file in modulesbyfile: return sys.modules.get(modulesbyfile[file]) - for module in sys.modules.values(): + for modname, module in sys.modules.iteritems(): if ismodule(module) and hasattr(module, '__file__'): + f = module.__file__ + if f == _filesbymodname.get(modname, None): + continue + _filesbymodname[modname] = f f = getabsfile(module) modulesbyfile[f] = modulesbyfile[ - os.path.realpath(f)] = module.__name__ + os.path.realpath(f)] = modname if file in modulesbyfile: return sys.modules.get(modulesbyfile[file]) main = sys.modules['__main__'] @@ -444,7 +451,7 @@ def findsource(object): in the file and the line number indexes a line in that list. An IOError is raised if the source code cannot be retrieved.""" file = getsourcefile(object) or getfile(object) - module = getmodule(object) + module = getmodule(object, file) if module: lines = linecache.getlines(file, module.__dict__) else: From guido at python.org Wed Sep 6 17:46:21 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2006 08:46:21 -0700 Subject: [Python-Dev] Exception message for invalid with statement usage In-Reply-To: References: Message-ID: IMO it's fine. The only time you'll see this in reality is when someone passed you the wrong type of object by mistake, and then the type mentioned in the message is plenty help to debug it. Anyone with even a slight understanding of 'with' knows it involves '__exit__', and the linenumber should be a big fat hint, too. On 9/6/06, Georg Brandl wrote: > Current trunk: > > >>> with 1: > ... print "1" > ... > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'int' object has no attribute '__exit__' > > Isn't that a bit crude? For "for i in 1" there's a better > error message, so why shouldn't the above give a > TypeError: 'int' object is not a context manager > > ? > > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Wed Sep 6 22:44:31 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 6 Sep 2006 16:44:31 -0400 Subject: [Python-Dev] Cross-platform math functions? In-Reply-To: <44FD050F.20901@gmx.de> References: <44FC9C53.5060304@gmx.de> <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com> <44FD050F.20901@gmx.de> Message-ID: <1f7befae0609061344x61b1ae87vdd523fceb32a12d7@mail.gmail.com> [Tim Peters] >> Package a Python wrapper and see how popular it becomes. Some reasons >> against trying to standardize on fdlibm were explained here: >> >> http://mail.python.org/pipermail/python-list/2005-July/290164.html [Andreas Raab] > Thanks, these are good points. About speed, do you have any good > benchmarks available? Certainly not for "typical Python use" -- doubt such a benchmark exists. Some people use sqrt once in a blue moon, others make heavy use of many libm functions over millions & millions of floats, and in some apps extremely heavy use is made where speed is everything and accuracy doesn't much matter at all (e.g., gross plotting). I'd ask on numeric Python lists, and (e.g.) people working with visualization. > In my experience fdlibm is quite reasonable for speed in the context of use > by dynamic languages (i.e., counting allocation overheads, lookup and send > performance etc) "Reasonable" for which purpose(s), specifically? Some people would certainly care about a 5% slowdown, while most others wouldn't, but one thing to avoid is pissing off the people who use a thing the most ;-) > but since I'm not a Python expert I'd appreciate some help with realistic > benchmarks. As above, python-dev isn't a likely place to look for such answers. > ... > Agreed. Thus my question if someone had already done this ;-) Not that I know of, although my understanding (which may be wrong) is that glibc's current math functions started as a copy of fdlibm. From gustavo at niemeyer.net Thu Sep 7 01:24:23 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Wed, 6 Sep 2006 20:24:23 -0300 Subject: [Python-Dev] buildbot breakage Message-ID: <20060906232422.GA8620@niemeyer.net> Some buildbots will fail because they got revision r51793, and it has a change I made to fix a problem in the subprocess module. Please do not rollback any changes. I'm handling the issue. Also notice that there's no broken code there. The problem is that the issue in subprocess is related to stdout/stderr handling, and I'm having trouble making buildbot happy while keeping the new tests in place. I apologise for any inconvenience this may cause. -- Gustavo Niemeyer http://niemeyer.net From gustavo at niemeyer.net Thu Sep 7 01:45:50 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Wed, 6 Sep 2006 20:45:50 -0300 Subject: [Python-Dev] buildbot breakage In-Reply-To: <20060906232422.GA8620@niemeyer.net> References: <20060906232422.GA8620@niemeyer.net> Message-ID: <20060906234550.GA9265@niemeyer.net> > Some buildbots will fail because they got revision r51793, and it > has a change I made to fix a problem in the subprocess module. I've removed the offending test in r51794 and buildbots should be happy again. One of the ways of exploring the issue reported is using sys.stdout as the stdout keyword, such as: subprocess.call([...], stdout=sys.stdout) it breaks because it ends up closing one of the standard descriptors of the subprocess. Unfortunately we can't test it that way because buildbot uses a StringIO in sys.stdout. I kept the test which uses stdout=1, and removed the one expecting sys.stdout to be a "normal" file. Sorry for the trouble, -- Gustavo Niemeyer http://niemeyer.net From python-dev at zesty.ca Thu Sep 7 05:38:07 2006 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Wed, 6 Sep 2006 22:38:07 -0500 (CDT) Subject: [Python-Dev] new security doc using object-capabilities In-Reply-To: References: Message-ID: Hi Brett, Here are some comments on your proposal. Sorry this took so long. I apologize if any of these comments are out of date (but also look forward to your answers to some of the questions, as they'll help me understand some more of the details of your proposal). Thanks! > Introduction > /////////////////////////////////////// [...] > Throughout this document several terms are going to be used. A > "sandboxed interpreter" is one where the built-in namespace is not the > same as that of an interpreter whose built-ins were unaltered, which > is called an "unprotected interpreter". Is this a definition or an implementation choice? As in, are you defining "sandboxed" to mean "with altered built-ins" or just "restricted in some way", and does the above mean to imply that altering the built-ins is what triggers other kinds of restrictions (as it did in Python's old restricted execution mode)? > A "bare interpreter" is one where the built-in namespace has been > stripped down the bare minimum needed to run any form of basic Python > program. This means that all atomic types (i.e., syntactically > supported types), ``object``, and the exceptions provided by the > ``exceptions`` module are considered in the built-in namespace. There > have also been no imports executed in the interpreter. Is a "bare interpreter" just one example of a sandboxed interpreter, or are all sandboxed interpreters in your design initially bare (i.e. "sandboxed" = "bare" + zero or more granted authorities)? > The "security domain" is the boundary at which security is cared > about. For this dicussion, it is the interpreter. It might be clearer to say (if i understand correctly) "Each interpreter is a separate security domain." Many interpreters can run within a single operating system process, right? Could you say a bit about what sort of concurrency model you have in mind? How would this interact (if at all) with use of the existing threading functionality? > The "powerbox" is the thing that possesses the ultimate power in the > system. In our case it is the Python process. This could also be the application process, right? > Rationale > /////////////////////////////////////// [...] > For instance, think of an application that supports a plug-in system > with Python as the language used for writing plug-ins. You do not > want to have to examine every plug-in you download to make sure that > it does not alter your filesystem if you can help it. With a proper > security model and implementation in place this hinderance of having > to examine all code you execute should be alleviated. I'm glad to have this use case set out early in the document, so the reader can keep it in mind as an example while reading about the model. > Approaches to Security > /////////////////////////////////////// > > There are essentially two types of security: who-I-am > (permissions-based) security and what-I-have (authority-based) > security. As Mark Miller mentioned in another message, your descriptions of "who-I-am" security and "what-I-have" security make sense, but they don't correspond to "permission" vs. "authority". They correspond to "identity-based" vs. "authority-based" security. > Difficulties in Python for Object-Capabilities > ////////////////////////////////////////////// [...] > Three key requirements for providing a proper perimeter defence is > private namespaces, immutable shared state across domains, and > unforgeable references. Nice summary. > Problem of No Private Namespace > =============================== [...] > The Python language has no such thing as a private namespace. Don't local scopes count as private namespaces? It seems clear that they aren't designed with the intention of being exposed, unlike other namespaces in Python. > It also makes providing security at the object level using > object-capabilities non-existent in pure Python code. I don't think this is necessarily the case. No Python code i've ever seen expects to be able to invade the local scopes of other functions, so you could use them as private namespaces. There are two ways i've seen to invade local scopes: (a) Use gc.get_referents to get back from a cell object to its contents. (b) Compare the cell object to another cell object, thereby causing __eq__ to be invoked to compare the contents of the cells. So you could protect local scopes by prohibiting these or by simply turning off access to func_closure. It's clear that hardly any code depends on these introspection featuresl, so it would be reasonble to turn them off in a sandboxed interpreter. (It seems you would have to turn off some introspection features anyway in order to have reliable import guards.) > Problem of Mutable Shared State > =============================== [...] > Regardless, sharing of state that can be influenced by another > interpreter is not safe for object-capabilities. Yup. > Threat Model > /////////////////////////////////////// Good to see this specified here. I like the way you've broken this down. > * An interpreter cannot gain abilties the Python process possesses > without explicitly being given those abilities. It would be good to enumerate which abilities you're referring to in this item. For example, a bare interpreter should be able to allocate memory and call most of the built-in functions, but should not be able to open network connections. > * An interpreter cannot influence another interpreter directly at the > Python level without explicitly allowing it. You mean, without some other entity explicitly allowing it, right? What would that other entity be -- presumably the interpreter that spawned both of these sub-interpreters? > * An interpreter cannot use operating system resources without being > explicitly given those resources. Okay. > * A bare Python interpreter is always trusted. What does "trusted" mean in the above? > * Python bytecode is always distrusted. > * Pure Python source code is always safe on its own. It would be helpful to clarify "safe" here. I assume by "safe" you mean that the Python source code can express whatever it wants, including potentially dangerous activities, but when run in a bare or sandboxed interpreter it cannot have harmful effects. But then in what sense does the "safety" have to do with the Python source code rather than the restrictions on the interpreter? Would it be correct to say: + We want to guarantee that Python source code cannot violate the restrictions in a restricted or bare interpreter. + We do not prevent arbitrary Python bytecode from violating these restrictions, and assume that it can. > + Malicious abilities are derived from C extension modules, > built-in modules, and unsafe types implemented in C, not from > pure Python source. By "malicious" do you just mean "anything that isn't accessible to a bare interpreter"? > * A sub-interpreter started by another interpreter does not inherit > any state. Do you envision a tree of interpreters and sub-interpreters? Can the levels of spawning get arbitrarily deep? If i am visualizing your model correctly, maybe it would be useful to introduce the term "parent", where each interpreter has as its parent either the Python process or another interpreter. Then you could say that each interpreter acquires authority only by explicit granting from its parent. Then i have another question: can an interpreter acquire authorities only when it is started, or can it acquire them while it is running, and how? > Implementation > /////////////////////////////////////// > > Guiding Principles > ======================== > > To begin, the Python process garners all power as the powerbox. It is > up to the process to initially hand out access to resources and > abilities to interpreters. This might take the form of an interpreter > with all abilities granted (i.e., a standard interpreter as launched > when you execute Python), which then creates sub-interpreters with > sandboxed abilities. Another alternative is only creating > interpreters with sandboxed abilities (i.e., Python being embedded in > an application that only uses sandboxed interpreters). This sounds like part of your design to me. It might help to have this earlier in the document (maybe even with an example diagram of a tree of interpreters). > All security measures should never have to ask who an interpreter is. > This means that what abilities an interpreter has should not be stored > at the interpreter level when the security can use a proxy to protect > a resource. This means that while supporting a memory cap can > have a per-interpreter setting that is checked (because access to the > operating system's memory allocator is not supported at the program > level), protecting files and imports should not such a per-interpreter > protection at such a low level (because those can have extension > module proxies to provide the security). It might be good to declare two categories of resources -- those protected by object hiding and those protected by a per-interpreter setting -- and make lists. > Backwards-compatibility will not be a hindrance upon the design or > implementation of the security model. Because the security model will > inherently remove resources and abilities that existing code expects, > it is not reasonable to expect existing code to work in a sandboxed > interpreter. You might qualify the last statement a bit. For example, a Python implementation of a pure algorithm (e.g. string processing, data compression, etc.) would still work in a sandboxed interpreter. > Keeping Python "pythonic" is required for all design decisions. As Lawrence Oluyede also mentioned, it would be helpful to say a little more about what "pythonic" means. > Restricting what is in the built-in namespace and the safe-guarding > the interpreter (which includes safe-guarding the built-in types) is > where security will come from. Sounds good. > Abilities of a Standard Sandboxed Interpreter > ============================================= > [...] > * You cannot open any files directly. > * Importation > + You can import any pure Python module. > + You cannot import any Python bytecode module. > + You cannot import any C extension module. > + You cannot import any built-in module. > * You cannot find out any information about the operating system you > are running on. > * Only safe built-ins are provided. This looks reasonable. This is probably a good place to itemize exactly which built-ins are considered safe. > Imports > ------- > > A proxy for protecting imports will be provided. This is done by > setting the ``__import__()`` function in the built-in namespace of the > sandboxed interpreter to a proxied version of the function. > > The planned proxy will take in a passed-in function to use for the > import and a whitelist of C extension modules and built-in modules to > allow importation of. Presumably these are passed in to the proxy's constructor. > If an import would lead to loading an extension > or built-in module, it is checked against the whitelist and allowed > to be imported based on that list. All .pyc and .pyo file will not > be imported. All .py files will be imported. I'm unclear about this. Is the whitelist a list of module names only, or of filenames with extensions? Does the normal path-searching process take place or can it be restricted in some way? Would it simplify the security analysis to have the whitelist be a dictionary that maps module names to absolute pathnames? If both the .py and .pyc are present, the normal import would find the .pyc file; would the import proxy reject such an import or ignore it and recompile the .py instead? > It must be warned that importing any C extension module is dangerous. Right. > Implementing Import in Python > +++++++++++++++++++++++++++++ > > To help facilitate in the exposure of more of what importation > requires (and thus make implementing a proxy easier), the import > machinery should be rewritten in Python. This seems like a good idea. Can you identify which minimum essential pieces of the import machinery have to be written in C? > Sanitizing Built-In Types > ------------------------- [...] > Constructors > ++++++++++++ > > Almost all of Python's built-in types > contain a constructor that allows code to create a new instance of a > type as long as you have the type itself. Unfortunately this does not > work in an object-capabilities system without either providing a proxy > to the constructor or just turning it off. The existence of the constructor isn't (by itself) the problem. The problem is that both of the following are true: (a) From any object you can get its type object. (b) Using any type object you can construct a new instance. So, you can control this either by hiding the type object, separating the constructor from the type, or disabling the constructor. > Types whose constructors are considered dangerous are: > > * ``file`` > + Will definitely use the ``open()`` built-in. > * code objects > * XXX sockets? > * XXX type? > * XXX Looks good so far. Not sure i see what's dangerous about 'type'. > Filesystem Information > ++++++++++++++++++++++ > > When running code in a sandboxed interpreter, POLA suggests that you > do not want to expose information about your environment on top of > protecting its use. This means that filesystem paths typically should > not be exposed. Unfortunately, Python exposes file paths all over the > place: > > * Modules > + ``__file__`` attribute > * Code objects > + ``co_filename`` attribute > * Packages > + ``__path__`` attribute > * XXX > > XXX how to expose safely? It seems that in most cases, a single Python object is associated with a single pathname. If that's true in general, one solution would be to provide an introspection function named 'getpath' or something similar that would get the path associated with any object. This function might go in a module containing all the introspection functions, so imports of that module could be easily restricted. > Mutable Shared State > ++++++++++++++++++++ > > Because built-in types are shared between interpreters, they cannot > expose any mutable shared state. Unfortunately, as it stands, some > do. Below is a list of types that share some form of dangerous state, > how they share it, and how to fix the problem: > > * ``object`` > + ``__subclasses__()`` function > - Remove the function; never seen used in real-world code. > * XXX Okay, more to work out here. :) > Perimeter Defences Between a Created Interpreter and Its Creator > ---------------------------------------------------------------- > > The plan is to allow interpreters to instantiate sandboxed > interpreters safely. By using the creating interpreter's abilities to > provide abilities to the created interpreter, you make sure there is > no escalation in abilities. Good. > * ``__del__`` created in sandboxed interpreter but object is cleaned > up in unprotected interpreter. How do you envision the launching of a sandboxed interpreter to look? Could you sketch out some rough code examples? Were you thinking of something like: sys.spawn(code, dict) code: a string containing Python source code dict: the global namespace in which to run the code If you allow the parent interpreter to pass mutable objects into the child interpreter, then the parent and child can already communicate via the object, so '__del__' is a moot issue. Do you want to prevent all communication between parent and child? It's not obvious to me why that would be necessary. > * Using frames to walk the frame stack back to another interpreter. Could you just disable introspection of the frame stack? > Making the ``sys`` Module Safe > ------------------------------ [...] > This means that the ``sys`` module needs to have its safe information > separated out from the unsafe settings. Yes. > XXX separate modules, ``sys.settings`` and ``sys.info``, or strip > ``sys`` to settings and put info somewhere else? Or provide a method > that will create a faked sys module that has the safe values copied > into it? I think the last suggestion above would lead to confusion. The two groups should have two distinct names and it should be clear which attribute goes with which group. > Protecting I/O > ++++++++++++++ > > The ``print`` keyword and the built-ins ``raw_input()`` and > ``input()`` use the values stored in ``sys.stdout`` and ``sys.stdin``. > By exposing these attributes to the creating interpreter, one can set > them to safe objects, such as instances of ``StringIO``. Sounds good. > Safe Networking > --------------- > > XXX proxy on socket module, modify open() to be the constructor, etc. Lots more to think about here. :) > Protecting Memory Usage > ----------------------- > > To protect memory, low-level hooks into the memory allocator for > Python is needed. By hooking into the C API for memory allocation and > deallocation a very rough running count of used memory can kept. This > can be used to prevent sandboxed interpreters from using so much > memory that it impacts the overall performance of the system. Preventing denial-of-service is in general quite difficult, but i applaud the attempt. I agree with your decision to separate this work from the rest of the security model. -- ?!ng From nnorwitz at gmail.com Thu Sep 7 09:28:39 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 7 Sep 2006 00:28:39 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: <44FDD122.3000809@egenix.com> Message-ID: On 9/5/06, Brett Cannon wrote: > > > [MAL] > > The proper fix would be to introduce a tp_unicode slot and let > > this decide what to do, ie. call .__unicode__() methods on instances > > and use the .__name__ on classes. > > That was my bug reaction and what I said on the bug report. Kind of > surprised one doesn't already exist. > > > I think this would be the right way to go for Python 2.6. For > > Python 2.5, just dropping this .__unicode__ method on exceptions > > is probably the right thing to do. > > Neal, do you want to rip it out or should I? Is removing __unicode__ backwards compatible with 2.4 for both instances and exception classes? Does everyone agree this is the proper approach? I'm not familiar with this code. Brett, if everyone agrees (ie, remains silent), please fix this and add tests and a NEWS entry. Everyone should be looking for incompatibilities with previous versions. Exceptions are new and deserve special attention. Lots of the internals of strings (8-bit and unicode) and the struct module changed and should be tested thoroughly. I'm sure there are a bunch of other things I'm not remembering. The compiler is also an obvious target to verify your code still works. We're stuck with anything that makes it into 2.5, so now is the time to fix these problems. n From ronaldoussoren at mac.com Thu Sep 7 11:17:37 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 7 Sep 2006 11:17:37 +0200 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: On 5-sep-2006, at 6:24, Neal Norwitz wrote: > There are 3 bugs currently listed in PEP 356 as blocking: > http://python.org/sf/1551432 - __unicode__ breaks on > exception classes > http://python.org/sf/1550938 - improper exception w/ > relative import > http://python.org/sf/1541697 - sgmllib regexp bug causes hang > > Does anyone want to fix the sgmlib issue? If not, we should revert > this week before c2 is cut. I'm hoping that we will have *no changes* > in 2.5 final from c2. Should there be any bugs/patches added to or > removed from the list? > > The buildbots are currently humming along, but I believe all 3 > versions (2.4, 2.5, and 2.6) are fine. > > Test out 2.5c1+ and report all bugs! I have another bug that I'd like to fix: Mac/ReadMe contains an error: it claims that you can build the frameworkinstall into a temporary directory and then move it into place, but that isn't actually true. The erroneous paragraph is this: Note that there are no references to the actual locations in the code or resource files, so you are free to move things around afterwards. For example, you could use --enable-framework=/tmp/newversion/Library/ Frameworks and use /tmp/newversion as the basis for an installer or something. My proposed fix is to drop this paragraph. There is no bugreport for this yet, I got notified of this issue in a private e-mail. Ronald From nnorwitz at gmail.com Thu Sep 7 11:19:35 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 7 Sep 2006 02:19:35 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: Doc patches are fine, please fix. n -- On 9/7/06, Ronald Oussoren wrote: > > On 5-sep-2006, at 6:24, Neal Norwitz wrote: > > > There are 3 bugs currently listed in PEP 356 as blocking: > > http://python.org/sf/1551432 - __unicode__ breaks on > > exception classes > > http://python.org/sf/1550938 - improper exception w/ > > relative import > > http://python.org/sf/1541697 - sgmllib regexp bug causes hang > > > > Does anyone want to fix the sgmlib issue? If not, we should revert > > this week before c2 is cut. I'm hoping that we will have *no changes* > > in 2.5 final from c2. Should there be any bugs/patches added to or > > removed from the list? > > > > The buildbots are currently humming along, but I believe all 3 > > versions (2.4, 2.5, and 2.6) are fine. > > > > Test out 2.5c1+ and report all bugs! > > I have another bug that I'd like to fix: Mac/ReadMe contains an > error: it claims that you can build the frameworkinstall into a > temporary directory and then move it into place, but that isn't > actually true. The erroneous paragraph is this: > > Note that there are no references to the actual locations in the > code or > resource files, so you are free to move things around afterwards. > For example, > you could use --enable-framework=/tmp/newversion/Library/ > Frameworks and use > /tmp/newversion as the basis for an installer or something. > > My proposed fix is to drop this paragraph. There is no bugreport for > this yet, I got notified of this issue in a private e-mail. > > Ronald > From ncoghlan at gmail.com Thu Sep 7 12:59:01 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 07 Sep 2006 20:59:01 +1000 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FEE0EA.7000303@brainbot.com> References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> <44FEAF85.1000107@brainbot.com> <44FEC7D9.80500@gmail.com> <44FEE0EA.7000303@brainbot.com> Message-ID: <44FFFB75.3030903@gmail.com> Ralf Schmitt wrote: > Nick Coghlan wrote: >> Good point. I modified the patch so it does the latter (it only calls >> getabspath() again for a module if the value of module.__file__ changes). > > with _filesbymodname[modname] = file changed to _filesbymodname[modname] > = f > it seems to work ok. I checked the inspect module unit tests and discovered the test for this function was only covering one of the half dozen possible execution paths. I've updated the patch on SF, and committed the fix (including PJE's and Neal's comments) to the trunk. I'll backport it tomorrow night (assuming I don't hear any objections in the meantime :). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From murman at gmail.com Thu Sep 7 15:37:41 2006 From: murman at gmail.com (Michael Urman) Date: Thu, 7 Sep 2006 08:37:41 -0500 Subject: [Python-Dev] Change in file() behavior in 2.5 Message-ID: Hi folks, Between 2.4 and 2.5 the behavior of file or open with the mode 'wU' has changed. In 2.4 it silently works. in 2.5 it raises a ValueError. I can't find any more discussion on it in python-dev than tangential mentions in this thread: http://mail.python.org/pipermail/python-dev/2006-June/065939.html It is (buried) in NEWS. First I found: Bug #1462152: file() now checks more thoroughly for invalid mode strings and removes a possible "U" before passing the mode to the C library function. Which seems to imply different behavior than the actual entry: bug #967182: disallow opening files with 'wU' or 'aU' as specified by PEP 278. I don't see anything in pep278 about a timeline, and wanted to make sure that transitioning directly from working to raising an error was a desired change. This actually caught a bug in an application I work with, which used an explicit 'wU', that will currently stop working when people upgrade Python but not our application. Thanks, Michael -- Michael Urman http://www.tortall.net/mu/blog From mwh at python.net Thu Sep 7 16:15:35 2006 From: mwh at python.net (Michael Hudson) Date: Thu, 07 Sep 2006 15:15:35 +0100 Subject: [Python-Dev] Change in file() behavior in 2.5 In-Reply-To: (Michael Urman's message of "Thu, 7 Sep 2006 08:37:41 -0500") References: Message-ID: <2m4pvjoj94.fsf@starship.python.net> "Michael Urman" writes: > Hi folks, > > Between 2.4 and 2.5 the behavior of file or open with the mode 'wU' > has changed. In 2.4 it silently works. in 2.5 it raises a ValueError. > I can't find any more discussion on it in python-dev than tangential > mentions in this thread: > http://mail.python.org/pipermail/python-dev/2006-June/065939.html > > It is (buried) in NEWS. First I found: > Bug #1462152: file() now checks more thoroughly for invalid mode > strings and removes a possible "U" before passing the mode to the > C library function. > Which seems to imply different behavior than the actual entry: > bug #967182: disallow opening files with 'wU' or 'aU' as specified by PEP > 278. > > I don't see anything in pep278 about a timeline, and wanted to make > sure that transitioning directly from working to raising an error was > a desired change. That it was silently ignored was never intentional; it was a bug and it was fixed. I don't think having a release with deprecation warnings and so on is worth it. > This actually caught a bug in an application I work with, which used > an explicit 'wU', that will currently stop working when people > upgrade Python but not our application. I would hope they wouldn't do that without careful testing anyway. Cheers, mwh -- No. In fact, my eyeballs fell out just from reading this question, so it's a good thing I can touch-type. -- John Baez, sci.physics.research From fperez.net at gmail.com Thu Sep 7 17:31:20 2006 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 07 Sep 2006 09:31:20 -0600 Subject: [Python-Dev] inspect.py very slow under 2.5 References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> <44FEAF85.1000107@brainbot.com> <44FEC7D9.80500@gmail.com> <44FEE0EA.7000303@brainbot.com> <44FFFB75.3030903@gmail.com> Message-ID: Nick Coghlan wrote: > I've updated the patch on SF, and committed the fix (including PJE's and > Neal's comments) to the trunk. > > I'll backport it tomorrow night (assuming I don't hear any objections in the > meantime :). I just wanted to thank you all for taking the time to work on this, even with my 11-th hour report. Greatly appreciated, really. Looking forward to 2.5! f From grig.gheorghiu at gmail.com Thu Sep 7 17:34:17 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Thu, 7 Sep 2006 08:34:17 -0700 Subject: [Python-Dev] 'with' bites Twisted Message-ID: <3f09d5a00609070834m35694c34u5af582dff3aa5bb4@mail.gmail.com> When the pybot buildslave for Twisted is trying to run the Twisted test suite via 'trial', it gets an exception: Traceback (most recent call last): File "/tmp/Twisted/bin/trial", line 23, in from twisted.scripts.trial import run File "/tmp/Twisted/twisted/scripts/trial.py", line 10, in from twisted.application import app File "/tmp/Twisted/twisted/application/app.py", line 10, in from twisted.application import service File "/tmp/Twisted/twisted/application/service.py", line 20, in from twisted.python import components File "/tmp/Twisted/twisted/python/components.py", line 37, in from zope.interface.adapter import AdapterRegistry File "/tmp/python-buildbot/local/lib/python2.6/site-packages/zope/interface/adapter.py", line 201 for with, objects in v.iteritems(): ^ SyntaxError: invalid syntax So the culprit in this case is really zope.interface. The full log is here: http://www.python.org/dev/buildbot/community/all/x86%20RedHat%209%20trunk/builds/97/step-shell/0 Grig -- http://agiletesting.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/148dda2b/attachment.htm From exarkun at divmod.com Thu Sep 7 18:06:13 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Thu, 7 Sep 2006 12:06:13 -0400 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <972ec5bd0609070841j6cd2a600o4c6db5567dffd790@mail.gmail.com> Message-ID: <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> On Thu, 7 Sep 2006 11:41:48 -0400, Timothy Fitz wrote: >On 9/5/06, Jean-Paul Calderone wrote: >>You cannot stop the reactor and then start it again. > >Why don't the reactors throw if this happens? This question comes up >almost once a month. > One could just as easily ask why no one bothers to read mailing list archives to see if their question has been answered before. No one will ever know, it is just one of the mysteries of the universe. Jean-Paul From aahz at pythoncraft.com Thu Sep 7 18:22:17 2006 From: aahz at pythoncraft.com (Aahz) Date: Thu, 7 Sep 2006 09:22:17 -0700 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> References: <972ec5bd0609070841j6cd2a600o4c6db5567dffd790@mail.gmail.com> <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> Message-ID: <20060907162217.GA17623@panix.com> On Thu, Sep 07, 2006, Jean-Paul Calderone wrote: > On Thu, 7 Sep 2006 11:41:48 -0400, Timothy Fitz wrote: >>On 9/5/06, Jean-Paul Calderone wrote: >>> >>>You cannot stop the reactor and then start it again. >> >>Why don't the reactors throw if this happens? This question comes up >>almost once a month. > > One could just as easily ask why no one bothers to read mailing list > archives to see if their question has been answered before. > > No one will ever know, it is just one of the mysteries of the universe. One could also ask why this got x-posted to python-dev... -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ I support the RKAB From skip at pobox.com Thu Sep 7 18:31:42 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 7 Sep 2006 11:31:42 -0500 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> References: <972ec5bd0609070841j6cd2a600o4c6db5567dffd790@mail.gmail.com> <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> Message-ID: <17664.18798.756868.339094@montanaro.dyndns.org> Jean-Paul> One could just as easily ask why no one bothers to read Jean-Paul> mailing list archives to see if their question has been Jean-Paul> answered before. Jean-Paul> No one will ever know, it is just one of the mysteries of the Jean-Paul> universe. +1 QOTF... Skip From exarkun at divmod.com Thu Sep 7 18:36:00 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Thu, 7 Sep 2006 12:36:00 -0400 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <20060907162217.GA17623@panix.com> Message-ID: <20060907163600.1717.1300037898.divmod.quotient.42020@ohm> Sorry, brainfart. Jean-Paul From kristjan at ccpgames.com Thu Sep 7 18:56:15 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Thu, 7 Sep 2006 16:56:15 -0000 Subject: [Python-Dev] Unicode Imports Message-ID: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> Hello All. I just added patch 1552880 to sourceforge. It is a patch for 2.6 (and 2.5) which allows unicode paths in sys.path and uses the unicode file api on windows. This is tried and tested on 2.5, and backported to 2.3 and is currently running on clients in china and esewhere. It is minimally intrusive to the inporting mechanism, at the cost of some string conversion overhead (to utf8 and then back to unicode). Cheers, Kristj?n -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/071726d7/attachment.html From skip at pobox.com Thu Sep 7 19:23:39 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 7 Sep 2006 12:23:39 -0500 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <20060907163600.1717.1300037898.divmod.quotient.42020@ohm> References: <20060907162217.GA17623@panix.com> <20060907163600.1717.1300037898.divmod.quotient.42020@ohm> Message-ID: <17664.21915.553226.875941@montanaro.dyndns.org> Jean-Paul> Sorry, brainfart. But still... QOTF ;-) S From amk at amk.ca Thu Sep 7 19:39:00 2006 From: amk at amk.ca (A.M. Kuchling) Date: Thu, 7 Sep 2006 13:39:00 -0400 Subject: [Python-Dev] Arlington sprints to occur monthly Message-ID: <20060907173900.GA4691@rogue.amk.ca> Jeffrey Elkner has arranged things so that the 1-day Python sprints in Arlington VA will now be happening every month. Future sprints will be on September 23rd, October 21st, November 18th, and December 16th. See http://wiki.python.org/moin/ArlingtonSprint for directions and to sign up. --amk From brett at python.org Thu Sep 7 19:39:20 2006 From: brett at python.org (Brett Cannon) Date: Thu, 7 Sep 2006 10:39:20 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: <44FDD122.3000809@egenix.com> Message-ID: On 9/7/06, Neal Norwitz wrote: > > On 9/5/06, Brett Cannon wrote: > > > > > [MAL] > > > The proper fix would be to introduce a tp_unicode slot and let > > > this decide what to do, ie. call .__unicode__() methods on instances > > > and use the .__name__ on classes. > > > > That was my bug reaction and what I said on the bug report. Kind of > > surprised one doesn't already exist. > > > > > I think this would be the right way to go for Python 2.6. For > > > Python 2.5, just dropping this .__unicode__ method on exceptions > > > is probably the right thing to do. > > > > Neal, do you want to rip it out or should I? > > Is removing __unicode__ backwards compatible with 2.4 for both > instances and exception classes? Should be. There was no proper __unicode__() originally so that's why this whole problem came up in the first place. Does everyone agree this is the proper approach? I'm not familiar > with this code. I am not terribly anymore either since Georg and Richard rewrote the whole thing. =) Brett, if everyone agrees (ie, remains silent), > please fix this and add tests and a NEWS entry. OK. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/3def1449/attachment.htm From anthony at interlink.com.au Thu Sep 7 19:53:03 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 8 Sep 2006 03:53:03 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> Message-ID: <200609080353.07502.anthony@interlink.com.au> On Friday 08 September 2006 02:56, Kristj?n V. J?nsson wrote: > Hello All. > I just added patch 1552880 to sourceforge. It is a patch for 2.6 (and 2.5) > which allows unicode paths in sys.path and uses the unicode file api on > windows. This is tried and tested on 2.5, and backported to 2.3 and is > currently running on clients in china and esewhere. It is minimally > intrusive to the inporting mechanism, at the cost of some string conversion > overhead (to utf8 and then back to unicode). As this can't be considered a bugfix (that I can see), I'd be against it being checked into 2.5. From brett at python.org Thu Sep 7 20:26:53 2006 From: brett at python.org (Brett Cannon) Date: Thu, 7 Sep 2006 11:26:53 -0700 Subject: [Python-Dev] new security doc using object-capabilities In-Reply-To: References: Message-ID: On 9/6/06, Ka-Ping Yee wrote: > > Hi Brett, > > Here are some comments on your proposal. Sorry this took so long. > I apologize if any of these comments are out of date (but also look > forward to your answers to some of the questions, as they'll help > me understand some more of the details of your proposal). Thanks! I think they are slightly outdated. The latest version of the doc is in the bcannon-objcap branch and is named securing_python.txt ( http://svn.python.org/view/python/branches/bcannon-objcap/securing_python.txt ). > Introduction > > /////////////////////////////////////// > [...] > > Throughout this document several terms are going to be used. A > > "sandboxed interpreter" is one where the built-in namespace is not the > > same as that of an interpreter whose built-ins were unaltered, which > > is called an "unprotected interpreter". > > Is this a definition or an implementation choice? As in, are you > defining "sandboxed" to mean "with altered built-ins" or just > "restricted in some way", and does the above mean to imply that > altering the built-ins is what triggers other kinds of restrictions > (as it did in Python's old restricted execution mode)? There is no "triggering" of other restrictions. This is an implementation choice. "Sandboxed" means "with altered built-ins". > A "bare interpreter" is one where the built-in namespace has been > > stripped down the bare minimum needed to run any form of basic Python > > program. This means that all atomic types (i.e., syntactically > > supported types), ``object``, and the exceptions provided by the > > ``exceptions`` module are considered in the built-in namespace. There > > have also been no imports executed in the interpreter. > > Is a "bare interpreter" just one example of a sandboxed interpreter, > or are all sandboxed interpreters in your design initially bare (i.e. > "sandboxed" = "bare" + zero or more granted authorities)? You build up from a bare interpreter by adding in authorities (e.g., providing a wrapped version of open()) to reach the level of security you want. > The "security domain" is the boundary at which security is cared > > about. For this dicussion, it is the interpreter. > > It might be clearer to say (if i understand correctly) "Each interpreter > is a separate security domain." > > Many interpreters can run within a single operating system process, > right? Yes. Could you say a bit about what sort of concurrency model you > have in mind? None specifically. Each new interpreter automatically runs in its own Python thread, so they have essentially the same concurrency as using the 'thread' module. How would this interact (if at all) with use of the > existing threading functionality? See above. > The "powerbox" is the thing that possesses the ultimate power in the > > system. In our case it is the Python process. > > This could also be the application process, right? If Python is embedded, yes. > Rationale > > /////////////////////////////////////// > [...] > > For instance, think of an application that supports a plug-in system > > with Python as the language used for writing plug-ins. You do not > > want to have to examine every plug-in you download to make sure that > > it does not alter your filesystem if you can help it. With a proper > > security model and implementation in place this hinderance of having > > to examine all code you execute should be alleviated. > > I'm glad to have this use case set out early in the document, so the > reader can keep it in mind as an example while reading about the model. > > > Approaches to Security > > /////////////////////////////////////// > > > > There are essentially two types of security: who-I-am > > (permissions-based) security and what-I-have (authority-based) > > security. > > As Mark Miller mentioned in another message, your descriptions of > "who-I-am" security and "what-I-have" security make sense, but > they don't correspond to "permission" vs. "authority". They > correspond to "identity-based" vs. "authority-based" security. Right. This was fixed the day Mark and Alan Karp made the comment. > Difficulties in Python for Object-Capabilities > > ////////////////////////////////////////////// > [...] > > Three key requirements for providing a proper perimeter defence is > > private namespaces, immutable shared state across domains, and > > unforgeable references. > > Nice summary. > > > Problem of No Private Namespace > > =============================== > [...] > > The Python language has no such thing as a private namespace. > > Don't local scopes count as private namespaces? It seems clear > that they aren't designed with the intention of being exposed, > unlike other namespaces in Python. Sort of. But you can still get access to them if you have an execution frame and they are not persistent. Generators are are worse since they store their execution frame with the generator itself, completely exposing the local namespace. > It also makes providing security at the object level using > > object-capabilities non-existent in pure Python code. I don't think this is necessarily the case. No Python code i've > ever seen expects to be able to invade the local scopes of other > functions, so you could use them as private namespaces. There > are two ways i've seen to invade local scopes: > > (a) Use gc.get_referents to get back from a cell object > to its contents. > > (b) Compare the cell object to another cell object, thereby > causing __eq__ to be invoked to compare the contents of > the cells. Or the execution frame which is exposed directly on generators. But regardless, the comment was meant to apply to Python as it stands, not that it couldn't be possibly tweaked somehow. So you could protect local scopes by prohibiting these or by > simply turning off access to func_closure. It's clear that hardly > any code depends on these introspection featuresl, so it would be > reasonble to turn them off in a sandboxed interpreter. (It seems > you would have to turn off some introspection features anyway in > order to have reliable import guards.) Maybe this can be changed in the future, but this more than I need at the moment so I am not going to go down that path right now. But I added a quick mention of this. > Problem of Mutable Shared State > > =============================== > [...] > > Regardless, sharing of state that can be influenced by another > > interpreter is not safe for object-capabilities. > > Yup. > > > Threat Model > > /////////////////////////////////////// > > Good to see this specified here. I like the way you've broken this > down. The current version has more details per point than the one you read. > * An interpreter cannot gain abilties the Python process possesses > > without explicitly being given those abilities. > > It would be good to enumerate which abilities you're referring to in > this item. For example, a bare interpreter should be able to allocate > memory and call most of the built-in functions, but should not be able > to open network connections. > > > * An interpreter cannot influence another interpreter directly at the > > Python level without explicitly allowing it. > > You mean, without some other entity explicitly allowing it, right? Yep. What would that other entity be -- presumably the interpreter that > spawned both of these sub-interpreters? Sure. You could stick something in the built-in namespace of the sub-interpreter to use for communicating. > * An interpreter cannot use operating system resources without being > > explicitly given those resources. > > Okay. > > > * A bare Python interpreter is always trusted. > > What does "trusted" mean in the above? It means that if Python source code can execute within a bare interpreter it is considered safe code. This is covered in the new version of the doc. > * Python bytecode is always distrusted. > > * Pure Python source code is always safe on its own. > > It would be helpful to clarify "safe" here. I assume by "safe" you > mean that the Python source code can express whatever it wants, > including potentially dangerous activities, but when run in a bare > or sandboxed interpreter it cannot have harmful effects. But then > in what sense does the "safety" have to do with the Python source code > rather than the restrictions on the interpreter? > > Would it be correct to say: > + We want to guarantee that Python source code cannot violate > the restrictions in a restricted or bare interpreter. > + We do not prevent arbitrary Python bytecode from violating > these restrictions, and assume that it can. > + Malicious abilities are derived from C extension modules, > > built-in modules, and unsafe types implemented in C, not from > > pure Python source. > > By "malicious" do you just mean "anything that isn't accessible to > a bare interpreter"? Anything that could harm the system or interpreter. > * A sub-interpreter started by another interpreter does not inherit > > any state. > > Do you envision a tree of interpreters and sub-interpreters? Can the > levels of spawning get arbitrarily deep? Yes and yes. If i am visualizing your model correctly, maybe it would be useful to > introduce the term "parent", where each interpreter has as its parent > either the Python process or another interpreter. Then you could say > that each interpreter acquires authority only by explicit granting from > its parent. You could, although there is not hierarchy at the implementation level. But it works in terms of who has a reference to whom and who gives each interpreter their authority. Then i have another question: can an interpreter acquire > authorities only when it is started, or can it acquire them while it is > running, and how? Well, whatever you want to do through the built-in namespace. So if you pass in a mutable object like a dict and add stuff to it on the fly, I don't see why you couldn't give new authorities on the fly. > Implementation > > /////////////////////////////////////// > > > > Guiding Principles > > ======================== > > > > To begin, the Python process garners all power as the powerbox. It is > > up to the process to initially hand out access to resources and > > abilities to interpreters. This might take the form of an interpreter > > with all abilities granted (i.e., a standard interpreter as launched > > when you execute Python), which then creates sub-interpreters with > > sandboxed abilities. Another alternative is only creating > > interpreters with sandboxed abilities (i.e., Python being embedded in > > an application that only uses sandboxed interpreters). > > This sounds like part of your design to me. It might help to have > this earlier in the document (maybe even with an example diagram of a > tree of interpreters). Made Guiding Principles its own section and split off the bottom part of the section and put it under Implementation. > All security measures should never have to ask who an interpreter is. > > This means that what abilities an interpreter has should not be stored > > at the interpreter level when the security can use a proxy to protect > > a resource. This means that while supporting a memory cap can > > have a per-interpreter setting that is checked (because access to the > > operating system's memory allocator is not supported at the program > > level), protecting files and imports should not such a per-interpreter > > protection at such a low level (because those can have extension > > module proxies to provide the security). > > It might be good to declare two categories of resources -- those > protected by object hiding and those protected by a per-interpreter > setting -- and make lists. That is rather unknown since I am constantly finding stuff that is global to the process compared to the interpreter, so making the list seems premature. > Backwards-compatibility will not be a hindrance upon the design or > > implementation of the security model. Because the security model will > > inherently remove resources and abilities that existing code expects, > > it is not reasonable to expect existing code to work in a sandboxed > > interpreter. > > You might qualify the last statement a bit. For example, a Python > implementation of a pure algorithm (e.g. string processing, data > compression, etc.) would still work in a sandboxed interpreter. I tossed in "all" to clarify. > Keeping Python "pythonic" is required for all design decisions. > > As Lawrence Oluyede also mentioned, it would be helpful to say a > little more about what "pythonic" means. Done in the current version. > Restricting what is in the built-in namespace and the safe-guarding > > the interpreter (which includes safe-guarding the built-in types) is > > where security will come from. > > Sounds good. > > > Abilities of a Standard Sandboxed Interpreter > > ============================================= > > > [...] > > * You cannot open any files directly. > > * Importation > > + You can import any pure Python module. > > + You cannot import any Python bytecode module. > > + You cannot import any C extension module. > > + You cannot import any built-in module. > > * You cannot find out any information about the operating system you > > are running on. > > * Only safe built-ins are provided. > > This looks reasonable. This is probably a good place to itemize > exactly which built-ins are considered safe. > > > Imports > > ------- > > > > A proxy for protecting imports will be provided. This is done by > > setting the ``__import__()`` function in the built-in namespace of the > > sandboxed interpreter to a proxied version of the function. > > > > The planned proxy will take in a passed-in function to use for the > > import and a whitelist of C extension modules and built-in modules to > > allow importation of. > > Presumably these are passed in to the proxy's constructor. Current plan is to expose the built-in namespace, imported modules, and sys module dict when creating an Interpreter instance. > If an import would lead to loading an extension > > or built-in module, it is checked against the whitelist and allowed > > to be imported based on that list. All .pyc and .pyo file will not > > be imported. All .py files will be imported. > > I'm unclear about this. Is the whitelist a list of module names only, > or of filenames with extensions? Have not deciced, but probably module name. Does the normal path-searching process > take place or can it be restricted in some way? Have not decided. Would it simplify the > security analysis to have the whitelist be a dictionary that maps module > names to absolute pathnames? Don't know. Protecting imports is the last thing I am going to implement since it is the trickiest. If both the .py and .pyc are present, the normal import would find the > .pyc file; would the import proxy reject such an import or ignore it > and recompile the .py instead? Somethign along those lines. > It must be warned that importing any C extension module is dangerous. > > Right. > > > Implementing Import in Python > > +++++++++++++++++++++++++++++ > > > > To help facilitate in the exposure of more of what importation > > requires (and thus make implementing a proxy easier), the import > > machinery should be rewritten in Python. > > This seems like a good idea. Can you identify which minimum essential > pieces of the import machinery have to be written in C? Loading of C extensions, stating files, reading files, etc. Pretty much that requires help from the OS. > Sanitizing Built-In Types > > ------------------------- > [...] > > Constructors > > ++++++++++++ > > > > Almost all of Python's built-in types > > contain a constructor that allows code to create a new instance of a > > type as long as you have the type itself. Unfortunately this does not > > work in an object-capabilities system without either providing a proxy > > to the constructor or just turning it off. > > The existence of the constructor isn't (by itself) the problem. > The problem is that both of the following are true: > > (a) From any object you can get its type object. > (b) Using any type object you can construct a new instance. > > So, you can control this either by hiding the type object, separating > the constructor from the type, or disabling the constructor. I separated the constructor or initializer (tp_new or tp_init) into a factory function. > Types whose constructors are considered dangerous are: > > > > * ``file`` > > + Will definitely use the ``open()`` built-in. > > * code objects > > * XXX sockets? > > * XXX type? > > * XXX > > Looks good so far. Not sure i see what's dangerous about 'type'. That's why it has the question mark. =) > Filesystem Information > > ++++++++++++++++++++++ > > > > When running code in a sandboxed interpreter, POLA suggests that you > > do not want to expose information about your environment on top of > > protecting its use. This means that filesystem paths typically should > > not be exposed. Unfortunately, Python exposes file paths all over the > > place: > > > > * Modules > > + ``__file__`` attribute > > * Code objects > > + ``co_filename`` attribute > > * Packages > > + ``__path__`` attribute > > * XXX > > > > XXX how to expose safely? > > It seems that in most cases, a single Python object is associated with > a single pathname. If that's true in general, one solution would be > to provide an introspection function named 'getpath' or something > similar that would get the path associated with any object. This > function might go in a module containing all the introspection functions, > so imports of that module could be easily restricted. That is the current thinking. > Mutable Shared State > > ++++++++++++++++++++ > > > > Because built-in types are shared between interpreters, they cannot > > expose any mutable shared state. Unfortunately, as it stands, some > > do. Below is a list of types that share some form of dangerous state, > > how they share it, and how to fix the problem: > > > > * ``object`` > > + ``__subclasses__()`` function > > - Remove the function; never seen used in real-world code. > > * XXX > > Okay, more to work out here. :) Possibly. I might have to wait until I am much closer to being done to discover more places where mutable shared state is exposed in a bare interpreter because I have not been able to think of anymore. > Perimeter Defences Between a Created Interpreter and Its Creator > > ---------------------------------------------------------------- > > > > The plan is to allow interpreters to instantiate sandboxed > > interpreters safely. By using the creating interpreter's abilities to > > provide abilities to the created interpreter, you make sure there is > > no escalation in abilities. > > Good. > > > * ``__del__`` created in sandboxed interpreter but object is cleaned > > up in unprotected interpreter. > > How do you envision the launching of a sandboxed interpreter to look? > Could you sketch out some rough code examples? >>> interp = interpreter.Interpreter() >>> interp.builtins['open'] = wrapped_open() >>> interp.sys_dict['path'] = [] >>> interp.exec("2 + 3") Were you thinking of > something like: > > sys.spawn(code, dict) > code: a string containing Python source code > dict: the global namespace in which to run the code > > If you allow the parent interpreter to pass mutable objects into the > child interpreter, then the parent and child can already communicate > via the object, so '__del__' is a moot issue. Do you want to prevent > all communication between parent and child? It's not obvious to me > why that would be necessary. No, I don't since there should be a secure way to allow that. The __del__ worry came up from Guido pointing out you might be able to screw with it. But if you pass in something implemented in C you should be okay. > * Using frames to walk the frame stack back to another interpreter. > > Could you just disable introspection of the frame stack? If you don't allow importing of 'sys' then yes, and that is planned. I just wanted to make sure I didn't forget this needs to be protected. I do need to check what a generator's frame exposes, though. > Making the ``sys`` Module Safe > > ------------------------------ > [...] > > This means that the ``sys`` module needs to have its safe information > > separated out from the unsafe settings. > > Yes. > > > XXX separate modules, ``sys.settings`` and ``sys.info``, or strip > > ``sys`` to settings and put info somewhere else? Or provide a method > > that will create a faked sys module that has the safe values copied > > into it? > > I think the last suggestion above would lead to confusion. The two > groups should have two distinct names and it should be clear which > attribute goes with which group. This is also more complicated by the fact that some things are for the entire process while others are per interpreter. Might have to separate things out even more. > Protecting I/O > > ++++++++++++++ > > > > The ``print`` keyword and the built-ins ``raw_input()`` and > > ``input()`` use the values stored in ``sys.stdout`` and ``sys.stdin``. > > By exposing these attributes to the creating interpreter, one can set > > them to safe objects, such as instances of ``StringIO``. > > Sounds good. > > > Safe Networking > > --------------- > > > > XXX proxy on socket module, modify open() to be the constructor, etc. > > Lots more to think about here. :) Oh yeah. =) > Protecting Memory Usage > > ----------------------- > > > > To protect memory, low-level hooks into the memory allocator for > > Python is needed. By hooking into the C API for memory allocation and > > deallocation a very rough running count of used memory can kept. This > > can be used to prevent sandboxed interpreters from using so much > > memory that it impacts the overall performance of the system. > > Preventing denial-of-service is in general quite difficult, but i > applaud the attempt. I agree with your decision to separate this The memory tracking has a proof-of-concept done in the bcannon-sandboxing branch. Not perfect, but it does show how one could go about accounting for every byte of data in terms of what it is basically used for. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/be9d8493/attachment.html From steve at holdenweb.com Fri Sep 8 10:24:03 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 08 Sep 2006 09:24:03 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <200609080353.07502.anthony@interlink.com.au> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > On Friday 08 September 2006 02:56, Kristj?n V. J?nsson wrote: > >>Hello All. >>I just added patch 1552880 to sourceforge. It is a patch for 2.6 (and 2.5) >>which allows unicode paths in sys.path and uses the unicode file api on >>windows. This is tried and tested on 2.5, and backported to 2.3 and is >>currently running on clients in china and esewhere. It is minimally >>intrusive to the inporting mechanism, at the cost of some string conversion >>overhead (to utf8 and then back to unicode). > > > As this can't be considered a bugfix (that I can see), I'd be against it being > checked into 2.5. > Are you suggesting that Python's inability to correctly handle Unicode path elements isn't a bug? Or simply that this inability isn't currently described in a bug report on Sourceforge? I agree it's a relatively large patch for a release candidate but if prudence suggests deferring it, it should be a *definite* for 2.5.1 and subsequent releases. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From anthony at interlink.com.au Fri Sep 8 10:58:28 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 8 Sep 2006 18:58:28 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> Message-ID: <200609081858.32277.anthony@interlink.com.au> On Friday 08 September 2006 18:24, Steve Holden wrote: > > As this can't be considered a bugfix (that I can see), I'd be against it > > being checked into 2.5. > > Are you suggesting that Python's inability to correctly handle Unicode > path elements isn't a bug? Or simply that this inability isn't currently > described in a bug report on Sourceforge? I'm suggesting that adding the ability to handle unicode paths is a *new* *feature*. If people actually want to see 2.5 final ever released, they're going to have to accept that "oh, but just this _one_ _more_ _thing_" is not going to fly. We're _well_ past beta1, where new features should have been added. At this point, we have to cut another release candidate. This is far too much to add during the release candidate stage. > I agree it's a relatively large patch for a release candidate but if > prudence suggests deferring it, it should be a *definite* for 2.5.1 and > subsequent releases. Possibly. I remain unconvinced. -- Anthony Baxter It's never too late to have a happy childhood. From steve at holdenweb.com Fri Sep 8 11:19:08 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 08 Sep 2006 10:19:08 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <200609081858.32277.anthony@interlink.com.au> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <200609081858.32277.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > On Friday 08 September 2006 18:24, Steve Holden wrote: > >>>As this can't be considered a bugfix (that I can see), I'd be against it >>>being checked into 2.5. >> >>Are you suggesting that Python's inability to correctly handle Unicode >>path elements isn't a bug? Or simply that this inability isn't currently >>described in a bug report on Sourceforge? > > I'm suggesting that adding the ability to handle unicode paths is a *new* > *feature*. > That's certainly true. > If people actually want to see 2.5 final ever released, they're going to have > to accept that "oh, but just this _one_ _more_ _thing_" is not going to fly. > > We're _well_ past beta1, where new features should have been added. At this > point, we have to cut another release candidate. This is far too much to add > during the release candidate stage. > Right. I couldn't argue for putting this in to 2.5 - it would certainly represent unwarranted feature creep at the rc2 stage. > >>I agree it's a relatively large patch for a release candidate but if >>prudence suggests deferring it, it should be a *definite* for 2.5.1 and >>subsequent releases. > > > Possibly. I remain unconvinced. > But it *is* a desirable, albeit new, feature, so I'm surprised that you don't appear to perceive it as such for a downstream release. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From ncoghlan at gmail.com Fri Sep 8 11:56:27 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 08 Sep 2006 19:56:27 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <200609081858.32277.anthony@interlink.com.au> Message-ID: <45013E4B.4050802@gmail.com> Steve Holden wrote: > Anthony Baxter wrote: >> On Friday 08 September 2006 18:24, Steve Holden wrote: >>> I agree it's a relatively large patch for a release candidate but if >>> prudence suggests deferring it, it should be a *definite* for 2.5.1 and >>> subsequent releases. >> >> Possibly. I remain unconvinced. >> > > But it *is* a desirable, albeit new, feature, so I'm surprised that you > don't appear to perceive it as such for a downstream release. And unlike 2.2's True/False problem, it is an *environmental* feature, rather than a programmatic one. So while it's a new feature, it would merely mean that 2.5.1 works correctly in more environments than 2.5. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From anthony at interlink.com.au Fri Sep 8 11:48:51 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 8 Sep 2006 19:48:51 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> Message-ID: <200609081948.55218.anthony@interlink.com.au> On Friday 08 September 2006 19:19, Steve Holden wrote: > But it *is* a desirable, albeit new, feature, so I'm surprised that you > don't appear to perceive it as such for a downstream release. Point releases (2.x.1 and suchlike) are absolutely not for new features. They're for bugfixes, only. It's possible that this could be considered a bugfix, but as I said right now I'm dubious. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From steve at holdenweb.com Fri Sep 8 12:28:27 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 08 Sep 2006 11:28:27 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <200609081948.55218.anthony@interlink.com.au> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> <200609081948.55218.anthony@interlink.com.au> Message-ID: <450145CB.3070601@holdenweb.com> Anthony Baxter wrote: > On Friday 08 September 2006 19:19, Steve Holden wrote: > >>But it *is* a desirable, albeit new, feature, so I'm surprised that you >>don't appear to perceive it as such for a downstream release. > > > Point releases (2.x.1 and suchlike) are absolutely not for new features. > They're for bugfixes, only. It's possible that this could be considered a > bugfix, but as I said right now I'm dubious. > OK, in that case I'm going to argue that the current behaviour is buggy. I suppose your point is that, assuming the patch is correct (and it seems the authors are relying on it for production purposes in tens of thousands of installations), it doesn't change the behaviour of the interpreter in existing cases, and therefore it is providing a new feature. I don't regard this as the provision of a new feature but as the removal of an unnecessary restriction (which I would prefer to call a bug). If it was *documented* somewhere that Unicode paths aren't legal I would find your arguments more convincing. As things stand new Python users would, IMHO, be within their rights to assume that arbitrary directories could be added to the path without breakage. Ultimately, your call, I guess. Would it help if I added "inability to import from Unicode directories" as a bug? Or would you prefer to change the documentation to state that some directories can't be used as path elements <0.3 wink>? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From guido at python.org Fri Sep 8 18:29:16 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Sep 2006 09:29:16 -0700 Subject: [Python-Dev] Unicode Imports In-Reply-To: <450145CB.3070601@holdenweb.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> <200609081948.55218.anthony@interlink.com.au> <450145CB.3070601@holdenweb.com> Message-ID: On 9/8/06, Steve Holden wrote: > Anthony Baxter wrote: > > On Friday 08 September 2006 19:19, Steve Holden wrote: > > > >>But it *is* a desirable, albeit new, feature, so I'm surprised that you > >>don't appear to perceive it as such for a downstream release. > > > > > > Point releases (2.x.1 and suchlike) are absolutely not for new features. > > They're for bugfixes, only. It's possible that this could be considered a > > bugfix, but as I said right now I'm dubious. > > > OK, in that case I'm going to argue that the current behaviour is buggy. > > I suppose your point is that, assuming the patch is correct (and it > seems the authors are relying on it for production purposes in tens of > thousands of installations), it doesn't change the behaviour of the > interpreter in existing cases, and therefore it is providing a new feature. > > I don't regard this as the provision of a new feature but as the removal > of an unnecessary restriction (which I would prefer to call a bug). If > it was *documented* somewhere that Unicode paths aren't legal I would > find your arguments more convincing. As things stand new Python users > would, IMHO, be within their rights to assume that arbitrary directories > could be added to the path without breakage. > > Ultimately, your call, I guess. Would it help if I added "inability to > import from Unicode directories" as a bug? Or would you prefer to change > the documentation to state that some directories can't be used as path > elements <0.3 wink>? We've all heard the arguments for both sides enough times I think. IMO it's the call of the release managers. Board members ought to trust the release managers and not apply undue pressure. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Fri Sep 8 18:41:44 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 8 Sep 2006 11:41:44 -0500 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> <200609081948.55218.anthony@interlink.com.au> <450145CB.3070601@holdenweb.com> Message-ID: <17665.40264.242710.426290@montanaro.dyndns.org> Guido> IMO it's the call of the release managers. Board members ought to Guido> trust the release managers and not apply undue pressure. Indeed. Let's not go whacking people with boards. The Perl people would just laugh at us... Skip From rasky at develer.com Fri Sep 8 20:51:46 2006 From: rasky at develer.com (Giovanni Bajo) Date: Fri, 8 Sep 2006 20:51:46 +0200 Subject: [Python-Dev] Unicode Imports References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc><200609081858.32277.anthony@interlink.com.au><200609081948.55218.anthony@interlink.com.au><450145CB.3070601@holdenweb.com> Message-ID: <010301c6d377$d5df7bc0$46ba2997@bagio> Guido van Rossum wrote: > IMO it's the call of the release managers. Board members ought to > trust the release managers and not apply undue pressure. +1, but I would love to see a more formal definition of what a "bugfix" is, which would reduce the ambiguous cases, and thus reduce the number of times the release managers are called to pronounce. Other projects, for instance, describe point releases as "open for regression fixes only", which means that a patch, to be eligible for a point release, must fix a regression (something which used to work before, and doesn't anymore). Regressions are important because they affect people wanting to upgrade Python. If something never worked before (like this unicode path thingie), surely existing Python users are not affected by the bug (or they have already workarounds in place), so that NOT having the bug fixed in a point release is not a problem. Anyway, I'm not pushing for this specific policy (even if I like it): I'm just suggesting Release Managers to more formally define what should and what should not go in a point release. Giovanni Bajo From rhettinger at ewtllc.com Fri Sep 8 21:00:50 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Fri, 08 Sep 2006 12:00:50 -0700 Subject: [Python-Dev] Unicode Imports In-Reply-To: <010301c6d377$d5df7bc0$46ba2997@bagio> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc><200609081858.32277.anthony@interlink.com.au><200609081948.55218.anthony@interlink.com.au><450145CB.3070601@holdenweb.com> <010301c6d377$d5df7bc0$46ba2997@bagio> Message-ID: <4501BDE2.6020306@ewtllc.com> Giovanni Bajo wrote: > >+1, but I would love to see a more formal definition of what a "bugfix" is, >which would reduce the ambiguous cases, and thus reduce the number of times the >release managers are called to pronounce. > > Sorry, that is just a pipe-dream. To some degree, all bug-fixes are new features in that there is some behavioral difference, something will now work that wouldn't work before. While some cases are clear-cut (such as API changes), the ones that are interesting will defy definition and need a human judgment call as to whether a given change will help more than it hurts. The RMs are also strongly biased against extensive patches than haven't had a chance to go through a beta-cycle -- they don't want their releases mucked-up. Raymond From mal at egenix.com Fri Sep 8 21:12:33 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 08 Sep 2006 21:12:33 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> Message-ID: <4501C0A1.4010600@egenix.com> Kristj?n V. J?nsson wrote: > Hello All. > I just added patch 1552880 to sourceforge. It is a patch for 2.6 (and 2.5) which allows unicode paths in sys.path and uses the unicode file api on windows. > This is tried and tested on 2.5, and backported to 2.3 and is currently running on clients in china and esewhere. It is minimally intrusive to the inporting mechanism, at the cost of some string conversion overhead (to utf8 and then back to unicode). +1 on adding it to Python 2.6. -0 for Python 2.5.x: Applications/modules written for Python 2.4 and 2.5 won't be expecting Unicode strings in sys.path with all the consequences that go with it, so this is a true change in semantics, not just a nice to have additional feature or "bug" fix. OTOH, those applications will just break in a different place with the patch applied :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 08 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Fri Sep 8 22:51:09 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 08 Sep 2006 22:51:09 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> Message-ID: <4501D7BD.1020006@v.loewis.de> Steve Holden schrieb: >> As this can't be considered a bugfix (that I can see), I'd be against it being >> checked into 2.5. >> > Are you suggesting that Python's inability to correctly handle Unicode > path elements isn't a bug? Not sure whether Anthony suggests it, but I do. > Or simply that this inability isn't currently > described in a bug report on Sourceforge? No: sys.path is specified (originally) as containing a list of byte strings; it was extended to also support path importers (or whatever that PEP calls them). It was never extended to support Unicode strings. That other PEP e > I agree it's a relatively large patch for a release candidate but if > prudence suggests deferring it, it should be a *definite* for 2.5.1 and > subsequent releases. I'm not so sure it should. It *is* a new feature: it makes applications possible which aren't possible today, and the documentation does not ever suggest that these applications should have been possible. In fact, it is common knowledge that this currently isn't supported. Regards, Martin From martin at v.loewis.de Fri Sep 8 22:52:26 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 08 Sep 2006 22:52:26 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <200609081858.32277.anthony@interlink.com.au> Message-ID: <4501D80A.5050008@v.loewis.de> Steve Holden schrieb: >>> I agree it's a relatively large patch for a release candidate but if >>> prudence suggests deferring it, it should be a *definite* for 2.5.1 and >>> subsequent releases. >> >> Possibly. I remain unconvinced. >> > > But it *is* a desirable, albeit new, feature, so I'm surprised that you > don't appear to perceive it as such for a downstream release. Because 2.5.1 shouldn't include any new features. If it is a new feature (which it is), it should go into 2.6. Regards, Martin From martin at v.loewis.de Fri Sep 8 22:54:43 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 08 Sep 2006 22:54:43 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <45013E4B.4050802@gmail.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <200609081858.32277.anthony@interlink.com.au> <45013E4B.4050802@gmail.com> Message-ID: <4501D893.4090504@v.loewis.de> Nick Coghlan schrieb: >> But it *is* a desirable, albeit new, feature, so I'm surprised that you >> don't appear to perceive it as such for a downstream release. > > And unlike 2.2's True/False problem, it is an *environmental* feature, rather > than a programmatic one. Not sure what you mean by that; if you mean "thus existing applications cannot break": this is not true. In fact, it seems that some applications are extremely susceptible to the types of objects on sys.path. Some applications apparently know exactly what you can and cannot find on sys.path; changing that might break them. Regards, Martin From martin at v.loewis.de Fri Sep 8 22:56:48 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 08 Sep 2006 22:56:48 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <450145CB.3070601@holdenweb.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> <200609081948.55218.anthony@interlink.com.au> <450145CB.3070601@holdenweb.com> Message-ID: <4501D910.8020805@v.loewis.de> Steve Holden schrieb: > I don't regard this as the provision of a new feature but as the removal > of an unnecessary restriction (which I would prefer to call a bug). You got the definition of "bug" wrong. Primarily, a bug is a deviation from the specification. Extending the domain of an argument to an existing function is a new feature. Regards, Martin From martin at v.loewis.de Fri Sep 8 22:59:57 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 08 Sep 2006 22:59:57 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <010301c6d377$d5df7bc0$46ba2997@bagio> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc><200609081858.32277.anthony@interlink.com.au><200609081948.55218.anthony@interlink.com.au><450145CB.3070601@holdenweb.com> <010301c6d377$d5df7bc0$46ba2997@bagio> Message-ID: <4501D9CD.80301@v.loewis.de> Giovanni Bajo schrieb: > +1, but I would love to see a more formal definition of what a "bugfix" is, > which would reduce the ambiguous cases, and thus reduce the number of times the > release managers are called to pronounce. > > Other projects, for instance, describe point releases as "open for regression > fixes only", which means that a patch, to be eligible for a point release, must > fix a regression (something which used to work before, and doesn't anymore). In Python, the tradition has excepted bug fixes beyond that. For example, fixing a memory leak would also count as a bug fix. In general, I think a "bug" is a deviation from the specification (it might be necessary to interpret the specification first to find out whether the implementation deviates). A bug fix is then a behavior change so that the new behavior follows the specification, or a specification change so that it correctly describes the behavior. Regards, Martin From misa at redhat.com Sat Sep 9 00:06:05 2006 From: misa at redhat.com (Mihai Ibanescu) Date: Fri, 8 Sep 2006 18:06:05 -0400 Subject: [Python-Dev] Py_BuildValue and decref Message-ID: <20060908220605.GF990@abulafia.devel.redhat.com> Hi, Looking at: http://docs.python.org/api/arg-parsing.html The description for "O" is: "O" (object) [PyObject *] Store a Python object (without any conversion) in a C object pointer. The C program thus receives the actual object that was passed. The object's reference count is not increased. The pointer stored is not NULL. There is no description of what happens when Py_BuildValue fails. Will it decref the python object passed in? Will it not? Looking at tupleobject.h: /* Another generally useful object type is a tuple of object pointers. For Python, this is an immutable type. C code can change the tuple items (but not their number), and even use tuples are general-purpose arrays of object references, but in general only brand new tuples should be mutated, not ones that might already have been exposed to Python code. *** WARNING *** PyTuple_SetItem does not increment the new item's reference count, but does decrement the reference count of the item it replaces, if not nil. It does *decrement* the reference count if it is *not* ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ inserted in the tuple. Similarly, PyTuple_GetItem does not increment the returned item's reference count. */ So, if the call to PyTuple_SetItem fails, the value passed in is lost. Should I expect the same thing with Py_BuildValue? Looking at how other modules deal with this, I picked typeobject.c: result = Py_BuildValue("[O]", (PyObject *)type); if (result == NULL) { Py_DECREF(to_merge); return NULL; } so no attempt to DECREF type in the error case. Further down... if (n) { state = Py_BuildValue("(NO)", state, slots); if (state == NULL) goto end; } and further down: end: Py_XDECREF(cls); Py_XDECREF(args); Py_XDECREF(args2); Py_XDECREF(slots); Py_XDECREF(state); Py_XDECREF(names); Py_XDECREF(listitems); Py_XDECREF(dictitems); Py_XDECREF(copy_reg); Py_XDECREF(newobj); return res; so it will attempt to DECREF the (non-NULL) slots in the error case. It's probably not a big issue since if Py_BuildValue fails, you have bigger issues than memory leaks, but it seems inconsistent to me. Can someone that knows the internal implementation clarify one way over the other? Thanks! Misa From barry at barrys-emacs.org Sat Sep 9 00:18:49 2006 From: barry at barrys-emacs.org (Barry Scott) Date: Fri, 8 Sep 2006 23:18:49 +0100 Subject: [Python-Dev] What windows tool chain do I need for python 2.5 extensions? Message-ID: <52A09F3B-0D3B-46E3-B7E5-02DC0D3BB170@barrys-emacs.org> I have the tool chains to build extensions against your binary python 2.2, 2.3 and 2.4 on windows. What are the tool chain requirements for building extensions against python 2.5 on windows? Barry From barry at python.org Sat Sep 9 00:27:08 2006 From: barry at python.org (Barry Warsaw) Date: Fri, 8 Sep 2006 18:27:08 -0400 Subject: [Python-Dev] Py_BuildValue and decref In-Reply-To: <20060908220605.GF990@abulafia.devel.redhat.com> References: <20060908220605.GF990@abulafia.devel.redhat.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 8, 2006, at 6:06 PM, Mihai Ibanescu wrote: > There is no description of what happens when Py_BuildValue fails. > Will it > decref the python object passed in? Will it not? I just want to point out that the C API documentation is pretty silent about the refcounting side-effects in error conditions (and often in success conditions too) of most Python functions. For example, what is the refcounting side-effects of PyDict_SetItem() on val? What about if that function fails? Has val been incref'd or not? What about the side-effects on any value the new one replaces, both in success and failure? The C API documentation has improved in documenting the refcount behavior for return values of many of the functions, but the only reliable way to know what some other side-effects are is to read the code. After I perfect my human cloning techniques, I'll be assigning one of my minions to fix this situation (I'll bet my clean-the-kitty- litter-and-stalk-er-keep-tabs-on-Britney clone would love to take a break for a few weeks to work on this). - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRQHuQnEjvBPtnXfVAQJfFAP9GHIRhiVc7lkzwEkPtJgqNsrN8edQcKh3 l4edSlDD7JoJrIaOElqyIaEKcJSkjpKfJt6qdA1qIt8LD9x4pGvdxpxgodGVYfFo VGPwm+pU9SH6JJIZcCOOf9bJbEmR9iqZKceAJMGgJvZjBnTnoVSyf52254q3JJGR b9glwqbddi0= =3iWf -----END PGP SIGNATURE----- From misa at redhat.com Sat Sep 9 00:35:58 2006 From: misa at redhat.com (Mihai Ibanescu) Date: Fri, 8 Sep 2006 18:35:58 -0400 Subject: [Python-Dev] Py_BuildValue and decref In-Reply-To: References: <20060908220605.GF990@abulafia.devel.redhat.com> Message-ID: <20060908223558.GG990@abulafia.devel.redhat.com> On Fri, Sep 08, 2006 at 06:27:08PM -0400, Barry Warsaw wrote: > > On Sep 8, 2006, at 6:06 PM, Mihai Ibanescu wrote: > > >There is no description of what happens when Py_BuildValue fails. > >Will it > >decref the python object passed in? Will it not? > > I just want to point out that the C API documentation is pretty > silent about the refcounting side-effects in error conditions (and > often in success conditions too) of most Python functions. For > example, what is the refcounting side-effects of PyDict_SetItem() on > val? What about if that function fails? Has val been incref'd or > not? What about the side-effects on any value the new one replaces, > both in success and failure? In this particular case, it doesn't decref it (or so I read the code). Relevant code is in do_mkvalue from Python/modsupport.c case 'N': case 'S': case 'O': if (**p_format == '&') { typedef PyObject *(*converter)(void *); converter func = va_arg(*p_va, converter); void *arg = va_arg(*p_va, void *); ++*p_format; return (*func)(arg); } else { PyObject *v; v = va_arg(*p_va, PyObject *); if (v != NULL) { if (*(*p_format - 1) != 'N') Py_INCREF(v); } else if (!PyErr_Occurred()) /* If a NULL was passed * because a call that should * have constructed a value * failed, that's OK, and we * pass the error on; but if * no error occurred it's not * clear that the caller knew * what she was doing. */ PyErr_SetString(PyExc_SystemError, "NULL object passed to Py_BuildValue"); return v; } Barry, where can I ship you my cloning machine? :-) Misa From jcarlson at uci.edu Sat Sep 9 00:48:59 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 08 Sep 2006 15:48:59 -0700 Subject: [Python-Dev] What windows tool chain do I need for python 2.5 extensions? In-Reply-To: <52A09F3B-0D3B-46E3-B7E5-02DC0D3BB170@barrys-emacs.org> References: <52A09F3B-0D3B-46E3-B7E5-02DC0D3BB170@barrys-emacs.org> Message-ID: <20060908154754.F8DF.JCARLSON@uci.edu> Barry Scott wrote: > > I have the tool chains to build extensions against your binary python > 2.2, 2.3 and 2.4 on windows. > > What are the tool chain requirements for building extensions against > python 2.5 on windows? The compiler requirements for 2.5 on Windows is the same as 2.4 . - Josiah From kbk at shore.net Sat Sep 9 03:35:24 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri, 8 Sep 2006 21:35:24 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200609090135.k891ZOcT003051@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 413 open ( +1) / 3407 closed (+10) / 3820 total (+11) Bugs : 897 open ( -3) / 6167 closed (+18) / 7064 total (+15) RFE : 234 open ( +1) / 238 closed ( +2) / 472 total ( +3) New / Reopened Patches ______________________ Fix decimal context management for 2.5 (2006-09-02) CLOSED http://python.org/sf/1550886 opened by Nick Coghlan Fix for rpartition() end-case (2006-09-03) CLOSED http://python.org/sf/1551339 opened by Raymond Hettinger Updated spec file for 2.5 release. (2006-09-03) CLOSED http://python.org/sf/1551340 opened by Sean Reifschneider unparse.py decorator support (2006-09-04) http://python.org/sf/1552024 opened by Adal Chiriliuc eval docstring typo (2006-09-04) CLOSED http://python.org/sf/1552093 opened by Ori Avtalion Fix error checks and leaks in setobject.c (2006-09-05) CLOSED http://python.org/sf/1552731 reopened by gbrandl Fix error checks and leaks in setobject.c (2006-09-05) CLOSED http://python.org/sf/1552731 opened by Raymond Hettinger Unicode Imports (2006-09-05) http://python.org/sf/1552880 opened by Kristj?n Valur Fix inspect.py 2.5 slowdown (2006-09-06) CLOSED http://python.org/sf/1553314 opened by Nick Coghlan locale.getdefaultlocale() bug when _locale is missing (2006-09-06) http://python.org/sf/1553427 opened by STINNER Victor UserDict New Style (2006-09-09) http://python.org/sf/1555097 opened by Indy Performance enhancements. (2006-09-09) http://python.org/sf/1555098 opened by Indy Patches Closed ______________ Fix decimal context management for 2.5 (2006-09-02) http://python.org/sf/1550886 closed by ncoghlan Fix for rpartition() end-case (2006-09-02) http://python.org/sf/1551339 closed by nnorwitz Updated spec file for 2.5 release. (2006-09-02) http://python.org/sf/1551340 closed by nnorwitz eval docstring typo (2006-09-04) http://python.org/sf/1552093 closed by nnorwitz crash in dict_equal (2006-08-24) http://python.org/sf/1546288 closed by nnorwitz Patches for OpenBSD 4.0 (2006-08-15) http://python.org/sf/1540470 closed by nnorwitz Fix error checks and leaks in setobject.c (2006-09-05) http://python.org/sf/1552731 closed by rhettinger Fix error checks and leaks in setobject.c (2006-09-05) http://python.org/sf/1552731 closed by gbrandl make exec a function (2006-09-01) http://python.org/sf/1550800 closed by gbrandl Ellipsis literal "..." (2006-09-01) http://python.org/sf/1550786 closed by gbrandl Fix inspect.py 2.5 slowdown (2006-09-06) http://python.org/sf/1553314 closed by ncoghlan New / Reopened Bugs ___________________ from . import bug (2006-09-02) CLOSED http://python.org/sf/1550938 opened by ganges master random.choice(setinstance) fails (2006-09-02) CLOSED http://python.org/sf/1551113 opened by Alan Build of 2.4.3 on fedora core 5 fails to find asm/msr.h (2006-09-02) http://python.org/sf/1551238 opened by George R. Goffe tiny bug in win32_urandom (2006-09-03) CLOSED http://python.org/sf/1551427 opened by Rocco Matano __unicode__ breaks for exception class objects (2006-09-03) http://python.org/sf/1551432 opened by Marcin 'Qrczak' Kowalczyk Wrong link to unicode database (2006-09-03) CLOSED http://python.org/sf/1551669 opened by Yevgen Muntyan unpack list of singleton tuples not unpacking (2006-07-11) CLOSED http://python.org/sf/1520864 reopened by gbrandl UnixCCompiler runtime_library_dir uses -R instead of -Wl,-R (2006-09-04) CLOSED http://python.org/sf/1552304 opened by TFKyle PEP 290 <-> normal docu... (2006-09-05) CLOSED http://python.org/sf/1552618 opened by Jens Diemer Python polls unecessarily every 0.1 when interactive (2006-09-05) http://python.org/sf/1552726 opened by Richard Boulton Python polls unnecessarily every 0.1 second when interactive (2006-09-05) http://python.org/sf/1552726 reopened by akuchling subprocess.Popen(cmd, stdout=sys.stdout) fails (2006-07-31) CLOSED http://python.org/sf/1531862 reopened by nnorwitz ConfigParser converts option names to lower case on set() (2006-09-05) CLOSED http://python.org/sf/1552892 opened by daniel Pythonw doesn't get rebuilt if version number changes (2006-09-05) http://python.org/sf/1552935 opened by Jack Jansen python 2.5 install can't find tcl/tk in /usr/lib64 (2006-09-06) http://python.org/sf/1553166 opened by David Strozzi logging.handlers.RotatingFileHandler - inconsistent mode (2006-09-06) http://python.org/sf/1553496 opened by Walker Hale datetime.datetime.now() mangles tzinfo (2006-09-06) http://python.org/sf/1553577 opened by Skip Montanaro Class instance apparently not destructed when expected (2006-09-06) http://python.org/sf/1553819 opened by Peter Donis PyOS_InputHook() and related API funcs. not documented (2006-09-07) http://python.org/sf/1554133 opened by A.M. Kuchling Bugs Closed ___________ itertools.tee raises SystemError (2006-09-01) http://python.org/sf/1550714 closed by nnorwitz Typo in Language Reference Section 3.2 Class Instances (2006-08-28) http://python.org/sf/1547931 closed by nnorwitz from . import bug (2006-09-02) http://python.org/sf/1550938 closed by gbrandl tiny bug in win32_urandom (2006-09-03) http://python.org/sf/1551427 closed by gbrandl sgmllib.sgmlparser is not thread safe (2006-08-29) http://python.org/sf/1548288 closed by gbrandl test_anydbm segmentation fault (2006-08-21) http://python.org/sf/1544106 closed by greg Wrong link to unicode database (2006-09-03) http://python.org/sf/1551669 closed by gbrandl unpack list of singleton tuples not unpacking (2006-07-11) http://python.org/sf/1520864 closed by nnorwitz UnixCCompiler runtime_library_dir uses -R instead of -Wl,-R (2006-09-04) http://python.org/sf/1552304 closed by tfkyle gcc trunk (4.2) exposes a signed integer overflows (2006-08-23) http://python.org/sf/1545668 closed by nnorwitz Exceptions don't call _PyObject_GC_UNTRACK(self) (2006-08-17) http://python.org/sf/1542051 closed by gbrandl PEP 290 <-> normal docu... (2006-09-05) http://python.org/sf/1552618 closed by gbrandl SimpleXMLRpcServer still uses sys.exc_value and sys.exc_type (2006-07-19) http://python.org/sf/1525469 closed by akuchling unbalanced parentheses from command line crash pdb (2006-07-22) http://python.org/sf/1526834 closed by akuchling Python polls unnecessarily every 0.1 second when interactive (2006-09-05) http://python.org/sf/1552726 closed by akuchling subprocess.Popen(cmd, stdout=sys.stdout) fails (2006-07-31) http://python.org/sf/1531862 closed by niemeyer subprocess.Popen(cmd, stdout=sys.stdout) fails (2006-07-31) http://python.org/sf/1531862 closed by niemeyer ConfigParser converts option names to lower case on set() (2006-09-05) http://python.org/sf/1552892 closed by gbrandl SWIG wrappers incompatible with 2.5c1 (2006-09-01) http://python.org/sf/1550559 closed by gbrandl Building Python 2.4.3 on Solaris 9/10 with Sun Studio 11 (2006-05-28) http://python.org/sf/1496561 closed by andyfloe Curses module doesn't install on Solaris 2.8 (2005-10-12) http://python.org/sf/1324799 closed by akuchling New / Reopened RFE __________________ Add traceback.print_full_exception() (2006-09-06) http://python.org/sf/1553375 opened by Michael Hoffman Print full exceptions as they occur in logging (2006-09-06) http://python.org/sf/1553380 opened by Michael Hoffman RFE Closed __________ random.choice(setinstance) fails (2006-09-02) http://python.org/sf/1551113 closed by rhettinger Add 'find' method to sequence types (2006-08-28) http://python.org/sf/1548178 closed by gbrandl From jan-python at maka.demon.nl Sat Sep 9 04:07:02 2006 From: jan-python at maka.demon.nl (Jan Kanis) Date: Sat, 09 Sep 2006 04:07:02 +0200 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: At the risk of waking up a thread that was already declared dead, but perhaps this is usefull. So, what happens is pythons signal handler sets a flag and registrers a callback. Then the main thread should check the flag and make the callback to actually do something with the signal. However the main thread is blocked in GTK and can't check the flag. Nick Maclaren wrote: ...lots of reasons why you can't do anything reliably from within a signal handler... As far as I understand it, what could work is this: -PyGTK registrers a callback. -Pythons signal handler does not change at all. -All threads that run in the Python interpreter occasionally check the flag which the signal handler sets, like the main thread does nowadays. If it is set, the thread calls PyGTKs callback. It does not do anything else with the signal. -PyGTKs callback wakes up the main thread, which actually handles the signal just like it does now. PyGTKs callback could be called from any thread, but it would be called in a normal context, not in a signal handler. As the signal handler does not change, the risk of breaking anything or causing chaos is as large/small as it is under the current scheme. However, PyGTKs problem does get solved, as long as there is _a_ thread that returns to the interpreter within some timeframe. It seems plausible that this will happen. From rhamph at gmail.com Sat Sep 9 06:52:42 2006 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 8 Sep 2006 22:52:42 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: On 9/8/06, Jan Kanis wrote: > At the risk of waking up a thread that was already declared dead, but > perhaps this is usefull. I don't think we should let this die, at least not yet. Nick seems to be arguing that ANY signal handler is prone to random crashes or corruption (due to bugs). However, we already have a signal handler, so we should already be exposed to the random crashes/corruption. If we're going to rely on signal handling being correct then I think we should also rely on write() being correct. Note that I'm not suggesting an API that allows arbitrary signal handlers, but rather one that calls write() on an array of prepared file descriptors (ignoring errors). Ensuring modifications to that array are atomic would be tricky, but I think it would be doable if we use a read-copy-update approach (with two alternating signal handler functions). Not sure how to ensure there's no currently running signal handlers in another thread though. Maybe have to rip the atomic read/write stuff out of the Linux sources to ensure it's *always* defined behavior. Looking into the existing signalmodule.c, I see no attempts to ensure atomic access to the Handlers data structure. Is the current code broken, at least on non-x86 platforms? -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Sat Sep 9 06:59:48 2006 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 8 Sep 2006 22:59:48 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: On 9/8/06, Adam Olsen wrote: > Ensuring modifications to that array are atomic would be tricky, but I > think it would be doable if we use a read-copy-update approach (with > two alternating signal handler functions). Not sure how to ensure > there's no currently running signal handlers in another thread though. > Maybe have to rip the atomic read/write stuff out of the Linux > sources to ensure it's *always* defined behavior. Doh, except that's exactly what sig_atomic_t is for. Ah well, can't win them all. -- Adam Olsen, aka Rhamphoryncus From ncoghlan at gmail.com Sat Sep 9 07:55:56 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 09 Sep 2006 15:55:56 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: <4501D7BD.1020006@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> Message-ID: <4502576C.3060604@gmail.com> Martin v. L?wis wrote: > Steve Holden schrieb: >> Or simply that this inability isn't currently >> described in a bug report on Sourceforge? > > No: sys.path is specified (originally) as containing a list of byte > strings; it was extended to also support path importers (or whatever > that PEP calls them). It was never extended to support Unicode strings. > That other PEP e That other PEP being PEP 302. That said, Unicode strings *are* permitted on sys.path - the import system will automatically encode them to an 8-bit string using the default filesystem encoding as part of the import process. This works fine on Unix systems that use UTF-8 encoded strings to handle Unicode paths at the C API level, but is screwed on Windows because the default mbcs filesystem encoding can't handle the full range of possible Unicode path names (such as the Chinese directories that originally gave Kristj?n grief). To get Unicode path names to work on Windows, you have to use the Windows-specific wide character API instead of the normal C API, and the import machinery doesn't do that. So this is taking something that *already works properly on POSIX systems* and making it work on Windows as well. >> I agree it's a relatively large patch for a release candidate but if >> prudence suggests deferring it, it should be a *definite* for 2.5.1 and >> subsequent releases. > > I'm not so sure it should. It *is* a new feature: it makes applications > possible which aren't possible today, and the documentation does not > ever suggest that these applications should have been possible. In fact, > it is common knowledge that this currently isn't supported. It should already work fine on POSIX filesystems that use the default filesystem encoding for path names. As far as I am aware, it is only Windows where it doesn't work. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Sat Sep 9 09:23:32 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 09 Sep 2006 09:23:32 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <4502576C.3060604@gmail.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> Message-ID: <45026BF4.5080108@v.loewis.de> Nick Coghlan schrieb: > So this is taking something that *already works properly on POSIX > systems* and making it work on Windows as well. I doubt it does without side effects. For example, an application that would go through sys.path, and encode everything with sys.getfilesystemencoding() currently works, but will break if the patch is applied and non-mbcs strings are put on sys.path. Also, what will be the effect on __file__? What value will it have if the module originates from a sys.path entry that is a non-mbcs unicode string? I haven't tested the patch, but it looks like __file__ becomes a unicode string on Windows, and remains a byte string encoded with the file system encoding elsewhere. That's also a change in behavior. Regards, Martin From brett at python.org Sat Sep 9 09:23:54 2006 From: brett at python.org (Brett Cannon) Date: Sat, 9 Sep 2006 00:23:54 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: <44FDD122.3000809@egenix.com> Message-ID: On 9/7/06, Neal Norwitz wrote: > > On 9/5/06, Brett Cannon wrote: > > > > > [MAL] > > > The proper fix would be to introduce a tp_unicode slot and let > > > this decide what to do, ie. call .__unicode__() methods on instances > > > and use the .__name__ on classes. > > > > That was my bug reaction and what I said on the bug report. Kind of > > surprised one doesn't already exist. > > > > > I think this would be the right way to go for Python 2.6. For > > > Python 2.5, just dropping this .__unicode__ method on exceptions > > > is probably the right thing to do. > > > > Neal, do you want to rip it out or should I? > > Is removing __unicode__ backwards compatible with 2.4 for both > instances and exception classes? > > Does everyone agree this is the proper approach? I'm not familiar > with this code. Brett, if everyone agrees (ie, remains silent), > please fix this and add tests and a NEWS entry. Done. Even updated PEP 356 for you while I was at it. =) -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060909/19a6f83f/attachment.html From nmm1 at cus.cam.ac.uk Sat Sep 9 12:56:06 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sat, 09 Sep 2006 11:56:06 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: I was hoping to have stopped, but here are a few comments. I agree with Jan Kanis. That is the way to tackle this one. "Adam Olsen" wrote: > > I don't think we should let this die, at least not yet. Nick seems to > be arguing that ANY signal handler is prone to random crashes or > corruption (due to bugs). However, we already have a signal handler, > so we should already be exposed to the random crashes/corruption. No. I am afraid that is a common myth and often catastrophic mistake. In this sort of area, NEVER assume that even apparently unrelated changes won't cause 'working' code to misbehave. Yes, Python is already exposed, but it would be easy to turn a very rare failure into a more common one. What I was actually arguing for was defensive programming. > If we're going to rely on signal handling being correct then I think > we should also rely on write() being correct. Note that I'm not > suggesting an API that allows arbitrary signal handlers, but rather > one that calls write() on an array of prepared file descriptors > (ignoring errors). For your interpretation of 'correct'. The cause of this chaos is that the C and POSIX standards are inconsistent, even internally, and they are wildly incompatible. So, even if things 'work' today, don't bet on the next release of your favourite system behaving the same way. It wouldn't matter if there was a de facto standard (i.e. a consensus), but there isn't. > Ensuring modifications to that array are atomic would be tricky, but I > think it would be doable if we use a read-copy-update approach (with > two alternating signal handler functions). Not sure how to ensure > there's no currently running signal handlers in another thread though. > Maybe have to rip the atomic read/write stuff out of the Linux > sources to ensure it's *always* defined behavior. Yes. But even that wouldn't solve the problem, as that code is very gcc-specific. > Looking into the existing signalmodule.c, I see no attempts to ensure > atomic access to the Handlers data structure. Is the current code > broken, at least on non-x86 platforms? Well, at a quick glance at the actual handler (the riskiest bit): 1) It doesn't check the signal range - bad practice, as systems do sometimes generate wayward numbers. 2) Handlers[sig_num].tripped = 1; is formally undefined, but actually pretty safe. If that breaks, nothing much will work. It would be better to make the int sig_atomic_t, as you say. 3) is_tripped++; and Py_AddPendingCall(checksignals_witharg, NULL); will work only because the handler ignores all signals in subthreads (which is definitely NOT right, as the comments say). Despite the implication, the code of Py_AddPendingCall is NOT safe against simultaneous registration. It is just plain broken, I am afraid. The note starting "Darn" should be a LOT stronger :-) [ For example, think of two threads calling the function at exactly the same time, in almost perfect step. Oops. ] I can't honestly promise to put any time into this in the forseeable future, but will try (sometime). If anyone wants to tackle this, please ask me for comments/help/etc. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From gjcarneiro at gmail.com Sat Sep 9 12:59:23 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 9 Sep 2006 11:59:23 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: On 9/9/06, Jan Kanis wrote: > At the risk of waking up a thread that was already declared dead, but > perhaps this is usefull. > > So, what happens is pythons signal handler sets a flag and registrers a > callback. Then the main thread should check the flag and make the callback > to actually do something with the signal. However the main thread is > blocked in GTK and can't check the flag. > > Nick Maclaren wrote: > ...lots of reasons why you can't do anything reliably from within a signal > handler... > > As far as I understand it, what could work is this: > -PyGTK registrers a callback. > -Pythons signal handler does not change at all. > -All threads that run in the Python interpreter occasionally check the > flag which the signal handler sets, like the main thread does nowadays. If > it is set, the thread calls PyGTKs callback. It does not do anything else > with the signal. > -PyGTKs callback wakes up the main thread, which actually handles the > signal just like it does now. > > PyGTKs callback could be called from any thread, but it would be called in > a normal context, not in a signal handler. As the signal handler does not > change, the risk of breaking anything or causing chaos is as large/small > as it is under the current scheme. > However, PyGTKs problem does get > solved, as long as there is _a_ thread that returns to the interpreter > within some timeframe. It seems plausible that this will happen. No, it is not plausible at all. For instance, the GnomeVFS library usually has a pool of thread, not doing anything, waiting for some VFS task. It is likely that a signal will be delivered to one of these threads, which know nothing about Python, and sit idle most of the time. Regards. From gjcarneiro at gmail.com Sat Sep 9 13:11:19 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 9 Sep 2006 12:11:19 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: On 9/9/06, Adam Olsen wrote: > On 9/8/06, Adam Olsen wrote: > > Ensuring modifications to that array are atomic would be tricky, but I > > think it would be doable if we use a read-copy-update approach (with > > two alternating signal handler functions). Not sure how to ensure > > there's no currently running signal handlers in another thread though. > > Maybe have to rip the atomic read/write stuff out of the Linux > > sources to ensure it's *always* defined behavior. > > Doh, except that's exactly what sig_atomic_t is for. Ah well, can't > win them all. >From the glibc manual: """ To avoid uncertainty about interrupting access to a variable, you can use a particular data type for which access is always atomic: sig_atomic_t. Reading and writing this data type is guaranteed to happen in a single instruction, so there's no way for a handler to run "in the middle" of an access. """ So, no, this is certainly not the same as linux kernel atomic operations, which allow you to do more interesting stuff like, test-and-clear, or decrement-and-test atomically. glib has those too, and so does mozilla's NSPR, but only on a few architectures does it do it without using mutexes. for instance, i686 onwards don't require mutexes, only special instructions, but i386 requires mutexes. And we all know mutexes in signal handlers cause deadlocks :-( And, yes, Py_AddPendingCall and Py_MakePendingCalls are most certainly not async safe! Just look at the source code of Py_MakePendingCalls and you'll see an interesting comment... Therefore, discussions about signal safety in whatever new API we may add to Python should be taken with a grain of salt. Regards. From gjcarneiro at gmail.com Sat Sep 9 13:38:03 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 9 Sep 2006 12:38:03 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/9/06, Nick Maclaren wrote: > I was hoping to have stopped, but here are a few comments. > > I agree with Jan Kanis. That is the way to tackle this one. Alas, it doesn't work in practice, as I already replied. [...] > Despite the implication, the code of Py_AddPendingCall is NOT safe > against simultaneous registration. It is just plain broken, I am > afraid. The note starting "Darn" should be a LOT stronger :-) Considering that this code has existed for a very long time, and that it isn't really safe, should we even bother to try to make signals 100% reliable? I remember about a security-related module (bastion?) that first claimed to allow execution of malicious code while protecting the system; later, they figured out it wasn't really safe, and couldn't be safe, so the documentation was simply changed to state not to use that module if you need real security. I see the same problem here. Python signal handling isn't _really_ 100% reliable. And it would be very hard to make Py_AddPendingCall / Py_MakePendingCalls completely reliable. But let's think for a moment. Do we really _need_ to make Python unix signal handling 100% reliable? What are the uses for signals? I can only understand a couple of uses: handling of SIGINT for generating KeyboardInterrupt [1], and handling of fatal errors like SIGSEGV in order to show a crash dialog and bug reporting tool. The second use case doesn't demand 100% reliability. The second use case is currently being handled also in recent Ubuntu Linux through /proc/sys/kernel/crashdump-helper. Other notable uses that I see of signals are sending SIGUSR1 or SIGHUP to a daemon to make it reload its configuration. But any competent programmer already knows how to make the program use local sockets instead. [1] Although ideally Python wouldn't even have KeyboardInterrupt and just die on Ctrl-C like any normal program. From steve at holdenweb.com Sat Sep 9 14:33:24 2006 From: steve at holdenweb.com (Steve Holden) Date: Sat, 09 Sep 2006 13:33:24 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <45026BF4.5080108@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> Message-ID: <4502B494.3080509@holdenweb.com> Martin v. L?wis wrote: > Nick Coghlan schrieb: > >>So this is taking something that *already works properly on POSIX >>systems* and making it work on Windows as well. > > > I doubt it does without side effects. For example, an application that > would go through sys.path, and encode everything with > sys.getfilesystemencoding() currently works, but will break if the patch > is applied and non-mbcs strings are put on sys.path. > > Also, what will be the effect on __file__? What value will it have > if the module originates from a sys.path entry that is a non-mbcs > unicode string? I haven't tested the patch, but it looks like > __file__ becomes a unicode string on Windows, and remains a byte > string encoded with the file system encoding elsewhere. That's also > a change in behavior. > Just to summarise my feeling having read the words of those more familiar with the issues than me: it looks like this should be a 2.6 enhancement if it's included at all. I'd like to see it go in, but there do seem to be problems ensuring consistent behaviour across inconsistent platforms. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From steve at holdenweb.com Sat Sep 9 14:33:24 2006 From: steve at holdenweb.com (Steve Holden) Date: Sat, 09 Sep 2006 13:33:24 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <45026BF4.5080108@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> Message-ID: <4502B494.3080509@holdenweb.com> Martin v. L?wis wrote: > Nick Coghlan schrieb: > >>So this is taking something that *already works properly on POSIX >>systems* and making it work on Windows as well. > > > I doubt it does without side effects. For example, an application that > would go through sys.path, and encode everything with > sys.getfilesystemencoding() currently works, but will break if the patch > is applied and non-mbcs strings are put on sys.path. > > Also, what will be the effect on __file__? What value will it have > if the module originates from a sys.path entry that is a non-mbcs > unicode string? I haven't tested the patch, but it looks like > __file__ becomes a unicode string on Windows, and remains a byte > string encoded with the file system encoding elsewhere. That's also > a change in behavior. > Just to summarise my feeling having read the words of those more familiar with the issues than me: it looks like this should be a 2.6 enhancement if it's included at all. I'd like to see it go in, but there do seem to be problems ensuring consistent behaviour across inconsistent platforms. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From jan-python at maka.demon.nl Sat Sep 9 16:06:20 2006 From: jan-python at maka.demon.nl (Jan Kanis) Date: Sat, 09 Sep 2006 16:06:20 +0200 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: On Sat, 09 Sep 2006 12:59:23 +0200, Gustavo Carneiro wrote: > On 9/9/06, Jan Kanis wrote: >> However, PyGTKs problem does get >> solved, as long as there is _a_ thread that returns to the interpreter >> within some timeframe. It seems plausible that this will happen. > > No, it is not plausible at all. For instance, the GnomeVFS library > usually has a pool of thread, not doing anything, waiting for some VFS > task. It is likely that a signal will be delivered to one of these > threads, which know nothing about Python, and sit idle most of the > time. > > Regards. Well, perhaps it isn't plausible in all cases. However, it is dependant on the libraries you're using and debuggable, which broken signal handlers apparently aren't. The approach would work if you don't use libraries that block threads, and if the libraries that do, co-operate with the interpreter. Open source libraries can be made to co-operate, and if you don't have the source and a library doesn't work correctly, all bets are off anyway. But having the signal handler itself write to a pipe seems to be a cleaner solution, if it can work reliable enough for some value of 'reliable'. Jan From david.nospam.hopwood at blueyonder.co.uk Sat Sep 9 17:26:03 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Sat, 09 Sep 2006 16:26:03 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <45026BF4.5080108@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> Message-ID: <4502DD0B.2090903@blueyonder.co.uk> Martin v. L?wis wrote: > Nick Coghlan schrieb: > >>So this is taking something that *already works properly on POSIX >>systems* and making it work on Windows as well. > > I doubt it does without side effects. For example, an application that > would go through sys.path, and encode everything with > sys.getfilesystemencoding() currently works, but will break if the patch > is applied and non-mbcs strings are put on sys.path. Huh? It won't break on any path for which it is not already broken. You seem to be saying "Paths with non-mbcs strings shouldn't work on Windows, because they haven't worked in the past." -- David Hopwood From martin at v.loewis.de Sat Sep 9 17:34:19 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 09 Sep 2006 17:34:19 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <4502DD0B.2090903@blueyonder.co.uk> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk> Message-ID: <4502DEFB.5030904@v.loewis.de> David Hopwood schrieb: >> I doubt it does without side effects. For example, an application that >> would go through sys.path, and encode everything with >> sys.getfilesystemencoding() currently works, but will break if the patch >> is applied and non-mbcs strings are put on sys.path. > > Huh? It won't break on any path for which it is not already broken. > > You seem to be saying "Paths with non-mbcs strings shouldn't work on Windows, > because they haven't worked in the past." That's not what I'm saying. I'm saying that it shouldn't work in 2.5.x, because it didn't in 2.5.0. Changing it in 2.6 is fine, along with the incompatibilities it causes. Regards, Martin From ncoghlan at gmail.com Sat Sep 9 19:05:36 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 10 Sep 2006 03:05:36 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: <4502DD0B.2090903@blueyonder.co.uk> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk> Message-ID: <4502F460.5040308@gmail.com> David Hopwood wrote: > Martin v. L?wis wrote: >> Nick Coghlan schrieb: >> >>> So this is taking something that *already works properly on POSIX >>> systems* and making it work on Windows as well. >> I doubt it does without side effects. For example, an application that >> would go through sys.path, and encode everything with >> sys.getfilesystemencoding() currently works, but will break if the patch >> is applied and non-mbcs strings are put on sys.path. > > Huh? It won't break on any path for which it is not already broken. > > You seem to be saying "Paths with non-mbcs strings shouldn't work on Windows, > because they haven't worked in the past." I think MvL is looking at it from the point of view of consumers of the list of strings in sys.path, such as PEP 302 importer and loader objects, and tools like module_finder. Currently, the list of values in sys.path is limited to: 1. 8-bit strings 2. Unicode strings containing only characters which can be encoded using the default file system encoding For PEP 302 loaders, it is currently correct for them to take the 8-bit string they receive and do "path.decode(sys.getfilesystemencoding())" Kristj?n's patch works nicely for his application because he doesn't have to worry about compatibility with existing loaders and utilities. The core doesn't have that luxury. We *might* be able to find a backwards compatible way to do it that could be put into 2.5.x, but that is effort that could more profitably be spent elsewhere, particularly since the state of the import system in Py3k will be for it to be based entirely on Unicode (as GvR pointed out last time this topic came up [1]). Cheers, Nick. http://mail.python.org/pipermail/python-dev/2006-June/066225.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Sat Sep 9 19:42:17 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 09 Sep 2006 19:42:17 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <4502F460.5040308@gmail.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk> <4502F460.5040308@gmail.com> Message-ID: <4502FCF9.2090403@v.loewis.de> Nick Coghlan schrieb: > I think MvL is looking at it from the point of view of consumers of the list > of strings in sys.path, such as PEP 302 importer and loader objects, and tools > like module_finder. Currently, the list of values in sys.path is limited to: That, and all kinds of inspection tools. For example, when __file__ of a module object changes to be a Unicode string (which it does under the proposed patch), then these tools break. They currently don't break in that way because putting arbitrary Unicode strings on sys.path doesn't work in the first place. Regards, Martin From martin at v.loewis.de Sat Sep 9 20:10:19 2006 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 09 Sep 2006 20:10:19 +0200 Subject: [Python-Dev] Interest in a Python 2.3.6? In-Reply-To: <05D21426-7FF5-4AEF-B757-DF0BEC5D0D74@python.org> References: <05D21426-7FF5-4AEF-B757-DF0BEC5D0D74@python.org> Message-ID: <4503038B.8060507@v.loewis.de> Barry Warsaw schrieb: > Thoughts? I don't want to waste my time if nobody thinks a 2.3.6 would > be useful, but I'm happy to do it if there's community support. I'll > also need the usual help with Windows installers and documentation updates. I personally would consider it a waste of time. Since it wouldn't waste *my* time, I'm -0 :-) I think everybody has arranged with whatever quirks Python 2.3 has. Distributors of Python 2.3 have added whatever patches they think are absolutely necessary. Making another release could cause confusion; at worst, it may cause people to special-case people for 2.3.6 in case the release contains some incompatible change that affects existing applications. Regards, Martin From david.nospam.hopwood at blueyonder.co.uk Sat Sep 9 20:52:48 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Sat, 09 Sep 2006 19:52:48 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <4502F460.5040308@gmail.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk> <4502F460.5040308@gmail.com> Message-ID: <45030D80.9080105@blueyonder.co.uk> Nick Coghlan wrote: > David Hopwood wrote: >> Martin v. L?wis wrote: >>> Nick Coghlan schrieb: >>> >>>> So this is taking something that *already works properly on POSIX >>>> systems* and making it work on Windows as well. >>> >>> I doubt it does without side effects. For example, an application that >>> would go through sys.path, and encode everything with >>> sys.getfilesystemencoding() currently works, but will break if the patch >>> is applied and non-mbcs strings are put on sys.path. >> >> Huh? It won't break on any path for which it is not already broken. >> >> You seem to be saying "Paths with non-mbcs strings shouldn't work on >> Windows, because they haven't worked in the past." > > I think MvL is looking at it from the point of view of consumers of the > list of strings in sys.path, such as PEP 302 importer and loader > objects, and tools like module_finder. Currently, the list of values in > sys.path is limited to: > > 1. 8-bit strings > 2. Unicode strings containing only characters which can be encoded using > the default file system encoding On Windows, file system pathnames can contain arbitrary Unicode characters (well, almost). Despite the existence of "ANSI" filesystem APIs, and regardless of what 'sys.getfilesystemencoding()' returns, the underlying file system encoding for NTFS and FAT filesystems is UTF-16LE. Thus, either: - the fact that sys.getfilesystemencoding() returns a non-Unicode encoding on Windows is a bug, or - any program that relies on sys.getfilesystemencoding() being able to encode arbitrary Windows pathnames has a bug. We need to decide which of these is the case. -- David Hopwood From martin at v.loewis.de Sat Sep 9 21:16:45 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 09 Sep 2006 21:16:45 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <45030D80.9080105@blueyonder.co.uk> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk> <4502F460.5040308@gmail.com> <45030D80.9080105@blueyonder.co.uk> Message-ID: <4503131D.20806@v.loewis.de> David Hopwood schrieb: > On Windows, file system pathnames can contain arbitrary Unicode characters > (well, almost). Despite the existence of "ANSI" filesystem APIs, and > regardless of what 'sys.getfilesystemencoding()' returns, the underlying > file system encoding for NTFS and FAT filesystems is UTF-16LE. > > Thus, either: > - the fact that sys.getfilesystemencoding() returns a non-Unicode encoding > on Windows is a bug, or > - any program that relies on sys.getfilesystemencoding() being able to > encode arbitrary Windows pathnames has a bug. > > We need to decide which of these is the case. There is a third option: - the operating system has a bug It is actually this option that rules out the other two. sys.getfilesystemencoding() returns "mbcs" on Windows, which means CP_ACP. The file system encoding is an encoding that converts a file name into a byte string. Unfortunately, on Windows, there are file names which cannot be converted into a byte string in a standard manner. This is an operating system bug (or mis-design; they should have chosen UTF-8 as the byte encoding of file names, instead of making it depend on the system locale, but they of course did so for backwards compatibility with Windows 3.1 and 9x). As a side note: every encoding in Python is a Unicode encoding; so there aren't any "non-Unicode encodings". Programs that rely on sys.getfilesystemencoding() being able to represent arbitrary file names on Windows might have a bug; programs that rely on sys.getfilesystemencoding() being able to encode all elements of sys.path do not (atleast not for Python 2.5 and earlier). Regards, Martin From barry at python.org Sat Sep 9 22:41:04 2006 From: barry at python.org (Barry Warsaw) Date: Sat, 9 Sep 2006 16:41:04 -0400 Subject: [Python-Dev] Interest in a Python 2.3.6? In-Reply-To: <4503038B.8060507@v.loewis.de> References: <05D21426-7FF5-4AEF-B757-DF0BEC5D0D74@python.org> <4503038B.8060507@v.loewis.de> Message-ID: <204A1476-028D-4D75-98C0-BECEA3509C39@python.org> On Sep 9, 2006, at 2:10 PM, Martin v. L?wis wrote: > Barry Warsaw schrieb: >> Thoughts? I don't want to waste my time if nobody thinks a 2.3.6 >> would >> be useful, but I'm happy to do it if there's community support. I'll >> also need the usual help with Windows installers and documentation >> updates. > > I personally would consider it a waste of time. Since it wouldn't > waste > *my* time, I'm -0 :-) > > I think everybody has arranged with whatever quirks Python 2.3 has. > Distributors of Python 2.3 have added whatever patches they think are > absolutely necessary. Making another release could cause confusion; > at worst, it may cause people to special-case people for 2.3.6 in > case the release contains some incompatible change that affects > existing applications. Well, there certainly hasn't been an overwhelming chorus of support for the idea, so I think I'll waste my time elsewhere ;). Consider the offer withdrawn. -Barry From david.nospam.hopwood at blueyonder.co.uk Sat Sep 9 23:22:10 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Sat, 09 Sep 2006 22:22:10 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <4503131D.20806@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk> <4502F460.5040308@gmail.com> <45030D80.9080105@blueyonder.co.uk> <4503131D.20806@v.loewis.de> Message-ID: <45033082.8080209@blueyonder.co.uk> Martin v. L?wis wrote: > David Hopwood schrieb: > >>On Windows, file system pathnames can contain arbitrary Unicode characters >>(well, almost). Despite the existence of "ANSI" filesystem APIs, and >>regardless of what 'sys.getfilesystemencoding()' returns, the underlying >>file system encoding for NTFS and FAT filesystems is UTF-16LE. >> >>Thus, either: >> - the fact that sys.getfilesystemencoding() returns a non-Unicode encoding >> on Windows is a bug, or >> - any program that relies on sys.getfilesystemencoding() being able to >> encode arbitrary Windows pathnames has a bug. >> >>We need to decide which of these is the case. > > There is a third option: > - the operating system has a bug This behaviour is by design. If it is a bug, then it is a "won't ever fix -- no way, no how" bug, that Python must accomodate if it is to properly support Unicode on Windows. > It is actually this option that rules out the other two. > sys.getfilesystemencoding() returns "mbcs" on Windows, which means > CP_ACP. The file system encoding is an encoding that converts a > file name into a byte string. Unfortunately, on Windows, there are > file names which cannot be converted into a byte string in a standard > manner. This is an operating system bug (or mis-design; they should > have chosen UTF-8 as the byte encoding of file names, instead of > making it depend on the system locale, but they of course did so > for backwards compatibility with Windows 3.1 and 9x). Although UTF-8 was invented (in September 1992) technically before the release of the first version of NT supporting NTFS (NT 3.1 in July 1993), it had not been invented before the decision to use Unicode in NTFS, or in Windows NT's file APIs, had been made. (I believe OS/2 HPFS had not supported Unicode, even though NTFS was otherwise almost identical to it.) At that time, the decision to use Unicode at all was quite forward-looking; the final version of Unicode 1.0 had only been published in June 1992 (although it had been approved earlier; see ). UTF-8 was only officially added to the Unicode standard in an appendix of Unicode 2.0 (published July 1996), and only given essentially equal status to UTF-16 and UTF-32 in Unicode 3.0 (September 1999). > As a side note: every encoding in Python is a Unicode encoding; > so there aren't any "non-Unicode encodings". It was clear from context that I meant "encoding capable of representing all Unicode characters". > Programs that rely on sys.getfilesystemencoding() being able to > represent arbitrary file names on Windows might have a bug; > programs that rely on sys.getfilesystemencoding() being able > to encode all elements of sys.path do not (at least not for > Python 2.5 and earlier). Elements of sys.path can be Unicode strings in Python 2.5, and should be pathnames supported by the underlying OS. Where is it documented that there is any further restriction on them? And why should there be any further restriction on them? -- David Hopwood From martin at v.loewis.de Sat Sep 9 23:55:20 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 09 Sep 2006 23:55:20 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <45033082.8080209@blueyonder.co.uk> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk> <4502F460.5040308@gmail.com> <45030D80.9080105@blueyonder.co.uk> <4503131D.20806@v.loewis.de> <45033082.8080209@blueyonder.co.uk> Message-ID: <45033848.2020307@v.loewis.de> David Hopwood schrieb: > Elements of sys.path can be Unicode strings in Python 2.5, and should be > pathnames supported by the underlying OS. Where is it documented that there > is any further restriction on them? And why should there be any further > restriction on them? It's not documented in that detail; if people think it should be documented more thoroughly, that should be done (contributions are welcome). Changing the import machinery to deal with Unicode strings differently cannot be done for Python 2.5, though: it cannot be done for 2.5.0 as the release candidate has already been published, and there is no acceptable patch available at this moment. It cannot be added to 2.5.x as it may reasonably break existing applications. Regards, Martin From jcarlson at uci.edu Sun Sep 10 00:23:50 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 09 Sep 2006 15:23:50 -0700 Subject: [Python-Dev] Python 2.4.4 was: Interest in a Python 2.3.6? In-Reply-To: <204A1476-028D-4D75-98C0-BECEA3509C39@python.org> References: <4503038B.8060507@v.loewis.de> <204A1476-028D-4D75-98C0-BECEA3509C39@python.org> Message-ID: <20060909151653.F8E7.JCARLSON@uci.edu> Barry Warsaw wrote: > Well, there certainly hasn't been an overwhelming chorus of support > for the idea, so I think I'll waste my time elsewhere ;). Consider > the offer withdrawn. I hope someone tries to fix one of the two bugs I listed that were problems for 2.3 and 2.4 in 2.4.4: http://www.python.org/sf/780714 http://www.python.org/sf/1548687 The former involves stack allocation errors in subthreads that exists even in 2.5, which may not be fixable in Windows, and very likely is not fixable on linux. The latter is fixable on all platforms. - Josiah From ncoghlan at gmail.com Sun Sep 10 04:24:38 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 10 Sep 2006 12:24:38 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: <45033082.8080209@blueyonder.co.uk> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk> <4502F460.5040308@gmail.com> <45030D80.9080105@blueyonder.co.uk> <4503131D.20806@v.loewis.de> <45033082.8080209@blueyonder.co.uk> Message-ID: <45037766.8030202@gmail.com> David Hopwood wrote: > Martin v. L?wis wrote: >> Programs that rely on sys.getfilesystemencoding() being able to >> represent arbitrary file names on Windows might have a bug; >> programs that rely on sys.getfilesystemencoding() being able >> to encode all elements of sys.path do not (at least not for >> Python 2.5 and earlier). > > Elements of sys.path can be Unicode strings in Python 2.5, and should be > pathnames supported by the underlying OS. Where is it documented that there > is any further restriction on them? And why should there be any further > restriction on them? There's no suggestion that this limitation shouldn't be fixed - merely that fixing it is likely to break some applications which rely on sys.path for importing or introspection purposes. A 2.5.x maintenance release typically shouldn't break anything that worked correctly on 2.5.0, hence fixing this becomes a project for either 2.6 or 3.0. To put it another way: fixing this is likely to require changes to more than just the interpreter core. It will also potentially require changes to all applications which currently expect to be able to use 's.encode(sys.getfilesystemencoding())' to convert any Unicode path entry or __file__ attribute to an 8-bit string. Doing that qualifies as correcting a language design error or limitation, but it would require a real stretch of the definition to qualify as a bug fix. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From greg.ewing at canterbury.ac.nz Sun Sep 10 09:35:53 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 10 Sep 2006 19:35:53 +1200 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: <4503C059.3070308@canterbury.ac.nz> Jan Kanis wrote: > However, PyGTKs problem does get > solved, as long as there is _a_ thread that returns to the interpreter > within some timeframe. It seems plausible that this will happen. I don't see that this makes the situation much better, as it just shifts the need for polling to another thread. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/greg.ewing%40canterbury.ac.nz From greg.ewing at canterbury.ac.nz Sun Sep 10 09:35:59 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 10 Sep 2006 19:35:59 +1200 Subject: [Python-Dev] Py_BuildValue and decref In-Reply-To: References: <20060908220605.GF990@abulafia.devel.redhat.com> Message-ID: <4503C05F.70508@canterbury.ac.nz> Barry Warsaw wrote: > I just want to point out that the C API documentation is pretty > silent about the refcounting side-effects in error conditions (and > often in success conditions too) of most Python functions. For > example, what is the refcounting side-effects of PyDict_SetItem() on > val? What about if that function fails? Has val been incref'd or > not? What about the side-effects on any value the new one replaces, > both in success and failure? The usual principle is that the refcounting behaviour is (or should be) independent of whether the function succeeded or failed. In the absence of any statement to the contrary in the docs, you should be able to assume that. The words used to describe the refcount behaviour of some functions can be rather confusing, but it always boils down to one of two cases: either the function "borrows" a reference (and does its own incref if needed, the caller doesn't need to care) or it "steals" a reference (so the caller is always responsible for doing an incref if needed before calling). What that rather convoluted comment about PyTuple_SetItem is trying to say is just that it *always* steals a reference, regardless of whether it succeeds or fails. I expect the same is true of Py_BuildValue. -- Greg From rhamph at gmail.com Mon Sep 11 06:32:43 2006 From: rhamph at gmail.com (Adam Olsen) Date: Sun, 10 Sep 2006 22:32:43 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/9/06, Nick Maclaren wrote: > I can't honestly promise to put any time into this in the forseeable > future, but will try (sometime). If anyone wants to tackle this, > please ask me for comments/help/etc. It took me a while to realize just what was wrong with my proposal, but I did, and it led me to a new proposal. I'd appreciate if you could point out any holes in it. First though, for the benefit of those reading, I'll try to explain the (multiple!) reasons why mine fails. First, sig_atomic_t essentially promises that the compiler will behave atomically and the CPU it's ran on will behave locally atomic. It does not claim to make writes visible to other CPUs in an atomic way, and thus you could have different bytes show up at different times. The x86 architecture uses a very simple scheme and won't do this (unless the compiler itself does), but other architectures will. Second, the start of a write call may be delayed a very long time. This means that a fd may not be written to for hours until after the signal started. We can't release any fd's used for such a purpose, or else risk random writing to them if they get reused later.. Third, it doesn't resolve the existing problems. If I'm going to fix signals I should fix ALL of signals. :) Now on to my new proposal. I do still use write(). If you can't accept that I think we should rip signals out entirely, just let them kill the process. Not a reliable feature of any OS. We create a single pipe and use it for all signals. We never release it, instead letting the OS do it when the process gets cleaned up. We write the signal number to it as a byte (assuming there's at most 256 unique signals). This much would allow a GUI's poll loop to wake up when there is a signal, and give control back to the python main loop, which could then read off the signals and queue up their handler functions. The only problem is when there is no GUI poll loop. We don't want python to have to poll the fd, we'd rather it just check a variable. Is it possible to set/clear a flag in a sufficiently portable (reentrant-safe, non-blocking, thread-safe) fashion? -- Adam Olsen, aka Rhamphoryncus From nnorwitz at gmail.com Mon Sep 11 06:54:49 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 10 Sep 2006 21:54:49 -0700 Subject: [Python-Dev] 2.5c2 Message-ID: PEP 356 http://www.python.org/dev/peps/pep-0356/ has 2.5c2 scheduled for Sept 12. I checked in a fix for the last blocking 2.5 issue (revert sgml infinite loop bug). There are no blocking issues that I know of (the PEP is up to date). I expect Anthony will call for a freeze real soon now. It would be awesome if there were no more changes from now until for 2.5 final! (Changing the trunk or 2.4 branches are fine, updating doc for 2.5 is also fine). I will be running valgrind over 2.5, but don't expect anything to show up since the last run was pretty recent. Coverity has no outstanding issues and Klocwork results are pretty clean. It's not clear if the remaining warnings from Klocwork are real issues or not. Keep doing a bunch of testing so we don't have any surprises in 2.5. n PS Scary as it sounds, I hope to have an HP-UX buildbot up and running real soon now. After 2.5 is out, I will fix the issues with the cygwin bot (ie, upgrade cygwin) and get the HP-UX bot running. From nnorwitz at gmail.com Mon Sep 11 10:34:21 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 11 Sep 2006 01:34:21 -0700 Subject: [Python-Dev] _PyGILState_NoteThreadState should be static or not? Message-ID: Michael, In Python/pystate.c, you made this checkin: """ r39044 | mwh | 2005-06-20 12:52:57 -0400 (Mon, 20 Jun 2005) | 8 lines Fix bug: [ 1163563 ] Sub threads execute in restricted mode basically by fixing bug 1010677 in a non-broken way. """ _PyGILState_NoteThreadState is declared as static on line 54, but the definition on line 508 is not static. (HP's cc is complaining.) I don't see this referenced in any header file, it seems like this should be static? $ grep _PyGILState_NoteThreadState */*.ch] Python/pystate.c:static void _PyGILState_NoteThreadState(PyThreadState* tstate); Python/pystate.c: _PyGILState_NoteThreadState(tstate); Python/pystate.c: _PyGILState_NoteThreadState(t); Python/pystate.c:_PyGILState_NoteThreadState(PyThreadState* tstate) n From mwh at python.net Mon Sep 11 10:40:23 2006 From: mwh at python.net (Michael Hudson) Date: Mon, 11 Sep 2006 09:40:23 +0100 Subject: [Python-Dev] _PyGILState_NoteThreadState should be static or not? In-Reply-To: References: Message-ID: On 11 Sep 2006, at 09:34, Neal Norwitz wrote: > Michael, > > In Python/pystate.c, you made this checkin: > > """ > r39044 | mwh | 2005-06-20 12:52:57 -0400 (Mon, 20 Jun 2005) | 8 lines > > Fix bug: [ 1163563 ] Sub threads execute in restricted mode > basically by fixing bug 1010677 in a non-broken way. > """ > > _PyGILState_NoteThreadState is declared as static on line 54, but the > definition on line 508 is not static. (HP's cc is complaining.) I > don't see this referenced in any header file, it seems like this > should be static? It seems very likely, yes. I think at one point (in my working copy) there was a call in Modules/threadmodule.c, which may partially account for my confusion. Seems we have lots of HP users tracking SVN HEAD, then... Cheers, mwh From anthony at interlink.com.au Mon Sep 11 13:58:13 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 11 Sep 2006 21:58:13 +1000 Subject: [Python-Dev] BRANCH FREEZE: release25-maint, 00:00UTC 12 September 2006 Message-ID: <200609112158.19000.anthony@interlink.com.au> Ok, I haven't heard back from Martin, but I'm going to hope he's OK with tomorrow as a release date for 2.5rc2. If he's not, we'll try for the day after. In any case, I'm going to declare the release25-maint branch FROZEN as at 00:00 UTC on the 12th. That's about 12 hours from now. This is for 2.5rc2. Once this is out, I'd like to see as close to zero changes as possible for the next week or so until 2.5 final is released. My god, it's getting so close... Anthony -- Anthony Baxter It's never too late to have a happy childhood. From gjcarneiro at gmail.com Mon Sep 11 16:16:44 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Mon, 11 Sep 2006 15:16:44 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/11/06, Adam Olsen wrote: > Now on to my new proposal. I do still use write(). If you can't > accept that I think we should rip signals out entirely, just let them > kill the process. Not a reliable feature of any OS. > > We create a single pipe and use it for all signals. We never release > it, instead letting the OS do it when the process gets cleaned up. We > write the signal number to it as a byte (assuming there's at most 256 > unique signals). > > This much would allow a GUI's poll loop to wake up when there is a > signal, and give control back to the python main loop, which could > then read off the signals and queue up their handler functions. I like this approach. Not only we would get a poll-able file descriptor to notify a GUI main loop when signals arrive, we'd also avoid the lack of async safety in Py_AddPendingCall / Py_MakePendingCalls which affects _current_ Python code. Note that the file descriptor of the read end of the pipe has to become a public Python API so that 3rd party extensions may poll it. This is crucial. > > The only problem is when there is no GUI poll loop. We don't want > python to have to poll the fd, we'd rather it just check a variable. > Is it possible to set/clear a flag in a sufficiently portable > (reentrant-safe, non-blocking, thread-safe) fashion? It's simple. That pipe file descriptor has to be changed to non-blocking mode in both ends of the pipe, obviously, with fcntl. Then, to find out whether a signal happened or not we modify PyErr_CheckSignals() to try to read from the pipe. If it reads bytes from the pipe, we process the corresponding python signal handlers or raise KeyboardInterrupt. If the read() syscall returns zero bytes read, we know no signal was delivered and move on. The only potential problem left is that, by changing the pipe file descriptor to non-blocking mode we can only write as many bytes to it without reading from the other side as the pipe buffer allows. If a large number of signals arrive very quickly, that buffer may fill and we lose signals. But I think the default buffer should be more than enough. And normally programs don't receive lots of signals in a small time window. If it happens we may lose signals, but that's very rare, and who cares anyway. Regards. From eric+python-dev at trueblade.com Mon Sep 11 20:31:45 2006 From: eric+python-dev at trueblade.com (Eric V. Smith) Date: Mon, 11 Sep 2006 14:31:45 -0400 Subject: [Python-Dev] datetime's strftime implementation: by design or bug Message-ID: <4505AB91.6030908@trueblade.com> [I hope this belongs on python-dev, since it's about the design of something. But if not, let me know and I'll post to c.l.py.] I'm willing to file a bug report and patch on this, but I'd like to know if it's by design or not. In datetimemodule.c, the function wrap_strftime() insists that the length of a format string be <= 127 chars, by forcing the length into a char. This seems like a bug to me. wrap_strftime() calls time's strftime(), which doesn't have this limitation because it uses size_t. >>> import datetime >>> datetime.datetime.now().strftime('x'*128) Traceback (most recent call last): File "", line 1, in ? MemoryError >>> import datetime >>> datetime.datetime.now().strftime('x'*256) in wrap_strftime totalnew=1 Traceback (most recent call last): File "", line 1, in SystemError: Objects/stringobject.c:4077: bad argument to internal function >>> import time >>> time.strftime('x'*128) 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' But before I write this up, I'd like to know if anyone knows if this is by design or not. This is reproducible on Windows 2.4.3, and Linux 2.3.3 and 2.5c1. Thanks. Eric. From tim.peters at gmail.com Mon Sep 11 22:06:20 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 11 Sep 2006 16:06:20 -0400 Subject: [Python-Dev] datetime's strftime implementation: by design or bug In-Reply-To: <4505AB91.6030908@trueblade.com> References: <4505AB91.6030908@trueblade.com> Message-ID: <1f7befae0609111306u587e2db9j51b58a15719797a1@mail.gmail.com> [Eric V. Smith] > [I hope this belongs on python-dev, since it's about the design of > something. But if not, let me know and I'll post to c.l.py.] > > I'm willing to file a bug report and patch on this, but I'd like to know > if it's by design or not. > > In datetimemodule.c, the function wrap_strftime() insists that the > length of a format string be <= 127 chars, by forcing the length into a > char. This seems like a bug to me. wrap_strftime() calls time's > strftime(), which doesn't have this limitation because it uses size_t. Yawn ;-) I'm very surprised the code doesn't verify that the format size fits in a C char, but there's nothing deep about the assumption. I expect it would work fine to just change the declarations of `totalnew` and `usednew` from `char` to `Py_ssize_t` (for 2.5.1 and 2.6; to something else for 2.4.4 (I don't recall which C type PyString_Size returned then -- probably `int`)), and /also/ change the resize-and-overflow check. The current: int bigger = totalnew << 1; if ((bigger >> 1) != totalnew) { /* overflow */ PyErr_NoMemory(); goto Done; } doesn't actually make sense even if it's certain than sizeof(int) is strictly larger than sizeof(totalnew) (which C guarantees for type `char`, but is plain false on some boxes if changed to Py_ssize_t). Someone must have been on heavy drugs when writing that endlessly tedious wrapper ;-) > ... From anthony at interlink.com.au Tue Sep 12 02:54:30 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 12 Sep 2006 10:54:30 +1000 Subject: [Python-Dev] datetime's strftime implementation: by design or bug In-Reply-To: <4505AB91.6030908@trueblade.com> References: <4505AB91.6030908@trueblade.com> Message-ID: <200609121054.35576.anthony@interlink.com.au> Please log a bug - this is probably something suitable for fixing in 2.5.1. At the very least, if it's going to be limited to 127 characters, it should check that and raise a more suitable exception. From misa at redhat.com Tue Sep 12 03:18:12 2006 From: misa at redhat.com (Mihai Ibanescu) Date: Mon, 11 Sep 2006 21:18:12 -0400 Subject: [Python-Dev] Py_BuildValue and decref In-Reply-To: <4503C05F.70508@canterbury.ac.nz> References: <20060908220605.GF990@abulafia.devel.redhat.com> <4503C05F.70508@canterbury.ac.nz> Message-ID: <20060912011812.GB14187@abulafia.devel.redhat.com> On Sun, Sep 10, 2006 at 07:35:59PM +1200, Greg Ewing wrote: > Barry Warsaw wrote: > > I just want to point out that the C API documentation is pretty > > silent about the refcounting side-effects in error conditions (and > > often in success conditions too) of most Python functions. For > > example, what is the refcounting side-effects of PyDict_SetItem() on > > val? What about if that function fails? Has val been incref'd or > > not? What about the side-effects on any value the new one replaces, > > both in success and failure? > > The usual principle is that the refcounting behaviour > is (or should be) independent of whether the function > succeeded or failed. In the absence of any statement > to the contrary in the docs, you should be able to > assume that. > > The words used to describe the refcount behaviour of > some functions can be rather confusing, but it always > boils down to one of two cases: either the function > "borrows" a reference (and does its own incref if > needed, the caller doesn't need to care) or it "steals" > a reference (so the caller is always responsible for > doing an incref if needed before calling). > > What that rather convoluted comment about PyTuple_SetItem > is trying to say is just that it *always* steals a reference, > regardless of whether it succeeds or fails. I expect the > same is true of Py_BuildValue. Given that it doesn't seem to be the case, and my quick look at the code indicates that even internally python is inconsistent, should I file a low-severity bug so we don't lose track of this? Misa From rhamph at gmail.com Tue Sep 12 06:05:38 2006 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 11 Sep 2006 22:05:38 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/11/06, Gustavo Carneiro wrote: > On 9/11/06, Adam Olsen wrote: > > This much would allow a GUI's poll loop to wake up when there is a > > signal, and give control back to the python main loop, which could > > then read off the signals and queue up their handler functions. > > I like this approach. Not only we would get a poll-able file > descriptor to notify a GUI main loop when signals arrive, we'd also > avoid the lack of async safety in Py_AddPendingCall / > Py_MakePendingCalls which affects _current_ Python code. > > Note that the file descriptor of the read end of the pipe has to > become a public Python API so that 3rd party extensions may poll it. > This is crucial. Yeah, so long as Python still does the actual reading. > > The only problem is when there is no GUI poll loop. We don't want > > python to have to poll the fd, we'd rather it just check a variable. > > Is it possible to set/clear a flag in a sufficiently portable > > (reentrant-safe, non-blocking, thread-safe) fashion? > > It's simple. That pipe file descriptor has to be changed to > non-blocking mode in both ends of the pipe, obviously, with fcntl. > Then, to find out whether a signal happened or not we modify > PyErr_CheckSignals() to try to read from the pipe. If it reads bytes > from the pipe, we process the corresponding python signal handlers or > raise KeyboardInterrupt. If the read() syscall returns zero bytes > read, we know no signal was delivered and move on. Aye, but my point was that a syscall is costly, and we'd like to avoid it if possible. We'll probably have to benchmark it though, to find out if it's worth the hassle. > The only potential problem left is that, by changing the pipe file > descriptor to non-blocking mode we can only write as many bytes to it > without reading from the other side as the pipe buffer allows. If a > large number of signals arrive very quickly, that buffer may fill and > we lose signals. But I think the default buffer should be more than > enough. And normally programs don't receive lots of signals in a > small time window. If it happens we may lose signals, but that's very > rare, and who cares anyway. Indeed, we need to document very clearly that: * Signals may be dropped if there is a burst * Signals may be delayed for a very long time, and if you replace a previous handler your new handler may get signals intended for the old handler -- Adam Olsen, aka Rhamphoryncus From greg.ewing at canterbury.ac.nz Tue Sep 12 06:08:07 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Sep 2006 16:08:07 +1200 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <450632A7.40504@canterbury.ac.nz> Gustavo Carneiro wrote: > The only potential problem left is that, by changing the pipe file > descriptor to non-blocking mode we can only write as many bytes to it > without reading from the other side as the pipe buffer allows. If a > large number of signals arrive very quickly, that buffer may fill and > we lose signals. That might be an argument for *not* trying to communicate the signal number by the value written to the pipe, but keep a separate set of signal-pending flags, and just use the pipe as a way of indicating that *something* has happened. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Sep 12 06:33:40 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Sep 2006 16:33:40 +1200 Subject: [Python-Dev] Py_BuildValue and decref In-Reply-To: <20060912011812.GB14187@abulafia.devel.redhat.com> References: <20060908220605.GF990@abulafia.devel.redhat.com> <4503C05F.70508@canterbury.ac.nz> <20060912011812.GB14187@abulafia.devel.redhat.com> Message-ID: <450638A4.6020903@canterbury.ac.nz> Mihai Ibanescu wrote: > Given that it doesn't seem to be the case, and my quick look at the code > indicates that even internally python is inconsistent, should I file a > low-severity bug so we don't lose track of this? I'd say so, yes. A function whose refcount behaviour differs when it fails is awkward to use safely at best, impossible at worst (if there's no way of finding out what needs to be decrefed in order to clean up properly).] -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From rhamph at gmail.com Tue Sep 12 07:05:00 2006 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 11 Sep 2006 23:05:00 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <450632A7.40504@canterbury.ac.nz> References: <450632A7.40504@canterbury.ac.nz> Message-ID: On 9/11/06, Greg Ewing wrote: > Gustavo Carneiro wrote: > > The only potential problem left is that, by changing the pipe file > > descriptor to non-blocking mode we can only write as many bytes to it > > without reading from the other side as the pipe buffer allows. If a > > large number of signals arrive very quickly, that buffer may fill and > > we lose signals. > > That might be an argument for *not* trying to > communicate the signal number by the value > written to the pipe, but keep a separate set > of signal-pending flags, and just use the pipe > as a way of indicating that *something* has > happened. That brings you back to how you access the flags variable. At best it is very difficult, requiring unique assembly code for every supported platform. At worst, some platforms may not have any way to do it from an interrupt context.. A possible alternative is to keep a set of flags for every thread, but that requires the threads poll their variable regularly, and possibly a wake-up pipe for each thread.. -- Adam Olsen, aka Rhamphoryncus From greg.ewing at canterbury.ac.nz Tue Sep 12 08:35:41 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Sep 2006 18:35:41 +1200 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <450632A7.40504@canterbury.ac.nz> Message-ID: <4506553D.1020307@canterbury.ac.nz> Adam Olsen wrote: > That brings you back to how you access the flags variable. The existing signal handler sets a flag, doesn't it? So it couldn't be any more broken than the current implementation. If we get too paranoid about this, we'll just end up deciding that signals can't be used for anything, at all, ever. That doesn't seem very helpful, although techically I suppose it would solve the problem. :-) My own conclusion from all this is that if you can't rely on writing to a variable in one part of your program and reading it back in another, then computer architectures have become far too clever for their own good. :-( -- Greg From rhamph at gmail.com Tue Sep 12 08:59:58 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 12 Sep 2006 00:59:58 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <4506553D.1020307@canterbury.ac.nz> References: <450632A7.40504@canterbury.ac.nz> <4506553D.1020307@canterbury.ac.nz> Message-ID: On 9/12/06, Greg Ewing wrote: > Adam Olsen wrote: > > > That brings you back to how you access the flags variable. > > The existing signal handler sets a flag, doesn't it? > So it couldn't be any more broken than the current > implementation. > > If we get too paranoid about this, we'll just end > up deciding that signals can't be used for anything, > at all, ever. That doesn't seem very helpful, > although techically I suppose it would solve > the problem. :-) > > My own conclusion from all this is that if you > can't rely on writing to a variable in one part > of your program and reading it back in another, > then computer architectures have become far > too clever for their own good. :-( They've been that way for a long, long time. The irony is that x86 is immensely stupid in this regard, and as a result most programmers remain unaware of it. Other architectures have much more interesting read/write and cache reordering semantics, and the code is certainly broken there. C leaves it undefined with good reason. My previous mention of using a *single* flag may survive corruption simply because we can tolerate false positives. Signal handlers would write 0xFFFFFFFF, the poll loop would check if *any* bit is set. If so, write 0x0, read off the fd, then loop around and check it again. If the start of the read() acts as a write-barrier it SHOULD guarantee we don't miss any positive writes. Hmm, if that works we should be able to generalize it for all the other flags too. Something to think about anyway... -- Adam Olsen, aka Rhamphoryncus From gjcarneiro at gmail.com Tue Sep 12 19:15:48 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Tue, 12 Sep 2006 18:15:48 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <450632A7.40504@canterbury.ac.nz> <4506553D.1020307@canterbury.ac.nz> Message-ID: On 9/12/06, Adam Olsen wrote: > On 9/12/06, Greg Ewing wrote: > > Adam Olsen wrote: > > > > > That brings you back to how you access the flags variable. > > > > The existing signal handler sets a flag, doesn't it? > > So it couldn't be any more broken than the current > > implementation. > > > > If we get too paranoid about this, we'll just end > > up deciding that signals can't be used for anything, > > at all, ever. That doesn't seem very helpful, > > although techically I suppose it would solve > > the problem. :-) > > > > My own conclusion from all this is that if you > > can't rely on writing to a variable in one part > > of your program and reading it back in another, > > then computer architectures have become far > > too clever for their own good. :-( > > They've been that way for a long, long time. The irony is that x86 is > immensely stupid in this regard, and as a result most programmers > remain unaware of it. > > Other architectures have much more interesting read/write and cache > reordering semantics, and the code is certainly broken there. C > leaves it undefined with good reason. > > My previous mention of using a *single* flag may survive corruption > simply because we can tolerate false positives. Signal handlers would > write 0xFFFFFFFF, the poll loop would check if *any* bit is set. If > so, write 0x0, read off the fd, then loop around and check it again. > If the start of the read() acts as a write-barrier it SHOULD guarantee > we don't miss any positive writes. Why write 0xFFFFFFFF? Why can't the variable be of a "volatile char" type? Assuming sizeof(char) == 1, please don't tell me architecture XPTO will write the value 4 bits at a time! :P I see your point of using a flag to avoid the read() syscall most of the time. Slightly more complex, but possibly worth it. I was going to describe a possible race condition, then wrote the code below to help explain it, modified it slightly, and now I think the race is gone. In any case, the code might be helpful to check if we are in sync. Let me know if you spot any race condition I missed. static volatile char signal_flag; static int signal_pipe_r, signal_pipe_w; PyErr_CheckSignals() { if (signal_flag) { char signum; signal_flag = 0; while (read(signal_pipe_r, &signum, 1) == 1) process_signal(signum); } } static void signal_handler(int signum) { char signum_c = signum; signal_flag = 1; write(signal_pipe_w, &signum_c, 1); } From jcarlson at uci.edu Tue Sep 12 19:37:54 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 12 Sep 2006 10:37:54 -0700 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <4506553D.1020307@canterbury.ac.nz> Message-ID: <20060912090921.F918.JCARLSON@uci.edu> "Adam Olsen" wrote: > > On 9/12/06, Greg Ewing wrote: > > Adam Olsen wrote: > > > > > That brings you back to how you access the flags variable. > > > > The existing signal handler sets a flag, doesn't it? > > So it couldn't be any more broken than the current > > implementation. [snip] > My previous mention of using a *single* flag may survive corruption > simply because we can tolerate false positives. Signal handlers would > write 0xFFFFFFFF, the poll loop would check if *any* bit is set. If > so, write 0x0, read off the fd, then loop around and check it again. > If the start of the read() acts as a write-barrier it SHOULD guarantee > we don't miss any positive writes. [snip] I've been lurking on this thread for a while, but I'm thinking that just a single file handle with a poll/read (if the poll succeeds) would be fine. So what if you miss a signal if there is a burst of signal activity? If users want a *good* IPC mechanism, then they can use any one of the known-good IPC mechanisms defined for their platform (mmap, named pipes, unnamed pipes, sockets (unix domain, udp, tcp), etc.), not an IPC mechanism that has historically (at least in Python) been generally unreliable. Also, I wouldn't be surprised if the majority of signals are from the set: SIGHUP, SIGTERM, SIGKILL, none of which should be coming in at a high rate. - Josiah From eric+python-dev at trueblade.com Tue Sep 12 21:40:07 2006 From: eric+python-dev at trueblade.com (Eric V. Smith) Date: Tue, 12 Sep 2006 15:40:07 -0400 Subject: [Python-Dev] datetime's strftime implementation: by design or bug In-Reply-To: <200609121054.35576.anthony@interlink.com.au> References: <4505AB91.6030908@trueblade.com> <200609121054.35576.anthony@interlink.com.au> Message-ID: <45070D17.1090302@trueblade.com> Anthony Baxter wrote: > Please log a bug - this is probably something suitable for fixing in 2.5.1. At > the very least, if it's going to be limited to 127 characters, it should > check that and raise a more suitable exception. [First time sent from wrong address, sorry if this is a dupe.] Done. The patch is at http://python.org/sf/1557390. From rhamph at gmail.com Tue Sep 12 22:53:49 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 12 Sep 2006 14:53:49 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <450632A7.40504@canterbury.ac.nz> <4506553D.1020307@canterbury.ac.nz> Message-ID: On 9/12/06, Gustavo Carneiro wrote: > On 9/12/06, Adam Olsen wrote: > > My previous mention of using a *single* flag may survive corruption > > simply because we can tolerate false positives. Signal handlers would > > write 0xFFFFFFFF, the poll loop would check if *any* bit is set. If > > so, write 0x0, read off the fd, then loop around and check it again. > > If the start of the read() acts as a write-barrier it SHOULD guarantee > > we don't miss any positive writes. > > Why write 0xFFFFFFFF? Why can't the variable be of a "volatile > char" type? Assuming sizeof(char) == 1, please don't tell me > architecture XPTO will write the value 4 bits at a time! :P Nope. It'll write 32 bits, then break that up into 8 bits :) Although, at the moment I can't fathom what harm that would cause... For the record, all volatile does is prevent compiler reordering across sequence points. Interestingly, it seems "volatile sig_atomic_t" is the correct way to declare a variable for (single-threaded) signal handling. Odd that volatile didn't show up in any of the previous documentation I read.. > I see your point of using a flag to avoid the read() syscall most of > the time. Slightly more complex, but possibly worth it. > > I was going to describe a possible race condition, then wrote the > code below to help explain it, modified it slightly, and now I think > the race is gone. In any case, the code might be helpful to check if > we are in sync. Let me know if you spot any race condition I missed. > > > static volatile char signal_flag; > static int signal_pipe_r, signal_pipe_w; > > PyErr_CheckSignals() > { > if (signal_flag) { > char signum; > signal_flag = 0; > while (read(signal_pipe_r, &signum, 1) == 1) > process_signal(signum); > } > } I'd prefer this to be a "while (signal_flag)" instead, although it should technically work either way. > static void > signal_handler(int signum) > { > char signum_c = signum; > signal_flag = 1; > write(signal_pipe_w, &signum_c, 1); > } This is wrong. PyErr_CheckSignals could check and clear signal_flag before you reach the write() call. "signal_flag = 1" should come after. -- Adam Olsen, aka Rhamphoryncus From martin at v.loewis.de Tue Sep 12 23:38:39 2006 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 12 Sep 2006 23:38:39 +0200 Subject: [Python-Dev] Subversion 1.4 Message-ID: <450728DF.1020104@v.loewis.de> As many of you probably know: Subversion 1.4 has been released. It is safe to upgrade to this version, even if the repository server (for us svn.python.org) stays at an older version: they can interoperate just fine. There is one major pitfall: Subversion 1.4 changes the format of the working copy file structure (.svn/format goes from 4 to 8). This new format is more efficient: for a Python checkout, it saves about 15MiB (out of 125 MiB). Also, several operations (e.g. svn status) are faster. Subversion performs a silent upgrade of the existing repository on the first operation (I believe on the first modifying operation). However, this new format is not compatible with older clients; you need 1.4 clients to access an upgraded working copy. So if you use the same working copy with different clients (e.g. command line and turtoise, or from different systems through NFS), you either need to upgrade all clients, or else you should stay away from 1.4. Alternatively, you can have different checkouts for 1.3 and 1.4 clients, of course. Just in case you didn't know. Regards, Martin From rhamph at gmail.com Wed Sep 13 01:03:54 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 12 Sep 2006 17:03:54 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <450632A7.40504@canterbury.ac.nz> <4506553D.1020307@canterbury.ac.nz> Message-ID: On 9/12/06, Gustavo Carneiro wrote: > On 9/12/06, Adam Olsen wrote: > > My previous mention of using a *single* flag may survive corruption > > simply because we can tolerate false positives. Signal handlers would > > write 0xFFFFFFFF, the poll loop would check if *any* bit is set. If > > so, write 0x0, read off the fd, then loop around and check it again. > > If the start of the read() acts as a write-barrier it SHOULD guarantee > > we don't miss any positive writes. > PyErr_CheckSignals() > { > if (signal_flag) { > char signum; > signal_flag = 0; > while (read(signal_pipe_r, &signum, 1) == 1) > process_signal(signum); > } > } The more I think about this the less I like relying on read() imposing a hardware write barrier. Unless somebody can say otherwise, I think we'd be better of putting dummy PyThread_aquire_lock/PyThread_release_lock calls in there. -- Adam Olsen, aka Rhamphoryncus From martin at v.loewis.de Wed Sep 13 05:36:34 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 13 Sep 2006 05:36:34 +0200 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <45077CC2.9070601@v.loewis.de> Nick Maclaren schrieb: >> (coment by Arjan van de Ven): >> | afaik the kernel only sends signals to threads that don't have them blocked. >> | If python doesn't want anyone but the main thread to get signals, it >> should just >> | block signals on all but the main thread and then by nature, all >> signals will go >> | to the main thread.... > > Well, THAT'S wrong, I am afraid! Things ain't that simple :-( > > Yes, POSIX implies that things work that way, but there are so many > get-out clauses and problems with trying to implement that specification > that such behaviour can't be relied on. Can you please give one example for each (one get-out clause, and one problem with trying to implement that). I fail to see why it isn't desirable to make all signals occur in the main thread, on systems where this is possible. Regards, Martin From martin at v.loewis.de Wed Sep 13 05:38:08 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 13 Sep 2006 05:38:08 +0200 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <2mpseboj26.fsf@starship.python.net> References: <2mpseboj26.fsf@starship.python.net> Message-ID: <45077D20.5070400@v.loewis.de> Michael Hudson schrieb: >> According to [1], all python needs to do to avoid this problem is >> block all signals in all but the main thread; > > Argh, no: then people who call system() from non-main threads end up > running subprocesses with all signals masked, which breaks other > things in very mysterious ways. Been there... Python should register a pthread_atfork handler then, which clears the signal mask. Would that not work? Regards, Martin From mwh at python.net Wed Sep 13 10:14:42 2006 From: mwh at python.net (Michael Hudson) Date: Wed, 13 Sep 2006 09:14:42 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <45077D20.5070400@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Wed, 13 Sep 2006 05:38:08 +0200") References: <2mpseboj26.fsf@starship.python.net> <45077D20.5070400@v.loewis.de> Message-ID: <2mlkookwst.fsf@starship.python.net> "Martin v. L?wis" writes: > Michael Hudson schrieb: >>> According to [1], all python needs to do to avoid this problem is >>> block all signals in all but the main thread; >> >> Argh, no: then people who call system() from non-main threads end up >> running subprocesses with all signals masked, which breaks other >> things in very mysterious ways. Been there... > > Python should register a pthread_atfork handler then, which clears > the signal mask. Would that not work? Not for system() at least: http://mail.python.org/pipermail/python-dev/2003-December/041303.html Cheers, mwh -- ROOSTA: Ever since you arrived on this planet last night you've been going round telling people that you're Zaphod Beeblebrox, but that they're not to tell anyone else. -- The Hitch-Hikers Guide to the Galaxy, Episode 7 From anthony at interlink.com.au Wed Sep 13 13:57:40 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed, 13 Sep 2006 21:57:40 +1000 Subject: [Python-Dev] RELEASED Python 2.5 (release candidate 2) Message-ID: <200609132157.44342.anthony@interlink.com.au> On behalf of the Python development team and the Python community, I'm happy to announce the second RELEASE CANDIDATE of Python 2.5. After the first release candidate a number of new bugfixes have been applied to the Python 2.5 code. In the interests of making 2.5 the best release possible, we've decided to put out a second (and hopefully last) release candidate. We plan for a 2.5 final in a week's time. This is not yet the final release - it is not suitable for production use. It is being released to solicit feedback and hopefully expose bugs, as well as allowing you to determine how changes in 2.5 might impact you. As a release candidate, this is one of your last chances to test the new code in 2.5 before the final release. *Please* try this release out and let us know about any problems you find. In particular, note that changes to improve Python's support of 64 bit systems might require authors of C extensions to change their code. More information (as well as source distributions and Windows and Universal Mac OSX installers) are available from the 2.5 website: http://www.python.org/2.5/ As of this release, Python 2.5 is now in *feature freeze*. Unless absolutely necessary, no functionality changes will be made between now and the final release of Python 2.5. The new features in Python 2.5 are described in Andrew Kuchling's What's New In Python 2.5. It's available from the 2.5 web page. Amongst the language features added include conditional expressions, the with statement, the merge of try/except and try/finally into try/except/finally, enhancements to generators to produce a coroutine kind of functionality, and a brand new AST-based compiler implementation. New modules added include hashlib, ElementTree, sqlite3, wsgiref, uuid and ctypes. In addition, a new profiling module "cProfile" was added. Enjoy this new release, Anthony Anthony Baxter anthony at python.org Python Release Manager (on behalf of the entire python-dev team) From gjcarneiro at gmail.com Wed Sep 13 15:17:03 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Wed, 13 Sep 2006 14:17:03 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <450632A7.40504@canterbury.ac.nz> <4506553D.1020307@canterbury.ac.nz> Message-ID: On 9/12/06, Adam Olsen wrote: > On 9/12/06, Gustavo Carneiro wrote: > > On 9/12/06, Adam Olsen wrote: > > > My previous mention of using a *single* flag may survive corruption > > > simply because we can tolerate false positives. Signal handlers would > > > write 0xFFFFFFFF, the poll loop would check if *any* bit is set. If > > > so, write 0x0, read off the fd, then loop around and check it again. > > > If the start of the read() acts as a write-barrier it SHOULD guarantee > > > we don't miss any positive writes. > > > > Why write 0xFFFFFFFF? Why can't the variable be of a "volatile > > char" type? Assuming sizeof(char) == 1, please don't tell me > > architecture XPTO will write the value 4 bits at a time! :P > > Nope. It'll write 32 bits, then break that up into 8 bits :) > Although, at the moment I can't fathom what harm that would cause... Hmm... it means that to write those 8 bits the processor / compiler may need to 1. read 32 bits from memory to a register, 2. modify 8 bits of the register, 3. write back those 32 bits. Shouldn't affect our case, but maybe it's better to use sig_atomic_t in any case. > For the record, all volatile does is prevent compiler reordering > across sequence points. It makes the compiler aware the value may change any time, outside the current context/function, so it doesn't assume a constant value and always re-reads it from memory instead of assuming a value from a register is correct. > > static volatile char signal_flag; > > static int signal_pipe_r, signal_pipe_w; > > > > PyErr_CheckSignals() > > { > > if (signal_flag) { > > char signum; > > signal_flag = 0; > > while (read(signal_pipe_r, &signum, 1) == 1) > > process_signal(signum); > > } > > } > > I'd prefer this to be a "while (signal_flag)" instead, although it > should technically work either way. I guess we can use while instead of if. > > > static void > > signal_handler(int signum) > > { > > char signum_c = signum; > > signal_flag = 1; > > write(signal_pipe_w, &signum_c, 1); > > } > > This is wrong. PyErr_CheckSignals could check and clear signal_flag > before you reach the write() call. "signal_flag = 1" should come > after. Yes, good catch. I don't understand the memory barrier concern in your other email. I know little on the subject, but from what I could find out memory barriers are used to avoid reordering of multiple read and write operations. However, in this case we have a single value at stake, there's nothing to reorder. Except perhaps that "signal_flag = 0" could be delayed... If it is delayed until after the while (read (...)...) loop below we could get in trouble. I see your point now... :| But I think that a system call has to act as memory barrier, forcefully, because the CPU has to jump into kernelspace, a completely different context, it _has_ to flush pending memory operations sooner or later. Round two: static volatile sig_atomic_t signal_flag; static int signal_pipe_r, signal_pipe_w; PyErr_CheckSignals() { while (signal_flag) { char signum; signal_flag = 0; while (read(signal_pipe_r, &signum, 1) == 1) process_signal(signum); } } static void signal_handler(int signum) { char signum_c = signum; write(signal_pipe_w, &signum_c, 1); signal_flag = 1; } -- Gustavo J. A. M. Carneiro "The universe is always one step beyond logic." From skip at pobox.com Wed Sep 13 19:46:39 2006 From: skip at pobox.com (skip at pobox.com) Date: Wed, 13 Sep 2006 12:46:39 -0500 Subject: [Python-Dev] Maybe we should have a C++ extension for testing... Message-ID: <17672.17407.88122.884957@montanaro.dyndns.org> Building Python with C and then linking in extensions written in or wrapped with C++ can present problems, at least in some situations. I don't know if it's kosher to build that way, but folks do. We're bumping into such problems at work using Solaris 10 and Python 2.4 (building matplotlib, which is largely written in C++), and it appears others have similar problems: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6395191 http://mail.python.org/pipermail/patches/2005-June/017820.html http://mail.python.org/pipermail/python-bugs-list/2005-November/030900.html I attached a comment to the third item yesterday (even though it was closed). One of our C++ gurus (that's definitely not me!) patched the Python source to include at the top of Python.h. That seems to have solved our problems, but seems to be a symptomatic fix. I got to thinking, should we a) encourage people to compile Python with a C++ compiler if most/all of their extensions are written in C++ anyway (does that even work if one or more extensions are written in C?), or b) should the standard distribution maybe include a toy extension written in C++ whose sole purpose is to test for cross-language problems? Either/or/neither/something else? Skip From dinov at exchange.microsoft.com Wed Sep 13 20:05:32 2006 From: dinov at exchange.microsoft.com (Dino Viehland) Date: Wed, 13 Sep 2006 11:05:32 -0700 Subject: [Python-Dev] .pyc file has different result for value "1.79769313486232e+308" than .py file Message-ID: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> We've noticed a strange occurance on Python 2.4.3 w/ the floating point value 1.79769313486232e+308 and how it interacts w/ a .pyc. Given x.py: def foo(): print str(1.79769313486232e+308) print str(1.79769313486232e+308) == "1.#INF" The 1st time you run this you get the correct value, but if you reload the module after a .pyc is created then you get different results (and the generated byte code appears to have changed). Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import x >>> import dis >>> dis.dis(x.foo) 2 0 LOAD_GLOBAL 0 (str) 3 LOAD_CONST 1 (1.#INF) 6 CALL_FUNCTION 1 9 PRINT_ITEM 10 PRINT_NEWLINE 3 11 LOAD_GLOBAL 0 (str) 14 LOAD_CONST 1 (1.#INF) 17 CALL_FUNCTION 1 20 LOAD_CONST 2 ('1.#INF') 23 COMPARE_OP 2 (==) 26 PRINT_ITEM 27 PRINT_NEWLINE 28 LOAD_CONST 0 (None) 31 RETURN_VALUE >>> reload(x) >>> dis.dis(x.foo) 2 0 LOAD_GLOBAL 0 (str) 3 LOAD_CONST 1 (1.0) 6 CALL_FUNCTION 1 9 PRINT_ITEM 10 PRINT_NEWLINE 3 11 LOAD_GLOBAL 0 (str) 14 LOAD_CONST 1 (1.0) 17 CALL_FUNCTION 1 20 LOAD_CONST 2 ('1.#INF') 23 COMPARE_OP 2 (==) 26 PRINT_ITEM 27 PRINT_NEWLINE 28 LOAD_CONST 0 (None) 31 RETURN_VALUE >>> ^Z From tim.peters at gmail.com Wed Sep 13 20:39:26 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 13 Sep 2006 14:39:26 -0400 Subject: [Python-Dev] .pyc file has different result for value "1.79769313486232e+308" than .py file In-Reply-To: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <1f7befae0609131139v375c9492g6c464e164aa74f9b@mail.gmail.com> [Dino Viehland] > We've noticed a strange occurance on Python 2.4.3 w/ the floating point > value 1.79769313486232e+308 and how it interacts w/ a .pyc. Given x.py: > > def foo(): > print str(1.79769313486232e+308) > print str(1.79769313486232e+308) == "1.#INF" > > > The 1st time you run this you get the correct value, but if you reload the module > after a .pyc is created then you get different results (and the generated byte code > appears to have changed). > ... Exhaustively explained in this recent thread: http://mail.python.org/pipermail/python-list/2006-August/355986.html From dinov at exchange.microsoft.com Thu Sep 14 00:04:46 2006 From: dinov at exchange.microsoft.com (Dino Viehland) Date: Wed, 13 Sep 2006 15:04:46 -0700 Subject: [Python-Dev] .pyc file has different result for value "1.79769313486232e+308" than .py file In-Reply-To: <1f7befae0609131139v375c9492g6c464e164aa74f9b@mail.gmail.com> References: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <1f7befae0609131139v375c9492g6c464e164aa74f9b@mail.gmail.com> Message-ID: <7AD436E4270DD54A94238001769C22273E9618FB88@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Thanks for the link - it's a good explanation. FYI I've opened a bug against the VC++ team to fix their round tripping on floating point values (doesn't sound like it'll make the next release, but hopefully it'll make it someday). -----Original Message----- From: Tim Peters [mailto:tim.peters at gmail.com] Sent: Wednesday, September 13, 2006 11:39 AM To: Dino Viehland Cc: python-dev at python.org; Haibo Luo Subject: Re: [Python-Dev] .pyc file has different result for value "1.79769313486232e+308" than .py file [Dino Viehland] > We've noticed a strange occurance on Python 2.4.3 w/ the floating > point value 1.79769313486232e+308 and how it interacts w/ a .pyc. Given x.py: > > def foo(): > print str(1.79769313486232e+308) > print str(1.79769313486232e+308) == "1.#INF" > > > The 1st time you run this you get the correct value, but if you reload > the module after a .pyc is created then you get different results (and > the generated byte code appears to have changed). > ... Exhaustively explained in this recent thread: http://mail.python.org/pipermail/python-list/2006-August/355986.html From tim.peters at gmail.com Thu Sep 14 00:29:44 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 13 Sep 2006 18:29:44 -0400 Subject: [Python-Dev] .pyc file has different result for value "1.79769313486232e+308" than .py file In-Reply-To: <7AD436E4270DD54A94238001769C22273E9618FB88@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <1f7befae0609131139v375c9492g6c464e164aa74f9b@mail.gmail.com> <7AD436E4270DD54A94238001769C22273E9618FB88@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <1f7befae0609131529t353b9986t1e185afe8dc61a28@mail.gmail.com> [Dino Viehland] > FYI I've opened a bug against the VC++ team to fix their round tripping on floating > point values (doesn't sound like it'll make the next release, but hopefully it'll make it > someday). Cool! That would be helpful to many languages implemented in C/C++ relying on the platform {float, double}<->string library routines. Note that the latest revision of the C standard ("C99") specifies strings for infinities and NaNs that conforming implementations must accept (for example, "inf"). It would be nice to accept those too, for portability; "most" Python platforms already do. In fact, this is the primary reason people running on, e.g., Linux, resist upgrading to Windows ;-) From anthony at interlink.com.au Thu Sep 14 02:58:08 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 14 Sep 2006 10:58:08 +1000 Subject: [Python-Dev] release is done, but release25-maint branch remains near-frozen Message-ID: <200609141058.13625.anthony@interlink.com.au> Ok - we're looking at a final release in 7 days time. I really, really, really don't want to have to cut an rc3, so unless it's a seriously critical brown-paper-bag bug, let's hold off on the checkins. Documentation, I don't mind so much - particularly any formatting errors. -- Anthony Baxter It's never too late to have a happy childhood. From nnorwitz at gmail.com Thu Sep 14 09:48:57 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 14 Sep 2006 00:48:57 -0700 Subject: [Python-Dev] fun threading problem Message-ID: On everyones favorite platform (HP-UX), the following code consistently fails: ### from thread import start_new_thread, allocate_lock from time import sleep def bootstrap(): from os import fork ; fork() allocate_lock().acquire() start_new_thread(bootstrap, ()) sleep(.1) ### The error is: Fatal Python error: Invalid thread state for this thread This code was whittled down from test_socketserver which fails in the same way. It doesn't matter what value is passed to sleep as long as it's greater than 0. I also tried changing the sleep to a while 1: pass and the same problem occurred. So there isn't a huge interaction of APIs, only: fork, allocate_lock.acquire and start_new_thread. HP-UX seems to be more sensitive to various threading issues. In Modules/_test_capimodule.c, I had to make this modification: Index: Modules/_testcapimodule.c =================================================================== --- Modules/_testcapimodule.c (revision 51875) +++ Modules/_testcapimodule.c (working copy) @@ -665,6 +665,9 @@ PyThread_acquire_lock(thread_done, 1); /* wait for thread to finish */ Py_END_ALLOW_THREADS + /* Release lock we acquired above. This is required on HP-UX. */ + PyThread_release_lock(thread_done); + PyThread_free_lock(thread_done); Py_RETURN_NONE; } Without that patch, there would be this error: sem_destroy: Device busy sem_init: Device busy Fatal Python error: UNREF invalid object ABORT instruction (core dumped) Anyone have any ideas? n From aahz at pythoncraft.com Thu Sep 14 16:31:14 2006 From: aahz at pythoncraft.com (Aahz) Date: Thu, 14 Sep 2006 07:31:14 -0700 Subject: [Python-Dev] fun threading problem In-Reply-To: References: Message-ID: <20060914143114.GB20596@panix.com> On Thu, Sep 14, 2006, Neal Norwitz wrote: > > On everyones favorite platform (HP-UX), the following code > consistently fails: Which exact HP-UX? I remember from my ancient days that each HP-UX version completely changes the way threading works -- dunno whether that's still true. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "LL YR VWL R BLNG T S" -- www.nancybuttons.com From kbk at shore.net Fri Sep 15 04:59:08 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Thu, 14 Sep 2006 22:59:08 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200609150259.k8F2x8cT031149@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 416 open ( +3) / 3408 closed ( +1) / 3824 total ( +4) Bugs : 898 open ( +1) / 6180 closed (+13) / 7078 total (+14) RFE : 234 open ( +0) / 238 closed ( +0) / 472 total ( +0) New / Reopened Patches ______________________ email parser incorrectly breaks headers with a CRLF at 8192 (2006-09-10) http://python.org/sf/1555570 opened by Tony Meyer datetime's strftime limits strings to 127 chars (2006-09-12) http://python.org/sf/1557390 opened by Eric V. Smith Add RLIMIT_SBSIZE to resource module (2006-09-12) http://python.org/sf/1557515 opened by Eric Huss missing imports ctypes in documentation examples (2006-09-13) http://python.org/sf/1557890 opened by Daniele Varrazzo Patches Closed ______________ UserDict New Style (2006-09-08) http://python.org/sf/1555097 closed by rhettinger New / Reopened Bugs ___________________ Bug in the match function (2006-09-09) CLOSED http://python.org/sf/1555496 opened by wojtekwu Please include pliblist for all plattforms (2006-09-09) http://python.org/sf/1555501 opened by Guido Guenther sgmllib should allow angle brackets in quoted values (2006-06-11) http://python.org/sf/1504333 reopened by nnorwitz Move fpectl elsewhere in library reference (2006-09-11) http://python.org/sf/1556261 opened by Michael Hoffman datetime's strftime limits strings to 127 chars (2006-09-11) http://python.org/sf/1556784 opened by Eric V. Smith datetime's strftime limits strings to 127 chars (2006-09-12) CLOSED http://python.org/sf/1557037 opened by Eric V. Smith typo in encoding name in email package (2006-09-12) http://python.org/sf/1556895 opened by Guillaume Rousse 2.5c1 Core dump during 64-bit make on Solaris 9 Sparc (2006-09-12) http://python.org/sf/1557490 opened by Tony Bigbee xlc 6 does not like bufferobject.c line22 (2006-09-13) http://python.org/sf/1557983 opened by prueba uno apache2 - mod_python - python2.4 core dump (2006-09-14) CLOSED http://python.org/sf/1558223 opened by ThurnerRupert Tru64 make install failure (2006-09-14) http://python.org/sf/1558802 opened by Ralf W. Grosse-Kunstleve 2.5c2 macosx installer aborts during "GUI Applications" (2006-09-14) http://python.org/sf/1558983 opened by Evan Bugs Closed ___________ datetime.datetime.now() mangles tzinfo (2006-09-06) http://python.org/sf/1553577 closed by nnorwitz __unicode__ breaks for exception class objects (2006-09-03) http://python.org/sf/1551432 closed by bcannon Bug in the match function (2006-09-09) http://python.org/sf/1555496 closed by tim_one Recently introduced sgmllib regexp bug hangs Python (2006-08-16) http://python.org/sf/1541697 closed by nnorwitz logging.handlers.RotatingFileHandler - inconsistent mode (2006-09-06) http://python.org/sf/1553496 closed by vsajip datetime's strftime limits strings to 127 chars (2006-09-12) http://python.org/sf/1557037 closed by ericvsmith apache2 - mod_python - python2.4 core dump (2006-09-13) http://python.org/sf/1558223 closed by nnorwitz From nnorwitz at gmail.com Fri Sep 15 07:51:54 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 14 Sep 2006 22:51:54 -0700 Subject: [Python-Dev] fun threading problem In-Reply-To: <20060914143114.GB20596@panix.com> References: <20060914143114.GB20596@panix.com> Message-ID: On 9/14/06, Aahz wrote: > On Thu, Sep 14, 2006, Neal Norwitz wrote: > > > > On everyones favorite platform (HP-UX), the following code > > consistently fails: > > Which exact HP-UX? I remember from my ancient days that each HP-UX > version completely changes the way threading works -- dunno whether > that's still true. HP-UX 11i v2 on PA-RISC td191 on http://www.testdrive.hp.com/current.shtml From sanxiyn at gmail.com Wed Sep 13 09:46:05 2006 From: sanxiyn at gmail.com (Sanghyeon Seo) Date: Wed, 13 Sep 2006 16:46:05 +0900 Subject: [Python-Dev] IronPython and AST branch Message-ID: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com> CPython 2.5, which will be released Real Soon Now, is the first version to ship with new "AST branch", which have been in development for a long time. AST branch uses ASDL, Abstract Syntax Description Language http://asdl.sourceforge.net/ to describe Abstract Syntax Tree data structure used by CPython compiler. In theory this is language independant, and the same file could be used to generate C# source files. Having the same AST for Python implementations will be good for applications and libraries using Python implementations's internal parsers and compilers. Currently, those using CPython parser module or compiler package can't be easily ported to IronPython. What do you think? -- Seo Sanghyeon From dan.eloff at gmail.com Wed Sep 13 23:14:46 2006 From: dan.eloff at gmail.com (Dan Eloff) Date: Wed, 13 Sep 2006 16:14:46 -0500 Subject: [Python-Dev] Thank you all Message-ID: <4817b6fc0609131414j60c50400r13df42e2c6abf38e@mail.gmail.com> I was just browsing what's new in Python 2.5 at http://docs.python.org/dev/whatsnew/ As I was reading I found myself thinking how almost every improvement made a programming task I commonly bump into a little easier. Take the with statement, or the new partition method for strings, or the defaultdict (which I think was previously available, but I only now realized what it does), or the unified try/except/finally, or the conditional expression, etc Then I remembered my reaction was much like that when python 2.4 was released, and before that when Python 2.3 was released. Every time a new version of python rolls around, my life gets a little easier. I just want to say thank you, very much, from the bottom of my heart, to everyone here who chooses to spend some of their free time working on improving Python. Whether it be fixing bugs, writing documentation, optimizing things, or adding new/updating modules or features, I want you all to know I really appreciate your efforts. Your hard work has long ago made Python into my favourite programming language, and the gap only continues to grow. I think most people here and on comp.lang.python feel the same way. It's just too often that people (me) will find the 1% of things that aren't quite right and will focus on that, rather than look at the 99% of things that are done very well. So now, while I'm thinking about it, I want to take the opportunity to say thank you for the 99% of Python that all of you have done such a good job on. -Dan From martin at v.loewis.de Sat Sep 16 08:39:07 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 16 Sep 2006 08:39:07 +0200 Subject: [Python-Dev] Thank you all In-Reply-To: <4817b6fc0609131414j60c50400r13df42e2c6abf38e@mail.gmail.com> References: <4817b6fc0609131414j60c50400r13df42e2c6abf38e@mail.gmail.com> Message-ID: <450B9C0B.4070906@v.loewis.de> Dan Eloff schrieb: > I just want to say thank you, very much, from the bottom of my heart, > to everyone here who chooses to spend some of their free time working > on improving Python. Hi Dan, I can't really speak for all the other contributors (but maybe in this case I can): Thanks for the kind words. While we know in principle that many users appreciate our work, it is heartening to actually hear (or read) the praise. Regards, Martin From arigo at tunes.org Sat Sep 16 13:11:11 2006 From: arigo at tunes.org (Armin Rigo) Date: Sat, 16 Sep 2006 13:11:11 +0200 Subject: [Python-Dev] Before 2.5 - More signed integer overflows Message-ID: <20060916111111.GA27757@code0.codespeak.net> Hi all, There are more cases of signed integer overflows in the CPython source code base... That's on a 64-bits machine: [GCC 4.1.2 20060715 (prerelease) (Debian 4.1.1-9)] on linux2 abs(-sys.maxint-1) == -sys.maxint-1 I'd expect the same breakage everywhere when GCC 4.2 is used. Note that the above is Python 2.4.4c0 - apparently Python 2.3 compiled with GCC 4.1.2 works, although that doesn't make much sense to me because intobject.c didn't change here - 2.3, 2.4, 2.5, trunk are all the same. Both tested Pythons are Debian packages, not in-house compiled. Humpf! Looks like one person or two need to do a quick last-minute review of all places trying to deal with -sys.maxint-1, and replace them all with the "official" fix from Tim [SF 1545668]. A bientot, Armin From fabiofz at gmail.com Sat Sep 16 18:49:01 2006 From: fabiofz at gmail.com (Fabio Zadrozny) Date: Sat, 16 Sep 2006 13:49:01 -0300 Subject: [Python-Dev] Grammar change in classdef Message-ID: I've been porting the grammar for pydev to version 2.5 and I've seen that you can now declare a class in the format: class B():pass (without the testlist) -- from the grammar: classdef: 'class' NAME ['(' [testlist] ')'] ':' suite I think that this change should be presented at http://docs.python.org/dev/whatsnew/whatsnew25.html I'm saying that because I've only stumbled upon it by accident -- and I wasn't able to find any explanation on the reason or semantics of the change... Thanks, Fabio From l.oluyede at gmail.com Sat Sep 16 18:58:13 2006 From: l.oluyede at gmail.com (Lawrence Oluyede) Date: Sat, 16 Sep 2006 18:58:13 +0200 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: References: Message-ID: <9eebf5740609160958n6fa7ae80hd7d7c737166e6367@mail.gmail.com> > I think that this change should be presented at > http://docs.python.org/dev/whatsnew/whatsnew25.html It's already listed there: http://docs.python.org/dev/whatsnew/other-lang.html -- Lawrence http://www.oluyede.org/blog From martin at v.loewis.de Sat Sep 16 19:22:34 2006 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 16 Sep 2006 19:22:34 +0200 Subject: [Python-Dev] Testsuite fails on Windows if a space is in the path Message-ID: <450C32DA.9030601@v.loewis.de> The test suite currently (2.5) has two failures on Windows if Python is installed into a directory with a space in it (such as "Program Files"). The failing tests are test_popen and test_cmd_line. The test_cmd_line failure is shallow: the test fails to properly quote sys.executable when passing it to os.popen. I propose to fix this in Python 2.5.1; see #1559413 test_popen is more tricky. This code has always failed AFAICT, except that the test itself is a recent addition. The test tries to pass the following command to os.popen "c:\Program Files\python25\python.exe" -c "import sys;print sys.version" For some reason, os.popen invokes doesn't directly start Python as a new process, but install invokes cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print sys.version" Can somebody remember what the reason is to invoke cmd.exe (or COMSPEC) in os.popen? In any case, cmd.exe fails to execute this, claiming that c:\Program is not a valid executable. It would run cmd.exe /c "c:\Program Files\python25\python.exe" just fine, so apparently, the problem is with argument that have multiple pairs of quotes. I found, through experimentation, that it *will* accept cmd.exe /c ""c:\Program Files\python25\python.exe" -c "import sys;print sys.version"" (i.e. doubling the quotes at the beginning and the end). I'm not quite sure what algorithm cmd.exe uses for parsing, but it appears that adding a pair of quotes works in all cases (at least those I could think of). See # 1559298 Here are my questions: 1. Should this be fixed before the final release of Python 2.5? 2. If not, should it be fixed in Python 2.5.1? I'd say not: there is a potential of breaking existing applications. Applications might be aware of this mess, and deliberately add a pair of quotes already. If popen then adds yet another pair of quotes, cmd.exe will again fail. 3. If not, should this be fixed in 2.6 in the way I propose in the patch (i.e. add quotes around the command line)? Or can anybody propose a different fix? 4. How should we deal with different values of COMSPEC? Should this patch only apply for cmd.exe, or should we assume that other shells are quirk-compatible with cmd.exe in this respect (or that people stopped setting COMSPEC, anyway)? Any comments appreciated, Martin From ncoghlan at gmail.com Sat Sep 16 19:28:48 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Sep 2006 03:28:48 +1000 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: References: Message-ID: <450C3450.4010401@gmail.com> Fabio Zadrozny wrote: > I've been porting the grammar for pydev to version 2.5 and I've seen > that you can now declare a class in the format: class B():pass > (without the testlist) > > -- from the grammar: classdef: 'class' NAME ['(' [testlist] ')'] ':' suite > > I think that this change should be presented at > http://docs.python.org/dev/whatsnew/whatsnew25.html > > I'm saying that because I've only stumbled upon it by accident -- and > I wasn't able to find any explanation on the reason or semantics of > the change... Lawrence already noted that this is already covered by the What's New document (semantically, it's identical to omitting the parentheses entirely). As for the reason: it makes it possible to use the same style for classes without bases as is used for functions without arguments. Prior to this change, there was a sharp break in the class syntax, such that if you got rid of the last base class you had to get rid of the parentheses as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From exarkun at divmod.com Sat Sep 16 19:38:06 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Sat, 16 Sep 2006 13:38:06 -0400 Subject: [Python-Dev] Testsuite fails on Windows if a space is in the path In-Reply-To: <450C32DA.9030601@v.loewis.de> Message-ID: <20060916173806.1717.491882186.divmod.quotient.51076@ohm> On Sat, 16 Sep 2006 19:22:34 +0200, "\"Martin v. L?wis\"" wrote: >The test suite currently (2.5) has two failures on Windows >if Python is installed into a directory with a space in it >(such as "Program Files"). The failing tests are test_popen >and test_cmd_line. > >The test_cmd_line failure is shallow: the test fails to properly >quote sys.executable when passing it to os.popen. I propose to >fix this in Python 2.5.1; see #1559413 > >test_popen is more tricky. This code has always failed AFAICT, >except that the test itself is a recent addition. The test tries >to pass the following command to os.popen > >"c:\Program Files\python25\python.exe" -c "import sys;print sys.version" > >For some reason, os.popen invokes doesn't directly start Python as >a new process, but install invokes > >cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print >sys.version" > >Can somebody remember what the reason is to invoke cmd.exe (or COMSPEC) >in os.popen? I would guess it was done to force cmd.exe-style argument parsing in the subprocess, which is optional on Win32. > >In any case, cmd.exe fails to execute this, claiming that c:\Program >is not a valid executable. It would run > >cmd.exe /c "c:\Program Files\python25\python.exe" > >just fine, so apparently, the problem is with argument that have >multiple pairs of quotes. I found, through experimentation, that it >*will* accept > >cmd.exe /c ""c:\Program Files\python25\python.exe" -c "import sys;print >sys.version"" > >(i.e. doubling the quotes at the beginning and the end). I'm not quite >sure what algorithm cmd.exe uses for parsing, but it appears that >adding a pair of quotes works in all cases (at least those I could think >of). See # 1559298 You can find the quoting/dequoting rules used by cmd.exe documented on msdn: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/_pluslang_Parsing_C.2b2b_.Command.2d.Line_Arguments.asp Interpreting them is something of a challenge (my favorite part is how the examples imply that the final argument is automatically uppercased ;) Here is an attempted implementation of the quoting rules: http://twistedmatrix.com/trac/browser/trunk/twisted/python/win32.py#L41 Whether or not it is correct is probably a matter of discussion. If you find a more generally correct solution, I would certainly like to know about it. Jean-Paul From brett at python.org Sat Sep 16 20:33:56 2006 From: brett at python.org (Brett Cannon) Date: Sat, 16 Sep 2006 11:33:56 -0700 Subject: [Python-Dev] IronPython and AST branch In-Reply-To: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com> References: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com> Message-ID: On 9/13/06, Sanghyeon Seo wrote: > > CPython 2.5, which will be released Real Soon Now, is the first > version to ship with new "AST branch", which have been in development > for a long time. > > AST branch uses ASDL, Abstract Syntax Description Language > http://asdl.sourceforge.net/ to describe Abstract Syntax Tree data > structure used by CPython compiler. In theory this is language > independant, and the same file could be used to generate C# source > files. It would be nice, but see below. Having the same AST for Python implementations will be good for > applications and libraries using Python implementations's internal > parsers and compilers. Currently, those using CPython parser module or > compiler package can't be easily ported to IronPython. > > What do you think? I have talked to Jim Hugunin about this very topic at the last PyCon. He pointed out that IronPython was started before he knew about the AST branch so that's why he didn't use it. Plus, by the time he did know, it was too late to switch right then and there. As for making the AST branch itself more of a standard, I have talked to Jeremy Hylton about that and he didn't like the idea, at least for now. The reasons for keeping it as "experimental" in terms of exposure at the Python level is that we do not want to lock ourselves down to some AST spec that we end up changing in the future. It's the same reasoning behind not officially documenting the marshal format; we want the flexibility. How best to resolve all of this, I don't know. I completely understand not wanting to lock ourselves down to an AST too soon. Might need to wait a little while after the AST has been out in the wild to see what the user response is and then make a decision. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060916/81f42b9e/attachment.htm From steve at holdenweb.com Sat Sep 16 20:41:42 2006 From: steve at holdenweb.com (Steve Holden) Date: Sat, 16 Sep 2006 14:41:42 -0400 Subject: [Python-Dev] Testsuite fails on Windows if a space is in the path In-Reply-To: <450C32DA.9030601@v.loewis.de> References: <450C32DA.9030601@v.loewis.de> Message-ID: Martin v. L?wis wrote: > The test suite currently (2.5) has two failures on Windows > if Python is installed into a directory with a space in it > (such as "Program Files"). The failing tests are test_popen > and test_cmd_line. > > The test_cmd_line failure is shallow: the test fails to properly > quote sys.executable when passing it to os.popen. I propose to > fix this in Python 2.5.1; see #1559413 > > test_popen is more tricky. This code has always failed AFAICT, > except that the test itself is a recent addition. The test tries > to pass the following command to os.popen > > "c:\Program Files\python25\python.exe" -c "import sys;print sys.version" > > For some reason, os.popen invokes doesn't directly start Python as > a new process, but install invokes > > cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print > sys.version" > > Can somebody remember what the reason is to invoke cmd.exe (or COMSPEC) > in os.popen? > > In any case, cmd.exe fails to execute this, claiming that c:\Program > is not a valid executable. It would run > > cmd.exe /c "c:\Program Files\python25\python.exe" > > just fine, so apparently, the problem is with argument that have > multiple pairs of quotes. I found, through experimentation, that it > *will* accept > > cmd.exe /c ""c:\Program Files\python25\python.exe" -c "import sys;print > sys.version"" > > (i.e. doubling the quotes at the beginning and the end). I'm not quite > sure what algorithm cmd.exe uses for parsing, but it appears that > adding a pair of quotes works in all cases (at least those I could think > of). See # 1559298 > > Here are my questions: > 1. Should this be fixed before the final release of Python 2.5? > 2. If not, should it be fixed in Python 2.5.1? I'd say not: there > is a potential of breaking existing applications. Applications > might be aware of this mess, and deliberately add a pair of > quotes already. If popen then adds yet another pair of quotes, > cmd.exe will again fail. > 3. If not, should this be fixed in 2.6 in the way I propose in > the patch (i.e. add quotes around the command line)? > Or can anybody propose a different fix? > 4. How should we deal with different values of COMSPEC? Should > this patch only apply for cmd.exe, or should we assume that > other shells are quirk-compatible with cmd.exe in this > respect (or that people stopped setting COMSPEC, anyway)? > > Any comments appreciated, > 1. Because this is almost certainly Windows version-dependent I would suggest that you definitely hold off trying to fix it for 2.5 - it would almost certainly make another RC necessary, and even that wouldn't guarantee the required testing (I sense that Windows versions get rather less pre-release testing than others). 2. I agree with your opinion: anyone for whom this is an important issue has almost certainly addressed it with their own (version-dependent) workarounds that will break with the change. 3/4. Tricky. I don't think it would be wise to assume quirk-compatibility across all Windows command processors. On balance I suspect we should just alter the documentation to note that quirks int he underlying platform may result in unexpected behavior on quoted arguments, perhaps with an example or two. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From tim.peters at gmail.com Sat Sep 16 21:49:52 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 16 Sep 2006 15:49:52 -0400 Subject: [Python-Dev] Testsuite fails on Windows if a space is in the path In-Reply-To: <450C32DA.9030601@v.loewis.de> References: <450C32DA.9030601@v.loewis.de> Message-ID: <1f7befae0609161249u751e9a8oe651b1ca81be1879@mail.gmail.com> [Martin v. L?wis] > ... > Can somebody remember what the reason is to invoke cmd.exe (or COMSPEC) > in os.popen? Absolutely necessary, as any number of shell gimmicks can be used in the passed string, same as on non-Windows boxes; .e.g., >>> import os >>> os.environ['STR'] = 'SSL' >>> p = os.popen("findstr %STR% *.py | sort") >>> print p.read() build_ssl.py: print " None of these versions appear suitable for building OpenSSL" build_ssl.py: print "Could not find an SSL directory in '%s'" % (sources,) build_ssl.py: print "Found an SSL directory at '%s'" % (best_name,) build_ssl.py: # Look for SSL 2 levels up from pcbuild - ie, same place zlib etc all live. ... That illustrates envar substitution and setting up a pipe in the passed string, and people certainly do things like that. These are the MS docs for cmd.exe's inscrutable quoting rules after /C: """ If /C or /K is specified, then the remainder of the command line after the switch is processed as a command line, where the following logic is used to process quote (") characters: 1. If all of the following conditions are met, then quote characters on the command line are preserved: - no /S switch - exactly two quote characters - no special characters between the two quote characters, where special is one of: &<>()@^| - there are one or more whitespace characters between the the two quote characters - the string between the two quote characters is the name of an executable file. 2. Otherwise, old behavior is to see if the first character is a quote character and if so, strip the leading character and remove the last quote character on the command line, preserving any text after the last quote character. """ Your cmd.exe /c "c:\Program Files\python25\python.exe" example fit clause #1 above. cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print sys.version" fails the "exactly two quote characters" part of #1, so falls into #2, and after stripping the first and last quotes leaves the senseless: cmd.exe /c c:\Program Files\python25\python.exe" -c "import sys;print sys.version > (i.e. doubling the quotes at the beginning and the end) [works] And that follows from the above, although not for a reason any sane person would guess :-( I personally wouldn't change anything here for 2.5. It's a minefield, and people who care a lot already have their own workarounds in place, which we'd risk breaking. It remains a minefield for newbies, but we're really just passing on cmd.exe's behaviors. People are well-advised to accept the installer's default directory. From talin at acm.org Sat Sep 16 22:32:55 2006 From: talin at acm.org (Talin) Date: Sat, 16 Sep 2006 13:32:55 -0700 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: <450C3450.4010401@gmail.com> References: <450C3450.4010401@gmail.com> Message-ID: <450C5F77.30107@acm.org> Nick Coghlan wrote: > As for the reason: it makes it possible to use the same style for classes > without bases as is used for functions without arguments. Prior to this > change, there was a sharp break in the class syntax, such that if you got rid > of the last base class you had to get rid of the parentheses as well. Is the result a new-style or classic-style class? It would be nice if using the empty parens forced a new-style class... -- Talin From gjcarneiro at gmail.com Sat Sep 16 22:54:13 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 16 Sep 2006 21:54:13 +0100 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: <450C5F77.30107@acm.org> References: <450C3450.4010401@gmail.com> <450C5F77.30107@acm.org> Message-ID: On 9/16/06, Talin wrote: > Nick Coghlan wrote: > > As for the reason: it makes it possible to use the same style for classes > > without bases as is used for functions without arguments. Prior to this > > change, there was a sharp break in the class syntax, such that if you got rid > > of the last base class you had to get rid of the parentheses as well. > > Is the result a new-style or classic-style class? It would be nice if > using the empty parens forced a new-style class... That was my first thought as well. Unfortunately a quick test shows that class Foo(): creates an old style class instead :( -- Gustavo J. A. M. Carneiro "The universe is always one step beyond logic." From fabiofz at gmail.com Sat Sep 16 23:48:59 2006 From: fabiofz at gmail.com (Fabio Zadrozny) Date: Sat, 16 Sep 2006 18:48:59 -0300 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: <9eebf5740609160958n6fa7ae80hd7d7c737166e6367@mail.gmail.com> References: <9eebf5740609160958n6fa7ae80hd7d7c737166e6367@mail.gmail.com> Message-ID: On 9/16/06, Lawrence Oluyede wrote: > > I think that this change should be presented at > > http://docs.python.org/dev/whatsnew/whatsnew25.html > > It's already listed there: http://docs.python.org/dev/whatsnew/other-lang.html > Thanks... also, I don't know if the empty yield statement is mentioned too (I couldn't find it either). Cheers, Fabio From l.oluyede at gmail.com Sat Sep 16 23:57:08 2006 From: l.oluyede at gmail.com (Lawrence Oluyede) Date: Sat, 16 Sep 2006 23:57:08 +0200 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: References: <450C3450.4010401@gmail.com> <450C5F77.30107@acm.org> Message-ID: <9eebf5740609161457m2de173c2w82a068bdd0d8225a@mail.gmail.com> > That was my first thought as well. Unfortunately a quick test shows > that class Foo(): creates an old style class instead :( I think that's because until it'll be safe to break things we will stick with classic by default... -- Lawrence http://www.oluyede.org/blog From talin at acm.org Sun Sep 17 00:12:25 2006 From: talin at acm.org (Talin) Date: Sat, 16 Sep 2006 15:12:25 -0700 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: <9eebf5740609161457m2de173c2w82a068bdd0d8225a@mail.gmail.com> References: <450C3450.4010401@gmail.com> <450C5F77.30107@acm.org> <9eebf5740609161457m2de173c2w82a068bdd0d8225a@mail.gmail.com> Message-ID: <450C76C9.5060201@acm.org> Lawrence Oluyede wrote: >> That was my first thought as well. Unfortunately a quick test shows >> that class Foo(): creates an old style class instead :( > > I think that's because until it'll be safe to break things we will > stick with classic by default... But in this case nothing will be broken, since the () syntax was formerly not allowed, so it won't appear in any existing code. So it would have been a good opportunity to shift over to increased usage new-style classes without breaking anything. Thus, 'class Foo:' would create a classic class, but 'class Foo():' would create a new-style class. However, once it's released as 2.5 that will no longer be the case, as people might start to use () to indicate a classic class. Oh well. -- Talin From greg.ewing at canterbury.ac.nz Sun Sep 17 01:22:53 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 17 Sep 2006 11:22:53 +1200 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: <450C5F77.30107@acm.org> References: <450C3450.4010401@gmail.com> <450C5F77.30107@acm.org> Message-ID: <450C874D.9080302@canterbury.ac.nz> Talin wrote: > Is the result a new-style or classic-style class? It would be nice if > using the empty parens forced a new-style class... No, it wouldn't, IMO. Too subtle a clue. Best to just wait for Py3k when all classes will be new-style. -- Greg From brett at python.org Sun Sep 17 02:45:57 2006 From: brett at python.org (Brett Cannon) Date: Sat, 16 Sep 2006 17:45:57 -0700 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: <450C76C9.5060201@acm.org> References: <450C3450.4010401@gmail.com> <450C5F77.30107@acm.org> <9eebf5740609161457m2de173c2w82a068bdd0d8225a@mail.gmail.com> <450C76C9.5060201@acm.org> Message-ID: On 9/16/06, Talin wrote: > > Lawrence Oluyede wrote: > >> That was my first thought as well. Unfortunately a quick test shows > >> that class Foo(): creates an old style class instead :( > > > > I think that's because until it'll be safe to break things we will > > stick with classic by default... > > But in this case nothing will be broken, since the () syntax was > formerly not allowed, so it won't appear in any existing code. So it > would have been a good opportunity to shift over to increased usage > new-style classes without breaking anything. > > Thus, 'class Foo:' would create a classic class, but 'class Foo():' > would create a new-style class. > > However, once it's released as 2.5 that will no longer be the case, as > people might start to use () to indicate a classic class. Oh well. We didn't want there to suddenly be a way to make a new-style class that didn't explicitly subclass 'object'. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060916/f30361fe/attachment.htm From ncoghlan at gmail.com Sun Sep 17 11:00:13 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Sep 2006 19:00:13 +1000 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: <450C5F77.30107@acm.org> References: <450C3450.4010401@gmail.com> <450C5F77.30107@acm.org> Message-ID: <450D0E9D.5010001@gmail.com> Talin wrote: > Nick Coghlan wrote: >> As for the reason: it makes it possible to use the same style for classes >> without bases as is used for functions without arguments. Prior to this >> change, there was a sharp break in the class syntax, such that if you got rid >> of the last base class you had to get rid of the parentheses as well. > > Is the result a new-style or classic-style class? It would be nice if > using the empty parens forced a new-style class... This was considered & rejected by Guido as too subtle a distinction. So you still need to set __metaclass__=type (or inherit from such a class) to get a new-style class. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Sun Sep 17 11:01:25 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Sep 2006 19:01:25 +1000 Subject: [Python-Dev] Grammar change in classdef In-Reply-To: References: <9eebf5740609160958n6fa7ae80hd7d7c737166e6367@mail.gmail.com> Message-ID: <450D0EE5.3020806@gmail.com> Fabio Zadrozny wrote: > On 9/16/06, Lawrence Oluyede wrote: >>> I think that this change should be presented at >>> http://docs.python.org/dev/whatsnew/whatsnew25.html >> It's already listed there: http://docs.python.org/dev/whatsnew/other-lang.html >> > > Thanks... also, I don't know if the empty yield statement is mentioned > too (I couldn't find it either). It's part of the PEP 342 changes. However, I don't believe AMK mentioned that part explicitly in the What's New. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Sun Sep 17 11:40:41 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Sep 2006 19:40:41 +1000 Subject: [Python-Dev] IronPython and AST branch In-Reply-To: References: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com> Message-ID: <450D1819.2080803@gmail.com> Brett Cannon wrote: > As for making the AST branch itself more of a standard, I have talked to > Jeremy Hylton about that and he didn't like the idea, at least for now. > The reasons for keeping it as "experimental" in terms of exposure at the > Python level is that we do not want to lock ourselves down to some AST > spec that we end up changing in the future. It's the same reasoning > behind not officially documenting the marshal format; we want the > flexibility. > > How best to resolve all of this, I don't know. I completely understand > not wanting to lock ourselves down to an AST too soon. Might need to > wait a little while after the AST has been out in the wild to see what > the user response is and then make a decision. One of the biggest issues I have with the current AST is that I don't believe it really gets the "slice" and "extended slice" terminology correct (it uses 'extended slice' to refer to multi-dimensional indexing, but the normal meaning of that phrase is to refer to the use of a step argument for a slice [1]) Cheers, Nick. [1] http://www.python.org/doc/2.3.5/whatsnew/section-slices.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From howarth at bromo.msbb.uc.edu Sun Sep 17 14:51:38 2006 From: howarth at bromo.msbb.uc.edu (Jack Howarth) Date: Sun, 17 Sep 2006 08:51:38 -0400 (EDT) Subject: [Python-Dev] python, lipo and the future? Message-ID: <20060917125138.C4142110010@bromo.msbb.uc.edu> I am curious if there are any plans to support the functionality provided by lipo on MacOS X to create a python release that could operate at either 32-bit or 64-bit on Darwin ppc and Darwin intel? My understanding was that the linux developers are very interested in lipo as well as an approach to avoid the difficulty of maintaining separate lib directories for 32 and 64-bit libraries. Thanks in advance for any insights on this issue. Jack From ronaldoussoren at mac.com Sun Sep 17 18:15:22 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 17 Sep 2006 18:15:22 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <20060917125138.C4142110010@bromo.msbb.uc.edu> References: <20060917125138.C4142110010@bromo.msbb.uc.edu> Message-ID: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> On Sep 17, 2006, at 2:51 PM, Jack Howarth wrote: > I am curious if there are any plans to support > the functionality provided by lipo on MacOS X to > create a python release that could operate at either > 32-bit or 64-bit on Darwin ppc and Darwin intel? We already support universal binaries for PPC and x86, adding PPC64 and x86-64 to the mix should be relatively straigthforward, but it isn't a complete no-brainer. One problem is that python's configure script detects the sizes of various types and those values will be different on 32-bit and 64-bit flavours. Another problem is that Tiger's 64-bit support is pretty limited, basically just the Unix APIs, which means you cannot have a 4-way universal python interpreter without upsetting anyone with a 64- bit machine :-). > My > understanding was that the linux developers are very > interested in lipo as well as an approach to avoid > the difficulty of maintaining separate lib directories > for 32 and 64-bit libraries. Thanks in advance for > any insights on this issue. OSX uses the MachO binary format which natively supports fat binaries, I don't know if ELF (the linux binary format) support fat binaries. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/20dad179/attachment.bin From martin at v.loewis.de Sun Sep 17 18:53:04 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 17 Sep 2006 18:53:04 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> References: <20060917125138.C4142110010@bromo.msbb.uc.edu> <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> Message-ID: <450D7D70.5080505@v.loewis.de> Ronald Oussoren schrieb: > One problem is that python's configure script detects the sizes of > various types and those values will be different on 32-bit and 64-bit > flavours. FWIW, the PC build "solves" this problem by providing a hand-crafted pyconfig.h file, instead of using an autoconf-generated one. That could work for OSX as well, although it is tedious to keep the hand-crafted file up-to-date. For the PC, this isn't really a problem, since Windows doesn't suddenly grow new features, at least not those that configure checks for. So forking pyconfig.h once and then customizing it for universal binaries might work. Another approach would be to override architecture-specific defines. For example, a block #ifdef __APPLE__ #include "pyosx.h" #endif could be added to the end of pyconfig.h, and then pyosx.h would have #undef SIZEOF_LONG #if defined(__i386__) || defined(__ppc__) #define SIZEOF_LONG 4 #elif defined(__amd64__) || defined(__ppc64__) #define SIZEOF_LONG 8 #else #error unsupported architecture #endif Out of curiosity: how do the current universal binaries deal with this issue? Regards, Martin From jcarlson at uci.edu Sun Sep 17 20:03:31 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 17 Sep 2006 11:03:31 -0700 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <450D7D70.5080505@v.loewis.de> References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> Message-ID: <20060917105909.F9A7.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > Out of curiosity: how do the current universal binaries deal with this > issue? If I remember correctly, usually you do two completely independant compile runs (optionally on the same machine with different configure or macro definitions, then use a packager provided by Apple to merge the results for each binary/so to be distributed. Each additional platform would just be a new compile run. - Josiah From ronaldoussoren at mac.com Sun Sep 17 20:31:38 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 17 Sep 2006 20:31:38 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <450D7D70.5080505@v.loewis.de> References: <20060917125138.C4142110010@bromo.msbb.uc.edu> <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> Message-ID: <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com> On Sep 17, 2006, at 6:53 PM, Martin v. L?wis wrote: > Ronald Oussoren schrieb: >> One problem is that python's configure script detects the sizes of >> various types and those values will be different on 32-bit and 64-bit >> flavours. > > FWIW, the PC build "solves" this problem by providing a hand-crafted > pyconfig.h file, instead of using an autoconf-generated one. > That could work for OSX as well, although it is tedious to keep > the hand-crafted file up-to-date. > > For the PC, this isn't really a problem, since Windows doesn't > suddenly > grow new features, at least not those that configure checks for. So > forking pyconfig.h once and then customizing it for universal binaries > might work. > > Another approach would be to override architecture-specific defines. > For example, a block > > #ifdef __APPLE__ > #include "pyosx.h" > #endif Thats what I had started on before Bob came up with the endianness check that is in pyconfig.h.in at the moment. I'd to do this instead of manually maintaining a fork of pyconfig.h, my guess it is a lot less likely that pyconfig.h will grow new size related macros than new feature related ones. One possible issue here is that distutils has an API for fetching definitions from pyconfig.h, code that uses this to detect architecture features could cause problems. > > could be added to the end of pyconfig.h, and then pyosx.h would have > > #undef SIZEOF_LONG > > #if defined(__i386__) || defined(__ppc__) > #define SIZEOF_LONG 4 > #elif defined(__amd64__) || defined(__ppc64__) > #define SIZEOF_LONG 8 > #else > #error unsupported architecture > #endif > > Out of curiosity: how do the current universal binaries deal with this > issue? The sizes of basic types are the same on PPC32 and x86 which helps a lot. The byteorder is different, but we can use GCC feature checks there. The relevant bit of pyconfig.h.in: #ifdef __BIG_ENDIAN__ #define WORDS_BIGENDIAN 1 #else #ifndef __LITTLE_ENDIAN__ #undef WORDS_BIGENDIAN #endif #endif Users of pyconfig.h will see the correct definition of WORDS_BIGENDIAN regardless of the architecture that was used to create the file. One of the announced features of osx 10.5 is 64-bit support throughout the system and I definitely want to see if we can get 4- way universal support on such systems. As I don't have a system that is capable of running 64-bit code I'm not going to worry too much about this right now :-) Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/11cd07d3/attachment.bin From martin at v.loewis.de Sun Sep 17 20:35:34 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 17 Sep 2006 20:35:34 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <20060917105909.F9A7.JCARLSON@uci.edu> References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <20060917105909.F9A7.JCARLSON@uci.edu> Message-ID: <450D9576.5070700@v.loewis.de> Josiah Carlson schrieb: > "Martin v. L?wis" wrote: >> Out of curiosity: how do the current universal binaries deal with this >> issue? > > If I remember correctly, usually you do two completely independant > compile runs (optionally on the same machine with different configure or > macro definitions, then use a packager provided by Apple to merge the > results for each binary/so to be distributed. Each additional platform > would just be a new compile run. It's true that the compiler is invoked twice, however, I very much doubt that configure is run twice. Doing so would cause the Makefile being regenerated, and the build starting from scratch. It would find the object files from the previous run, and either all overwrite them, or leave them in place. The gcc driver on OSX allows to invoke cc1/as two times, and then combines the resulting object files into a single one (not sure whether or not by invoking lipo). Regards, Martin From fabiofz at gmail.com Sun Sep 17 20:38:42 2006 From: fabiofz at gmail.com (Fabio Zadrozny) Date: Sun, 17 Sep 2006 15:38:42 -0300 Subject: [Python-Dev] New relative import issue Message-ID: I've been playing with the new features and there's one thing about the new relative import that I find a little strange and I'm not sure this was intended... When you do a from . import xxx, it will always fail if you're in a top-level module, and when executing any module, the directory of the module will automatically go into the pythonpath, thus making all the relative imports in that structure fail. E.g.: /foo/bar/imp1.py <-- has a "from . import imp2" /foo/bar/imp2.py if I now put a test-case (or any other module I'd like as the main module) at: /foo/bar/mytest.py if it imports imp1, it will always fail. The solutions I see would be: - only use the pythonpath actually defined by the user (and don't put the current directory in the pythonpath) - make relative imports work even if they reach some directory in the pythonpath (making it work as an absolute import that would only search the current directory structure) Or is this actually a bug? (I'm with python 2.5 rc2) I took another look at http://docs.python.org/dev/whatsnew/pep-328.html and the example shows: pkg/ pkg/__init__.py pkg/main.py pkg/string.py with the main.py doing a "from . import string", which is what I was trying to accomplish... Cheers, Fabio From ronaldoussoren at mac.com Sun Sep 17 20:50:08 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 17 Sep 2006 20:50:08 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <20060917105909.F9A7.JCARLSON@uci.edu> References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <20060917105909.F9A7.JCARLSON@uci.edu> Message-ID: <73A4A5EE-76EE-41EF-8514-6EB8F828BE2B@mac.com> On Sep 17, 2006, at 8:03 PM, Josiah Carlson wrote: > > "Martin v. L?wis" wrote: >> Out of curiosity: how do the current universal binaries deal with >> this >> issue? > > If I remember correctly, usually you do two completely independant > compile runs (optionally on the same machine with different > configure or > macro definitions, then use a packager provided by Apple to merge the > results for each binary/so to be distributed. Each additional platform > would just be a new compile run. That's the hard way to do things, if you don't mind to spent some time checking the code you try to compile you can usually tweak header files and use '-arch ppc -arch i386' to build a universal binary in one go. This is a lot more convenient when building universal binaries and is what's used to build Python as a universal binary. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/7d58af89/attachment-0001.bin From bob at redivi.com Sun Sep 17 20:46:41 2006 From: bob at redivi.com (Bob Ippolito) Date: Sun, 17 Sep 2006 11:46:41 -0700 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <450D9576.5070700@v.loewis.de> References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <20060917105909.F9A7.JCARLSON@uci.edu> <450D9576.5070700@v.loewis.de> Message-ID: <6a36e7290609171146k5b90b13v78d31f013f36d282@mail.gmail.com> On 9/17/06, "Martin v. L?wis" wrote: > Josiah Carlson schrieb: > > "Martin v. L?wis" wrote: > >> Out of curiosity: how do the current universal binaries deal with this > >> issue? > > > > If I remember correctly, usually you do two completely independant > > compile runs (optionally on the same machine with different configure or > > macro definitions, then use a packager provided by Apple to merge the > > results for each binary/so to be distributed. Each additional platform > > would just be a new compile run. Sometimes this is done, but usually people just use CC="cc -arch i386 -arch ppc". Most of the time that Just Works, unless the source depends on autoconf gunk for endianness related issues. > It's true that the compiler is invoked twice, however, I very much doubt > that configure is run twice. Doing so would cause the Makefile being > regenerated, and the build starting from scratch. It would find the > object files from the previous run, and either all overwrite them, or > leave them in place. > > The gcc driver on OSX allows to invoke cc1/as two times, and then > combines the resulting object files into a single one (not sure whether > or not by invoking lipo). > That's exactly what it does. The gcc frontend ensures that cc1/as is invoked exactly as many times as there are -arch flags, and the result is lipo'ed together. This also means that you get to see a copy of all warnings and errors for each -arch flag. -bob From ronaldoussoren at mac.com Sun Sep 17 20:53:03 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 17 Sep 2006 20:53:03 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <450D9576.5070700@v.loewis.de> References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <20060917105909.F9A7.JCARLSON@uci.edu> <450D9576.5070700@v.loewis.de> Message-ID: <4AC70E8C-85DF-4BF3-9B4B-C19B3D1118CC@mac.com> On Sep 17, 2006, at 8:35 PM, Martin v. L?wis wrote: > Josiah Carlson schrieb: >> "Martin v. L?wis" wrote: >>> Out of curiosity: how do the current universal binaries deal with >>> this >>> issue? >> >> If I remember correctly, usually you do two completely independant >> compile runs (optionally on the same machine with different >> configure or >> macro definitions, then use a packager provided by Apple to merge the >> results for each binary/so to be distributed. Each additional >> platform >> would just be a new compile run. > > It's true that the compiler is invoked twice, however, I very much > doubt > that configure is run twice. Doing so would cause the Makefile being > regenerated, and the build starting from scratch. It would find the > object files from the previous run, and either all overwrite them, or > leave them in place. > > The gcc driver on OSX allows to invoke cc1/as two times, and then > combines the resulting object files into a single one (not sure > whether > or not by invoking lipo). IIRC the gcc driver calls lipo when multiple -arch flags are present in the command line. This is very convenient, especially when combined with distutils. Universal builds of Python will automaticly build universal extensions as well, without major patches to distutils. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/bb04a59f/attachment.bin From martin at v.loewis.de Sun Sep 17 20:56:18 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 17 Sep 2006 20:56:18 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com> References: <20060917125138.C4142110010@bromo.msbb.uc.edu> <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com> Message-ID: <450D9A52.6010209@v.loewis.de> Ronald Oussoren schrieb: > The sizes of basic types are the same on PPC32 and x86 which helps a > lot. Ah, right. This was the missing piece of the puzzle. The byteorder is different, but we can use GCC feature checks > there. The relevant bit of pyconfig.h.in: > > #ifdef __BIG_ENDIAN__ > #define WORDS_BIGENDIAN 1 > #else > #ifndef __LITTLE_ENDIAN__ > #undef WORDS_BIGENDIAN > #endif > #endif Yes, I remember this change very well. > One of the announced features of osx 10.5 is 64-bit support throughout > the system and I definitely want to see if we can get 4-way universal > support on such systems. As I don't have a system that is capable of > running 64-bit code I'm not going to worry too much about this right > now :-) Isn't this a size issue, also? There might be very few users of a 64-bit binary (fewer even on PPC64 than on AMD64). In addition: how does the system chose whether to create a 32-bit or a 64-bit process if the python binary is fat? Regards, Martin P.S.: for distutils, I think adding special cases would retrieving pyconfig.h items would be necessary. In addition, I think Python should expose some of these in the image, e.g. as sys.platform_config.SIZEOF_INT. From ronaldoussoren at mac.com Sun Sep 17 21:11:20 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 17 Sep 2006 21:11:20 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <450D9A52.6010209@v.loewis.de> References: <20060917125138.C4142110010@bromo.msbb.uc.edu> <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com> <450D9A52.6010209@v.loewis.de> Message-ID: On Sep 17, 2006, at 8:56 PM, Martin v. L?wis wrote: > >> One of the announced features of osx 10.5 is 64-bit support >> throughout >> the system and I definitely want to see if we can get 4-way universal >> support on such systems. As I don't have a system that is capable of >> running 64-bit code I'm not going to worry too much about this right >> now :-) > > Isn't this a size issue, also? There might be very few users of a > 64-bit > binary (fewer even on PPC64 than on AMD64). On Tiger it's primairily a useability issue: 64-bit binaries can't use most of the system API's because only the unix API (libSystem) is 64-bit at the moment. The size of the python installer would grow significantly for a 4-way universal distribution, it would be almost twice as large as the current distribution ("almost" because only binaries would grow in site, python source files and data files wouldn't grow in size). > > In addition: how does the system chose whether to create a 32-bit > or a 64-bit process if the python binary is fat? It should take the best fit, on 32-bit processors it picks the 32-bit version and on 64-bit processors it picks the 64-bit one. This probably means that we'll have to ship multiple versions of the python executable, otherwise Tiger (10.4) users would end up with an interpreter that cannot use OSX-specific API's. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/80a3be83/attachment.bin From Jack.Jansen at cwi.nl Sun Sep 17 21:29:52 2006 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Sun, 17 Sep 2006 21:29:52 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: References: <20060917125138.C4142110010@bromo.msbb.uc.edu> <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com> <450D9A52.6010209@v.loewis.de> Message-ID: <415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl> Just wondering: is it a good idea in the first place to create a universal 32/64 bit Python on MacOSX? On MacOS you don't pay a penalty or anything for running in 32-bit mode on any current hardware, so the choice of whether to use 32 or 64 bits really depends on the application. A single Python interpreter that can run in both 32 and 64 bit mode would possibly make this more difficult rather than easier. I think I'd prefer a situation where we have python32 and python64 (with both being ppc/ intel fat) and python being a symlink to either, at the end-users' discretion. For extension modules it's different, though: there it would be nice to be able to have a single module that could load into any Python (32/64 bit, Intel/PPC) on any applicable MacOSX version. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From howarth at bromo.msbb.uc.edu Sun Sep 17 21:37:14 2006 From: howarth at bromo.msbb.uc.edu (Jack Howarth) Date: Sun, 17 Sep 2006 15:37:14 -0400 (EDT) Subject: [Python-Dev] python, lipo and the future? Message-ID: <20060917193714.BCA73110010@bromo.msbb.uc.edu> Martin, I believe if you use the Xcode project management the Universal binary creation is automated. Currently they support the i386/ppc binaries but once Leopard comes out you will see i386/x86_64/ppc/ppc64 binaries for shared libraries. Jack From ronaldoussoren at mac.com Sun Sep 17 21:52:21 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 17 Sep 2006 21:52:21 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl> References: <20060917125138.C4142110010@bromo.msbb.uc.edu> <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com> <450D9A52.6010209@v.loewis.de> <415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl> Message-ID: <4F28D5B1-B0FC-49D8-9839-1E1338188FA9@mac.com> On Sep 17, 2006, at 9:29 PM, Jack Jansen wrote: > Just wondering: is it a good idea in the first place to create a > universal 32/64 bit Python on MacOSX? > > On MacOS you don't pay a penalty or anything for running in 32-bit > mode on any current hardware, so the choice of whether to use 32 or > 64 bits really depends on the application. A single Python > interpreter that can run in both 32 and 64 bit mode would possibly > make this more difficult rather than easier. I think I'd prefer a > situation where we have python32 and python64 (with both being ppc/ > intel fat) and python being a symlink to either, at the end-users' > discretion. > > For extension modules it's different, though: there it would be nice > to be able to have a single module that could load into any Python > (32/64 bit, Intel/PPC) on any applicable MacOSX version. A 4-way universal python framework could be useful, but I agree that the python executable shouldn't be 64-bit. I'm not too happy about a symlink that selects which version you get to use, wouldn't 'python' (32-bit) and 'python-64' (64-bit) be just as good. That way the user doesn't have to set up anything and it helps to reinforce the message that 64-bit isn't necessarily better than 32-bit. Having a 4-way universal framework would IMO be preferable over two seperate python installs, that would just increase the confusion. There are too many python distributions for the mac anyway. A major stumbling-block for a 4-way universal installation is the availability of binary packages for (popular) 3th party packages, this is not really relevant for python-dev but I'd prefer not having 64-bit support in the default installer over a 64-bit capable installation where it is very hard to get popular packages to work. BTW. several sites on the interweb claim that x86-64 runs faster than plain x86 due to a larger register set. All my machines are 32-bit so I can't check if this is relevant for Python (let alone Python on OSX). Ronald > -- > Jack Jansen, , http://www.cwi.nl/~jack > If I can't dance I don't want to be part of your revolution -- Emma > Goldman > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ > ronaldoussoren%40mac.com -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/2d174016/attachment-0001.bin From ronaldoussoren at mac.com Sun Sep 17 21:56:21 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 17 Sep 2006 21:56:21 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <20060917193714.BCA73110010@bromo.msbb.uc.edu> References: <20060917193714.BCA73110010@bromo.msbb.uc.edu> Message-ID: <90D1893C-568B-4499-A054-CB4E63B5FFD6@mac.com> On Sep 17, 2006, at 9:37 PM, Jack Howarth wrote: > Martin, > I believe if you use the Xcode project management the > Universal binary creation is automated. Currently they > support the i386/ppc binaries but once Leopard comes > out you will see i386/x86_64/ppc/ppc64 binaries for > shared libraries. That's not really relevant for python, python is build using makefiles not using a Xcode project (and I'd like to keep it that way). BTW. Xcode 2.4 can already build 4-way universal binaries, Tiger supports 64-bit unix programs. On my system file /usr/lib/ libSystem.B.dylib (the unix/C library) says: $ file /usr/lib/libSystem.B.dylib /usr/lib/libSystem.B.dylib: Mach-O universal binary with 3 architectures /usr/lib/libSystem.B.dylib (for architecture ppc64): Mach-O 64-bit dynamically linked shared library ppc64 /usr/lib/libSystem.B.dylib (for architecture i386): Mach-O dynamically linked shared library i386 /usr/lib/libSystem.B.dylib (for architecture ppc): Mach-O dynamically linked shared library ppc On the new Mac Pro's and probably the Core2 based iMac's as well libSystem also contains a x86-64 version. Ronald > Jack -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/8c68135e/attachment.bin From martin at v.loewis.de Sun Sep 17 22:27:42 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 17 Sep 2006 22:27:42 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl> References: <20060917125138.C4142110010@bromo.msbb.uc.edu> <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com> <450D9A52.6010209@v.loewis.de> <415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl> Message-ID: <450DAFBE.2000704@v.loewis.de> Jack Jansen schrieb: > Just wondering: is it a good idea in the first place to create a > universal 32/64 bit Python on MacOSX? I wonder about the same thing. > For extension modules it's different, though: there it would be nice > to be able to have a single module that could load into any Python > (32/64 bit, Intel/PPC) on any applicable MacOSX version. That seems to suggest that the standard distribution should indeed provide a four-times fat binary, at least for libpython: AFAIU, to build extension modules that way, all target architectures must be supported in all necessary libraries on the build machine (somebody will surely correct me if that's wrong). Regards, Martin From martin at v.loewis.de Sun Sep 17 22:34:49 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 17 Sep 2006 22:34:49 +0200 Subject: [Python-Dev] python, lipo and the future? In-Reply-To: <4F28D5B1-B0FC-49D8-9839-1E1338188FA9@mac.com> References: <20060917125138.C4142110010@bromo.msbb.uc.edu> <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com> <450D7D70.5080505@v.loewis.de> <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com> <450D9A52.6010209@v.loewis.de> <415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl> <4F28D5B1-B0FC-49D8-9839-1E1338188FA9@mac.com> Message-ID: <450DB169.5060608@v.loewis.de> Ronald Oussoren schrieb: > BTW. several sites on the interweb claim that x86-64 runs faster than > plain x86 due to a larger register set. All my machines are 32-bit so I > can't check if this is relevant for Python (let alone Python on OSX). That is plausible. OTOH, the AMD64 binaries will often require twice as much main memory, as all pointers double their size, and the Python implementation (or most OO languages, for that matter) is full of pointers. So it will be more efficient only until it starts swapping. (there is also a negative effect of larger pointers on the processor cache; the impact of this effect is hard to estimate). Regards, Martin From anthony at interlink.com.au Mon Sep 18 06:35:52 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 18 Sep 2006 14:35:52 +1000 Subject: [Python-Dev] Before 2.5 - More signed integer overflows In-Reply-To: <20060916111111.GA27757@code0.codespeak.net> References: <20060916111111.GA27757@code0.codespeak.net> Message-ID: <200609181435.59238.anthony@interlink.com.au> On Saturday 16 September 2006 21:11, Armin Rigo wrote: > Hi all, > > There are more cases of signed integer overflows in the CPython source > code base... > > That's on a 64-bits machine: > > [GCC 4.1.2 20060715 (prerelease) (Debian 4.1.1-9)] on linux2 > abs(-sys.maxint-1) == -sys.maxint-1 > Humpf! Looks like one person or two need to do a quick last-minute > review of all places trying to deal with -sys.maxint-1, and replace them > all with the "official" fix from Tim [SF 1545668]. Ick. We're now less than 24 hours from the scheduled release date for 2.5 final. There seems to be a couple of approaches here: 1. Someone (it won't be me, I'm flat out with work and paperwriting today) reviews the code and fixes it 2. We leave it for a 2.5.1. I'm expecting (based on the number of bugs found and fixed during the release cycle) that we'll probably need a 2.5.1 in about 3 months. 3. We delay the release until it's fixed. I'm strongly leaning towards (2) at this point. (1) would probably require another release candidate, while (3) would result in another release candidate and massive amount of sobbing from a lot of people (including me). -- Anthony Baxter It's never too late to have a happy childhood. From anthony at interlink.com.au Mon Sep 18 06:40:43 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 18 Sep 2006 14:40:43 +1000 Subject: [Python-Dev] BRANCH FREEZE/IMMINENT RELEASE: Python 2.5 (final). 2006-09-19, 00:00UTC Message-ID: <200609181440.46961.anthony@interlink.com.au> Ok, time to bring down the hammer. The release25-maint branch is absolutely frozen to everyone but the release team from 00:00UTC, Tuesday 19th September. That's just under 20 hours from now. This is for Python 2.5 FINAL, so anyone who breaks this release will make me very, very sad. Based on the last few releases, I'd expect the release process to take around 18 hours (timezones are a swine). Anthony -- Anthony Baxter It's never too late to have a happy childhood. From tim.peters at gmail.com Mon Sep 18 06:58:31 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Sep 2006 00:58:31 -0400 Subject: [Python-Dev] Before 2.5 - More signed integer overflows In-Reply-To: <200609181435.59238.anthony@interlink.com.au> References: <20060916111111.GA27757@code0.codespeak.net> <200609181435.59238.anthony@interlink.com.au> Message-ID: <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com> [Armin Rigo] >> There are more cases of signed integer overflows in the CPython source >> code base... >> >> That's on a 64-bits machine: >> >> [GCC 4.1.2 20060715 (prerelease) (Debian 4.1.1-9)] on linux2 >> abs(-sys.maxint-1) == -sys.maxint-1 >< >> Humpf! Looks like one person or two need to do a quick last-minute >> review of all places trying to deal with -sys.maxint-1, and replace them >> all with the "official" fix from Tim [SF 1545668]. [Anthony Baxter] > Ick. We're now less than 24 hours from the scheduled release date for 2.5 > final. There seems to be a couple of approaches here: > > 1. Someone (it won't be me, I'm flat out with work and paperwriting today) > reviews the code and fixes it > 2. We leave it for a 2.5.1. I'm expecting (based on the number of bugs found > and fixed during the release cycle) that we'll probably need a 2.5.1 in about > 3 months. > 3. We delay the release until it's fixed. > > I'm strongly leaning towards (2) at this point. (1) would probably require > another release candidate, while (3) would result in another release > candidate and massive amount of sobbing from a lot of people (including me). I ignored this since I don't have a box where problems are visible (& nobody responded to my request to check my last flying-blind "fix" on a box where it mattered). Given that these are weird, unlikely-in-real-life endcase bugs specific to a single compiler, #2 is the natural choice. BTW, did anyone try compiling Python with -fwrapv on a box where it matters? I doubt that Python's speed is affected one way or the other, and if adding wrapv makes the problems go away, that would be an easy last-second workaround for all possible such problems (which of course could get fixed "for real" for 2.5.1, provided someone cares enough to dig into it). From martin at v.loewis.de Mon Sep 18 08:26:07 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 18 Sep 2006 08:26:07 +0200 Subject: [Python-Dev] Before 2.5 - More signed integer overflows In-Reply-To: <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com> References: <20060916111111.GA27757@code0.codespeak.net> <200609181435.59238.anthony@interlink.com.au> <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com> Message-ID: <450E3BFF.3000700@v.loewis.de> > BTW, did anyone try compiling Python with -fwrapv on a box where it > matters? I doubt that Python's speed is affected one way or the > other, and if adding wrapv makes the problems go away, that would be > an easy last-second workaround for all possible such problems (which > of course could get fixed "for real" for 2.5.1, provided someone cares > enough to dig into it). It's not so easy to add this option: configure needs to be taught to check whether the option is supported first; to test it, you ideally need an installation where it is supported, and one where it isn't. I've added a note to README indicating that GCC 4.2 shouldn't be used to compile Python. I don't consider this a terrible limitation, especially since GCC 4.2 isn't released, yet. OTOH, I get the same problem that Armin gets (abs(-sys.maxint-1) is negative) also on a 32-bit system, with Debian's gcc 4.1.2 (which also isn't released, yet), so it appears that the problem is already with gcc 4.1. On my system, adding -fwrapv indeed solves the problem (tested for abs()). So I added this to the README also. Regards, Martin From nnorwitz at gmail.com Mon Sep 18 08:29:56 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 17 Sep 2006 23:29:56 -0700 Subject: [Python-Dev] Before 2.5 - More signed integer overflows In-Reply-To: <450E3BFF.3000700@v.loewis.de> References: <20060916111111.GA27757@code0.codespeak.net> <200609181435.59238.anthony@interlink.com.au> <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com> <450E3BFF.3000700@v.loewis.de> Message-ID: I also tested the fix (see patch below) for the abs() issue and it seemed to work for 4.1.1 on 64-bit. I'll apply the patch to head and 2.5 and a test after 2.5 is out. I have no idea how to search for these problems. I know that xrange can't display -sys.maxint-1 properly, but I think it works properly. n -- Index: Objects/intobject.c =================================================================== --- Objects/intobject.c (revision 51886) +++ Objects/intobject.c (working copy) @@ -763,7 +763,7 @@ register long a, x; a = v->ob_ival; x = -a; - if (a < 0 && x < 0) { + if (a < 0 && (unsigned long)x == 0-(unsigned long)x) { PyObject *o = PyLong_FromLong(a); if (o != NULL) { PyObject *result = PyNumber_Negative(o); On 9/17/06, "Martin v. L?wis" wrote: > > BTW, did anyone try compiling Python with -fwrapv on a box where it > > matters? I doubt that Python's speed is affected one way or the > > other, and if adding wrapv makes the problems go away, that would be > > an easy last-second workaround for all possible such problems (which > > of course could get fixed "for real" for 2.5.1, provided someone cares > > enough to dig into it). > > It's not so easy to add this option: configure needs to be taught to > check whether the option is supported first; to test it, you ideally > need an installation where it is supported, and one where it isn't. > > I've added a note to README indicating that GCC 4.2 shouldn't be > used to compile Python. I don't consider this a terrible limitation, > especially since GCC 4.2 isn't released, yet. > > OTOH, I get the same problem that Armin gets (abs(-sys.maxint-1) > is negative) also on a 32-bit system, with Debian's gcc 4.1.2 > (which also isn't released, yet), so it appears that the problem > is already with gcc 4.1. > > On my system, adding -fwrapv indeed solves the problem > (tested for abs()). So I added this to the README also. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/nnorwitz%40gmail.com > From martin at v.loewis.de Mon Sep 18 08:56:26 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 18 Sep 2006 08:56:26 +0200 Subject: [Python-Dev] Before 2.5 - More signed integer overflows In-Reply-To: References: <20060916111111.GA27757@code0.codespeak.net> <200609181435.59238.anthony@interlink.com.au> <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com> <450E3BFF.3000700@v.loewis.de> Message-ID: <450E431A.8010507@v.loewis.de> Neal Norwitz schrieb: > I also tested the fix (see patch below) for the abs() issue and it > seemed to work for 4.1.1 on 64-bit. I'll apply the patch to head and > 2.5 and a test after 2.5 is out. Please also add it to 2.4. > Index: Objects/intobject.c > =================================================================== > --- Objects/intobject.c (revision 51886) > +++ Objects/intobject.c (working copy) > @@ -763,7 +763,7 @@ > register long a, x; > a = v->ob_ival; > x = -a; > - if (a < 0 && x < 0) { > + if (a < 0 && (unsigned long)x == 0-(unsigned long)x) { Hmm. Shouldn't this drop 'x' and use 'a' instead? If a is -sys.maxint-1, -a is already undefined. Regards, Martin P.S. As for finding these problems, I would have hoped that -ftrapv could help - unfortunately, gcc breaks with this option (consumes incredible amounts of memory). From nnorwitz at gmail.com Mon Sep 18 08:59:39 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 17 Sep 2006 23:59:39 -0700 Subject: [Python-Dev] Before 2.5 - More signed integer overflows In-Reply-To: <450E431A.8010507@v.loewis.de> References: <20060916111111.GA27757@code0.codespeak.net> <200609181435.59238.anthony@interlink.com.au> <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com> <450E3BFF.3000700@v.loewis.de> <450E431A.8010507@v.loewis.de> Message-ID: On 9/17/06, "Martin v. L?wis" wrote: > Neal Norwitz schrieb: > > I also tested the fix (see patch below) for the abs() issue and it > > seemed to work for 4.1.1 on 64-bit. I'll apply the patch to head and > > 2.5 and a test after 2.5 is out. > > Please also add it to 2.4. Yes > > > Index: Objects/intobject.c > > =================================================================== > > --- Objects/intobject.c (revision 51886) > > +++ Objects/intobject.c (working copy) > > @@ -763,7 +763,7 @@ > > register long a, x; > > a = v->ob_ival; > > x = -a; > > - if (a < 0 && x < 0) { > > + if (a < 0 && (unsigned long)x == 0-(unsigned long)x) { > > Hmm. Shouldn't this drop 'x' and use 'a' instead? If a is > -sys.maxint-1, -a is already undefined. Yes, probably. I didn't review carefully. > P.S. As for finding these problems, I would have hoped that > -ftrapv could help - unfortunately, gcc breaks with this > option (consumes incredible amounts of memory). I'm getting a crash when running test_builtin and test_calendar (at least) with gcc 4.1.1 on amd64. It's happening in pymalloc, though I don't know what the cause is. I thought I tested with gcc 4.1 before, but probably would have been in debug mode. n -- Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 16384 (LWP 22020)] PyObject_Malloc (nbytes=40) at obmalloc.c:746 746 if ((pool->freeblock = *(block **)bp) != NULL) { (gdb) p bp $1 = (block *) 0x2a9558d41800
(gdb) l 741 * Pick up the head block of its free list. 742 */ 743 ++pool->ref.count; 744 bp = pool->freeblock; 745 assert(bp != NULL); 746 if ((pool->freeblock = *(block **)bp) != NULL) { 747 UNLOCK(); 748 return (void *)bp; 749 } 750 /* (gdb) p *pool $2 = {ref = {_padding = 0x1a
, count = 26}, freeblock = 0x2a9558d41800
, nextpool = 0x2a95eac000, prevpool = 0x620210, arenaindex = 0, szidx = 4, nextoffset = 4088, maxnextoffset = 4056} (gdb) p size $3 = 4 From arigo at tunes.org Mon Sep 18 11:13:14 2006 From: arigo at tunes.org (Armin Rigo) Date: Mon, 18 Sep 2006 11:13:14 +0200 Subject: [Python-Dev] New relative import issue In-Reply-To: References: Message-ID: <20060918091314.GA26814@code0.codespeak.net> Hi Fabio, On Sun, Sep 17, 2006 at 03:38:42PM -0300, Fabio Zadrozny wrote: > I've been playing with the new features and there's one thing about > the new relative import that I find a little strange and I'm not sure > this was intended... My (limited) understanding of the motivation for relative imports is that they are only here as a transitional feature. Fully-absolute imports are the official future. Neither relative nor fully-absolute imports address the fact that in any multi-package project I've been involved with, there is some kind of sys.path hackery required (or even custom import hooks). Indeed, there is no clean way from a test module 'foo.bar.test.test_hello' to import 'foo.bar.hello': the top-level directory must first be inserted into sys.path magically. > /foo/bar/imp1.py <-- has a "from . import imp2" > /foo/bar/imp2.py > > if I now put a test-case (or any other module I'd like as the main module) at: > /foo/bar/mytest.py > > if it imports imp1, it will always fail. Indeed: foo/bar/mytest.py must do 'import foo.bar.imp1' or 'from foo.bar import imp1', and then it works (if sys.path was properly hacked first, of course). (I'm not sure, but I think that this not so much a language design decision as a consequence of the complexities of import.c, which is the largest C source file of CPython and steadily growing.) A bientot, Armin From ncoghlan at gmail.com Mon Sep 18 13:25:03 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 18 Sep 2006 21:25:03 +1000 Subject: [Python-Dev] New relative import issue In-Reply-To: References: Message-ID: <450E820F.4000505@gmail.com> Fabio Zadrozny wrote: > I've been playing with the new features and there's one thing about > the new relative import that I find a little strange and I'm not sure > this was intended... > > When you do a from . import xxx, it will always fail if you're in a > top-level module, and when executing any module, the directory of the > module will automatically go into the pythonpath, thus making all the > relative imports in that structure fail. Correct. Relative imports are based on __name__ and don't work properly if __name__ does not properly reflect the module's position in the package hierarchy (usually because the module is the main module, so name is set to '__main__'). This is noted briefly in PEP 328 [1], with the current workarounds explained in more detail in PEP 338 [2]. Cheers, Nick. [1] http://www.python.org/dev/peps/pep-0328/#relative-imports-and-name [2] http://www.python.org/dev/peps/pep-0338/#import-statements-and-the-main-module -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Mon Sep 18 16:02:29 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 18 Sep 2006 16:02:29 +0200 Subject: [Python-Dev] Before 2.5 - More signed integer overflows In-Reply-To: References: <20060916111111.GA27757@code0.codespeak.net> <200609181435.59238.anthony@interlink.com.au> <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com> <450E3BFF.3000700@v.loewis.de> <450E431A.8010507@v.loewis.de> Message-ID: <450EA6F5.4040100@v.loewis.de> Neal Norwitz schrieb: > I'm getting a crash when running test_builtin and test_calendar (at > least) with gcc 4.1.1 on amd64. It's happening in pymalloc, though I > don't know what the cause is. I thought I tested with gcc 4.1 before, > but probably would have been in debug mode. Can't really check right now, but it might be that this is just the limitation that a debug obmalloc doesn't work on 64-bit systems. There is a header at each block with a fixed size of 4 bytes, even though it should be 8 bytes on 64-bit systems. This header is there only in a debug build. Regards, Martin From devik at cdi.cz Mon Sep 18 15:46:02 2006 From: devik at cdi.cz (Martin Devera) Date: Mon, 18 Sep 2006 15:46:02 +0200 Subject: [Python-Dev] deja-vu .. python locking Message-ID: <450EA31A.6060500@cdi.cz> Hello, as someone has written in FAQ, sometimes someone starts a thread about finer grained locking in Python. Ok here is one. I don't want to start a flamewar. I only seek suggestions and constructive critic. I have some ideas whose are new in this context (I believe) and I only wanted to make them public in case someone finds them interesting. Comments are welcome. Martin ------------ Round 1, Greg Stein's patch The patch removes GIL from version 1.6 and replaces locking of list, dict and other structures with finer grained locking. The major slowdown seems to be in list and dict structures, dicts are used for object attributes and these are accessed quite often. Because (IIUC) mutex struct is quite heavy, dict and list are locked via pool of locks. When you lock this pooled lock you have to lock two locks in reality. One locks pool itself, and other locks the pooled lock (the second locking can be omited in non contended case because locks in the pool are in locked state). One lock take about 25 cycles on UP P4 (mainly pipeline flush during memory barrier) and can be even more expensive (hundreds of cycles) due to cacheline move between CPUs on SMP machine. "Global" pool lock is subject to cacheline pinpong as it will be often reacquired by competing CPUs. In mappinglookup there is lookmapping guarded by this locking scheme, lookmapping itself has about 20 cycles in the best (one hope typical) case plus compareobj cost (in case of string keys say ... 50..100 cycles?). Thus locking/unlocking the read takes 50..100 cycles and operation itself is 70-120 cycles. One might expect about 50% slowdown in dict read path. RCU like locking Solution I have in mind is similar to RCU. In Python we have quiscent state - when a thread returns to main loop of interpreter. Let's add "owner_thread" field to locked object. It reflects last thread (its id) which called any lockable method on the object. Each LOCK operation looks like: while (ob->owner_thread != self_thread()) { unlock_mutex(thread_mutex[self_thread()]) // wait for owning thread to go to quiscent state lock_mutex(thread_mutex[ob->owner_thread]) ob->owner_thread = self_thread() unlock_mutex(thread_mutex[ob->owner_thread]) lock_mutex(thread_mutex[self_thread()]) } Unlock is not done - we own the object now and can use it without locking (until we return to interpreter loop or we call LOCK on other object). For non-shared objects there is only penalty of ob->owner_thread != self_thread() condition. Not sure about Windows, but in recent Linuxes one can use %gs register as thread id, thus compare is about 3 cycles (and owner_thread should be in object's cacheline anyway). In contended case there is some cache pingpong with ob and mutex but it is as expected. Deadlocks Our object ownership is long - from getting it in LOCK to next quiscent state of the thread. Thus when two threads want to step each on other's object, they will deadlock. Simple solution is to extend set of quiscent states. It is when thread releases its thread_mutex in main loop (and immediately reacquires). Additionaly it can release it just before it is going to wait on another thread's mutex, like in LOCK (already in code above). If you use LOCK correctly then when you are LOCKing an object you can't be in vulnerable part of OTHER object. So that let other threads to get ownership of your own objects in that time. One can also want to release his lock when going to lock mutex in threading package and in other places where GIL is released today. However I admit that I did no formal proof regarding deadlock, I plan to do it if nobody can find other flaw in the proposal. Big reader lock While above scheme might work well, it'd impose performance penalty for shared dicts which are almost read only (module.__dict__). For these similar locking can be used, only writer has to wait until ALL other threads enter quiscent state (take locks of them), then perform change and unlock them all. Readers can read without any locking. Compatibilty with 3rd party modules I've read this argument on pydev list. Maybe I'm not understanding something, but is it so complex for Py_InitModule4 to use extra flag in apiver for example ? When at least one non-freethreaded module is loaded, locking is done in old good way... From martin at v.loewis.de Mon Sep 18 16:18:59 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 18 Sep 2006 16:18:59 +0200 Subject: [Python-Dev] Testsuite fails on Windows if a space is in the path In-Reply-To: <20060916173806.1717.491882186.divmod.quotient.51076@ohm> References: <20060916173806.1717.491882186.divmod.quotient.51076@ohm> Message-ID: <450EAAD3.4010002@v.loewis.de> Jean-Paul Calderone schrieb: > You can find the quoting/dequoting rules used by cmd.exe documented on msdn: > > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/_pluslang_Parsing_C.2b2b_.Command.2d.Line_Arguments.asp > > Interpreting them is something of a challenge (my favorite part is how the > examples imply that the final argument is automatically uppercased ;) That doesn't talk about cmd.exe, does it? It rather looks like the procedure used to create argc/argv when calling main() in the C run-time library. If cmd.exe would use these rules, the current Python code should be fine, AFAICT. Regards, Martin From martin at v.loewis.de Mon Sep 18 16:29:32 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 18 Sep 2006 16:29:32 +0200 Subject: [Python-Dev] Testsuite fails on Windows if a space is in the path In-Reply-To: <1f7befae0609161249u751e9a8oe651b1ca81be1879@mail.gmail.com> References: <450C32DA.9030601@v.loewis.de> <1f7befae0609161249u751e9a8oe651b1ca81be1879@mail.gmail.com> Message-ID: <450EAD4C.5020902@v.loewis.de> Tim Peters schrieb: > These are the MS docs for cmd.exe's inscrutable quoting rules after /C: > > """ > If /C or /K is specified, then the remainder of the command line after > the switch is processed as a command line, where the following logic is > used to process quote (") characters: > > 1. If all of the following conditions are met, then quote characters > on the command line are preserved: I couldn't make sense of the German translation; reading over the English version several times, I think I now understand what it does (not that I truly understand *why* it does that, probably because too many people complained that it would strip off quotes when the program name had a space in it). > I personally wouldn't change anything here for 2.5. It's a minefield, > and people who care a lot already have their own workarounds in place, > which we'd risk breaking. It remains a minefield for newbies, but > we're really just passing on cmd.exe's behaviors. So what do you suggest for 2.6? "Fix" it (i.e. make sure that the target process is invoked with the same command line that is passed to popen)? Or leave it as-is, just documenting the limitations better. It's non-obvious that popen uses %COMSPEC% /c. (Another problem is that the error message from cmd.exe gets discarded; that should get fixed regardless) > People are well-advised to accept the installer's default directory. That's very true, but difficult to communicate. Too many people actually complain about that, and some even bring reasonable arguments (such as the ACL in c:\ being too permissive for a software installation). Regards, Martin From martin at v.loewis.de Mon Sep 18 16:46:40 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 18 Sep 2006 16:46:40 +0200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450EA31A.6060500@cdi.cz> References: <450EA31A.6060500@cdi.cz> Message-ID: <450EB150.90700@v.loewis.de> Martin Devera schrieb: > RCU like locking > Solution I have in mind is similar to RCU. In Python we have quiscent > state - when a thread returns to main loop of interpreter. There might be a terminology problem here. RCU is read-copy-update, right? I fail to see the copy (copy data structure to be modified) and update (replace original pointer with pointer to copy) part. Do this play a role in that scheme? If so, what specific structure is copied for, say, a list or a dict? This confusion makes it very difficult for me to understand your proposal, so I can't comment much on it. If you think it could work, just go ahead and create an implementation. Regards, Martin From devik at cdi.cz Mon Sep 18 17:06:47 2006 From: devik at cdi.cz (Martin Devera) Date: Mon, 18 Sep 2006 17:06:47 +0200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450EB150.90700@v.loewis.de> References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de> Message-ID: <450EB607.70801@cdi.cz> Martin v. L?wis wrote: > Martin Devera schrieb: >> RCU like locking >> Solution I have in mind is similar to RCU. In Python we have quiscent >> state - when a thread returns to main loop of interpreter. > > There might be a terminology problem here. RCU is read-copy-update, > right? I fail to see the copy (copy data structure to be modified) > and update (replace original pointer with pointer to copy) part. > Do this play a role in that scheme? If so, what specific structure > is copied for, say, a list or a dict? > > This confusion makes it very difficult for me to understand your > proposal, so I can't comment much on it. If you think it could work, > just go ahead and create an implementation. It is why I used a word "similar". I see the similarity in a way to archieve safe "delete" phase of RCU. Probably I selected bad title for the text. It is because I was reading about RCU implementation in Linux kernel and I discovered that the idea of postponing critical code to some safe point in future might work in Python interpreter. So that you are right. It is not RCU. It only uses similar technique as RCU uses for free-ing old copy of data. It is based on assumption that an object is typicaly used by single thread. You must lock it anyway just for case if another thread steps on it. The idea is that each object is "owned" by a thread. Owner can use its objects without locking. If a thread wants to use foreign object then it has to wait for owning thread to go to some safe place (out of interpreter, into LOCK of other object..). It is done by per-thread lock and it is neccessary because owner does no locking, thus you can be sure that nobody it using the object when former owner is somewhere out of the object. Regarding implementation, I wanted to look for some opinions before starting to implement something as big as this patch. Probably someone can look and say, hey it is stupit, you forgot that.... FILL_IN ... ;-) I hope I explained it better this time, I know my English not the best. At least worse than my Python :-) thanks for your time, Martin From exarkun at divmod.com Mon Sep 18 17:22:07 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Mon, 18 Sep 2006 11:22:07 -0400 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450EB607.70801@cdi.cz> Message-ID: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> On Mon, 18 Sep 2006 17:06:47 +0200, Martin Devera wrote: >Martin v. L?wis wrote: >> Martin Devera schrieb: >>> RCU like locking >>> Solution I have in mind is similar to RCU. In Python we have quiscent >>> state - when a thread returns to main loop of interpreter. >> >> There might be a terminology problem here. RCU is read-copy-update, >> right? I fail to see the copy (copy data structure to be modified) >> and update (replace original pointer with pointer to copy) part. >> Do this play a role in that scheme? If so, what specific structure >> is copied for, say, a list or a dict? >> >> This confusion makes it very difficult for me to understand your >> proposal, so I can't comment much on it. If you think it could work, >> just go ahead and create an implementation. > >It is why I used a word "similar". I see the similarity in a way to archieve >safe "delete" phase of RCU. Probably I selected bad title for the text. It >is because I was reading about RCU implementation in Linux kernel and >I discovered that the idea of postponing critical code to some safe point in >future might work in Python interpreter. > >So that you are right. It is not RCU. It only uses similar technique as RCU >uses for free-ing old copy of data. > >It is based on assumption that an object is typicaly used by single thread. Which thread owns builtins? Or module dictionaries? If two threads are running the same function and share no state except their globals, won't they constantly be thrashing on the module dictionary? Likewise, if the same method is running in two different threads, won't they thrash on the class dictionary? Jean-Paul From tim.peters at gmail.com Mon Sep 18 17:27:00 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Sep 2006 11:27:00 -0400 Subject: [Python-Dev] Before 2.5 - More signed integer overflows In-Reply-To: <450EA6F5.4040100@v.loewis.de> References: <20060916111111.GA27757@code0.codespeak.net> <200609181435.59238.anthony@interlink.com.au> <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com> <450E3BFF.3000700@v.loewis.de> <450E431A.8010507@v.loewis.de> <450EA6F5.4040100@v.loewis.de> Message-ID: <1f7befae0609180827p7ce60142u8c3cd3d9f3c9483@mail.gmail.com> [Neal Norwitz] >> I'm getting a crash when running test_builtin and test_calendar (at >> least) with gcc 4.1.1 on amd64. It's happening in pymalloc, though I >> don't know what the cause is. I thought I tested with gcc 4.1 before, >> but probably would have been in debug mode. Neil, in context it was unclear whether you were using trapv at the time. Were you? [Martin v. L?wis] > Can't really check right now, but it might be that this is just the > limitation that a debug obmalloc doesn't work on 64-bit systems. > There is a header at each block with a fixed size of 4 bytes, even > though it should be 8 bytes on 64-bit systems. This header is there > only in a debug build. Funny then how all the 64-bit buildbots manage to pass running debug builds ;-) As of revs 46637 + 46638 (3-4 months ago), debug-build obmalloc uses sizeof(size_t) bytes for each of its header and trailer debugging fields. Before then, the debug-build obmalloc was "safe" in this respect: if it /needed/ to store more than 4 bytes in a debug bookkeeping field, it assert-failed in a debug build. That would happen if and only if a call to malloc/realloc requested >= 2**32 bytes, so was never provoked by Python's test suite. As of rev 46638, that limitation should have gone away. From devik at cdi.cz Mon Sep 18 19:08:16 2006 From: devik at cdi.cz (Martin Devera) Date: Mon, 18 Sep 2006 19:08:16 +0200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> Message-ID: <450ED280.8010409@cdi.cz> >> So that you are right. It is not RCU. It only uses similar technique as RCU >> uses for free-ing old copy of data. >> >> It is based on assumption that an object is typicaly used by single thread. > > Which thread owns builtins? Or module dictionaries? If two threads are > running the same function and share no state except their globals, won't > they constantly be thrashing on the module dictionary? Likewise, if the > same method is running in two different threads, won't they thrash on the > class dictionary? As I've written in "Big reader lock" paragraph of the original proposal, these objects could be handled by not blocking in read path and wait for all other threads to "come home" before modifying. The selection between locking mode could be selected either by something like __locking__ or by detecting the mode. From nnorwitz at gmail.com Mon Sep 18 19:27:25 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 18 Sep 2006 10:27:25 -0700 Subject: [Python-Dev] Before 2.5 - More signed integer overflows In-Reply-To: <1f7befae0609180827p7ce60142u8c3cd3d9f3c9483@mail.gmail.com> References: <20060916111111.GA27757@code0.codespeak.net> <200609181435.59238.anthony@interlink.com.au> <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com> <450E3BFF.3000700@v.loewis.de> <450E431A.8010507@v.loewis.de> <450EA6F5.4040100@v.loewis.de> <1f7befae0609180827p7ce60142u8c3cd3d9f3c9483@mail.gmail.com> Message-ID: On 9/18/06, Tim Peters wrote: > [Neal Norwitz] > >> I'm getting a crash when running test_builtin and test_calendar (at > >> least) with gcc 4.1.1 on amd64. It's happening in pymalloc, though I > >> don't know what the cause is. I thought I tested with gcc 4.1 before, > >> but probably would have been in debug mode. > > Neil, in context it was unclear whether you were using trapv at the > time. Were you? No trapv, just ./configure --without-pydebug IIRC. I should have sent a msg last night, but was too tired. I got the same crash (I think) with gcc 3.4.4, so it's almost definitely due to an outstanding change, not python's or gcc's fault. n From rasky at develer.com Mon Sep 18 20:45:49 2006 From: rasky at develer.com (Giovanni Bajo) Date: Mon, 18 Sep 2006 20:45:49 +0200 Subject: [Python-Dev] Testsuite fails on Windows if a space is in the path References: <450C32DA.9030601@v.loewis.de><1f7befae0609161249u751e9a8oe651b1ca81be1879@mail.gmail.com> <450EAD4C.5020902@v.loewis.de> Message-ID: <09c401c6db52$a8d1a2b0$b803030a@trilan> Martin v. L?wis wrote: >> People are well-advised to accept the installer's default directory. > > That's very true, but difficult to communicate. Too many people > actually > complain about that, and some even bring reasonable arguments (such > as the ACL in c:\ being too permissive for a software installation). Besides, it won't be allowed in Vista with the default user permissions. -- Giovanni Bajo From pje at telecommunity.com Mon Sep 18 21:45:20 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 18 Sep 2006 15:45:20 -0400 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450ED280.8010409@cdi.cz> References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> <20060918152207.1717.107889227.divmod.quotient.52985@ohm> Message-ID: <5.1.1.6.0.20060918154011.026fb190@sparrow.telecommunity.com> At 07:08 PM 9/18/2006 +0200, Martin Devera wrote: > >> So that you are right. It is not RCU. It only uses similar technique > as RCU > >> uses for free-ing old copy of data. > >> > >> It is based on assumption that an object is typicaly used by single > thread. > > > > Which thread owns builtins? Or module dictionaries? If two threads are > > running the same function and share no state except their globals, won't > > they constantly be thrashing on the module dictionary? Likewise, if the > > same method is running in two different threads, won't they thrash on the > > class dictionary? > >As I've written in "Big reader lock" paragraph of the original proposal, these >objects could be handled by not blocking in read path and wait for all other >threads to "come home" before modifying. Changing an object's reference count is modifying it, and most accesses to get the dictionaries themselves involve refcount changes. Your plan, so far, does not appear to have any solution for reducing this overhead. Module globals aren't so bad, in that you'd only have to lock and refcount when frames are created and destroyed. But access to class dictionaries to obtain methods happens a lot more often, and refcounting is involved there as well. So, I think for your plan to work, you would have to eliminate reference counting, in order to bring the lock overhead down to a manageable level. From martin at v.loewis.de Mon Sep 18 22:21:51 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 18 Sep 2006 22:21:51 +0200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450EB607.70801@cdi.cz> References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de> <450EB607.70801@cdi.cz> Message-ID: <450EFFDF.1020307@v.loewis.de> Martin Devera schrieb: > It is based on assumption that an object is typicaly used by single > thread. You must lock it anyway just for case if another thread steps > on it. The idea is that each object is "owned" by a thread. Owner can > use its objects without locking. If a thread wants to use foreign > object then it has to wait for owning thread to go to some safe place > (out of interpreter, into LOCK of other object..). It is done by > per-thread lock and it is neccessary because owner does no locking, > thus you can be sure that nobody it using the object when former > owner is somewhere out of the object. Ah, I think I understand now. First the minor critique: I believe the locking algorithm isn't thread-safe: while (ob->owner_thread != self_thread()) { unlock_mutex(thread_mutex[self_thread()]) // wait for owning thread to go to quiscent state lock_mutex(thread_mutex[ob->owner_thread]) ob->owner_thread = self_thread() unlock_mutex(thread_mutex[ob->owner_thread]) lock_mutex(thread_mutex[self_thread()]) } If two threads are competing for the same object held by a third thread, they may simultaneously enter the while loop, and then simultaneously try to lock the owner_thread. Now, one will win, and own the object. Later, the other will gain the lock, and unconditionally overwrite ownership. This will cause two threads to own the objects, which is an error. The more fundamental critique is: Why? It seems you do this to improve efficiency, (implicitly) claiming that it is more efficient to keep holding the lock, instead of releasing and re-acquiring it each time. I claim that this doesn't really matter: any reasonable mutex implementation will be "fast" if there is no lock contention. On locking, it will not invoke any system call if the lock is currently not held (but just atomically test-and-set some field of the lock); on unlocking, it will not invoke any system call if the wait list is empty. As you also need to test, there shouldn't be much of a performance difference. Regards, Martin From greg.ewing at canterbury.ac.nz Tue Sep 19 05:46:59 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Sep 2006 15:46:59 +1200 Subject: [Python-Dev] New relative import issue In-Reply-To: <20060918091314.GA26814@code0.codespeak.net> References: <20060918091314.GA26814@code0.codespeak.net> Message-ID: <450F6833.60603@canterbury.ac.nz> Armin Rigo wrote: > My (limited) understanding of the motivation for relative imports is > that they are only here as a transitional feature. Fully-absolute > imports are the official future. Guido does seem to have a dislike for relative imports, but I don't really understand why. The usefulness of being able to make a package self-contained and movable to another place in the package hierarchy without hacking it seems self-evident to me. What's happening in Py3k? Will relative imports still exist? > there > is no clean way from a test module 'foo.bar.test.test_hello' to import > 'foo.bar.hello': the top-level directory must first be inserted into > sys.path magically. I've felt for a long time that problems like this wouldn't arise so much if there were a closer connection between the package hierarchy and the file system structure. There really shouldn't be any such thing as sys.path -- the view that any given module has of the package namespace should depend only on where it is, not on the history of how it came to be invoked. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From jcarlson at uci.edu Tue Sep 19 06:18:24 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 18 Sep 2006 21:18:24 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <450F6833.60603@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> Message-ID: <20060918210603.07EA.JCARLSON@uci.edu> Greg Ewing wrote: > Armin Rigo wrote: > > there > > is no clean way from a test module 'foo.bar.test.test_hello' to import > > 'foo.bar.hello': the top-level directory must first be inserted into > > sys.path magically. > > I've felt for a long time that problems like this > wouldn't arise so much if there were a closer > connection between the package hierarchy and the > file system structure. There really shouldn't be > any such thing as sys.path -- the view that any > given module has of the package namespace should > depend only on where it is, not on the history of > how it came to be invoked. Wait, wait, wait. If I remember correctly, one of the use-cases cited was for sub-packages of a single larger package to be able to import other sub-packages, via 'from ..subpackage2 import module2'. That is to say, given a package structure like... .../__init__.py .../subpackage1/module1.py .../subpackage1/__init__.py .../subpackage2/module2.py .../subpackage2/__init__.py Running module1.py, with an import line that read: from ..subpackage2 import module2 ... would import module2 from subpackage2 Testing this in the beta I have installed tells me: Traceback (most recent call last): File "module1.py", line 1, in from ..subpackage2 import module2 ValueError: Relative importpath too deep While I can understand why this is the case (if one is going to be naming modules relative to __main__ or otherwise, unless one preserves the number of leading '.', giving module2 a __name__ of __main__..subpackage2.module2 or ..subpackage2.module2, naming can be confusing), it does remove a very important feature. Guido suggested I make up a PEP way back in March or so, but I was slowed by actually implementing __main__-relative naming (which is currently incomplete). As it stands, in order to "work around" this particular feature, one would need to write a 'loader' to handle importing and/or main() calling in subpackage1/module1.py . - Josiah From brett at python.org Tue Sep 19 06:15:50 2006 From: brett at python.org (Brett Cannon) Date: Mon, 18 Sep 2006 21:15:50 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <450F6833.60603@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> Message-ID: On 9/18/06, Greg Ewing wrote: > > Armin Rigo wrote: > > > My (limited) understanding of the motivation for relative imports is > > that they are only here as a transitional feature. Fully-absolute > > imports are the official future. > > Guido does seem to have a dislike for relative imports, > but I don't really understand why. The usefulness of > being able to make a package self-contained and movable > to another place in the package hierarchy without hacking > it seems self-evident to me. It is more of how relative imports used to be inherent and thus have no clear way to delineate that an import was being done using a relative path compared to an absolute one. What's happening in Py3k? Will relative imports still > exist? Using the dot notation, yes they will exist in Py3K. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060918/01e1a07e/attachment.htm From greg.ewing at canterbury.ac.nz Tue Sep 19 06:42:47 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Sep 2006 16:42:47 +1200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450EB607.70801@cdi.cz> References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de> <450EB607.70801@cdi.cz> Message-ID: <450F7547.2080900@canterbury.ac.nz> Martin Devera wrote: > Regarding implementation, I wanted to look for some opinions before starting to > implement something as big as this patch. Probably someone can look and say, hey > it is stupit, you forgot that.... FILL_IN ... ;-) If I understand correctly, your suggestion for avoiding deadlock relies on the fact that a given thread can really only have one object locked at a time, i.e. after you LOCK an object you can only assume you own it until you LOCK another object or return to some quiescent state. Is this right? If so, the question is whether it's sufficient to be able to lock just one object at a time. Maybe it is, but some more formal consideration of that might be a good idea. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Sep 19 06:58:41 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Sep 2006 16:58:41 +1200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450ED280.8010409@cdi.cz> References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> <450ED280.8010409@cdi.cz> Message-ID: <450F7901.9030106@canterbury.ac.nz> Martin Devera wrote: > As I've written in "Big reader lock" paragraph of the original proposal, these > objects could be handled by not blocking in read path But as was just pointed out, because of refcounting, there's really no such thing as read-only access to an object. What *looks* like read-only access at the Python level involves refcount updates just from the act of touching the object. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Sep 19 07:13:46 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Sep 2006 17:13:46 +1200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <5.1.1.6.0.20060918154011.026fb190@sparrow.telecommunity.com> References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> <20060918152207.1717.107889227.divmod.quotient.52985@ohm> <5.1.1.6.0.20060918154011.026fb190@sparrow.telecommunity.com> Message-ID: <450F7C8A.7070500@canterbury.ac.nz> Phillip J. Eby wrote: > So, I think for your plan to work, you would have to eliminate reference > counting, in order to bring the lock overhead down to a manageable level. There's a possibility it wouldn't be atrociously bad. Seems like it would only add the 3 instructions or whatever overhead to most refcount operations. How much this would reduce performance depends on what percentage of time is currently used by refcounting. Are there any figures for that? A quick way of getting an idea of how much effect it would have might be to change Py_INCREF and Py_DECREF to go through the relevant motions, and see what timings are produced for single-threaded code. It wouldn't be a working implementation, but you'd find out pretty quickly if it were going to be a disaster. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From devik at cdi.cz Tue Sep 19 09:39:28 2006 From: devik at cdi.cz (Martin Devera) Date: Tue, 19 Sep 2006 09:39:28 +0200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450F7901.9030106@canterbury.ac.nz> References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> <450ED280.8010409@cdi.cz> <450F7901.9030106@canterbury.ac.nz> Message-ID: <450F9EB0.3090206@cdi.cz> Greg Ewing wrote: > Martin Devera wrote: > >> As I've written in "Big reader lock" paragraph of the original >> proposal, these >> objects could be handled by not blocking in read path > > But as was just pointed out, because of refcounting, > there's really no such thing as read-only access to > an object. What *looks* like read-only access at the > Python level involves refcount updates just from the > act of touching the object. > Yes I was thinking about atomic inc/dec (locked inc/dec in x86) as used in G.Stein's patch. I have to admit that I haven't measured its performance, I was hoping for decent one. But from http://www.linuxjournal.com/article/6993 it seems that atomic inc is rather expensive too (75ns on 1.8GHz P4) :-( Greg, what change do you have in mind regarding that "3 instruction addition" to refcounting ? thanks, Martin From devik at cdi.cz Tue Sep 19 09:51:18 2006 From: devik at cdi.cz (Martin Devera) Date: Tue, 19 Sep 2006 09:51:18 +0200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450EFFDF.1020307@v.loewis.de> References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de> <450EB607.70801@cdi.cz> <450EFFDF.1020307@v.loewis.de> Message-ID: <450FA176.3090801@cdi.cz> > Ah, I think I understand now. First the minor critique: I believe > the locking algorithm isn't thread-safe: > > while (ob->owner_thread != self_thread()) { > unlock_mutex(thread_mutex[self_thread()]) > // wait for owning thread to go to quiscent state > lock_mutex(thread_mutex[ob->owner_thread]) > ob->owner_thread = self_thread() > unlock_mutex(thread_mutex[ob->owner_thread]) > lock_mutex(thread_mutex[self_thread()]) > } > > If two threads are competing for the same object held by a third > thread, they may simultaneously enter the while loop, and then > simultaneously try to lock the owner_thread. Now, one will win, > and own the object. Later, the other will gain the lock, and > unconditionally overwrite ownership. This will cause two threads > to own the objects, which is an error. oops .. well it seems as very stupid error on my side. Yes you are absolutely right, I'll have to rethink it. I hope it is possible to do it in correct way... > The more fundamental critique is: Why? It seems you do this > to improve efficiency, (implicitly) claiming that it is > more efficient to keep holding the lock, instead of releasing > and re-acquiring it each time. > > I claim that this doesn't really matter: any reasonable > mutex implementation will be "fast" if there is no lock > contention. On locking, it will not invoke any system > call if the lock is currently not held (but just > atomically test-and-set some field of the lock); on > unlocking, it will not invoke any system call if > the wait list is empty. As you also need to test, there > shouldn't be much of a performance difference. I measured it. Lock op in futex based linux locking is of the same speed as windows critical section and it is about 30 cycles on my P4 1.8GHz in uncontented case. As explained in already mentioned http://www.linuxjournal.com/article/6993 it seems due to pipeline flush during cmpxchg insn. And there will be cacheline transfer penalty which is much larger. So that mutex locking will take time comparable with protected code itself (assuming fast code like dict/list read). Single compare will take ten times less. Am I missing something ? thanks, Martin From phd at phd.pp.ru Tue Sep 19 11:47:38 2006 From: phd at phd.pp.ru (Oleg Broytmann) Date: Tue, 19 Sep 2006 13:47:38 +0400 Subject: [Python-Dev] New relative import issue In-Reply-To: <450F6833.60603@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> Message-ID: <20060919094738.GC27707@phd.pp.ru> On Tue, Sep 19, 2006 at 03:46:59PM +1200, Greg Ewing wrote: > There really shouldn't be > any such thing as sys.path -- the view that any > given module has of the package namespace should > depend only on where it is I do not understand this. Can you show an example? Imagine I have two servers, Linux and FreeBSD, and on Linux python is in /usr/bin, home is /home/phd, on BSD these are /usr/local/bin and /usr/home/phd. I have some modules in site-packages and some modules in $HOME/lib/python. How can I move programs from one server to the other without rewriting them (how can I not to put full paths to modules)? I use PYTHONPATH manipulation - its enough to write a shell script that starts daemons once and use it for many years. How can I do this without sys.path?! Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From ncoghlan at gmail.com Tue Sep 19 12:16:59 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 19 Sep 2006 20:16:59 +1000 Subject: [Python-Dev] New relative import issue In-Reply-To: <20060918210603.07EA.JCARLSON@uci.edu> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060918210603.07EA.JCARLSON@uci.edu> Message-ID: <450FC39B.9070200@gmail.com> Josiah Carlson wrote: > As it stands, in order to "work around" this particular feature, one > would need to write a 'loader' to handle importing and/or main() calling > in subpackage1/module1.py . Yup. At the moment, you can rely on PEP 328, or an PEP 338, but not both at the same time. This was previously discussed back in June/July with Anthony convincing me that the solution to the current poor interaction shouldn't be rushed [1]. It is, however, pretty trivial to write a runpy.run_module based launcher that will execute your module and use something other than "__name__ == '__main__'" to indicate that the module is the main module. By letting run_module set __name__ normally, relative imports will "just work". For example: #mypkg/launch.py # Runs a script, using the global _launched to indicate whether or not # the module is the main module if "_launched" not in globals(): _launched = False if (__name__ == "__main__") or _launched: import runpy # Run the module specified as the next command line argument if len(sys.argv) < 2: print >> sys.stderr, "No module specified for execution" else: del sys.argv[0] # Make the requested module sys.argv[0] run_module(sys.argv[0], init_globals=dict(_launched=True), alter_sys=True) Cheers, Nick. [1] http://mail.python.org/pipermail/python-dev/2006-July/067077.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From steve at holdenweb.com Tue Sep 19 14:40:45 2006 From: steve at holdenweb.com (Steve Holden) Date: Tue, 19 Sep 2006 08:40:45 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: <450F6833.60603@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Armin Rigo wrote: > > >>My (limited) understanding of the motivation for relative imports is >>that they are only here as a transitional feature. Fully-absolute >>imports are the official future. > > > Guido does seem to have a dislike for relative imports, > but I don't really understand why. The usefulness of > being able to make a package self-contained and movable > to another place in the package hierarchy without hacking > it seems self-evident to me. > > What's happening in Py3k? Will relative imports still > exist? > > >>there >>is no clean way from a test module 'foo.bar.test.test_hello' to import >>'foo.bar.hello': the top-level directory must first be inserted into >>sys.path magically. > > > I've felt for a long time that problems like this > wouldn't arise so much if there were a closer > connection between the package hierarchy and the > file system structure. There really shouldn't be > any such thing as sys.path -- the view that any > given module has of the package namespace should > depend only on where it is, not on the history of > how it came to be invoked. > This does, of course, assume that you're importing modules from the filestore, which assumption is no longer valid in the presence of PEP 302 importers. The current initialization code actually looks for os.py as a means of establishing path elements. This should really be better integrated with the PEP 302 mechanism: ideally Python should work on systems that don't rely on filestore for import (even though for the foreseeable future all systems will continue to do this). regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From anthony at python.org Tue Sep 19 14:39:48 2006 From: anthony at python.org (Anthony Baxter) Date: Tue, 19 Sep 2006 22:39:48 +1000 Subject: [Python-Dev] RELEASED Python 2.5 (FINAL) Message-ID: <200609192239.57508.anthony@python.org> It's been nearly 20 months since the last major release of Python (2.4), and 5 months since the first alpha release of this cycle, so I'm absolutely thrilled to be able to say: On behalf of the Python development team and the Python community, I'm happy to announce the FINAL release of Python 2.5. This is a *production* release of Python 2.5. Yes, that's right, it's finally here. Python 2.5 is probably the most significant new release of Python since 2.2, way back in the dark ages of 2001. There's been a wide variety of changes and additions, both user-visible and underneath the hood. In addition, we've switched to SVN for development and now use Buildbot to do continuous testing of the Python codebase. Much more information (as well as source distributions and Windows and Universal Mac OSX installers) are available from the 2.5 website: http://www.python.org/2.5/ The new features in Python 2.5 are described in Andrew Kuchling's What's New In Python 2.5. It's available from the 2.5 web page. Amongst the new features of Python 2.5 are conditional expressions, the with statement, the merge of try/except and try/finally into try/except/finally, enhancements to generators to produce coroutine functionality, and a brand new AST-based compiler implementation underneath the hood. There's a variety of smaller new features as well. New to the standard library are hashlib, ElementTree, sqlite3, wsgiref, uuid and ctypes. As well, a new higher-performance profiling module (cProfile) was added. Extra-special thanks on behalf of the entire Python community should go out to Neal Norwitz, who's done absolutely sterling work in shepherding Python 2.5 through to it's final release. Enjoy this new release, (and Woo-HOO! It's done!) Anthony Anthony Baxter anthony at python.org Python Release Manager (on behalf of the entire python-dev team) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060919/5e104475/attachment.pgp From anthony at interlink.com.au Tue Sep 19 16:06:00 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed, 20 Sep 2006 00:06:00 +1000 Subject: [Python-Dev] release25-maint branch - please keep frozen for a day or two more. Message-ID: <200609200006.06585.anthony@interlink.com.au> Could people please treat the release25-maint branch as frozen for a day or two, just in case we have to cut an ohmygodnononokillme release? Thanks, Anthony -- Anthony Baxter It's never too late to have a happy childhood. From steve at holdenweb.com Tue Sep 19 16:19:30 2006 From: steve at holdenweb.com (Steve Holden) Date: Tue, 19 Sep 2006 10:19:30 -0400 Subject: [Python-Dev] release25-maint branch - please keep frozen for a day or two more. In-Reply-To: <200609200006.06585.anthony@interlink.com.au> References: <200609200006.06585.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > Could people please treat the release25-maint branch as frozen for a day or > two, just in case we have to cut an ohmygodnononokillme release? Thanks, Otherwise to be known as 2.5.005? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From martin at v.loewis.de Tue Sep 19 20:13:29 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 19 Sep 2006 20:13:29 +0200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450FA176.3090801@cdi.cz> References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de> <450EB607.70801@cdi.cz> <450EFFDF.1020307@v.loewis.de> <450FA176.3090801@cdi.cz> Message-ID: <45103349.3020203@v.loewis.de> Martin Devera schrieb: > I measured it. Lock op in futex based linux locking is of the same > speed as windows critical section and it is about 30 cycles on my > P4 1.8GHz in uncontented case. > As explained in already mentioned http://www.linuxjournal.com/article/6993 > it seems due to pipeline flush during cmpxchg insn. > And there will be cacheline transfer penalty which is much larger. So > that mutex locking will take time comparable with protected code itself > (assuming fast code like dict/list read). > Single compare will take ten times less. > Am I missing something ? I'll have to wait for your revised algorithm, but likely, you will need some kind of memory barrier also, or else it can't work in the multi-processor case. In any case, if to judge whether 30 cycles is few or little, measurements of the alternative approach are necessary. Regards, Martin From michael.walter at gmail.com Tue Sep 19 21:34:56 2006 From: michael.walter at gmail.com (Michael Walter) Date: Tue, 19 Sep 2006 21:34:56 +0200 Subject: [Python-Dev] Download URL typo Message-ID: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com> Hiho, in case noone didn't notice yet: the "Windows MSI Installer" link at http://www.python.org/download/releases/2.5/ points to Python 2.4! Regards, Michael From martin at v.loewis.de Tue Sep 19 23:09:38 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 19 Sep 2006 23:09:38 +0200 Subject: [Python-Dev] Download URL typo In-Reply-To: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com> References: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com> Message-ID: <45105C92.9030002@v.loewis.de> Michael Walter schrieb: > in case noone didn't notice yet: the "Windows MSI Installer" link at > http://www.python.org/download/releases/2.5/ points to Python 2.4! Why is this a problem? The link is actually correct: The MSI documentation is the same. Regards, Martin From martin at v.loewis.de Tue Sep 19 23:45:27 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 19 Sep 2006 23:45:27 +0200 Subject: [Python-Dev] Download URL typo In-Reply-To: <45105C92.9030002@v.loewis.de> References: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com> <45105C92.9030002@v.loewis.de> Message-ID: <451064F7.2000200@v.loewis.de> Martin v. L?wis schrieb: > Michael Walter schrieb: >> in case noone didn't notice yet: the "Windows MSI Installer" link at >> http://www.python.org/download/releases/2.5/ points to Python 2.4! > > Why is this a problem? The link is actually correct: The MSI > documentation is the same. I reconsidered. Even though the documentation was nearly correct (except that one limitation went away long ago), it's probably better to have the documentation state "2.5" throughout. So I copied it, changed the version numbers, and changed the links to refer to the copy. Regards, Martin From greg.ewing at canterbury.ac.nz Wed Sep 20 01:54:08 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 20 Sep 2006 11:54:08 +1200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <450F9EB0.3090206@cdi.cz> References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> <450ED280.8010409@cdi.cz> <450F7901.9030106@canterbury.ac.nz> <450F9EB0.3090206@cdi.cz> Message-ID: <45108320.6090109@canterbury.ac.nz> Martin Devera wrote: > Greg, what change do you have in mind regarding that "3 instruction > addition" to refcounting ? I don't have any change in mind. If even an atomic inc is too expensive, it seems there's no hope for us. -- Greg From greg.ewing at canterbury.ac.nz Wed Sep 20 02:06:41 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 20 Sep 2006 12:06:41 +1200 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> Message-ID: <45108611.7090009@canterbury.ac.nz> Steve Holden wrote: > This does, of course, assume that you're importing modules from the > filestore, which assumption is no longer valid in the presence of PEP > 302 importers. Well, you need to allow for a sufficiently abstract notion of "filesystem". I haven't really thought it through in detail. It just seems as though it would be a lot less confusing if you could figure out from static information which module will get imported by a given import statement, instead of having it depend on the history of run-time modifications to sys.path. One such kind of static information is the layout of the filesystem. -- Greg From steve at holdenweb.com Wed Sep 20 03:04:26 2006 From: steve at holdenweb.com (Steve Holden) Date: Tue, 19 Sep 2006 21:04:26 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: <45108611.7090009@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <45108611.7090009@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Steve Holden wrote: > > >>This does, of course, assume that you're importing modules from the >>filestore, which assumption is no longer valid in the presence of PEP >>302 importers. > > > Well, you need to allow for a sufficiently abstract > notion of "filesystem". > For some value of "sufficiently" ... > I haven't really thought it through in detail. It > just seems as though it would be a lot less confusing > if you could figure out from static information which > module will get imported by a given import statement, > instead of having it depend on the history of run-time > modifications to sys.path. One such kind of static > information is the layout of the filesystem. > Less confusing, but sadly also less realistic. I suspect what's really needed is *more* importer behavior rather than less but, like you, I haven't yet thought it through in detail. All I *can* tell you is once you start importing modules for a database the whole import mechanism starts to look a bit under-specified an over-complicated. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From steve at holdenweb.com Wed Sep 20 03:14:12 2006 From: steve at holdenweb.com (Steve Holden) Date: Tue, 19 Sep 2006 21:14:12 -0400 Subject: [Python-Dev] Download URL typo In-Reply-To: <451064F7.2000200@v.loewis.de> References: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com> <45105C92.9030002@v.loewis.de> <451064F7.2000200@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Martin v. L?wis schrieb: > >>Michael Walter schrieb: >> >>>in case noone didn't notice yet: the "Windows MSI Installer" link at >>>http://www.python.org/download/releases/2.5/ points to Python 2.4! >> >>Why is this a problem? The link is actually correct: The MSI >>documentation is the same. > > > I reconsidered. Even though the documentation was nearly correct > (except that one limitation went away long ago), it's probably better > to have the documentation state "2.5" throughout. So I copied it, > changed the version numbers, and changed the links to refer to the > copy. > As I write the situation is an ugly mess, since the most visible link is just plain wrong. The page http://www.python.org/download/releases/2.5/ has a block at the top right whose last link is "Windows MSI installer". That links to http://www.python.org/download/releases/2.5/msi/ which *also* has a block at the top right whose last link is "Windows MSI installer". Unfortunately that takes you to http://www.python.org/download/releases/2.5/msi/msi by which time you have completely lost contact with any style sheet, and despite the potential infinite regress have still not located the actual installer. The correct link is in-line: http://www.python.org/download/releases/2.5/python-2.5.msi I think the next time we redesign the web production system we should take the release managers' needs into consideration. They should have a simple form to fill in, with defaults already provided. As indeed should many other people ... regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From devik at cdi.cz Wed Sep 20 08:00:35 2006 From: devik at cdi.cz (Martin Devera) Date: Wed, 20 Sep 2006 08:00:35 +0200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <45108320.6090109@canterbury.ac.nz> References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> <450ED280.8010409@cdi.cz> <450F7901.9030106@canterbury.ac.nz> <450F9EB0.3090206@cdi.cz> <45108320.6090109@canterbury.ac.nz> Message-ID: <4510D903.3090002@cdi.cz> Greg Ewing wrote: > Martin Devera wrote: > >> Greg, what change do you have in mind regarding that "3 instruction >> addition" to refcounting ? > > I don't have any change in mind. If even an atomic inc > is too expensive, it seems there's no hope for us. Just from curiosity, would be a big problem removing refcounting and live with garbage collection only ? I'm not sure if some parts of py code depends on exact refcnt behaviour (I guess it should not). Probably not for mainstream, but maybe as compile time option as part of freethreading solution only for those who need it. Even if you can do fast atomic inc/dec, it forces cacheline with refcounter to ping-pong between caches of referencing cpus (for read only class dicts for example) so that you can probably never get good SMP scalability. Consider main memory latency 100ns, then on 8 way 2GHz SMP system where paralel computation within the same py class is going on all cpus. When you manage to do a lot of class references in a loop, say 6400 instructions apart (quite realistic) then at least one CPU each time will block on that inc/dec, so that you lost one cpu in overhead... From martin at v.loewis.de Wed Sep 20 08:33:38 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 20 Sep 2006 08:33:38 +0200 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <4510D903.3090002@cdi.cz> References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm> <450ED280.8010409@cdi.cz> <450F7901.9030106@canterbury.ac.nz> <450F9EB0.3090206@cdi.cz> <45108320.6090109@canterbury.ac.nz> <4510D903.3090002@cdi.cz> Message-ID: <4510E0C2.7060506@v.loewis.de> Martin Devera schrieb: > Just from curiosity, would be a big problem removing refcounting and live > with garbage collection only ? I'm not sure if some parts of py code > depends on exact refcnt behaviour (I guess it should not). Now, this gives a true deja-vu. Python applications often rely on reference counting (in particular, that releasing a file object will immediately close the file), despite the language reference saying that this is not a Python feature, just one of the implementation. In addition, implementing a tracing garbage collection would either be tedious or have additional consequences on semantics: with a conservative GC, some objects may never get collected, with a precise GC, you have to declare GC roots on the C level. Things get more complicated if the GC is also compacting. See the current thread on the py3k list. Regards, Martin From martin at v.loewis.de Wed Sep 20 08:48:18 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 20 Sep 2006 08:48:18 +0200 Subject: [Python-Dev] Download URL typo In-Reply-To: References: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com> <45105C92.9030002@v.loewis.de> <451064F7.2000200@v.loewis.de> Message-ID: <4510E432.4060402@v.loewis.de> Steve Holden schrieb: > That links to > > http://www.python.org/download/releases/2.5/msi/ > > which *also* has a block at the top right whose last link is "Windows > MSI installer". Unfortunately that takes you to > > http://www.python.org/download/releases/2.5/msi/msi I noticed, but my pyramid fu is not good enough to fix it. Should I submit a pyramid/web site bug report? Or can you fix it? Notice that the Highlights page behaves the same way, whereas the License and Bugs pages works correctly. I can't really spot a difference in the sources: the subnav.yml files are identical in all these. Actually, looking more closely, it appears that the "working" pages have a line subnav: !fragment subnav.yml in content.yml; this seems to make a difference. What does that line mean? Regards, Martin From jcarlson at uci.edu Wed Sep 20 09:49:59 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 20 Sep 2006 00:49:59 -0700 Subject: [Python-Dev] deja-vu .. python locking In-Reply-To: <4510D903.3090002@cdi.cz> References: <45108320.6090109@canterbury.ac.nz> <4510D903.3090002@cdi.cz> Message-ID: <20060920003827.0814.JCARLSON@uci.edu> Martin Devera wrote: [snip] > Even if you can do fast atomic inc/dec, it forces cacheline with > refcounter to ping-pong between caches of referencing cpus (for read only > class dicts for example) so that you can probably never get good SMP > scalability. That's ok. Why? Because according to Guido, the GIL isn't going away: http://mail.python.org/pipermail/python-3000/2006-April/001072.html ... so ruminations about refcounting, GC, etc., at least with regards to removing the GIL towards some sort of "free threading" Python, are likely to go nowhere. Unless someone is able to translate the codebase into using such methods, show how it is not (significantly) more difficult to program extensions for, show a mild to moderate slowdown on single processors, and prove actual speedup on multiple processors. But even then it will be a difficult sell, as it would require possibly radical rewrites for all of the hundreds or thousands of CPython extensions currently being developed and maintained. - Josiah From theller at python.net Wed Sep 20 12:02:21 2006 From: theller at python.net (Thomas Heller) Date: Wed, 20 Sep 2006 12:02:21 +0200 Subject: [Python-Dev] Exceptions and slicing Message-ID: Is it an oversight that exception instances do no longer support slicing in Python 2.5? This code works in 2.4, but no longer in 2.5: try: open("", "r") except IOError, details: print details[:] Thomas From brett at python.org Wed Sep 20 20:07:51 2006 From: brett at python.org (Brett Cannon) Date: Wed, 20 Sep 2006 11:07:51 -0700 Subject: [Python-Dev] Exceptions and slicing In-Reply-To: References: Message-ID: On 9/20/06, Thomas Heller wrote: > > Is it an oversight that exception instances do no longer support > slicing in Python 2.5? > > This code works in 2.4, but no longer in 2.5: > > try: > open("", "r") > except IOError, details: > print details[:] Technically, yes. There is no entry in the sq_slice field for the PySequenceMethods struct. Although you can get to the list of arguments by going through the 'args' attribute if you need a quick fix. I have a fix in my checkout that I will check into the trunk shortly and into 25-maint as soon as Anthony unfreezes it. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060920/720c7bdb/attachment.html From theller at python.net Wed Sep 20 21:38:39 2006 From: theller at python.net (Thomas Heller) Date: Wed, 20 Sep 2006 21:38:39 +0200 Subject: [Python-Dev] Exceptions and slicing In-Reply-To: References: Message-ID: Brett Cannon schrieb: > On 9/20/06, Thomas Heller wrote: >> >> Is it an oversight that exception instances do no longer support >> slicing in Python 2.5? >> >> This code works in 2.4, but no longer in 2.5: >> >> try: >> open("", "r") >> except IOError, details: >> print details[:] > > > Technically, yes. There is no entry in the sq_slice field for the > PySequenceMethods struct. Although you can get to the list of arguments by > going through the 'args' attribute if you need a quick fix. Well, Nick Coghlan pointed out in private email: >> According to PEP 352 it should have at most been deprecated along with the >> rest of Exception.__getitem__: >> >> "This also means providing a __getitem__ method is unneeded for exceptions and >> thus will be deprecated as well." > I have a fix in my checkout that I will check into the trunk shortly and > into 25-maint as soon as Anthony unfreezes it. I was not aware before I posted that tuple-unpacking of exceptions still works, so this is another possibility: except WindowsError, (errno, message): What I find worse about WindowsError especially is two things: 1. The __str__ of a WindowsError instance hides the 'real' windows error number. So, in 2.4 "print error_instance" would print for example: [Errno 1002] Das Fenster kann die gesendete Nachricht nicht verarbeiten. while in 2.5: [Error 22] Das Fenster kann die gesendete Nachricht nicht verarbeiten. because the new mapping of windows error codes to posix error codes creates EINVAL (22) when no corresponding posix error code exists. 2. How would one write portable exception handling for Python 2.4 and 2.5? I have code like this: try: do something except WindowsError, details: if not details.errno in (TYPE_E_REGISTRYACCESS, TYPE_E_CANTLOADLIBRARY): raise Doesn't work in 2.5 any longer, because I would have to use details.winerror instead of e.errno. The two portale possibilities I found are these, but neither is elegant imo: except WindowsError, (winerrno, message): or except WindowsError, details: winerrno = details[0] And the latter still uses __getitem__ which may go away according to PEP 352. Thomas From martin at v.loewis.de Wed Sep 20 21:58:36 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 20 Sep 2006 21:58:36 +0200 Subject: [Python-Dev] Exceptions and slicing In-Reply-To: References: Message-ID: <45119D6C.2050005@v.loewis.de> Thomas Heller schrieb: > 1. The __str__ of a WindowsError instance hides the 'real' windows > error number. So, in 2.4 "print error_instance" would print > for example: > > [Errno 1002] Das Fenster kann die gesendete Nachricht nicht verarbeiten. > > while in 2.5: > > [Error 22] Das Fenster kann die gesendete Nachricht nicht verarbeiten. That's a bug. I changed the string deliberately from Errno to error to indicate that it is not an errno, but a GetLastError. Can you come up with a patch? > 2. How would one write portable exception handling for Python 2.4 and 2.5? > > I have code like this: > > try: > do something > except WindowsError, details: > if not details.errno in (TYPE_E_REGISTRYACCESS, TYPE_E_CANTLOADLIBRARY): > raise > > Doesn't work in 2.5 any longer, because I would have to use details.winerror > instead of e.errno. Portable code should do def winerror(exc): try: return exc.winerror except AttributeError: #2.4 and earlier return exc.errno and then try: do something except WindowsError, details: if not winerror(details) in (TYPE_E_REGISTRYACCESS, YPE_E_CANTLOADLIBRARY): raise Regards, Martin From brett at python.org Wed Sep 20 22:04:49 2006 From: brett at python.org (Brett Cannon) Date: Wed, 20 Sep 2006 13:04:49 -0700 Subject: [Python-Dev] Exceptions and slicing In-Reply-To: References: Message-ID: On 9/20/06, Thomas Heller wrote: > > Brett Cannon schrieb: > > On 9/20/06, Thomas Heller wrote: > >> > >> Is it an oversight that exception instances do no longer support > >> slicing in Python 2.5? > >> > >> This code works in 2.4, but no longer in 2.5: > >> > >> try: > >> open("", "r") > >> except IOError, details: > >> print details[:] > > > > > > Technically, yes. There is no entry in the sq_slice field for the > > PySequenceMethods struct. Although you can get to the list of arguments > by > > going through the 'args' attribute if you need a quick fix. > > Well, Nick Coghlan pointed out in private email: > > >> According to PEP 352 it should have at most been deprecated along with > the > >> rest of Exception.__getitem__: > >> > >> "This also means providing a __getitem__ method is unneeded for > exceptions and > >> thus will be deprecated as well." Right, the deprecation is not scheduled until Python 2.9 for __getitem__ so it was a regression problem (was never a test for it before PEP 352 was written). The fix is now in so your code should work again from a trunk checkout. I will backport when the freeze is raised. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060920/5e51126c/attachment.html From theller at python.net Wed Sep 20 22:11:58 2006 From: theller at python.net (Thomas Heller) Date: Wed, 20 Sep 2006 22:11:58 +0200 Subject: [Python-Dev] Exceptions and slicing In-Reply-To: <45119D6C.2050005@v.loewis.de> References: <45119D6C.2050005@v.loewis.de> Message-ID: Martin v. L?wis schrieb: > Thomas Heller schrieb: >> 1. The __str__ of a WindowsError instance hides the 'real' windows >> error number. So, in 2.4 "print error_instance" would print >> for example: >> >> [Errno 1002] Das Fenster kann die gesendete Nachricht nicht verarbeiten. >> >> while in 2.5: >> >> [Error 22] Das Fenster kann die gesendete Nachricht nicht verarbeiten. > > That's a bug. I changed the string deliberately from Errno to error to > indicate that it is not an errno, but a GetLastError. Can you come up > with a patch? Yes, but not today. >> 2. How would one write portable exception handling for Python 2.4 and 2.5? >> > Portable code should do > > def winerror(exc): > try: > return exc.winerror > except AttributeError: #2.4 and earlier > return exc.errno > > and then > > try: > do something > except WindowsError, details: > if not winerror(details) in (TYPE_E_REGISTRYACCESS, > YPE_E_CANTLOADLIBRARY): > raise Ok (sigh ;-). Thanks, Thomas From kbk at shore.net Thu Sep 21 02:02:59 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed, 20 Sep 2006 20:02:59 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200609210002.k8L02xNN002362@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 419 open ( +3) / 3410 closed ( +2) / 3829 total ( +5) Bugs : 910 open (+12) / 6185 closed ( +5) / 7095 total (+17) RFE : 235 open ( +1) / 238 closed ( +0) / 473 total ( +1) New / Reopened Patches ______________________ Practical ctypes example (2006-09-15) http://python.org/sf/1559219 opened by leppton pyclbr reports different module for Class and Function (2006-09-18) http://python.org/sf/1560617 opened by Peter Otten Exec stacks in python 2.5 (2006-09-18) http://python.org/sf/1560695 opened by Chaza Patches Closed ______________ test_grp.py doesn't skip special NIS entry, fails (2006-06-22) http://python.org/sf/1510987 closed by martineau New / Reopened Bugs ___________________ some section links (previous, up, next) missing last word (2006-09-15) http://python.org/sf/1559142 opened by Tim Smith time.strptime() access non exitant attribute in calendar.py (2006-09-15) CLOSED http://python.org/sf/1559515 opened by betatim shutil.copyfile incomplete on NTFS (2006-09-16) http://python.org/sf/1559684 opened by Roger Upole gcc trunk (4.2) exposes a signed integer overflows (2006-08-24) http://python.org/sf/1545668 reopened by arigo 2.5c2 pythonw does not execute (2006-09-16) http://python.org/sf/1559747 opened by Ron Platten list.sort does nothing when both cmp and key are given (2006-09-16) CLOSED http://python.org/sf/1559818 opened by Marcin 'Qrczak' Kowalczyk confusing error msg from random.randint (2006-09-17) http://python.org/sf/1560032 opened by paul rubin Tutorial: incorrect info about package importing and mac (2006-09-17) http://python.org/sf/1560114 opened by C L Better/faster implementation of os.path.split (2006-09-17) CLOSED http://python.org/sf/1560161 opened by Michael Gebetsroither Better/faster implementation of os.path.basename/dirname (2006-09-17) http://python.org/sf/1560179 reopened by gbrandl Better/faster implementation of os.path.basename/dirname (2006-09-17) http://python.org/sf/1560179 opened by Michael Gebetsroither copy() method of dictionaries is not "deep" (2006-09-17) http://python.org/sf/1560327 reopened by gbrandl copy() method of dictionaries is not "deep" (2006-09-17) http://python.org/sf/1560327 opened by daniel hahler python 2.5 fails to build with --as-needed (2006-09-18) http://python.org/sf/1560984 opened by Chaza mac installer profile patch vs. .bash_login (2006-09-19) http://python.org/sf/1561243 opened by Ronald Oussoren -xcode=pic32 option is not supported on Solaris x86 Sun C (2006-09-19) http://python.org/sf/1561333 opened by James Lick Dedent with Italian keyboard (2006-09-20) http://python.org/sf/1562092 opened by neclepsio Fails to install on Fedora Core 5 (2006-09-20) http://python.org/sf/1562171 opened by Mark Summerfield IDLE Hung up after open script by command line... (2006-09-20) http://python.org/sf/1562193 opened by Faramir^ uninitialized memory read in parsetok() (2006-09-20) http://python.org/sf/1562308 opened by Luke Moore Bugs Closed ___________ 2.5c2 macosx installer aborts during "GUI Applications" (2006-09-15) http://python.org/sf/1558983 closed by ronaldoussoren time.strptime() access non existant attribute in calendar.py (2006-09-15) http://python.org/sf/1559515 closed by bcannon list.sort does nothing when both cmp and key are given (2006-09-16) http://python.org/sf/1559818 closed by qrczak Better/faster implementation of os.path.split (2006-09-17) http://python.org/sf/1560161 deleted by einsteinmg Better/faster implementation of os.path.basename/dirname (2006-09-17) http://python.org/sf/1560179 deleted by einsteinmg copy() method of dictionaries is not "deep" (2006-09-17) http://python.org/sf/1560327 closed by gbrandl New / Reopened RFE __________________ Exception need structured information associated with them (2006-09-15) http://python.org/sf/1559549 opened by Ned Batchelder String searching performance improvement (2006-09-19) CLOSED http://python.org/sf/1561634 opened by Nick Welch RFE Closed __________ String searching performance improvement (2006-09-19) http://python.org/sf/1561634 deleted by mackstann From guido at python.org Thu Sep 21 04:17:43 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Sep 2006 19:17:43 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <45108611.7090009@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <45108611.7090009@canterbury.ac.nz> Message-ID: On 9/19/06, Greg Ewing wrote: > I haven't really thought it through in detail. It > just seems as though it would be a lot less confusing > if you could figure out from static information which > module will get imported by a given import statement, > instead of having it depend on the history of run-time > modifications to sys.path. One such kind of static > information is the layout of the filesystem. Eek? If there are two third-party top-level packages A and B, by different third parties, and A depends on B, how should A find B if not via sys.path or something that is sufficiently equivalent as to have the same problems? Surely every site shouldn't be required to install A and B in the same location (or in the same location relative to each other). I sympathize with the problems that exist with the current import mechanism, really, I do. Google feels the pain every day (alas, Google's requirements are a bit unusual, so they alone can't provide much guidance for a solution). But if you combine the various requirements: zip imports, import hooks of various sorts, different permissions for the owners of different packages that must cooperate, versioning issues (Python versions as well as package versions), forwards compatibility, backwards compatibility, ease of development, ease of packaging, ease of installation, supporting the conventions of vastly different platforms, data files mixed in with the source code (sometimes with their own search path), and probably several other requirements that I'm forgetting right now, it's just not an easy problem. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Sep 21 04:20:30 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Sep 2006 19:20:30 -0700 Subject: [Python-Dev] IronPython and AST branch In-Reply-To: <450D1819.2080803@gmail.com> References: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com> <450D1819.2080803@gmail.com> Message-ID: On 9/17/06, Nick Coghlan wrote: > One of the biggest issues I have with the current AST is that I don't believe > it really gets the "slice" and "extended slice" terminology correct (it uses > 'extended slice' to refer to multi-dimensional indexing, but the normal > meaning of that phrase is to refer to the use of a step argument for a slice [1]) The two were introduced together and were referred to together as "extended slicing" at the time, so I'm not sure who is confused. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Thu Sep 21 12:22:27 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Sep 2006 20:22:27 +1000 Subject: [Python-Dev] Removing __del__ In-Reply-To: References: <20060919053609.vp8duwukq7sw4w48@login.werra.lunarpages.com> <451123A2.7040701@gmail.com> Message-ID: <451267E3.60405@gmail.com> Adding pydev back in, since these seem like reasonable questions to me :) Jim Jewett wrote: > On 9/20/06, Nick Coghlan wrote: >> # Create a class with the same instance attributes >> # as the original >> class attr_holder(object): >> pass >> finalizer_arg = attr_holder() >> finalizer_arg.__dict__ = self.__dict__ > > Does this really work? It works for normal user-defined classes at least: >>> class C1(object): ... pass ... >>> class C2(object): ... pass ... >>> a = C1() >>> b = C2() >>> b.__dict__ = a.__dict__ >>> a.x = 1 >>> b.x 1 > (1) for classes with a dictproxy of some sort, you might get either a > copy (which isn't updated) Classes that change the way __dict__ is handled would probably need to define their own __del_arg__. > (2) for other classes, self might be added to the dict later Yeah, that's the strongest argument I know of against having that default fallback - it can easily lead to a strong reference from sys.finalizers into an otherwise unreachable cycle. I believe it currently takes two __del__ methods to prevent a cycle from being collected, whereas in this set up it would only take one. OTOH, fixing it would be much easier than it is now (by setting __del_args__ to something that holds only the subset of attributes that require finalization). > and of course, if it isn't added later, then it doesn't hvae the full > power of current finalizers -- just the __close__ subset. True, but most finalizers I've seen don't really *need* the full power of the current __del__. They only need to get at a couple of their internal members in order to explicitly release external resources. And more sophisticated usage is still possible by assigning an appropriate value to __del_arg__. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at iinet.net.au Thu Sep 21 12:24:28 2006 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Thu, 21 Sep 2006 20:24:28 +1000 Subject: [Python-Dev] Removing __del__ In-Reply-To: <451267E3.60405@gmail.com> References: <20060919053609.vp8duwukq7sw4w48@login.werra.lunarpages.com> <451123A2.7040701@gmail.com> <451267E3.60405@gmail.com> Message-ID: <4512685C.3070603@iinet.net.au> Nick Coghlan wrote: > Adding pydev back in, since these seem like reasonable questions to me :) D'oh, that should have been python-3000 not python-dev :( Sorry for the noise, folks. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Thu Sep 21 13:10:24 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Sep 2006 21:10:24 +1000 Subject: [Python-Dev] IronPython and AST branch In-Reply-To: References: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com> <450D1819.2080803@gmail.com> Message-ID: <45127320.6060308@gmail.com> Guido van Rossum wrote: > On 9/17/06, Nick Coghlan wrote: >> One of the biggest issues I have with the current AST is that I don't >> believe >> it really gets the "slice" and "extended slice" terminology correct >> (it uses >> 'extended slice' to refer to multi-dimensional indexing, but the normal >> meaning of that phrase is to refer to the use of a step argument for a >> slice [1]) > > The two were introduced together and were referred to together as > "extended slicing" at the time, so I'm not sure who is confused. Ah, that would explain it then - I first encountered the phrase 'extended slicing' in the context of the Python 2.3 additions to the builtin types, so I didn't realise it referred to all __getitem__ based non-mapping lookups, rather than just the start:stop:step form of slicing. Given that additional bit of history, I don't think changing the name of the AST node is worth the hassle - I'll just have to recalibrate my brain :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From anthony at interlink.com.au Thu Sep 21 13:12:03 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 21 Sep 2006 21:12:03 +1000 Subject: [Python-Dev] release25-maint is UNFROZEN Message-ID: <200609212112.04923.anthony@interlink.com.au> Ok - it's been 48 hours, and I've not seen any brown-paper-bag bugs, so I'm declaring the 2.5 maintenance branch open for business. As specified in PEP-006, this is a maintenance branch only suitable for bug fixes. No functionality changes should be checked in without discussion and agreement on python-dev first. Thanks to everyone for helping make 2.5 happen. It's been a long slog there, but I think we can all be proud of the result. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From g.brandl at gmx.net Thu Sep 21 14:31:24 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 21 Sep 2006 14:31:24 +0200 Subject: [Python-Dev] GCC 4.x incompatibility Message-ID: Is it noted somewhere that building Python with GCC 4.x results in problems such as abs(-sys.maxint-1) being negative? I think this is something users may want to know. Perhaps the "Known Bugs" page at http://www.python.org/download/releases/2.5/bugs/ is the right place to put this info. Georg From arigo at tunes.org Thu Sep 21 14:35:11 2006 From: arigo at tunes.org (Armin Rigo) Date: Thu, 21 Sep 2006 14:35:11 +0200 Subject: [Python-Dev] release25-maint is UNFROZEN In-Reply-To: <200609212112.04923.anthony@interlink.com.au> References: <200609212112.04923.anthony@interlink.com.au> Message-ID: <20060921123510.GA22457@code0.codespeak.net> Hi Anthony, On Thu, Sep 21, 2006 at 09:12:03PM +1000, Anthony Baxter wrote: > Thanks to everyone for helping make 2.5 happen. It's been a long slog there, > but I think we can all be proud of the result. Thanks for the hassle! I've got another bit of it for you, though. The freezed 2.5 documentation doesn't seem to be available on-line. At least, the doc links from the release page point to the 'dev' 2.6a0 version, and the URL following the common scheme - http://www.python.org/doc/2.5/ - doesn't work. A bientot, Armin From steve at holdenweb.com Thu Sep 21 14:47:57 2006 From: steve at holdenweb.com (Steve Holden) Date: Thu, 21 Sep 2006 08:47:57 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <45108611.7090009@canterbury.ac.nz> Message-ID: Guido van Rossum wrote: > On 9/19/06, Greg Ewing wrote: > >>I haven't really thought it through in detail. It >>just seems as though it would be a lot less confusing >>if you could figure out from static information which >>module will get imported by a given import statement, >>instead of having it depend on the history of run-time >>modifications to sys.path. One such kind of static >>information is the layout of the filesystem. > > > Eek? If there are two third-party top-level packages A and B, by > different third parties, and A depends on B, how should A find B if > not via sys.path or something that is sufficiently equivalent as to > have the same problems? Surely every site shouldn't be required to > install A and B in the same location (or in the same location relative > to each other). > > I sympathize with the problems that exist with the current import > mechanism, really, I do. Google feels the pain every day (alas, > Google's requirements are a bit unusual, so they alone can't provide > much guidance for a solution). But if you combine the various > requirements: zip imports, import hooks of various sorts, different > permissions for the owners of different packages that must cooperate, > versioning issues (Python versions as well as package versions), > forwards compatibility, backwards compatibility, ease of development, > ease of packaging, ease of installation, supporting the conventions of > vastly different platforms, data files mixed in with the source code > (sometimes with their own search path), and probably several other > requirements that I'm forgetting right now, it's just not an easy > problem. > But you're the BDFL! You mean to tell me there are some problems you can't solve?!?!?!?!? shocked-and-amazed-ly y'rs - steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From rasky at develer.com Thu Sep 21 15:05:38 2006 From: rasky at develer.com (Giovanni Bajo) Date: Thu, 21 Sep 2006 15:05:38 +0200 Subject: [Python-Dev] New relative import issue References: <20060918091314.GA26814@code0.codespeak.net><450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> Message-ID: <05af01c6dd7e$a2209560$e303030a@trilan> Oleg Broytmann wrote: >> There really shouldn't be >> any such thing as sys.path -- the view that any >> given module has of the package namespace should >> depend only on where it is > > I do not understand this. Can you show an example? Imagine I have > two servers, Linux and FreeBSD, and on Linux python is in /usr/bin, > home is /home/phd, on BSD these are /usr/local/bin and /usr/home/phd. > I have some modules in site-packages and some modules in > $HOME/lib/python. How can I move programs from one server to the > other without rewriting them (how can I not to put full paths to > modules)? I use PYTHONPATH manipulation - its enough to write a shell > script that starts daemons once and use it for many years. How can I > do this without sys.path?! My idea (and interpretation of Greg's statement) is that a module/package should be able to live with either relative imports within itself, or fully absolute imports. No sys.path *hackery* should ever be necessary to access modules in sibling namespaces. Either it's an absolute import, or a relative (internal) import. A sibling import is a symptom of wrong design of the packages. This is how I usually design my packages at least. There might be valid use cases for doing sys.path hackery, but I have yet to find them. -- Giovanni Bajo From theller at python.net Thu Sep 21 15:24:52 2006 From: theller at python.net (Thomas Heller) Date: Thu, 21 Sep 2006 15:24:52 +0200 Subject: [Python-Dev] Small Py3k task: fix modulefinder.py In-Reply-To: References: Message-ID: Guido van Rossum schrieb: > Is anyone familiar enough with modulefinder.py to fix its breakage in > Py3k? It chokes in a nasty way (exceeding the recursion limit) on the > relative import syntax. I suspect this is also a problem for 2.5, when > people use that syntax; hence the cross-post. There's no unittest for > modulefinder.py, but I believe py2exe depends on it (and of course > freeze.py, but who uses that still?) > I'm not (yet) using relative imports in 2.5 or Py3k, but have not been able to reproduce the recursion limit problem. Can you describe the package that fails? Thanks, Thomas From gustavo at niemeyer.net Thu Sep 21 15:42:49 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Thu, 21 Sep 2006 10:42:49 -0300 Subject: [Python-Dev] dict.discard Message-ID: <20060921134249.GA9238@niemeyer.net> Hey guys, After trying to use it a few times with no success :-), I'd like to include a new method, dict.discard, mirroring set.discard: >>> print set.discard.__doc__ Remove an element from a set if it is a member. If the element is not a member, do nothing. Comments? -- Gustavo Niemeyer http://niemeyer.net From fdrake at acm.org Thu Sep 21 15:49:25 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 21 Sep 2006 09:49:25 -0400 Subject: [Python-Dev] dict.discard In-Reply-To: <20060921134249.GA9238@niemeyer.net> References: <20060921134249.GA9238@niemeyer.net> Message-ID: <200609210949.25822.fdrake@acm.org> On Thursday 21 September 2006 09:42, Gustavo Niemeyer wrote: > After trying to use it a few times with no success :-), I'd like > > to include a new method, dict.discard, mirroring set.discard: > >>> print set.discard.__doc__ > > Remove an element from a set if it is a member. > > If the element is not a member, do nothing. Would the argument be the key, or the pair? I'd guess the key. If so, there's the 2-arg flavor of dict.pop(): >>> d = {} >>> d.pop("key", None) It's not terribly obvious, but does the job without enlarging the dict API. -Fred -- Fred L. Drake, Jr. From gustavo at niemeyer.net Thu Sep 21 16:07:04 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Thu, 21 Sep 2006 11:07:04 -0300 Subject: [Python-Dev] dict.discard In-Reply-To: <200609210949.25822.fdrake@acm.org> References: <20060921134249.GA9238@niemeyer.net> <200609210949.25822.fdrake@acm.org> Message-ID: <20060921140704.GA10159@niemeyer.net> > Would the argument be the key, or the pair? I'd guess the key. Right, the key. > If so, there's the 2-arg flavor of dict.pop(): > > >>> d = {} > >>> d.pop("key", None) > > It's not terribly obvious, but does the job without enlarging > the dict API. Yeah, this looks good. I don't think I've ever used it like this. -- Gustavo Niemeyer http://niemeyer.net From guido at python.org Thu Sep 21 16:22:04 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Sep 2006 07:22:04 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <05af01c6dd7e$a2209560$e303030a@trilan> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> Message-ID: On 9/21/06, Giovanni Bajo wrote: > >> Greg Eqing wrote: > >> There really shouldn't be > >> any such thing as sys.path -- the view that any > >> given module has of the package namespace should > >> depend only on where it is > My idea (and interpretation of Greg's statement) is that a module/package > should be able to live with either relative imports within itself, or fully > absolute imports. No sys.path *hackery* should ever be necessary to access > modules in sibling namespaces. Either it's an absolute import, or a relative > (internal) import. A sibling import is a symptom of wrong design of the > packages. > > This is how I usually design my packages at least. There might be valid use > cases for doing sys.path hackery, but I have yet to find them. While I agree with your idea(l), I don't think that's what Greg meant. He clearly say "sys.path should not exist at all". I do think it's fair to use sibling imports (using from ..sibling import module) from inside subpackages of the same package; I couldn't tell if you were against that or not. sys.path exists to stitch together the toplevel module/package namespace from diverse sources. Import hooks and sys.path hackery exist so that module/package sources don't have to be restricted to the filesystem (as well as to allow unbridled experimentation by those so inclined :-). I think one missing feature is a mechanism whereby you can say "THIS package (gives top-level package name) lives HERE (gives filesystem location of package)" without adding the parent of HERE to sys.path for all module searches. I think Phillip Eby's egg system might benefit from this. Another missing feature is a mechanism whereby you can use a particular file as the main script without adding its directory to the front of sys.path. Other missing features have to do with versioning constraints. But that quickly gets extremely messy. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mak at trisoft.com.pl Thu Sep 21 18:23:11 2006 From: mak at trisoft.com.pl (Grzegorz Makarewicz) Date: Thu, 21 Sep 2006 18:23:11 +0200 Subject: [Python-Dev] win32 - results from Lib/test - 2.5 release-maint Message-ID: <4512BC6F.3090907@trisoft.com.pl> Hi, - *.txt files for unicode tests are downloaded from internet - I don't like this. - __db.004 isn't removed after tests - init_types is declared static in python/python-ast.c and cant be imported from PC/config.c. - python_d -u regrtest.py -u bsddb -u curses -uall -v = dies after testInfinitRecursion without any message, just dissapears from tasks and doesn't show anything mak From martin at v.loewis.de Thu Sep 21 20:13:12 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 21 Sep 2006 20:13:12 +0200 Subject: [Python-Dev] GCC 4.x incompatibility In-Reply-To: References: Message-ID: <4512D638.3040909@v.loewis.de> Georg Brandl schrieb: > Is it noted somewhere that building Python with GCC 4.x results in > problems such as abs(-sys.maxint-1) being negative? Yes, it's in the README (although it claims problems only exist with 4.1 and 4.2; 4.0 seems to work fine for me). > I think this is something users may want to know. See what I wrote. Users are advised to either not use that compiler, or add -fwrapv. Regards, Martin From brett at python.org Thu Sep 21 20:19:13 2006 From: brett at python.org (Brett Cannon) Date: Thu, 21 Sep 2006 11:19:13 -0700 Subject: [Python-Dev] win32 - results from Lib/test - 2.5 release-maint In-Reply-To: <4512BC6F.3090907@trisoft.com.pl> References: <4512BC6F.3090907@trisoft.com.pl> Message-ID: On 9/21/06, Grzegorz Makarewicz wrote: > > Hi, > > - *.txt files for unicode tests are downloaded from internet - I don't > like this. Then don't use the urlfetch resource when running regrtest.py (which you did specify when you ran with ``-uall``). - __db.004 isn't removed after tests > - init_types is declared static in python/python-ast.c and cant be > imported from PC/config.c. > - python_d -u regrtest.py -u bsddb -u curses -uall -v = dies after > testInfinitRecursion without any message, just dissapears from tasks > and doesn't show anything Please file a bug report for each of these issues. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060921/8ca024ed/attachment.htm From martin at v.loewis.de Thu Sep 21 20:20:55 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 21 Sep 2006 20:20:55 +0200 Subject: [Python-Dev] win32 - results from Lib/test - 2.5 release-maint In-Reply-To: <4512BC6F.3090907@trisoft.com.pl> References: <4512BC6F.3090907@trisoft.com.pl> Message-ID: <4512D807.5060501@v.loewis.de> Please submit a patch to sf.net/projects/python. > - *.txt files for unicode tests are downloaded from internet - I don't > like this. What files specifcally? Could it be that you passed -u urlfetch or -u all? If so, then just don't. > - init_types is declared static in python/python-ast.c and cant be > imported from PC/config.c. Why is that a problem? PC/config.c refers to Modules/_typesmodule.c. Regards, Martin From nnorwitz at gmail.com Thu Sep 21 21:17:09 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 21 Sep 2006 12:17:09 -0700 Subject: [Python-Dev] release25-maint is UNFROZEN In-Reply-To: <20060921123510.GA22457@code0.codespeak.net> References: <200609212112.04923.anthony@interlink.com.au> <20060921123510.GA22457@code0.codespeak.net> Message-ID: On 9/21/06, Armin Rigo wrote: > Hi Anthony, > > On Thu, Sep 21, 2006 at 09:12:03PM +1000, Anthony Baxter wrote: > > Thanks to everyone for helping make 2.5 happen. It's been a long slog there, > > but I think we can all be proud of the result. > > Thanks for the hassle! I've got another bit of it for you, though. The > freezed 2.5 documentation doesn't seem to be available on-line. At > least, the doc links from the release page point to the 'dev' 2.6a0 > version, and the URL following the common scheme - > http://www.python.org/doc/2.5/ - doesn't work. I got http://docs.python.org/dev/2.5/ working last night. So when the 2.5 docs are updated these pages will reflect that. http://docs.python.org/ should point to the 2.5 doc too. I looked at making these changes, but was confused by what needed to be done. n From kbk at shore.net Thu Sep 21 22:15:24 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Thu, 21 Sep 2006 16:15:24 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary ** REVISED ** Message-ID: <200609212015.k8LKFONN031921@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 420 open ( +4) / 3410 closed ( +2) / 3830 total ( +6) Bugs : 915 open (+17) / 6186 closed ( +6) / 7101 total (+23) RFE : 235 open ( +1) / 238 closed ( +0) / 473 total ( +1) New / Reopened Patches ______________________ Practical ctypes example (2006-09-15) http://python.org/sf/1559219 opened by leppton test_popen fails on Windows if installed to "Program Files" (2006-09-15) http://python.org/sf/1559298 opened by Martin v. L?wis test_cmd_line fails on Windows (2006-09-15) http://python.org/sf/1559413 opened by Martin v. L?wis pyclbr reports different module for Class and Function (2006-09-18) http://python.org/sf/1560617 opened by Peter Otten Exec stacks in python 2.5 (2006-09-18) http://python.org/sf/1560695 opened by Chaza Python 2.5 fails with -Wl,--as-needed in LDFLAGS (2006-09-21) http://python.org/sf/1562825 opened by Chaza Patches Closed ______________ test_grp.py doesn't skip special NIS entry, fails (2006-06-22) http://python.org/sf/1510987 closed by martineau Add RLIMIT_SBSIZE to resource module (2006-09-13) http://python.org/sf/1557515 closed by loewis New / Reopened Bugs ___________________ some section links (previous, up, next) missing last word (2006-09-15) http://python.org/sf/1559142 opened by Tim Smith time.strptime() access non exitant attribute in calendar.py (2006-09-15) CLOSED http://python.org/sf/1559515 opened by betatim shutil.copyfile incomplete on NTFS (2006-09-16) http://python.org/sf/1559684 opened by Roger Upole gcc trunk (4.2) exposes a signed integer overflows (2006-08-24) http://python.org/sf/1545668 reopened by arigo 2.5c2 pythonw does not execute (2006-09-16) CLOSED http://python.org/sf/1559747 opened by Ron Platten list.sort does nothing when both cmp and key are given (2006-09-16) CLOSED http://python.org/sf/1559818 opened by Marcin 'Qrczak' Kowalczyk confusing error msg from random.randint (2006-09-17) http://python.org/sf/1560032 opened by paul rubin Tutorial: incorrect info about package importing and mac (2006-09-17) http://python.org/sf/1560114 opened by C L Better/faster implementation of os.path.split (2006-09-17) CLOSED http://python.org/sf/1560161 opened by Michael Gebetsroither Better/faster implementation of os.path.basename/dirname (2006-09-17) http://python.org/sf/1560179 reopened by gbrandl Better/faster implementation of os.path.basename/dirname (2006-09-17) http://python.org/sf/1560179 opened by Michael Gebetsroither copy() method of dictionaries is not "deep" (2006-09-17) http://python.org/sf/1560327 reopened by gbrandl copy() method of dictionaries is not "deep" (2006-09-17) http://python.org/sf/1560327 opened by daniel hahler strftime('%z') behaving differently with/without time arg. (2006-09-18) http://python.org/sf/1560794 opened by Knut Aksel R?ysland python 2.5 fails to build with --as-needed (2006-09-18) http://python.org/sf/1560984 opened by Chaza mac installer profile patch vs. .bash_login (2006-09-19) http://python.org/sf/1561243 opened by Ronald Oussoren -xcode=pic32 option is not supported on Solaris x86 Sun C (2006-09-19) http://python.org/sf/1561333 opened by James Lick Dedent with Italian keyboard (2006-09-20) http://python.org/sf/1562092 opened by neclepsio Fails to install on Fedora Core 5 (2006-09-20) http://python.org/sf/1562171 opened by Mark Summerfield IDLE Hung up after open script by command line... (2006-09-20) http://python.org/sf/1562193 opened by Faramir^ uninitialized memory read in parsetok() (2006-09-20) http://python.org/sf/1562308 opened by Luke Moore asyncore.dispatcher.set_reuse_addr not documented. (2006-09-20) http://python.org/sf/1562583 opened by Noah Spurrier Spurious tab/space warning (2006-09-21) http://python.org/sf/1562716 opened by torhu Spurious Tab/space error (2006-09-21) http://python.org/sf/1562719 opened by torhu decimal module borks thread (2006-09-21) http://python.org/sf/1562822 opened by Jaster MacPython ignores user-installed Tcl/Tk (2006-09-21) http://python.org/sf/1563046 opened by Russell Owen code.InteractiveConsole() and closed sys.stdout (2006-09-21) http://python.org/sf/1563079 opened by Skip Montanaro Bugs Closed ___________ 2.5c2 macosx installer aborts during "GUI Applications" (2006-09-15) http://python.org/sf/1558983 closed by ronaldoussoren time.strptime() access non existant attribute in calendar.py (2006-09-15) http://python.org/sf/1559515 closed by bcannon 2.5c2 pythonw does not execute (2006-09-16) http://python.org/sf/1559747 closed by loewis list.sort does nothing when both cmp and key are given (2006-09-16) http://python.org/sf/1559818 closed by qrczak Better/faster implementation of os.path.split (2006-09-17) http://python.org/sf/1560161 deleted by einsteinmg Better/faster implementation of os.path.basename/dirname (2006-09-17) http://python.org/sf/1560179 deleted by einsteinmg copy() method of dictionaries is not "deep" (2006-09-17) http://python.org/sf/1560327 closed by gbrandl UCS-4 tcl not found on SUSE 10.1 with tcl and tk 8.4.12-14 (2006-09-02) http://python.org/sf/1550791 closed by loewis New / Reopened RFE __________________ Exception need structured information associated with them (2006-09-15) http://python.org/sf/1559549 opened by Ned Batchelder String searching performance improvement (2006-09-19) CLOSED http://python.org/sf/1561634 opened by Nick Welch RFE Closed __________ String searching performance improvement (2006-09-19) http://python.org/sf/1561634 deleted by mackstann From p.f.moore at gmail.com Thu Sep 21 22:22:55 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Sep 2006 21:22:55 +0100 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> Message-ID: <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> On 9/21/06, Guido van Rossum wrote: > I think one missing feature is a mechanism whereby you can say "THIS > package (gives top-level package name) lives HERE (gives filesystem > location of package)" without adding the parent of HERE to sys.path > for all module searches. I think Phillip Eby's egg system might > benefit from this. This is pretty easy to do with a custom importer on sys.meta_path. Getting the details right is a touch fiddly, but it's conceptually straightforward. Hmm, I might play with this - a set of PEP 302 importers to completely customise the import mechanism. The never-completed "phase 2" of the PEP included a reimplementation of the built in import mechanism as hooks. Is there any interest in this actually happening? I've been looking for an interesting coding project for a while (although I never have any free time...) Paul. From guido at python.org Thu Sep 21 22:54:45 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Sep 2006 13:54:45 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> Message-ID: On 9/21/06, Paul Moore wrote: > On 9/21/06, Guido van Rossum wrote: > > I think one missing feature is a mechanism whereby you can say "THIS > > package (gives top-level package name) lives HERE (gives filesystem > > location of package)" without adding the parent of HERE to sys.path > > for all module searches. I think Phillip Eby's egg system might > > benefit from this. > > This is pretty easy to do with a custom importer on sys.meta_path. > Getting the details right is a touch fiddly, but it's conceptually > straightforward. Isn't the main problem how to specify a bunch of these in the environment? Or can this be done through .pkg files? Those aren't cheap either though -- it would be best if the work was only done when the package is actually needed. > Hmm, I might play with this - a set of PEP 302 importers to completely > customise the import mechanism. The never-completed "phase 2" of the > PEP included a reimplementation of the built in import mechanism as > hooks. Is there any interest in this actually happening? I've been > looking for an interesting coding project for a while (although I > never have any free time...) There's a general desire to reimplement import entirely in Python for more flexibility. I believe Brett Cannon is working on this. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From p.f.moore at gmail.com Thu Sep 21 23:23:08 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Sep 2006 22:23:08 +0100 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> Message-ID: <79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com> On 9/21/06, Guido van Rossum wrote: > On 9/21/06, Paul Moore wrote: > > On 9/21/06, Guido van Rossum wrote: > > > I think one missing feature is a mechanism whereby you can say "THIS > > > package (gives top-level package name) lives HERE (gives filesystem > > > location of package)" without adding the parent of HERE to sys.path > > > for all module searches. I think Phillip Eby's egg system might > > > benefit from this. > > > > This is pretty easy to do with a custom importer on sys.meta_path. > > Getting the details right is a touch fiddly, but it's conceptually > > straightforward. > > Isn't the main problem how to specify a bunch of these in the > environment? Or can this be done through .pkg files? Those aren't > cheap either though -- it would be best if the work was only done when > the package is actually needed. Hmm, I wasn't thinking of the environment. I pretty much never use PYTHONPATH, so I tend to forget about that aspect. I was assuming an importer object with a "register(package_name, filesystem_path)" method. Then register the packages you want in your code, or in site.py. I've attached a trivial proof of concept. But yes, you'd need to consider the environment. Possibly just have an initialisation function called at load time (I'm assuming the new hook is defined in a system module of some sort - I mean when that system module is loaded) which parses an environment variable and issues a set of register() calls. Paul. -------------- next part -------------- A non-text attachment was scrubbed... Name: imphook.py Type: text/x-python Size: 1247 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060921/56e5b9b7/attachment.py From pje at telecommunity.com Thu Sep 21 23:23:13 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 21 Sep 2006 17:23:13 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> Message-ID: <5.1.1.6.0.20060921170416.027233b8@sparrow.telecommunity.com> At 01:54 PM 9/21/2006 -0700, Guido van Rossum wrote: >On 9/21/06, Paul Moore wrote: > > On 9/21/06, Guido van Rossum wrote: > > > I think one missing feature is a mechanism whereby you can say "THIS > > > package (gives top-level package name) lives HERE (gives filesystem > > > location of package)" without adding the parent of HERE to sys.path > > > for all module searches. I think Phillip Eby's egg system might > > > benefit from this. > > > > This is pretty easy to do with a custom importer on sys.meta_path. > > Getting the details right is a touch fiddly, but it's conceptually > > straightforward. > >Isn't the main problem how to specify a bunch of these in the >environment? Yes, that's exactly the problem, assuming that by environment you mean the operating environment, as opposed to e.g. os.environ. (Environment variables are problematic for installation purposes, as on Unix-y systems there is no one obvious place to set them, and on Windows, the one obvious place is one that the user may have no permissions for!) From grig.gheorghiu at gmail.com Thu Sep 21 23:34:40 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Thu, 21 Sep 2006 14:34:40 -0700 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine Message-ID: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> One of the Pybots buildslaves has been failing the 'test' step, with the culprit being test_itertools: test_itertools test test_itertools failed -- Traceback (most recent call last): File "/Users/builder/pybots/pybot/trunk.osaf-x86/build/Lib/test/test_itertools.py", line 62, in test_count self.assertEqual(repr(c), 'count(-9)') AssertionError: 'count(4294967287)' != 'count(-9)' This started to happen after . The complete log for the test step on that buildslave is here: http://www.python.org/dev/buildbot/community/all/x86%20OSX%20trunk/builds/19/step-test/0 Grig -- http://agiletesting.blogspot.com From p.f.moore at gmail.com Thu Sep 21 23:39:05 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Sep 2006 22:39:05 +0100 Subject: [Python-Dev] New relative import issue In-Reply-To: <5.1.1.6.0.20060921170416.027233b8@sparrow.telecommunity.com> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> <5.1.1.6.0.20060921170416.027233b8@sparrow.telecommunity.com> Message-ID: <79990c6b0609211439q65fba28cu31611fa7291eabf1@mail.gmail.com> On 9/21/06, Phillip J. Eby wrote: > >Isn't the main problem how to specify a bunch of these in the > >environment? > > Yes, that's exactly the problem, assuming that by environment you mean the > operating environment, as opposed to e.g. os.environ. Hmm, now I don't understand again. What "operating environment" might there be, other than - os.environ - code that gets executed before the import ? There are clearly application design issues involved here (application configuration via initialisation files, plugin registries, etc etc). But in purely technical terms, don't they all boil down to executing a registration function (either directly by the user, or by the application on behalf of the user)? I don't think I'd expect anything at the language (or base library) level beyond a registration function and possibly an OS environment check. Paul. From jackdied at jackdied.com Thu Sep 21 23:50:19 2006 From: jackdied at jackdied.com (Jack Diederich) Date: Thu, 21 Sep 2006 17:50:19 -0400 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> Message-ID: <20060921215019.GA6677@performancedrivers.com> The python binary is out of step with the test_itertools.py version. You can generate this same error on your own box by reverting the change to itertoolsmodule.c but leaving the new test in test_itertools.py I don't know why this only happened on that OSX buildslave On Thu, Sep 21, 2006 at 02:34:40PM -0700, Grig Gheorghiu wrote: > One of the Pybots buildslaves has been failing the 'test' step, with > the culprit being test_itertools: > > test_itertools > test test_itertools failed -- Traceback (most recent call last): > File > "/Users/builder/pybots/pybot/trunk.osaf-x86/build/Lib/test/test_itertools.py", > line 62, in test_count > self.assertEqual(repr(c), 'count(-9)') > AssertionError: 'count(4294967287)' != 'count(-9)' > > This started to happen after > . > > The complete log for the test step on that buildslave is here: > > http://www.python.org/dev/buildbot/community/all/x86%20OSX%20trunk/builds/19/step-test/0 > > Grig > > > -- > http://agiletesting.blogspot.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jackdied%40jackdied.com > From brett at python.org Thu Sep 21 23:55:52 2006 From: brett at python.org (Brett Cannon) Date: Thu, 21 Sep 2006 14:55:52 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> Message-ID: On 9/21/06, Guido van Rossum wrote: > > On 9/21/06, Paul Moore wrote: > [SNIP] > > Hmm, I might play with this - a set of PEP 302 importers to completely > > customise the import mechanism. The never-completed "phase 2" of the > > PEP included a reimplementation of the built in import mechanism as > > hooks. Is there any interest in this actually happening? I've been > > looking for an interesting coding project for a while (although I > > never have any free time...) > > There's a general desire to reimplement import entirely in Python for > more flexibility. I believe Brett Cannon is working on this. Since I need to control imports to the point of being able to deny importing built-in and extension modules, I was planning on re-implementing the import system to use PEP 302 importers. Possibly do it in pure Python for ease-of-use. Then that can be worked off of for possible Py3K improvements to the import system. But either way I will be messing with the import system in the relatively near future. If you want to help, Paul (or anyone else), just send me an email and we can try to coordinate something (plan to do the work in the sandbox as a separate thing from my security stuff). -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060921/536ef637/attachment.htm From grig.gheorghiu at gmail.com Fri Sep 22 00:28:04 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Thu, 21 Sep 2006 15:28:04 -0700 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: <20060921215019.GA6677@performancedrivers.com> References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> Message-ID: <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> On 9/21/06, Jack Diederich wrote: > The python binary is out of step with the test_itertools.py version. > You can generate this same error on your own box by reverting the > change to itertoolsmodule.c but leaving the new test in test_itertools.py > > I don't know why this only happened on that OSX buildslave Not sure what you mean by out of step. The binary was built out of the very latest itertoolsmodule.c, and test_itertools.py was also updated from svn. So they're both in sync IMO. That tests passes successfully on all the other buildslaves in the Pybots farm (x86 Ubuntu, Debian, Gentoo, RH9, AMD-64 Ubuntu) Grig From guido at python.org Fri Sep 22 00:28:43 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Sep 2006 15:28:43 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> <79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com> Message-ID: On 9/21/06, Paul Moore wrote: > On 9/21/06, Guido van Rossum wrote: > > Isn't the main problem how to specify a bunch of these in the > > environment? Or can this be done through .pkg files? Those aren't > > cheap either though -- it would be best if the work was only done when > > the package is actually needed. > > Hmm, I wasn't thinking of the environment. I pretty much never use > PYTHONPATH, so I tend to forget about that aspect. As Phillip understood, I meant the environment to include the filesystem (and on Windows, the registry -- in fact, Python on Windows *has* exactly such a mechanism in the registry, although I believe it's rarely used these days -- it was done by Mark Hammond to support COM servers I believe.) > I was assuming an > importer object with a "register(package_name, filesystem_path)" > method. Then register the packages you want in your code, or in > site.py. Neither is an acceptable method for an installer tool (e.g. eggs) to register new packages. it needs to be some kind of data file or set of data files. > But yes, you'd need to consider the environment. Possibly just have an > initialisation function called at load time (I'm assuming the new hook > is defined in a system module of some sort - I mean when that system > module is loaded) which parses an environment variable and issues a > set of register() calls. os.environ is useless because there's no way for a package installer to set it for all users. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Fri Sep 22 00:50:28 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 21 Sep 2006 18:50:28 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com> <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> <79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com> Message-ID: <5.1.1.6.0.20060921184915.02f4a4a8@sparrow.telecommunity.com> At 03:28 PM 9/21/2006 -0700, Guido van Rossum wrote: >os.environ is useless because there's no way for a package installer >to set it for all users. Or even for *one* user! :) From greg.ewing at canterbury.ac.nz Fri Sep 22 01:37:37 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Sep 2006 11:37:37 +1200 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <45108611.7090009@canterbury.ac.nz> Message-ID: <45132241.7030307@canterbury.ac.nz> Guido van Rossum wrote: > Eek? If there are two third-party top-level packages A and B, by > different third parties, and A depends on B, how should A find B if > not via sys.path or something that is sufficiently equivalent as to > have the same problems? Some kind of configuration mechanism is needed, but I don't see why it can't be a static, declarative one rather than computed at run time. Whoever installs package A should be responsible for setting up whatever environment is necessary around it for it to find package B and anything else it directly depends on. The program C which uses package A needs to be told where to find it. But C doesn't need to know where to find B, the dependency on which is an implementation detail of A, because A already knows where to find it. In the Eiffel world, there's a thing called and ACE (Assembly of Classes in Eiffel), which is a kind of meta-language for describing the shape of the class namespace, and it allows each source file to have its own unique view of that namespace. I'm groping towards something equivalent for the Python module namespace. (AMP - Assembly of Modules in Python?) -- Greg From greg.ewing at canterbury.ac.nz Fri Sep 22 02:07:13 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Sep 2006 12:07:13 +1200 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <45108611.7090009@canterbury.ac.nz> Message-ID: <45132931.1070803@canterbury.ac.nz> Another thought on static module namespace configuration: It would make things a *lot* easier for py2exe, py2app and the like that have to figure out what packages a program depends on without running the program. -- Greg From pje at telecommunity.com Fri Sep 22 02:15:41 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 21 Sep 2006 20:15:41 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: <45132931.1070803@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <45108611.7090009@canterbury.ac.nz> Message-ID: <5.1.1.6.0.20060921201311.02735808@sparrow.telecommunity.com> At 12:07 PM 9/22/2006 +1200, Greg Ewing wrote: >Another thought on static module namespace configuration: >It would make things a *lot* easier for py2exe, py2app >and the like that have to figure out what packages >a program depends on without running the program. Setuptools users already explicitly declare what projects their projects depend on; this is how easy_install can then find and install those dependencies. So, there is at least one system already available for Python that manages this type of thing already, and my understanding is that the py2exe and py2app developers plan to support using this dependency information in the future. From guido at python.org Fri Sep 22 02:17:24 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Sep 2006 17:17:24 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <45132241.7030307@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <45108611.7090009@canterbury.ac.nz> <45132241.7030307@canterbury.ac.nz> Message-ID: On 9/21/06, Greg Ewing wrote: > Guido van Rossum wrote: > > > Eek? If there are two third-party top-level packages A and B, by > > different third parties, and A depends on B, how should A find B if > > not via sys.path or something that is sufficiently equivalent as to > > have the same problems? > > Some kind of configuration mechanism is needed, but > I don't see why it can't be a static, declarative one > rather than computed at run time. That would preclude writing the code that interprets the static data in Python itself. Despite the good use cases I think that's a big showstopper. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Fri Sep 22 02:17:24 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Sep 2006 12:17:24 +1200 Subject: [Python-Dev] New relative import issue In-Reply-To: <05af01c6dd7e$a2209560$e303030a@trilan> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> Message-ID: <45132B94.6050401@canterbury.ac.nz> Giovanni Bajo wrote: > My idea (and interpretation of Greg's statement) is that a module/package > should be able to live with either relative imports within itself, or fully > absolute imports. I think it goes further than that -- each module should (potentially) have its own unique view of the module namespace, defined at the time the module is installed, that can't be disturbed by anything that any other module does. -- Greg From guido at python.org Fri Sep 22 02:18:41 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Sep 2006 17:18:41 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <5.1.1.6.0.20060921201311.02735808@sparrow.telecommunity.com> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <45108611.7090009@canterbury.ac.nz> <45132931.1070803@canterbury.ac.nz> <5.1.1.6.0.20060921201311.02735808@sparrow.telecommunity.com> Message-ID: I think it would be worth writing up a PEP to describe this, if it's to become a de-facto standard. That might be a better path towards standardization than just checking in the code... :-/ --Guido On 9/21/06, Phillip J. Eby wrote: > At 12:07 PM 9/22/2006 +1200, Greg Ewing wrote: > >Another thought on static module namespace configuration: > >It would make things a *lot* easier for py2exe, py2app > >and the like that have to figure out what packages > >a program depends on without running the program. > > Setuptools users already explicitly declare what projects their projects > depend on; this is how easy_install can then find and install those > dependencies. So, there is at least one system already available for > Python that manages this type of thing already, and my understanding is > that the py2exe and py2app developers plan to support using this dependency > information in the future. > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Sep 22 02:20:03 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Sep 2006 17:20:03 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <45132B94.6050401@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <45132B94.6050401@canterbury.ac.nz> Message-ID: On 9/21/06, Greg Ewing wrote: > Giovanni Bajo wrote: > > > My idea (and interpretation of Greg's statement) is that a module/package > > should be able to live with either relative imports within itself, or fully > > absolute imports. > > I think it goes further than that -- each module should > (potentially) have its own unique view of the module > namespace, defined at the time the module is installed, > that can't be disturbed by anything that any other module > does. Well, maybe. But there's also the requirement that if packages A and B both import C, they should get the same C. Having multiple versions of the same package loaded simultaneously sounds like a recipe for disaster. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Fri Sep 22 02:21:01 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Sep 2006 12:21:01 +1200 Subject: [Python-Dev] list.discard? (Re: dict.discard) In-Reply-To: <20060921134249.GA9238@niemeyer.net> References: <20060921134249.GA9238@niemeyer.net> Message-ID: <45132C6D.9010806@canterbury.ac.nz> Gustavo Niemeyer wrote: > >>> print set.discard.__doc__ > Remove an element from a set if it is a member. Actually I'd like this for lists. Often I find myself writing if x not in somelist: somelist.remove(x) A single method for doing this would be handy, and more efficient. -- Greg From fdrake at acm.org Fri Sep 22 02:28:00 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 21 Sep 2006 20:28:00 -0400 Subject: [Python-Dev] list.discard? (Re: dict.discard) In-Reply-To: <45132C6D.9010806@canterbury.ac.nz> References: <20060921134249.GA9238@niemeyer.net> <45132C6D.9010806@canterbury.ac.nz> Message-ID: <200609212028.00824.fdrake@acm.org> On Thursday 21 September 2006 20:21, Greg Ewing wrote: > if x not in somelist: > somelist.remove(x) I'm just guessing you really meant "if x in somelist". ;-) -Fred -- Fred L. Drake, Jr. From greg.ewing at canterbury.ac.nz Fri Sep 22 02:40:03 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Sep 2006 12:40:03 +1200 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> Message-ID: <451330E3.1020701@canterbury.ac.nz> Guido van Rossum wrote: > While I agree with your idea(l), I don't think that's what Greg meant. > He clearly say "sys.path should not exist at all". Refining that a bit, I don't think there should be a *single* sys.path for the whole program -- more like each module having its own sys.path. And, at least in most cases, its contents should be set up from static information that exists outside the program, established when the module is installed. One reason for this is the lack of any absolute notion of a "program". What is a program on one level can be part of a larger program on another level. For example, a module with test code that is run when it's invoked as a main script. Sometimes it's a program of its own, other times it's not. And it shouldn't *matter* whether it's a program or not when it comes to being able to find other modules that it needs to import. So using a piece of program-wide shared state for this seems wrong. -- Greg From scott+python-dev at scottdial.com Fri Sep 22 02:32:52 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Thu, 21 Sep 2006 20:32:52 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <45132B94.6050401@canterbury.ac.nz> Message-ID: <45132F34.801@scottdial.com> Guido van Rossum wrote: > On 9/21/06, Greg Ewing wrote: >> I think it goes further than that -- each module should >> (potentially) have its own unique view of the module >> namespace, defined at the time the module is installed, >> that can't be disturbed by anything that any other module >> does. > > Well, maybe. But there's also the requirement that if packages A and B > both import C, they should get the same C. Having multiple versions of > the same package loaded simultaneously sounds like a recipe for > disaster. I have exactly this scenario in my current codebase for a server. It was absolutely necessary for me to update a module in Twisted because all other solutions I could come up with were less desirable. Either I send my patch upstream and wait (can't wait), or I fork out another version and place it at the top of sys.path (this seems ok). Except an even better solution is to maintain my own subset of Twisted, because I am localized to a particularly small corner of the codebase. I can continue to use upstream updates to the rest of Twisted without any fussing about merging changes and so forth. And if Twisted was allowed to decide how it saw its own world, then I would have to go back to maintaining my own complete branch. While I don't strictly need to be able to do this, I wanted to at least raise my hand and say, "I abuse this facet of the current import mechanism." -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From jcarlson at uci.edu Fri Sep 22 03:02:07 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 21 Sep 2006 18:02:07 -0700 Subject: [Python-Dev] list.discard? (Re: dict.discard) In-Reply-To: <45132C6D.9010806@canterbury.ac.nz> References: <20060921134249.GA9238@niemeyer.net> <45132C6D.9010806@canterbury.ac.nz> Message-ID: <20060921175719.0842.JCARLSON@uci.edu> Greg Ewing wrote: > Gustavo Niemeyer wrote: > > > >>> print set.discard.__doc__ > > Remove an element from a set if it is a member. > > Actually I'd like this for lists. Often I find myself > writing > > if x not in somelist: > somelist.remove(x) > > A single method for doing this would be handy, and > more efficient. A marginal calling time improvement; but we are still talking linear time containment test. I'm -0, if only because I've never really found the need to use list.remove(), so this API expansion doesn't feel necessary to me. - Josiah From pje at telecommunity.com Fri Sep 22 03:40:57 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 21 Sep 2006 21:40:57 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: <451330E3.1020701@canterbury.ac.nz> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> Message-ID: <5.1.1.6.0.20060921212856.027284d8@sparrow.telecommunity.com> At 12:40 PM 9/22/2006 +1200, Greg Ewing wrote: >Guido van Rossum wrote: > > > While I agree with your idea(l), I don't think that's what Greg meant. > > He clearly say "sys.path should not exist at all". > >Refining that a bit, I don't think there should be >a *single* sys.path for the whole program -- more >like each module having its own sys.path. And, at >least in most cases, its contents should be set >up from static information that exists outside the >program, established when the module is installed. Now I'm a little bit more in agreement with you, but not by much. :) In the Java world, the OSGi framework (which is a big inspiration for many aspects of setuptools) effectively has a sys.path per installed project (modulo certain issues for dynamic imports, which I'll get into more below). And, when Bob Ippolito and I were first designing the egg runtime system, we considered using some hacks to do the same thing, such that you actually *could* have more than one version of a package present and being used by two other packages that require it. However, we eventually tossed it as a YAGNI; dependency resolution is too damn hard already without also having to deal with crud like C extensions only being able to be loaded once, etc., in addition to the necessary import hackery. So, in principle, it's a good idea. In practice, I don't think it can be achieved on the CPython platform. You need something like a Jython or IronPython or a PyPy+ctypes-ish platform, where all the C is effectively abstracted behind some barrier that prevents the module problem from occurring, and you could in principle load the modules in different interpreters (or "class loaders" in the Java context). Amusingly, this is one of the few instances where every Python *except* CPython is probably in a better position to implement the feature... :) >One reason for this is the lack of any absolute >notion of a "program". What is a program on one >level can be part of a larger program on another >level. For example, a module with test code that >is run when it's invoked as a main script. >Sometimes it's a program of its own, other times >it's not. And it shouldn't *matter* whether it's >a program or not when it comes to being able to >find other modules that it needs to import. So >using a piece of program-wide shared state for >this seems wrong. An interesting point about this is that it coincidentally solves the problem of dynamic interpretation of meta-syntactic features. That is, if there is a static way to know what modules are accessible to the module, then the resolution of meta-syntax features like macros or custom statements is simpler than if a runtime resolution is required. Of course, in all of this you're still ignoring the part where some modules may need to perform dynamic or "weak" imports. So at least *some* portion of a module's import path *is* dependent on the notion of the "current program". (See the documentation for the "Importing" package at http://peak.telecommunity.com/DevCenter/Importing for an explanation of "weak" importing.) From jackdied at jackdied.com Fri Sep 22 04:08:58 2006 From: jackdied at jackdied.com (Jack Diederich) Date: Thu, 21 Sep 2006 22:08:58 -0400 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> Message-ID: <20060922020858.GB6677@performancedrivers.com> On Thu, Sep 21, 2006 at 03:28:04PM -0700, Grig Gheorghiu wrote: > On 9/21/06, Jack Diederich wrote: > > The python binary is out of step with the test_itertools.py version. > > You can generate this same error on your own box by reverting the > > change to itertoolsmodule.c but leaving the new test in test_itertools.py > > > > I don't know why this only happened on that OSX buildslave > > Not sure what you mean by out of step. The binary was built out of the > very latest itertoolsmodule.c, and test_itertools.py was also updated > from svn. So they're both in sync IMO. That tests passes successfully > on all the other buildslaves in the Pybots farm (x86 Ubuntu, Debian, > Gentoo, RH9, AMD-64 Ubuntu) > When I saw the failure, first I cursed (a lot). Then I followed the repr all the way down into stringobject.c, no dice. Then I noticed that the failure is exactly what you get if the test was updated but the old module wasn't. Faced with the choice of believing in a really strange platform specific bug in a commonly used routine that resulted in exactly the failure caused by one of the two files being updated or believing a failure occurred in the long chain of networks, disks, file systems, build tools, and operating systems that would result in only one of the files being updated - I went with the latter. I'll continue in my belief until my dying day or until someone with OSX confirms it is a bug, whichever comes first. not-gonna-sweat-it-ly, -Jack From grig.gheorghiu at gmail.com Fri Sep 22 04:16:33 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Thu, 21 Sep 2006 19:16:33 -0700 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: <20060922020858.GB6677@performancedrivers.com> References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> <20060922020858.GB6677@performancedrivers.com> Message-ID: <3f09d5a00609211916g58d3aabam683517fe047f352b@mail.gmail.com> On 9/21/06, Jack Diederich wrote: > On Thu, Sep 21, 2006 at 03:28:04PM -0700, Grig Gheorghiu wrote: > > On 9/21/06, Jack Diederich wrote: > > > The python binary is out of step with the test_itertools.py version. > > > You can generate this same error on your own box by reverting the > > > change to itertoolsmodule.c but leaving the new test in test_itertools.py > > > > > > I don't know why this only happened on that OSX buildslave > > > > Not sure what you mean by out of step. The binary was built out of the > > very latest itertoolsmodule.c, and test_itertools.py was also updated > > from svn. So they're both in sync IMO. That tests passes successfully > > on all the other buildslaves in the Pybots farm (x86 Ubuntu, Debian, > > Gentoo, RH9, AMD-64 Ubuntu) > > > > When I saw the failure, first I cursed (a lot). Then I followed the repr > all the way down into stringobject.c, no dice. Then I noticed that the > failure is exactly what you get if the test was updated but the old > module wasn't. > > Faced with the choice of believing in a really strange platform specific > bug in a commonly used routine that resulted in exactly the failure caused > by one of the two files being updated or believing a failure occurred in the > long chain of networks, disks, file systems, build tools, and operating > systems that would result in only one of the files being updated - > I went with the latter. > > I'll continue in my belief until my dying day or until someone with OSX > confirms it is a bug, whichever comes first. > > not-gonna-sweat-it-ly, > > -Jack > _______________________________________________ OK, sorry for having caused you so much grief....I'll investigate some more on the Pybots side and I'll let you know what I find. Grig From grig.gheorghiu at gmail.com Fri Sep 22 04:31:17 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Thu, 21 Sep 2006 19:31:17 -0700 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: <3f09d5a00609211916g58d3aabam683517fe047f352b@mail.gmail.com> References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> <20060922020858.GB6677@performancedrivers.com> <3f09d5a00609211916g58d3aabam683517fe047f352b@mail.gmail.com> Message-ID: <3f09d5a00609211931j3cf5a084y8e4ea2603a65034@mail.gmail.com> On 9/21/06, Grig Gheorghiu wrote: > On 9/21/06, Jack Diederich wrote: > > On Thu, Sep 21, 2006 at 03:28:04PM -0700, Grig Gheorghiu wrote: > > > On 9/21/06, Jack Diederich wrote: > > > > The python binary is out of step with the test_itertools.py version. > > > > You can generate this same error on your own box by reverting the > > > > change to itertoolsmodule.c but leaving the new test in test_itertools.py > > > > > > > > I don't know why this only happened on that OSX buildslave > > > > > > Not sure what you mean by out of step. The binary was built out of the > > > very latest itertoolsmodule.c, and test_itertools.py was also updated > > > from svn. So they're both in sync IMO. That tests passes successfully > > > on all the other buildslaves in the Pybots farm (x86 Ubuntu, Debian, > > > Gentoo, RH9, AMD-64 Ubuntu) > > > > > > > When I saw the failure, first I cursed (a lot). Then I followed the repr > > all the way down into stringobject.c, no dice. Then I noticed that the > > failure is exactly what you get if the test was updated but the old > > module wasn't. > > > > Faced with the choice of believing in a really strange platform specific > > bug in a commonly used routine that resulted in exactly the failure caused > > by one of the two files being updated or believing a failure occurred in the > > long chain of networks, disks, file systems, build tools, and operating > > systems that would result in only one of the files being updated - > > I went with the latter. > > > > I'll continue in my belief until my dying day or until someone with OSX > > confirms it is a bug, whichever comes first. > > > > not-gonna-sweat-it-ly, > > > > -Jack > > _______________________________________________ > > OK, sorry for having caused you so much grief....I'll investigate some > more on the Pybots side and I'll let you know what I find. > > Grig > Actually, that test fails also in the official Python buildbot farm, on a g4 osx machine. See http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-test/0 So it looks like it's an OS X specific issue. Grig From mhammond at skippinet.com.au Fri Sep 22 04:29:45 2006 From: mhammond at skippinet.com.au (Mark Hammond) Date: Fri, 22 Sep 2006 12:29:45 +1000 Subject: [Python-Dev] New relative import issue In-Reply-To: Message-ID: <13c301c6ddee$f8964370$050a0a0a@enfoldsystems.local> Guido writes: > As Phillip understood, I meant the environment to include the > filesystem (and on Windows, the registry -- in fact, Python on Windows > *has* exactly such a mechanism in the registry, although I believe > it's rarely used these days -- it was done by Mark Hammond to support > COM servers I believe.) It is rarely used these days due to the fact it is truly global to the machine. These days, it is not uncommon to have multiple copies of the same Python version installed on the same box - generally installed privately into an application by the vendor - eg, Komodo and Open Office both do this to some degree. The problem with a global registry is that *all* Python installations honoured it. This meant bugs in the vendor applications, as their 'import foo' did *not* locate the foo module inside their directory, but instead loaded a completely unrelated one, which promptly crashed. A mechanism similar to .pth files, where the "declaration" of a module's location is stored privately to the installation seems a more workable approach. Mark From skip at pobox.com Fri Sep 22 05:08:56 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 21 Sep 2006 22:08:56 -0500 Subject: [Python-Dev] list.discard? (Re: dict.discard) In-Reply-To: <45132C6D.9010806@canterbury.ac.nz> References: <20060921134249.GA9238@niemeyer.net> <45132C6D.9010806@canterbury.ac.nz> Message-ID: <17683.21448.909748.200493@montanaro.dyndns.org> Greg> Actually I'd like [discard] for lists. It's obvious for sets and dictionaries that there is only one thing to discard and that after the operation you're guaranteed the key no longer exists. Would you want the same semantics for lists or the semantics of list.remove where it only removes the first instance? When I want to remove something from a list I typically write: while x in somelist: somelist.remove(x) not "if" as in your example. Skip From jcarlson at uci.edu Fri Sep 22 05:44:30 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 21 Sep 2006 20:44:30 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <45132241.7030307@canterbury.ac.nz> References: <45132241.7030307@canterbury.ac.nz> Message-ID: <20060921183846.0845.JCARLSON@uci.edu> Greg Ewing wrote: > > Guido van Rossum wrote: > > > Eek? If there are two third-party top-level packages A and B, by > > different third parties, and A depends on B, how should A find B if > > not via sys.path or something that is sufficiently equivalent as to > > have the same problems? > > Some kind of configuration mechanism is needed, but > I don't see why it can't be a static, declarative one > rather than computed at run time. I could be missing something, or be completely off-topic, but why not both, or really a mechanism to define: 1. Installation time package location (register package X in the package registry at path Y and persist across Python sessions). 2. Runtime package location (package X is in path Y, do not persist across Python sessions). With 1 and 2, we remove the need for .pth files, all packages to be installed into Lib/site-packages, and sys.path manipulation. You want access to package X? packages.register('X', '~/mypackages/X') packages.register('X', '~/mypackages/X', persist=True) packages.register('X', '~/mypackages/X', systemwide=True) This can be implemented with a fairly simple package registry, contained within a (small) SQLite database (which is conveniently shipped in Python 2.5). There can be a system-wide database that all users use as a base, with a user-defined package registry (per user) where the system-wide packages can be augmented. With a little work, it could even be possible to define importers during registration (filesystem, zip, database, etc.) or include a tracing mechanism for py2exe/distutils/py2app/cx_freeze/etc. (optionally writing to a simplified database-like file for distribution so that SQLite doesn't need to be shipped). - Josiah From martin at v.loewis.de Fri Sep 22 06:09:41 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 22 Sep 2006 06:09:41 +0200 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: <20060922020858.GB6677@performancedrivers.com> References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> <20060922020858.GB6677@performancedrivers.com> Message-ID: <45136205.6060603@v.loewis.de> Jack Diederich schrieb: > Faced with the choice of believing in a really strange platform specific > bug in a commonly used routine that resulted in exactly the failure caused > by one of the two files being updated or believing a failure occurred in the > long chain of networks, disks, file systems, build tools, and operating > systems that would result in only one of the files being updated - > I went with the latter. Please reconsider how subversion works. It has the notion of atomic commits, so you either get the entire change, or none at all. Fortunately, the buildbot keeps logs of everything it does: http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-svn/0 shows U Lib/test/test_itertools.py U Modules/itertoolsmodule.c Updated to revision 51950. So it said it updated both files. But perhaps it didn't build them? Let's check: http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-compile/0 has this: building 'itertools' extension gcc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -g -Wall -Wstrict-prototypes -I. -I/Users/buildslave/bb/trunk.psf-g4/build/./Include -I/Users/buildslave/bb/trunk.psf-g4/build/./Mac/Include -I./Include -I. -I/usr/local/include -I/Users/buildslave/bb/trunk.psf-g4/build/Include -I/Users/buildslave/bb/trunk.psf-g4/build -c /Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.c -o build/temp.macosx-10.3-ppc-2.6/Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.o gcc -bundle -undefined dynamic_lookup build/temp.macosx-10.3-ppc-2.6/Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.o -L/usr/local/lib -o build/lib.macosx-10.3-ppc-2.6/itertools.so So itertools.so is regenerated, as it should; qed. Regards, Martin From pje at telecommunity.com Fri Sep 22 06:10:45 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 22 Sep 2006 00:10:45 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: <20060921183846.0845.JCARLSON@uci.edu> References: <45132241.7030307@canterbury.ac.nz> <45132241.7030307@canterbury.ac.nz> Message-ID: <5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com> At 08:44 PM 9/21/2006 -0700, Josiah Carlson wrote: >This can be implemented with a fairly simple package registry, contained >within a (small) SQLite database (which is conveniently shipped in >Python 2.5). There can be a system-wide database that all users use as >a base, with a user-defined package registry (per user) where the >system-wide packages can be augmented. As far as I can tell, you're ignoring that per-user must *also* be per-version, and per-application. Each application or runtime environment needs its own private set of information like this. Next, putting the installation data inside a database instead of per-installation-unit files presents problems of its own. While some system packaging tools allow install/uninstall scripts to run, they are often frowned upon, and can be unintentionally bypassed. These are just a few of the issues that come to mind. Realistically speaking, .pth files are currently the most effective mechanism we have, and there actually isn't much that can be done to improve upon them. What's more needed are better mechanisms for creating and managing Python "environments" (to use a term coined by Ian Bicking and Jim Fulton over on the distutils-sig), which are individual contexts in which Python applications run. Some current tools in development by Ian and Jim include: http://cheeseshop.python.org/pypi/workingenv.py/ http://cheeseshop.python.org/pypi/zc.buildout/ I don't know that much about either, but I do know that for zc.buildout, Jim's goal was to be able to install Python scripts with pre-baked sys.path based on package dependencies from setuptools, and as far as I know, he achieved it. Anyway, system-wide and per-user environment information isn't nearly sufficient to address the issues that people have when developing and deploying multiple applications on a server, or even using multiple applications on a client installation (e.g. somebody using both the Enthought Python IDE and Chandler on the same machine). These relatively simple use cases rapidly demonstrate the inadequacy of system-wide or per-user configuration of what packages are available. From steve at holdenweb.com Fri Sep 22 06:38:24 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 22 Sep 2006 00:38:24 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> Message-ID: Paul Moore wrote: > On 9/21/06, Guido van Rossum wrote: > >>I think one missing feature is a mechanism whereby you can say "THIS >>package (gives top-level package name) lives HERE (gives filesystem >>location of package)" without adding the parent of HERE to sys.path >>for all module searches. I think Phillip Eby's egg system might >>benefit from this. > > > This is pretty easy to do with a custom importer on sys.meta_path. > Getting the details right is a touch fiddly, but it's conceptually > straightforward. > > Hmm, I might play with this - a set of PEP 302 importers to completely > customise the import mechanism. The never-completed "phase 2" of the > PEP included a reimplementation of the built in import mechanism as > hooks. Is there any interest in this actually happening? I've been > looking for an interesting coding project for a while (although I > never have any free time...) > My interest in such a project would be in replacing a bunch of legacy C code with varying implementations of the import mechanism in pure Python strictly according to the dictats of PEP 302, using sys.path_hooks and sys.path (meta_path is for future consideration ;-). Some readers may remember a lightning talk I gave at OSCON about three years ago. In that I discussed a database structure that allowed different implementations of modules to be loaded according to compatibility requirements established as a result of testing. Although I now have a working database import mechanism based on PEP 302 it's by no means obvious how that can be used exclusively (in other words, replacing the current import mechanism: the present implementation relies on an import of MySQLdb, which has many dependencies that clearly must be importable before the DB mechanism is in place). And I certainly haven't followed up by establishing the compatibility data that such an implementation would require. Has anyone done any work on (for example) zip-only distributions? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From steve at holdenweb.com Fri Sep 22 06:41:38 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 22 Sep 2006 00:41:38 -0400 Subject: [Python-Dev] list.discard? (Re: dict.discard) In-Reply-To: <200609212028.00824.fdrake@acm.org> References: <20060921134249.GA9238@niemeyer.net> <45132C6D.9010806@canterbury.ac.nz> <200609212028.00824.fdrake@acm.org> Message-ID: Fred L. Drake, Jr. wrote: > On Thursday 21 September 2006 20:21, Greg Ewing wrote: > > if x not in somelist: > > somelist.remove(x) > > I'm just guessing you really meant "if x in somelist". ;-) > No you aren't, that's clearly an *informed* guess. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From jackdied at jackdied.com Fri Sep 22 06:58:03 2006 From: jackdied at jackdied.com (Jack Diederich) Date: Fri, 22 Sep 2006 00:58:03 -0400 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: <45136205.6060603@v.loewis.de> References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> <20060922020858.GB6677@performancedrivers.com> <45136205.6060603@v.loewis.de> Message-ID: <20060922045803.GC6677@performancedrivers.com> On Fri, Sep 22, 2006 at 06:09:41AM +0200, "Martin v. L?wis" wrote: > Jack Diederich schrieb: > > Faced with the choice of believing in a really strange platform specific > > bug in a commonly used routine that resulted in exactly the failure caused > > by one of the two files being updated or believing a failure occurred in the > > long chain of networks, disks, file systems, build tools, and operating > > systems that would result in only one of the files being updated - > > I went with the latter. > > Please reconsider how subversion works. It has the notion of atomic > commits, so you either get the entire change, or none at all. > > Fortunately, the buildbot keeps logs of everything it does: > > http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-svn/0 > > shows > > U Lib/test/test_itertools.py > U Modules/itertoolsmodule.c > Updated to revision 51950. > > So it said it updated both files. But perhaps it didn't build them? > Let's check: > > > http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-compile/0 > > has this: > > building 'itertools' extension > > gcc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp > -mno-fused-madd -g -Wall -Wstrict-prototypes -I. > -I/Users/buildslave/bb/trunk.psf-g4/build/./Include > -I/Users/buildslave/bb/trunk.psf-g4/build/./Mac/Include -I./Include -I. > -I/usr/local/include -I/Users/buildslave/bb/trunk.psf-g4/build/Include > -I/Users/buildslave/bb/trunk.psf-g4/build -c > /Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.c -o > build/temp.macosx-10.3-ppc-2.6/Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.o > > gcc -bundle -undefined dynamic_lookup > build/temp.macosx-10.3-ppc-2.6/Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.o > -L/usr/local/lib -o build/lib.macosx-10.3-ppc-2.6/itertools.so > > So itertools.so is regenerated, as it should; qed. > I should leave the tounge-in-cheek bombast to Tim and Frederik, especially when dealing with what might be an OS & machine specific bug. The next checkin and re-test will or won't highlight a failure and certainly someone with a g4 will try it out before 2.5.1 goes out so we'll know if it was a fluke soonish. The original error was mine, I typed "Size_t" instead of "Ssize_t" and while my one-char patch might also be wrong (I hope not, I'm red-faced enough as is) we should find out soon enough. -Jack From nnorwitz at gmail.com Fri Sep 22 07:23:54 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 21 Sep 2006 22:23:54 -0700 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: <20060922045803.GC6677@performancedrivers.com> References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> <20060922020858.GB6677@performancedrivers.com> <45136205.6060603@v.loewis.de> <20060922045803.GC6677@performancedrivers.com> Message-ID: On 9/21/06, Jack Diederich wrote: > > I should leave the tounge-in-cheek bombast to Tim and Frederik, especially > when dealing with what might be an OS & machine specific bug. The next > checkin and re-test will or won't highlight a failure and certainly someone > with a g4 will try it out before 2.5.1 goes out so we'll know if it was a > fluke soonish. The original error was mine, I typed "Size_t" instead of > "Ssize_t" and while my one-char patch might also be wrong (I hope not, I'm > red-faced enough as is) we should find out soon enough. It looks like %zd of a negative number is treated as an unsigned number on OS X, even though the man page says it should be signed. """ The z modifier, when applied to a d or i conversion, indicates that the argument is of a signed type equivalent in size to a size_t. """ The program below returns -123 on Linux and 4294967173 on OS X. n -- #include int main() { char buffer[256]; if(sprintf(buffer, "%zd", (size_t)-123) < 0) return 1; printf("%s\n", buffer); return 0; } From tim.peters at gmail.com Fri Sep 22 08:12:07 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 22 Sep 2006 02:12:07 -0400 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> <20060922020858.GB6677@performancedrivers.com> <45136205.6060603@v.loewis.de> <20060922045803.GC6677@performancedrivers.com> Message-ID: <1f7befae0609212312h28a961ffhbead29c6bab3c0f6@mail.gmail.com> [Neal Norwitz] > It looks like %zd of a negative number is treated as an unsigned > number on OS X, even though the man page says it should be signed. > > """ > The z modifier, when applied to a d or i conversion, indicates that > the argument is of a signed type equivalent in size to a size_t. > """ It's not just some man page ;-), this is required by the C99 standard (which introduced the `z` length modifier -- and it's the `d` or `i` here that imply `signed`, `z` is only supposed to specify the width of the integer type, and can also be applied to codes for unsigned integer types, like %zu and %zx). > The program below returns -123 on Linux and 4294967173 on OS X. > > n > -- > #include > int main() > { > char buffer[256]; > if(sprintf(buffer, "%zd", (size_t)-123) < 0) > return 1; > printf("%s\n", buffer); > return 0; > } Well, to be strictly anal, while the result of (size_t)-123 is defined, the result of casting /that/ back to a signed type of the same width is not defined. Maybe your compiler was "doing you a favor" ;-) From jackdied at jackdied.com Fri Sep 22 07:43:16 2006 From: jackdied at jackdied.com (Jack Diederich) Date: Fri, 22 Sep 2006 01:43:16 -0400 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> <20060922020858.GB6677@performancedrivers.com> <45136205.6060603@v.loewis.de> <20060922045803.GC6677@performancedrivers.com> Message-ID: <20060922054316.GD6677@performancedrivers.com> On Thu, Sep 21, 2006 at 10:23:54PM -0700, Neal Norwitz wrote: > On 9/21/06, Jack Diederich wrote: > > > > I should leave the tounge-in-cheek bombast to Tim and Frederik, especially > > when dealing with what might be an OS & machine specific bug. The next > > checkin and re-test will or won't highlight a failure and certainly someone > > with a g4 will try it out before 2.5.1 goes out so we'll know if it was a > > fluke soonish. The original error was mine, I typed "Size_t" instead of > > "Ssize_t" and while my one-char patch might also be wrong (I hope not, I'm > > red-faced enough as is) we should find out soon enough. > > It looks like %zd of a negative number is treated as an unsigned > number on OS X, even though the man page says it should be signed. > > """ > The z modifier, when applied to a d or i conversion, indicates that > the argument is of a signed type equivalent in size to a size_t. > """ > > The program below returns -123 on Linux and 4294967173 on OS X. > > n > -- > #include > int main() > { > char buffer[256]; > if(sprintf(buffer, "%zd", (size_t)-123) < 0) > return 1; > printf("%s\n", buffer); > return 0; > } Consider me blushing even harder for denying the power of the buildbot (and against all evidence). Yikes, didn't any other tests trigger this? sprat:~/src/python-head# find ./ -name '*.c' | xargs grep '%zd' | wc -l 65 -Jack From nnorwitz at gmail.com Fri Sep 22 08:37:37 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 21 Sep 2006 23:37:37 -0700 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: <1f7befae0609212312h28a961ffhbead29c6bab3c0f6@mail.gmail.com> References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> <20060922020858.GB6677@performancedrivers.com> <45136205.6060603@v.loewis.de> <20060922045803.GC6677@performancedrivers.com> <1f7befae0609212312h28a961ffhbead29c6bab3c0f6@mail.gmail.com> Message-ID: On 9/21/06, Tim Peters wrote: > > Well, to be strictly anal, while the result of > > (size_t)-123 > > is defined, the result of casting /that/ back to a signed type of the > same width is not defined. Maybe your compiler was "doing you a > favor" ;-) I also tried with a cast to an ssize_t and replacing %zd with an %zi. None of them make a difference; all return an unsigned value. This is with powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 20041026 (Apple Computer, Inc. build 4061). Although i would expect the issue is in the std C library rather than the compiler. Forcing PY_FORMAT_SIZE_T to be "l" instead of "z" fixes this problem. BTW, this is the same issue on Mac OS X: >>> struct.pack('=b', -599999) __main__:1: DeprecationWarning: 'b' format requires 4294967168 <= number <= 127 'A' n -- From jcarlson at uci.edu Fri Sep 22 09:08:03 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 22 Sep 2006 00:08:03 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com> References: <20060921183846.0845.JCARLSON@uci.edu> <5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com> Message-ID: <20060921233257.0848.JCARLSON@uci.edu> "Phillip J. Eby" wrote: > > At 08:44 PM 9/21/2006 -0700, Josiah Carlson wrote: > >This can be implemented with a fairly simple package registry, contained > >within a (small) SQLite database (which is conveniently shipped in > >Python 2.5). There can be a system-wide database that all users use as > >a base, with a user-defined package registry (per user) where the > >system-wide packages can be augmented. > > As far as I can tell, you're ignoring that per-user must *also* be > per-version, and per-application. Each application or runtime environment > needs its own private set of information like this. Having a different database per Python version is not significantly different than having a different Python binary for each Python version. About the only (annoying) nit is that the systemwide database needs to be easily accessable to the Python runtime, and is possibly volatile. Maybe a symlink in the same path as the actual Python binary on *nix, and the file located next to the binary on Windows. I didn't mention the following because I thought it would be superfluous, but it seems that I should have stated it right out. My thoughts were that on startup, Python would first query the 'system' database, caching its results in a dictionary, then query the user's listing, updating the dictionary as necessary, then unload the databases. On demand, when code runs packages.register(), if both persist and systemwide are False, it just updates the dictionary. If either are true, it opens up and updates the relevant database. With such a semantic, every time Python gets run, every instance gets its own private set of paths, derived from the system database, user database, and runtime-defined packages. > Next, putting the installation data inside a database instead of > per-installation-unit files presents problems of its own. While some > system packaging tools allow install/uninstall scripts to run, they are > often frowned upon, and can be unintentionally bypassed. This is easily remedied with a proper 'packages' implementation: python -Mpackages name path Note that Python could auto-insert standard library and site-packages 'packages' on startup (creating the initial dictionary, then the systemwide, then the user, ...). > These are just a few of the issues that come to mind. Realistically > speaking, .pth files are currently the most effective mechanism we have, > and there actually isn't much that can be done to improve upon them. Except that .pth files are only usable in certain (likely) system paths, that the user may not have write access to. There have previously been proposals to add support for .pth files in the path of the run .py file, but they don't seem to have gotten any support. > What's more needed are better mechanisms for creating and managing Python > "environments" (to use a term coined by Ian Bicking and Jim Fulton over on > the distutils-sig), which are individual contexts in which Python > applications run. Some current tools in development by Ian and Jim include: > > Anyway, system-wide and per-user environment information isn't nearly > sufficient to address the issues that people have when developing and > deploying multiple applications on a server, or even using multiple > applications on a client installation (e.g. somebody using both the > Enthought Python IDE and Chandler on the same machine). These relatively > simple use cases rapidly demonstrate the inadequacy of system-wide or > per-user configuration of what packages are available. It wouldn't be terribly difficult to add environment switching and environment derivation (copying or linked, though copying would be simpler). packages.derive_environment(parent_environment) packages.register(name, path, env=environment) packages.use(environment) It also wouldn't be terribly difficult to set up environments that required certain packages... packages.new_environment(environment, *required_packages, test=True) To verify that the Python installation has the required packages, then later... packages.new_environment(environment, *required_packages, persist=True) I believe that most of the concerns that you have brought up can be addressed, and I think that it could be far nicer to deal with than the current sys.path hackery. The system database location is a bit annoying, but I lack the *nix experience to say where such a database could or should be located. - Josiah From fredrik at pythonware.com Fri Sep 22 10:35:50 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 22 Sep 2006 10:35:50 +0200 Subject: [Python-Dev] list.discard? (Re: dict.discard) References: <20060921134249.GA9238@niemeyer.net> <45132C6D.9010806@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Actually I'd like this for lists. Often I find myself > writing > > if x not in somelist: > somelist.remove(x) > > A single method for doing this would be handy, and > more efficient. there is a single method that does this, of course, but you have to sprinkle some sugar on it: try: somelist.remove(x) except ValueError: pass From ronaldoussoren at mac.com Fri Sep 22 11:40:41 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 22 Sep 2006 11:40:41 +0200 Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine In-Reply-To: References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com> <20060921215019.GA6677@performancedrivers.com> <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com> <20060922020858.GB6677@performancedrivers.com> <45136205.6060603@v.loewis.de> <20060922045803.GC6677@performancedrivers.com> <1f7befae0609212312h28a961ffhbead29c6bab3c0f6@mail.gmail.com> Message-ID: <1020669.1158918041341.JavaMail.ronaldoussoren@mac.com> On Friday, September 22, 2006, at 08:38AM, Neal Norwitz wrote: >On 9/21/06, Tim Peters wrote: >> >> Well, to be strictly anal, while the result of >> >> (size_t)-123 >> >> is defined, the result of casting /that/ back to a signed type of the >> same width is not defined. Maybe your compiler was "doing you a >> favor" ;-) > >I also tried with a cast to an ssize_t and replacing %zd with an %zi. >None of them make a difference; all return an unsigned value. This is >with powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 20041026 (Apple >Computer, Inc. build 4061). Although i would expect the issue is in >the std C library rather than the compiler. > >Forcing PY_FORMAT_SIZE_T to be "l" instead of "z" fixes this problem. > >BTW, this is the same issue on Mac OS X: > >>>> struct.pack('=b', -599999) >__main__:1: DeprecationWarning: 'b' format requires 4294967168 <= number <= 127 Has anyone filed a bug at bugreport.apple.com about this (that is '%zd' not behaving as the documentation says it should behave)? I'll file a bug (as well), but the more people tell Apple about this the more likely it is that someone will fix this. Ronald >'A' > >n >-- >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com > > From ncoghlan at gmail.com Fri Sep 22 13:09:44 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Sep 2006 21:09:44 +1000 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com> Message-ID: <4513C478.6040307@gmail.com> Brett Cannon wrote: > But either way I will be messing with the import system in the > relatively near future. If you want to help, Paul (or anyone else), > just send me an email and we can try to coordinate something (plan to do > the work in the sandbox as a separate thing from my security stuff). Starting with pkgutil.get_loader and removing the current dependency on imp.find_module and imp.load_module would probably be a decent way to start. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fdrake at acm.org Fri Sep 22 14:44:55 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Sep 2006 08:44:55 -0400 Subject: [Python-Dev] [Python-checkins] release25-maint is UNFROZEN In-Reply-To: <20060921123510.GA22457@code0.codespeak.net> References: <200609212112.04923.anthony@interlink.com.au> <20060921123510.GA22457@code0.codespeak.net> Message-ID: <200609220844.55724.fdrake@acm.org> On Thursday 21 September 2006 08:35, Armin Rigo wrote: > Thanks for the hassle! I've got another bit of it for you, though. The > freezed 2.5 documentation doesn't seem to be available on-line. At > least, the doc links from the release page point to the 'dev' 2.6a0 > version, and the URL following the common scheme - > http://www.python.org/doc/2.5/ - doesn't work. This should mostly be working now. The page at www.python.org/doc/2.5/ isn't "really" right, but will do the trick. Hopefully I'll be able to work out how these pages should be updated properly at the Arlington sprint this weekend, at which point I can update PEP 101 appropriately and make sure this gets done when releases are made. -Fred -- Fred L. Drake, Jr. From rokkamraja at gmail.com Fri Sep 22 15:19:44 2006 From: rokkamraja at gmail.com (Raja Rokkam) Date: Fri, 22 Sep 2006 18:49:44 +0530 Subject: [Python-Dev] Python network Programmign Message-ID: <357297a00609220619x3e968d30p7fcafbb7683e5a69@mail.gmail.com> Hi, I am currently doing my final year project "Secure mobile Robot Management" . I have done the theoretical aspects of it till now and now thinking of coding it . I would like to code in Python , but i am new to Python Network Programming . Some of features of my project are: 1. Each robot can send data to any other robot. 2. Each robot can receive data from any other robot. 3. Every Robot has atleast 1 other bot in its communication range. 4. maximum size of a data packet is limited to 35 bytes 5. each mobile robot maintains a table with routes 6. all the routes stored in the routing table include a ??eld named life-time. 7. Route Discovery Process initiated if there is no known route to other bot. 8. There is no server over here . 9. every bot should be able to process the data from other bots and both multicast/unicast need to be supported. Assume the environment is gridded mesh and bots exploring the area. They need to perform a set of tasks (assume finding some locations which are dangerous or smthing like that). My main concern is how to go about modifying the headers such that everything fits in 35bytes . I would like to know how to proceed and if any links or resources in this regard. How to modify the headers ? ie. all in 35 bytes . Thank You, Raja. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060922/64a12c24/attachment.htm From fredrik at pythonware.com Fri Sep 22 15:32:50 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 22 Sep 2006 15:32:50 +0200 Subject: [Python-Dev] Python network Programmign References: <357297a00609220619x3e968d30p7fcafbb7683e5a69@mail.gmail.com> Message-ID: Raja Rokkam wrote: > I would like to code in Python , but i am new to Python Network Programming wrong list: python-dev is for people who develop the python core, not people who want to develop *in* python. see http://www.python.org/community/lists/ for a list of more appropriate forums. cheers /F From pje at telecommunity.com Fri Sep 22 18:25:01 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 22 Sep 2006 12:25:01 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: <20060921233257.0848.JCARLSON@uci.edu> References: <5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com> <20060921183846.0845.JCARLSON@uci.edu> <5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com> At 12:08 AM 9/22/2006 -0700, Josiah Carlson wrote: >"Phillip J. Eby" wrote: > > > > At 08:44 PM 9/21/2006 -0700, Josiah Carlson wrote: > > >This can be implemented with a fairly simple package registry, contained > > >within a (small) SQLite database (which is conveniently shipped in > > >Python 2.5). There can be a system-wide database that all users use as > > >a base, with a user-defined package registry (per user) where the > > >system-wide packages can be augmented. > > > > As far as I can tell, you're ignoring that per-user must *also* be > > per-version, and per-application. Each application or runtime environment > > needs its own private set of information like this. > >Having a different database per Python version is not significantly >different than having a different Python binary for each Python version. You misunderstood me: I mean that the per-user database must be able to store information for *different Python versions*. Having a single per-user database without the ability to include configuration for more than one Python version (analagous to the current situation with the distutils per-user config file) is problematic. In truth, a per-user configuration is just a special case of the real need: to have per-application environments. In effect, a per-user environment is a fallback for not having an appplication environment, and the system environment is a fallback for not having a user environment. >About the only (annoying) nit is that the systemwide database needs to >be easily accessable to the Python runtime, and is possibly volatile. >Maybe a symlink in the same path as the actual Python binary on *nix, >and the file located next to the binary on Windows. > >I didn't mention the following because I thought it would be superfluous, >but it seems that I should have stated it right out. My thoughts were >that on startup, Python would first query the 'system' database, caching >its results in a dictionary, then query the user's listing, updating the >dictionary as necessary, then unload the databases. On demand, when >code runs packages.register(), if both persist and systemwide are False, >it just updates the dictionary. If either are true, it opens up and >updates the relevant database. Using a database as the primary mechanism for managing import locations simply isn't workable. You might as well suggest that each environment consist of a single large zipfile containing the packages in question: this would actually be *more* practical (and fast!) in terms of Python startup, and is no different from having a database with respect to the need for installation and uninstallation to modify a central file! I'm not proposing we do that -- I'm just pointing out why using an actual database isn't really workable, considering that it has all of the disadvantages of a big zipfile, and none of the advantages (like speed, having code already written that supports it, etc.) >This is easily remedied with a proper 'packages' implementation: > > python -Mpackages name path > >Note that Python could auto-insert standard library and site-packages >'packages' on startup (creating the initial dictionary, then the >systemwide, then the user, ...). I presume here you're suggesting a way to select a runtime environment from the command line, which would certainly be a good idea. > > These are just a few of the issues that come to mind. Realistically > > speaking, .pth files are currently the most effective mechanism we have, > > and there actually isn't much that can be done to improve upon them. > >Except that .pth files are only usable in certain (likely) system paths, >that the user may not have write access to. There have previously been >proposals to add support for .pth files in the path of the run .py file, >but they don't seem to have gotten any support. Setuptools works around this by installing an enhancement for the 'site' module that extends .pth support to include all PYTHONPATH directories. The enhancement delegates to the original site module after recording data about sys.path that the site module destroys at startup. >I believe that most of the concerns that you have brought up can be >addressed, Well, as I said, I've already dealt with them, using .pth files, for the use cases I care about. Ian Bicking and Jim Fulton have also gone farther with work on tools to create environments with greater isolation or more fixed version linkages than what setuptools does. (Setuptools-generated environments dynamically select requirements based on available versions at runtime, while Ian and Jim's tools create environments whose inter-package linkages are frozen at installation time.) >and I think that it could be far nicer to deal with than the >current sys.path hackery. I'm not sure of that, since I don't yet know how your approach would deal with namespace packages, which are distributed in pieces and assembled later. For example, many PEAK and Zope distributions live in the peak.* and zope.* package namespaces, but are installed separately, and glued together via __path__ changes (see the pkgutil docs). Thus, if you are talking about a packagename->importer mapping, it has to take into consideration the possibility of multiple import locations for the same package. > The system database location is a bit annoying, >but I lack the *nix experience to say where such a database could or >should be located. This issue is a triviality compared to the more fundamental flaws (or at any rate, holes) in what you're currently proposing. I wouldn't worry about it at all right now. That having been said, I find the discussion stimulating, because I do plan to revisit the environments issue in setuptools 0.7, so who knows what ideas may come up? From fuzzyman at voidspace.org.uk Fri Sep 22 19:43:42 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 22 Sep 2006 18:43:42 +0100 Subject: [Python-Dev] Suggestion for a new built-in - flatten Message-ID: <451420CE.8070003@voidspace.org.uk> Hello all, I have a suggestion for a new Python built in function: 'flatten'. This would (as if it needs explanation) take a single sequence, where each element can be a sequence (or iterable ?) nested to an arbitrary depth. It would return a flattened list. A useful restriction could be that it wouldn't expand strings :-) I've needed this several times, and recently twice at work. There are several implementations in the Python cookbook. When I posted on my blog recently asking for one liners to flatten a list of lists (only 1 level of nesting), I had 26 responses, several of them saying it was a problem they had encountered before. There are also numerous places on the web bewailing the lack of this as a built-in. All of this points to the fact that it is something that would be appreciated as a built in. There is an implementation already in Tkinter : import _tkinter._flatten as flatten There are several different possible approaches in pure Python, but is this an idea that has legs ? All the best, Michael Foord http://www.voidspace.org.uk/python/index.shtml -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.405 / Virus Database: 268.12.7/454 - Release Date: 21/09/2006 From theller at python.net Fri Sep 22 20:10:23 2006 From: theller at python.net (Thomas Heller) Date: Fri, 22 Sep 2006 20:10:23 +0200 Subject: [Python-Dev] Relative import bug? Message-ID: Consider a package containing these files: a/__init__.py a/b/__init__.py a/b/x.py a/b/y.py If x.py contains this: """ from ..b import y import a.b.x from ..b import x """ Python trunk and Python 2.5 both complain: Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import a.b.x Traceback (most recent call last): File "", line 1, in File "a\b\x.py", line 2, in from ..b import x ImportError: cannot import name x >>> A bug? Thomas From pje at telecommunity.com Fri Sep 22 20:44:55 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 22 Sep 2006 14:44:55 -0400 Subject: [Python-Dev] Relative import bug? In-Reply-To: Message-ID: <5.1.1.6.0.20060922143559.02f03498@sparrow.telecommunity.com> At 08:10 PM 9/22/2006 +0200, Thomas Heller wrote: >Consider a package containing these files: > >a/__init__.py >a/b/__init__.py >a/b/x.py >a/b/y.py > >If x.py contains this: > >""" >from ..b import y >import a.b.x >from ..b import x >""" > >Python trunk and Python 2.5 both complain: > >Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] >on win32 >Type "help", "copyright", "credits" or "license" for more information. > >>> import a.b.x >Traceback (most recent call last): > File "", line 1, in > File "a\b\x.py", line 2, in > from ..b import x >ImportError: cannot import name x > >>> > >A bug? If it is, it has nothing to do with relative importing per se. Note that changing it to "from a.b import x" produces the exact same error. This looks like a "standard" circular import bug. What's happening is that the first import doesn't set "a.b.x = x" until after a.b.x is fully imported. But subsequent "import a.b.x" statements don't set it either, because they are satisfied by finding 'a.b.x' in sys.modules. So, when the 'from ... import x' runs, it tries to get the 'x' attribute of 'a.b' (whether it gets a.b relatively or absolutely), and fails. If you make the last import be "import a.b.x as x", you'll get a better error message: Traceback (most recent call last): File "", line 1, in File "a/b/x.py", line 3, in import a.b.x as x AttributeError: 'module' object has no attribute 'x' But the entire issue is a bug that exists in Python 2.4, and possibly prior versions as well. From dave at boost-consulting.com Fri Sep 22 20:45:17 2006 From: dave at boost-consulting.com (David Abrahams) Date: Fri, 22 Sep 2006 14:45:17 -0400 Subject: [Python-Dev] Pep 353: Py_ssize_t advice Message-ID: <871wq3eo5e.fsf@pereiro.luannocracy.com> Pep 353 advises the use of this incantation: #if PY_VERSION_HEX < 0x02050000 typedef int Py_ssize_t; #define PY_SSIZE_T_MAX INT_MAX #define PY_SSIZE_T_MIN INT_MIN #endif I just wanted to point out that this advice could lead to library header collisions when multiple 3rd parties decide to follow it. I suggest it be changed to something like: #if PY_VERSION_HEX < 0x02050000 && !defined(PY_SSIZE_T_MIN) typedef int Py_ssize_t; #define PY_SSIZE_T_MAX INT_MAX #define PY_SSIZE_T_MIN INT_MIN #endif (C++ allows restating of typedefs; if C allows it, that should be something like): #if PY_VERSION_HEX < 0x02050000 typedef int Py_ssize_t; # if !defined(PY_SSIZE_T_MIN) # define PY_SSIZE_T_MAX INT_MAX # define PY_SSIZE_T_MIN INT_MIN # endif #endif You may say that library developers should know better, but I just had an argument with a very bright guy who didn't get it at first. Thanks, and HTH. -- Dave Abrahams Boost Consulting www.boost-consulting.com From skip at pobox.com Fri Sep 22 20:46:15 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 22 Sep 2006 13:46:15 -0500 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <451420CE.8070003@voidspace.org.uk> References: <451420CE.8070003@voidspace.org.uk> Message-ID: <17684.12151.408448.905468@montanaro.dyndns.org> Michael> There are several different possible approaches in pure Python, Michael> but is this an idea that has legs ? Why not add it to itertools? Then, if you need a true list, just call list() on the returned iterator. Skip From jcarlson at uci.edu Fri Sep 22 20:57:10 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 22 Sep 2006 11:57:10 -0700 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <451420CE.8070003@voidspace.org.uk> References: <451420CE.8070003@voidspace.org.uk> Message-ID: <20060922114820.0851.JCARLSON@uci.edu> Michael Foord wrote: > > Hello all, > > I have a suggestion for a new Python built in function: 'flatten'. This has been brought up many times. I'm -1 on its inclusion, if only because it's a fairly simple 9-line function (at least the trivial version I came up with), and not all X-line functions should be in the standard library. Also, while I have had need for such a function in the past, I have found that I haven't needed it in a few years. - Josiah From brett at python.org Fri Sep 22 21:01:28 2006 From: brett at python.org (Brett Cannon) Date: Fri, 22 Sep 2006 12:01:28 -0700 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <17684.12151.408448.905468@montanaro.dyndns.org> References: <451420CE.8070003@voidspace.org.uk> <17684.12151.408448.905468@montanaro.dyndns.org> Message-ID: On 9/22/06, skip at pobox.com wrote: > > > Michael> There are several different possible approaches in pure > Python, > Michael> but is this an idea that has legs ? > > Why not add it to itertools? Then, if you need a true list, just call > list() on the returned iterator. Yeah, this is a better solution. flatten() just doesn't scream "built-in!" to me. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060922/35d33452/attachment.htm From bob at redivi.com Fri Sep 22 21:05:19 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 22 Sep 2006 12:05:19 -0700 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <20060922114820.0851.JCARLSON@uci.edu> References: <451420CE.8070003@voidspace.org.uk> <20060922114820.0851.JCARLSON@uci.edu> Message-ID: <6a36e7290609221205y392c6defy19ea2a004b82725a@mail.gmail.com> On 9/22/06, Josiah Carlson wrote: > > Michael Foord wrote: > > > > Hello all, > > > > I have a suggestion for a new Python built in function: 'flatten'. > > This has been brought up many times. I'm -1 on its inclusion, if only > because it's a fairly simple 9-line function (at least the trivial > version I came up with), and not all X-line functions should be in the > standard library. Also, while I have had need for such a function in > the past, I have found that I haven't needed it in a few years. I think instead of adding a flatten function perhaps we should think about adding something like Erlang's "iolist" support. The idea is that methods like "writelines" should be able to take nested iterators and consume any object they find that implements the buffer protocol. -bob From ferringb at gmail.com Fri Sep 22 21:26:37 2006 From: ferringb at gmail.com (Brian Harring) Date: Fri, 22 Sep 2006 12:26:37 -0700 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <6a36e7290609221205y392c6defy19ea2a004b82725a@mail.gmail.com> References: <451420CE.8070003@voidspace.org.uk> <20060922114820.0851.JCARLSON@uci.edu> <6a36e7290609221205y392c6defy19ea2a004b82725a@mail.gmail.com> Message-ID: <20060922192637.GA10582@seldon> On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote: > On 9/22/06, Josiah Carlson wrote: > > > > Michael Foord wrote: > > > > > > Hello all, > > > > > > I have a suggestion for a new Python built in function: 'flatten'. > > > > This has been brought up many times. I'm -1 on its inclusion, if only > > because it's a fairly simple 9-line function (at least the trivial > > version I came up with), and not all X-line functions should be in the > > standard library. Also, while I have had need for such a function in > > the past, I have found that I haven't needed it in a few years. > > I think instead of adding a flatten function perhaps we should think > about adding something like Erlang's "iolist" support. The idea is > that methods like "writelines" should be able to take nested iterators > and consume any object they find that implements the buffer protocol. Which is no different then just passing in a generator/iterator that does flattening. Don't much see the point in gumming up the file protocol with this special casing; still will have requests for a flattener elsewhere. If flattening was added, should definitely be a general obj, not a special casing in one method in my opinion. ~harring -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060922/4181f583/attachment.pgp From theller at python.net Fri Sep 22 21:28:24 2006 From: theller at python.net (Thomas Heller) Date: Fri, 22 Sep 2006 21:28:24 +0200 Subject: [Python-Dev] Relative import bug? In-Reply-To: <5.1.1.6.0.20060922143559.02f03498@sparrow.telecommunity.com> References: <5.1.1.6.0.20060922143559.02f03498@sparrow.telecommunity.com> Message-ID: Phillip J. Eby schrieb: > At 08:10 PM 9/22/2006 +0200, Thomas Heller wrote: >>If x.py contains this: >> >>""" >>from ..b import y >>import a.b.x >>from ..b import x >>""" ... >>ImportError: cannot import name x >> >>> >> >>A bug? > > If it is, it has nothing to do with relative importing per se. Note that > changing it to "from a.b import x" produces the exact same error. > > This looks like a "standard" circular import bug. Of course. Thanks. Thomas From glyph at divmod.com Fri Sep 22 21:29:28 2006 From: glyph at divmod.com (glyph at divmod.com) Date: Fri, 22 Sep 2006 15:29:28 -0400 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <451420CE.8070003@voidspace.org.uk> Message-ID: <20060922192928.1717.1975026622.divmod.quotient.57018@ohm> On Fri, 22 Sep 2006 18:43:42 +0100, Michael Foord wrote: >I have a suggestion for a new Python built in function: 'flatten'. This seems superficially like a good idea, but I think adding it to Python anywhere would do a lot more harm than good. I can see that consensus is already strongly against a builtin, but I think it would be bad to add to itertools too. Flattening always *seems* to be a trivial and obvious operation. "I just need something that takes a group of deeply structured data and turns it into a group of shallowly structured data.". Everyone that has this requirement assumes that their list of implicit requirements for "flattening" is the obviously correct one. This wouldn't be a problem except that everyone has a different idea of those requirements:). Here are a few issues. What do you do when you encounter a dict? You can treat it as its keys(), its values(), or its items(). What do you do when you encounter an iterable object? What order do you flatten set()s in? (and, ha ha, do you Set the same?) How are user-defined flattening behaviors registered? Is it a new special method, a registration API? How do you pass information about the flattening in progress to the user-defined behaviors? If you do something special to iterables, do you special-case strings? Why or why not? What do you do if you encounter a function? This is kind of a trick question, since Nevow's "flattener" *calls* functions as it encounters them, then treats the *result* of calling them as further input. If you don't think that functions are special, what about *generator* functions? How do you tell the difference? What about functions that return generators but aren't themselves generators? What about functions that return non-generator iterators? What about pre-generated generator objects (if you don't want to treat iterables as special, are generators special?). Do you produce the output as a structured list or an iterator that works incrementally? Also, at least Nevow uses "flatten" to mean "serialize to bytes", not "produce a flat list", and I imagine at least a few other web frameworks do as well. That starts to get into encoding issues. If you make a decision one way or another on any of these questions of policy, you are going to make flatten() useless to a significant portion of its potential userbase. The only difference between having it in the standard library and not is that if it's there, they'll spend an hour being confused by the weird way that it's dealing with rather than just doing the "obvious" thing, and they'll take a minute to write the 10-line function that they need. Without the standard library, they'll skip to step 2 and save a lot of time. I would love to see a unified API that figured out all of these problems, and put them together into a (non-stdlib) library that anyone interested could use for a few years to work the kinks out. Although it might be nice to have a simple "flatten" interface, I don't think that it would ever be simple enough to stick into a builtin; it would just be the default instance of the IncrementalDestructuringProcess class with the most popular (as determined by polling users of the library after a year or so) IncrementalDestructuringTypePolicy. From jcarlson at uci.edu Fri Sep 22 21:42:17 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 22 Sep 2006 12:42:17 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com> References: <20060921233257.0848.JCARLSON@uci.edu> <5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com> Message-ID: <20060922112345.084E.JCARLSON@uci.edu> "Phillip J. Eby" wrote: > At 12:08 AM 9/22/2006 -0700, Josiah Carlson wrote: > >"Phillip J. Eby" wrote: > > > At 08:44 PM 9/21/2006 -0700, Josiah Carlson wrote: [snip] > You misunderstood me: I mean that the per-user database must be able to > store information for *different Python versions*. Having a single > per-user database without the ability to include configuration for more > than one Python version (analagous to the current situation with the > distutils per-user config file) is problematic. Just like having different systemwide databases for each Python version makes sense, why wouldn't we have different user databases for each Python version? Something like ~/.python_packages.2.6 and ~/.python_packages.3.0 Also, by separating out the files per Python version, we can also guarantee database compatability for any fixed Python series (2.5.x, etc.). I don't know if the internal organization of SQLite databases changes between revisions in a backwards compatible way, so this may not actually be a concern (it is with bsddb). > In truth, a per-user configuration is just a special case of the real need: > to have per-application environments. In effect, a per-user environment is > a fallback for not having an appplication environment, and the system > environment is a fallback for not having a user environment. I think you are mostly correct. The reason you are not completely correct is that if I were to install psyco, and I want all applications that could use it to use it (they guard the psyco import with a try/except), I merely need to register the package in the systemwide (or user) package registery. No need to muck about with each environment I (or my installed applications) have defined, it just works. Is it a "fallback"? Sure, but I prefer to call them "convenient defaults". > >I didn't mention the following because I thought it would be superfluous, > >but it seems that I should have stated it right out. My thoughts were > >that on startup, Python would first query the 'system' database, caching > >its results in a dictionary, then query the user's listing, updating the > >dictionary as necessary, then unload the databases. On demand, when > >code runs packages.register(), if both persist and systemwide are False, > >it just updates the dictionary. If either are true, it opens up and > >updates the relevant database. > > Using a database as the primary mechanism for managing import locations > simply isn't workable. Why? Remember that this database isn't anything other than a persistance mechanism that has pre-built locking semantics for multi-process opening, reading, writing, and closing. Given proper cross-platform locking, we could use any persistance mechanism as a replacement; miniconf, Pickle, marshal; whatever. > You might as well suggest that each environment > consist of a single large zipfile containing the packages in question: this > would actually be *more* practical (and fast!) in terms of Python startup, > and is no different from having a database with respect to the need for > installation and uninstallation to modify a central file! We should remember that the sizes of databases that (I expect) will be common, we are talking about maybe 30k if a user has installed every package in pypi. And after the initial query, everything will be stored in a dictionary or dictionary-like object, offering faster query times than even a zip file (though loading the module/package from disk won't have its performance improved). > I'm not proposing we do that -- I'm just pointing out why using an actual > database isn't really workable, considering that it has all of the > disadvantages of a big zipfile, and none of the advantages (like speed, > having code already written that supports it, etc.) SQLite is pretty fast. And for startup, we are really only performing a single query per database "SELECT * FROM package_registry". It will end up reading the entire database, but these databases will be generally small, perhaps a few dozen rows, maybe a few thousand if we have set up a bunch of installation-time application environments. > >This is easily remedied with a proper 'packages' implementation: > > > > python -Mpackages name path > > > >Note that Python could auto-insert standard library and site-packages > >'packages' on startup (creating the initial dictionary, then the > >systemwide, then the user, ...). > > I presume here you're suggesting a way to select a runtime environment from > the command line, which would certainly be a good idea. Actually, I'm offering a way of *registering* a package with the repository from the command line. I'm of the opinion that setting the environment via command line for the subsequent Python runs is a bad idea, but then again, I have been using wxPython's wxversion method for a while to select which wxPython installation I want to use, and find things like: import wxversion wxversion.ensureMinimal('2.6-unicode', optionsRequired=True) To be exactly the amount of control I want, where I want it. Further, a non-command-line mechanism to handle environment would save people from mucking up their Python runtime environment if they forget to switch it back to a 'default'. With a package registry (perhaps as I have been describing, perhaps something different), all of the disparate ways of choosing a version of a library during import can be removed in favor of a single mechanism. This single mechanism could handle things like the wxPython 'ensureMinimal', perhaps even 'ensure exact' or 'use latest'. > > > These are just a few of the issues that come to mind. Realistically > > > speaking, .pth files are currently the most effective mechanism we have, > > > and there actually isn't much that can be done to improve upon them. > > > >Except that .pth files are only usable in certain (likely) system paths, > >that the user may not have write access to. There have previously been > >proposals to add support for .pth files in the path of the run .py file, > >but they don't seem to have gotten any support. > > Setuptools works around this by installing an enhancement for the 'site' > module that extends .pth support to include all PYTHONPATH > directories. The enhancement delegates to the original site module after > recording data about sys.path that the site module destroys at startup. But wasn't there a recent discussion describing how keeping persistant environment variables is a PITA both during install and runtime? Extending .pth files to PYTHONPATH seems to me like a hack meant to work around the fact that Python doesn't have a package registry. And really, all of the current sys.path + .pth + PYTHONPATH stuff could be subsumed into a *single* mechanism. I'm of the opinion that the current system of paths, etc., are a bit cumbersome. And I think that we can do better, either with the mechanism I am describing, or otherwise. > >I believe that most of the concerns that you have brought up can be > >addressed, > > Well, as I said, I've already dealt with them, using .pth files, for the > use cases I care about. Ian Bicking and Jim Fulton have also gone farther > with work on tools to create environments with greater isolation or more > fixed version linkages than what setuptools does. (Setuptools-generated > environments dynamically select requirements based on available versions at > runtime, while Ian and Jim's tools create environments whose inter-package > linkages are frozen at installation time.) All of these cases could be handled by a properly designed package registry mechanism. > >and I think that it could be far nicer to deal with than the > >current sys.path hackery. > > I'm not sure of that, since I don't yet know how your approach would deal > with namespace packages, which are distributed in pieces and assembled > later. For example, many PEAK and Zope distributions live in the peak.* > and zope.* package namespaces, but are installed separately, and glued > together via __path__ changes (see the pkgutil docs). packages.register('zope', '/path/to/zope') And if the installation path is different: packages.register('zope.subpackage', '/different/path/to/subpackage/') Otherwise the importer will know where the zope (or peak) package exists in the filesystem (or otherwise), and search it whenever 'from zope import ...' is performed. > Thus, if you are talking about a packagename->importer mapping, it has to > take into consideration the possibility of multiple import locations for > the same package. Indeed. But this is not any different than the "multiple import locations for any absolute import" in all Pythons. Only now we don't need to rely on sys.path, .pth, PYTHONPATH, or monkey patching site.py, and we don't need to be adding packages to the root of the absolute import hierarchy: I can add my own package/module to the email package if I want, and I don't even need to bork the system install to do it. - Josiah From bob at redivi.com Fri Sep 22 21:42:16 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 22 Sep 2006 12:42:16 -0700 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <20060922192637.GA10582@seldon> References: <451420CE.8070003@voidspace.org.uk> <20060922114820.0851.JCARLSON@uci.edu> <6a36e7290609221205y392c6defy19ea2a004b82725a@mail.gmail.com> <20060922192637.GA10582@seldon> Message-ID: <6a36e7290609221242j165f23bfq22c0502b7afe9ffa@mail.gmail.com> On 9/22/06, Brian Harring wrote: > On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote: > > On 9/22/06, Josiah Carlson wrote: > > > > > > Michael Foord wrote: > > > > > > > > Hello all, > > > > > > > > I have a suggestion for a new Python built in function: 'flatten'. > > > > > > This has been brought up many times. I'm -1 on its inclusion, if only > > > because it's a fairly simple 9-line function (at least the trivial > > > version I came up with), and not all X-line functions should be in the > > > standard library. Also, while I have had need for such a function in > > > the past, I have found that I haven't needed it in a few years. > > > > I think instead of adding a flatten function perhaps we should think > > about adding something like Erlang's "iolist" support. The idea is > > that methods like "writelines" should be able to take nested iterators > > and consume any object they find that implements the buffer protocol. > > Which is no different then just passing in a generator/iterator that > does flattening. > > Don't much see the point in gumming up the file protocol with this > special casing; still will have requests for a flattener elsewhere. > > If flattening was added, should definitely be a general obj, not a > special casing in one method in my opinion. I disagree, the reason for iolist is performance and convenience; the required indirection of having to explicitly call a flattener function removes some optimization potential and makes it less convenient to use. While there certainly should be a general mechanism available to perform the task (easily accessible from C), the user would be better served by not having to explicitly call itertools.iterbuffers every time they want to write recursive iterables of stuff. -bob From fuzzyman at voidspace.org.uk Fri Sep 22 21:55:18 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 22 Sep 2006 20:55:18 +0100 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <20060922192928.1717.1975026622.divmod.quotient.57018@ohm> References: <20060922192928.1717.1975026622.divmod.quotient.57018@ohm> Message-ID: <45143FA6.3020600@voidspace.org.uk> glyph at divmod.com wrote: > On Fri, 22 Sep 2006 18:43:42 +0100, Michael Foord wrote: > > >> I have a suggestion for a new Python built in function: 'flatten'. >> > > This seems superficially like a good idea, but I think adding it to Python anywhere would do a lot more harm than good. I can see that consensus is already strongly against a builtin, but I think it would be bad to add to itertools too. > > Flattening always *seems* to be a trivial and obvious operation. "I just need something that takes a group of deeply structured data and turns it into a group of shallowly structured data.". Everyone that has this requirement assumes that their list of implicit requirements for "flattening" is the obviously correct one. > > This wouldn't be a problem except that everyone has a different idea of those requirements:). > > Here are a few issues. > > What do you do when you encounter a dict? You can treat it as its keys(), its values(), or its items(). > > What do you do when you encounter an iterable object? > > What order do you flatten set()s in? (and, ha ha, do you Set the same?) > > How are user-defined flattening behaviors registered? Is it a new special method, a registration API? > > How do you pass information about the flattening in progress to the user-defined behaviors? > > If you do something special to iterables, do you special-case strings? Why or why not? > > If you consume iterables, and only special case strings - then none of the issues you raise above seem to be a problem. Sets and dictionaries are both iterable. If it's not iterable it's an element. I'd prefer to see this as a built-in, lots of people seem to want it. IMHO Having it in itertools is a good compromise. > What do you do if you encounter a function? This is kind of a trick question, since Nevow's "flattener" *calls* functions as it encounters them, then treats the *result* of calling them as further input. > Sounds like not what anyone would normally expect. > If you don't think that functions are special, what about *generator* functions? How do you tell the difference? What about functions that return generators but aren't themselves generators? What about functions that return non-generator iterators? What about pre-generated generator objects (if you don't want to treat iterables as special, are generators special?). > > What does the list constructor do with these ? Do the same. > Do you produce the output as a structured list or an iterator that works incrementally? > Either would be fine. I had in mind a list, but converting an iterator into a list is trivial. > Also, at least Nevow uses "flatten" to mean "serialize to bytes", not "produce a flat list", and I imagine at least a few other web frameworks do as well. That starts to get into encoding issues. > > Not a use of the term I've come across. On the other hand I've heard of flatten in the context of nested data-structures many times. > If you make a decision one way or another on any of these questions of policy, you are going to make flatten() useless to a significant portion of its potential userbase. The only difference between having it in the standard library and not is that if it's there, they'll spend an hour being confused by the weird way that it's dealing with rather than just doing the "obvious" thing, and they'll take a minute to write the 10-line function that they need. Without the standard library, they'll skip to step 2 and save a lot of time. > I think that you're over complicating it and that the term flatten is really fairly straightforward. Especially if it's clearly documented in terms of consuming iterables. All the best, Michael Foord http://www.voidspace.org.uk > I would love to see a unified API that figured out all of these problems, and put them together into a (non-stdlib) library that anyone interested could use for a few years to work the kinks out. Although it might be nice to have a simple "flatten" interface, I don't think that it would ever be simple enough to stick into a builtin; it would just be the default instance of the IncrementalDestructuringProcess class with the most popular (as determined by polling users of the library after a year or so) IncrementalDestructuringTypePolicy. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > > -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.405 / Virus Database: 268.12.7/454 - Release Date: 21/09/2006 From jcarlson at uci.edu Fri Sep 22 22:17:23 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 22 Sep 2006 13:17:23 -0700 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <6a36e7290609221242j165f23bfq22c0502b7afe9ffa@mail.gmail.com> References: <20060922192637.GA10582@seldon> <6a36e7290609221242j165f23bfq22c0502b7afe9ffa@mail.gmail.com> Message-ID: <20060922131249.0854.JCARLSON@uci.edu> "Bob Ippolito" wrote: > On 9/22/06, Brian Harring wrote: > > On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote: > > > I think instead of adding a flatten function perhaps we should think > > > about adding something like Erlang's "iolist" support. The idea is > > > that methods like "writelines" should be able to take nested iterators > > > and consume any object they find that implements the buffer protocol. > > > > Which is no different then just passing in a generator/iterator that > > does flattening. > > > > Don't much see the point in gumming up the file protocol with this > > special casing; still will have requests for a flattener elsewhere. > > > > If flattening was added, should definitely be a general obj, not a > > special casing in one method in my opinion. > > I disagree, the reason for iolist is performance and convenience; the > required indirection of having to explicitly call a flattener function > removes some optimization potential and makes it less convenient to > use. Sorry Bob, but I disagree. In the few times where I've needed to 'write a list of buffers to a file handle', I find that iterating over the buffers to be sufficient. And honestly, in all of my time dealing with socket and file IO, I've never needed to write a list of iterators of buffers. Not to say that YAGNI, but I'd like to see an example where 1) it was being used in the wild, and 2) where it would be a measurable speedup. - Josiah From martin at v.loewis.de Fri Sep 22 22:14:42 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 22 Sep 2006 22:14:42 +0200 Subject: [Python-Dev] Pep 353: Py_ssize_t advice In-Reply-To: <871wq3eo5e.fsf@pereiro.luannocracy.com> References: <871wq3eo5e.fsf@pereiro.luannocracy.com> Message-ID: <45144432.6010304@v.loewis.de> David Abrahams schrieb: > #if PY_VERSION_HEX < 0x02050000 > typedef int Py_ssize_t; > #define PY_SSIZE_T_MAX INT_MAX > #define PY_SSIZE_T_MIN INT_MIN > #endif > > I just wanted to point out that this advice could lead to library > header collisions when multiple 3rd parties decide to follow it. I > suggest it be changed to something like: > > #if PY_VERSION_HEX < 0x02050000 && !defined(PY_SSIZE_T_MIN) Strictly speaking, this shouldn't be necessary. C allows redefinition of an object-like macro if the replacement list is identical (for some definition of identical which applies if the fragment is copied literally from the PEP). So I assume you had non-identical replacement list? Can you share what alternative definition you were using? In any case, I still think this is good practice, so I added it to the PEP. > (C++ allows restating of typedefs; if C allows it, that should be > something like): C also allows this; yet, our advise would be that these three names get always defined together - if that is followed, having a single guard macro should suffice. PY_SSIZE_T_MIN, as you propose, should be sufficient. Regards, Martin From rhettinger at ewtllc.com Fri Sep 22 22:14:58 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Fri, 22 Sep 2006 13:14:58 -0700 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <451420CE.8070003@voidspace.org.uk> Message-ID: [Michael Foord] >I have a suggestion for a new Python built in function: 'flatten'. > ... > There are several different possible approaches in pure Python, > but is this an idea that has legs ? No legs. It has been discussed ad naseum on comp.lang.python. People seem to enjoy writing their own versions of flatten more than finding legitimate use cases that don't already have trivial solutions. A general purpose flattener needs some way to be told was is atomic and what can be further subdivided. Also, it not obvious how the algorithm should be extended to cover inputs with tree-like data structures with data at nodes as well as the leaves (preorder, postorder, inorder traversal, etc.) I say use your favorite cookbook approach and leave it out of the language. Raymond From martin at v.loewis.de Fri Sep 22 22:21:35 2006 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 22 Sep 2006 22:21:35 +0200 Subject: [Python-Dev] GCC patch for catching errors in PyArg_ParseTuple Message-ID: <451445CF.7080407@v.loewis.de> I wrote a patch for the GCC trunk to add an __attribute__((format(PyArg_ParseTuple, 2, 3))) declaration to functions (this specific declaration should go to PyArg_ParseTuple only). With that patch, parameter types are compared with the string parameter (if that's a literal), and errors are reported if there is a type mismatch (provided -Wformat is given). I'll post more about this patch in the near future, and commit some bug fixes I found with it, but here is the patch, in a publish-early fashion. There is little chance that this can go into GCC (as it is too specific), so it likely needs to be maintained separately. It was written for the current trunk, but hopefully applies to most recent releases. Regards, Martin -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pyformat.diff Url: http://mail.python.org/pipermail/python-dev/attachments/20060922/588484ab/attachment-0001.diff From pje at telecommunity.com Fri Sep 22 22:25:49 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 22 Sep 2006 16:25:49 -0400 Subject: [Python-Dev] New relative import issue In-Reply-To: <20060922112345.084E.JCARLSON@uci.edu> References: <5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com> <20060921233257.0848.JCARLSON@uci.edu> <5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20060922160541.028188e8@sparrow.telecommunity.com> At 12:42 PM 9/22/2006 -0700, Josiah Carlson wrote: > > You might as well suggest that each environment > > consist of a single large zipfile containing the packages in question: > this > > would actually be *more* practical (and fast!) in terms of Python startup, > > and is no different from having a database with respect to the need for > > installation and uninstallation to modify a central file! > >We should remember that the sizes of databases that (I expect) will be >common, we are talking about maybe 30k if a user has installed every >package in pypi. And after the initial query, everything will be stored >in a dictionary or dictionary-like object, offering faster query times >than even a zip file Measure it. Be sure to include the time to import SQLite vs. the time to import the zipimport module. >SQLite is pretty fast. And for startup, we are really only performing a >single query per database "SELECT * FROM package_registry". It will end >up reading the entire database, but these databases will be generally >small, perhaps a few dozen rows, maybe a few thousand if we have set up >a bunch of installation-time application environments. Again, seriously, compare this against a zipfile. You'll find that there's absolutely no comparison between reading this and reading a zipfile central directory -- which also results in an in-memory cache that can then be used to seek() directly to the module. >Actually, I'm offering a way of *registering* a package with the >repository from the command line. I'm of the opinion that setting the >environment via command line for the subsequent Python runs is a bad >idea, but then again, I have been using wxPython's wxversion method for >a while to select which wxPython installation I want to use, and find >things like: > > import wxversion > wxversion.ensureMinimal('2.6-unicode', optionsRequired=True) > >To be exactly the amount of control I want, where I want it. Well, that's already easy to do for arbitrary packages and arbitrary versions with setuptools. Eggs installed in "multi-version" mode are added to sys.path at runtime if/when they are requested. >With a package registry (perhaps as I have been describing, perhaps >something different), all of the disparate ways of choosing a version of >a library during import can be removed in favor of a single mechanism. >This single mechanism could handle things like the wxPython >'ensureMinimal', perhaps even 'ensure exact' or 'use latest'. This discussion is mostly making me realize that sys.path is exactly the right thing to have, and that the only thing that actually need fixing is universal .pth support, and maybe some utility functions for better sys.path manipulation within .pth files. I suggest that there is no way an arbitrary "registry" implementation is going to be faster than reading lines from a text file. > > Setuptools works around this by installing an enhancement for the 'site' > > module that extends .pth support to include all PYTHONPATH > > directories. The enhancement delegates to the original site module after > > recording data about sys.path that the site module destroys at startup. > >But wasn't there a recent discussion describing how keeping persistant >environment variables is a PITA both during install and runtime? Yes, exactly. >Extending .pth files to PYTHONPATH seems to me like a hack meant to work >around the fact that Python doesn't have a package registry. And really, >all of the current sys.path + .pth + PYTHONPATH stuff could be subsumed >into a *single* mechanism. Sure -- I suggest that the single mechanism is none other than *sys.path*. The .pth files, PYTHONPATH, and a new command-line option merely being ways to set it. All of the discussion that's taken place here has sufficed at this point to convince me that sys.path isn't broken at all, and doesn't need fixing. Some tweaks to 'site' and maybe a new command-line option will suffice to clean everything up quite nicely. I say this because all of the version and dependency management things that people are asking about can already be achieved by setuptools, so clearly the underlying machinery is fine. It wasn't until this message of yours that I realized that you are trying to solve a bunch of problems that are quite solvable within the existing machinery. I was mainly interested in cleaning up the final awkwardness that's effectively caused by lack of .pth support for the startup script directory. > > I'm not sure of that, since I don't yet know how your approach would deal > > with namespace packages, which are distributed in pieces and assembled > > later. For example, many PEAK and Zope distributions live in the peak.* > > and zope.* package namespaces, but are installed separately, and glued > > together via __path__ changes (see the pkgutil docs). > > packages.register('zope', '/path/to/zope') > >And if the installation path is different: > > packages.register('zope.subpackage', '/different/path/to/subpackage/') > >Otherwise the importer will know where the zope (or peak) package exists >in the filesystem (or otherwise), and search it whenever 'from zope >import ...' is performed. If you're talking about replacing the current import machinery, you would have to leave this to Py3K, otherwise all you've done is add a *new* import hook, i.e. a "sys.package_loaders" dictionary or some such. If you wanted something like that now, of course, you could slap an importer into sys.meta_path that then did a lookup in sys.package_loaders. Getting this mechanism bootstrapped, however, is left as an exercise for the reader. ;) Note, by the way, that it might be quite possible to do away with everything but sys.meta_path in Py3K, prepopulated with such an importer (along with ones to support builtin and frozen modules). You could then import a backward-compatibility module that would add support for sys.path and for package __path__ attributes, by adding a new entry to sys.meta_path. But this is strictly a pipe dream where Python 2.x is concerned. From bob at redivi.com Fri Sep 22 22:34:23 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 22 Sep 2006 13:34:23 -0700 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <20060922131249.0854.JCARLSON@uci.edu> References: <20060922192637.GA10582@seldon> <6a36e7290609221242j165f23bfq22c0502b7afe9ffa@mail.gmail.com> <20060922131249.0854.JCARLSON@uci.edu> Message-ID: <6a36e7290609221334q7ec72a5cu5000347ee13248fa@mail.gmail.com> On 9/22/06, Josiah Carlson wrote: > > "Bob Ippolito" wrote: > > On 9/22/06, Brian Harring wrote: > > > On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote: > > > > I think instead of adding a flatten function perhaps we should think > > > > about adding something like Erlang's "iolist" support. The idea is > > > > that methods like "writelines" should be able to take nested iterators > > > > and consume any object they find that implements the buffer protocol. > > > > > > Which is no different then just passing in a generator/iterator that > > > does flattening. > > > > > > Don't much see the point in gumming up the file protocol with this > > > special casing; still will have requests for a flattener elsewhere. > > > > > > If flattening was added, should definitely be a general obj, not a > > > special casing in one method in my opinion. > > > > I disagree, the reason for iolist is performance and convenience; the > > required indirection of having to explicitly call a flattener function > > removes some optimization potential and makes it less convenient to > > use. > > Sorry Bob, but I disagree. In the few times where I've needed to 'write > a list of buffers to a file handle', I find that iterating over the > buffers to be sufficient. And honestly, in all of my time dealing > with socket and file IO, I've never needed to write a list of iterators > of buffers. Not to say that YAGNI, but I'd like to see an example where > 1) it was being used in the wild, and 2) where it would be a measurable > speedup. The primary use for this is structured data, mostly file formats, where you can't write the beginning until you have a bunch of information about the entire structure such as the number of items or the count of bytes when serialized. An efficient way to do that is just to build a bunch of nested lists that you can use to calculate the size (iolist_size(...) in Erlang) instead of having to write a visitor that constructs a new flat list or writes to StringIO first. I suppose in the most common case, for performance reasons, you would want to restrict this to sequences only (as in PySequence_Fast) because iolist_size(...) should be non-destructive (or else it has to flatten into a new list anyway). I've definitely done this before in Python, most recently here: http://svn.red-bean.com/bob/flashticle/trunk/flashticle/ The flatten function in this case is flashticle.util.iter_only, and it's used in flashticle.actions, flashticle.amf, flashticle.flv, flashticle.swf, and flashticle.remoting. -bob From dave at boost-consulting.com Sat Sep 23 00:17:13 2006 From: dave at boost-consulting.com (David Abrahams) Date: Fri, 22 Sep 2006 18:17:13 -0400 Subject: [Python-Dev] Pep 353: Py_ssize_t advice In-Reply-To: <45144432.6010304@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wi?= =?utf-8?Q?s's?= message of "Fri, 22 Sep 2006 22:14:42 +0200") References: <871wq3eo5e.fsf@pereiro.luannocracy.com> <45144432.6010304@v.loewis.de> Message-ID: <87k63vzguu.fsf@pereiro.luannocracy.com> "Martin v. L?wis" writes: > David Abrahams schrieb: >> #if PY_VERSION_HEX < 0x02050000 >> typedef int Py_ssize_t; >> #define PY_SSIZE_T_MAX INT_MAX >> #define PY_SSIZE_T_MIN INT_MIN >> #endif >> >> I just wanted to point out that this advice could lead to library >> header collisions when multiple 3rd parties decide to follow it. I >> suggest it be changed to something like: >> >> #if PY_VERSION_HEX < 0x02050000 && !defined(PY_SSIZE_T_MIN) > > Strictly speaking, this shouldn't be necessary. C allows redefinition > of an object-like macro if the replacement list is identical (for > some definition of identical which applies if the fragment is > copied literally from the PEP). > > So I assume you had non-identical replacement list? No: a. I didn't actually experience a collision; I only anticipated it b. We were using C++, which IIRC does not allow such redefinition c. anyway you'll get a nasty warning, which for some people will be just as bad as an error > Can you share what alternative definition you were using? > > In any case, I still think this is good practice, so I added it > to the PEP. Thanks, -- Dave Abrahams Boost Consulting www.boost-consulting.com From rasky at develer.com Sat Sep 23 00:42:33 2006 From: rasky at develer.com (Giovanni Bajo) Date: Sat, 23 Sep 2006 00:42:33 +0200 Subject: [Python-Dev] GCC patch for catching errors in PyArg_ParseTuple References: <451445CF.7080407@v.loewis.de> Message-ID: <00c401c6de98$64cefd80$4bbd2997@bagio> Martin v. L?wis wrote: >> I'll post more about this patch in the near future, and commit >> some bug fixes I found with it, but here is the patch, in >> a publish-early fashion. >> >> There is little chance that this can go into GCC (as it is too >> specific), so it likely needs to be maintained separately. >> It was written for the current trunk, but hopefully applies >> to most recent releases. A way not to maintain this patch forever would be to devise a way to make format syntax "pluggable" / "scriptable". There have been previous discussions on the GCC mailing lists. Giovanni Bajo From typo_pl at hotmail.com Sat Sep 23 00:46:32 2006 From: typo_pl at hotmail.com (Johnny Lee) Date: Fri, 22 Sep 2006 22:46:32 +0000 Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code Message-ID: Hello,My name is Johnny Lee. I have developed a *ahem* perl script which scans C/C++ source files for typos. I ran the typo.pl script on the released Python 2.5 source code. The scan took about two minutes and produced ~340 typos.After spending about 13 minutes weeding out the obvious false positives, 149 typos remain. One of the pros/cons of the script is that it doesn't need to be intergrated into the build process to work.It just searches for files with typical C/C++ source code file extensions and scans them.The downside is if the source file is not included in the build process, then the script is scanning an irrelevant file.Unless you aid the script via some parameters, it will scan all the code, even stuff inside #ifdef'sthat wouldn't normally be compiled. You can access the list of typos from The Perl 1999 paper can be read at I've mapped the Python memory-related calls PyMem_Alloc, PyMem_Realloc, etc. to the same behaviour as the C std library malloc, realloc, etc. sinceInclude\pymem.h seem to map them to those calls. If that assumption is not valid, then you can ignore typos that involve those PyMem_XXX calls. The Python 2.5 typos can be classified into 7 types. 1) if (X = 0)Assignment within an if statement. Typically a false positive, but sometimes it catches something valid.In Python's case, the one typo is: if (status = ERROR_MORE_DATA)but the previous code statement returns an error code into the status variable. 2) realloc overwrite src if NULL, i.e. p = realloc(p, new_size);If realloc() fails, it will return NULL. If you assign the return value to the same variable you passed into realloc,then you've overwritten the variable and possibly leaked the memory that the variable pointed to. 3) if (CreateFileMapping == IHV)On Win32, the CreateFileMapping() API will return NULL on failure, not INVALID_HANDLE_VALUE.The Python code does not check for NULL though. 4) if ((X!=0) || (X!=1))The problem with code of this type is that it doesn't work. In the Python case, we have in a large if statement: quotetabs && ((data[in]!='\t')||(data[in]!=' '))Now if data[in] == '\t', then it will fail the first data[in] but it will pass the second data[in] comparison.Typically you want "&&" not "||".5) using API result w/no checkThere are several APIs that should be checked for success before using the returned ptrs/cookies, i.e.malloc, realloc, and fopen among others. 6) XX;;Just being anal here. Two semicolons in a row. Second one is extraneous. 7) extraneous test for non-NULL ptrSeveral memory calls that free memory accept NULL ptrs. So testing for NULL before calling them is redundant and wastes code space.Now some codepaths may be time-critical, but probably not all, and smaller code usually helps.If you have any questions, comments, feel free to email. I hope this scan is useful. Thanks for your time,J _________________________________________________________________ Use Messenger to talk to your IM friends, even those on Yahoo! http://ideas.live.com/programpage.aspx?versionId=7adb59de-a857-45ba-81cc-685ee3e858fe -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060922/93285d97/attachment.htm From jcarlson at uci.edu Sat Sep 23 02:03:45 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 22 Sep 2006 17:03:45 -0700 Subject: [Python-Dev] New relative import issue In-Reply-To: <5.1.1.6.0.20060922160541.028188e8@sparrow.telecommunity.com> References: <20060922112345.084E.JCARLSON@uci.edu> <5.1.1.6.0.20060922160541.028188e8@sparrow.telecommunity.com> Message-ID: <20060922134229.0857.JCARLSON@uci.edu> "Phillip J. Eby" wrote: > At 12:42 PM 9/22/2006 -0700, Josiah Carlson wrote: [snip] > Measure it. Be sure to include the time to import SQLite vs. the time to > import the zipimport module. [snip] > Again, seriously, compare this against a zipfile. You'll find that there's > absolutely no comparison between reading this and reading a zipfile central > directory -- which also results in an in-memory cache that can then be used > to seek() directly to the module. They are not directly comparable. The registry of packages can do more than zipimport in terms of package naming and hierarchy, but it's not an importer; it's a conceptual replacement of sys.path. I have already stated that the actual imports from this registry won't be any faster, as it will still need to read modules/packages from disk *after* it has decided on a list of paths to check for the package/module. Further, whether we use SQLite, or any one of a number of other persistance mechanisms, such a choice should depend on a few things (speed being one of them, though maybe not the *only* consideration). Perhaps even a zip file whose 'files' are named with the desired package hierarchy, and whose contents are something like: import imp globals.update(imp.load_XXX(...).__dict__) del imp > >Actually, I'm offering a way of *registering* a package with the > >repository from the command line. I'm of the opinion that setting the > >environment via command line for the subsequent Python runs is a bad > >idea, but then again, I have been using wxPython's wxversion method for > >a while to select which wxPython installation I want to use, and find > >things like: > > > > import wxversion > > wxversion.ensureMinimal('2.6-unicode', optionsRequired=True) > > > >To be exactly the amount of control I want, where I want it. > > Well, that's already easy to do for arbitrary packages and arbitrary > versions with setuptools. Eggs installed in "multi-version" mode are added > to sys.path at runtime if/when they are requested. Why do we have to use eggs or setuptools to get a feature that *arguably* should have existed a decade ago in core Python? The core functionality I'm talking about is: packages.register(name, path, env=None, system=False, persist=False) #system==True implies persist==True packages.copy_env(fr_env, to_env) packages.use_env(env) packages.check(name, version=None) packages.use(name, version) With those 5 functions and a few tricks, we can replace all user-level .pth and PYTHONPATH use, and sys.path manipulation done in other 3rd party packages (setuptools, etc.) are easily handled and supported. > >With a package registry (perhaps as I have been describing, perhaps > >something different), all of the disparate ways of choosing a version of > >a library during import can be removed in favor of a single mechanism. > >This single mechanism could handle things like the wxPython > >'ensureMinimal', perhaps even 'ensure exact' or 'use latest'. > > This discussion is mostly making me realize that sys.path is exactly the > right thing to have, and that the only thing that actually need fixing is > universal .pth support, and maybe some utility functions for better > sys.path manipulation within .pth files. I suggest that there is no way an > arbitrary "registry" implementation is going to be faster than reading > lines from a text file. > > > > Setuptools works around this by installing an enhancement for the 'site' > > > module that extends .pth support to include all PYTHONPATH > > > directories. The enhancement delegates to the original site module after > > > recording data about sys.path that the site module destroys at startup. > > > >But wasn't there a recent discussion describing how keeping persistant > >environment variables is a PITA both during install and runtime? > > Yes, exactly. You have confused me, because not only have you just said "we use PYTHONPATH as a solution", but you have just acknowledged that using PYTHONPATH is not reasonable as a solution. You have also just said that we need to add features to .pth support so that it is more usable. So, sys.path "is exactly the right thing to have", but we need to add more features to make it better. Ok, here's a sample .pth file if we are willing to make it better (in my opinion): zope,/path/to/zope,3.2.1,netserver zope.subpackage,/path/to/subpackage,.1.1,netserver That's a CSV file with rows defining packages, and columns in order: package name, path to package, version, and a semicolon-separated list of environments that this package is available in (a leading semicolon, or a double semicolon says that it is available when no environment is specified). With a base sys.path, a dictionary of environment -> packages created from .pth files, and a simple function, one can generally develop an applicable sys.path on demand to some choose_environment() call. This is, effectively, a variant of what I was suggesting, only with a different persistance representation. > >Extending .pth files to PYTHONPATH seems to me like a hack meant to work > >around the fact that Python doesn't have a package registry. And really, > >all of the current sys.path + .pth + PYTHONPATH stuff could be subsumed > >into a *single* mechanism. > > Sure -- I suggest that the single mechanism is none other than > *sys.path*. The .pth files, PYTHONPATH, and a new command-line option > merely being ways to set it. I guess we disagree on what is meant by "single" in this context. > All of the discussion that's taken place here has sufficed at this point to > convince me that sys.path isn't broken at all, and doesn't need > fixing. Some tweaks to 'site' and maybe a new command-line option will > suffice to clean everything up quite nicely. > > I say this because all of the version and dependency management things that > people are asking about can already be achieved by setuptools, so clearly > the underlying machinery is fine. It wasn't until this message of yours > that I realized that you are trying to solve a bunch of problems that are > quite solvable within the existing machinery. I was mainly interested in > cleaning up the final awkwardness that's effectively caused by lack of .pth > support for the startup script directory. Indeed, everything is solvable within the existing machinery. But it's not a question of solvable, it's a question of can we make things better. When I have had the occasion to use .pth files, I've been somewhat disappointed. Given even the few functions I've defined for an API, or the .pth variant I described, I know I wouldn't be disappointed in trying to set up independant package version installations, application environments, etc. They all come fairly naturally. > > > I'm not sure of that, since I don't yet know how your approach would deal > > > with namespace packages, which are distributed in pieces and assembled > > > later. For example, many PEAK and Zope distributions live in the peak.* > > > and zope.* package namespaces, but are installed separately, and glued > > > together via __path__ changes (see the pkgutil docs). > > > > packages.register('zope', '/path/to/zope') > > > >And if the installation path is different: > > > > packages.register('zope.subpackage', '/different/path/to/subpackage/') > > > >Otherwise the importer will know where the zope (or peak) package exists > >in the filesystem (or otherwise), and search it whenever 'from zope > >import ...' is performed. > > If you're talking about replacing the current import machinery, you would > have to leave this to Py3K, otherwise all you've done is add a *new* import > hook, i.e. a "sys.package_loaders" dictionary or some such. It could coexist happily next to sys.path-based machinery, and it is likely easier for it to do so (replacing the sys.path bits in the core language is more work than I would be willing to do). > If you wanted something like that now, of course, you could slap an > importer into sys.meta_path that then did a lookup in > sys.package_loaders. Getting this mechanism bootstrapped, however, is left > as an exercise for the reader. ;) I just about cry every time I think about adding an import hook. If others think that this functionality has legs to stand on, I may just have to get help from experienced users. > Note, by the way, that it might be quite possible to do away with > everything but sys.meta_path in Py3K, prepopulated with such an importer > (along with ones to support builtin and frozen modules). You could then > import a backward-compatibility module that would add support for sys.path > and for package __path__ attributes, by adding a new entry to > sys.meta_path. But this is strictly a pipe dream where Python 2.x is > concerned. Indeed, actually removing sys.path from 2.x is a non-starter. But replacing user-level modifications of sys.path with calls to a registry? That seems possible, if not desireable, from a "let us not monkey patch the Python runtime" perspective. - Josiah From glyph at divmod.com Sat Sep 23 02:35:04 2006 From: glyph at divmod.com (glyph at divmod.com) Date: Fri, 22 Sep 2006 20:35:04 -0400 Subject: [Python-Dev] Suggestion for a new built-in - flatten In-Reply-To: <45143FA6.3020600@voidspace.org.uk> Message-ID: <20060923003504.1717.1400241516.divmod.quotient.57242@ohm> On Fri, 22 Sep 2006 20:55:18 +0100, Michael Foord wrote: >glyph at divmod.com wrote: >>On Fri, 22 Sep 2006 18:43:42 +0100, Michael Foord >> wrote: >>This wouldn't be a problem except that everyone has a different idea of >>those requirements:). You didn't really address this, and it was my main point. In fact, you more or less made my point for me. You just assume that the type of application you have in mind right now is the only one that wants to use a flatten function, and dismiss out of hand any uses that I might have in mind. >If you consume iterables, and only special case strings - then none of the >issues you raise above seem to be a problem. You have just made two major policy decisions about the flattener without presenting a specific use case or set of use cases it is meant to be restricted to. For example, you suggest special casing strings. Why? Your guideline otherwise is to follow what the iter() or list() functions do. What about user-defined classes which subclass str and implement __iter__? >Sets and dictionaries are both iterable. > >If it's not iterable it's an element. > >I'd prefer to see this as a built-in, lots of people seem to want it. IMHO Can you give specific examples? The only significant use of a flattener I'm intimately familiar with (Nevow) works absolutely nothing like what you described. >Having it in itertools is a good compromise. No need to compromise with me. I am not in a position to reject your change. No particular reason for me to make any concessions either: I'm simply trying to communicate the fact that I think this is a terrible idea, not come to an agreement with you about how progress might be made. Absolutely no changes on this front are A-OK by me :). You have made a case for the fact that, perhaps, you should have a utility library which you use in all your projects could use for consistency and to avoid repeating yourself, since you have a clearly defined need for what a flattener should do. I haven't read anything that indicates there's a good reason for this function to be in the standard library. What are the use cases? It's definitely better for the core language to define lots of basic types so that you can say something in a library like "returns a dict mapping strings to ints" without having a huge argument about what "dict" and "string" and "int" mean. What's the benefit to having everyone flatten things the same way, though? Flattening really isn't that common of an operation, and in the cases where it's needed, a unified approach would only help if you had two flattenable data-structures from different libraries which needed to be combined. I can't say I've ever seen a case where that would happen, let alone for it to be common enough that there should be something in the core language to support it. >>What do you do if you encounter a function? This is kind of a trick >>question, since Nevow's "flattener" *calls* functions as it encounters >>them, then treats the *result* of calling them as further input. >> >Sounds like not what anyone would normally expect. Of course not. My point is that there is nothing that anyone would "normally" expect from a flattener except a few basic common features. Bob's use-case is completely different from yours, for example: he's talking about flattening to support high-performance I/O. >What does the list constructor do with these ? Do the same. >>> list('hello') ['h', 'e', 'l', 'l', 'o'] What more can I say? >>Do you produce the output as a structured list or an iterator that works >>incrementally? >Either would be fine. I had in mind a list, but converting an iterator into >a list is trivial. There are applications where this makes a big difference. Bob, for example, suggested that this should only work on structures that support the PySequence_Fast operations. >>Also, at least Nevow uses "flatten" to mean "serialize to bytes", not >>"produce a flat list", and I imagine at least a few other web frameworks do >>as well. That starts to get into encoding issues. >Not a use of the term I've come across. On the other hand I've heard of >flatten in the context of nested data-structures many times. Nevertheless the only respondent even mildly in favor of your proposal so far also mentions flattening sequences of bytes, although not quite as directly. >I think that you're over complicating it and that the term flatten is really >fairly straightforward. Especially if it's clearly documented in terms of >consuming iterables. And I think that you're over-simplifying. If you can demonstrate that there is really a broad consensus that this sort of thing is useful in a wide variety of applications, then sure, I wouldn't complain too much. But I've spent a LOT of time thinking about what "flattening" is, and several applications that I've worked on have very different ideas about how it should work, and I see very little benefit to unifying them. That's just the work of one programmer; I have to assume that the broader domain of all applications which do structure flattening is much more diverse. From greg.ewing at canterbury.ac.nz Sat Sep 23 03:34:56 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 23 Sep 2006 13:34:56 +1200 Subject: [Python-Dev] list.discard? (Re: dict.discard) In-Reply-To: <17683.21448.909748.200493@montanaro.dyndns.org> References: <20060921134249.GA9238@niemeyer.net> <45132C6D.9010806@canterbury.ac.nz> <17683.21448.909748.200493@montanaro.dyndns.org> Message-ID: <45148F40.40802@canterbury.ac.nz> skip at pobox.com wrote: > It's obvious for sets and dictionaries that there is only one thing to > discard and that after the operation you're guaranteed the key no longer > exists. Would you want the same semantics for lists or the semantics of > list.remove where it only removes the first instance? In my use cases I usually know that there is either zero or one occurrences in the list. But maybe it would be more useful to have a remove_all() method, whose behaviour with zero occurrences would just be a special case. Or maybe remove() should just do nothing if the item is not found. I don't think I've ever found getting an exception from it to be useful, and I've often found it a nuisance. What experiences have others had with it? -- Greg From nnorwitz at gmail.com Sat Sep 23 06:51:38 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Fri, 22 Sep 2006 21:51:38 -0700 Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code In-Reply-To: References: Message-ID: On 9/22/06, Johnny Lee wrote: > > Hello, > My name is Johnny Lee. I have developed a *ahem* perl script which scans > C/C++ source files for typos. Hi Johnny. Thanks for running your script, even if it is written in Perl and ran on Windows. :-) > The Python 2.5 typos can be classified into 7 types. > > 2) realloc overwrite src if NULL, i.e. p = realloc(p, new_size); > If realloc() fails, it will return NULL. If you assign the return value to > the same variable you passed into realloc, > then you've overwritten the variable and possibly leaked the memory that the > variable pointed to. A bunch of these warnings were accurate and a bunch were not. There were 2 reasons for the false positives. 1) The pointer was aliased, thus not lost, 2) On failure, we exited (Parser/*.c) > 4) if ((X!=0) || (X!=1)) These 2 cases occurred in binascii. I have no idea if the warning is wright or the code is. > 6) XX;; > Just being anal here. Two semicolons in a row. Second one is extraneous. I already checked in a fix for these on HEAD. Hard for even me to screw up those fixes. :-) > 7) extraneous test for non-NULL ptr > Several memory calls that free memory accept NULL ptrs. > So testing for NULL before calling them is redundant and wastes code space. > Now some codepaths may be time-critical, but probably not all, and smaller > code usually helps. I ignored these as I'm not certain all the platforms we run on accept free(NULL). Below is my categorization of the warnings except #7. Hopefully someone will fix all the real problems in the first batch. Thanks again! n -- # Problems Objects\fileobject.c (338): realloc overwrite src if NULL; 17: file->f_setbuf=(char*)PyMem_Realloc(file->f_setbuf,bufsize) Objects\fileobject.c (342): using PyMem_Realloc result w/no check 30: setvbuf(file->f_fp, file->f_setbuf, type, bufsize); [file->f_setbuf] Objects\listobject.c (2619): using PyMem_MALLOC result w/no check 30: garbage[i] = selfitems[cur]; [garbage] Parser\myreadline.c (144): realloc overwrite src if NULL; 17: p=(char*)PyMem_REALLOC(p,n+incr) Modules\_csv.c (564): realloc overwrite src if NULL; 17: self->field=PyMem_Realloc(self->field,self->field_size) Modules\_localemodule.c (366): realloc overwrite src if NULL; 17: buf=PyMem_Realloc(buf,n2) Modules\_randommodule.c (290): realloc overwrite src if NULL; 17: key=(unsigned#long*)PyMem_Realloc(key,bigger*sizeof(*key)) Modules\arraymodule.c (1675): realloc overwrite src if NULL; 17: self->ob_item=(char*)PyMem_REALLOC(self->ob_item,itemsize*self->ob_size) Modules\cPickle.c (536): realloc overwrite src if NULL; 17: self->buf=(char*)realloc(self->buf,n) Modules\cPickle.c (592): realloc overwrite src if NULL; 17: self->buf=(char*)realloc(self->buf,bigger) Modules\cPickle.c (4369): realloc overwrite src if NULL; 17: self->marks=(int*)realloc(self->marks,s*sizeof(int)) Modules\cStringIO.c (344): realloc overwrite src if NULL; 17: self->buf=(char*)realloc(self->buf,self->buf_size) Modules\cStringIO.c (380): realloc overwrite src if NULL; 17: oself->buf=(char*)realloc(oself->buf,oself->buf_size) Modules\_ctypes\_ctypes.c (2209): using PyMem_Malloc result w/no check 30: memset(obj->b_ptr, 0, dict->size); [obj->b_ptr] Modules\_ctypes\callproc.c (1472): using PyMem_Malloc result w/no check 30: strcpy(conversion_mode_encoding, coding); [conversion_mode_encoding] Modules\_ctypes\callproc.c (1478): using PyMem_Malloc result w/no check 30: strcpy(conversion_mode_errors, mode); [conversion_mode_errors] Modules\_ctypes\stgdict.c (362): using PyMem_Malloc result w/no check 30: memset(stgdict->ffi_type_pointer.elements, 0, [stgdict->ffi_type_pointer.elements] Modules\_ctypes\stgdict.c (376): using PyMem_Malloc result w/no check 30: memset(stgdict->ffi_type_pointer.elements, 0, [stgdict->ffi_type_pointer.elements] # No idea if the code or tool is right. Modules\binascii.c (1161) Modules\binascii.c (1231) # Platform specific files. I didn't review and won't fix without testing. Python\thread_lwp.h (107): using malloc result w/no check 30: lock->lock_locked = 0; [lock] Python\thread_os2.h (141): using malloc result w/no check 30: (long)sem)); [sem] Python\thread_os2.h (155): using malloc result w/no check 30: lock->is_set = 0; [lock] Python\thread_pth.h (133): using malloc result w/no check 30: memset((void *)lock, '\0', sizeof(pth_lock)); [lock] Python\thread_solaris.h (48): using malloc result w/no check 30: funcarg->func = func; [funcarg] Python\thread_solaris.h (133): using malloc result w/no check 30: if(mutex_init(lock,USYNC_THREAD,0)) [lock] # Who cares about these modules. Modules\almodule.c:182 Modules\svmodule.c:547 # Not a problem. Parser\firstsets.c (76) Parser\grammar.c (40) Parser\grammar.c (59) Parser\grammar.c (83) Parser\grammar.c (102) Parser\node.c (95) Parser\pgen.c (52) Parser\pgen.c (69) Parser\pgen.c (126) Parser\pgen.c (438) Parser\pgen.c (462) Parser\tokenizer.c (797) Parser\tokenizer.c (869) Modules\_bsddb.c (2633) Modules\_csv.c (1069) Modules\arraymodule.c (1871) Modules\gcmodule.c (1363) Modules\zlib\trees.c (375) From martin at v.loewis.de Sat Sep 23 07:27:20 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 23 Sep 2006 07:27:20 +0200 Subject: [Python-Dev] Pep 353: Py_ssize_t advice In-Reply-To: <87k63vzguu.fsf@pereiro.luannocracy.com> References: <871wq3eo5e.fsf@pereiro.luannocracy.com> <45144432.6010304@v.loewis.de> <87k63vzguu.fsf@pereiro.luannocracy.com> Message-ID: <4514C5B8.1070903@v.loewis.de> David Abrahams schrieb: > b. We were using C++, which IIRC does not allow such redefinition You remember incorrectly. 16.3/2 (cpp.replace) says # An identifier currently defined as a macro without use of lparen (an # object-like macro) may be redefined by another #define preprocessing # directive provided that the second definition is an object-like macro # definition and the two replacement lists are identical, otherwise the # program is ill-formed. > c. anyway you'll get a nasty warning, which for some people will be > just as bad as an error Try for yourself. You get the warning only if the redefinition is not identical to the original definition (or an object-like macro is redefined as a function-like macro or vice versa). Regards, Martin From martin at v.loewis.de Sat Sep 23 07:33:05 2006 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Sep 2006 07:33:05 +0200 Subject: [Python-Dev] GCC patch for catching errors in PyArg_ParseTuple In-Reply-To: <00c401c6de98$64cefd80$4bbd2997@bagio> References: <451445CF.7080407@v.loewis.de> <00c401c6de98$64cefd80$4bbd2997@bagio> Message-ID: <4514C711.9090003@v.loewis.de> Giovanni Bajo schrieb: > A way not to maintain this patch forever would be to devise a way to make > format syntax "pluggable" / "scriptable". There have been previous discussions > on the GCC mailing lists. Perhaps. I very much doubt that this can or will be done, in a way that would support PyArg_ParseTuple. It's probably easier to replace PyArg_ParseTuple with something that can be statically checked by any compiler. Regards, Martin From anthony at interlink.com.au Sat Sep 23 10:40:02 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Sat, 23 Sep 2006 18:40:02 +1000 Subject: [Python-Dev] AST structure and maintenance branches Message-ID: <200609231840.03859.anthony@interlink.com.au> I'd like to propose that the AST format returned by passing PyCF_ONLY_AST to compile() get the same guarantee in maintenance branches as the bytecode format - that is, unless it's absolutely necessary, we'll keep it the same. Otherwise anyone trying to write tools to manipulate the AST is in for a massive world of hurt. Anyone have any problems with this, or can it be added to PEP 6? Anthony From barry at barrys-emacs.org Sat Sep 23 14:06:34 2006 From: barry at barrys-emacs.org (Barry Scott) Date: Sat, 23 Sep 2006 13:06:34 +0100 Subject: [Python-Dev] Maybe we should have a C++ extension for testing... In-Reply-To: <17672.17407.88122.884957@montanaro.dyndns.org> References: <17672.17407.88122.884957@montanaro.dyndns.org> Message-ID: On Sep 13, 2006, at 18:46, skip at pobox.com wrote: > > Building Python with C and then linking in extensions written in or > wrapped > with C++ can present problems, at least in some situations. I > don't know if > it's kosher to build that way, but folks do. We're bumping into such > problems at work using Solaris 10 and Python 2.4 (building > matplotlib, which > is largely written in C++), and it appears others have similar > problems: > > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6395191 > http://mail.python.org/pipermail/patches/2005-June/017820.html > http://mail.python.org/pipermail/python-bugs-list/2005-November/ > 030900.html > > I attached a comment to the third item yesterday (even though it was > closed). > > One of our C++ gurus (that's definitely not me!) patched the Python > source > to include at the top of Python.h. That seems to have > solved our > problems, but seems to be a symptomatic fix. I got to thinking, > should we > a) encourage people to compile Python with a C++ compiler if most/ > all of > their extensions are written in C++ anyway (does that even work if > one or > more extensions are written in C?), or b) should the standard > distribution > maybe include a toy extension written in C++ whose sole purpose is > to test > for cross-language problems? Mixing of C and C++ code is fully supported by the compilers and linkers. There is no need to compile the python core as C++ code, indeed if you did only C++ extension could use it! In the distent past there had been problems with some unix distributions linking python in such a way that C++ code would not initialise. The major distributions seem to have sort these problems out. But clearly Solaris has a problem. It would be worth finding out out why it was necessary to include to fix the problems. If you do add a C++ test extension it will need to do what ever it was that fixes. From what I can remember attempts to use std::cout would fail and I think static object initialisation would fail. The test code would need to do all these things and verify they are working. Barry (PyCXX cxx.sourceforge.net) From martin at v.loewis.de Sat Sep 23 14:43:55 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Sep 2006 14:43:55 +0200 Subject: [Python-Dev] Maybe we should have a C++ extension for testing... In-Reply-To: <17672.17407.88122.884957@montanaro.dyndns.org> References: <17672.17407.88122.884957@montanaro.dyndns.org> Message-ID: <45152C0B.5070202@v.loewis.de> skip at pobox.com schrieb: > One of our C++ gurus (that's definitely not me!) patched the Python source > to include at the top of Python.h. That seems to have solved our > problems, but seems to be a symptomatic fix. Indeed. The right fix is likely different, and relates to the question what API Sun defines in its header files, and which of these which gcc version uses. > I got to thinking, should we > a) encourage people to compile Python with a C++ compiler if most/all of > their extensions are written in C++ anyway (does that even work if one or > more extensions are written in C?) I can't see how this could help. The problem you have is specific to Solaris, and specific to using GCC on Solaris. This is just a tiny fraction of Python users. Without further investigation, it might be even depending on the specific version of GCC being used (and the specific Solaris version). > or b) should the standard distribution > maybe include a toy extension written in C++ whose sole purpose is to test > for cross-language problems? Again, this isn't likely to help. If such a problem exist, it is only found if somebody builds Python on that platform. You are perhaps the first one to do in this specific combination, so you would have encountered the problem first. Would that have helped you? > Either/or/neither/something else? Something else. Find and understand all platform quirks on platforms we encounter, and come up with a solution. Fix them one by one, as we encounter them, and document all work-arounds being made, so we can take them out when the system disappears (or subsequent releases fixed the platform bugs). Doing so requires a good understanding of C and C++, of course. Regards, Martin From dave at boost-consulting.com Sat Sep 23 15:14:24 2006 From: dave at boost-consulting.com (David Abrahams) Date: Sat, 23 Sep 2006 09:14:24 -0400 Subject: [Python-Dev] Pep 353: Py_ssize_t advice In-Reply-To: <4514C5B8.1070903@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wi?= =?utf-8?Q?s's?= message of "Sat, 23 Sep 2006 07:27:20 +0200") References: <871wq3eo5e.fsf@pereiro.luannocracy.com> <45144432.6010304@v.loewis.de> <87k63vzguu.fsf@pereiro.luannocracy.com> <4514C5B8.1070903@v.loewis.de> Message-ID: <874puy209b.fsf@pereiro.luannocracy.com> "Martin v. L?wis" writes: >> c. anyway you'll get a nasty warning, which for some people will be >> just as bad as an error > > Try for yourself. You get the warning only if the redefinition is not > identical to the original definition (or an object-like macro is > redefined as a function-like macro or vice versa). I'm confident that whether you get the warning otherwise is dependent both on the compiler and the compiler-flags you use. But this question is academic now, I think, since you accepted my suggestion. -- Dave Abrahams Boost Consulting www.boost-consulting.com From skip at pobox.com Sat Sep 23 15:27:12 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 23 Sep 2006 08:27:12 -0500 Subject: [Python-Dev] list.discard? (Re: dict.discard) In-Reply-To: <45148F40.40802@canterbury.ac.nz> References: <20060921134249.GA9238@niemeyer.net> <45132C6D.9010806@canterbury.ac.nz> <17683.21448.909748.200493@montanaro.dyndns.org> <45148F40.40802@canterbury.ac.nz> Message-ID: <17685.13872.773665.230012@montanaro.dyndns.org> Greg> Or maybe remove() should just do nothing if the item is not Greg> found. If that's the case, I'd argue that dict.remove and set.remove should behave the same way, making .discard unnecessary. OTOH, perhaps lists should grow a .discard method. Skip From glassfordm at hotmail.com Fri Sep 22 14:24:32 2006 From: glassfordm at hotmail.com (Michael Glassford) Date: Fri, 22 Sep 2006 08:24:32 -0400 Subject: [Python-Dev] Python 2.5 bug? Changes in behavior of traceback module Message-ID: In Python 2.4, traceback.print_exc() and traceback.format_exc() silently do nothing if there is no active exception; in Python 2.5, they raise an exception. Not too difficult to handle, but unexpected (and a pain if you use it in a lot of places). I assume it was an unintentional change? Mike In Python 2.4: >>> import traceback >>> traceback.print_exc() None >>> traceback.format_exc() 'None\n' In Python 2.5: >>> import traceback >>> traceback.print_exc() Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", line 227, in print_exc print_exception(etype, value, tb, limit, file) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", line 126, in print_exception lines = format_exception_only(etype, value) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", line 176, in format_exception_only stype = etype.__name__ AttributeError: 'NoneType' object has no attribute '__name__' >>> traceback.format_exc() Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", line 236, in format_exc return ''.join(format_exception(etype, value, tb, limit)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", line 145, in format_exception list = list + format_exception_only(etype, value) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", line 176, in format_exception_only stype = etype.__name__ AttributeError: 'NoneType' object has no attribute '__name__' From skip at pobox.com Sat Sep 23 15:37:38 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 23 Sep 2006 08:37:38 -0500 Subject: [Python-Dev] Maybe we should have a C++ extension for testing... In-Reply-To: <45152C0B.5070202@v.loewis.de> References: <17672.17407.88122.884957@montanaro.dyndns.org> <45152C0B.5070202@v.loewis.de> Message-ID: <17685.14498.20311.248692@montanaro.dyndns.org> Martin> The problem you have is specific to Solaris, and specific to Martin> using GCC on Solaris. So can we fix this in pyport.h or with suitable Configure script machinations? Even though the current patch we're using is trivial I'd really like to avoid patching the Python distribution when we install it. Skip From martin at v.loewis.de Sat Sep 23 16:23:17 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Sep 2006 16:23:17 +0200 Subject: [Python-Dev] Maybe we should have a C++ extension for testing... In-Reply-To: <17685.14498.20311.248692@montanaro.dyndns.org> References: <17672.17407.88122.884957@montanaro.dyndns.org> <45152C0B.5070202@v.loewis.de> <17685.14498.20311.248692@montanaro.dyndns.org> Message-ID: <45154355.4040000@v.loewis.de> skip at pobox.com schrieb: > Martin> The problem you have is specific to Solaris, and specific to > Martin> using GCC on Solaris. > > So can we fix this in pyport.h or with suitable Configure script > machinations? Even though the current patch we're using is trivial I'd > really like to avoid patching the Python distribution when we install it. Yes. However, to do so, somebody would have to understand the problem in detail first. Regards, Martin From mwh at python.net Sat Sep 23 16:59:14 2006 From: mwh at python.net (Michael Hudson) Date: Sat, 23 Sep 2006 15:59:14 +0100 Subject: [Python-Dev] AST structure and maintenance branches In-Reply-To: <200609231840.03859.anthony@interlink.com.au> (Anthony Baxter's message of "Sat, 23 Sep 2006 18:40:02 +1000") References: <200609231840.03859.anthony@interlink.com.au> Message-ID: <2mu02yk4sd.fsf@starship.python.net> Anthony Baxter writes: > I'd like to propose that the AST format returned by passing PyCF_ONLY_AST to > compile() get the same guarantee in maintenance branches as the bytecode > format - that is, unless it's absolutely necessary, we'll keep it the same. > Otherwise anyone trying to write tools to manipulate the AST is in for a > massive world of hurt. > > Anyone have any problems with this, or can it be added to PEP 6? Sounds like a good idea. Cheers, mwh -- Reading Slashdot can [...] often be worse than useless, especially to young and budding programmers: it can give you exactly the wrong idea about the technical issues it raises. -- http://www.cs.washington.edu/homes/klee/misc/slashdot.html#reasons From gh at ghaering.de Sat Sep 23 19:31:00 2006 From: gh at ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=) Date: Sat, 23 Sep 2006 19:31:00 +0200 Subject: [Python-Dev] Need help with C - problem in sqlite3 module Message-ID: <45156F54.3010606@ghaering.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Looks like I don't know C so well after all ... Apparently at least gcc on Linux exports all symbols by default that are not static. This creates problems with Python extensions that export symbols that are also used in other contexts. For example some people use Python and the sqlite3 module under Apache, and the sqlite3 module exports a symbol cache_init, but cache_init is also used by Apache's mod_cache module. Thus there are crashes when using the sqlite3 module that only occur in the mod_python context. Can somebody with more knowledge about C tell me how to fix the sqlite3 module or compiler settings for distutils so that this does not happen? Of course this only happens because the sqlite3 module is distributed among multiple .c files and thus I couldn't make everything "static". Thanks in advance. - -- Gerhard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFFW9UdIO4ozGCH14RApFQAKC+BJd8mGlCXJa89swOcMvASoj6GgCfZxf+ tZ/iVO8xTEV7qNeXBcDT0WU= =lX07 -----END PGP SIGNATURE----- From jeremy.kloth at 4suite.org Sat Sep 23 20:07:29 2006 From: jeremy.kloth at 4suite.org (Jeremy Kloth) Date: Sat, 23 Sep 2006 12:07:29 -0600 Subject: [Python-Dev] Need help with C - problem in sqlite3 module In-Reply-To: <45156F54.3010606@ghaering.de> References: <45156F54.3010606@ghaering.de> Message-ID: <200609231207.30418.jeremy.kloth@4suite.org> On Saturday, September 23, 2006 11:31 am, Gerhard H?ring wrote: > Looks like I don't know C so well after all ... > > Apparently at least gcc on Linux exports all symbols by default that are > not static. This creates problems with Python extensions that export > symbols that are also used in other contexts. For example some people use > Python and the sqlite3 module under Apache, and the sqlite3 module exports > a symbol cache_init, but cache_init is also used by Apache's mod_cache > module. Thus there are crashes when using the sqlite3 module that only > occur in the mod_python context. > > Can somebody with more knowledge about C tell me how to fix the sqlite3 > module or compiler settings for distutils so that this does not happen? > > Of course this only happens because the sqlite3 module is distributed among > multiple .c files and thus I couldn't make everything "static". GCC's symbol visibility is supposed to address this exact problem. It would be nice if -fvisibility=hidden was used to build Python (and its extensions) by default on supported platforms/compilers. It shouldn't be much of an issue wrt. exported symbols as they already need to be tracked for Windows where symbols are hidden by default (unlike traditional *nix). -- Jeremy Kloth http://4suite.org/ From brett at python.org Sat Sep 23 21:12:05 2006 From: brett at python.org (Brett Cannon) Date: Sat, 23 Sep 2006 12:12:05 -0700 Subject: [Python-Dev] AST structure and maintenance branches In-Reply-To: <200609231840.03859.anthony@interlink.com.au> References: <200609231840.03859.anthony@interlink.com.au> Message-ID: On 9/23/06, Anthony Baxter wrote: > > I'd like to propose that the AST format returned by passing PyCF_ONLY_AST > to > compile() get the same guarantee in maintenance branches as the bytecode > format - that is, unless it's absolutely necessary, we'll keep it the > same. > Otherwise anyone trying to write tools to manipulate the AST is in for a > massive world of hurt. > > Anyone have any problems with this, or can it be added to PEP 6? Works for me. -Bett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060923/f9620f0c/attachment.htm From david.nospam.hopwood at blueyonder.co.uk Sat Sep 23 22:00:29 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Sat, 23 Sep 2006 21:00:29 +0100 Subject: [Python-Dev] Pep 353: Py_ssize_t advice In-Reply-To: <45144432.6010304@v.loewis.de> References: <871wq3eo5e.fsf@pereiro.luannocracy.com> <45144432.6010304@v.loewis.de> Message-ID: <4515925D.6010408@blueyonder.co.uk> Martin v. L?wis wrote: > David Abrahams schrieb: > >>(C++ allows restating of typedefs; if C allows it, that should be >>something like): > > C also allows this; [...] This is nitpicking, since you agreed the change to the PEP, but are you sure that C allows this? From C99 + TC1 + TC2 (http://www.open-std.org/JTC1/SC22/WG14/www/standards): # 6.2.2 Linkages of identifiers # # 6 The following identifiers have no linkage: an identifier declared # to be anything other than an object or a function; [...] (i.e. typedef identifiers have no linkage) # 6.7 Declarations # # Constraints # 3 If an identifier has no linkage, there shall be no more than one # declaration of the identifier (in a declarator or type specifier) # with the same scope and in the same name space, except for tags as # specified in 6.7.2.3. # 6.7.2.3 Tags # # Constraints # 1 A specific type shall have its content defined at most once. (There is nothing else in 6.7.2.3 that applies to typedefs.) Since 6.7 (3) and 6.7.2.3 (1) are constraints, I read this as saying that a C99 implementation must produce a diagnostic if a typedef is redeclared in the same scope. If the program is run despite the diagnostic, its behaviour is undefined. Several C compilers I've used in the past have needed the idempotence guard on typedefs, in any case. -- David Hopwood From martin at v.loewis.de Sat Sep 23 22:18:32 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Sep 2006 22:18:32 +0200 Subject: [Python-Dev] Need help with C - problem in sqlite3 module In-Reply-To: <45156F54.3010606@ghaering.de> References: <45156F54.3010606@ghaering.de> Message-ID: <45159698.9080405@v.loewis.de> Gerhard H?ring schrieb: > Apparently at least gcc on Linux exports all symbols by default that are > not static. Correct. Various factors influence run-time symbol binding, though. > This creates problems with Python extensions that export > symbols that are also used in other contexts. For example some people use > Python and the sqlite3 module under Apache, and the sqlite3 module exports > a symbol cache_init, but cache_init is also used by Apache's mod_cache > module. Thus there are crashes when using the sqlite3 module that only > occur in the mod_python context. > > Can somebody with more knowledge about C tell me how to fix the sqlite3 > module or compiler settings for distutils so that this does not happen? The only reliable way is to do renaming. This was one of the primary reasons of the "grand renaming" in Python, where the Py prefix was introduced. > Of course this only happens because the sqlite3 module is distributed among > multiple .c files and thus I couldn't make everything "static". In the specific case, I can't understand that reason. cache_init is declared in cache.c, and only used in cache.c (to fill a tp_init slot). So just make the symbol static. As a lesson learned, you should go through the module and make all functions static, then see what functions really need to be extern. You should then rename these functions, say by adding a PySQLite prefix. All dynamic symbols remaining should then either have the PySQLite prefix, except for init_sqlite3. In fact, since most operations in Python go through function pointers, there is typically very little need for extern functions in a Python extension module, even if that module consists of multiple C files. Regards, Martin P.S. Currently, on my system, the following symbols are extern in this module 00005890 T _authorizer_callback 0000dec0 A __bss_start 00007600 T _build_column_name 00005df0 T _build_py_params 00007ee0 T build_row_cast_map 00004880 T cache_dealloc 00004990 T cache_display 00004b90 T cache_get 00004da0 T cache_init 00004930 T cache_setup_types 0000d4a0 D CacheType 00004e80 T check_connection 00009f60 T check_remaining_sql 00005420 T check_thread 00006430 T _connection_begin 00005cb0 T connection_call 000068d0 T connection_close 000061c0 T connection_commit 000059b0 T connection_create_aggregate 00005ab0 T connection_create_function 000057a0 T connection_cursor 00006530 T connection_dealloc 00005320 T connection_execute 00005220 T connection_executemany 00005120 T connection_executescript 00006970 T connection_init 00006700 T connection_rollback 000056d0 T connection_set_authorizer 000050e0 T connection_setup_types 0000d5e0 D ConnectionType 0000ded8 B converters 000094d0 T converters_init 00007110 T cursor_close 00007190 T cursor_dealloc 00008d90 T cursor_execute 00008d50 T cursor_executemany 000072e0 T cursor_executescript 00007c90 T cursor_fetchall 00007d30 T cursor_fetchmany 00007e10 T cursor_fetchone 000070b0 T cursor_getiter 00007530 T cursor_init 00007b50 T cursor_iternext 000070e0 T cursor_setup_types 0000d980 D CursorType 0000decc B DatabaseError 0000ded4 B DataError 00005bb0 T _drop_unused_statement_references 0000dec0 A _edata 0000def0 B _enable_callback_tracebacks 0000defc A _end 0000dee8 B Error 00007710 T _fetch_one_row 00006cb0 T _final_callback 0000aac4 T _fini 00006830 T flush_statement_cache 00006fa0 T _func_callback 00007e60 T _get_converter 00003bd4 T _init 00009520 T init_sqlite3 0000deec B IntegrityError 0000ded0 B InterfaceError 0000dedc B InternalError 00008dd0 T microprotocols_adapt 00009040 T microprotocols_add 000090e0 T microprotocols_init 000047a0 T new_node 00004810 T node_dealloc 0000d3e0 D NodeType 0000def8 B NotSupportedError 0000dee4 B OperationalError 0000dee0 B OptimizedUnicode 00009ae0 T prepare_protocol_dealloc 00009ac0 T prepare_protocol_init 00009b10 T prepare_protocol_setup_types 0000dec8 B ProgrammingError 0000dec4 B psyco_adapters 00008fc0 T psyco_microprotocols_adapt 000070c0 T pysqlite_noop 00008110 T _query_execute 00006690 T reset_all_statements 0000dd20 D row_as_mapping 00009b50 T row_dealloc 00009e40 T row_init 00009bd0 T row_length 00009c40 T row_setup_types 00009c80 T row_subscript 0000dd40 D RowType 0000a910 T _seterror 00005fc0 T _set_result 00006c70 T _sqlite3_result_error 0000dc60 D SQLitePrepareProtocolType 0000aa30 T _sqlite_step_with_busyhandler 0000a2a0 T statement_bind_parameter 0000a530 T statement_bind_parameters 0000a7f0 T statement_create 0000a0f0 T statement_dealloc 0000a080 T statement_finalize 00009f30 T statement_mark_dirty 0000a210 T statement_recompile 0000a190 T statement_reset 0000a040 T statement_setup_types 0000de00 D StatementType 000076a0 T unicode_from_string 0000def4 B Warning From martin at v.loewis.de Sat Sep 23 22:19:39 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Sep 2006 22:19:39 +0200 Subject: [Python-Dev] Need help with C - problem in sqlite3 module In-Reply-To: <200609231207.30418.jeremy.kloth@4suite.org> References: <45156F54.3010606@ghaering.de> <200609231207.30418.jeremy.kloth@4suite.org> Message-ID: <451596DB.7060609@v.loewis.de> Jeremy Kloth schrieb: > GCC's symbol visibility is supposed to address this exact problem. It would > be nice if -fvisibility=hidden was used to build Python (and its extensions) > by default on supported platforms/compilers. It shouldn't be much of an > issue wrt. exported symbols as they already need to be tracked for Windows > where symbols are hidden by default (unlike traditional *nix). Of course, this doesn't help on systems where gcc isn't used. So for Python itself, we should always look for a solution that works across compilers. Regards, Martin From martin at v.loewis.de Sat Sep 23 22:21:06 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Sep 2006 22:21:06 +0200 Subject: [Python-Dev] Pep 353: Py_ssize_t advice In-Reply-To: <4515925D.6010408@blueyonder.co.uk> References: <871wq3eo5e.fsf@pereiro.luannocracy.com> <45144432.6010304@v.loewis.de> <4515925D.6010408@blueyonder.co.uk> Message-ID: <45159732.2000205@v.loewis.de> David Hopwood schrieb: >>> (C++ allows restating of typedefs; if C allows it, that should be >>> something like): >> C also allows this; [...] > > This is nitpicking, since you agreed the change to the PEP, but are you > sure that C allows this? I was sure, but I was also wrong. Thanks for pointing that out. Regards, Martin From arigo at tunes.org Sun Sep 24 00:15:11 2006 From: arigo at tunes.org (Armin Rigo) Date: Sun, 24 Sep 2006 00:15:11 +0200 Subject: [Python-Dev] New relative import issue In-Reply-To: References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> Message-ID: <20060923221510.GA21803@code0.codespeak.net> Hi Guido, On Thu, Sep 21, 2006 at 07:22:04AM -0700, Guido van Rossum wrote: > sys.path exists to stitch together the toplevel module/package > namespace from diverse sources. > > Import hooks and sys.path hackery exist so that module/package sources > don't have to be restricted to the filesystem (as well as to allow > unbridled experimentation by those so inclined :-). This doesn't match my experience, which is that sys.path hackery is required in any project that is larger than one directory, but is not itself a library. The basic assumption is that I don't want to put whole applications in 'site-packages' or in my $PYTHONPATH; I would like them to work in a self-contained, zero-installation way, much like they do if they are built from several modules in a single directory. For example, consider an application with the following structure: myapp/ main.py a/ __init__.py b.py test_b.py c/ __init__.py This theoretical example shows main.py (the main entry point) at the root of the package directories - it is the only place where it can be if it needs to import the packages a and c. The module a.b can import c, too (and this is not bad design - think about c as a package regrouping utilities that make sense for the whole application). But then the testing script test_b.py cannot import the whole application any more. Imports of a or c will fail, and even a relative import of b will crash when b tries to import c. The only way I can think of is to insert the root directory in sys.path from within test_b.py, and then use absolute imports. (For example, to support this way of organizing applications, the 'py' lib provides a call py.magic.autopath() that can be dropped at the start of test_b.py. It hacks sys.path by guessing the "real" root according to how many levels of __init__.py there are...) A bientot, Armin. From krcmar at datinel.cz Sun Sep 24 00:14:41 2006 From: krcmar at datinel.cz (Milan Krcmar) Date: Sun, 24 Sep 2006 00:14:41 +0200 Subject: [Python-Dev] Minipython Message-ID: <20060923221441.GB5227@hornet.din.cz> I would like to run Python scripts on an embedded MIPS Linux platform having only 2 MiB of flash ROM and 16 MiB of RAM for everything. Current (2.5) stripped and gzipped (I am going to use a compressed filesystem) CPython binary, compiled with defaults on a i386/glibc Linux, results in 500 KiB of "flash". How to make the Python interpreter even smaller? - can I completely drop out lexical analysis of sourcecode and compilation to bytecode? is it relevant enough to the size of interpreter? - should I drop "useless" compiled-in modules? (what I need is a replacement for advanced bash scripting, being able to write more complex scripts and avoid forking tens of processes for things like searching filesystem, formating dates etc.) I don't want to re-invent the wheel, but all my attempts at finding Python for embedded systems ended in instructions for embedding Python in another program :-) Can you give me any information to start with? I would prefer stripping current version of Python rather than returning to a years-old (but smaller) version and remembering what of the new syntax/functionality to avoid. TIA, Milan From rasky at develer.com Sun Sep 24 01:11:06 2006 From: rasky at develer.com (Giovanni Bajo) Date: Sun, 24 Sep 2006 01:11:06 +0200 Subject: [Python-Dev] New relative import issue References: <20060918091314.GA26814@code0.codespeak.net> <450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru> <05af01c6dd7e$a2209560$e303030a@trilan> <20060923221510.GA21803@code0.codespeak.net> Message-ID: <06e501c6df65$8c521b80$4bbd2997@bagio> Armin Rigo wrote: > This doesn't match my experience, which is that sys.path hackery is > required in any project that is larger than one directory, but is not > itself a library. [...] > myapp/ > main.py > a/ > __init__.py > b.py > test_b.py > c/ > __init__.py > > This theoretical example shows main.py (the main entry point) at the > root of the package directories - it is the only place where it can be > if it needs to import the packages a and c. The module a.b can import > c, too (and this is not bad design - think about c as a package > regrouping utilities that make sense for the whole application). But > then the testing script test_b.py cannot import the whole application > any more. Imports of a or c will fail, and even a relative import of > b will crash when b tries to import c. The only way I can think of > is to insert the root directory in sys.path from within test_b.py, > and then use absolute imports. This also matches my experience, but I never used sys.path hackery for this kind of things. I either set PYTHONPATH while I work on "myapp" (which I consider not such a big trouble after all, and surely much less invasive than adding specific Python code tweaking sys.path into all the tests), or, even more simply, I run the test from myapp main directory (manually typing "myapp/b/test_b.py"). There is also another possibility, which is having a smarter test framework where you can specify substrings of test names: I don't know py.test in detail, but in my own framework I can say something like "./run_tests.py PAT", which basically means "recursively discover and run all files named test_NAME, and where PAT is a substring of NAME). > (For example, to support this way of organizing applications, the 'py' > lib provides a call py.magic.autopath() that can be dropped at the > start of test_b.py. It hacks sys.path by guessing the "real" root > according to how many levels of __init__.py there are...) Since I consider this more of an environmental problem, I would not find satisfying any kind of solution at the single module level (and even less so one requiring so much guess-work as this one). Giovanni Bajo From rasky at develer.com Sun Sep 24 01:16:54 2006 From: rasky at develer.com (Giovanni Bajo) Date: Sun, 24 Sep 2006 01:16:54 +0200 Subject: [Python-Dev] Minipython References: <20060923221441.GB5227@hornet.din.cz> Message-ID: <070101c6df66$5bc1e5d0$4bbd2997@bagio> Milan Krcmar wrote: > Current (2.5) stripped and gzipped (I am going to use a compressed > filesystem) CPython binary, compiled with defaults on a i386/glibc > Linux, results in 500 KiB of "flash". How to make the Python > interpreter even smaller? In my experience, the biggest gain can be obtained by dropping the rarely-used CJK codecs (for Asian languages). That should sum up to almost 800K (uncompressed), IIRC. After that, I once had to strip down the binary even more, and found out (by guesswork and inspection of map files) that there is no other low hanging fruit. By carefully selecting which modules to link in, I was able to reduce of another 300K or so, but nothing really incredible. I would also suggest -ffunction-sections in these cases, but you might already know that. Giovanni Bajo From rasky at develer.com Sun Sep 24 01:29:27 2006 From: rasky at develer.com (Giovanni Bajo) Date: Sun, 24 Sep 2006 01:29:27 +0200 Subject: [Python-Dev] Removing __del__ References: <324634B71B159D469BCEB616678A6B94F94C3B@ingdexs1.ingdirect.com><008901c6de94$d2072ed0$4bbd2997@bagio><20060922235602.GA3427@panix.com><6a36e7290609221735hcbd3df2ne41406323ce5fd72@mail.gmail.com><039d01c6def1$46df1ef0$4bbd2997@bagio><6a36e7290609230222w1fe8dfaam4780a1fd81481cd0@mail.gmail.com><03bb01c6def4$257b6c70$4bbd2997@bagio> <8764fesfsj.fsf@qrnik.zagroda> Message-ID: <07b101c6df68$1ca8dfa0$4bbd2997@bagio> Marcin 'Qrczak' Kowalczyk wrote: >> 1) There's a way to destruct the handle BEFORE __del__ is called, >> which would require killing the weakref / deregistering the >> finalization hook. > > Weakrefs should have a method which runs their callback and > unregisters them. > >> 2) The objects required in the destructor can be mutated / changed >> during the lifetime of the instance. For instance, a class that >> wraps Win32 FindFirstFirst/FindFirstNext and support transparent >> directory recursion needs something similar. > > Listing files with transparent directory recursion can be implemented > in terms of listing files of a given directory, such that a finalizer > is only used with the low level object. > >> Another example is a class which creates named temporary files >> and needs to remove them on finalization. It might need to create >> several different temporary files (say, self.handle is the filename >> in that case)[1], so the filename needed in the destructor changes >> during the lifetime of the instance. > > Again: move the finalizer to a single temporary file object, and refer > to such object instead of a raw handle. Yes, I know Python is turing-complete even without __del__, but that is not my point. The fact that we can enhance weakrefs and find a very complicated way to solve problems which __del__ solves right now easily does not make things different. People are still propsing to drop a feature which is perceived as "easy" by users, and replace it with a complicated set of workarounds, which are prone to mistakes, more verbose, hard to learn and to maintain. I'm totally in favor of the general idea of dropping rarely used features (like __var in the other thread). I just can't see how dropping __del__ makes things easier, while it surely makes life a lot harder for the legitimate users of it. Giovanni Bajo From mwh at python.net Sun Sep 24 01:36:41 2006 From: mwh at python.net (Michael Hudson) Date: Sun, 24 Sep 2006 00:36:41 +0100 Subject: [Python-Dev] Minipython In-Reply-To: <20060923221441.GB5227@hornet.din.cz> (Milan Krcmar's message of "Sun, 24 Sep 2006 00:14:41 +0200") References: <20060923221441.GB5227@hornet.din.cz> Message-ID: <2mhcyyjgty.fsf@starship.python.net> Milan Krcmar writes: > I would like to run Python scripts on an embedded MIPS Linux platform > having only 2 MiB of flash ROM and 16 MiB of RAM for everything. > > Current (2.5) stripped and gzipped (I am going to use a compressed > filesystem) CPython binary, compiled with defaults on a i386/glibc > Linux, results in 500 KiB of "flash". How to make the Python interpreter > even smaller? > > - can I completely drop out lexical analysis of sourcecode and compilation > to bytecode? is it relevant enough to the size of interpreter? I don't think there's an configure flag for this or anything, and it might be a bit hairy to do it, but it's possible and it would probably save a bit. There is a configure option to remove unicode support. It's not terribly well supported and stops working every now and again, but it's probably much easier to start with. There was at one point and may still be an option to not include the complex type. > - should I drop "useless" compiled-in modules? (what I need is a > replacement for advanced bash scripting, being able to write more > complex scripts and avoid forking tens of processes for things like > searching filesystem, formating dates etc.) Yes, definitely. > I don't want to re-invent the wheel, but all my attempts at finding > Python for embedded systems ended in instructions for embedding > Python in another program :-) > > Can you give me any information to start with? I would prefer stripping > current version of Python rather than returning to a years-old (but > smaller) version and remembering what of the new syntax/functionality to > avoid. Well, I would start by looking at what is taking up the space... Cheers, mwh -- C++ is a siren song. It *looks* like a HLL in which you ought to be able to write an application, but it really isn't. -- Alain Picard, comp.lang.lisp From martin at v.loewis.de Sun Sep 24 06:49:34 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 24 Sep 2006 06:49:34 +0200 Subject: [Python-Dev] Minipython In-Reply-To: <20060923221441.GB5227@hornet.din.cz> References: <20060923221441.GB5227@hornet.din.cz> Message-ID: <45160E5E.8040406@v.loewis.de> Milan Krcmar schrieb: > Can you give me any information to start with? I would prefer stripping > current version of Python rather than returning to a years-old (but > smaller) version and remembering what of the new syntax/functionality to > avoid. I would start with dropping support for dynamic loading of extension modules, and link all necessary modules statically. Then, do what Michael Hudson says: find out what is taking up space. size */*.o|sort -n should give a good starting point; on my system, I get [...] 29356 1416 156 30928 78d0 Objects/classobject.o 30663 0 0 30663 77c7 Objects/unicodectype.o 33530 480 536 34546 86f2 Python/Python-ast.o 33624 1792 616 36032 8cc0 Objects/longobject.o 36603 16 288 36907 902b Python/ceval.o 36710 2532 0 39242 994a Modules/_sre.o 39169 9473 1032 49674 c20a Objects/stringobject.o 52965 0 36 53001 cf09 Python/compile.o 66197 4592 436 71225 11639 Objects/typeobject.o 74111 9779 1160 85050 14c3a Objects/unicodeobject.o Michael already mentioned you can drop unicodeobject if you want to. compile.o would also offer savings, but stripping it might not be easy. Dropping _sre is quite easy. If you manage to drop compile.o, then dropping Python-ast.o (along with the rest of the compiler) should also be possible. unicodectype will go away if the Unicode type goes, but can probably be removed separately. And so on. When you come to a solution that satisfies your needs, don't forget to document it somewhere. Regards, Martin From krcmar at datinel.cz Sun Sep 24 10:37:55 2006 From: krcmar at datinel.cz (Milan Krcmar) Date: Sun, 24 Sep 2006 10:37:55 +0200 Subject: [Python-Dev] Minipython In-Reply-To: <45160E5E.8040406@v.loewis.de> References: <20060923221441.GB5227@hornet.din.cz> <45160E5E.8040406@v.loewis.de> Message-ID: <20060924083755.GA27480@hornet.din.cz> Thank you people. I'm going to try to strip unneeded things and let you know the result. Along with running Python on an embedded system, I am considering two more things. Suppose the system to be a small Linux router, which, after the kernel starts, merely configures lots of parameters of the kernel and then runs some daemons for gathering statistics and allowing remote control of the host. Python helps mainly in the startup phase of configuring kernel according to a human-readable confgiuration files. This has been solved by shell scripts. Python is not as suitable for running external processes and process pipes as a shell, but I'd like to write a module (at least) helping him in the sense of scsh (a "Scheme shell", http://www.scsh.net). A more advanced solution is to replace system's init (/sbin/init) by Python. It should even speed the startup up as it will not need to run shell many times. To avoid running another processes, I want to "port them" to Python. Processes for kernel configuration, like iproute2, iptables etc. are often built above its own library, which can be used as a start point. (Yes, it does matter, at startup, routers run such processes hundreds times). Milan On Sun, Sep 24, 2006 at 06:49:34AM +0200, "Martin v. L?wis" wrote: > Milan Krcmar schrieb: > > Can you give me any information to start with? I would prefer stripping > > current version of Python rather than returning to a years-old (but > > smaller) version and remembering what of the new syntax/functionality to > > avoid. > > I would start with dropping support for dynamic loading of extension > modules, and link all necessary modules statically. > > Then, do what Michael Hudson says: find out what is taking up space. > > size */*.o|sort -n > > should give a good starting point; on my system, I get > > [...] > 29356 1416 156 30928 78d0 Objects/classobject.o > 30663 0 0 30663 77c7 Objects/unicodectype.o > 33530 480 536 34546 86f2 Python/Python-ast.o > 33624 1792 616 36032 8cc0 Objects/longobject.o > 36603 16 288 36907 902b Python/ceval.o > 36710 2532 0 39242 994a Modules/_sre.o > 39169 9473 1032 49674 c20a Objects/stringobject.o > 52965 0 36 53001 cf09 Python/compile.o > 66197 4592 436 71225 11639 Objects/typeobject.o > 74111 9779 1160 85050 14c3a Objects/unicodeobject.o > > Michael already mentioned you can drop unicodeobject if you want > to. compile.o would also offer savings, but stripping it might > not be easy. Dropping _sre is quite easy. If you manage to > drop compile.o, then dropping Python-ast.o (along with the > rest of the compiler) should also be possible. > unicodectype will go away if the Unicode type goes, but can > probably be removed separately. And so on. > > When you come to a solution that satisfies your needs, > don't forget to document it somewhere. > > Regards, > Martin From gjcarneiro at gmail.com Sun Sep 24 14:07:33 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sun, 24 Sep 2006 13:07:33 +0100 Subject: [Python-Dev] PyErr_CheckSignals error return value Message-ID: int PyErr_CheckSignals() Documentation for PyErr_CheckSignals [1] says "If an exception is raised the error indicator is set and the function returns 1; otherwise the function returns 0.". But the code I see tells me the function returns -1 on error. What to do? Fix the code, or the documentation? [1] http://docs.python.org/api/exceptionHandling.html#l2h-115 -- Gustavo J. A. M. Carneiro "The universe is always one step beyond logic." From g.brandl at gmx.net Sun Sep 24 14:50:50 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 24 Sep 2006 14:50:50 +0200 Subject: [Python-Dev] Python 2.5 bug? Changes in behavior of traceback module In-Reply-To: References: Message-ID: Michael Glassford wrote: > In Python 2.4, traceback.print_exc() and traceback.format_exc() silently > do nothing if there is no active exception; in Python 2.5, they raise an > exception. Not too difficult to handle, but unexpected (and a pain if > you use it in a lot of places). I assume it was an unintentional change? This was certainly an unintentional change while restructuring some internal traceback routines. It's now fixed in SVN. Georg From gjcarneiro at gmail.com Sun Sep 24 16:17:42 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sun, 24 Sep 2006 15:17:42 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <450632A7.40504@canterbury.ac.nz> <4506553D.1020307@canterbury.ac.nz> Message-ID: -> http://www.python.org/sf/1564547 -- Gustavo J. A. M. Carneiro "The universe is always one step beyond logic." From python at rcn.com Mon Sep 25 01:21:27 2006 From: python at rcn.com (python at rcn.com) Date: Sun, 24 Sep 2006 19:21:27 -0400 (EDT) Subject: [Python-Dev] list.discard? (Re: dict.discard) Message-ID: <20060924192127.AFZ50059@ms09.lnh.mail.rcn.net> > When I want to remove something from a list I typically write: > > while x in somelist: > somelist.remove(x) An O(n) version of removeall: somelist[:] = [e for e in somelist if e != x] Raymond From unknown_kev_cat at hotmail.com Mon Sep 25 01:25:27 2006 From: unknown_kev_cat at hotmail.com (Joe Smith) Date: Sun, 24 Sep 2006 19:25:27 -0400 Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code References: Message-ID: "Neal Norwitz" wrote in message news:ee2a432c0609222151k2bf1a211u44d9e44dcc6bbf5d at mail.gmail.com... > I ignored these as I'm not certain all the platforms we run on accept > free(NULL). > That sounds like exactly what the autotools are designed for. You simply use free(), and have autoconf check for support of free(NULL). If free(NULL) is broken then a macro is defined: "#define free(p) (p==NULL)||free(p)" Or something like that. Note that this does not clutter up the main program any. In fact it simplifies it. It also potentially speeds up platforms with a working free, without any negative speed implications for other platforms. The only downside is a slight, presumably negligible, increase in build time. From mwh at python.net Mon Sep 25 11:08:08 2006 From: mwh at python.net (Michael Hudson) Date: Mon, 25 Sep 2006 10:08:08 +0100 Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code In-Reply-To: (Neal Norwitz's message of "Fri, 22 Sep 2006 21:51:38 -0700") References: Message-ID: <2m64fcjouf.fsf@starship.python.net> "Neal Norwitz" writes: > I ignored these as I'm not certain all the platforms we run on accept > free(NULL). It's mandated by C99, and I don't *think* it changed from the previous version (I only have a bootleg copy of C99 :). Cheers, mwh -- TRSDOS: Friendly old lizard. Or, at least, content to sit there eating flies. -- Jim's pedigree of operating systems, asr From talin at acm.org Mon Sep 25 11:27:57 2006 From: talin at acm.org (Talin) Date: Mon, 25 Sep 2006 02:27:57 -0700 Subject: [Python-Dev] Minipython In-Reply-To: <20060924083755.GA27480@hornet.din.cz> References: <20060923221441.GB5227@hornet.din.cz> <45160E5E.8040406@v.loewis.de> <20060924083755.GA27480@hornet.din.cz> Message-ID: <4517A11D.1050004@acm.org> Milan Krcmar wrote: > Thank you people. I'm going to try to strip unneeded things and let you > know the result. > > Along with running Python on an embedded system, I am considering two > more things. Suppose the system to be a small Linux router, which, after > the kernel starts, merely configures lots of parameters of the kernel > and then runs some daemons for gathering statistics and allowing remote > control of the host. > > Python helps mainly in the startup phase of configuring kernel according > to a human-readable confgiuration files. This has been solved by shell > scripts. Python is not as suitable for running external processes and > process pipes as a shell, but I'd like to write a module (at least) > helping him in the sense of scsh (a "Scheme shell", > http://www.scsh.net). > > A more advanced solution is to replace system's init (/sbin/init) by > Python. It should even speed the startup up as it will not need to run > shell many times. To avoid running another processes, I want to "port > them" to Python. Processes for kernel configuration, like iproute2, > iptables etc. are often built above its own library, which can be used as > a start point. (Yes, it does matter, at startup, routers run such processes > hundreds times). > > Milan One alternative you might want to look into is the language "Lua" (www.lua.org), which is similar to Python in some respects (also has some similarities to Javascript), but specifically optimized for embedding in larger apps - meaning that it has a much smaller footprint, a much smaller standard library, less built-in data types and so on. (For example, dicts, lists, and objects are all merged into a single type called a 'table', which is just a generic indexable container.) Lua's C API consists of just a few dozen functions. It's not as powerful as Python of course, although it's surprisingly powerful for its size - it has closures, continuations, and all of the goodness you would expect from a modern language. Lua provides 'meta-mechanisms' for extending the language rather than implementing language features directly. So even though it's not a pure object-oriented language, it provides mechanisms for implementing classes and inheritance. And it's fast, since it has less baggage to carry around. It has a few warts - for example, I don't like the fact that referring to an undefined variable silently returns nil instead of returning an error, but I suppose in some environments that's a feature. A lot of game companies use Lua for embedded scripting languages in their games. (Console-based games in particular have strict memory requirements, since there's no virtual memory on consoles.) -- Talin From glassfordm at hotmail.com Mon Sep 25 14:23:51 2006 From: glassfordm at hotmail.com (Michael Glassford) Date: Mon, 25 Sep 2006 08:23:51 -0400 Subject: [Python-Dev] Python 2.5 bug? Changes in behavior of traceback module In-Reply-To: References: Message-ID: <4517CA57.3040500@hotmail.com> Thanks! Mike Georg Brandl wrote: > Michael Glassford wrote: >> In Python 2.4, traceback.print_exc() and traceback.format_exc() silently >> do nothing if there is no active exception; in Python 2.5, they raise an >> exception. Not too difficult to handle, but unexpected (and a pain if >> you use it in a lot of places). I assume it was an unintentional change? > > This was certainly an unintentional change while restructuring some > internal traceback routines. > > It's now fixed in SVN. > > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > From steven.bethard at gmail.com Tue Sep 26 06:04:37 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 25 Sep 2006 22:04:37 -0600 Subject: [Python-Dev] python-dev summary for 2006-08-01 to 2006-08-15 Message-ID: Sorry about the delay. Here's the summary for the first half of August. As always, comments and corrections are greatly appreciated. ========= Summaries ========= -------------------------------- Mixing str and unicode dict keys -------------------------------- Ralf Schmitt noted that in Python head, inserting str and unicode keys to the same dictionary would sometimes raise UnicodeDecodeErrors:: >>> d = {} >>> d[u'm\xe1s'] = 1 >>> d['m\xe1s'] = 1 Traceback (most recent call last): ... UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1: ordinal not in range(128) This error showed up as a result of Armin Rigo's `patch to stop dict lookup from hiding exceptions`_, which meant that the UnicodeDecodeError raised when a str object is compared to a non-ASCII unicode object was no longer silenced. In the end, people agreed that UnicodeDecodeError should not be raised for equality comparisons, and in general, ``__eq__()`` methods should not raise exceptions. But comparing str and unicode objects is often a programming error, so in addition to just returning False, equality comparisons on str and non-ASCII unicode now issues a warning with the UnicodeDecodeError message. .. _patch to stop dict lookup from hiding exceptions: http://bugs.python.org/1497053 Contributing threads: - `unicode hell/mixing str and unicode as dictionary keys `__ - `Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys `__ - `Dicts are broken ... `__ - `Dict suppressing exceptions `__ ----------------------- Rounding floats to ints ----------------------- Bob Ippolito pointed out a long-standing bug in the struct module where floats were automatically converted to ints. Michael Urman showed a simple case that would provoke an exception if the bug were fixed:: pack('>H', round(value * 32768)) The source of this bug is the expectation that ``round()`` returns an int, when it actually returns a float. There was then some discussion about splitting the round functionality into two functions: ``__builtin__.round()`` which would round floats to ints, and ``math.round()`` which would round floats to floats. There was also some discussion about the optional argument to ``round()`` which currently specifies the number of decimal places to round to -- a number of folks felt that it was a mistake to round to *decimal* places when a float can only truly reflect *binary* places. In the end, there were no definite conclusions about the future of ``round()``, but it seemed like the discussion might be resumed on the Python 3000 list. Contributing threads: - `struct module and coercing floats to integers `__ - `Rounding float to int directly (Re: struct module and coercing floats to integers) `__ - `Rounding float to int directly (Re: struct module and coercing floats to integers) `__ - `Rounding float to int directly ... `__ - `struct module and coercing floats to integers `__ --------------------------- Assigning to function calls --------------------------- Neal Becker proposed that code by ``X() += 2`` be allowed so that you could call __iadd__ on objects immediately after creation. People pointed out that allowing augmented *assignment* is misleading when no assignment can occur, and it would be better just to call the method directly, e.g. ``X().__iadd__(2)``. Contributing threads: - `SyntaxError: can't assign to function call `__ - `Split augmented assignment into two operator sets? [Re: SyntaxError: can't assign to function call] `__ --------------------------------------- PEP 357: Integer clipping and __index__ --------------------------------------- After some further discussion on the `__index__ issue`_ of last fortnight, Travis E. Oliphant proposed `a patch for __index__`_ that introduced three new C API functions: * PyIndex_Check(obj) -- checks for nb_index * PyObject* PyNumber_Index(obj) -- calls nb_index if possible or raises a TypeError * Py_ssize_t PyNumber_AsSsize_t(obj, err) -- converts the object to a Py_ssize_t, raising err on overflow After a few minor edits, this patch was checked in. .. __index__ issue: http://www.python.org/dev/summary/2006-07-16_2006-07-31/#pep-357-integer-clipping-and-index .. a patch for __index__: http://bugs.python.org/1538606 Contributing threads: - `Bad interaction of __index__ and sequence repeat `__ - `__index__ clipping `__ - `Fwd: [Python-checkins] r51236 - in python/trunk: Doc/api/abstract.tex Include/abstract.h Include/object.h Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c Modules/mmapmodule.c Modules/operator.c Objects/abstract.c Objects/classobject.c Objects/ `__ - `Fwd: [Python-checkins] r51236 - in python/trunk: Doc/api/abstract.tex Include/abstract.h Include/object.h Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c Modules/mmapmodule.c Modules/operator.c Objects/abstract.c Objects/class `__ ---------------------------- OpenSSL and Windows binaries ---------------------------- Jim Jewett pointed out that a default build of OpenSSL includes the patented IDEA cipher, and asked whether that needed to be kept out of the Windows binary versions. There was some concern about dropping a feature, but Gregory P. Smith pointed out that IDEA isn't directly exposed to any Python user, and suggested that IDEA should never be required by any sane SSL connection. Martin v. L?wis promised to look into making the change. Contributing threads: - `windows 2.5 build: use OpenSSL for hashlib [bug 1535502] `__ - `openSSL and windows binaries - license `__ ---------------------------- Type of range object members ---------------------------- Alexander Belopolsky proposed making the members of the ``range()`` object use Py_ssize_t instead of C longs. Guido indicated that this was basically wasted effort -- in the long run, the members should be PyObject* so that they can handle Python longs correctly, so converting them to Py_ssize_t would be an intermediate step that wouldn't help in the transition. There was then some discussion about the int and long types in Python 3000, with Guido suggesting two separate implementations that would be mostly hidden at the Python level. Contributing thread: - `Type of range object members `__ ------------------------ Distutils version number ------------------------ A user noted that Python 2.4.3 shipped with distutils 2.4.1 and the version number of distutils in the repository was only 2.4.0 and requested that Python 2.5 include the newer distutils. In fact, the newest distutils was already the one in the repository but the version number had not been appropriately bumped. For a short while, the distutils number was automatically generated from the Python one, but Marc-Andre Lemburg volunteered to manually bump it so that it would be easier to use the SVN distutils with a different Python version. Contributing threads: - `Which version of distutils to ship with Python 2.5? `__ - `no remaining issues blocking 2.5 release `__ ------------------------------------- Dict containment and unhashable items ------------------------------------- tomer filiba suggested that dict.__contain__ should return False instead of raising a TypeError in situations like:: >>> a={1:2, 3:4} >>> [] in a Traceback (most recent call last): File "", line 1, in ? TypeError: list objects are unhashable Guido suggested that swallowing the TypeError here would be a mistake as it would also swallow any TypeErrors produced by faulty ``__hash__()`` methods. Contributing threads: - `dict containment annoyance `__ - `NotHashableError? (Re: dict containment annoyance) `__ ------------------------------- Returning longs from __hash__() ------------------------------- Armin Rigo pointed out that Python 2.5's change that allows id() to return ints or longs would have caused some breakage for custom hash functions like:: def __hash__(self): return id(self) Though it has long been documented that the result of ``id()`` is not suitable as a hash value, code like this is apparently common. So Martin v. L?wis and Armin arranged for ``PyLong_Type.tp_hash`` to be called in the code for ``hash()``. Contributing thread: - `returning longs from __hash__() `__ ---------------------- instancemethod builtin ---------------------- Nick Coghlan suggested adding an ``instancemethod()`` builtin along the lines of ``staticmethod()`` and ``classmethod()`` which would allow arbitrary callables to act more like functions. In particular, Nick was considering code like:: class C(object): method = some_callable Currently, if ``some_callable`` did not define the ``__get__()`` method, ``C().method`` would not bind the ``C`` instance as the first argument. By introducing ``instancemethod()``, this problem could be solved like:: class C(object): method = instancemethod(some_callable) There wasn't much of a reaction one way or another, so it looked like the idea would at least temporarily be shelved. Contributing thread: - `2.6 idea: a 'function' builtin to parallel classmethod and staticmethod `__ -------------------------------- Unicode versions and unicodedata -------------------------------- Armin Ronacher noted that Python 2.5 implements Unicode 4.1 but while a ucd_3_2_0 object is available (implementing Unicode 3.2), no ucd_4_1_0 object is available. Martin v. L?wis explained that the ucd_3_2_0 object is only available because IDNA needs it, and that there are no current plans to expose any other Unicode versions (and that ucd_3_2_0 may go away when IDNA no longer needs it). Contributing thread: - `Unicode Data in Python2.5 is missing a ucd_4_1_0 object `__ ================== Previous Summaries ================== - `Release manager pronouncement needed: PEP 302 Fix `__ =============== Skipped Threads =============== - `clock_gettime() vs. gettimeofday()? `__ - `Strange memo behavior from cPickle `__ - `internal weakref API should be Py_ssize_t? `__ - `Weekly Python Patch/Bug Summary `__ - `Releasemanager, please approve #1532975 `__ - `FW: using globals `__ - `TRUNK FREEZE 2006-07-03, 00:00 UTC for 2.5b3 `__ - `segmentation fault in Python 2.5b3 (trunk:51066) `__ - `using globals `__ - `uuid module - byte order issue `__ - `RELEASED Python 2.5 (beta 3) `__ - `TRUNK is UNFROZEN `__ - `2.5 status `__ - `Python 2.5b3 and AIX 4.3 - It Works `__ - `More tracker demos online `__ - `need an SSH key removed `__ - `BZ2File.writelines should raise more meaningful exceptions `__ - `test_mailbox on Cygwin `__ - `cgi.FieldStorage DOS (sf bug #1112549) `__ - `2.5b3, commit r46372 regressed PEP 302 machinery (sf not letting me post) `__ - `free(): invalid pointer `__ - `should i put this on the bug tracker ? `__ - `Is this a bug? `__ - `httplib and bad response chunking `__ - `cgi DoS attack `__ - `DRAFT: python-dev summary for 2006-07-01 to 2006-07-15 `__ - `SimpleXMLWriter missing from elementtree `__ - `DRAFT: python-dev summary for 2006-07-16 to 2006-07-31 `__ - `Is module clearing still necessary? [Re: Is this a bug?] `__ - `PyThreadState_SetAsyncExc bug? `__ - `Elementtree and Namespaces in 2.5 `__ - `Errors after running make test `__ - `What is the status of file.readinto? `__ - `Recent logging spew `__ - `[Python-3000] Python 2.5 release schedule (was: threading, part 2) `__ - `test_socketserver failure on cygwin `__ - `ANN: byteplay - a bytecode assembler/disassembler `__ - `Arlington VA sprint on Sept. 23 `__ - `IDLE patches - bugfix or not? `__ - `Four issue trackers submitted for Infrastructue Committee's tracker search `__ From anthony at interlink.com.au Tue Sep 26 06:14:43 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 26 Sep 2006 14:14:43 +1000 Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18 Message-ID: <200609261414.46940.anthony@interlink.com.au> The plan for 2.4.4 is to have a release candidate on October 11th, and a final release on October 18th. This is very likely to be the last ever 2.4 release, after which 2.4.4 joins 2.3.5 and earlier in the old folks home, where it can live out it's remaining life with dignity and respect. If you know of any backports that should go in, please make sure you get them done before the 11th. Anthony From fredrik at pythonware.com Tue Sep 26 08:23:10 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 26 Sep 2006 08:23:10 +0200 Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18 In-Reply-To: <200609261414.46940.anthony@interlink.com.au> References: <200609261414.46940.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > The plan for 2.4.4 is to have a release candidate on October 11th, and a final > release on October 18th. This is very likely to be the last ever 2.4 release, > after which 2.4.4 joins 2.3.5 and earlier in the old folks home "finally leaves school" is a more correct description, I think. my 2.3 and 2.4 installations are in a pretty good shape, after all... From skip at pobox.com Wed Sep 27 02:40:43 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 26 Sep 2006 19:40:43 -0500 Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18 In-Reply-To: <200609261414.46940.anthony@interlink.com.au> References: <200609261414.46940.anthony@interlink.com.au> Message-ID: <17689.51339.120140.761854@montanaro.dyndns.org> Anthony> The plan for 2.4.4 is to have a release candidate on October Anthony> 11th, and a final release on October 18th. This is very likely Anthony> to be the last ever 2.4 release, after which 2.4.4 joins 2.3.5 Anthony> and earlier in the old folks home, where it can live out it's Anthony> remaining life with dignity and respect. Anthony> If you know of any backports that should go in, please make Anthony> sure you get them done before the 11th. John Hunter (matplotlib author) recently made me aware of a problem with code.InteractiveConsole. It doesn't protect itself from the user closing sys.stdout: % ./python.exe Python 2.4.4c0 (#2, Sep 26 2006, 06:26:16) [GCC 4.0.0 (Apple Computer, Inc. build 5026)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import code >>> c = code.InteractiveConsole() >>> c.interact() Python 2.4.4c0 (#2, Sep 26 2006, 06:26:16) [GCC 4.0.0 (Apple Computer, Inc. build 5026)] on darwin Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> import sys >>> sys.stdout.close() Traceback (most recent call last): File "", line 1, in ? File "/Users/skip/src/python-svn/release24-maint/Lib/code.py", line 234, in interact line = self.raw_input(prompt) File "/Users/skip/src/python-svn/release24-maint/Lib/code.py", line 277, in raw_input return raw_input(prompt) ValueError: I/O operation on closed file I think the right thing is for InteractiveConsole to dup sys.std{in,out,err} and do its own thing for its raw_input() method instead of naively calling the raw_input() builtin. I outlined a solution for ipython in a message to ipython-dev, but even better would be if InteractiveConsole itself was fixed: John Hunter alerted me to a segfault problem in code.InteractiveConsole when sys.stdout is closed. This problem is present in Python up to 2.4.3 as far as I can tell, but is fixed in later versions of Python (2.5, 2.4.4 when it's released, svn trunk). Even with that fix, if the user calls sys.stdout.close() you'll get a ValueError and your console will be useless. I took a look at the code in Python that the InteractiveConsole class exercises and see that the cause is that the naive raw_input() method simply calls the raw_input() builtin. That function gets the "stdin" and "stdout" functions from the sys module and there's no way to override that behavior. In my opinion, the best thing to do would be to subclass InteractiveConsole and provide a more robust raw_input() method. Ideally, I think you'd want to dup() the file descriptors for sys.{stdin,stdout} and use those instead of calling the builtin raw_input(). Something like (untested): class IC(code.InteractiveConsole): def __init__(self): code.InteractiveConsole.__init__(self) self.input = os.fdopen(os.dup(sys.stdin.fileno())) self.output = os.fdopen(os.dup(sys.stdout.fileno())) self.error = os.fdopen(os.dup(sys.stderr.fileno())) def raw_input(self, prompt=""): if prompt: self.output.write(prompt): self.output.flush() return self.input.readline() def write(self, data): self.error.write(data) Also, the runcode() method will have to be overridden to use self.output instead of sys.stdout. Those couple changes should (hopefully) insulate IPython from such user wackiness. I'm happy to work up a patch for 2.4.4, 2.5.1 and 2.6.0. Does this group think that's the right route to take? Skip From skip at pobox.com Wed Sep 27 12:50:04 2006 From: skip at pobox.com (skip at pobox.com) Date: Wed, 27 Sep 2006 05:50:04 -0500 Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18 In-Reply-To: <17689.51339.120140.761854@montanaro.dyndns.org> References: <200609261414.46940.anthony@interlink.com.au> <17689.51339.120140.761854@montanaro.dyndns.org> Message-ID: <17690.22364.692472.235533@montanaro.dyndns.org> Anthony> If you know of any backports that should go in, please make Anthony> sure you get them done before the 11th. skip> John Hunter (matplotlib author) recently made me aware of a skip> problem with code.InteractiveConsole. It doesn't protect itself skip> from the user closing sys.stdout: ... I attached a patch for code.py to http://sourceforge.net/support/tracker.php?aid=1563079 If someone wants to take a peek, that would be appreciated. It seems to me that it certainly should go into 2.5.1 and 2.6. Whether it's deemed serious enough to go into 2.4.4 is another question. Skip From skip at pobox.com Wed Sep 27 17:28:46 2006 From: skip at pobox.com (skip at pobox.com) Date: Wed, 27 Sep 2006 10:28:46 -0500 Subject: [Python-Dev] [SECUNIA] "buffer overrun in repr() for unicode strings" Potential Vulnerability (fwd) Message-ID: <17690.39086.849178.331542@montanaro.dyndns.org> This came in to the webmaster address and was also addressed to a number of individuals (looks like the SF project admins). This appears like it would be of general interest to this group. Looking through this message and the various bug tracker items it's not clear to me if Secunia wants to know if the patch (which I believe has already been applied to all three active svn branches) is the source of the problem or if they want to know if it solves the buffer overrun problem. Are they suggesting that 10*size should be the character multiple in all cases? Skip -------------- next part -------------- An embedded message was scrubbed... From: Secunia Research Subject: [SECUNIA] "buffer overrun in repr() for unicode strings" Potential Vulnerability Date: Wed, 27 Sep 2006 15:18:46 +0200 Size: 5508 Url: http://mail.python.org/pipermail/python-dev/attachments/20060927/fdfd4bdf/attachment.mht From amk at amk.ca Wed Sep 27 18:40:04 2006 From: amk at amk.ca (A.M. Kuchling) Date: Wed, 27 Sep 2006 12:40:04 -0400 Subject: [Python-Dev] List of candidate 2.4.4 bugs? Message-ID: <20060927164004.GA12389@localhost.localdomain> Is anyone maintaining a list of candidate bugs to be fixed in 2.4.4? If not, should we start a wiki page for the purpose? --amk From martin at v.loewis.de Wed Sep 27 18:56:14 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 27 Sep 2006 18:56:14 +0200 Subject: [Python-Dev] List of candidate 2.4.4 bugs? In-Reply-To: <20060927164004.GA12389@localhost.localdomain> References: <20060927164004.GA12389@localhost.localdomain> Message-ID: <451AAD2E.1070705@v.loewis.de> A.M. Kuchling schrieb: > Is anyone maintaining a list of candidate bugs to be fixed in 2.4.4? I don't think so. Also, I see little chance that many bugs will be fixed that aren't already. People should really do constant backporting, instead of starting backports when a subminor release is made. Of course, there are some things that people remember and want to see fixed, but they are pretty arbitrary. Regards, Martin From amk at amk.ca Wed Sep 27 19:35:42 2006 From: amk at amk.ca (A.M. Kuchling) Date: Wed, 27 Sep 2006 13:35:42 -0400 Subject: [Python-Dev] List of candidate 2.4.4 bugs? In-Reply-To: <451AAD2E.1070705@v.loewis.de> References: <20060927164004.GA12389@localhost.localdomain> <451AAD2E.1070705@v.loewis.de> Message-ID: <20060927173542.GA10686@rogue.amk.ca> On Wed, Sep 27, 2006 at 06:56:14PM +0200, "Martin v. L?wis" wrote: > I don't think so. Also, I see little chance that many bugs will be fixed > that aren't already. People should really do constant backporting, > instead of starting backports when a subminor release is made. Agreed. One reason I often don't backport a bug is because I'm not sure if there will be another bugfix release; if not, it's wasted effort, and I wasn't sure if a 2.4.4 release was ever going to happen. After 2.4.4, will there be a 2.4.5 or is that too unlikely? I've done an 'svn log' on the modules I'm familiar with (curses, zlib, gzip) and will look at backporting the results. Grepping for 'backport candidate' in 'svn log -r37910:HEAD' turns up 30-odd checkins that contain the phrase: r51728 r51669 r47171 r47061 r46991 r46882 r46879 r46878 r46602 r46589 r45234 r41842 r41696 r41531 r39767 r39743 r39739 r39650 r39645 r39595 r39594 r39491 r39135 r39044 r39030 r39012 r38932 r38927 r38887 r38826 r38781 r38772 r38745 --amk From fredrik at pythonware.com Wed Sep 27 19:25:13 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 27 Sep 2006 19:25:13 +0200 Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18 In-Reply-To: <17689.51339.120140.761854@montanaro.dyndns.org> References: <200609261414.46940.anthony@interlink.com.au> <17689.51339.120140.761854@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > I think the right thing is for InteractiveConsole to dup sys.std{in,out,err} > and do its own thing for its raw_input() method instead of naively calling > the raw_input() builtin. what guarantees that sys.stdin etc has a valid and dup:able fileno when the console is instantiated ? From skip at pobox.com Wed Sep 27 20:06:34 2006 From: skip at pobox.com (skip at pobox.com) Date: Wed, 27 Sep 2006 13:06:34 -0500 Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18 In-Reply-To: References: <200609261414.46940.anthony@interlink.com.au> <17689.51339.120140.761854@montanaro.dyndns.org> Message-ID: <17690.48554.84183.999508@montanaro.dyndns.org> Fredrik> skip at pobox.com wrote: >> I think the right thing is for InteractiveConsole to dup >> sys.std{in,out,err} and do its own thing for its raw_input() method >> instead of naively calling the raw_input() builtin. Fredrik> what guarantees that sys.stdin etc has a valid and dup:able Fredrik> fileno when the console is instantiated ? Nothing, I suppose. I'm just concerned that the InteractiveConsole instance keep working after its interact() method is called. Skip From jimjjewett at gmail.com Wed Sep 27 20:10:16 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 27 Sep 2006 14:10:16 -0400 Subject: [Python-Dev] openssl - was: 2.4.4c1 October 11, 2.4.4 final October 18 Message-ID: OpenSSL should probably be upgraded to 0.9.8.c (or possibly 0.9.7.k) because of the security patch. http://www.openssl.org/ http://www.openssl.org/news/secadv_20060905.txt I'm not sure which version shipped with the 2.4 windows binaries, but externals (for 2.5) still points to 0.9.8.a, which is vulnerable. openssl has also patched 0.9.7.k (0.9.7 was released in 2003) and the patch itself http://www.openssl.org/news/patch-CVE-2006-4339.txt should apply to 0.9.6 (released in 2000). -jJ From martin at v.loewis.de Wed Sep 27 20:31:11 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 27 Sep 2006 20:31:11 +0200 Subject: [Python-Dev] openssl - was: 2.4.4c1 October 11, 2.4.4 final October 18 In-Reply-To: References: Message-ID: <451AC36F.5020404@v.loewis.de> Jim Jewett schrieb: > OpenSSL should probably be upgraded to 0.9.8.c (or possibly 0.9.7.k) > because of the security patch. > > http://www.openssl.org/ > http://www.openssl.org/news/secadv_20060905.txt > > I'm not sure which version shipped with the 2.4 windows binaries, but > externals (for 2.5) still points to 0.9.8.a, which is vulnerable. If there is any change, it should be to 0.9.7k; we shouldn't switch to a new "branch" of OpenSSL in micro releases. However, I'm uncertain whether I can do the update in next weeks. Regards, Martin From martin at v.loewis.de Wed Sep 27 21:20:08 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 27 Sep 2006 21:20:08 +0200 Subject: [Python-Dev] List of candidate 2.4.4 bugs? In-Reply-To: <20060927173542.GA10686@rogue.amk.ca> References: <20060927164004.GA12389@localhost.localdomain> <451AAD2E.1070705@v.loewis.de> <20060927173542.GA10686@rogue.amk.ca> Message-ID: <451ACEE8.2020604@v.loewis.de> A.M. Kuchling schrieb: > One reason I often don't backport a bug is because I'm not sure if > there will be another bugfix release; if not, it's wasted effort, and > I wasn't sure if a 2.4.4 release was ever going to happen. After > 2.4.4, will there be a 2.4.5 or is that too unlikely? The "tradition" seems to be that there will be one last bug fix release after a feature release is made; IOW, two branches are always maintained (the trunk and the last release). Following this tradition, there wouldn't be another 2.4.x release (and I think Anthony already said so). Likewise, 2.5. will be maintained until 2.6 is released, and one last 2.5.x release will be made shortly after 2.6. Regards, Martin From gustavo at niemeyer.net Wed Sep 27 22:15:21 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Wed, 27 Sep 2006 17:15:21 -0300 Subject: [Python-Dev] Minipython In-Reply-To: <20060923221441.GB5227@hornet.din.cz> References: <20060923221441.GB5227@hornet.din.cz> Message-ID: <20060927201521.GA24770@niemeyer.net> > I would like to run Python scripts on an embedded MIPS Linux platform > having only 2 MiB of flash ROM and 16 MiB of RAM for everything. (...) Have you looked at Python for S60 and Python for the Maemo platform? If not directly useful, they should provide some hints. [1] http://opensource.nokia.com/projects/pythonfors60/ [2] http://pymaemo.sf.net -- Gustavo Niemeyer http://niemeyer.net From brett at python.org Wed Sep 27 23:11:30 2006 From: brett at python.org (Brett Cannon) Date: Wed, 27 Sep 2006 14:11:30 -0700 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source Message-ID: I am at the point with my security work that I need to consider how I am going to restrict importing modules. My current plan is to basically implement phase 2 of PEP 302 and control imports through what importer objects are provided. This work should lead to a meta_path importer for built-ins and then path_hooks importers for .py, .pyc, and extension modules. But it has been suggested here that the import machinery be rewritten in Python. Now I have never touched the import code since it has always had the reputation of being less than friendly to work with. I am asking for opinions from people who have worked with the import machinery before if it is so bad that it is worth trying to re-implement the import semantics in pure Python or if in the name of time to just work with the C code. Basically I will end up breaking up built-in, .py, .pyc, and extension modules into individual importers and then have a chaining class to act as a combined .pyc/.py combination importer (this will also make writing out to .pyc files an optional step of the .py import). Any opinions would be greatly appreciated on this. I need to get back to my supervisor by the end of the day Friday with a decision as to whether I think it is worth the rewrite. If you are interested in helping with the Python rewrite (or in general if the work is done with the C code), please let me know since if enough people want to help with the Python rewrite it might help wash out the extra time needed to make it work. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060927/60f301e2/attachment.htm From pje at telecommunity.com Thu Sep 28 00:31:34 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 27 Sep 2006 18:31:34 -0400 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: Message-ID: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote: >But it has been suggested here that the import machinery be rewritten in >Python. Now I have never touched the import code since it has always had >the reputation of being less than friendly to work with. I am asking for >opinions from people who have worked with the import machinery before if >it is so bad that it is worth trying to re-implement the import semantics >in pure Python or if in the name of time to just work with the C >code. Basically I will end up breaking up built-in, .py, .pyc, and >extension modules into individual importers and then have a chaining class >to act as a combined .pyc/.py combination importer (this will also make >writing out to .pyc files an optional step of the .py import). The problem you would run into here would be supporting zip imports. It would probably be more useful to have a mapping of file types to "format handlers", because then a filesystem importer or zip importer would then be able to work with any .py/.pyc/.pyo/whatever formats, along with any new ones that are invented, without reinventing the wheel. Thus, whether it's file import, zip import, web import, or whatever, the same handlers would be reusable, and when people invent new extensions like .ptl, .kid, etc., they can just register format handlers instead. Format handlers could of course be based on the PEP 302 protocol, and simply accept a "parent importer" with a get_data() method. So, let's say you have a PyImporter: class PyImporter: def __init__(self, parent_importer): self.parent = parent_importer def find_module(self, fullname): path = fullname.split('.')[-1]+'.py' try: source = self.parent.get_data(path) except IOError: return None else: return PySourceLoader(source) See what I mean? The importers and loaders thus don't have to do direct filesystem operations. Of course, to fully support .pyc timestamp checking and writeback, you'd need some sort of "stat" or "getmtime" feature on the parent importer, as well as perhaps an optional "save_data" method. These would be extensions to PEP 302, but welcome ones. Anyway, based on my previous work with pkg_resource, pkgutil, zipimport, import.c, etc. I would say this is how I'd want to structure a reimplementation of the core system. And if it were for Py3K, I'd probably treat sys.path and all the import hooks associated with it as a single meta-importer on sys.meta_path -- listed after a meta-importer for handling frozen and built-in modules. (I.e., the meta-importer that uses sys.path and its path hooks would be last on sys.meta_path.) In other words, sys.meta_path is really the only critical import hook from the raw interpreter's point of view. sys.path, however, (along with sys.path_hooks and sys.path_importer_cache) is critical from the perspective of users, applications, etc., as there has to be some way to get things onto Python's path in the first place. From brett at python.org Thu Sep 28 01:11:33 2006 From: brett at python.org (Brett Cannon) Date: Wed, 27 Sep 2006 16:11:33 -0700 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> Message-ID: On 9/27/06, Phillip J. Eby wrote: > > At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote: > >But it has been suggested here that the import machinery be rewritten in > >Python. Now I have never touched the import code since it has always had > >the reputation of being less than friendly to work with. I am asking for > >opinions from people who have worked with the import machinery before if > >it is so bad that it is worth trying to re-implement the import semantics > >in pure Python or if in the name of time to just work with the C > >code. Basically I will end up breaking up built-in, .py, .pyc, and > >extension modules into individual importers and then have a chaining > class > >to act as a combined .pyc/.py combination importer (this will also make > >writing out to .pyc files an optional step of the .py import). > > The problem you would run into here would be supporting zip imports. I have not looked at zipimport so I don't know the exact issue in terms of how it hooks into the import machinery. But a C level API will most likely be needed. It > would probably be more useful to have a mapping of file types to "format > handlers", because then a filesystem importer or zip importer would then > be > able to work with any .py/.pyc/.pyo/whatever formats, along with any new > ones that are invented, without reinventing the wheel. So you are saying the zipimporter would then pull out of the zip file the individual file to import and pass that to the format-specific importer? Thus, whether it's file import, zip import, web import, or whatever, the > same handlers would be reusable, and when people invent new extensions > like > .ptl, .kid, etc., they can just register format handlers instead. So a sepration of data store from data interpretation for importation. My only worry is a possible explosion of checks for the various data types. If you are using the file data store and had .py, .pyc, .so, module.so, .ptl, and .kid registered that might suck in terms of performance hit. And I am assuming for a web import that it would decide based on the extension of the resulting web address? And checking for the various types might not work well for other data store types. Guess you would need a way to register with the data store exactly what types of data interpretation you might want to check. Other option is to just have the data store do its magic and somehow know what kind of data interpretation is needed for the string returned (e.g., a database data store might implicitly only store .py code and thus know that it will only return a string of source). Then that string and the supposed file extension is passed ot the next step of creating a module from that data string. Format handlers could of course be based on the PEP 302 protocol, and > simply accept a "parent importer" with a get_data() method. So, let's say > you have a PyImporter: > > class PyImporter: > def __init__(self, parent_importer): > self.parent = parent_importer > > def find_module(self, fullname): > path = fullname.split('.')[-1]+'.py' > try: > source = self.parent.get_data(path) > except IOError: > return None > else: > return PySourceLoader(source) > > See what I mean? The importers and loaders thus don't have to do direct > filesystem operations. I think so. Basically you want more of a way to stack imports so that the basic importers are just passed the string of what it is supposed to load from. Other importers higher in the chain can handle getting that string. Of course, to fully support .pyc timestamp checking and writeback, you'd > need some sort of "stat" or "getmtime" feature on the parent importer, as > well as perhaps an optional "save_data" method. These would be extensions > to PEP 302, but welcome ones. Could pass the string representing the location of where the string came from. That would allow for the required stat calls for .pyc files as needed without having to implement methods just for this one use case. Anyway, based on my previous work with pkg_resource, pkgutil, zipimport, > import.c, etc. I would say this is how I'd want to structure a > reimplementation of the core system. And if it were for Py3K, I'd > probably > treat sys.path and all the import hooks associated with it as a single > meta-importer on sys.meta_path -- listed after a meta-importer for > handling > frozen and built-in modules. (I.e., the meta-importer that uses sys.path > and its path hooks would be last on sys.meta_path.) Ah, interesting idea! Could even go as far as removing sys.path and just making it an attribute of the base importer if you really wanted to make it just meta_path for imports. In other words, sys.meta_path is really the only critical import hook from > the raw interpreter's point of view. sys.path, however, (along with > sys.path_hooks and sys.path_importer_cache) is critical from the > perspective of users, applications, etc., as there has to be some way to > get things onto Python's path in the first place. > > Yeah, I think I get it. I don't know how much it simplifies things for users but I think it might make it easier for alternative import writers. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060927/0fda24ae/attachment.htm From pje at telecommunity.com Thu Sep 28 01:41:18 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 27 Sep 2006 19:41:18 -0400 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com> At 04:11 PM 9/27/2006 -0700, Brett Cannon wrote: >On 9/27/06, Phillip J. Eby ><pje at telecommunity.com> wrote: >>At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote: >> >But it has been suggested here that the import machinery be rewritten in >> >Python. Now I have never touched the import code since it has always had >> >the reputation of being less than friendly to work with. I am asking for >> >opinions from people who have worked with the import machinery before if >> >it is so bad that it is worth trying to re-implement the import semantics >> >in pure Python or if in the name of time to just work with the C >> >code. Basically I will end up breaking up built-in, .py, .pyc, and >> >extension modules into individual importers and then have a chaining class >> >to act as a combined .pyc/.py combination importer (this will also make >> >writing out to .pyc files an optional step of the .py import). >> >>The problem you would run into here would be supporting zip imports. > >I have not looked at zipimport so I don't know the exact issue in terms of >how it hooks into the import machinery. But a C level API will most >likely be needed. I was actually assuming you planned to reimplement that in Python as well, and hence the need for the storage/format separation. >> It >>would probably be more useful to have a mapping of file types to "format >>handlers", because then a filesystem importer or zip importer would then be >>able to work with any .py/.pyc/.pyo/whatever formats, along with any new >>ones that are invented, without reinventing the wheel. > >So you are saying the zipimporter would then pull out of the zip file the >individual file to import and pass that to the format-specific importer? No, I'm saying that the zipimporter would simply call the format importers in sequence, as in your original concept. However, these importers would call *back* to the zipimporter to ask if the file they are looking for is there. >>Thus, whether it's file import, zip import, web import, or whatever, the >>same handlers would be reusable, and when people invent new extensions like >>.ptl, .kid, etc., they can just register format handlers instead. > >So a sepration of data store from data interpretation for importation. My >only worry is a possible explosion of checks for the various data >types. If you are using the file data store and had .py, .pyc, .so, >module.so , .ptl, and .kid registered that might suck in terms of >performance hit. Look at it this way: the parent importer can always pull a directory listing once and cache it for the duration of its calls to the child importers. In practice, however, I suspect that the stat calls will be faster. In the case of a zipimport parent, the zip directory is already cached. Also, keep in mind that most imports will likely occur *before* any special additional types get registered, so the hits will be minimal. And the more of sys.path is taken up by zip files, the less of a hit it will be for each query. > And I am assuming for a web import that it would decide based on the > extension of the resulting web address? No - you'd effectively end up doing a web hit for each possible extension. Which would suck, but that's what caching is for. Realistically, you wouldn't want to do web-based imports without some disk-based caching anyway. > And checking for the various types might not work well for other data > store types. Guess you would need a way to register with the data store > exactly what types of data interpretation you might want to check. No, you just need a method on the parent importer like get_data(). >Other option is to just have the data store do its magic and somehow know >what kind of data interpretation is needed for the string returned (e.g., >a database data store might implicitly only store .py code and thus know >that it will only return a string of source). Then that string and the >supposed file extension is passed ot the next step of creating a module >from that data string. Again, all that's way more complex than you need; you can do the same thing by just raising IOError from get_data() when asked for something that's not a .py. >>Format handlers could of course be based on the PEP 302 protocol, and >>simply accept a "parent importer" with a get_data() method. So, let's say >>you have a PyImporter: >> >> class PyImporter: >> def __init__(self, parent_importer): >> self.parent = parent_importer >> >> def find_module(self, fullname): >> path = fullname.split('.')[-1]+'.py' >> try: >> source = self.parent.get_data(path) >> except IOError: >> return None >> else: >> return PySourceLoader(source) >> >>See what I mean? The importers and loaders thus don't have to do direct >>filesystem operations. > >I think so. Basically you want more of a way to stack imports so that the >basic importers are just passed the string of what it is supposed to load >from. Other importers higher in the chain can handle getting that string. No, they're full importers; they're not passed "a string". The only difference between this and your original idea of an importer chain is that I'm saying the chained format-specific importers need to know who their "parent" importer (the data store) is, so they can be data-store independent. Everything else can be done with that, and perhaps a few extra parent importer methods for stat, save, etc. >>Of course, to fully support .pyc timestamp checking and writeback, you'd >>need some sort of "stat" or "getmtime" feature on the parent importer, as >>well as perhaps an optional "save_data" method. These would be extensions >>to PEP 302, but welcome ones. > >Could pass the string representing the location of where the string came >from. That would allow for the required stat calls for .pyc files as >needed without having to implement methods just for this one use case. Huh? In order to know if a .pyc is up to date, you need the st_mtime of the .py file. That can't be done in the parent importer without giving it format knowledge, which goes against the point of the exercise. Thus, something like stat() and save() methods need to be available on the parent, if it can support them. >>Anyway, based on my previous work with pkg_resource, pkgutil, zipimport, >>import.c , etc. I would say this is how I'd want to structure a >>reimplementation of the core system. And if it were for Py3K, I'd probably >>treat sys.path and all the import hooks associated with it as a single >>meta-importer on sys.meta_path -- listed after a meta-importer for handling >>frozen and built-in modules. (I.e., the meta-importer that uses sys.path >>and its path hooks would be last on sys.meta_path.) > >Ah, interesting idea! Could even go as far as removing sys.path and just >making it an attribute of the base importer if you really wanted to make >it just meta_path for imports. Perhaps, but then that just means you have to have a new variable for 'sys.path_importer' or some such, just to get at it. (i.e., code won't be able to assume it's always the last item in sys.meta_path). So this seems wasteful and changing things just for the sake of change, vs. just keeping the other PEP 302 sys variables. I just think the *implementation* of them can move to sys.meta_path, as that simplifies the main __import__ function down to just calling meta_path importers in sequence, modulo some package issues. One other rather tricky matter is that the sys.path meta-importer has to deal with package __path__ management... and actually, meta_path importers are supposed to receive a copy of sys.path... ugh. Well, it was a nice idea, but I guess you can't actually implement sys.path using a meta_path importer. :( For Py3K, we could drop the path argument to find_module() and manage it, but it can't be done and still allow current meta_path hooks to work right. >>In other words, sys.meta_path is really the only critical import hook from >>the raw interpreter's point of view. sys.path, however, (along with >>sys.path_hooks and sys.path_importer_cache) is critical from the >>perspective of users, applications, etc., as there has to be some way to >>get things onto Python's path in the first place. > >Yeah, I think I get it. I don't know how much it simplifies things for >users but I think it might make it easier for alternative import writers. That was the idea, yes. :) From brett at python.org Thu Sep 28 02:26:14 2006 From: brett at python.org (Brett Cannon) Date: Wed, 27 Sep 2006 17:26:14 -0700 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com> References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com> Message-ID: On 9/27/06, Phillip J. Eby wrote: > > At 04:11 PM 9/27/2006 -0700, Brett Cannon wrote: > > > >On 9/27/06, Phillip J. Eby > ><pje at telecommunity.com> wrote: > >>At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote: > >> >But it has been suggested here that the import machinery be rewritten > in > >> >Python. Now I have never touched the import code since it has always > had > >> >the reputation of being less than friendly to work with. I am asking > for > >> >opinions from people who have worked with the import machinery before > if > >> >it is so bad that it is worth trying to re-implement the import > semantics > >> >in pure Python or if in the name of time to just work with the C > >> >code. Basically I will end up breaking up built-in, .py, .pyc, and > >> >extension modules into individual importers and then have a chaining > class > >> >to act as a combined .pyc/.py combination importer (this will also > make > >> >writing out to .pyc files an optional step of the .py import). > >> > >>The problem you would run into here would be supporting zip imports. > > > >I have not looked at zipimport so I don't know the exact issue in terms > of > >how it hooks into the import machinery. But a C level API will most > >likely be needed. > > I was actually assuming you planned to reimplement that in Python as well, > and hence the need for the storage/format separation. I was not explictly planning on it. >> It > >>would probably be more useful to have a mapping of file types to "format > >>handlers", because then a filesystem importer or zip importer would then > be > >>able to work with any .py/.pyc/.pyo/whatever formats, along with any new > >>ones that are invented, without reinventing the wheel. > > > >So you are saying the zipimporter would then pull out of the zip file the > >individual file to import and pass that to the format-specific importer? > > No, I'm saying that the zipimporter would simply call the format importers > in sequence, as in your original concept. However, these importers would > call *back* to the zipimporter to ask if the file they are looking for is > there. Ah, OK. So for importing 'email', the zipimporter would call the .pyc importer and it would ask the zipimporter, "can you get me email.pyc?" and if it said no it would move on to asking the .py importer for email.py, etc. >>Thus, whether it's file import, zip import, web import, or whatever, the > >>same handlers would be reusable, and when people invent new extensions > like > >>.ptl, .kid, etc., they can just register format handlers instead. > > > >So a sepration of data store from data interpretation for > importation. My > >only worry is a possible explosion of checks for the various data > >types. If you are using the file data store and had .py, .pyc, .so, > >module.so , .ptl, and .kid registered that might suck in terms of > >performance hit. > > Look at it this way: the parent importer can always pull a directory > listing once and cache it for the duration of its calls to the child > importers. In practice, however, I suspect that the stat calls will be > faster. In the case of a zipimport parent, the zip directory is already > cached. > > Also, keep in mind that most imports will likely occur *before* any > special > additional types get registered, so the hits will be minimal. And the > more > of sys.path is taken up by zip files, the less of a hit it will be for > each > query. That's fine. Just thinking about how the current situation sucks for NFS but how caching just isn't done. But obvoiusly this could change. > And I am assuming for a web import that it would decide based on the > > extension of the resulting web address? > > No - you'd effectively end up doing a web hit for each possible > extension. Which would suck, but that's what caching is > for. Realistically, you wouldn't want to do web-based imports without > some > disk-based caching anyway. > > > And checking for the various types might not work well for other data > > store types. Guess you would need a way to register with the data store > > exactly what types of data interpretation you might want to check. > > No, you just need a method on the parent importer like get_data(). > > > >Other option is to just have the data store do its magic and somehow know > >what kind of data interpretation is needed for the string returned (e.g., > >a database data store might implicitly only store .py code and thus know > >that it will only return a string of source). Then that string and the > >supposed file extension is passed ot the next step of creating a module > >from that data string. > > Again, all that's way more complex than you need; you can do the same > thing > by just raising IOError from get_data() when asked for something that's > not > a .py. > > > >>Format handlers could of course be based on the PEP 302 protocol, and > >>simply accept a "parent importer" with a get_data() method. So, let's > say > >>you have a PyImporter: > >> > >> class PyImporter: > >> def __init__(self, parent_importer): > >> self.parent = parent_importer > >> > >> def find_module(self, fullname): > >> path = fullname.split('.')[-1]+'.py' > >> try: > >> source = self.parent.get_data(path) > >> except IOError: > >> return None > >> else: > >> return PySourceLoader(source) > >> > >>See what I mean? The importers and loaders thus don't have to do direct > >>filesystem operations. > > > >I think so. Basically you want more of a way to stack imports so that > the > >basic importers are just passed the string of what it is supposed to load > >from. Other importers higher in the chain can handle getting that > string. > > No, they're full importers; they're not passed "a string". The only > difference between this and your original idea of an importer chain is > that > I'm saying the chained format-specific importers need to know who their > "parent" importer (the data store) is, so they can be data-store > independent. Everything else can be done with that, and perhaps a few > extra parent importer methods for stat, save, etc. OK. >>Of course, to fully support .pyc timestamp checking and writeback, you'd > >>need some sort of "stat" or "getmtime" feature on the parent importer, > as > >>well as perhaps an optional "save_data" method. These would be > extensions > >>to PEP 302, but welcome ones. > > > >Could pass the string representing the location of where the string came > >from. That would allow for the required stat calls for .pyc files as > >needed without having to implement methods just for this one use case. > > Huh? In order to know if a .pyc is up to date, you need the st_mtime of > the .py file. That can't be done in the parent importer without giving it > format knowledge, which goes against the point of the exercise. Sorry, thought .pyc files based whether they needed to be recompiled based on the stat info on the .py and .pyc file, not on data stored from within the .pyc . Thus, > something like stat() and save() methods need to be available on the > parent, if it can support them. > > > >>Anyway, based on my previous work with pkg_resource, pkgutil, zipimport, > >>import.c , etc. I would say this is how I'd want to structure a > >>reimplementation of the core system. And if it were for Py3K, I'd > probably > >>treat sys.path and all the import hooks associated with it as a single > >>meta-importer on sys.meta_path -- listed after a meta-importer for > handling > >>frozen and built-in modules. (I.e., the meta-importer that uses > sys.path > >>and its path hooks would be last on sys.meta_path.) > > > >Ah, interesting idea! Could even go as far as removing sys.path and just > >making it an attribute of the base importer if you really wanted to make > >it just meta_path for imports. > > Perhaps, but then that just means you have to have a new variable for > 'sys.path_importer' or some such, just to get at it. (i.e., code won't be > able to assume it's always the last item in sys.meta_path). So this seems > wasteful and changing things just for the sake of change, vs. just keeping > the other PEP 302 sys variables. I just think the *implementation* of > them > can move to sys.meta_path, as that simplifies the main __import__ function > down to just calling meta_path importers in sequence, modulo some package > issues. > > One other rather tricky matter is that the sys.path meta-importer has to > deal with package __path__ management... and actually, meta_path > importers > are supposed to receive a copy of sys.path... ugh. Well, it was a nice > idea, but I guess you can't actually implement sys.path using a meta_path > importer. :( For Py3K, we could drop the path argument to find_module() > and manage it, but it can't be done and still allow current meta_path > hooks > to work right. Ah, true. >>In other words, sys.meta_path is really the only critical import hook from > >>the raw interpreter's point of view. sys.path, however, (along with > >>sys.path_hooks and sys.path_importer_cache) is critical from the > >>perspective of users, applications, etc., as there has to be some way to > >>get things onto Python's path in the first place. > > > >Yeah, I think I get it. I don't know how much it simplifies things for > >users but I think it might make it easier for alternative import writers. > > That was the idea, yes. :) =) -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060927/6718c9f1/attachment.html From pje at telecommunity.com Thu Sep 28 02:41:15 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 27 Sep 2006 20:41:15 -0400 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: References: <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com> <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com> At 05:26 PM 9/27/2006 -0700, Brett Cannon wrote: >Ah, OK. So for importing 'email', the zipimporter would call the .pyc >importer and it would ask the zipimporter, "can you get me email.pyc?" and >if it said no it would move on to asking the .py importer for email.py, etc. Yes, exactly. >That's fine. Just thinking about how the current situation sucks for NFS >but how caching just isn't done. But obvoiusly this could change. Well, with this design, you can have a CachingFilesystemImporter as your storage mechanism to speed things up. >> >>Of course, to fully support .pyc timestamp checking and writeback, you'd >> >>need some sort of "stat" or "getmtime" feature on the parent importer, as >> >>well as perhaps an optional "save_data" method. These would be extensions >> >>to PEP 302, but welcome ones. >> > >> >Could pass the string representing the location of where the string came >> >from. That would allow for the required stat calls for .pyc files as >> >needed without having to implement methods just for this one use case. >> >>Huh? In order to know if a .pyc is up to date, you need the st_mtime of >>the .py file. That can't be done in the parent importer without giving it >>format knowledge, which goes against the point of the exercise. > >Sorry, thought .pyc files based whether they needed to be recompiled based >on the stat info on the .py and .pyc file, not on data stored from within >the .pyc . It's not just that (although I believe it's also the case that there is a timestamp inside .pyc), it's that to do the check in the parent importer, the parent importer would have to know that there is such a thing as .py-and-.pyc. The whole point of this design is that the parent importer doesn't have to know *anything* about filename extensions OR how those files are formatted internally. In this scheme, adding more child importers is sufficient to add all the special handling needed for .py/.pyc-style schemes. Of course, for maximum flexibility, you might want get_stream() and get_file() methods optionally available, since a .so loader really needs a file, and .pyc might want to read in two stages. But the child importers can be defensively coded so as to be able to live with only a parent.get_data(), if necessary, and do the enhanced behaviors only if stat() or get_stream() or write_data() etc. attributes are available on the parent. If we get some standards for these additional attributes, we can document them as standard PEP 302 extensions. The format importer mechanism might want to have something like 'sys.import_formats' as a list of importer classes (or factories). Parent (storage) importer classes would then create instances to use. If you add a new format importer to sys.import_formats, you would of course need to clear sys.path_importer_cache, so that the individual importers are rebuilt on the next import, and thus they will create new child importer chains. Yeah, that pretty much ought to do it. From xah at xahlee.org Thu Sep 28 12:49:22 2006 From: xah at xahlee.org (xah lee) Date: Thu, 28 Sep 2006 03:49:22 -0700 Subject: [Python-Dev] Python Doc problems Message-ID: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> There are a lot reports on the lousy state of python docs. I'm not much in the python community so i don't know what the developers are doing anything about it. anyway, i've rewrote the Python's RE module documentation, at: http://xahlee.org/perl-python/python_re-write/lib/module-re.html and have recently made the term of user clear. may i ask what the python developers is doing about the python's docs? Are you guys aware, that there are rampant criticisms of python docs and many diverse tries by various individuals to rewrite the doc by starting another wiki or site? Xah xah at xahlee.org ? http://xahlee.org/ ? From amk at amk.ca Thu Sep 28 14:12:55 2006 From: amk at amk.ca (A.M. Kuchling) Date: Thu, 28 Sep 2006 08:12:55 -0400 Subject: [Python-Dev] Collecting 2.4.4 fixes Message-ID: <20060928121255.GE5511@localhost.localdomain> I've put some candidate fixes and listed some tasks at . --amk From jeremy at alum.mit.edu Thu Sep 28 16:30:25 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 28 Sep 2006 10:30:25 -0400 Subject: [Python-Dev] AST structure and maintenance branches In-Reply-To: <200609231840.03859.anthony@interlink.com.au> References: <200609231840.03859.anthony@interlink.com.au> Message-ID: On 9/23/06, Anthony Baxter wrote: > I'd like to propose that the AST format returned by passing PyCF_ONLY_AST to > compile() get the same guarantee in maintenance branches as the bytecode > format - that is, unless it's absolutely necessary, we'll keep it the same. > Otherwise anyone trying to write tools to manipulate the AST is in for a > massive world of hurt. > > Anyone have any problems with this, or can it be added to PEP 6? It's possible we should poll developers of other Python implementations and find out if anyone has objections to supporting this AST format. But in principle, it sounds like a good idea to me. Jeremy From anthony at interlink.com.au Thu Sep 28 16:37:16 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 29 Sep 2006 00:37:16 +1000 Subject: [Python-Dev] AST structure and maintenance branches In-Reply-To: References: <200609231840.03859.anthony@interlink.com.au> Message-ID: <200609290037.18885.anthony@interlink.com.au> On Friday 29 September 2006 00:30, Jeremy Hylton wrote: > On 9/23/06, Anthony Baxter wrote: > > I'd like to propose that the AST format returned by passing PyCF_ONLY_AST > > to compile() get the same guarantee in maintenance branches as the > > bytecode format - that is, unless it's absolutely necessary, we'll keep > > it the same. Otherwise anyone trying to write tools to manipulate the AST > > is in for a massive world of hurt. > > > > Anyone have any problems with this, or can it be added to PEP 6? > > It's possible we should poll developers of other Python > implementations and find out if anyone has objections to supporting > this AST format. But in principle, it sounds like a good idea to me. I think it's extremely likely that the AST format will change over time - with major releases. I'd just like to guarantee that we won't mess with it other than that. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From jeremy at alum.mit.edu Thu Sep 28 16:42:15 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 28 Sep 2006 10:42:15 -0400 Subject: [Python-Dev] AST structure and maintenance branches In-Reply-To: <200609290037.18885.anthony@interlink.com.au> References: <200609231840.03859.anthony@interlink.com.au> <200609290037.18885.anthony@interlink.com.au> Message-ID: On 9/28/06, Anthony Baxter wrote: > On Friday 29 September 2006 00:30, Jeremy Hylton wrote: > > On 9/23/06, Anthony Baxter wrote: > > > I'd like to propose that the AST format returned by passing PyCF_ONLY_AST > > > to compile() get the same guarantee in maintenance branches as the > > > bytecode format - that is, unless it's absolutely necessary, we'll keep > > > it the same. Otherwise anyone trying to write tools to manipulate the AST > > > is in for a massive world of hurt. > > > > > > Anyone have any problems with this, or can it be added to PEP 6? > > > > It's possible we should poll developers of other Python > > implementations and find out if anyone has objections to supporting > > this AST format. But in principle, it sounds like a good idea to me. > > I think it's extremely likely that the AST format will change over time - > with major releases. I'd just like to guarantee that we won't mess with it > other than that. Good point. I'm fine with the change, then. Jeremy From jcarlson at uci.edu Thu Sep 28 19:40:24 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 28 Sep 2006 10:40:24 -0700 Subject: [Python-Dev] Python Doc problems In-Reply-To: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> Message-ID: <20060928095951.08BF.JCARLSON@uci.edu> xah lee wrote: > There are a lot reports on the lousy state of python docs. I'm not > much in the python community so i don't know what the developers are > doing anything about it. I don't know about everyone else, but when I recieve comments like "the docs are lousy, fix them", it is more than a bit difficult to know where to start, and/or what would be better. Case-by-case examples of "the phrasing of the docs here is confusing" are helpful, as are actual documentation patches (even plain text is fine). While I have heard comments along the lines of "the docs could be better", I've never heard the claim that the Python docs are "lousy". > anyway, i've rewrote the Python's RE module documentation, at: > http://xahlee.org/perl-python/python_re-write/lib/module-re.html > and have recently made the term of user clear. Aside from a few sections in the original docs, and also some sections in your docs, about the only part of the original docs that I find unclear is that some sections do not have function names sorted lexically. This is confusing compared to other module documentation available in the stdlib. I would also like to make one comment about your updated docs (I didn't read them all, I'm on vacation); In the section about 'Regex Functions' you used r'\w+@\w+\.com' as a regular expression for an email address in information about the search() function. This particular RE will only give results for the simplest of email addresses. I understand that you wanted to provide an example, but providing a generally broken example will be detrimental to newer Python RE users, especially those who were looking for a regular expression for email addresses. I would say slim it down to domain names, but even the correct RE for domain names (with or without internationalization) is ugly. I don't currently have an idea of what kind of example would be simple and illustrative, but maybe someone else has an idea. > may i ask what the python developers is doing about the python's > docs? Are you guys aware, that there are rampant criticisms of python > docs and many diverse tries by various individuals to rewrite the doc > by starting another wiki or site? If there are "rampant criticisms" of the Python docs, then those that are complaining should take specific examples of their complaints to the sourceforge bug tracker and submit documentation patches for the relevant sections. And personally, I've not noticed that criticisms of the Python docs are "rampant", but maybe there is some "I hate Python docs" newsgroup or mailing list that I'm not subscribed to. While I personally think that having a wiki attached to the documentation is a decent idea, I fear that we would run into a situation like php, where the documentation is so atrocious that users need to comment on basically every function in every package to understand what the heck is going on. - Josiah From brett at python.org Thu Sep 28 20:25:25 2006 From: brett at python.org (Brett Cannon) Date: Thu, 28 Sep 2006 11:25:25 -0700 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com> References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com> <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com> Message-ID: On 9/27/06, Phillip J. Eby wrote: > > At 05:26 PM 9/27/2006 -0700, Brett Cannon wrote: > >Ah, OK. So for importing 'email', the zipimporter would call the .pyc > >importer and it would ask the zipimporter, "can you get me email.pyc?" > and > >if it said no it would move on to asking the .py importer for email.py, > etc. > > Yes, exactly. > > > >That's fine. Just thinking about how the current situation sucks for NFS > >but how caching just isn't done. But obvoiusly this could change. > > Well, with this design, you can have a CachingFilesystemImporter as your > storage mechanism to speed things up. > > > >> >>Of course, to fully support .pyc timestamp checking and writeback, > you'd > >> >>need some sort of "stat" or "getmtime" feature on the parent > importer, as > >> >>well as perhaps an optional "save_data" method. These would be > extensions > >> >>to PEP 302, but welcome ones. > >> > > >> >Could pass the string representing the location of where the string > came > >> >from. That would allow for the required stat calls for .pyc files as > >> >needed without having to implement methods just for this one use case. > >> > >>Huh? In order to know if a .pyc is up to date, you need the st_mtime of > >>the .py file. That can't be done in the parent importer without giving > it > >>format knowledge, which goes against the point of the exercise. > > > >Sorry, thought .pyc files based whether they needed to be recompiled > based > >on the stat info on the .py and .pyc file, not on data stored from within > >the .pyc . > > It's not just that (although I believe it's also the case that there is a > timestamp inside .pyc), it's that to do the check in the parent importer, > the parent importer would have to know that there is such a thing as > .py-and-.pyc. The whole point of this design is that the parent importer > doesn't have to know *anything* about filename extensions OR how those > files are formatted internally. In this scheme, adding more child > importers is sufficient to add all the special handling needed for > .py/.pyc-style schemes. > > Of course, for maximum flexibility, you might want get_stream() and > get_file() methods optionally available, since a .so loader really needs a > file, and .pyc might want to read in two stages. But the child importers > can be defensively coded so as to be able to live with only a > parent.get_data(), if necessary, and do the enhanced behaviors only if > stat() or get_stream() or write_data() etc. attributes are available on > the > parent. Yeah, how to get the proper information to the data importers is going to be the trick. If we get some standards for these additional attributes, we can document > them as standard PEP 302 extensions. > > The format importer mechanism might want to have something like > 'sys.import_formats' as a list of importer classes (or factories). Parent > (storage) importer classes would then create instances to use. > > If you add a new format importer to sys.import_formats, you would of > course > need to clear sys.path_importer_cache, so that the individual importers > are > rebuilt on the next import, and thus they will create new child importer > chains. > > Yeah, that pretty much ought to do it. I will think about it, but I am still trying to get the original question of how bad the C code is compared to rewriting import in Python from people. =) -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060928/979594a0/attachment.html From pje at telecommunity.com Thu Sep 28 20:35:30 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 28 Sep 2006 14:35:30 -0400 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: References: <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com> <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com> <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20060928143335.02e08360@sparrow.telecommunity.com> At 11:25 AM 9/28/2006 -0700, Brett Cannon wrote: >I will think about it, but I am still trying to get the original question >of how bad the C code is compared to rewriting import in Python from >people. =) I would say that the C code is *delicate*, not necessarily bad. In most ways, it's rather straightforward, it's actually the requirements that are complex. :) A Python implementation, however, would be a good idea to have around for PyPy, Py3K, and other versions of Python, and as a refactoring basis for writing any new C code. From tomerfiliba at gmail.com Thu Sep 28 20:40:34 2006 From: tomerfiliba at gmail.com (tomer filiba) Date: Thu, 28 Sep 2006 20:40:34 +0200 Subject: [Python-Dev] weakref enhancements Message-ID: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> i'd like to suggest adding weak attributes and weak methods to the std weakref module. weakattrs are weakly-referenced attributes. when the value they reference is no longer strongly-referenced by something else, the weakattrs "nullify" themselves. weakmethod is a method decorator, like classmethod et al, that returns "weakly bound" methods. weakmethod's im_self is a weakref.proxy to `self`, which means the mere method will not keep the entire instance alive. instead, you'll get a ReferenceError. i think these two features are quite useful, and being part of the stdlib, would provide programmers with easy-to-use solutions to object-aliveness issues. more info, examples, and suggested implementation: * http://sebulba.wikispaces.com/recipe+weakattr * http://sebulba.wikispaces.com/recipe+weakmethod -tomer -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060928/7c3f1fc2/attachment.html From theller at python.net Thu Sep 28 20:54:02 2006 From: theller at python.net (Thomas Heller) Date: Thu, 28 Sep 2006 20:54:02 +0200 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: <5.1.1.6.0.20060928143335.02e08360@sparrow.telecommunity.com> References: <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com> <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com> <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com> <5.1.1.6.0.20060928143335.02e08360@sparrow.telecommunity.com> Message-ID: Phillip J. Eby schrieb: > At 11:25 AM 9/28/2006 -0700, Brett Cannon wrote: >>I will think about it, but I am still trying to get the original question >>of how bad the C code is compared to rewriting import in Python from >>people. =) > > I would say that the C code is *delicate*, not necessarily bad. In most > ways, it's rather straightforward, it's actually the requirements that are > complex. :) > > A Python implementation, however, would be a good idea to have around for > PyPy, Py3K, and other versions of Python, and as a refactoring basis for > writing any new C code. FYI, Gordon McMillan had a Python 'model' of the import mechanism in his, (not sure if it was really named) "iu.py". It was part of his installer utility, maybe the code still lives in the PyInstaller project. IIRC, parts of pep 302 were inspired by his code. Thomas From rhettinger at ewtllc.com Thu Sep 28 21:02:20 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Thu, 28 Sep 2006 12:02:20 -0700 Subject: [Python-Dev] weakref enhancements In-Reply-To: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> Message-ID: <451C1C3C.5010004@ewtllc.com> tomer filiba wrote: > i'd like to suggest adding weak attributes and weak methods to the std > weakref > module. . . . > > i think these two features are quite useful, and being part of the > stdlib, would > provide programmers with easy-to-use solutions to object-aliveness issues. > > more info, examples, and suggested implementation: > * http://sebulba.wikispaces.com/recipe+weakattr > * http://sebulba.wikispaces.com/recipe+weakmethod > > I'm sceptical that these would find use in practice. The cited links have only toy examples and as motivation reference Greg Ewing's posting saying only "I'm thinking it would be nice . . . This could probably be done fairly easily with a property descriptor." Also, I question the utility of maintaining a weakref to a method or attribute instead of holding one for the object or class. As long as the enclosing object or class lives, so too will their methods and attributes. So what is the point of a tighter weakref granualarity? So, before being entertained for addition to the standard library, this idea should probably first be posted as an ASPN recipe, then we can see if any use cases emerge in actual practice. Then we could look at sample code fragments to see if any real-world code is actually improved with the new toys. My bet is that very few will emerge, that most would be better served by a simple decorator, and that an expanding weakref zoo will only making the module more difficult to learn. Raymond From python at rcn.com Thu Sep 28 21:23:31 2006 From: python at rcn.com (Raymond Hettinger) Date: Thu, 28 Sep 2006 12:23:31 -0700 Subject: [Python-Dev] weakref enhancements References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> <451C1C3C.5010004@ewtllc.com> Message-ID: <011401c6e333$96b35450$ea146b0a@RaymondLaptop1> > Also, I question the utility of maintaining a weakref to a method or > attribute instead of holding one for the object or class. Strike that paragraph -- the proposed weakattrs have references away from the object, not to the object. Raymond From p.f.moore at gmail.com Thu Sep 28 21:25:55 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 28 Sep 2006 20:25:55 +0100 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com> <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com> <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com> <5.1.1.6.0.20060928143335.02e08360@sparrow.telecommunity.com> Message-ID: <79990c6b0609281225t27df9e4al7aa491c2d7008d6a@mail.gmail.com> > Phillip J. Eby schrieb: > > I would say that the C code is *delicate*, not necessarily bad. In most > > ways, it's rather straightforward, it's actually the requirements that are > > complex. :) >From what I recall, that's right. The C code's main disadvantage is that it isn't very well commented (as far as I recall) and there's no documentation of precisely what it's trying to achieve (insofar as there isn't a precise spec for how importing works in the Python docs, covering all the subtleties of things like package imports, package __path__ entries, reloading, etc etc...) > > A Python implementation, however, would be a good idea to have around for > > PyPy, Py3K, and other versions of Python, and as a refactoring basis for > > writing any new C code. It would also provide the basis of a much better spec - both because a clear spec would need to be established before you could write it, and because Python code is inherently readable... On 9/28/06, Thomas Heller wrote: > FYI, Gordon McMillan had a Python 'model' of the import mechanism in his, > (not sure if it was really named) "iu.py". It was part of his installer utility, > maybe the code still lives in the PyInstaller project. IIRC, parts of pep 302 were > inspired by his code. That's right. Lots of the path importer and metapath stuff came from iu.py. I have an oldish copy (Installer 5b5_2, from 2003) if you can't get it anywhere else... Paul. From tomerfiliba at gmail.com Thu Sep 28 21:57:32 2006 From: tomerfiliba at gmail.com (tomer filiba) Date: Thu, 28 Sep 2006 21:57:32 +0200 Subject: [Python-Dev] weakref enhancements In-Reply-To: <011401c6e333$96b35450$ea146b0a@RaymondLaptop1> References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> <451C1C3C.5010004@ewtllc.com> <011401c6e333$96b35450$ea146b0a@RaymondLaptop1> Message-ID: <1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com> > I'm sceptical that these would find use in practice. > [..] > Also, I question the utility of maintaining a weakref to a method or > attribute instead of holding one for the object or class. As long as > the enclosing object or class lives, so too will their methods and > attributes. So what is the point of a tighter weakref granualarity? i didn't just came up with them "out of boredom", i have had specific use cases for these, mainly in rpyc3000... but since the rpyc300 code base is still far from completion, i don't want to give examples at this early stage. however, these two are theoretically useful, so i refactored them out of my code into recipes. -tomer On 9/28/06, Raymond Hettinger wrote: > > > Also, I question the utility of maintaining a weakref to a method or > > attribute instead of holding one for the object or class. > > Strike that paragraph -- the proposed weakattrs have references away from > the > object, not to the object. > > > Raymond > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060928/80869972/attachment.html From aleaxit at gmail.com Thu Sep 28 23:14:13 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Thu, 28 Sep 2006 14:14:13 -0700 Subject: [Python-Dev] weakref enhancements In-Reply-To: <1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com> References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> <451C1C3C.5010004@ewtllc.com> <011401c6e333$96b35450$ea146b0a@RaymondLaptop1> <1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com> Message-ID: On 9/28/06, tomer filiba wrote: > > I'm sceptical that these would find use in practice. > > [..] > > Also, I question the utility of maintaining a weakref to a method or > > attribute instead of holding one for the object or class. As long as > > the enclosing object or class lives, so too will their methods and > > attributes. So what is the point of a tighter weakref granualarity? > > i didn't just came up with them "out of boredom", i have had specific > use cases for these, mainly in rpyc3000... but since the rpyc300 > code base is still far from completion, i don't want to give examples > at this early stage. > > however, these two are theoretically useful, so i refactored them out > of my code into recipes. I've had use cases for "weakrefs to boundmethods" (and there IS a Cookbook recipe for them), as follows: sometimes I'm maintaining a container of callables, which may be of various kinds including functions, boundmethods, etc; but I'd like the mere presence of a callable in the container not to keep the callable alive (especially when the callable in turn keeps alive an object with possibly massive state). In practice I use wrapping and tricks, but it would be nice to have cleaner standard library support for this. (Often the container needs to be some form of a Queue.Queue, since queues of callables are a form I use very often to dispatch work requests to worker-threads in a threadpool). Alex From rhettinger at ewtllc.com Fri Sep 29 01:08:56 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Thu, 28 Sep 2006 16:08:56 -0700 Subject: [Python-Dev] weakref enhancements In-Reply-To: References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> <451C1C3C.5010004@ewtllc.com> <011401c6e333$96b35450$ea146b0a@RaymondLaptop1> <1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com> Message-ID: <451C5608.3020000@ewtllc.com> [Alex Martelli] >I've had use cases for "weakrefs to boundmethods" (and there IS a >Cookbook recipe for them), > Weakmethods make some sense (though they raise the question of why bound methods are being kept when the underlying object is no longer in use -- possibly as unintended side-effect of aggressive optimization). I'm more concerned about weakattr which hides the weak referencing from client code when it is usually the client that needs to know about the refcounts: n = SomeClass(x) obj.a = n del n # hmm, what happens now? If obj.a is a weakattr, then n get vaporized; otherwise, it lives. It is clearer and less error-prone to keep the responsibility with the caller: n = SomeClass(x) obj.a = weakref.proxy(n) del n # now, it is clear what happens The wiki-space example shows objects that directly assign a copy of self to an attribute of self. Even in that simplified, self-referential example, it is clear that correct functioning (when __del__ gets called) depends knowing whether or not assignments are creating references. Accordingly, the code would be better-off if the weak-referencing assignment was made explicit rather than hiding the weak-referencing wrapper in a descriptor. Raymond From bob at redivi.com Fri Sep 29 01:39:08 2006 From: bob at redivi.com (Bob Ippolito) Date: Thu, 28 Sep 2006 16:39:08 -0700 Subject: [Python-Dev] weakref enhancements In-Reply-To: <451C5608.3020000@ewtllc.com> References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> <451C1C3C.5010004@ewtllc.com> <011401c6e333$96b35450$ea146b0a@RaymondLaptop1> <1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com> <451C5608.3020000@ewtllc.com> Message-ID: <6a36e7290609281639r38aa53dbk6f1f64be58cfaad2@mail.gmail.com> On 9/28/06, Raymond Hettinger wrote: > [Alex Martelli] > > >I've had use cases for "weakrefs to boundmethods" (and there IS a > >Cookbook recipe for them), > > > Weakmethods make some sense (though they raise the question of why bound > methods are being kept when the underlying object is no longer in use -- > possibly as unintended side-effect of aggressive optimization). There are *definitely* use cases for keeping bound methods around. Contrived example: one_of = set([1,2,3,4]).__contains__ filter(one_of, [2,4,6,8,10]) -bob From raymond.hettinger at verizon.net Fri Sep 29 03:03:43 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 28 Sep 2006 18:03:43 -0700 Subject: [Python-Dev] weakref enhancements References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com><451C1C3C.5010004@ewtllc.com><011401c6e333$96b35450$ea146b0a@RaymondLaptop1><1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com><451C5608.3020000@ewtllc.com> <6a36e7290609281639r38aa53dbk6f1f64be58cfaad2@mail.gmail.com> Message-ID: <006f01c6e363$1bf64670$ea146b0a@RaymondLaptop1> > There are *definitely* use cases for keeping bound methods around. > > Contrived example: > > one_of = set([1,2,3,4]).__contains__ > filter(one_of, [2,4,6,8,10]) ISTM, the example shows the (undisputed) utility of regular bound methods. How does it show the need for methods bound weakly to the underlying object, where the underlying can be deleted while the bound method persists, alive but unusable? Raymond From bob at redivi.com Fri Sep 29 03:13:14 2006 From: bob at redivi.com (Bob Ippolito) Date: Thu, 28 Sep 2006 18:13:14 -0700 Subject: [Python-Dev] weakref enhancements In-Reply-To: <006f01c6e363$1bf64670$ea146b0a@RaymondLaptop1> References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> <451C1C3C.5010004@ewtllc.com> <011401c6e333$96b35450$ea146b0a@RaymondLaptop1> <1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com> <451C5608.3020000@ewtllc.com> <6a36e7290609281639r38aa53dbk6f1f64be58cfaad2@mail.gmail.com> <006f01c6e363$1bf64670$ea146b0a@RaymondLaptop1> Message-ID: <6a36e7290609281813j1517017bga3304284dd6325a@mail.gmail.com> On 9/28/06, Raymond Hettinger wrote: > > There are *definitely* use cases for keeping bound methods around. > > > > Contrived example: > > > > one_of = set([1,2,3,4]).__contains__ > > filter(one_of, [2,4,6,8,10]) > > ISTM, the example shows the (undisputed) utility of regular bound methods. > > How does it show the need for methods bound weakly to the underlying object, > where the underlying can be deleted while the bound method persists, alive but > unusable? It doesn't. I seem to have misinterpreted your "Weakmethods have some use (...)" sentence. Sorry for the noise. -bob From greg.ewing at canterbury.ac.nz Fri Sep 29 03:13:46 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Sep 2006 13:13:46 +1200 Subject: [Python-Dev] weakref enhancements In-Reply-To: <451C1C3C.5010004@ewtllc.com> References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> <451C1C3C.5010004@ewtllc.com> Message-ID: <451C734A.2060203@canterbury.ac.nz> Raymond Hettinger wrote: > Also, I question the utility of maintaining a weakref to a method or > attribute instead of holding one for the object or class. As long as > the enclosing object or class lives, so too will their methods and > attributes. So what is the point of a tighter weakref granualarity? I think you're misunderstanding what the OP means. A weak attribute isn't a weak reference to an attribute, it's an attribute that holds a weak reference and is automatically dereferenced when you access it. A frequent potential use case is parent-child relationships. To avoid creating cycles you'd like to make the link from child to parent weak, but doing that with raw weakrefs is somewhat tedious and doesn't feel worth the bother. If I could just declare the attribute to be weak and then use it like a normal attribute from then on, I would probably use this technique more often. > So, before being entertained for addition to the standard library, this > idea should probably first be posted as an ASPN recipe, That's a reasonable idea. -- Greg From stephen at xemacs.org Fri Sep 29 02:49:35 2006 From: stephen at xemacs.org (stephen at xemacs.org) Date: Fri, 29 Sep 2006 09:49:35 +0900 Subject: [Python-Dev] Python Doc problems In-Reply-To: <20060928095951.08BF.JCARLSON@uci.edu> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> Message-ID: <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> Josiah Carlson writes: > fine). While I have heard comments along the lines of "the docs could > be better", I've never heard the claim that the Python docs are "lousy". FYI, I have heard this, recently, from Tom Lord (aka developer of Arch, rx, guile, etc). Since he also took a swipe at Emacsen, I pressed him on what he meant. He immediately backtracked on "(all) Python docs" and "lousy", but did say that in his opinion scripting languages that provide docstrings have lost a fair amount of coherence in their documentation, and that Python's are consistent with the general trend. (He's started using Python relatively recently and does not claim a historical perspective.) What is lost according to him is information about how the elements of a module work together. The docstrings tend to be narrowly focused on the particular function or variable, and too often discuss implementation details. On the other hand, manuals tend to become either tutorials or compedia of the docstrings. > If there are "rampant criticisms" of the Python docs, then those that > are complaining should take specific examples of their complaints to the > sourceforge bug tracker and submit documentation patches for the > relevant sections. What they *should* do, but don't, is not necessarily a reflection on the accuracy of what they say. FWIW ... I find the documentation for the language, the standard library, and the Python applications I use quite adequate for my own use. From stephen at xemacs.org Fri Sep 29 03:29:00 2006 From: stephen at xemacs.org (stephen at xemacs.org) Date: Fri, 29 Sep 2006 10:29:00 +0900 Subject: [Python-Dev] Python Doc problems In-Reply-To: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> Message-ID: <17692.30428.14501.569620@uwakimon.sk.tsukuba.ac.jp> xah lee writes: > anyway, i've rewrote the Python's RE module documentation, at: > http://xahlee.org/perl-python/python_re-write/lib/module-re.html -1 The current docs could be improved (but not by me, at least not today), but I don't consider the general direction of Xah's edits desirable. Eg, the current table of contents is just as accurate and more precise than Xah's top node, which makes navigation faster for someone who knows what he forgot. In general his changes improve the "narrative flow", but for me that's a very low priority in a reference manual, while the cost in loss of navigability of his changes is pretty high for me. From steve at holdenweb.com Fri Sep 29 03:41:01 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 29 Sep 2006 02:41:01 +0100 Subject: [Python-Dev] Python Doc problems In-Reply-To: <17692.30428.14501.569620@uwakimon.sk.tsukuba.ac.jp> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <17692.30428.14501.569620@uwakimon.sk.tsukuba.ac.jp> Message-ID: stephen at xemacs.org wrote: > xah lee writes: > > > anyway, i've rewrote the Python's RE module documentation, at: > > http://xahlee.org/perl-python/python_re-write/lib/module-re.html > > -1 > > The current docs could be improved (but not by me, at least not > today), but I don't consider the general direction of Xah's edits > desirable. Eg, the current table of contents is just as accurate and > more precise than Xah's top node, which makes navigation faster for > someone who knows what he forgot. In general his changes > improve the "narrative flow", but for me that's a very low priority in > a reference manual, while the cost in loss of navigability of his > changes is pretty high for me. > 'Fraid that doesn't get him any nearer his hundred bucks, then. Xah: the money it still on offer should you choose to rewrite until the criteria are satisfied. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From barry at python.org Fri Sep 29 03:59:24 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 28 Sep 2006 21:59:24 -0400 Subject: [Python-Dev] Python Doc problems In-Reply-To: <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> Message-ID: <34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 28, 2006, at 8:49 PM, wrote: > What is lost according to him is information about how the elements of > a module work together. The docstrings tend to be narrowly focused on > the particular function or variable, and too often discuss > implementation details. On the other hand, manuals tend to become > either tutorials or compedia of the docstrings. There's no doubt that writing good documentation is an art form. There's also the pull between wanting to write reference docs for those who know what they've forgotten (I love that phrase!) and writing the introductory or "how it hangs together" documentation. It's not easy at all, and some of Python's documentation does better at this than others. In the vast array of FOSS and for-pay docs I've read in my career, Python actually ain't too bad. :) - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRRx+AXEjvBPtnXfVAQJgNgP8D9f2ZUqIDTUmQU8BRx4iqjbXQANrdHt1 usZCwguIS4pa0pmUp73E514y+tDs1UzU1E2I2itIifqtKXZuPOSZYG/DWcg4h8vh KPCygqSDNiW5dr77UP4QBXk3DOoj68E/WpLWOquoLB/eOYWOa08lh+XEJ9ShHF1F WfHMygrtpqk= =vEEN -----END PGP SIGNATURE----- From greg.ewing at canterbury.ac.nz Fri Sep 29 04:24:23 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Sep 2006 14:24:23 +1200 Subject: [Python-Dev] Python Doc problems In-Reply-To: <34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org> Message-ID: <451C83D7.5090705@canterbury.ac.nz> Barry Warsaw wrote: > There's also the pull between wanting to write reference docs for > those who know what they've forgotten (I love that phrase!) and > writing the introductory or "how it hangs together" documentation. The trick to this, I think, is not to try to make the same piece of documentation serve both purposes. An example of a good way to do it is the original Inside Macintosh series. Each chapter started with a narrative-style "About this module" kind of section, that introduced the relevant concepts and explained how they fitted together, without going into low-level details. Then there was a "Reference" section that systematically went through and gave all the details of the API. While Inside Mac could often be criticised for omitting rather important info in either section now and then, I think they had the basic structure of the docs right. -- Greg From tomerfiliba at gmail.com Fri Sep 29 09:33:35 2006 From: tomerfiliba at gmail.com (tomer filiba) Date: Fri, 29 Sep 2006 09:33:35 +0200 Subject: [Python-Dev] weakref enhancements In-Reply-To: <451C5608.3020000@ewtllc.com> References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com> <451C1C3C.5010004@ewtllc.com> <011401c6e333$96b35450$ea146b0a@RaymondLaptop1> <1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com> <451C5608.3020000@ewtllc.com> Message-ID: <1d85506f0609290033l1b276ea7j3a833c57c281c343@mail.gmail.com> this may still be premature, but i see people misunderstood the purpose. weakattrs are not likely to be used "externally", out of the scope of the object. they are just meant to provide an easy to use means for not holding cyclic references between parents and children. many graph-like structures, i.e., rpyc's node and proxies, are interconnected in both ways, and weakattrs help to solve that: i don't want a proxy of a node to keep the node alive. weakmethods are used very similarly. nodes have a method called "getmodule", that performs remote importing of modules. i expose these modules as a namespace object, so you could do: >>> mynode.modules.sys or >>> mynode.modules.xml.dom.minidom.parseString instead of >>> mynode.getmodule("xml.dom.minidom").parseString here's a sketch: def ModuleNamespace: def __init__(self, importerfunc): self.importerfunc = importerfunc class Node: def __init__(self, stream): .... self.modules = ModuleNamespace(self.getmodule) @ weakmethod def getmodule(self, name): .... i define this getmodule method as a *weakmethod*, so the mere existence of the ModuleNamespace instance will not keep the node alive. when the node loses all external references, the ModuleNamespace should just "commit suicide", and allow the node to be reclaimed. yes, you can do all of these with today's weakref, but it takes quite a lot of hassle to manually set up weakproxies every time. -tomer On 9/29/06, Raymond Hettinger wrote: > [Alex Martelli] > > >I've had use cases for "weakrefs to boundmethods" (and there IS a > >Cookbook recipe for them), > > > Weakmethods make some sense (though they raise the question of why bound > methods are being kept when the underlying object is no longer in use -- > possibly as unintended side-effect of aggressive optimization). > > I'm more concerned about weakattr which hides the weak referencing from > client code when it is usually the client that needs to know about the > refcounts: > > n = SomeClass(x) > obj.a = n > del n # hmm, what happens now? > > If obj.a is a weakattr, then n get vaporized; otherwise, it lives. > > It is clearer and less error-prone to keep the responsibility with the > caller: > > n = SomeClass(x) > obj.a = weakref.proxy(n) > del n # now, it is clear what happens > > The wiki-space example shows objects that directly assign a copy of self > to an attribute of self. Even in that simplified, self-referential > example, it is clear that correct functioning (when __del__ gets called) > depends knowing whether or not assignments are creating references. > Accordingly, the code would be better-off if the weak-referencing > assignment was made explicit rather than hiding the weak-referencing > wrapper in a descriptor. > > > > Raymond > From nick at craig-wood.com Fri Sep 29 10:14:02 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Fri, 29 Sep 2006 09:14:02 +0100 Subject: [Python-Dev] Caching float(0.0) Message-ID: <20060929081402.GB19781@craig-wood.com> I just discovered that in a program of mine it was wasting 7MB out of 200MB by storing multiple copies of 0.0. I found this a bit suprising since I'm used to small ints and strings being cached. I added the apparently nonsensical lines + if age == 0.0: + age = 0.0 # return a common object for the common case and got 7MB of memory back! Eg :- Python 2.5c1 (r25c1:51305, Aug 19 2006, 18:23:29) [GCC 4.1.2 20060814 (prerelease) (Debian 4.1.1-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a=0.0 >>> print id(a), id(0.0) 134738828 134738844 >>> Is there any reason why float() shouldn't cache the value of 0.0 since it is by far and away the most common value? A full cache of floats probably doesn't make much sense though since there are so many 'more' of them than integers and defining small isn't obvious. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From Jack.Jansen at cwi.nl Fri Sep 29 11:25:50 2006 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Fri, 29 Sep 2006 11:25:50 +0200 Subject: [Python-Dev] Python Doc problems In-Reply-To: <451C83D7.5090705@canterbury.ac.nz> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org> <451C83D7.5090705@canterbury.ac.nz> Message-ID: <32936525-6318-45DA-A8A9-57D3755C4F10@cwi.nl> On 29-sep-2006, at 4:24, Greg Ewing wrote: > An example of a good way to do it is the original Inside > Macintosh series. Each chapter started with a narrative-style > "About this module" kind of section, that introduced the > relevant concepts and explained how they fitted together, > without going into low-level details. Then there was a > "Reference" section that systematically went through and > gave all the details of the API. Yep, this is exactly what I often miss in the Python library docs. The module intro sections often do contain the "executive summary" of the module, so that you can quickly see whether this module could indeed help you solve the problem at hand. But then you go straight to descriptions of classes and methods, and there is often no info on how things are plumbed together, both within the module (how the classes relate to each other) and more globally (how this module relates to others, see also). A similar thing occurs one level higher in the library hierarchy: the section introductions are little more that a list of all the modules in the section. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2255 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060929/7f604aa0/attachment.bin From amk at amk.ca Fri Sep 29 14:10:35 2006 From: amk at amk.ca (A.M. Kuchling) Date: Fri, 29 Sep 2006 08:10:35 -0400 Subject: [Python-Dev] Python Doc problems In-Reply-To: <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20060929121035.GA4884@localhost.localdomain> On Fri, Sep 29, 2006 at 09:49:35AM +0900, stephen at xemacs.org wrote: > What is lost according to him is information about how the elements of > a module work together. The docstrings tend to be narrowly focused on > the particular function or variable, and too often discuss > implementation details. I agree with this, and am not very interested in tools such as epydoc for this reason. In such autogenerated documentation, you wind up with a list of every single class and function, and both trivial and important classes are given exactly the same emphasis. Such docs are useful as a reference when you know what class you need to look at, but then pydoc also works well for that purpose. --amk From ndbecker2 at gmail.com Fri Sep 29 14:20:48 2006 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 29 Sep 2006 08:20:48 -0400 Subject: [Python-Dev] os.unlink() closes file? Message-ID: It seems (I haven't looked at source) that os.unlink() will close the file? If so, please make this optional. It breaks the unix idiom for making a temporary file. (Yes, I know there is a tempfile module, but I need some behavior it doesn't implement so I want to do it myself). From ronaldoussoren at mac.com Fri Sep 29 14:36:23 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 29 Sep 2006 14:36:23 +0200 Subject: [Python-Dev] os.unlink() closes file? In-Reply-To: References: Message-ID: <10638710.1159533383149.JavaMail.ronaldoussoren@mac.com> On Friday, September 29, 2006, at 02:22PM, Neal Becker wrote: >It seems (I haven't looked at source) that os.unlink() will close the file? > >If so, please make this optional. It breaks the unix idiom for making a >temporary file. > >(Yes, I know there is a tempfile module, but I need some behavior it doesn't >implement so I want to do it myself). On what platform? Do you have a script that demonstrates your problem? If yes, please file a bug in the bugtracker at http://www.sf.net/projects/python. AFAIK os.unlink doesn't close files, and I cannot reproduce this problem (python2.3 on Solaris 9). Ronald From skip at pobox.com Fri Sep 29 15:05:18 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 29 Sep 2006 08:05:18 -0500 Subject: [Python-Dev] Python Doc problems In-Reply-To: <20060929121035.GA4884@localhost.localdomain> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <20060929121035.GA4884@localhost.localdomain> Message-ID: <17693.6670.189595.646482@montanaro.dyndns.org> Andrew> In such autogenerated documentation, you wind up with a list of Andrew> every single class and function, and both trivial and important Andrew> classes are given exactly the same emphasis. I find this true where I work as well. Doxygen is used as a documentation generation tool for our C++ class libraries. Too many people use that as a crutch to often avoid writing documentation altogether. It's worse in many ways than tools like epydoc, because you don't need to write any docstrings (or specially formatted comments) to generate reams and reams of virtual paper. This sort of documentation is all but useless for a Python programmer like myself. I don't really need to know the five syntactic constructor variants. I need to know how to use the classes which have been exposed to me. I guess this is a long-winded way of saying, "me too". Skip From ndbecker2 at gmail.com Fri Sep 29 15:18:17 2006 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 29 Sep 2006 09:18:17 -0400 Subject: [Python-Dev] os.unlink() closes file? References: <10638710.1159533383149.JavaMail.ronaldoussoren@mac.com> Message-ID: Ronald Oussoren wrote: > > On Friday, September 29, 2006, at 02:22PM, Neal Becker > wrote: > >>It seems (I haven't looked at source) that os.unlink() will close the >>file? >> >>If so, please make this optional. It breaks the unix idiom for making a >>temporary file. >> >>(Yes, I know there is a tempfile module, but I need some behavior it >>doesn't implement so I want to do it myself). > > On what platform? Do you have a script that demonstrates your problem? If > yes, please file a bug in the bugtracker at > http://www.sf.net/projects/python. > > AFAIK os.unlink doesn't close files, and I cannot reproduce this problem > (python2.3 on Solaris 9). > Sorry, my mistake. From fredrik at pythonware.com Fri Sep 29 17:11:02 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 29 Sep 2006 17:11:02 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20060929081402.GB19781@craig-wood.com> References: <20060929081402.GB19781@craig-wood.com> Message-ID: Nick Craig-Wood wrote: > Is there any reason why float() shouldn't cache the value of 0.0 since > it is by far and away the most common value? says who ? (I just checked the program I'm working on, and my analysis tells me that the most common floating point value in that program is 121.216, which occurs 32 times. from what I can tell, 0.0 isn't used at all.) From kristjan at ccpgames.com Fri Sep 29 17:18:17 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Fri, 29 Sep 2006 15:18:17 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A280002FE99A1@nemesis.central.ccp.cc> Acting on this excellent advice, I have patched in a reuse for -1.0, 0.0 and 1.0 for EVE Online. We use vectors and stuff a lot, and 0.0 is very, very common. I'll report on the refcount of this for you shortly. K > -----Original Message----- > From: python-dev-bounces+kristjan=ccpgames.com at python.org > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] > On Behalf Of Fredrik Lundh > Sent: 29. september 2006 15:11 > To: python-dev at python.org > Subject: Re: [Python-Dev] Caching float(0.0) > > Nick Craig-Wood wrote: > > > Is there any reason why float() shouldn't cache the value > of 0.0 since > > it is by far and away the most common value? > > says who ? > > (I just checked the program I'm working on, and my analysis > tells me that the most common floating point value in that > program is 121.216, which occurs 32 times. from what I can > tell, 0.0 isn't used at all.) > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/kristjan%40c cpgames.com > From kristjan at ccpgames.com Fri Sep 29 18:11:25 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Fri, 29 Sep 2006 16:11:25 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A280002FE99A2@nemesis.central.ccp.cc> Well gentlemen, I did gather some stats on the frequency of PyFloat_FromDouble(). out of the 1000 first different floats allocated, we get this frequency distribution once our server has started up: - stats [1000]({v=0.00000000000000000 c=410612 },{v=1.0000000000000000 c=107838 },{v=0.75000000000000000 c=25487 },{v=5.0000000000000000 c=22557 },...) std::vector > + [0] {v=0.00000000000000000 c=410612 } entry + [1] {v=1.0000000000000000 c=107838 } entry + [2] {v=0.75000000000000000 c=25487 } entry + [3] {v=5.0000000000000000 c=22557 } entry + [4] {v=10000.000000000000 c=18530 } entry + [5] {v=-1.0000000000000000 c=14950 } entry + [6] {v=2.0000000000000000 c=14460 } entry + [7] {v=1500.0000000000000 c=13470 } entry + [8] {v=100.00000000000000 c=11913 } entry + [9] {v=0.50000000000000000 c=11497 } entry + [10] {v=3.0000000000000000 c=9833 } entry + [11] {v=20.000000000000000 c=9019 } entry + [12] {v=0.90000000000000002 c=8954 } entry + [13] {v=10.000000000000000 c=8377 } entry + [14] {v=4.0000000000000000 c=7890 } entry + [15] {v=0.050000000000000003 c=7732 } entry + [16] {v=1000.0000000000000 c=7456 } entry + [17] {v=0.40000000000000002 c=7427 } entry + [18] {v=-100.00000000000000 c=7071 } entry + [19] {v=5000.0000000000000 c=6851 } entry + [20] {v=1000000.0000000000 c=6503 } entry + [21] {v=0.070000000000000007 c=6071 } entry (here I omit the rest). In addition, my shared 0.0 double has some 200000 references at this point. 0.0 is very, very common. The same can be said about all the integers up to 5.0 as well as -1.0 I think I will add a simple cache for these values for Eve. something like: int i = (int) fval; if ((double)i == fval && i>=-1 && i<6) { Py_INCREF(table[i]); return table[i]; } Cheers, Kristj?n > -----Original Message----- > From: python-dev-bounces+kristjan=ccpgames.com at python.org > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] > On Behalf Of Kristj?n V. J?nsson > Sent: 29. september 2006 15:18 > To: Fredrik Lundh; python-dev at python.org > Subject: Re: [Python-Dev] Caching float(0.0) > > Acting on this excellent advice, I have patched in a reuse > for -1.0, 0.0 and 1.0 for EVE Online. We use vectors and > stuff a lot, and 0.0 is very, very common. I'll report on > the refcount of this for you shortly. > > K > > > -----Original Message----- > > From: python-dev-bounces+kristjan=ccpgames.com at python.org > > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] > > On Behalf Of Fredrik Lundh > > Sent: 29. september 2006 15:11 > > To: python-dev at python.org > > Subject: Re: [Python-Dev] Caching float(0.0) > > > > Nick Craig-Wood wrote: > > > > > Is there any reason why float() shouldn't cache the value > > of 0.0 since > > > it is by far and away the most common value? > > > > says who ? > > > > (I just checked the program I'm working on, and my analysis > tells me > > that the most common floating point value in that program > is 121.216, > > which occurs 32 times. from what I can tell, 0.0 isn't > used at all.) > > > > > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > http://mail.python.org/mailman/options/python-dev/kristjan%40c > cpgames.com > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/kristjan%40c cpgames.com > From lcaamano at gmail.com Fri Sep 29 18:49:23 2006 From: lcaamano at gmail.com (Luis P Caamano) Date: Fri, 29 Sep 2006 12:49:23 -0400 Subject: [Python-Dev] PEP 355 status Message-ID: What's the status of PEP 355, Path - Object oriented filesystem paths? We'd like to start using the current reference implementation but we'd like to do it in a manner that minimizes any changes needed when Path becomes part of stdlib. In particular, the reference implementation in http://wiki.python.org/moin/PathModule names the class 'path' instead of 'Path', which seems like a source of name conflict problems. How would you recommend one starts using it now, as is or renaming class path to Path? Thanks -- Luis P Caamano Atlanta, GA USA From jason.orendorff at gmail.com Fri Sep 29 19:47:54 2006 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Fri, 29 Sep 2006 13:47:54 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <20060929081402.GB19781@craig-wood.com> Message-ID: On 9/29/06, Fredrik Lundh wrote: > (I just checked the program I'm working on, and my analysis tells me > that the most common floating point value in that program is 121.216, > which occurs 32 times. from what I can tell, 0.0 isn't used at all.) *bemused look* Fredrik, can you share the reason why this number occurs 32 times in this program? I don't mean to imply anything by that; it just sounds like it might be a fun story. :) Anyway, this kind of static analysis is probably more entertaining than relevant. For your enjoyment, the most-used float literals in python25\Lib, omitting test directories, are: 1e-006: 5 hits 4.0: 6 hits 0.05: 7 hits 6.0: 8 hits 0.5: 13 hits 2.0: 25 hits 0.0: 36 hits 1.0: 62 hits There are two hits each for -1.0 and -0.5. In my own Python code, I don't even have enough float literals to bother with. -j From nmm1 at cus.cam.ac.uk Fri Sep 29 20:03:42 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Fri, 29 Sep 2006 19:03:42 +0100 Subject: [Python-Dev] Caching float(0.0) Message-ID: "Jason Orendorff" wrote: > > Anyway, this kind of static analysis is probably more entertaining > than relevant. ... Well, yes. One can tell that by the piffling little counts being bandied about! More seriously, yes, it is Well Known that 0.0 is the Most Common Floating-Point Number is most numerical codes; a lot of older (and perhaps modern) sparse matrix algorithms use that to save space. In the software floating-point that I have started to draft some example code but have had to shelve (no, I haven't forgotten) the values I predefine are Invalid, Missing, True Zero and Approximate Zero. The infinities and infinitesimals (a.k.a. signed zeroes) could also be included, but are less common and more complicated. And so could common integers and fractions. It is generally NOT worth doing a cache lookup for genuinely numerical code, as the common cases that are not the above rarely account for enough of the numbers to be worth it. I did a fair amount of investigation looking for compressibility at one time, and that conclusion jumped out at me. The exact best choice depends entirely on what you are doing. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From guido at python.org Fri Sep 29 21:03:03 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Sep 2006 12:03:03 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: I see some confusion in this thread. If a *LITERAL* 0.0 (or any other float literal) is used, you only get one object, no matter how many times it is used. But if the result of a *COMPUTATION* returns 0.0, you get a new object for each such result. If you have 70 MB worth of zeros, that's clearly computation results, not literals. Attempts to remove literal references from source code won't help much. I'm personally +0 on caching computational results with common float values such as 0 and small (positive or negative) powers of two, e.g. 0.5, 1.0, 2.0. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From simon at brunningonline.net Fri Sep 29 21:11:13 2006 From: simon at brunningonline.net (Simon Brunning) Date: Fri, 29 Sep 2006 20:11:13 +0100 Subject: [Python-Dev] Python Doc problems In-Reply-To: <451C83D7.5090705@canterbury.ac.nz> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org> <451C83D7.5090705@canterbury.ac.nz> Message-ID: <8c7f10c60609291211u5804a9fdi20e09adfd7b56d74@mail.gmail.com> On 9/29/06, Greg Ewing wrote: > An example of a good way to do it is the original Inside > Macintosh series. Each chapter started with a narrative-style > "About this module" kind of section, that introduced the > relevant concepts and explained how they fitted together, > without going into low-level details. Then there was a > "Reference" section that systematically went through and > gave all the details of the API. The "How to use this module" sections sound like /F's "The Python Standard Library", of which I keep the dead tree version on my desk and the PDF vesion on my hard drive for when I'm coding in the pub. It or something like it would be a superb addition to the (already very good IMHO) Python docs. -- Cheers, Simon B, simon at brunningonline.net From fredrik at pythonware.com Fri Sep 29 21:27:05 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 29 Sep 2006 21:27:05 +0200 Subject: [Python-Dev] Python Doc problems In-Reply-To: <8c7f10c60609291211u5804a9fdi20e09adfd7b56d74@mail.gmail.com> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org> <451C83D7.5090705@canterbury.ac.nz> <8c7f10c60609291211u5804a9fdi20e09adfd7b56d74@mail.gmail.com> Message-ID: Simon Brunning wrote: > The "How to use this module" sections sound like /F's "The Python > Standard Library", of which I keep the dead tree version on my desk > and the PDF vesion on my hard drive for when I'm coding in the pub. It > or something like it would be a superb addition to the (already very > good IMHO) Python docs. that's what my old seealso proposal was supposed to address: http://effbot.org/zone/idea-seealso.htm the standard library's seealso file is here: http://effbot.org/librarybook/seealso.xml From guido at python.org Fri Sep 29 21:29:31 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Sep 2006 12:29:31 -0700 Subject: [Python-Dev] Python Doc problems In-Reply-To: <20060929121035.GA4884@localhost.localdomain> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <20060929121035.GA4884@localhost.localdomain> Message-ID: On 9/29/06, A.M. Kuchling wrote: > On Fri, Sep 29, 2006 at 09:49:35AM +0900, stephen at xemacs.org wrote: > > What is lost according to him is information about how the elements of > > a module work together. The docstrings tend to be narrowly focused on > > the particular function or variable, and too often discuss > > implementation details. > > I agree with this, and am not very interested in tools such as epydoc > for this reason. In such autogenerated documentation, you wind up > with a list of every single class and function, and both trivial and > important classes are given exactly the same emphasis. Such docs are > useful as a reference when you know what class you need to look at, > but then pydoc also works well for that purpose. Right. BTW isn't xah a well-known troll? (There are exactly 666 Google hits for the query ``xah troll'' -- draw your own conclusions. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Sep 29 21:38:22 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Sep 2006 12:38:22 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: References: Message-ID: I would recommend not using it. IMO it's an amalgam of unrelated functionality (much like the Java equivalent BTW) and the existing os and os.path modules work just fine. Those who disagree with me haven't done a very good job of convincing me, so I expect this PEP to remain in limbo indefinitely, until it is eventually withdrawn or rejected. --Guido On 9/29/06, Luis P Caamano wrote: > What's the status of PEP 355, Path - Object oriented filesystem paths? > > We'd like to start using the current reference implementation but we'd > like to do it in a manner that minimizes any changes needed when Path > becomes part of stdlib. > > In particular, the reference implementation in > http://wiki.python.org/moin/PathModule names the class 'path' instead > of 'Path', which seems like a source of name conflict problems. > > How would you recommend one starts using it now, as is or renaming > class path to Path? > > Thanks > > -- > Luis P Caamano > Atlanta, GA USA > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From lcaamano at gmail.com Fri Sep 29 22:15:44 2006 From: lcaamano at gmail.com (Luis P Caamano) Date: Fri, 29 Sep 2006 16:15:44 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: References: Message-ID: Thanks for your reply, that's the kind of info I was looking for to decide what to do. Good enough, I'll move on then. Thanks -- Luis P Caamano Atlanta, GA USA On 9/29/06, Guido van Rossum wrote: > I would recommend not using it. IMO it's an amalgam of unrelated > functionality (much like the Java equivalent BTW) and the existing os > and os.path modules work just fine. Those who disagree with me haven't > done a very good job of convincing me, so I expect this PEP to remain > in limbo indefinitely, until it is eventually withdrawn or rejected. > > --Guido > > On 9/29/06, Luis P Caamano wrote: > > What's the status of PEP 355, Path - Object oriented filesystem paths? > > > > We'd like to start using the current reference implementation but we'd > > like to do it in a manner that minimizes any changes needed when Path > > becomes part of stdlib. > > > > In particular, the reference implementation in > > http://wiki.python.org/moin/PathModule names the class 'path' instead > > of 'Path', which seems like a source of name conflict problems. > > > > How would you recommend one starts using it now, as is or renaming > > class path to Path? > > > > Thanks > > > > -- > > Luis P Caamano > > Atlanta, GA USA > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > From g.brandl at gmx.net Fri Sep 29 22:18:16 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 29 Sep 2006 22:18:16 +0200 Subject: [Python-Dev] PEP 355 status In-Reply-To: References: Message-ID: Shouldn't that paragraph be added to the PEP (e.g. under a "Status" subheading)? enjoying-top-posting-ly, Georg Guido van Rossum wrote: > I would recommend not using it. IMO it's an amalgam of unrelated > functionality (much like the Java equivalent BTW) and the existing os > and os.path modules work just fine. Those who disagree with me haven't > done a very good job of convincing me, so I expect this PEP to remain > in limbo indefinitely, until it is eventually withdrawn or rejected. > > --Guido > > On 9/29/06, Luis P Caamano wrote: >> What's the status of PEP 355, Path - Object oriented filesystem paths? >> >> We'd like to start using the current reference implementation but we'd >> like to do it in a manner that minimizes any changes needed when Path >> becomes part of stdlib. >> >> In particular, the reference implementation in >> http://wiki.python.org/moin/PathModule names the class 'path' instead >> of 'Path', which seems like a source of name conflict problems. >> >> How would you recommend one starts using it now, as is or renaming >> class path to Path? >> >> Thanks >> >> -- >> Luis P Caamano >> Atlanta, GA USA >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org >> > > From bjourne at gmail.com Fri Sep 29 23:48:37 2006 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Fri, 29 Sep 2006 23:48:37 +0200 Subject: [Python-Dev] Python Doc problems In-Reply-To: <20060928095951.08BF.JCARLSON@uci.edu> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> Message-ID: <740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com> > If there are "rampant criticisms" of the Python docs, then those that > are complaining should take specific examples of their complaints to the > sourceforge bug tracker and submit documentation patches for the > relevant sections. And personally, I've not noticed that criticisms of > the Python docs are "rampant", but maybe there is some "I hate Python > docs" newsgroup or mailing list that I'm not subscribed to. Meh! The number one complaint IS that you have to take your complaints to the sourceforge bug tracker and submit documentation patches. For documentation changes, that is way to much overhead for to little gain. But thankfully I think there are people working on fixing those problems which is very nice. -- mvh Bj?rn From tzot at mediconsa.com Sat Sep 30 01:27:25 2006 From: tzot at mediconsa.com (Christos Georgiou) Date: Sat, 30 Sep 2006 02:27:25 +0300 Subject: [Python-Dev] Tix not included in 2.5 for Windows Message-ID: Does anyone know why this happens? I can't find any information pointing to this being deliberate. I just upgraded to 2.5 on Windows (after making sure I can build extensions with the freeware VC++ Toolkit 2003) and some of my programs stopped operating. I saw in a French forum that someone else had the same problem, and what they did was to copy the relevant files from a 2.4.3 installation. I did the same, and it seems it works, with only a console message appearing as soon as a root window is created: attempt to provide package Tix 8.1 failed: package Tix 8.1.8.4 provided instead Cheers. From jcarlson at uci.edu Sat Sep 30 01:54:10 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 29 Sep 2006 16:54:10 -0700 Subject: [Python-Dev] Python Doc problems In-Reply-To: <740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com> References: <20060928095951.08BF.JCARLSON@uci.edu> <740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com> Message-ID: <20060929164528.08DD.JCARLSON@uci.edu> "BJ?rn Lindqvist" wrote: > > If there are "rampant criticisms" of the Python docs, then those that > > are complaining should take specific examples of their complaints to the > > sourceforge bug tracker and submit documentation patches for the > > relevant sections. And personally, I've not noticed that criticisms of > > the Python docs are "rampant", but maybe there is some "I hate Python > > docs" newsgroup or mailing list that I'm not subscribed to. > > Meh! The number one complaint IS that you have to take your complaints > to the sourceforge bug tracker and submit documentation patches. For > documentation changes, that is way to much overhead for to little > gain. But thankfully I think there are people working on fixing those > problems which is very nice. Are you telling me that people want to be able to complain into the ether and get their complaints heard? I hope not, because that would be insane. Also, "doc patches" are basically "the function foo() should be documented as ...", users don't need to know or learn TeX. Should there be an easier method of submitting doc fixes, etc.? Sure. But people are still going to need to actually *report* the fixes they want, which they aren't doing in *any* form now. - Josiah From brett at python.org Sat Sep 30 02:23:55 2006 From: brett at python.org (Brett Cannon) Date: Fri, 29 Sep 2006 17:23:55 -0700 Subject: [Python-Dev] Python Doc problems In-Reply-To: <740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com> Message-ID: On 9/29/06, BJ?rn Lindqvist wrote: > > > If there are "rampant criticisms" of the Python docs, then those that > > are complaining should take specific examples of their complaints to the > > sourceforge bug tracker and submit documentation patches for the > > relevant sections. And personally, I've not noticed that criticisms of > > the Python docs are "rampant", but maybe there is some "I hate Python > > docs" newsgroup or mailing list that I'm not subscribed to. > > Meh! The number one complaint IS that you have to take your complaints > to the sourceforge bug tracker and submit documentation patches. For > documentation changes, that is way to much overhead for to little > gain. But thankfully I think there are people working on fixing those > problems which is very nice. The PSF Infrastructure committe has already met and drafted our suggestions. Expect a post to the list on Monday or Tuesday outlining our recommendation on a new issue tracker. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060929/6760e511/attachment.html From greg.ewing at canterbury.ac.nz Sat Sep 30 02:57:55 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Sep 2006 12:57:55 +1200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20060929081402.GB19781@craig-wood.com> References: <20060929081402.GB19781@craig-wood.com> Message-ID: <451DC113.4040002@canterbury.ac.nz> Nick Craig-Wood wrote: > Is there any reason why float() shouldn't cache the value of 0.0 since > it is by far and away the most common value? 1.0 might be another candidate for cacheing. Although the fact that nobody has complained about this before suggests that it might not be a frequent enough problem to be worth the effort. -- Greg From bob at redivi.com Sat Sep 30 03:15:15 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 29 Sep 2006 18:15:15 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <451DC113.4040002@canterbury.ac.nz> References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> Message-ID: <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> On 9/29/06, Greg Ewing wrote: > Nick Craig-Wood wrote: > > > Is there any reason why float() shouldn't cache the value of 0.0 since > > it is by far and away the most common value? > > 1.0 might be another candidate for cacheing. > > Although the fact that nobody has complained about this > before suggests that it might not be a frequent enough > problem to be worth the effort. My guess is that people do have this problem, they just don't know where that memory has gone. I know I don't count objects unless I have a process that's leaking memory or it grows so big that I notice (by swapping or chance). That said, I've never noticed this particular issue.. but I deal with mostly strings. I have had issues with the allocator a few times that I had to work around, but not this sort of issue. -bob From rrr at ronadam.com Sat Sep 30 03:15:04 2006 From: rrr at ronadam.com (Ron Adam) Date: Fri, 29 Sep 2006 20:15:04 -0500 Subject: [Python-Dev] Python Doc problems In-Reply-To: <20060929164528.08DD.JCARLSON@uci.edu> References: <20060928095951.08BF.JCARLSON@uci.edu> <740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com> <20060929164528.08DD.JCARLSON@uci.edu> Message-ID: <451DC518.10609@ronadam.com> Josiah Carlson wrote: > "BJ?rn Lindqvist" wrote: >>> If there are "rampant criticisms" of the Python docs, then those that >>> are complaining should take specific examples of their complaints to the >>> sourceforge bug tracker and submit documentation patches for the >>> relevant sections. And personally, I've not noticed that criticisms of >>> the Python docs are "rampant", but maybe there is some "I hate Python >>> docs" newsgroup or mailing list that I'm not subscribed to. >> Meh! The number one complaint IS that you have to take your complaints >> to the sourceforge bug tracker and submit documentation patches. For >> documentation changes, that is way to much overhead for to little >> gain. But thankfully I think there are people working on fixing those >> problems which is very nice. > > Are you telling me that people want to be able to complain into the > ether and get their complaints heard? I hope not, because that would be > insane. Also, "doc patches" are basically "the function foo() should be > documented as ...", users don't need to know or learn TeX. Should there > be an easier method of submitting doc fixes, etc.? Sure. But people are > still going to need to actually *report* the fixes they want, which they > aren't doing in *any* form now. Maybe a doc fix day (similar to the bug fix day) would be good. That way we can report a lot of minor doc fix's at once and then they can be fixed in batches. For example of things I think may be thought of as too trivial to report but effect readability and ease of use with pythons help() function ... A *lot* of doc strings have lines that wrap when they are displayed by pythons help() in a standard 80 column console window. There are also two (maybe more) modules that have single backslash characters in their doc strings that get ate when viewed by pydoc. cookielib.py - has single '\'s in a diagram. SimpleXMLRPCServer.py - line 31... code example with line continuation. I wonder if double \ should also be allowed as line continuations so that doctests would look and work ok in doc strings when viewed by pythons help()? Anyway if someone wants to search for other things of that type they can play around with the hacked together tool included below. Setting it low enough so that indented methods don't wrap with the help() function brings up several thousand instances. I'm hoping most of those are duplicated/inherited doc strings. Many of those are documented format lines with the form ... name( longs_list_of_arguments ... ) -> long_list_of_return_types ... Rather than fix all of those, I'm changing the version of pydoc I've been rewriting to wordwrap lines. Although that's not the prettiest solution, it's better than breaking the indented margin. Have fun... ;-) Ron """ Find doc string lines are not longer than n characters. Dedenting the doc strings before testing may give more meaningful results. """ import sys import os import inspect import types class NullType(object): """ A simple Null object to use when None is a valid argument, or when redirecting print to Null. """ def write(self, *args): pass def __repr__(self): return "Null" Null = NullType() check = 'CHECKING__________' checkerr = 'ERROR CHECKING____' err_obj = [] err_num = 0 stdout = sys.stdout stderr = sys.stderr seporator = '--------------------------------------------------------' linelength = 100 def main(): sys_path = sys.path # remove invalid dirs for f in sys_path[:]: try: os.listdir(f) except: sys_path.remove(f) #checkmodule('__builtin__') for mod in sys.builtin_module_names: checkmodule(mod) for dir_ in sys.path: for f in os.listdir(dir_): if f.endswith('.py') or f.endswith('.pyw') or f.endswith('.pyd'): try: checkmodule(f.partition('.')[0]) except Exception: print seporator print checkerr, f, err_obj print ' %s: %s' % (sys.exc_type.__name__, sys.exc_value) print seporator def checkmodule(modname): global err_obj err_obj = [modname] # Silent text printed on import. sys.stdout = sys.stderr = Null try: module = __import__(modname) finally: sys.stdout = stdout sys.stderr = stderr try: checkobj(module) # module doc string for o1 in dir(module): obj1 = getattr(module, o1) err_obj = [modname, o1] checkobj(obj1) # class and function doc strings for o2 in dir(obj1): obj2 = getattr(obj1, o2) err_obj = [modname, o1, o2] checkobj(obj2) # method doc strings finally: del module def checkobj(obj): global err_num if not hasattr(obj, '__doc__'): return doc = str(obj.__doc__) err_obj.append('__doc__') lines = doc.split('\n') longlines = [x for x in lines if len(x) > linelength] if longlines: err_num += 1 print seporator print '#%i: %s' % (err_num, '.'.join([str(x) for x in err_obj])) print for x in longlines: print len(x), repr(x.strip()) if __name__ == '__main__': main() From glyph at divmod.com Sat Sep 30 06:52:58 2006 From: glyph at divmod.com (glyph at divmod.com) Date: Sat, 30 Sep 2006 00:52:58 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: Message-ID: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> On Fri, 29 Sep 2006 12:38:22 -0700, Guido van Rossum wrote: >I would recommend not using it. IMO it's an amalgam of unrelated >functionality (much like the Java equivalent BTW) and the existing os >and os.path modules work just fine. Those who disagree with me haven't >done a very good job of convincing me, so I expect this PEP to remain >in limbo indefinitely, until it is eventually withdrawn or rejected. Personally I don't like the path module in question either, and I think that PEP 355 presents an exceptionally weak case, but I do believe that there are several serious use-cases for "object oriented" filesystem access. Twisted has a module for doing this: http://twistedmatrix.com/trac/browser/trunk/twisted/python/filepath.py I hope to one day propose this module as a replacement, or update, for PEP 355, but I have neither the time nor the motivation to do it currently. I wouldn't propose it now; it is, for example, mostly undocumented, missing some useful functionality, and has some weird warts (for example, the name of the path-as-string attribute is "path"). However, since it's come up I thought I'd share a few of the use-cases for the general feature, and the things that Twisted has done with it. 1: Testing. If you want to provide filesystem stubs to test code which interacts with the filesystem, it is fragile and extremely complex to temporarily replace the 'os' module; you have to provide a replacement which knows about all the hairy string manipulations one can perform on paths, and you'll almost always forget some weird platform feature. If you have an object with a narrow interface to duck-type instead; for example, a "walk" method which returns similar objects, or an "open" method which returns a file-like object, mocking the appropriate parts of it in a test is a lot easier. The proposed PEP 355 module can be used for this, but its interface is pretty wide and implicit (and portions of it are platform-specific), and because it is also a string you may still have to deal with platform-specific features in tests (or even mixed os.path manipulations, on the same object). This is especially helpful when writing tests for error conditions that are difficult to reproduce on an actual filesystem, such as a network filesystem becoming unavailable. 2: Fast failure, or for lack of a better phrase, "type correctness". PEP 355 gets close to this idea when it talks about datetimes and sockets not being strings. In many cases, code that manipulates filesystems is passing around 'str' or 'unicode' objects, and may be accidentally passed the contents of a file rather than its name, leading to a bizarre failure further down the line. FilePath fails immediately with an "unsupported operand types" TypeError in that case. It also provides nice, immediate feedback at the prompt that the object you're dealing with is supposed to be a filesystem path, with no confusion as to whether it represents a relative or absolute path, or a path relative to a particular directory. Again, the PEP 355 module's subclassing of strings creates problems, because you don't get an immediate and obvious exception if you try to interpolate it with a non-path-name string, it silently "succeeds". 3: Safety. Almost every web server ever written (yes, including twisted.web) has been bitten by the "/../../../" bug at least once. The default child(name) method of Twisted's file path class will only let you go "down" (to go "up" you have to call the parent() method), and will trap obscure platform features like the "NUL" and "CON" files on Windows so that you can't trick a program into manipulating something that isn't actually a file. You can take strings you've read from an untrusted source and pass them to FilePath.child and get something relatively safe out. PEP 355 doesn't mention this at all. 4: last, but certainly not least: filesystem polymorphism. For an example of what I mean, take a look at this in-development module: http://twistedmatrix.com/trac/browser/trunk/twisted/python/zippath.py It's currently far too informal, and incomplete, and there's no specified interface. However, this module shows that by being objects and not module-methods, FilePath objects can also provide a sort of virtual filesystem for Python programs. With FilePath plus ZipPath, You can write Python programs which can operate on a filesystem directory or a directory within a Zip archive, depending on what object they are passed. On a more subjective note, I've been gradually moving over personal utility scripts from os.path manipulations to twisted.python.filepath for years. I can't say that this will be everyone's experience, but in the same way that Python scripts avoid the class of errors present in most shell scripts (quoting), t.p.f scripts avoid the class of errors present in most Python scripts (off-by-one errors when looking at separators or extensions). I hope that eventually Python will include some form of OO filesystem access, but I am equally hopeful that the current PEP 355 path.py is not it. From ncoghlan at gmail.com Sat Sep 30 07:17:16 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Sep 2006 15:17:16 +1000 Subject: [Python-Dev] PEP 355 status In-Reply-To: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> Message-ID: <451DFDDC.9020708@gmail.com> glyph at divmod.com wrote: > I hope that eventually Python will include some form of OO filesystem > access, but I am equally hopeful that the current PEP 355 path.py is not > it. +1 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From steve at holdenweb.com Sat Sep 30 09:41:38 2006 From: steve at holdenweb.com (Steve Holden) Date: Sat, 30 Sep 2006 08:41:38 +0100 Subject: [Python-Dev] Python Doc problems In-Reply-To: References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <20060929121035.GA4884@localhost.localdomain> Message-ID: <451E1FB2.9050209@holdenweb.com> Guido van Rossum wrote: > On 9/29/06, A.M. Kuchling wrote: > >>On Fri, Sep 29, 2006 at 09:49:35AM +0900, stephen at xemacs.org wrote: >> >>>What is lost according to him is information about how the elements of >>>a module work together. The docstrings tend to be narrowly focused on >>>the particular function or variable, and too often discuss >>>implementation details. >> >>I agree with this, and am not very interested in tools such as epydoc >>for this reason. In such autogenerated documentation, you wind up >>with a list of every single class and function, and both trivial and >>important classes are given exactly the same emphasis. Such docs are >>useful as a reference when you know what class you need to look at, >>but then pydoc also works well for that purpose. > > > Right. > > BTW isn't xah a well-known troll? (There are exactly 666 Google hits > for the query ``xah troll'' -- draw your own conclusions. :-) > The calming influence of c.l.py appears to have worked its magic on xah to the extent that his most recent post didn't contain any expletives. Maybe there's hope for him yet. 668-and-counting-ly y'rs - steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From steve at holdenweb.com Sat Sep 30 09:41:38 2006 From: steve at holdenweb.com (Steve Holden) Date: Sat, 30 Sep 2006 08:41:38 +0100 Subject: [Python-Dev] Python Doc problems In-Reply-To: References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <20060929121035.GA4884@localhost.localdomain> Message-ID: <451E1FB2.9050209@holdenweb.com> Guido van Rossum wrote: > On 9/29/06, A.M. Kuchling wrote: > >>On Fri, Sep 29, 2006 at 09:49:35AM +0900, stephen at xemacs.org wrote: >> >>>What is lost according to him is information about how the elements of >>>a module work together. The docstrings tend to be narrowly focused on >>>the particular function or variable, and too often discuss >>>implementation details. >> >>I agree with this, and am not very interested in tools such as epydoc >>for this reason. In such autogenerated documentation, you wind up >>with a list of every single class and function, and both trivial and >>important classes are given exactly the same emphasis. Such docs are >>useful as a reference when you know what class you need to look at, >>but then pydoc also works well for that purpose. > > > Right. > > BTW isn't xah a well-known troll? (There are exactly 666 Google hits > for the query ``xah troll'' -- draw your own conclusions. :-) > The calming influence of c.l.py appears to have worked its magic on xah to the extent that his most recent post didn't contain any expletives. Maybe there's hope for him yet. 668-and-counting-ly y'rs - steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From steve at holdenweb.com Sat Sep 30 09:45:03 2006 From: steve at holdenweb.com (Steve Holden) Date: Sat, 30 Sep 2006 08:45:03 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <20060929081402.GB19781@craig-wood.com> Message-ID: Jason Orendorff wrote: > On 9/29/06, Fredrik Lundh wrote: > >>(I just checked the program I'm working on, and my analysis tells me >>that the most common floating point value in that program is 121.216, >>which occurs 32 times. from what I can tell, 0.0 isn't used at all.) > > > *bemused look* Fredrik, can you share the reason why this number > occurs 32 times in this program? I don't mean to imply anything by > that; it just sounds like it might be a fun story. :) > > Anyway, this kind of static analysis is probably more entertaining > than relevant. For your enjoyment, the most-used float literals in > python25\Lib, omitting test directories, are: > > 1e-006: 5 hits > 4.0: 6 hits > 0.05: 7 hits > 6.0: 8 hits > 0.5: 13 hits > 2.0: 25 hits > 0.0: 36 hits > 1.0: 62 hits > > There are two hits each for -1.0 and -0.5. > > In my own Python code, I don't even have enough float literals to bother with. > By these statistics I think the answer to the original question is clearly "no" in the general case. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From martin at v.loewis.de Sat Sep 30 10:43:01 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 30 Sep 2006 10:43:01 +0200 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: References: Message-ID: <451E2E15.4040906@v.loewis.de> Christos Georgiou schrieb: > Does anyone know why this happens? I can't find any information pointing to > this being deliberate. It may well be that Tix wasn't included on Windows. I don't test Tix regularly, and nobody reported missing it during the beta test. Please submit a bug report to sf.net/projects/python. Notice that Python 2.5 ships with a different Tcl version than 2.4; using the 2.4 Tix binaries in 2.5 may cause crashes. Regards, Martin From martin at v.loewis.de Sat Sep 30 10:47:46 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 30 Sep 2006 10:47:46 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> Message-ID: <451E2F32.9070405@v.loewis.de> Bob Ippolito schrieb: > My guess is that people do have this problem, they just don't know > where that memory has gone. I know I don't count objects unless I have > a process that's leaking memory or it grows so big that I notice (by > swapping or chance). Right. Although I do wonder what kind of software people write to run into this problem. As Guido points out, the numbers must be the result from some computation, or created by an extension module by different means. If people have many *simultaneous* copies of 0.0, I would expect there is something else really wrong with the data structures or algorithms they use. Regards, Martin From ncoghlan at gmail.com Sat Sep 30 10:59:25 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Sep 2006 18:59:25 +1000 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <451E2F32.9070405@v.loewis.de> References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> <451E2F32.9070405@v.loewis.de> Message-ID: <451E31ED.7030905@gmail.com> Martin v. L?wis wrote: > Bob Ippolito schrieb: >> My guess is that people do have this problem, they just don't know >> where that memory has gone. I know I don't count objects unless I have >> a process that's leaking memory or it grows so big that I notice (by >> swapping or chance). > > Right. Although I do wonder what kind of software people write to run > into this problem. As Guido points out, the numbers must be the result > from some computation, or created by an extension module by different > means. If people have many *simultaneous* copies of 0.0, I would expect > there is something else really wrong with the data structures or > algorithms they use. I suspect the problem would typically stem from floating point values that are read in from a human-readable file rather than being the result of a 'calculation' as such: >>> float('1') is float('1') False >>> float('0') is float('0') False Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From tzot at mediconsa.com Sat Sep 30 11:23:28 2006 From: tzot at mediconsa.com (Christos Georgiou) Date: Sat, 30 Sep 2006 12:23:28 +0300 Subject: [Python-Dev] Tix not included in 2.5 for Windows References: <451E2E15.4040906@v.loewis.de> Message-ID: ""Martin v. L?wis"" wrote in message news:451E2E15.4040906 at v.loewis.de... > Please submit a bug report to sf.net/projects/python. Done: www.python.org/sf/1568240 From kristjan at ccpgames.com Sat Sep 30 13:20:07 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Sat, 30 Sep 2006 11:20:07 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A28000451FED3@nemesis.central.ccp.cc> Well, a lot of extension code, like ours use PyFloat_FromDouble(foo); This can be from vectors and stuff. Very often these are values from a database. Integral float values are very common in such case and id didn't occur to me that they weren't being reused, at least for small values. Also, a lot of arithmetic involving floats is expected to end in integers, like computing some index from a float value. Integers get promoted to floats when touched by them, as you know. Anyway, I now precreate integral values from -10 to 10 with great effect. The cost is minimal, the benefit great. Cheers, Kristj?n -----Original Message----- From: python-dev-bounces+kristjan=ccpgames.com at python.org [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf Of "Martin v. L?wis" Sent: 30. september 2006 08:48 To: Bob Ippolito Cc: python-dev at python.org Subject: Re: [Python-Dev] Caching float(0.0) Bob Ippolito schrieb: > My guess is that people do have this problem, they just don't know > where that memory has gone. I know I don't count objects unless I have > a process that's leaking memory or it grows so big that I notice (by > swapping or chance). Right. Although I do wonder what kind of software people write to run into this problem. As Guido points out, the numbers must be the result from some computation, or created by an extension module by different means. If people have many *simultaneous* copies of 0.0, I would expect there is something else really wrong with the data structures or algorithms they use. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com From mwh at python.net Sat Sep 30 13:52:20 2006 From: mwh at python.net (Michael Hudson) Date: Sat, 30 Sep 2006 12:52:20 +0100 Subject: [Python-Dev] PEP 355 status In-Reply-To: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> (glyph@divmod.com's message of "Sat, 30 Sep 2006 00:52:58 -0400") References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> Message-ID: <2mk63lfu6j.fsf@starship.python.net> glyph at divmod.com writes: > I hope that eventually Python will include some form of OO > filesystem access, but I am equally hopeful that the current PEP 355 > path.py is not it. I think I agree with this too. For another source of ideas there is the 'py.path' bit of the py lib, which, um, doesn't seem to be documented terribly well, but allows access to remote svn repositories as well as local filesytems (at least). Cheers, mwh -- 3. Syntactic sugar causes cancer of the semicolon. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From guido at python.org Sat Sep 30 17:09:58 2006 From: guido at python.org (Guido van Rossum) Date: Sat, 30 Sep 2006 08:09:58 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <2mk63lfu6j.fsf@starship.python.net> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> Message-ID: OK. Pronouncement: PEP 355 is dead. The authors (or the PEP editor) can update the PEP. I'm looking forward to a new PEP. --Guido On 9/30/06, Michael Hudson wrote: > glyph at divmod.com writes: > > > I hope that eventually Python will include some form of OO > > filesystem access, but I am equally hopeful that the current PEP 355 > > path.py is not it. > > I think I agree with this too. For another source of ideas there is > the 'py.path' bit of the py lib, which, um, doesn't seem to be > documented terribly well, but allows access to remote svn repositories > as well as local filesytems (at least). > > Cheers, > mwh > > -- > 3. Syntactic sugar causes cancer of the semicolon. > -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From Hans.Polak at capgemini.com Fri Sep 29 12:46:43 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Fri, 29 Sep 2006 12:46:43 +0200 Subject: [Python-Dev] PEP 351 - do while Message-ID: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> Hi, Just an opinion, but many uses of the 'while true loop' are instances of a 'do loop'. I appreciate the language layout question, so I'll give you an alternative: do: while Cheers, Hans Polak. This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060929/906c9e6c/attachment.htm From tjreedy at udel.edu Sat Sep 30 21:53:33 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 30 Sep 2006 15:53:33 -0400 Subject: [Python-Dev] Caching float(0.0) References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com><451E2F32.9070405@v.loewis.de> <451E31ED.7030905@gmail.com> Message-ID: "Nick Coghlan" wrote in message news:451E31ED.7030905 at gmail.com... >I suspect the problem would typically stem from floating point values that >are >read in from a human-readable file rather than being the result of a >'calculation' as such: For such situations, one could create a translation dict for both common float values and for non-numeric missing value indicators. For instance, flotran = {'*': None, '1.0':1.0, '2.0':2.0, '4.0':4.0} The details, of course, depend on the specific case. tjr From Scott.Daniels at Acm.Org Sat Sep 30 23:13:42 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat, 30 Sep 2006 14:13:42 -0700 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: References: Message-ID: Christos Georgiou wrote: > Does anyone know why this happens? I can't find any information pointing to > this being deliberate. > > I just upgraded to 2.5 on Windows (after making sure I can build extensions > with the freeware VC++ Toolkit 2003) and some of my programs stopped > operating. I saw in a French forum that someone else had the same problem, > and what they did was to copy the relevant files from a 2.4.3 installation. > I did the same, and it seems it works, with only a console message appearing > as soon as a root window is created: Also note: the Os/X universal seems to include a Tix runtime for the non-Intel processor, but not for the Intel processor. This makes me think there is a build problem. -- Scott David Daniels Scott.Daniels at Acm.Org From brett at python.org Sat Sep 30 23:26:57 2006 From: brett at python.org (Brett Cannon) Date: Sat, 30 Sep 2006 14:26:57 -0700 Subject: [Python-Dev] Possible semantic changes for PEP 352 in 2.6 Message-ID: I am working on PEP 352 stuff for 2.6 and there are two changes that I think should be made that are not explicitly laid out in the PEP. First, and most dramatic, involves what is legal to list in an 'except' clause. Right now you can listing *anything*. This means ``except 42`` is totally legal even though raising a number is not. Since I am deprecating catching string exceptions, I can go ahead and deprecate catching *any* object that is not a legitimate object to be raised. The second thing is changing PyErr_GivenExceptionMatches() to return 0 on false, 1 on true, and -1 on error. As of right now there is no defined error return value. While it could be suggested to check PyErr_Occurred() after every call, there is a way to have the return value reflect all possible so I think this changed should be made. Anybody have objections with any of the changes I am proposing? -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060930/06950d01/attachment.html