From draghuram at gmail.com Fri Jun 1 00:10:08 2007 From: draghuram at gmail.com (Raghuram Devarakonda) Date: Thu, 31 May 2007 18:10:08 -0400 Subject: [Python-Dev] removing use of mimetools, multifile, and rfc822 In-Reply-To: References:

Message-ID: <2c51ecee0705311510j6f7afaf9q2f2e917bb30d8fc4@mail.gmail.com> On 5/31/07, Barry Warsaw wrote: > > In other words this email is to hopefully inspire someone to remove > > the uses > > of rfc822, mimetools, and multifile from the stdlib so the > > DeprecationWarnings can finally go in. > > +1 for deprecating these. I don't have time to slog through the > stdlib and do the work, but I would be happy to help answer questions > about alternatives. I will give it a shot and will try to come up with a patch. Thanks, Raghu From brett at python.org Fri Jun 1 04:10:30 2007 From: brett at python.org (Brett Cannon) Date: Thu, 31 May 2007 19:10:30 -0700 Subject: [Python-Dev] failures in test_sqlite when entire test suite run Message-ID: I have been getting failures from test_sqlite off the trunk when I run the entire test suite (as ``./python.exe Lib/test/regrtest.py``) with this error on OS X 10.4.9 and sqlite3 3.3.16: Traceback (most recent call last): File "/Users/drifty/Dev/python/2.x/pristine/Lib/sqlite3/test/regression.py", line 29, in setUp self.con = sqlite.connect(":memory:") ProgrammingError: library routine called out of sequence When run in isolation it is fine. Anyone have a guess as to what is going on? -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070531/880f45fd/attachment.html From bjourne at gmail.com Fri Jun 1 17:54:40 2007 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Fri, 1 Jun 2007 17:54:40 +0200 Subject: [Python-Dev] Minor ConfigParser Change In-Reply-To: <200705310045.58802.fdrake@acm.org> References: <46585729.2030305@gmail.com> <4E9372E6B2234D4F859320D896059A9508DE8B40B5@exchis.ccp.ad.local> <46588B22.3090808@gmail.com> <200705310045.58802.fdrake@acm.org> Message-ID: <740c3aec0706010854m426efa53s36a175923edda136@mail.gmail.com> Patches are applied once, but thousands of people read the code in the standard library each month. The standard library should be as readable as possible to make it as easy as possible to maintain. It is just good software development methodology. Many parts of the standard library are arcane and almost impossible to understand (see httplib for example) because refactoring changes are Not done. So if someone wants to improve the code why not let them? -- mvh Bj?rn From draghuram at gmail.com Fri Jun 1 18:45:39 2007 From: draghuram at gmail.com (Raghuram Devarakonda) Date: Fri, 1 Jun 2007 12:45:39 -0400 Subject: [Python-Dev] error in Misc/NEWS Message-ID: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com> There is an entry in "Core and builtins" section of Misc/NEWS: "Bug #1722484: remove docstrings again when running with -OO.". The actual bug is 1722485. Incidentally, 1722484 appears to be spam. From tjreedy at udel.edu Fri Jun 1 20:19:46 2007 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 1 Jun 2007 14:19:46 -0400 Subject: [Python-Dev] error in Misc/NEWS References: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com> Message-ID: "Raghuram Devarakonda" wrote in message news:2c51ecee0706010945n7f144a0fn6c49b03216c54570 at mail.gmail.com... | There is an entry in "Core and builtins" section of Misc/NEWS: | | "Bug #1722484: remove docstrings again when running with -OO.". | | The actual bug is 1722485. Incidentally, 1722484 appears to be spam. Sure enough. But it is another project -- and submitted anonymously. From g.brandl at gmx.net Fri Jun 1 21:20:38 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 01 Jun 2007 21:20:38 +0200 Subject: [Python-Dev] error in Misc/NEWS In-Reply-To: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com> References: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com> Message-ID: Raghuram Devarakonda schrieb: > There is an entry in "Core and builtins" section of Misc/NEWS: > > "Bug #1722484: remove docstrings again when running with -OO.". > > The actual bug is 1722485. Incidentally, 1722484 appears to be spam. Fixed, thanks for spotting (you really read the commit logs thoroughly, don't you? ;) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From fdrake at acm.org Fri Jun 1 23:08:45 2007 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 1 Jun 2007 17:08:45 -0400 Subject: [Python-Dev] Minor ConfigParser Change In-Reply-To: <740c3aec0706010854m426efa53s36a175923edda136@mail.gmail.com> References: <46585729.2030305@gmail.com> <200705310045.58802.fdrake@acm.org> <740c3aec0706010854m426efa53s36a175923edda136@mail.gmail.com> Message-ID: <200706011708.45883.fdrake@acm.org> On Friday 01 June 2007, BJ?rn Lindqvist wrote: > Patches are applied once, but thousands of people read the code in the > standard library each month. The standard library should be as > readable as possible to make it as easy as possible to maintain. It is > just good software development methodology. Rest assured, I understand your sentiment here, and am not personally against an occaissional clean-up. ConfigParser in particular is old and highly idiosyncratic. > Many parts of the standard library are arcane and almost impossible to > understand (see httplib for example) because refactoring changes are > Not done. So if someone wants to improve the code why not let them? Changes in general are a source of risk; they have to be considered carefully. We've seen too many cases in which a change was thought to be safe, but broke something for someone. Avoiding style-only changes helps avoid introducing problems without being able to predict them; there are tests for ConfigParser, but it's hard to be sure every corner case has been covered. This is a general policy in the Python project, not simply my preference. I'd love to be able to say "yes, the code is painful to read, let's make it nicer", but it's hard to say that without being able to say "I'm sure it won't break anything for anybody." Python's too flexible for that to be easy. -Fred -- Fred L. Drake, Jr. From draghuram at gmail.com Sat Jun 2 01:00:31 2007 From: draghuram at gmail.com (Raghuram Devarakonda) Date: Fri, 1 Jun 2007 19:00:31 -0400 Subject: [Python-Dev] error in Misc/NEWS In-Reply-To: References: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com> Message-ID: <2c51ecee0706011600r5920375ftadd9a3bf161167a2@mail.gmail.com> On 6/1/07, Georg Brandl wrote: > Raghuram Devarakonda schrieb: > > There is an entry in "Core and builtins" section of Misc/NEWS: > > > > "Bug #1722484: remove docstrings again when running with -OO.". > > > > The actual bug is 1722485. Incidentally, 1722484 appears to be spam. > > Fixed, thanks for spotting (you really read the commit logs thoroughly, > don't you? ;) I was just scanning the file for the comment related to my patch (my first one, btw) when I spotted this. From brett at python.org Sat Jun 2 05:08:02 2007 From: brett at python.org (Brett Cannon) Date: Fri, 1 Jun 2007 20:08:02 -0700 Subject: [Python-Dev] failures in test_sqlite when entire test suite run In-Reply-To: References: Message-ID: On 5/31/07, Brett Cannon wrote: > > I have been getting failures from test_sqlite off the trunk when I run the > entire test suite (as ``./python.exe Lib/test/regrtest.py``) with this error > on OS X 10.4.9 and sqlite3 3.3.16: > > Traceback (most recent call last): > File > "/Users/drifty/Dev/python/2.x/pristine/Lib/sqlite3/test/regression.py", line > 29, in setUp > self.con = sqlite.connect(":memory:") > ProgrammingError: library routine called out of sequence > > > When run in isolation it is fine. Anyone have a guess as to what is going > on? Nevermind. It has started to pass again for me. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070601/b6c4aeef/attachment.htm From status at bugs.python.org Sun Jun 3 02:00:56 2007 From: status at bugs.python.org (Tracker) Date: Sun, 3 Jun 2007 00:00:56 +0000 (UTC) Subject: [Python-Dev] Summary of Tracker Issues Message-ID: <20070603000056.126EE780B3@psf.upfronthosting.co.za> ACTIVITY SUMMARY (05/27/07 - 06/03/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1649 open ( +0) / 8584 closed ( +0) / 10233 total ( +0) Average duration of open issues: 813 days. Median duration of open issues: 764 days. Open Issues Breakdown open 1649 ( +0) pending 0 ( +0) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070603/529c96ba/attachment.html From josepharmbruster at gmail.com Sun Jun 3 02:34:22 2007 From: josepharmbruster at gmail.com (Joseph Armbruster) Date: Sat, 02 Jun 2007 20:34:22 -0400 Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8 Message-ID: <46620C8E.1@gmail.com> All, I wanted to pass this one around before opening an issue on it. When running the unit test for popen via rt.bat (in PCBuild8), I received the following error: === BEGIN ERROR === C:\Documents and Settings\joe\Desktop\Development\Python\trunk\PCbuild8>rt test_popen Deleting .pyc/.pyo files ... 43 .pyc deleted, 0 .pyo deleted C:\Documents and Settings\joe\Desktop\Development\Python\trunk\PCbuild8>win32Release\python.exe -E -tt ../lib/test/regrtest.py te st_popen test_popen test test_popen failed -- Traceback (most recent call last): File "C:\Documents and Settings\joe\Desktop\Development\Python\trunk\lib\test\test_popen.py", line 31, in test_popen ["foo", "bar"] File "C:\Documents and Settings\joe\Desktop\Development\Python\trunk\lib\test\test_popen.py", line 24, in _do_test_commandline got = eval(data)[1:] # strip off argv[0] File "", line 0 ^ SyntaxError: unexpected EOF while parsing 1 test failed: test_popen === END ERROR === Only naturally, I looked into what was causing it and noticed the following: Line 23 of the test_popen.py appears to be returning '' and assigning this to data. data = os.popen(cmd).read() The problem with is, the next line (24) assumes the previous line will work and goes on to perform the following strip and assert: got = eval(data)[1:] # strip off argv[0] self.assertEqual(got, expected) So, in a perfect world, ['-c','foo','bar']\n is what data Should be. I put some quick debug statements after line 23 in test_popen.py to verify this and I observed the following: data= cmd= "C:\Documents and Settings\joe\Desktop\Development\Python\trunk\PCbuild8\win32Release\python.exe" -c "import sys;print sys.argv" foo bar Now, on to the 'interesting' part. From the command line, observe the following: C:\Documents and Settings\joe\Desktop\Development\Python\trunk\PCbuild8\win32release>python -c "import sys; print sys.argv" foo bar ['-c', 'foo', 'bar'] Outside of the popen call failing. I am wondering if an appropriate assert should be performed on the data object, prior to line 24. In addition, if you debug into the posixmodule, this is the scoop: 1. breakpoint set in posixmodule at the start of posix_popen 2. i run in debug 3. run the following: import os tmp = os.popen('"C:/Documents and Settings/joe/Desktop/Development/Python/trunk/PCbuild8/win32Release/python.exe" -c "import sys;print sys.argv" foo bar') 3. call enters posixmodule posix_popen and follows path: f = _PyPopen(cmdstring, tm | _O_TEXT, POPEN_1); 4. enters posixmodule: _PyPopen 5. enters posixmodule: _PyPopenCreateProcess 6. enters posixmodule linen 4920 where the CreateProcess is... s2 checks out as: "C:\WINDOWS\system32\cmd.exe /c "C:/Documents and Settings/joe/Desktop/Development/Python/trunk/PCbuild8/win32Release/python.exe" -c "import sys;print sys.argv" foo bar" this call returns nonzero, which means it "succeeded". see: [ http://msdn2.microsoft.com/en-us/library/ms682425.aspx ] On another note, I ran across CreateProcessW and am interested in questioning whether or not this has a place in posixmodule? Any on yet another note, when I ran test_popen.py straight from /lib (using my std::Python25 install, I obtained the following debug output in the same statement of interest) data=['-c', 'foo', 'bar'] cmd=c:\python25\python.exe -c "import sys;print sys.argv" foo bar Your thoughts ? Joseph Armbruster From talin at acm.org Sun Jun 3 21:07:24 2007 From: talin at acm.org (Talin) Date: Sun, 03 Jun 2007 12:07:24 -0700 Subject: [Python-Dev] Substantial rewrite of PEP 3101 Message-ID: <4663116C.8020201@acm.org> I've rewritten large portions of PEP 3101, incorporating some material from Patrick Maupin and Eric Smith, as well as rethinking the whole custom formatter design. Although it isn't showing up on the web site yet, you can view the copy in subversion (and the diffs) here: http://svn.python.org/view/peps/trunk/pep-3101.txt Please let me know of any errors you find, either by mailing me directly, or replying to the topic in Python-3000. (I.e. lets not start a thread here.) -- Talin From mhammond at skippinet.com.au Mon Jun 4 14:38:32 2007 From: mhammond at skippinet.com.au (Mark Hammond) Date: Mon, 4 Jun 2007 22:38:32 +1000 Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8 In-Reply-To: <46620C8E.1@gmail.com> Message-ID: <082d01c7a6a5$436a19b0$1f0a0a0a@enfoldsystems.local> > All, > > I wanted to pass this one around before opening an issue on it. > When running the unit test for popen via rt.bat (in PCBuild8), > I received the following error: > > === BEGIN ERROR === > > C:\Documents and > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>rt test_popen > Deleting .pyc/.pyo files ... > 43 .pyc deleted, 0 .pyo deleted > > C:\Documents and > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>win32Re > lease\python.exe -E -tt ../lib/test/regrtest.py test_popen > test_popen > test test_popen failed -- Traceback (most recent call last): > File "C:\Documents and Settings\joe\Desktop\Development\Python\... I can't reproduce this. I expect you will find it is due to the space in the filename of your Python directory, via cmd.exe's documented behaviour with quote characters. A patch that allows the test suite to work in such an environment would be welcome, but I think you might end up needing access to GetShortPathName() rather than CreateProcess(). Cheers, Mark From josepharmbruster at gmail.com Mon Jun 4 15:09:44 2007 From: josepharmbruster at gmail.com (Joseph Armbruster) Date: Mon, 4 Jun 2007 09:09:44 -0400 Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8 In-Reply-To: <082d01c7a6a5$436a19b0$1f0a0a0a@enfoldsystems.local> References: <46620C8E.1@gmail.com> <082d01c7a6a5$436a19b0$1f0a0a0a@enfoldsystems.local> Message-ID: <938f42d70706040609s5c35268sfd64fe0df167e241@mail.gmail.com> Mark, Sounds good, I will get patching tonight. Any thoughts on CreateProcessW ? Joseph Armbruster On 6/4/07, Mark Hammond wrote: > > > All, > > > > I wanted to pass this one around before opening an issue on it. > > When running the unit test for popen via rt.bat (in PCBuild8), > > I received the following error: > > > > === BEGIN ERROR === > > > > C:\Documents and > > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>rt test_popen > > Deleting .pyc/.pyo files ... > > 43 .pyc deleted, 0 .pyo deleted > > > > C:\Documents and > > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>win32Re > > lease\python.exe -E -tt ../lib/test/regrtest.py test_popen > > test_popen > > test test_popen failed -- Traceback (most recent call last): > > File "C:\Documents and Settings\joe\Desktop\Development\Python\... > > I can't reproduce this. I expect you will find it is due to the space in > the filename of your Python directory, via cmd.exe's documented behaviour > with quote characters. A patch that allows the test suite to work in such > an environment would be welcome, but I think you might end up needing > access > to GetShortPathName() rather than CreateProcess(). > > Cheers, > > Mark > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070604/1f600622/attachment.htm From bjourne at gmail.com Mon Jun 4 21:32:14 2007 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Mon, 4 Jun 2007 21:32:14 +0200 Subject: [Python-Dev] What exception should Thread.start() raise? Message-ID: <740c3aec0706041232s1ab331ees1592ca1204d6c47a@mail.gmail.com> The threading module contains buggy code: class Thread(_Verbose): ... def start(self): assert self.__initialized, "Thread.__init__() not called" assert not self.__started, "thread already started" ... If you run such code with python -O, weird stuff may happen when you call mythread.start() multiple times. -O removes assert statements so the code won't fail with an AssertionError which would be expected. So what real exception should Thread.start() raise? I have suggested adding an IllegalStateError modelled after java's IllegalStateException, but that idea was rejected. So what exception should be raised here, is it a RuntimeError? -- mvh Bj?rn From steven.bethard at gmail.com Mon Jun 4 21:50:39 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 4 Jun 2007 13:50:39 -0600 Subject: [Python-Dev] What exception should Thread.start() raise? In-Reply-To: <740c3aec0706041232s1ab331ees1592ca1204d6c47a@mail.gmail.com> References: <740c3aec0706041232s1ab331ees1592ca1204d6c47a@mail.gmail.com> Message-ID: On 6/4/07, BJ?rn Lindqvist wrote: > The threading module contains buggy code: > > class Thread(_Verbose): > ... > def start(self): > assert self.__initialized, "Thread.__init__() not called" > assert not self.__started, "thread already started" > ... > > If you run such code with python -O, weird stuff may happen when you > call mythread.start() multiple times. -O removes assert statements so > the code won't fail with an AssertionError which would be expected. > > So what real exception should Thread.start() raise? I have suggested > adding an IllegalStateError modelled after java's > IllegalStateException, but that idea was rejected. So what exception > should be raised here, is it a RuntimeError? If you want to be fully backwards compatible, you could just write this like:: def start(self): if not self.__initialized: raise AssertionError("Thread.__init__() not called") if self.__started: raise AssertionError("thread already started") But I doubt anyone is actually catching the AssertionError, so changing the error type would probably be okay. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From guido at python.org Mon Jun 4 22:33:11 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Jun 2007 13:33:11 -0700 Subject: [Python-Dev] What exception should Thread.start() raise? In-Reply-To: References: <740c3aec0706041232s1ab331ees1592ca1204d6c47a@mail.gmail.com> Message-ID: On 6/4/07, Steven Bethard wrote: > On 6/4/07, BJ?rn Lindqvist wrote: > > The threading module contains buggy code: > > > > class Thread(_Verbose): > > ... > > def start(self): > > assert self.__initialized, "Thread.__init__() not called" > > assert not self.__started, "thread already started" > > ... > > > > If you run such code with python -O, weird stuff may happen when you > > call mythread.start() multiple times. -O removes assert statements so > > the code won't fail with an AssertionError which would be expected. > > > > So what real exception should Thread.start() raise? I have suggested > > adding an IllegalStateError modelled after java's > > IllegalStateException, but that idea was rejected. So what exception > > should be raised here, is it a RuntimeError? > > If you want to be fully backwards compatible, you could just write this like:: > > def start(self): > if not self.__initialized: > raise AssertionError("Thread.__init__() not called") > if self.__started: > raise AssertionError("thread already started") > > But I doubt anyone is actually catching the AssertionError, so > changing the error type would probably be okay. Anything that causes an "assert" to fail is technically using "undefined" behavior. I am in favor of changing this case to RuntimeError, which is the error Python usually uses for state problems. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jimjjewett at gmail.com Tue Jun 5 00:55:01 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 4 Jun 2007 18:55:01 -0400 Subject: [Python-Dev] svn viewer confused Message-ID: Choosing a revision, such as http://svn.python.org/view/python/trunk/Objects/?rev=55606&sortby=date&view=log does not lead to the correct generated page; it either times out or generates a much older changelog. From tcdelaney at optusnet.com.au Tue Jun 5 14:03:23 2007 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Tue, 5 Jun 2007 22:03:23 +1000 Subject: [Python-Dev] Patch #1731330 - pysqlite_cache_display - missing Py_DECREF Message-ID: <008601c7a769$84a7c2a0$0201a8c0@mshome.net> I've added patch #1731330 to fix a missing Py_DECREF in pysqlite_cache_display. I've attached the diff to this email. I haven't actually been able to test this - haven't been able to get pysqlite compiled here on cygwin yet. I just noticed it when taking an example of using PyObject_Print ... Cheers, Tim Delaney -------------- next part -------------- A non-text attachment was scrubbed... Name: sqlite_cache.diff Type: application/octet-stream Size: 426 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070605/bb5dc4fe/attachment.obj From bjourne at gmail.com Tue Jun 5 18:34:49 2007 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Tue, 5 Jun 2007 18:34:49 +0200 Subject: [Python-Dev] Minor ConfigParser Change In-Reply-To: <200706011708.45883.fdrake@acm.org> References: <46585729.2030305@gmail.com> <200705310045.58802.fdrake@acm.org> <740c3aec0706010854m426efa53s36a175923edda136@mail.gmail.com> <200706011708.45883.fdrake@acm.org> Message-ID: <740c3aec0706050934u6c53f4a5k3f6fda82cd1e6f72@mail.gmail.com> On 6/1/07, Fred L. Drake, Jr. wrote: > Changes in general are a source of risk; they have to be considered carefully. > We've seen too many cases in which a change was thought to be safe, but broke > something for someone. Avoiding style-only changes helps avoid introducing > problems without being able to predict them; there are tests for > ConfigParser, but it's hard to be sure every corner case has been covered. I understand what you mean, all changes carry a certain risk. Especially in code that is so widely relied upon as the Standard Library. But the alternative, which is to let the code rot, while one-line fixes are applied upon it, is a much worse alternative. It is true that unit tests does not cover all corner cases and that you can't be 100% sure that a change won't break something for someone. But on the other hand, the whole point with unit tests is to facilitate exactly these kind of changes. If something breaks then that is a great opportunity to introduce more tests. > This is a general policy in the Python project, not simply my preference. I'd > love to be able to say "yes, the code is painful to read, let's make it > nicer", but it's hard to say that without being able to say "I'm sure it > won't break anything for anybody." Python's too flexible for that to be > easy. While what you have stated is the policy, I can't help but think that it is totally misguided (no offense intended). Maybe the policy can be reevaluated? -- mvh Bj?rn From guido at python.org Tue Jun 5 23:25:55 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Jun 2007 14:25:55 -0700 Subject: [Python-Dev] Patch #1731330 - pysqlite_cache_display - missing Py_DECREF In-Reply-To: <008601c7a769$84a7c2a0$0201a8c0@mshome.net> References: <008601c7a769$84a7c2a0$0201a8c0@mshome.net> Message-ID: On 6/5/07, Tim Delaney wrote: > I've added patch #1731330 to fix a missing Py_DECREF in > pysqlite_cache_display. I've attached the diff to this email. > > I haven't actually been able to test this - haven't been able to get > pysqlite compiled here on cygwin yet. I just noticed it when taking an > example of using PyObject_Print ... Committed revision 55783. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tcdelaney at optusnet.com.au Wed Jun 6 01:31:26 2007 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Wed, 6 Jun 2007 09:31:26 +1000 Subject: [Python-Dev] Patch #1731330 - pysqlite_cache_display - missing Py_DECREF In-Reply-To: References: <008601c7a769$84a7c2a0$0201a8c0@mshome.net> Message-ID: <98985ab20706051631m7128b02apc4fb9daa810a6985@mail.gmail.com> On 06/06/07, Guido van Rossum wrote: > > On 6/5/07, Tim Delaney wrote: > > I've added patch #1731330 to fix a missing Py_DECREF in > > pysqlite_cache_display. I've attached the diff to this email. > > > > I haven't actually been able to test this - haven't been able to get > > pysqlite compiled here on cygwin yet. I just noticed it when taking an > > example of using PyObject_Print ... > > Committed revision 55783. Thanks. I've added a comment that it also needs to be applied to p3yk. I've done a quick seach for other places with the same code, on the off-chance that it was copied from elsewhere. Didn't turn up any other cases. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070606/15d04766/attachment.html From josepharmbruster at gmail.com Wed Jun 6 03:09:56 2007 From: josepharmbruster at gmail.com (Joseph Armbruster) Date: Tue, 05 Jun 2007 21:09:56 -0400 Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8 In-Reply-To: <938f42d70706040609s5c35268sfd64fe0df167e241@mail.gmail.com> References: <46620C8E.1@gmail.com> <082d01c7a6a5$436a19b0$1f0a0a0a@enfoldsystems.local> <938f42d70706040609s5c35268sfd64fe0df167e241@mail.gmail.com> Message-ID: <46660964.6090600@gmail.com> Mark, My apologies for being a day late, got working on some other things. So here's the scoop as it relates to the issue at hand: - If you run rt.bat from the trunk as-is and place it in a path that contains an empty space, you receive the error outlined in resultwithspace.txt. - If you run rt.bat from the trunk as-is and place it in a path that does not contain an empty space, you receive no errors as outlined in resultwithoutspace.txt. - If you run rt.bat with the patch, on Windows XP, you receive no errors as outlined in resultafterpatch.txt. The patch is attached. Probably my biggest question now is the use of GetVersion as opposed to GetVersionEx. According to the MSDN, it doesn't appear to be all that undesirable: http://msdn2.microsoft.com/en-us/library/ms724451.aspx Your thoughts? Joseph Armbruster Joseph Armbruster wrote: > Mark, > > Sounds good, I will get patching tonight. Any thoughts on CreateProcessW ? > > Joseph Armbruster > > On 6/4/07, *Mark Hammond* < mhammond at skippinet.com.au > > wrote: > > > All, > > > > I wanted to pass this one around before opening an issue on it. > > When running the unit test for popen via rt.bat (in PCBuild8), > > I received the following error: > > > > === BEGIN ERROR === > > > > C:\Documents and > > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>rt test_popen > > Deleting .pyc/.pyo files ... > > 43 .pyc deleted, 0 .pyo deleted > > > > C:\Documents and > > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>win32Re > > lease\python.exe -E -tt ../lib/test/regrtest.py test_popen > > test_popen > > test test_popen failed -- Traceback (most recent call last): > > File "C:\Documents and Settings\joe\Desktop\Development\Python\... > > I can't reproduce this. I expect you will find it is due to the > space in > the filename of your Python directory, via cmd.exe's documented > behaviour > with quote characters. A patch that allows the test suite to work > in such > an environment would be welcome, but I think you might end up > needing access > to GetShortPathName() rather than CreateProcess(). > > Cheers, > > Mark > > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: popen.patch Url: http://mail.python.org/pipermail/python-dev/attachments/20070605/c95b09df/attachment.pot -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: resultafterpatch.txt Url: http://mail.python.org/pipermail/python-dev/attachments/20070605/c95b09df/attachment.txt -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: resultwithoutspace.txt Url: http://mail.python.org/pipermail/python-dev/attachments/20070605/c95b09df/attachment-0001.txt -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: resultwithspace.txt Url: http://mail.python.org/pipermail/python-dev/attachments/20070605/c95b09df/attachment-0002.txt From mhammond at skippinet.com.au Wed Jun 6 05:26:33 2007 From: mhammond at skippinet.com.au (Mark Hammond) Date: Wed, 6 Jun 2007 13:26:33 +1000 Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8 In-Reply-To: <46660964.6090600@gmail.com> Message-ID: <09d201c7a7ea$7bc23720$1f0a0a0a@enfoldsystems.local> > My apologies for being a day late, got working on some other things. > So here's the scoop as it relates to the issue at hand: > > - If you run rt.bat from the trunk as-is and place it in a path that > contains an empty space, you receive the error outlined in > resultwithspace.txt. > > - If you run rt.bat from the trunk as-is and place it in a path that > does not contain an empty space, you receive no errors as outlined in > resultwithoutspace.txt. > > - If you run rt.bat with the patch, on Windows XP, you > receive no errors as outlined in resultafterpatch.txt. In that last step, you failed to indicate if the path had a space or not. ie, on Windows XP I get that behaviour now without needing to apply the patch. > The patch is attached. The vast majority of the patch is insignificant - it is either adding braces where they are not necessary, or changing whitespace inappropriately (the spaces you replaced are so the lines all line up regardless of the tab width.) It seems there is only one significant block in your patch, and its not clear to me what the intent of the patch is - I admit I didn't apply it and look at it in-place, but a couple of comments indicating exactly what you are trying to do would be good, especially as I'm not aware of this behaviour change from Win2K -> WinXP. > Probably my biggest question now is > the use of GetVersion as opposed to GetVersionEx. The existing code explicitly checks if it is the 9x or NT family, which your patch no longer does. It seems to me that Windows ME will also qualify - although in general the strcmp for command.com will succeed, if an alternative shell is installed on a ME box it will do the wrong thing. If you need to check anything more than the high-bit of GetVersion(), IMO it should be replaced with GetVersionEx(). Cheers, Mark From mhammond at skippinet.com.au Wed Jun 6 05:45:08 2007 From: mhammond at skippinet.com.au (Mark Hammond) Date: Wed, 6 Jun 2007 13:45:08 +1000 Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8 In-Reply-To: <46660964.6090600@gmail.com> Message-ID: <09d901c7a7ed$162d3e20$1f0a0a0a@enfoldsystems.local> > My apologies for being a day late, got working on some other things. > So here's the scoop as it relates to the issue at hand: Something else I meant to mention: your problem is that the test suite fails in some circumstances, but these circumstances are not met for most core developers or when running the python test suite from the directory it is built in, but your proposed fix is a patch to os.popen(). There would also need to be new test cases added to demonstrate this bug in a "normal" test run. Cheers, Mark From bjourne at gmail.com Wed Jun 6 17:29:26 2007 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Wed, 6 Jun 2007 17:29:26 +0200 Subject: [Python-Dev] Patch reviews Message-ID: <740c3aec0706060829l7cfa5d87o14916a93f87bbe44@mail.gmail.com> Here is a review of some patches: * [ 1673759 ] '%G' string formatting doesn't catch same errors as '%g' This patch is done, has been reviewed and works as advertised. Just needs someone to commit it I think. * [ 1100942 ] datetime.strptime constructor added Doesn't apply cleanly, emits compiler warnings, but works as expected. Lacks tests. * [ 968063 ] Add fileinput.islastline() Good and useful patch (see the pup.py program) but lacks unit tests and documentation. * [ 1501979 ] syntax errors on continuation lines Doesn't apply cleanly, but that is easy to fix. Needs someone to fix a few minor flaws in it, but the patch works really well. * [ 1375011 ] Improper handling of duplicate cookies Fixes a fairly exotic bug in which Pythons cookie handling deviates in an obscure way from Netscapes cookie specification. See the bug about it at 1372650. As far as I can understand, the patch fixes the problem. If someone still does the 5 for 1 deal, my patch is [ 1676135 ]. -- mvh Bj?rn From facundo at taniquetil.com.ar Wed Jun 6 21:19:36 2007 From: facundo at taniquetil.com.ar (Facundo Batista) Date: Wed, 6 Jun 2007 19:19:36 +0000 (UTC) Subject: [Python-Dev] Timeout in urllib2.urlopen Message-ID: facundo at expiron:~/devel/reps/python/trunk$ ./python Python 2.6a0 (trunk, Jun 6 2007, 12:32:23) [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import urllib2 >>> u = urllib2.urlopen("http://www.taniquetil.com.ar/plog") >>> u.headers.items() [..., ('content-type', 'text/html;charset=iso-8859-1'), ...] >>> >>> u = urllib2.urlopen("http://www.taniquetil.com.ar/plog", timeout=3) Traceback (most recent call last): ... urllib2.URLError: >>> Ok, my blog is a bit slow, but the timeout is working ok, :D Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From theller at ctypes.org Wed Jun 6 21:34:42 2007 From: theller at ctypes.org (Thomas Heller) Date: Wed, 06 Jun 2007 21:34:42 +0200 Subject: [Python-Dev] Windows buildbot (Was: buildbot failure in x86 W2k trunk) In-Reply-To: References: <20070520071645.BA1C01E4004@bag.python.org> <464FFFDC.4020600@v.loewis.de> <46547C7F.7040908@activestate.com> <4654C280.1080802@v.loewis.de> Message-ID: Thomas Heller schrieb: > Thomas Heller schrieb: >>>> Are there others that can provide a Windows buildbot? It would probably >>>> be good to have two -- and a WinXP one would be good. >>> >>> It certainly would be good. Unfortunately, Windows users are not that >>> much engaged in the open source culture, so few of them volunteer >>> (plus it's more painful, with Windows not being a true multi-user >>> system). >> >> I'll try to setup a buildbot under WinXP. > > The buildbot is now working. Should I try to setup another buildbot client for win32/AMD64? Thomas From martin at v.loewis.de Wed Jun 6 22:55:40 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 06 Jun 2007 22:55:40 +0200 Subject: [Python-Dev] Windows buildbot (Was: buildbot failure in x86 W2k trunk) In-Reply-To: References: <20070520071645.BA1C01E4004@bag.python.org> <464FFFDC.4020600@v.loewis.de> <46547C7F.7040908@activestate.com> <4654C280.1080802@v.loewis.de> Message-ID: <46671F4C.1000601@v.loewis.de> > Should I try to setup another buildbot client for win32/AMD64? We don't have a Win64 buildbot yet. Depending on whether you plan to use PCbuild or PCbuild8, this might be a challenge to get working (I think it's a challenge either way, but different). If it could actually work, it might be useful. Regards, Martin From armin.ronacher at active-4.com Thu Jun 7 15:18:43 2007 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Thu, 7 Jun 2007 13:18:43 +0000 (UTC) Subject: [Python-Dev] NaN / Infinity in Python Message-ID: Hi, It's one of those "non issues" but there are still some situations where you have to deal with Infinity and NaN, even in Python. Basically one the problems is that the repr of floating point numbers is platform depending and sometimes yields "nan" which is not evaluable. It's true that eval() is probably a bad thing but there are some libraries that use repr/%r for code generation and it could happen that they produce erroneous code because of that. Also there is no way to get the Infinity and NaN values and also no way to test if they exist. Maybe changing the repr of `nan` to `math.NaN` and `inf` to `math.Infinity` as well as `-inf` to `(-math.Infinity)` and add that code to the math module (of course as a C implementation, there are even macros for testing for NaN values):: Infinity = 1e10000 NaN = Infinity / Infinity def is_nan(x): return type(x) is float and x != x def is_finite(x): return x != Infinity Bugs related to this issue: - http://bugs.python.org/1732212 [repr of 'nan' floats not parseable] - http://bugs.python.org/1481296 [long(float('nan'))!=0L] Regards, Armin From jcarlson at uci.edu Thu Jun 7 20:46:57 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 07 Jun 2007 11:46:57 -0700 Subject: [Python-Dev] NaN / Infinity in Python In-Reply-To: References: Message-ID: <20070607113552.6F4D.JCARLSON@uci.edu> Armin Ronacher wrote: > It's one of those "non issues" but there are still some situations where you > have to deal with Infinity and NaN, even in Python. Basically one the problems > is that the repr of floating point numbers is platform depending and sometimes > yields "nan" which is not evaluable. It's true that eval() is probably a bad > thing but there are some libraries that use repr/%r for code generation and it > could happen that they produce erroneous code because of that. Also there is no > way to get the Infinity and NaN values and also no way to test if they exist. > > Maybe changing the repr of `nan` to `math.NaN` and `inf` to `math.Infinity` as > well as `-inf` to `(-math.Infinity)` and add that code to the math module (of > course as a C implementation, there are even macros for testing for NaN values):: That would work for eval(repr(x)), but it fails for float(repr(x)). I believe this particular issue has been brought up before, as well as the particular solution, but I can't remember the final outcome. > Infinity = 1e10000 Has the storage of infinity in .pyc files been fixed? For a while it was broken. > NaN = Infinity / Infinity > > def is_nan(x): > return type(x) is float and x != x > > def is_finite(x): > return x != Infinity you mean def is_finite(x): return x not in (Infinity, -Infinity) - Josiah From theller at ctypes.org Fri Jun 8 15:45:47 2007 From: theller at ctypes.org (Thomas Heller) Date: Fri, 08 Jun 2007 15:45:47 +0200 Subject: [Python-Dev] Windows buildbot (Was: buildbot failure in x86 W2k trunk) In-Reply-To: <46671F4C.1000601@v.loewis.de> References: <20070520071645.BA1C01E4004@bag.python.org> <464FFFDC.4020600@v.loewis.de> <46547C7F.7040908@activestate.com> <4654C280.1080802@v.loewis.de> <46671F4C.1000601@v.loewis.de> Message-ID: Martin v. L?wis schrieb: >> Should I try to setup another buildbot client for win32/AMD64? > > We don't have a Win64 buildbot yet. Depending on whether you plan > to use PCbuild or PCbuild8, this might be a challenge to get working > (I think it's a challenge either way, but different). > If it could actually work, it might be useful. For release25-maint, probably PCBuild should be used since that is used to create the installer. For trunk/Python 2.6 I don't know what you will use for the release version. Where do you think is the challange to get it working? For the buildbot client itself I would use the same stuff as in the WinXP buildbot (32-bit python2.4, twisted, buildbot, pywin32). For the build scripts in Tools\buildbot I made some small changes to the batch files (appended). They look for a PROCESSOR_ARCHITECTURE env var, and if this is equal to AMD64 the build target and one or two small other things are changed from the default. So these scripts currently build the PCBuild process. I can run Tools\buildbot\build.bat, then Tools\buildbot\test.bat, and Tools\buildbot\clean.bat successfully. Of course this does not mean that *everything* is built correctly - currently _sqlite3, bz2, _tkinter, and _ssl are not build because of several reasons. If you want me to go online with the buildbot please send me a HOST:PORT and PASSWORD. Thomas Index: Tools/buildbot/build.bat =================================================================== --- Tools/buildbot/build.bat (revision 55792) +++ Tools/buildbot/build.bat (working copy) @@ -2,4 +2,6 @@ cmd /c Tools\buildbot\external.bat call "%VS71COMNTOOLS%vsvars32.bat" cmd /q/c Tools\buildbot\kill_python.bat -devenv.com /useenv /build Debug PCbuild\pcbuild.sln +set TARGET=Debug +if "%PROCESSOR_ARCHITECTURE%" == "AMD64" set _TARGET=ReleaseAMD64 +devenv.com /useenv /build %_TARGET% PCbuild\pcbuild.sln Index: Tools/buildbot/test.bat =================================================================== --- Tools/buildbot/test.bat (revision 55792) +++ Tools/buildbot/test.bat (working copy) @@ -1,3 +1,5 @@ @rem Used by the buildbot "test" step. cd PCbuild -call rt.bat -d -q -uall -rw +set _DEBUG=-d +if "%PROCESSOR_ARCHITECTURE%"=="AMD64" set _DEBUG= +call rt.bat %_DEBUG% -q -uall -rw Index: Tools/buildbot/clean.bat =================================================================== --- Tools/buildbot/clean.bat (revision 55792) +++ Tools/buildbot/clean.bat (working copy) @@ -2,5 +2,9 @@ call "%VS71COMNTOOLS%vsvars32.bat" cd PCbuild @echo Deleting .pyc/.pyo files ... -python_d.exe rmpyc.py -devenv.com /clean Debug pcbuild.sln +set _PYTHON=python_d.exe +if "%PROCESSOR_ARCHITECTURE%"=="AMD64" set _PYTHON=python.exe +%_PYTHON% rmpyc.py +set TARGET=Debug +if "%PROCESSOR_ARCHITECTURE%" == "AMD64" set TARGET=ReleaseAMD64 +devenv.com /clean %TARGET% pcbuild.sln From martin at v.loewis.de Fri Jun 8 21:26:28 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 08 Jun 2007 21:26:28 +0200 Subject: [Python-Dev] Windows buildbot (Was: buildbot failure in x86 W2k trunk) In-Reply-To: References: <20070520071645.BA1C01E4004@bag.python.org> <464FFFDC.4020600@v.loewis.de> <46547C7F.7040908@activestate.com> <4654C280.1080802@v.loewis.de> <46671F4C.1000601@v.loewis.de> Message-ID: <4669AD64.3010302@v.loewis.de> > For release25-maint, probably PCBuild should be used since that is used to create the installer. > For trunk/Python 2.6 I don't know what you will use for the release version. I actually don't know either, yet. I would like to use Orcas, but it's not clear when this will be released; neither is clear when 2.6 will be released. I notice that Kristjan would like to see a PCbuild8 buildbot. > Where do you think is the challange to get it working? For the buildbot client itself > I would use the same stuff as in the WinXP buildbot (32-bit python2.4, twisted, buildbot, > pywin32). I think the scripts in Tools\buildbot might be tricky, along with possible changes to the master. > For the build scripts in Tools\buildbot I made some small changes to the batch files (appended). > They look for a PROCESSOR_ARCHITECTURE env var, and if this is equal to AMD64 the build target > and one or two small other things are changed from the default. So these scripts currently build > the PCBuild process. That's an interesting option. I had myself arranged for the master to issue a different build command to an AMD64 build slave. > If you want me to go online with the buildbot please send me a HOST:PORT and PASSWORD. Doing so in a separate message. Regards, Martin From lance.ellinghaus at eds.com Sat Jun 9 00:43:16 2007 From: lance.ellinghaus at eds.com (Ellinghaus, Lance) Date: Fri, 8 Jun 2007 18:43:16 -0400 Subject: [Python-Dev] Compiling 2.5.1 under Studio 11 Message-ID: <752A61D5C34D41478E638FC92AF9051B0101B61E@usahm207.amer.corp.eds.com> Hello, I am having a couple of issues compiling Python 2.5.1 under Sun Solaris Studio 11 on Solaris 8. Everything compiles correctly except the _ctypes module because it cannot use the libffi that comes with Python and it does not exist on the system. Has anyone gotten it to compile correctly using Studio 11? I know it will compile under GCC but I am not allowed to use GCC. Also, during the pyexpat tests, Python generates a segfault. Are there any patches to fix these? Thank you, Lance -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070608/62c3c0ca/attachment.html From eyal.lotem at gmail.com Sat Jun 9 05:49:57 2007 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sat, 9 Jun 2007 06:49:57 +0300 Subject: [Python-Dev] cProfile with generator throwing Message-ID: Hi. It seems that cProfile does not support throwing exceptions into generators properly, when an external timer routine is used. The problem is that _lsprof.c: ptrace_enter_call assumes that there are no exceptions set when it is called, which is not true when the generator frame is being gen_send_ex'd to send an exception into it (Maybe you could say that only CallExternalTimer assumes this, but I am not sure). This assumption causes its eventual call to CallExternalTimer to discover that an error is set and assume that it was caused by its own work (which it wasn't). I am not sure what the right way to fix this is, so I cannot send a patch. Here is a minimalist example to reproduce the bug: >>> import cProfile >>> import time >>> p=cProfile.Profile(time.clock) >>> def f(): ... yield 1 ... >>> p.run("f().throw(Exception())") Exception exceptions.Exception: Exception() in ignored Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.5/cProfile.py", line 135, in run return self.runctx(cmd, dict, dict) File "/usr/lib/python2.5/cProfile.py", line 140, in runctx exec cmd in globals, locals File "", line 1, in File "", line 1, in f SystemError: error return without exception set From g.brandl at gmx.net Sat Jun 9 09:42:08 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 09 Jun 2007 09:42:08 +0200 Subject: [Python-Dev] cProfile with generator throwing In-Reply-To: References: Message-ID: Eyal Lotem schrieb: > Hi. It seems that cProfile does not support throwing exceptions into > generators properly, when an external timer routine is used. > > The problem is that _lsprof.c: ptrace_enter_call assumes that there > are no exceptions set when it is called, which is not true when the > generator frame is being gen_send_ex'd to send an exception into it > (Maybe you could say that only CallExternalTimer assumes this, but I > am not sure). This assumption causes its eventual call to > CallExternalTimer to discover that an error is set and assume that it > was caused by its own work (which it wasn't). > > I am not sure what the right way to fix this is, so I cannot send a patch. > Here is a minimalist example to reproduce the bug: > >>>> import cProfile >>>> import time >>>> p=cProfile.Profile(time.clock) >>>> def f(): > ... yield 1 > ... >>>> p.run("f().throw(Exception())") > Exception exceptions.Exception: Exception() in object at 0xb7f5a304> ignored > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/python2.5/cProfile.py", line 135, in run > return self.runctx(cmd, dict, dict) > File "/usr/lib/python2.5/cProfile.py", line 140, in runctx > exec cmd in globals, locals > File "", line 1, in > File "", line 1, in f > SystemError: error return without exception set There might be a similar problem with trace functions, see bug #1733757 which is quite obscure too. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From martin at v.loewis.de Sat Jun 9 10:41:22 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 09 Jun 2007 10:41:22 +0200 Subject: [Python-Dev] Compiling 2.5.1 under Studio 11 In-Reply-To: <752A61D5C34D41478E638FC92AF9051B0101B61E@usahm207.amer.corp.eds.com> References: <752A61D5C34D41478E638FC92AF9051B0101B61E@usahm207.amer.corp.eds.com> Message-ID: <466A67B2.6030505@v.loewis.de> > I am having a couple of issues compiling Python 2.5.1 under Sun Solaris > Studio 11 on Solaris 8. > > Everything compiles correctly except the _ctypes module because it > cannot use the libffi that comes with Python and it does not exist on > the system. > > Has anyone gotten it to compile correctly using Studio 11? This is not a question for python-dev; please ask it on comp.lang.python. In any case, what processor are you using? I have compiled Python successfully with Sun C 5.8. > Also, during the pyexpat tests, Python generates a segfault. > > Are there any patches to fix these? Without knowing what precisely the problem is, it is difficult to say whether it has been fixed. Regards, Martin From snaury at gmail.com Sat Jun 9 22:23:20 2007 From: snaury at gmail.com (Alexey Borzenkov) Date: Sun, 10 Jun 2007 00:23:20 +0400 Subject: [Python-Dev] zipfile and unicode filenames Message-ID: Hi everyone, Today I've stumbled upon a bug in my program that wasn't very straightforward to understand. The problem is that I was passing unicode filenames to zipfile.ZipFile.write and I had sys.setdefaultencoding() in effect, which resulted in a situation where most of the bytes generated in zipfile.ZipInfo.FileHeader would pass thru, except for a few, which caused codec error on another machine (where filenames got infectiously upgraded to unicode). The problem here is that it was absolutely unclear at first that I get unicode filenames passed to write, and it incorrectly accepted them silently. Is it worth to submit a bug report on this? The desired behavior here would be to either a) disallow unicode strings as arcname are raise an exception (since it is used in concatenation with raw data it is likely to cause problems because of auto upgrading raw data to unicode), or b) silently encode unicode strings to raw strings (something like if isinstance(filename, unicode): filename = filename.encode() in zipfile.ZipInfo constructor). So, should I submit a bug report, and which behavior would be actually correct? From eyal.lotem at gmail.com Sat Jun 9 23:23:41 2007 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sun, 10 Jun 2007 00:23:41 +0300 Subject: [Python-Dev] Instance variable access and descriptors Message-ID: Hi. I was surprised to find in my profiling that instance variable access was pretty slow. I looked through the CPython code involved, and discovered something that really surprises me. Python, probably through the valid assumption that most attribute lookups go to the class, tries to look for the attribute in the class first, and in the instance, second. What Python currently does is quite peculiar! Here's a short description o PyObject_GenericGetAttr: A. Python looks for a descriptor in the _entire_ mro hierarchy (len(mro) class/type check and dict lookups). B. If Python found a descriptor and it has both get and set functions - it uses it to get the value and returns, skipping the next stage. C. If Python either did not find a descriptor, or found one that has no setter, it will try a lookup in the instance dict. D. If Python failed to find it in the instance, it will use the descriptor's getter, and if it has no getter it will use the descriptor itself. I believe the average costs of A are much higher than of C. Because there is just 1 instance dict to look through, and it is also typically smaller than the class dicts (in rare cases of worse-case timings of hash lookups), while there are len(mro) dicts to look for a descriptor in. This means that for simple instance variable lookups, Python is paying the full mro lookup price! I believe that this should be changed, so that Python first looks for the attribute in the instance's dict and only then through the dict's mro. This will have the following effects: A. It will break code that uses instance.__dict__['var'] directly, when 'var' exists as a property with a __set__ in the class. I believe this is not significant. B. It will simplify getattr's semantics. Python should _always_ give precedence to instance attributes over class ones, rather than have very weird special-cases (such as a property with a __set__). C. It will greatly speed up instance variable access, especially when the class has a large mro. I think obviously the code breakage is the worst problem. This could probably be addressed by a transition version in which Python warns about any instance attributes that existed in the mro as descriptors as well. What do you think? From steven.bethard at gmail.com Sat Jun 9 23:51:57 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Sat, 9 Jun 2007 15:51:57 -0600 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: References: Message-ID: On 6/9/07, Eyal Lotem wrote: > I believe that this should be changed, so that Python first looks for > the attribute in the instance's dict and only then through the dict's > mro. [snip] > What do you think? Are you suggesting that the following code should print "43" instead of "42"? :: >>> class C(object): ... x = property(lambda self: self._x) ... def __init__(self): ... self._x = 42 ... >>> c = C() >>> c.__dict__['x'] = 43 >>> c.x 42 If so, this is a pretty substantial backwards incompatibility, and you should probably post this to python-ideas first to hash things out. If people like it there, the right target is probably Python 3000, not Python 2.x. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From steven.bethard at gmail.com Sun Jun 10 00:18:37 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Sat, 9 Jun 2007 16:18:37 -0600 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: References: Message-ID: > On 6/10/07, Steven Bethard wrote: > > On 6/9/07, Eyal Lotem wrote: > > > I believe that this should be changed, so that Python first looks for > > > the attribute in the instance's dict and only then through the dict's > > > mro. > > > > Are you suggesting that the following code should print "43" instead of "42"? > > :: > > > > >>> class C(object): > > ... x = property(lambda self: self._x) > > ... def __init__(self): > > ... self._x = 42 > > ... > > >>> c = C() > > >>> c.__dict__['x'] = 43 > > >>> c.x > > 42 On 6/9/07, Eyal Lotem wrote: > Yes, I do suggest that. > But its important to notice that this is not a suggestion in order to > improve Python, but one that makes it possible to get reasonable > performance out of CPython. As such, I don't believe it should be done > in Py3K. > > Firstly, like everything that breaks backwards compatibility, it is > possible to have a transitional version that spits warnings for all > problems (detect name clashes between properties and instance dict). Sure, but then you're talking about really introducing this in Python 2.7, with 2.6 as a transitional version. So take a minute to look at the release timelines: http://www.python.org/dev/peps/pep-0361/ The initial 2.6 target is for April 2008. http://www.python.org/dev/peps/pep-3000/ I hope to have a first alpha release (3.0a1) out in the first half of 2007; it should take no more than a year from then before the first proper release, named Python 3.0 So I'm expecting Python 3.0 to come out *before* 2.7. Thus if you're proposing a backwards-incompatible change that would have to wait until 2.7 anyway, why not propose it for 3.0 where backwards-incompatible changes are more acceptable? STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From pje at telecommunity.com Sun Jun 10 01:30:41 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 09 Jun 2007 19:30:41 -0400 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: References: Message-ID: <20070609232842.005993A4060@sparrow.telecommunity.com> At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: >A. It will break code that uses instance.__dict__['var'] directly, >when 'var' exists as a property with a __set__ in the class. I believe >this is not significant. >B. It will simplify getattr's semantics. Python should _always_ give >precedence to instance attributes over class ones, rather than have >very weird special-cases (such as a property with a __set__). Actually, these are features that are both used and desirable; I've been using them both since Python 2.2 (i.e., for many years now). I'm -1 on removing these features from any version of Python, even 3.0. >C. It will greatly speed up instance variable access, especially when >the class has a large mro. ...at the cost of slowing down access to properties and __slots__, by adding an *extra* dictionary lookup there. Note, by the way, that if you want to change attribute lookup semantics, you can always override __getattribute__ and make it work whatever way you like, without forcing everybody else to change *their* code. From status at bugs.python.org Sun Jun 10 02:00:44 2007 From: status at bugs.python.org (Tracker) Date: Sun, 10 Jun 2007 00:00:44 +0000 (UTC) Subject: [Python-Dev] Summary of Tracker Issues Message-ID: <20070610000044.83913780EA@psf.upfronthosting.co.za> ACTIVITY SUMMARY (06/03/07 - 06/10/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1645 open ( +0) / 8584 closed ( +0) / 10229 total ( +0) Average duration of open issues: 822 days. Median duration of open issues: 770 days. Open Issues Breakdown open 1645 ( +0) pending 0 ( +0) Issues Now Closed (1) _____________________ New issue test for email 87 days http://bugs.python.org/issue1001 admin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070610/4a68494b/attachment.html From bioinformed at gmail.com Sun Jun 10 02:28:31 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Sat, 9 Jun 2007 20:28:31 -0400 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: <20070609232842.005993A4060@sparrow.telecommunity.com> References: <20070609232842.005993A4060@sparrow.telecommunity.com> Message-ID: <2e1434c10706091728w593f4016p5f09ee49be97e80d@mail.gmail.com> I agree with Phillip with regard to the semantics. They are semantically desirable. However, there is a patch to add a mro cache to speed up these sorts of cases on the Python tracker, originally submitted by Armin Rigo. He saw ~20% speedups, others see less. It is currently just sitting there with no apparent activity. So if the overhead of mro lookups is that bothersome, it may be well worth your time to review the patch. URL: http://sourceforge.net/tracker/index.php?func=detail&aid=1700288&group_id=5470&atid=305470 -Kevin On 6/9/07, Phillip J. Eby wrote: > > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: > >A. It will break code that uses instance.__dict__['var'] directly, > >when 'var' exists as a property with a __set__ in the class. I believe > >this is not significant. > >B. It will simplify getattr's semantics. Python should _always_ give > >precedence to instance attributes over class ones, rather than have > >very weird special-cases (such as a property with a __set__). > > Actually, these are features that are both used and desirable; I've > been using them both since Python 2.2 (i.e., for many years > now). I'm -1 on removing these features from any version of Python, even > 3.0. > > > >C. It will greatly speed up instance variable access, especially when > >the class has a large mro. > > ...at the cost of slowing down access to properties and __slots__, by > adding an *extra* dictionary lookup there. > > Note, by the way, that if you want to change attribute lookup > semantics, you can always override __getattribute__ and make it work > whatever way you like, without forcing everybody else to change *their* > code. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/jacobs%40bioinformed.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070609/154fe92e/attachment.htm From eyal.lotem at gmail.com Sun Jun 10 02:59:04 2007 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sun, 10 Jun 2007 03:59:04 +0300 Subject: [Python-Dev] Frame zombies Message-ID: I was just looking through the code that handles frames (as part of my current effort to determine how to improve on CPython's performance), when I noticed the freelist/zombie mechanism for frame allocation handling. While the zombie mechanism seems like a nice optimization, I believe there can be a couple of improvements. Currently, each code object has a zombie frame that last executed it. This zombie is reused when that code object is re-executed in a frame. When a frame is released, it is reassigned as the zombie of the code, and iff the code object already has a zombie assigned to it, it places the frame in the freelist. If I understand correctly, this means, that the "freelist" is actually only ever used for recursive-call frames that were released. It also means that these released frames will be reassigned to other code objects in the future - in which case they will be reallocated, perhaps unnecessarily. "freelist" is just temporary storage for released recursive calls. A program with no recursions will always have an empty freelist. What is bounding the memory consumption of this mechanism is a limit on the number of frames in the freelist (and the fact that there is a limited number of code objects, each of which may have an additional zombie frame). I believe a better way to implement this mechanism: A. Replace the co_zombie frame with a co_zombie_freelist. B. Remove the global freelist. In other words, have a free list for each code object, rather than one-per-code-object and a freelist. This can be memory-bound by limiting the freelist size in each code object. This can be use a bit more memory if a recursion is called just once - and then discarded (waste for example 10 frames instead of 1), but can save a lot of realloc calls when there is more than one recursion used in the same program. It is also possible to substantially increase the number of frames stored per code-object, and then use some kind of more sophisticated aging mechanism on the zombie freelists to periodically get rid of unused freelists. That kind of mechanism would mean that even in the case of recursive calls, virtually all frame allocs are available from the freelist. I also believe this to be somewhat simpler, as there is only one concept (a zombie freelist) rather than 2 (a zombie code object and a freelist for recursive calls), and frames are never realloc'd, but only allocated. Should I make a patch? From eyal.lotem at gmail.com Sun Jun 10 03:13:38 2007 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sun, 10 Jun 2007 04:13:38 +0300 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: <2e1434c10706091728w593f4016p5f09ee49be97e80d@mail.gmail.com> References: <20070609232842.005993A4060@sparrow.telecommunity.com> <2e1434c10706091728w593f4016p5f09ee49be97e80d@mail.gmail.com> Message-ID: I must be missing something, as I really see no reason to keep the existing semantics other than backwards compatibility (which can be achieved by introducing a __fastattr__ or such). Can you explain under which situations or find any example situation where the existing semantics are desirable? As for the mro cache - thanks for pointing it out - I think it can serve as a platform for another idea that in conjunction with psyco, can possibly speed up CPython very significantly (will create a thread about this soon). Please note that speeding up the mro-lookup solves only half of the problem (if it was solved - which it seems not to have been), the more important half of the problem remains, allow me to emphasize: ALL instance attribute accesses look up in both instance and class dicts, when it could look just in the instance dict. This is made worse by the fact that the class dict lookup is more expensive (with or without the mro cache). Some code that accesses a lot of instance attributes in an inner loop can easily be sped up by a factor of 2 or more (depending on the size of the mro). Eyal On 6/10/07, Kevin Jacobs wrote: > I agree with Phillip with regard to the semantics. They are semantically > desirable. However, there is a patch to add a mro cache to speed up these > sorts of cases on the Python tracker, originally submitted by Armin Rigo. > He saw ~20% speedups, others see less. It is currently just sitting there > with no apparent activity. So if the overhead of mro lookups is that > bothersome, it may be well worth your time to review the patch. > > URL: > http://sourceforge.net/tracker/index.php?func=detail&aid=1700288&group_id=5470&atid=305470 > > -Kevin > > > > On 6/9/07, Phillip J. Eby wrote: > > > > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: > > >A. It will break code that uses instance.__dict__['var'] directly, > > >when 'var' exists as a property with a __set__ in the class. I believe > > >this is not significant. > > >B. It will simplify getattr's semantics. Python should _always_ give > > >precedence to instance attributes over class ones, rather than have > > >very weird special-cases (such as a property with a __set__). > > > > Actually, these are features that are both used and desirable; I've > > been using them both since Python 2.2 (i.e., for many years > > now). I'm -1 on removing these features from any version of Python, even > 3.0. > > > > > > >C. It will greatly speed up instance variable access, especially when > > >the class has a large mro. > > > > ...at the cost of slowing down access to properties and __slots__, by > > adding an *extra* dictionary lookup there. > > > > Note, by the way, that if you want to change attribute lookup > > semantics, you can always override __getattribute__ and make it work > > whatever way you like, without forcing everybody else to change *their* > code. > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/jacobs%40bioinformed.com > > > > From eyal.lotem at gmail.com Sun Jun 10 03:14:18 2007 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sun, 10 Jun 2007 04:14:18 +0300 Subject: [Python-Dev] Fwd: Instance variable access and descriptors In-Reply-To: References: <20070609232842.005993A4060@sparrow.telecommunity.com> Message-ID: On 6/10/07, Phillip J. Eby wrote: > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: > >A. It will break code that uses instance.__dict__['var'] directly, > >when 'var' exists as a property with a __set__ in the class. I believe > >this is not significant. > >B. It will simplify getattr's semantics. Python should _always_ give > >precedence to instance attributes over class ones, rather than have > >very weird special-cases (such as a property with a __set__). > > Actually, these are features that are both used and desirable; I've > been using them both since Python 2.2 (i.e., for many years > now). I'm -1 on removing these features from any version of Python, even 3.0. It is the same feature, actually, two sides of the same coin. Why do you use self.__dict__['propertyname'] when you can use self._propertyname? Why even call the first form, which is both longer and causes performance problems "a feature"? > >C. It will greatly speed up instance variable access, especially when > >the class has a large mro. > > ...at the cost of slowing down access to properties and __slots__, by > adding an *extra* dictionary lookup there. It will slow down access to properties - but that slowdown is insignificant: A. The vast majority of lookups are *NOT* of properties. They are the rare case and should not be the focus of optimization. B. Property access involves calling Python functions - which is heavier than a single dict lookup. C. The dict lookup to find the property in the __mro__ can involve many dicts (so in those cases adding a single dict lookup is not heavy). > Note, by the way, that if you want to change attribute lookup > semantics, you can always override __getattribute__ and make it work > whatever way you like, without forcing everybody else to change *their* code. If I write my own __getattribute__ I lose the performance benefit that I am after. I do agree that code shouldn't be broken, that's why a transitional that requires using __fastlookup__ can be used (Unfortunately, from __future__ cannot be used as it is not local to a module, but to a class hierarchy - unless one imports a feature from __future__ into a class). From martin at v.loewis.de Sun Jun 10 05:36:39 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 10 Jun 2007 05:36:39 +0200 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: References: Message-ID: <466B71C7.3040607@v.loewis.de> > Today I've stumbled upon a bug in my program that wasn't very > straightforward to understand. Unfortunately, it isn't straight-forward to understand your description of it, either. > The problem is that I was passing > unicode filenames to zipfile.ZipFile.write and I had > sys.setdefaultencoding() in effect What do you mean here? How can sys.setdefaultencoding() be "in effect"? There is always a default encoding; did you mean you changed the default? > which resulted in a situation > where most of the bytes generated in zipfile.ZipInfo.FileHeader would > pass thru, except for a few, which caused codec error on another > machine (where filenames got infectiously upgraded to unicode). Was the problem that most of the bytes would pass thru, or was the problem that a few did not pass thru? Why did filenames in the FileHeader infectiously upgraded to unicode on the other machine, but not on the first machine? > The > problem here is that it was absolutely unclear at first that I get > unicode filenames passed to write, and it incorrectly accepted them > silently. Is it worth to submit a bug report on this? Try to let me rephrase what I understood so far: "I changed the default system encoding from ASCII to some other value, and that caused zipfile.py to generate an incorrect zipfile. Is that a bug in zipfile?" To that, the answer is a clear "no". If you change the default encoding, you are on your own. Don't do that. > So, should I submit a bug report, and which behavior would be actually correct? The issue of non-ASCII file names in zipfiles is fairly well understood. The ZIP format historically did not support them well. I believe this has recently been improved, but that format change has not propagated into the zipfile module, yet. Howeer, everybody is aware of the situation, so there is no need to report a bug. Regards, Martin From martin at v.loewis.de Sun Jun 10 06:17:57 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 10 Jun 2007 06:17:57 +0200 Subject: [Python-Dev] Frame zombies In-Reply-To: References: Message-ID: <466B7B75.2080804@v.loewis.de> > Should I make a patch? -1. This could consume quite a lot of memory, and I doubt that the speed improvement would be significant. Instead, I would check whether performance could be improved by just dropping the freelist. Looking at the code, I see that it performs a realloc (!) of the frame object if the one it found is too small. That surely must be expensive, and should be replaced with a free/malloc pair instead. I'd be curious to see whether malloc on today's systems is still so slow as to justify a free list. If it is, I would propose to create multiple free lists per size classes, e.g. for frames with 10, 20, 30, etc. variables, rather than having free lists per code object. Regards, Martin From eyal.lotem at gmail.com Sun Jun 10 06:38:03 2007 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sun, 10 Jun 2007 07:38:03 +0300 Subject: [Python-Dev] Frame zombies In-Reply-To: <466B7B75.2080804@v.loewis.de> References: <466B7B75.2080804@v.loewis.de> Message-ID: The freelist currently serves as a good optimization of a special case of a recurring recursion. If the same code object (or one of the same size) is used for recursive calls repeatedly, the freelist will realloc-to-same-size (which probably has no serious cost) and thus the cost of allocating/deallocating frames was averted. I think that in general, the complexity of a sophisticated and efficient aging mechanism is not worth it just to optimize recursive calls. The only question is whether it is truly a memory problem, if using, say, up-to 50 frames per code object? Note that _only_ recursions will have more than 1 frame attached. How many recursions are used and then discarded? How slow is it to constantly malloc/free frames in a recursion? My proposal will accelerate the following example: def f(x, y): if 0 == x: return f(x-1, y) def g(x): if 0 == x: return g(x-1) while True: f(100, 100) g(100) The current implementation will work well with the following: while True: f(100, 100) But removing freelist altogether will not work well with any type of recursion. Eyal On 6/10/07, "Martin v. L?wis" wrote: > > Should I make a patch? > > -1. This could consume quite a lot of memory, and I doubt > that the speed improvement would be significant. Instead, > I would check whether performance could be improved by > just dropping the freelist. Looking at the code, I see > that it performs a realloc (!) of the frame object if > the one it found is too small. That surely must be > expensive, and should be replaced with a free/malloc pair > instead. > > I'd be curious to see whether malloc on today's systems > is still so slow as to justify a free list. If it is, > I would propose to create multiple free lists per size > classes, e.g. for frames with 10, 20, 30, etc. variables, > rather than having free lists per code object. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/eyal.lotem%40gmail.com > From martin at v.loewis.de Sun Jun 10 08:16:11 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 10 Jun 2007 08:16:11 +0200 Subject: [Python-Dev] Frame zombies In-Reply-To: References: <466B7B75.2080804@v.loewis.de> Message-ID: <466B972B.1090802@v.loewis.de> > Note that _only_ recursions will have more than 1 frame attached. That's not true; in the presence of threads, the same method may also be invoked more than one time simultaneously. > But removing freelist altogether will not work well with any type of > recursion. How do you know that? Did you measure the time? On what system? What were the results? Performance optimization without measuring is just unacceptable. Regards, Martin From martin at v.loewis.de Sun Jun 10 10:38:15 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 10 Jun 2007 10:38:15 +0200 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: References: <466B71C7.3040607@v.loewis.de> Message-ID: <466BB877.8000404@v.loewis.de> > sys.setdefaultencoding() > exists for a reason, wouldn't it be better if stdlib could cope with > that at least with zipfile? sys.setdefaultencoding just does not work. Many more things break when you call it. It only exists because people like you insisted that it exists. > Also note that I'm trying to ask if zipfile should be improved, how it > should be improved, and this possible improvement is not even for me > (because now I know how zipfile behaves and I will work correctly with > it, but someone else might stumble upon this very unexpectedly). If you want to come up with a patch: sure. The zipfile module should handle Unicode strings, encoding them in the encoding that the ZIP specification defines (both the formal one, and the informal-defined-by-pkwares-implementation). The tricky question is what to do when reading in zipfiles with non-ASCII characters (and yes, I understand that in your case there were only ASCII characters in the file names). > The problem was that sourcedir was unicode, and on my machine > everything went ok multiple times. zipfile.ZipInfo.FileHeader would > return unicode, but then when it writes it to a file it gets back to > str (because mappings back and forth were identical). The problem > happened when on a different machine header suddenly got byte 0x98 in > position 10 (seems to be compress_size), which cp1251 codec couldn't > decode. You see, arcname didn't even have unicode characters, but the > mere fact that it was unicode made header upgrade to unicode in > "return header + self.filename + self.extra". Ok, now I understand. If filename is a Unicode string, header is converted using the system encoding; depending on the exact value of header and depending on the system encoding, this may cause a decoding error. This bug has been reported as http://bugs.python.org/1170311 > Because that's not supposed to work sanely when self.filename is > unicode I'm asking if the right behavior would be to a) disallow > unicode filenames in zipfile.ZipInfo, b) automatically convert > filename to str in zipfile.ZipInfo, c) leave everything as it is. The correct behavior would be b); the difficult details are what encoding to use. Regards, Martin From gjcarneiro at gmail.com Sun Jun 10 12:27:16 2007 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sun, 10 Jun 2007 11:27:16 +0100 Subject: [Python-Dev] Fwd: Instance variable access and descriptors In-Reply-To: References: <20070609232842.005993A4060@sparrow.telecommunity.com> Message-ID: I have to agree with you. If removing support for self.__dict__['propertyname'] (where propertyname is also the name of a descriptor) is the price to pay for significant speedup, so be it. People doing that are asking for trouble anyway! On 10/06/07, Eyal Lotem wrote: > > On 6/10/07, Phillip J. Eby wrote: > > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: > > >A. It will break code that uses instance.__dict__['var'] directly, > > >when 'var' exists as a property with a __set__ in the class. I believe > > >this is not significant. > > >B. It will simplify getattr's semantics. Python should _always_ give > > >precedence to instance attributes over class ones, rather than have > > >very weird special-cases (such as a property with a __set__). > > > > Actually, these are features that are both used and desirable; I've > > been using them both since Python 2.2 (i.e., for many years > > now). I'm -1 on removing these features from any version of Python, > even 3.0. > > It is the same feature, actually, two sides of the same coin. > Why do you use self.__dict__['propertyname'] when you can use > self._propertyname? > Why even call the first form, which is both longer and causes > performance problems "a feature"? > > > > >C. It will greatly speed up instance variable access, especially when > > >the class has a large mro. > > > > ...at the cost of slowing down access to properties and __slots__, by > > adding an *extra* dictionary lookup there. > It will slow down access to properties - but that slowdown is > insignificant: > A. The vast majority of lookups are *NOT* of properties. They are the > rare case and should not be the focus of optimization. > B. Property access involves calling Python functions - which is > heavier than a single dict lookup. > C. The dict lookup to find the property in the __mro__ can involve > many dicts (so in those cases adding a single dict lookup is not > heavy). > > > Note, by the way, that if you want to change attribute lookup > > semantics, you can always override __getattribute__ and make it work > > whatever way you like, without forcing everybody else to change *their* > code. > If I write my own __getattribute__ I lose the performance benefit that > I am after. > I do agree that code shouldn't be broken, that's why a transitional > that requires using __fastlookup__ can be used (Unfortunately, from > __future__ cannot be used as it is not local to a module, but to a > class hierarchy - unless one imports a feature from __future__ into a > class). > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com > -- Gustavo J. A. M. Carneiro INESC Porto "The universe is always one step beyond logic." -- Frank Herbert -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070610/4335b0d9/attachment.html From snaury at gmail.com Sun Jun 10 12:40:19 2007 From: snaury at gmail.com (Alexey Borzenkov) Date: Sun, 10 Jun 2007 14:40:19 +0400 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: <466BB877.8000404@v.loewis.de> References: <466B71C7.3040607@v.loewis.de> <466BB877.8000404@v.loewis.de> Message-ID: > > Also note that I'm trying to ask if zipfile should be improved, how it > > should be improved, and this possible improvement is not even for me > > (because now I know how zipfile behaves and I will work correctly with > > it, but someone else might stumble upon this very unexpectedly). > If you want to come up with a patch: sure. The zipfile module should > handle Unicode strings, encoding them in the encoding that the ZIP > specification defines (both the formal one, and the > informal-defined-by-pkwares-implementation). I don't think always encoding them to utf-8 (and using bit 11 of flag_bits) is a good idea, since there's a chance to create archives that won't be correctly readable by programs not supporting this bit (it's no secret that currently some programs just assume that filenames are encoded using one of system encodings). This is too complex and hazy to implement. Even if I know what is the situation on Windows (i.e. using OEM, also called DOS encoding, but I'm not sure how to determine its codec name from within python apart from calling GetConsoleCP), I'm totally unaware of the situation on other operating systems. > The tricky question is what to do when reading in zipfiles with > non-ASCII characters (and yes, I understand that in your case > there were only ASCII characters in the file names). I don't think it should be changed. > Ok, now I understand. If filename is a Unicode string, header is > converted using the system encoding; depending on the exact value > of header and depending on the system encoding, this may cause > a decoding error. > > This bug has been reported as http://bugs.python.org/1170311 I see. Well, that's all easier now then, as I can just create a patch for an already existing bug. > > Because that's not supposed to work sanely when self.filename is > > unicode I'm asking if the right behavior would be to a) disallow > > unicode filenames in zipfile.ZipInfo, b) automatically convert > > filename to str in zipfile.ZipInfo, c) leave everything as it is. > The correct behavior would be b); the difficult details are what > encoding to use. Current zipfile seems to officially support ascii filenames only anyway, so the patch can be as simple as this: Index: Lib/zipfile.py =================================================================== --- Lib/zipfile.py (revision 55850) +++ Lib/zipfile.py (working copy) @@ -252,12 +252,13 @@ self.extract_version = max(45, self.extract_version) self.create_version = max(45, self.extract_version) + filename = str(self.filename) header = struct.pack(structFileHeader, stringFileHeader, self.extract_version, self.reserved, self.flag_bits, self.compress_type, dostime, dosdate, CRC, compress_size, file_size, - len(self.filename), len(extra)) - return header + self.filename + extra + len(filename), len(extra)) + return header + filename + extra def _decodeExtra(self): # Try to decode the extra field. This doesn't introduce new features, just enforces filenames to be ascii (or whatever default encoding is) encodable. From snaury at gmail.com Sun Jun 10 12:56:45 2007 From: snaury at gmail.com (Alexey Borzenkov) Date: Sun, 10 Jun 2007 14:56:45 +0400 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: References: <466B71C7.3040607@v.loewis.de> <466BB877.8000404@v.loewis.de> Message-ID: > Current zipfile seems to officially support ascii filenames only > anyway, so the patch can be as simple as this: Submitted patch and test case as http://python.org/sf/1734346 From martin at v.loewis.de Sun Jun 10 18:45:51 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 10 Jun 2007 18:45:51 +0200 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: References: <466B71C7.3040607@v.loewis.de> <466BB877.8000404@v.loewis.de> Message-ID: <466C2ABF.2090500@v.loewis.de> > I don't think always encoding them to utf-8 (and using bit 11 of > flag_bits) is a good idea, since there's a chance to create archives > that won't be correctly readable by programs not supporting this bit > (it's no secret that currently some programs just assume that > filenames are encoded using one of system encodings). I think it is also fairly uniformly agreed that these programs are incorrect; the official encoding of file names in a zip file is Windows/DOS code page 437. > This is too > complex and hazy to implement. Even if I know what is the situation on > Windows (i.e. using OEM, also called DOS encoding, but I'm not sure > how to determine its codec name from within python apart from calling > GetConsoleCP), I'm totally unaware of the situation on other operating > systems. I don't think that the situation on Windows is that the OEM code page should be used. Instead, CP 437 should be used, independent of the OEM code page. >> The tricky question is what to do when reading in zipfiles with >> non-ASCII characters (and yes, I understand that in your case >> there were only ASCII characters in the file names). > > I don't think it should be changed. In Python 3, it will certainly change, since the string type will be unicode-based. It probably should not change for the rest of 2.x. > Current zipfile seems to officially support ascii filenames only > anyway That's not true. You can use any byte string as the file name that you want, including non-ASCII strings encoded in CP437. > + filename = str(self.filename) That would be incorrect, as it relies on the system encoding, which shouldn't be relied upon. Plus, it would allow arbitrary non-string things as filenames. What it should do instead (IMO) is to encode in CP437. Bonus points if it falls back to the UTF-8 feature of zip files if encoding as CP437 fails. Regards, Martin From pje at telecommunity.com Sun Jun 10 20:08:48 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 10 Jun 2007 14:08:48 -0400 Subject: [Python-Dev] Fwd: Instance variable access and descriptors In-Reply-To: References: <20070609232842.005993A4060@sparrow.telecommunity.com> Message-ID: <20070610180649.D510C3A407F@sparrow.telecommunity.com> At 04:14 AM 6/10/2007 +0300, Eyal Lotem wrote: >On 6/10/07, Phillip J. Eby wrote: > > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: > > >A. It will break code that uses instance.__dict__['var'] directly, > > >when 'var' exists as a property with a __set__ in the class. I believe > > >this is not significant. > > >B. It will simplify getattr's semantics. Python should _always_ give > > >precedence to instance attributes over class ones, rather than have > > >very weird special-cases (such as a property with a __set__). > > > > Actually, these are features that are both used and desirable; I've > > been using them both since Python 2.2 (i.e., for many years > > now). I'm -1 on removing these features from any version of > Python, even 3.0. > >It is the same feature, actually, two sides of the same coin. >Why do you use self.__dict__['propertyname'] when you can use >self._propertyname? Because I'm *not writing this by hand*. I'm using descriptors that know what attribute name they're responsible for, and do the access directly. >Why even call the first form, which is both longer and causes >performance problems "a feature"? If you don't understand that, IMO you don't yet understand enough about the descriptor architecture to be proposing changes to it. > > Note, by the way, that if you want to change attribute lookup > > semantics, you can always override __getattribute__ and make it work > > whatever way you like, without forcing everybody else to change > *their* code. >If I write my own __getattribute__ I lose the performance benefit that >I am after. Not if you write it in C. >I do agree that code shouldn't be broken, that's why a transitional >that requires using __fastlookup__ can be used (Unfortunately, from >__future__ cannot be used as it is not local to a module, but to a >class hierarchy - unless one imports a feature from __future__ into a >class). I have no idea what you're talking about here. From snaury at gmail.com Sun Jun 10 20:17:16 2007 From: snaury at gmail.com (Alexey Borzenkov) Date: Sun, 10 Jun 2007 22:17:16 +0400 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: <466C2ABF.2090500@v.loewis.de> References: <466B71C7.3040607@v.loewis.de> <466BB877.8000404@v.loewis.de> <466C2ABF.2090500@v.loewis.de> Message-ID: On 6/10/07, "Martin v. L?wis" wrote: > > I don't think always encoding them to utf-8 (and using bit 11 of > > flag_bits) is a good idea, since there's a chance to create archives > > that won't be correctly readable by programs not supporting this bit > > (it's no secret that currently some programs just assume that > > filenames are encoded using one of system encodings). > I think it is also fairly uniformly agreed that these programs are > incorrect; the official encoding of file names in a zip file is > Windows/DOS code page 437. Before replying to you I actually did some quick tests. I packed a file with localized filename and then opened it using explorer and also viewed it using the hexeditor: 7-Zip: directory cp866, header cp866: explorer sees correct filename. zipfile: directory cp1251, header cp1251: explorer sees incorrect filename. pkzip25.exe: directory cp866, header cp1251: explorer sees correct filenames, zipfile complains that filenames differ. zip.exe: directory cp1251, header cp1251: explorer sees incorrect filenames. Also note, that modifying filename in directory with a hex editor to cp866 made explorer see correct filenames. Another experiment with pkzip25 showed that modifying filename in directory makes it extract files with that filenam, i.e. it ignores header filename. The same behavior is showed by 7-Zip. So the general idea is that at least directory filename has some sort of convention of using oem (dos, console) encoding on Windows, cp866 in my case. Header filenames have different encodings, and seem to be ignored. > I don't think that the situation on Windows is that the OEM code page > should be used. Instead, CP 437 should be used, independent of the OEM > code page. And on the contrary, pkzip25 made by PKWARE Inc. themselves behaves otherwise. > > + filename = str(self.filename) > That would be incorrect, as it relies on the system encoding, > which shouldn't be relied upon. Well, as I've seen in numerous examples above, system (or actually dos) encoding is actually what is used by at least by three major programs: 7-zip, pkzip25 and explorer, at least on windows. > Plus, it would allow arbitrary > non-string things as filenames. Hmm... why is that bad? > What it should do instead > (IMO) is to encode in CP437. Bonus points if it falls back > to the UTF-8 feature of zip files if encoding as CP437 fails. And encoding to cp437 would be incorrect, as no currently existing program would correctly work on non-english Windows OSes. I think that letting the user deciding on the encoding is the right way to go here, as you can't know what user actually wants these days, it's all too hazy to me. And in case unicode is passed it just converts it using ascii (or default) codec. One can specify ascii codec there explicitly, if using system encoding is really an issue. From pje at telecommunity.com Sun Jun 10 20:28:23 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 10 Jun 2007 14:28:23 -0400 Subject: [Python-Dev] Fwd: Instance variable access and descriptors In-Reply-To: References: <20070609232842.005993A4060@sparrow.telecommunity.com> Message-ID: <20070610182622.921653A407F@sparrow.telecommunity.com> At 11:27 AM 6/10/2007 +0100, Gustavo Carneiro wrote: > I have to agree with you. If removing support for > self.__dict__['propertyname'] (where propertyname is also the name > of a descriptor) is the price to pay for significant speedup, so be > it. People doing that are asking for trouble anyway! How so? This order of lookup is explicitly defined by the precedence rules of PEP 252: """When a dynamic attribute (one defined in a regular object's __dict__) has the same name as a static attribute (one defined by a meta-object in the inheritance graph rooted at the regular object's __class__), the static attribute has precedence if it is a descriptor that defines a __set__ method (see below); otherwise (if there is no __set__ method) the dynamic attribute has precedence. In other words, for data attributes (those with a __set__ method), the static definition overrides the dynamic definition, but for other attributes, dynamic overrides static.""" I fail to see how relying on explicitly-documented language behavior is "asking for trouble". From martin at v.loewis.de Sun Jun 10 21:43:19 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 10 Jun 2007 21:43:19 +0200 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: References: <466B71C7.3040607@v.loewis.de> <466BB877.8000404@v.loewis.de> <466C2ABF.2090500@v.loewis.de> Message-ID: <466C5457.1020904@v.loewis.de> > So the general idea is that at least directory filename has some sort > of convention of using oem (dos, console) encoding on Windows, cp866 > in my case. Header filenames have different encodings, and seem to be > ignored. Ok, then this is what the zipfile module should implement. >> That would be incorrect, as it relies on the system encoding, >> which shouldn't be relied upon. > > Well, as I've seen in numerous examples above, system (or actually > dos) encoding is actually what is used by at least by three major > programs: 7-zip, pkzip25 and explorer, at least on windows. Please don't confuse Python's "system encoding" with the system's (or user's) standard encoding - they are not related at all. Using the OEM code page if everybody else does it is fine. Using the encoding that somebody hand-coded into the Python installation is not. >> Plus, it would allow arbitrary >> non-string things as filenames. > > Hmm... why is that bad? Errors should never pass silently. >> What it should do instead >> (IMO) is to encode in CP437. Bonus points if it falls back >> to the UTF-8 feature of zip files if encoding as CP437 fails. > > And encoding to cp437 would be incorrect, as no currently existing > program would correctly work on non-english Windows OSes. I think that > letting the user deciding on the encoding is the right way to go here, > as you can't know what user actually wants these days, it's all too > hazy to me. Asking "the user" is not practical. If "the user" was aware of the problem, you would not have run into the problem in the first place - you would have known to encode all file names before passing them into the zipfile module. The automatic mode should follow the standard or the conventions; "the user" (in quotes, because the end user is rarely bothered with that detail) can still override that explicitly. Regards, Martin From snaury at gmail.com Sun Jun 10 22:26:33 2007 From: snaury at gmail.com (Alexey Borzenkov) Date: Mon, 11 Jun 2007 00:26:33 +0400 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: <466C5457.1020904@v.loewis.de> References: <466B71C7.3040607@v.loewis.de> <466BB877.8000404@v.loewis.de> <466C2ABF.2090500@v.loewis.de> <466C5457.1020904@v.loewis.de> Message-ID: On 6/10/07, "Martin v. L?wis" wrote: > > So the general idea is that at least directory filename has some sort > > of convention of using oem (dos, console) encoding on Windows, cp866 > > in my case. Header filenames have different encodings, and seem to be > > ignored. > Ok, then this is what the zipfile module should implement. But this is only on Windows! I have no clue what's the common situation on other OSes and don't even know how to sanely get OEM codepage on Windows (the obvious way with ctypes.kernel32.GetOEMCP() doesn't seem good to me). So I guess that's bad idea anyway, maybe conforming to language bit is better (ascii will stay ascii anyway). What about this? Index: Lib/zipfile.py =================================================================== --- Lib/zipfile.py (revision 55850) +++ Lib/zipfile.py (working copy) @@ -252,6 +252,7 @@ self.extract_version = max(45, self.extract_version) self.create_version = max(45, self.extract_version) + self._encodeFilename() header = struct.pack(structFileHeader, stringFileHeader, self.extract_version, self.reserved, self.flag_bits, self.compress_type, dostime, dosdate, CRC, @@ -259,6 +260,16 @@ len(self.filename), len(extra)) return header + self.filename + extra + def _encodeFilename(self): + if isinstance(self.filename, unicode): + self.filename = self.filename.encode('utf-8') + self.flag_bits = self.flag_bits | 0x800 + + def _decodeFilename(self): + if self.flag_bits & 0x800: + self.filename = self.filename.decode('utf-8') + self.flag_bits = self.flag_bits & ~0x800 + def _decodeExtra(self): # Try to decode the extra field. extra = self.extra @@ -683,6 +694,7 @@ t>>11, (t>>5)&0x3F, (t&0x1F) * 2 ) x._decodeExtra() + x._decodeFilename() x.header_offset = x.header_offset + concat self.filelist.append(x) self.NameToInfo[x.filename] = x @@ -967,6 +979,7 @@ extract_version = zinfo.extract_version create_version = zinfo.create_version + zinfo._encodeFilename() centdir = struct.pack(structCentralDir, stringCentralDir, create_version, zinfo.create_system, extract_version, zinfo.reserved, Index: Lib/test/test_zipfile.py =================================================================== --- Lib/test/test_zipfile.py (revision 55850) +++ Lib/test/test_zipfile.py (working copy) @@ -515,6 +515,11 @@ # and report that the first file in the archive was corrupt. self.assertRaises(RuntimeError, zipf.testzip) + def testUnicodeFilenames(self): + zf = zipfile.ZipFile(TESTFN, "w") + zf.writestr(u"foo.txt", "Test for unicode filename") + zf.close() + def tearDown(self): support.unlink(TESTFN) support.unlink(TESTFN2) The problem is that I don't know if anything actually supports bit 11 at the time and can't even tell if I did this correctly or not. :( From martin at v.loewis.de Sun Jun 10 22:47:54 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 10 Jun 2007 22:47:54 +0200 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: References: <466B71C7.3040607@v.loewis.de> <466BB877.8000404@v.loewis.de> <466C2ABF.2090500@v.loewis.de> <466C5457.1020904@v.loewis.de> Message-ID: <466C637A.2020404@v.loewis.de> > But this is only on Windows! I have no clue what's the common > situation on other OSes and don't even know how to sanely get OEM > codepage on Windows (the obvious way with ctypes.kernel32.GetOEMCP() > doesn't seem good to me). > > So I guess that's bad idea anyway, maybe conforming to language bit is > better (ascii will stay ascii anyway). > > What about this? I haven't checked (*) whether you got the right value for flag_bits; assuming you do, this looks good. For compatibility, I would propose to use UTF-8 only if the file name is not ASCII. Even though the OEM code pages vary, they are (mostly) ASCII supersets. So if the string can be encoded in ASCII, there is no need to set the UTF-8 flag bit. OTOH, I now wonder whether it would *hurt* to have the flag bit: if old zip software does not choke if the flag is set, then it can just as well be set, as ASCII strings automatically get encoded as ASCII in UTF-8. Regards, Martin (*) I just now read http://www.pkware.com/documents/casestudies/APPNOTE.TXT and 0x800 seems to be the right value indeed. Notice, in appendix D, that the specification says that the historical encoding of file names is code page 437. From aahz at pythoncraft.com Mon Jun 11 00:37:12 2007 From: aahz at pythoncraft.com (Aahz) Date: Sun, 10 Jun 2007 15:37:12 -0700 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: References: Message-ID: <20070610223712.GA15827@panix.com> On Sun, Jun 10, 2007, Eyal Lotem wrote: > > Python, probably through the valid assumption that most attribute > lookups go to the class, tries to look for the attribute in the class > first, and in the instance, second. > > What Python currently does is quite peculiar! > Here's a short description o PyObject_GenericGetAttr: > > A. Python looks for a descriptor in the _entire_ mro hierarchy > (len(mro) class/type check and dict lookups). > B. If Python found a descriptor and it has both get and set functions > - it uses it to get the value and returns, skipping the next stage. > C. If Python either did not find a descriptor, or found one that has > no setter, it will try a lookup in the instance dict. > D. If Python failed to find it in the instance, it will use the > descriptor's getter, and if it has no getter it will use the > descriptor itself. Guido, Ping, and I tried working on this at the sprint for PyCon 2003. We were unable to find any solution that did not affect critical-path timing. As other people have noted, the current semantics cannot be changed. I'll also echo other people and suggest that this discusion be moved to python-ideas if you want to continue pushing for a change in semantics. I just did a Google for my notes from PyCon 2003 and it appears that I never sent them out (probably because they aren't particularly comprehensible). Here they are for the record (from 3/25/2003): ''' CACHE_ATTR is the name used to describe a speedup (for new-style classes only) in attribute lookup by caching the location of attributes in the MRO. Some of the non-obvious bits of code: * If a new-style class has any classic classes in its bases, we can't do attribute caching (we need to weakrefs to the derived classes). * If searching the MRO for an attribute discovers a data descriptor (has tp_descr_set), that overrides any attribute that might be in the instance; however, the existence of tp_descr_get still permits the instance to override its bases (but tp_descr_get is called if there is no instance attribute). * We need to invalidate the cache for the updated attribute in all derived classes in the following cases: * an attribute is added or deleted to the class or its base classes * an attribute has its status changed to or from being a data descriptor This file uses Python pseudocode to describe changes necessary to implement CACHE_ATTR at the C level. Except for class Meta, these are all exact descriptions of the work being done. Except for class Meta the changes go into object.c (Meta goes into typeobject.c). The pseudocode looks somewhat C-like to ease the transformation. ''' NULL = object() def getattr(inst, name): isdata, where = lookup(inst.__class__, name) if isdata: descr = where[name] if hasattr(descr, "__get__"): return descr.__get__(inst) else: return descr value = inst.__dict__.get(name, NULL) if value != NULL: return value if where == NULL: raise AttributError descr = where[name] if hasattr(descr, "__get__"): value = descr.__get__(inst) else: value = descr return value def setattr(inst, name, value): isdata, where = lookup(inst.__class__, name) if isdata: descr = where[name] descr.__set__(inst, value) return inst.__dict__[name] = value def lookup(cls, name): if cls.__cache__ != NULL: pair = cls.__cache__.get(name) else: pair = NULL if pair: return pair else: for c in cls.__mro__: where = c.__dict__ if name in where: descr = where[name] isdata = hasattr(descr, "__set__") pair = isdata, where break else: pair = False, NULL if cls.__cache__ != NULL: cls.__cache__[name] = pair return pair ''' These changes go into typeobject.c; they are not a complete description of what happens during creation/updates, only the changes necessary to implement CACHE_ATTRO. ''' from types import ClassType class Meta(type): def _invalidate(cls, name): if name in cls.__cache__: del cls.__cache__[name] for c in cls.__subclasses__(): if name not in c.__dict__: self._invalidate(c, name) def _build_cache(cls, bases): for base in bases: if type(base.__class__) is ClassType: cls.__cache__ = NULL break else: cls.__cache__ = {} def __new__ (cls, bases): self._build_cache(cls, bases) def __setbases__(cls, bases): self._build_cache(cls, bases) def __setattr__(cls, name, value): if cls.__cache__ != NULL: old = cls.__dict__.get(name, NULL) wasdata = old != NULL and hasattr(old, "__set__") isdata = value != NULL and hasattr(value, "__set__") if wasdata != isdata or (old == NULL) != (value === NULL): self._invalidate(cls, name) type.__setattr__(cls, name, value) def __delattr__(cls, name): self.__setattr__(cls, name, NULL) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "as long as we like the same operating system, things are cool." --piranha From eyal.lotem at gmail.com Mon Jun 11 03:54:48 2007 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Mon, 11 Jun 2007 04:54:48 +0300 Subject: [Python-Dev] Frame zombies In-Reply-To: <466B972B.1090802@v.loewis.de> References: <466B7B75.2080804@v.loewis.de> <466B972B.1090802@v.loewis.de> Message-ID: On 6/10/07, "Martin v. L?wis" wrote: > > Note that _only_ recursions will have more than 1 frame attached. > > That's not true; in the presence of threads, the same method > may also be invoked more than one time simultaneously. Yes, I have missed that, and realized that I missed that myself a bit later. I guess I can rationalize that with the fact that I myself tend to avoid threads. > > But removing freelist altogether will not work well with any type of > > recursion. > > How do you know that? Did you measure the time? On what system? > What were the results? > > Performance optimization without measuring is just unacceptable. I agree, I may have used the wrong tone above. Removing the freelist will probably either not have a significant effect (at worst, its adding very little work of maintaining it), or improve recursions and functions that tend to be running simultaniously in multiple threads (as in those cases the realloc will not actually resize the frame, and mallocs/free will indeed be saved). But do note my corrected tone (I said "probably" :-) - and anyone is welcome to try removing it and see if they get a performance benefit. The fact threading also causes the same code object to be used in multiple frames makes everything a little less predictable and may mean that having a larger-than-1 number of frames associated with each code object may indeed yield a performance benefit. I am not sure how to benchmark such modifications. Is there any benchmark that includes threaded use of the same functions in typical use cases? > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/eyal.lotem%40gmail.com > From martin at v.loewis.de Mon Jun 11 04:58:02 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 11 Jun 2007 04:58:02 +0200 Subject: [Python-Dev] Frame zombies In-Reply-To: References: <466B7B75.2080804@v.loewis.de> <466B972B.1090802@v.loewis.de> Message-ID: <466CBA3A.8050307@v.loewis.de> > I am not sure how to benchmark such modifications. Is there any > benchmark that includes threaded use of the same functions in typical > use cases? I don't think it's necessary to benchmark that specific case - *any* kind of micro-benchmark would be better than none. If you want to introduce free lists per code object, you need to benchmark such code, and compare it to the status quo. While doing so, I'd ask to also measure the case that the free list is dropped without a replacement. Regards, Martin From snaury at gmail.com Mon Jun 11 06:30:29 2007 From: snaury at gmail.com (Alexey Borzenkov) Date: Mon, 11 Jun 2007 08:30:29 +0400 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: <466C637A.2020404@v.loewis.de> References: <466B71C7.3040607@v.loewis.de> <466BB877.8000404@v.loewis.de> <466C2ABF.2090500@v.loewis.de> <466C5457.1020904@v.loewis.de> <466C637A.2020404@v.loewis.de> Message-ID: On 6/11/07, "Martin v. L?wis" wrote: > For compatibility, I would propose to use UTF-8 only if the file > name is not ASCII. Even though the OEM code pages vary, they > are (mostly) ASCII supersets. So if the string can be encoded > in ASCII, there is no need to set the UTF-8 flag bit. Done: Index: Lib/zipfile.py =================================================================== --- Lib/zipfile.py (revision 55850) +++ Lib/zipfile.py (working copy) @@ -252,13 +252,29 @@ self.extract_version = max(45, self.extract_version) self.create_version = max(45, self.extract_version) + filename, flag_bits = self._encodeFilenameFlags() header = struct.pack(structFileHeader, stringFileHeader, - self.extract_version, self.reserved, self.flag_bits, + self.extract_version, self.reserved, flag_bits, self.compress_type, dostime, dosdate, CRC, compress_size, file_size, - len(self.filename), len(extra)) - return header + self.filename + extra + len(filename), len(extra)) + return header + filename + extra + def _encodeFilenameFlags(self): + if isinstance(self.filename, unicode): + try: + return self.filename.encode('ascii'), self.flag_bits + except UnicodeEncodeError: + return self.filename.encode('utf-8'), self.flag_bits | 0x800 + else: + return self.filename, self.flag_bits + + def _decodeFilenameFlags(self): + if self.flag_bits & 0x800: + return self.filename.decode('utf-8'), self.flag_bits & ~0x800 + else: + return self.filename, self.flag_bits + def _decodeExtra(self): # Try to decode the extra field. extra = self.extra @@ -684,6 +700,7 @@ x._decodeExtra() x.header_offset = x.header_offset + concat + x.filename, x.flag_bits = x._decodeFilenameFlags() self.filelist.append(x) self.NameToInfo[x.filename] = x if self.debug > 2: @@ -967,16 +984,17 @@ extract_version = zinfo.extract_version create_version = zinfo.create_version + filename, flag_bits = zinfo._encodeFilenameFlags() centdir = struct.pack(structCentralDir, stringCentralDir, create_version, zinfo.create_system, extract_version, zinfo.reserved, - zinfo.flag_bits, zinfo.compress_type, dostime, dosdate, + flag_bits, zinfo.compress_type, dostime, dosdate, zinfo.CRC, compress_size, file_size, - len(zinfo.filename), len(extra_data), len(zinfo.comment), + len(filename), len(extra_data), len(zinfo.comment), 0, zinfo.internal_attr, zinfo.external_attr, header_offset) self.fp.write(centdir) - self.fp.write(zinfo.filename) + self.fp.write(filename) self.fp.write(extra_data) self.fp.write(zinfo.comment) Index: Lib/test/test_zipfile.py =================================================================== --- Lib/test/test_zipfile.py (revision 55850) +++ Lib/test/test_zipfile.py (working copy) @@ -515,6 +515,12 @@ # and report that the first file in the archive was corrupt. self.assertRaises(RuntimeError, zipf.testzip) + def testUnicodeFilenames(self): + zf = zipfile.ZipFile(TESTFN, "w") + zf.writestr(u"foo.txt", "Test for unicode filename") + assert isinstance(zf.infolist()[0].filename, unicode) + zf.close() + def tearDown(self): support.unlink(TESTFN) support.unlink(TESTFN2) What I also changed is to encode filenames only for writing to the target file, without damaging ZipInfo. The reason for this is that if user decides to enumerate infolist after she wrote files to ZipFile, she would expect ZipInfo.filename to be what she passed to ZipFile.write/ZipFile.writestr. From eyal.lotem at gmail.com Mon Jun 11 06:55:50 2007 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Mon, 11 Jun 2007 07:55:50 +0300 Subject: [Python-Dev] Question about dictobject.c:lookdict_string Message-ID: My question is specifically regarding the transition back from lookdict_string (the initial value) to the general lookdict. Currently, when a string-only dict is trying to look up any non-string, it reverts back to a general lookdict. Wouldn't it be better (especially in the more important case of a string-key-only dict), to revert to the generic lookdict when a non-string is inserted to the dict, rather than when one is being searched? This seems to me as it would shift this (admittedly very slight) performance cost of a type ptr comparison from the read-access, to write-access on all dicts (which means insertions of new keys in non-string-only dicts may pay for another check, or that the lookdict funcptr will be replaced by two funcptrs so that a different insertion func on string-only dicts is used too [was tempted to say vtable here, but that would add another dereference to lookups]). It would also have the slight benefit of speeding up non-string lookups in string-only dicts. This does not seem like a significant issue, but as I know a lot of effort went into optimizing dicts, I was wondering if I am missing something here. From snaury at gmail.com Mon Jun 11 07:22:16 2007 From: snaury at gmail.com (Alexey Borzenkov) Date: Mon, 11 Jun 2007 09:22:16 +0400 Subject: [Python-Dev] zipfile and unicode filenames In-Reply-To: References: <466B71C7.3040607@v.loewis.de> <466BB877.8000404@v.loewis.de> <466C2ABF.2090500@v.loewis.de> <466C5457.1020904@v.loewis.de> Message-ID: On 6/11/07, Alexey Borzenkov wrote: > The problem is that I don't know if anything actually supports bit 11 > at the time and can't even tell if I did this correctly or not. :( I downloaded the latest WinZip and can confirm that it parses utf-8 filenames correctly (although it seems to treat presence of bit 11 more like enabling autodetection mode, not strict utf-8, but it must be because it has to cope with lots of incorrect zip files), i.e. in the presence of bit 11 it understands filename to be utf-8, without presence of bit 11 it treats it just like oem. :) From gjcarneiro at gmail.com Mon Jun 11 12:43:16 2007 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Mon, 11 Jun 2007 11:43:16 +0100 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: <20070610223712.GA15827@panix.com> References: <20070610223712.GA15827@panix.com> Message-ID: While you're at it, it would be nice to fix this ugly asymmetry I found in descriptors. It seems that descriptor's __get__ is called even when accessed from a class rather than instance, but __set__ is only invoked from instances, never from classes: class Descr(object): def __get__(self, obj, objtype): print "__get__ from instance %s, type %s" % (obj, type) return "foo" def __set__(self, obj, value): print "__set__ on instance %s, value %s" % (obj, value) class Foo(object): foo = Descr() print Foo.foo # works ## doesn't work, goes directly to the class dict, not calling __set__ Foo.foo = 123 Because of this problem, I may have to install properties into a class's metaclass achieve the same effect that I expected to achieve with a simple descriptor :-( On 10/06/07, Aahz wrote: > > On Sun, Jun 10, 2007, Eyal Lotem wrote: > > > > Python, probably through the valid assumption that most attribute > > lookups go to the class, tries to look for the attribute in the class > > first, and in the instance, second. > > > > What Python currently does is quite peculiar! > > Here's a short description o PyObject_GenericGetAttr: > > > > A. Python looks for a descriptor in the _entire_ mro hierarchy > > (len(mro) class/type check and dict lookups). > > B. If Python found a descriptor and it has both get and set functions > > - it uses it to get the value and returns, skipping the next stage. > > C. If Python either did not find a descriptor, or found one that has > > no setter, it will try a lookup in the instance dict. > > D. If Python failed to find it in the instance, it will use the > > descriptor's getter, and if it has no getter it will use the > > descriptor itself. > > Guido, Ping, and I tried working on this at the sprint for PyCon 2003. > We were unable to find any solution that did not affect critical-path > timing. As other people have noted, the current semantics cannot be > changed. I'll also echo other people and suggest that this discusion be > moved to python-ideas if you want to continue pushing for a change in > semantics. > > I just did a Google for my notes from PyCon 2003 and it appears that I > never sent them out (probably because they aren't particularly > comprehensible). Here they are for the record (from 3/25/2003): > > ''' > CACHE_ATTR is the name used to describe a speedup (for new-style classes > only) in attribute lookup by caching the location of attributes in the > MRO. Some of the non-obvious bits of code: > > * If a new-style class has any classic classes in its bases, we > can't do attribute caching (we need to weakrefs to the derived > classes). > > * If searching the MRO for an attribute discovers a data descriptor (has > tp_descr_set), that overrides any attribute that might be in the instance; > however, the existence of tp_descr_get still permits the instance to > override its bases (but tp_descr_get is called if there is no instance > attribute). > > * We need to invalidate the cache for the updated attribute in all derived > classes in the following cases: > > * an attribute is added or deleted to the class or its base classes > > * an attribute has its status changed to or from being a data > descriptor > > This file uses Python pseudocode to describe changes necessary to > implement CACHE_ATTR at the C level. Except for class Meta, these are > all exact descriptions of the work being done. Except for class Meta the > changes go into object.c (Meta goes into typeobject.c). The pseudocode > looks somewhat C-like to ease the transformation. > ''' > > NULL = object() > > def getattr(inst, name): > isdata, where = lookup(inst.__class__, name) > if isdata: > descr = where[name] > if hasattr(descr, "__get__"): > return descr.__get__(inst) > else: > return descr > value = inst.__dict__.get(name, NULL) > if value != NULL: > return value > if where == NULL: > raise AttributError > descr = where[name] > if hasattr(descr, "__get__"): > value = descr.__get__(inst) > else: > value = descr > return value > > def setattr(inst, name, value): > isdata, where = lookup(inst.__class__, name) > if isdata: > descr = where[name] > descr.__set__(inst, value) > return > inst.__dict__[name] = value > > def lookup(cls, name): > if cls.__cache__ != NULL: > pair = cls.__cache__.get(name) > else: > pair = NULL > if pair: > return pair > else: > for c in cls.__mro__: > where = c.__dict__ > if name in where: > descr = where[name] > isdata = hasattr(descr, "__set__") > pair = isdata, where > break > else: > pair = False, NULL > if cls.__cache__ != NULL: > cls.__cache__[name] = pair > return pair > > > ''' > These changes go into typeobject.c; they are not a complete > description of what happens during creation/updates, only the > changes necessary to implement CACHE_ATTRO. > ''' > > from types import ClassType > > class Meta(type): > def _invalidate(cls, name): > if name in cls.__cache__: > del cls.__cache__[name] > for c in cls.__subclasses__(): > if name not in c.__dict__: > self._invalidate(c, name) > def _build_cache(cls, bases): > for base in bases: > if type(base.__class__) is ClassType: > cls.__cache__ = NULL > break > else: > cls.__cache__ = {} > def __new__ (cls, bases): > self._build_cache(cls, bases) > def __setbases__(cls, bases): > self._build_cache(cls, bases) > def __setattr__(cls, name, value): > if cls.__cache__ != NULL: > old = cls.__dict__.get(name, NULL) > wasdata = old != NULL and hasattr(old, "__set__") > isdata = value != NULL and hasattr(value, "__set__") > if wasdata != isdata or (old == NULL) != (value === NULL): > self._invalidate(cls, name) > type.__setattr__(cls, name, value) > def __delattr__(cls, name): > self.__setattr__(cls, name, NULL) > -- > Aahz (aahz at pythoncraft.com) <*> > http://www.pythoncraft.com/ > > "as long as we like the same operating system, things are cool." --piranha > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com > -- Gustavo J. A. M. Carneiro INESC Porto, Telecommunications and Multimedia Unit "The universe is always one step beyond logic." -- Frank Herbert -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070611/11b461a5/attachment.htm From arigo at tunes.org Mon Jun 11 13:06:13 2007 From: arigo at tunes.org (Armin Rigo) Date: Mon, 11 Jun 2007 13:06:13 +0200 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: References: <20070609232842.005993A4060@sparrow.telecommunity.com> <2e1434c10706091728w593f4016p5f09ee49be97e80d@mail.gmail.com> Message-ID: <20070611110613.GA28880@code0.codespeak.net> Hi Eyal, On Sun, Jun 10, 2007 at 04:13:38AM +0300, Eyal Lotem wrote: > I must be missing something, as I really see no reason to keep the > existing semantics other than backwards compatibility (which can be > achieved by introducing a __fastattr__ or such). > > Can you explain under which situations or find any example situation > where the existing semantics are desirable? The existing semantics are essential when dealing with metaclasses. Many of the descriptors of the 'type' class would stop working without it. For example, the fact that 'x.__class__' normally gives the type of 'x' for any object x relies on this. Reading the '__dict__' attribute of types is also based on this. Before proposing changes, be sure you understand exactly how the following works: >>> object.__class__ >>> object.__dict__['__class__'] >>> class A(object): ... pass >>> A.__dict__ >>> A.__dict__['__dict__'] A bientot, Armin. From facundo at taniquetil.com.ar Mon Jun 11 15:33:01 2007 From: facundo at taniquetil.com.ar (Facundo Batista) Date: Mon, 11 Jun 2007 13:33:01 +0000 (UTC) Subject: [Python-Dev] Santa Fe Python Day report Message-ID: It was very succesful, around +300 people assisted, and there were a lot of interesting talks (two introductory talks, Turbogears, PyWeek, Zope 3, security, creating 3D games, Plone, automatic security testings, concurrency, and programming the OLPC). I want to thanks the PSF for the received support. Python is developing interestingly in Argentina, and this Python Days are both a prove of that, and a way to get more Python developers. Some links: Santa Fe Python Day: http://www.python-santafe.com.ar/ Python Argentina: http://www.python.com.ar/moin Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From cfbolz at gmx.de Mon Jun 11 17:23:10 2007 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Mon, 11 Jun 2007 17:23:10 +0200 Subject: [Python-Dev] Question about dictobject.c:lookdict_string In-Reply-To: References: Message-ID: <466D68DE.6020907@gmx.de> Eyal Lotem wrote: > My question is specifically regarding the transition back from > lookdict_string (the initial value) to the general lookdict. > > Currently, when a string-only dict is trying to look up any > non-string, it reverts back to a general lookdict. > > Wouldn't it be better (especially in the more important case of a > string-key-only dict), to revert to the generic lookdict when a > non-string is inserted to the dict, rather than when one is being > searched? [...] > This does not seem like a significant issue, but as I know a lot of > effort went into optimizing dicts, I was wondering if I am missing > something here. Yes, you are: when doing a lookup with a non-string-key, that key could be an instance of a class that has __hash__ and __eq__ implementations that make the key compare equal to some string that is in the dictionary. So you need to change to lookdict, otherwise that lookup might fail. Cheers, Carl Friedrich Bolz From greg.ewing at canterbury.ac.nz Tue Jun 12 10:01:09 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Jun 2007 20:01:09 +1200 Subject: [Python-Dev] 2.5 slower than 2.4 for some things? Message-ID: <466E52C5.5040906@canterbury.ac.nz> I've had a report from a user that Plex runs about half as fast in 2.5 as it did in 2.4. In particular, the NFA-to-DFA conversion phase, which does a lot of messing about with dicts representing mappings between sets of states. Does anyone in the Ministry for Making Python Blazingly fast happen to know of some change that might have pessimised things in this area? -- Greg -------------- next part -------------- An embedded message was scrubbed... From: Christian Kristukat Subject: plex performance Date: Sat, 09 Jun 2007 21:53:03 +0900 Size: 29661 Url: http://mail.python.org/pipermail/python-dev/attachments/20070612/ba3a7ac6/attachment-0001.mht From greg.ewing at canterbury.ac.nz Tue Jun 12 10:10:26 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Jun 2007 20:10:26 +1200 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: <20070609232842.005993A4060@sparrow.telecommunity.com> References: <20070609232842.005993A4060@sparrow.telecommunity.com> Message-ID: <466E54F2.20002@canterbury.ac.nz> Phillip J. Eby wrote: > ...at the cost of slowing down access to properties and __slots__, by > adding an *extra* dictionary lookup there. Rather than spend time tinkering with the lookup order, it might be more productive to look into implementing a cache for attribute lookups. That would help with method lookups as well, which are probably more frequent than instance var accesses. -- Greg From ferringb at gmail.com Tue Jun 12 13:13:01 2007 From: ferringb at gmail.com (Brian Harring) Date: Tue, 12 Jun 2007 04:13:01 -0700 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: <466E54F2.20002@canterbury.ac.nz> References: <20070609232842.005993A4060@sparrow.telecommunity.com> <466E54F2.20002@canterbury.ac.nz> Message-ID: <20070612111301.GF5778@seldon> On Tue, Jun 12, 2007 at 08:10:26PM +1200, Greg Ewing wrote: > Phillip J. Eby wrote: > > ...at the cost of slowing down access to properties and __slots__, by > > adding an *extra* dictionary lookup there. > > Rather than spend time tinkering with the lookup order, > it might be more productive to look into implementing > a cache for attribute lookups. That would help with > method lookups as well, which are probably more > frequent than instance var accesses. Was wondering the same; specifically, hijacking pep280 celldict appraoch for this. Downside, this would break code that tries to do PyDict_* calls on a class tp_dict; haven't dug extensively, but I'm sure there are a few out there. Main thing I like about that approach is that it avoids the staleness verification crap, single lookup- it's there or it isn't. It would also be resuable for 280. If folks don't much like the hit from tracing back to a cell holding an actual value, could always implement it such that upon change, the change propagates out to instances registered (iow, change a.__dict__, it notifies b.__dict__ of the change, etc, till it hits a point where the change doesn't need to go further). ~harring -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070612/435ee92d/attachment.pgp From ocean at m2.ccsnet.ne.jp Tue Jun 12 14:15:49 2007 From: ocean at m2.ccsnet.ne.jp (ocean) Date: Tue, 12 Jun 2007 21:15:49 +0900 Subject: [Python-Dev] 2.5 slower than 2.4 for some things? References: <466E52C5.5040906@canterbury.ac.nz> Message-ID: <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh> > I've had a report from a user that Plex runs about half > as fast in 2.5 as it did in 2.4. In particular, the > NFA-to-DFA conversion phase, which does a lot of > messing about with dicts representing mappings between > sets of states. > > Does anyone in the Ministry for Making Python Blazingly > fast happen to know of some change that might have > pessimised things in this area? Hello, I investigated. On my environment, consumed time is E:\Plex-1.1.5>py24 plex_test2.py 0.710999965668 E:\Plex-1.1.5>py25 plex_test2.py 0.921999931335 And after I applied this patch to Plex/Machines, (make `Node' new style class) 62c62 < class Node: --- > class Node(object): E:\Plex-1.1.5>py24 plex_test2.py 0.401000022888 E:\Plex-1.1.5>py25 plex_test2.py 0.350999832153 So, probably hash, comparation mechanizm of old/new style class has changed. # improved for new style class, worse for old style class. Maybe optimized for new style class? Try this for minimum test. import timeit init = """ class Class: pass c1 = Class() c2 = Class() """ t1 = timeit.Timer(""" c1 < c2 """, init) t2 = timeit.Timer(""" hash(c1) hash(c2) """, init) print t1.timeit(1000) print t2.timeit(1000) From eyal.lotem at gmail.com Tue Jun 12 15:22:15 2007 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Tue, 12 Jun 2007 16:22:15 +0300 Subject: [Python-Dev] Question about dictobject.c:lookdict_string In-Reply-To: <466D68DE.6020907@gmx.de> References: <466D68DE.6020907@gmx.de> Message-ID: On 6/11/07, Carl Friedrich Bolz wrote: > Eyal Lotem wrote: > > My question is specifically regarding the transition back from > > lookdict_string (the initial value) to the general lookdict. > > > > Currently, when a string-only dict is trying to look up any > > non-string, it reverts back to a general lookdict. > > > > Wouldn't it be better (especially in the more important case of a > > string-key-only dict), to revert to the generic lookdict when a > > non-string is inserted to the dict, rather than when one is being > > searched? > [...] > > This does not seem like a significant issue, but as I know a lot of > > effort went into optimizing dicts, I was wondering if I am missing > > something here. > > Yes, you are: when doing a lookup with a non-string-key, that key could > be an instance of a class that has __hash__ and __eq__ implementations > that make the key compare equal to some string that is in the > dictionary. So you need to change to lookdict, otherwise that lookup > might fail. Ah, thanks for clarification. But doesn't it make sense to only revert that single lookup, and not modify the function ptr until the dict contains a non-string? Eyal From orsenthil at gmail.com Tue Jun 12 22:39:10 2007 From: orsenthil at gmail.com (Senthil Kumaran) Date: Wed, 13 Jun 2007 02:09:10 +0530 Subject: [Python-Dev] [RFC] urlparse - parse query facility Message-ID: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com> Hi all, This mail is a request for comments on changes to urlparse module. We understand that urlparse returns the 'complete query' value as the query component and does not provide the facilities to separate the query components. User will have to use the cgi module (cgi.parse_qs) to get the query parsed. There has been a discussion in the past, on having a method of parse query string available from urlparse module itself. [1] To implement the query parse feature in urlparse module, we can: a) import cgi and call cgi module's query_ps. This approach will have problems as it i) imports cgi for urlparse module. ii) cgi module in turn imports urllib and urlparse. b) Implement a stand alone query parsing facility in urlparse *AS IN* cgi module. Below method implements the urlparse_qs(url, keep_blank_values,strict_parsing) that will help in parsing the query component of the url. It behaves same as the cgi.parse_qs. Please let me know your comments on the below code. ---------------------------------------------------------------------- def unquote(s): """unquote('abc%20def') -> 'abc def'.""" res = s.split('%') for i in xrange(1, len(res)): item = res[i] try: res[i] = _hextochr[item[:2]] + item[2:] except KeyError: res[i] = '%' + item except UnicodeDecodeError: res[i] = unichr(int(item[:2], 16)) + item[2:] return "".join(res) def urlparse_qs(url, keep_blank_values=0, strict_parsing=0): """Parse a URL query string and return the components as a dictionary. Based on the cgi.parse_qs method.This is a utility function provided with urlparse so that users need not use cgi module for parsing the url query string. Arguments: url: URL with query string to be parsed keep_blank_values: flag indicating whether blank values in URL encoded queries should be treated as blank strings. A true value indicates that blanks should be retained as blank strings. The default false value indicates that blank values are to be ignored and treated as if they were not included. strict_parsing: flag indicating what to do with parsing errors. If false (the default), errors are silently ignored. If true, errors raise a ValueError exception. """ scheme, netloc, url, params, querystring, fragment = urlparse(url) pairs = [s2 for s1 in querystring.split('&') for s2 in s1.split(';')] query = [] for name_value in pairs: if not name_value and not strict_parsing: continue nv = name_value.split('=', 1) if len(nv) != 2: if strict_parsing: raise ValueError, "bad query field: %r" % (name_value,) # Handle case of a control-name with no equal sign if keep_blank_values: nv.append('') else: continue if len(nv[1]) or keep_blank_values: name = unquote(nv[0].replace('+', ' ')) value = unquote(nv[1].replace('+', ' ')) query.append((name, value)) dict = {} for name, value in query: if name in dict: dict[name].append(value) else: dict[name] = [value] return dict ---------------------------------------------------------------------- Testing: $ python Python 2.6a0 (trunk, Jun 10 2007, 12:04:03) [GCC 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import urlparse >>> dir(urlparse) ['BaseResult', 'MAX_CACHE_SIZE', 'ParseResult', 'SplitResult', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '_parse_cache', '_splitnetloc', '_splitparams', 'clear_cache', 'non_hierarchical', 'scheme_chars', 'test', 'test_input', 'unquote', 'urldefrag', 'urljoin', 'urlparse', 'urlparse_qs', 'urlsplit', 'urlunparse', 'urlunsplit', 'uses_fragment', 'uses_netloc', 'uses_params', 'uses_query', 'uses_relative'] >>> URL = >>> 'http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=utf-8&q=south+africa+travel+cape+town' >>> print urlparse.urlparse_qs(URL) {'q': ['south africa travel cape town'], 'oe': ['utf-8'], 'ie': ['UTF-8'], 'hl': ['en']} >>> print urlparse.urlparse_qs(URL,keep_blank_values=1) {'q': ['south africa travel cape town'], 'ie': ['UTF-8'], 'oe': ['utf-8'], 'lr': [''], 'hl': ['en']} >>> Thanks, Senthil [1] http://mail.python.org/pipermail/tutor/2002-August/016823.html -- O.R.Senthil Kumaran http://phoe6.livejournal.com From orsenthil at gmail.com Tue Jun 12 22:58:42 2007 From: orsenthil at gmail.com (Senthil Kumaran) Date: Wed, 13 Jun 2007 02:28:42 +0530 Subject: [Python-Dev] Requesting commit access to python sandbox. Cleanup urllib2 - Summer of Code 2007 Project Message-ID: <7c42eba10706121358g3040faaeu1682e5a460cff557@mail.gmail.com> Hi, I am a student participant of Google Summer of Code 2007 and I am working on the cleanup task of urllib2, with Skip as my mentor. I would like to request for a commit access to the Python Sandbox for implementing the changes as part of the project. I have attached by SSH Public keys. preferred name : senthil.kumaran I am following up and adding comments to the urllib related bugs at sf.net page. I would also like to request addition of my sourceforge id : orsenthil to the python project, so I can close the defects raised against urllib modules. Summer of Code Project: http://code.google.com/soc/psf/appinfo.html?csaid=E73A6612F80229B6 The project actually commenced on May 28th itself. But, there was a delay from my side to get started. Ivan Sutherland's essay on Technology and Courage [1] did some good thing to me. :-) Thanks, Senthil [1] http://research.sun.com/techrep/Perspectives/smli_ps-1.pdf#search=%22sutherland%20courage%22 -- O.R.Senthil Kumaran http://phoe6.livejournal.com -------------- next part -------------- A non-text attachment was scrubbed... Name: id_rsa.pub Type: application/octet-stream Size: 228 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070613/e0d9d5e5/attachment.obj From greg.ewing at canterbury.ac.nz Wed Jun 13 02:37:23 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 13 Jun 2007 12:37:23 +1200 Subject: [Python-Dev] 2.5 slower than 2.4 for some things? In-Reply-To: <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh> References: <466E52C5.5040906@canterbury.ac.nz> <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh> Message-ID: <466F3C43.5090608@canterbury.ac.nz> ocean wrote: > So, probably hash, comparation mechanizm of old/new style class has changed. > # improved for new style class, worse for old style class. Maybe optimized > for new style class? Thanks -- it looks like there's a simple solution that will make Plex even faster! I'll pass this on to the OP. -- Greg From ckkart at hoc.net Wed Jun 13 03:15:38 2007 From: ckkart at hoc.net (Christian K) Date: Wed, 13 Jun 2007 10:15:38 +0900 Subject: [Python-Dev] 2.5 slower than 2.4 for some things? In-Reply-To: <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh> References: <466E52C5.5040906@canterbury.ac.nz> <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh> Message-ID: ocean wrote: >> I've had a report from a user that Plex runs about half >> as fast in 2.5 as it did in 2.4. In particular, the >> NFA-to-DFA conversion phase, which does a lot of >> messing about with dicts representing mappings between >> sets of states. That was me. >> Does anyone in the Ministry for Making Python Blazingly >> fast happen to know of some change that might have >> pessimised things in this area? > > Hello, I investigated. On my environment, consumed time is > > E:\Plex-1.1.5>py24 plex_test2.py > 0.710999965668 > > E:\Plex-1.1.5>py25 plex_test2.py > 0.921999931335 > > And after I applied this patch to Plex/Machines, (make `Node' new style > class) > > 62c62 > < class Node: > --- >> class Node(object): > > E:\Plex-1.1.5>py24 plex_test2.py > 0.401000022888 > > E:\Plex-1.1.5>py25 plex_test2.py > 0.350999832153 > Nice!. Meanwhile I tried to replace the parsing I did with Plex by re.Scanner. And again there is a remarkable speed difference. Again python2.5 is slower: try: from re import Scanner except: from sre import Scanner pars = {} order = [] count = 0 def par(scanner,name): global count, order, pars if name in ['caller','e','pi']: return name if name not in pars.keys(): pars[name] = ('ns', count) order.append(name) ret = 'a[%d]'%count count += 1 else: ret = 'a[%d]'%(order.index(name)) return ret scanner = Scanner([ (r"x", lambda y,x: x), (r"[a-zA-Z]+\.", lambda y,x: x), (r"[a-z]+$", lambda y,x: x), (r"[a-zA-Z_]\w*", par), (r"\d+\.\d*", lambda y,x: x), (r"\d+", lambda y,x: x), (r"\+|-|\*|/", lambda y,x: x), (r"\s+", None), (r"$+", lambda y,x: x), (r"\(+", lambda y,x: x), (r",", lambda y,x: x), ]) import profile import pstats def run(): arg = '+amp*exp(-(x-pos)/fwhm)' for i in range(100): scanner.scan(arg) profile.run('run()','profscanner') p = pstats.Stats('profscanner') p.strip_dirs() p.sort_stats('cumulative') p.print_stats() Christian From thopfin at umich.edu Tue Jun 5 18:55:07 2007 From: thopfin at umich.edu (Todd Hopfinger) Date: Tue, 5 Jun 2007 12:55:07 -0400 Subject: [Python-Dev] TLSAbruptCloseError Message-ID: <000801c7a792$45e15e90$d1a41bb0$@edu> I am using TLS Lite and J2ME SecureConnection for the purposes of encrypting traffic to/from a Java Midlet client and a multithreaded Python server. However, I encounter a TLSAbruptCloseError. I have tried to determine the cause of the exception to no avail. I understand that it has to do with close_notify alerts. My abbreviated code follows. // Server def sslSockRecv(conn, num): data = '' while len(data) < num: data = conn.recv(num - len(data)) # TLSAbruptCloseError thrown here if len(data) == 0: raise NotEnoughBytes ('Too few bytes from client. Expected ' + str(num) + '; got ' + str(len(data)), num, len(data)) return data sslSockRecv() throws NotEnoughBytes exception to indicate that the client has closed the connection. The NotEnoughBytes exception handler subsequently closes the SSL connection and then the underlying socket. // Client import javax.microedition.io.SecureConnection; sc = (SecureConnection)Connector.open("ssl://host:port"); inStream = sc.openInputStream(); outStream = sc.openOutputStream(); // read/write some data using streams if (inStream != null) inStream.close(); if (outStream != null) outStream.close(); if (sc != null) sc.close(); When using the Java phone emulator, SSLDump indicates after the application data portions. 3 13 0.3227 (0.0479) C>SV3.0(22) Alert level warning value close_notify 3 0.3228 (0.0000) C>S TCP FIN 3 14 0.3233 (0.0005) S>CV3.0(22) Alert level warning value close_notify However, the server doesn't throw a TLSAbruptCloseError when using the emulator. Using the actual phone does cause a TLSAbruptCloseError on the server but SSLDump reports no errors, just. 4 1.6258 (0.7012) C>S TCP FIN 4 1.6266 (0.0008) S>C TCP FIN Any thoughts? Todd Hopfinger -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070605/3d14619b/attachment.htm From rolandgeibel at yahoo.de Wed Jun 6 18:59:54 2007 From: rolandgeibel at yahoo.de (Roland Geibel) Date: Wed, 6 Jun 2007 18:59:54 +0200 (CEST) Subject: [Python-Dev] minimal configuration for python on a DSP (C64xx family of TI) Message-ID: <433272.47901.qm@web27412.mail.ukl.yahoo.com> Dear all. We want to make python run on DSP processors (C64xx family of TI). Which would be a minimal configuration (of modules, C-files, ... ) to make it start running (without all of the things useful to add, once it runs). Any hints welcome Roland Geibel Geibel at vision-comp.com Heute schon einen Blick in die Zukunft von E-Mails wagen? Versuchen Sie?s mit dem neuen Yahoo! Mail. www.yahoo.de/mail From shredwheat at gmail.com Fri Jun 8 05:31:23 2007 From: shredwheat at gmail.com (Pete Shinners) Date: Thu, 7 Jun 2007 20:31:23 -0700 Subject: [Python-Dev] Representation of 'nan' Message-ID: The repr() for a float of 'inf' or 'nan' is generated as a string (not a string literal). Perhaps this is only important in how one defines repr(). I've filed a bug, but am not sure if there is a clear solution. https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1732212&group_id=5470 # Repr with a tuple of floats >>> repr((1.0, 2.0, 3.0)) '(1.0, 2.0, 3.0)' >>> eval(_) (1.0, 2.0, 3.0) # Repr with a tuple of floats, plus nan >>> repr((1.0, float('nan'), 3.0)) '(1.0, nan, 3.0)' >>> eval(_) NameError: name 'nan' is not defined There are a few alternatives I can think are fairly clean. I think I'd prefer any of these over the current 'nan' implementation. I don't think it is worth adding a nan literal into the language. But something could be changed so that repr of nan meant something. Best option in my opinion would be adding attributes to float, so that float.nan, float.inf, and float.ninf are accessable. This could also help with the odd situations of checking for these out of range values. With that in place, repr could return 'float.nan' instead of 'nan'. This would make the repr string evaluatable again. (In contexts where __builtins__ has not been molested) Another option could be for repr to return 'float("nan")' for these, which would also evaluate correctly. But this doesn't seem a clean use for repr. Is this worth even changing? It's just an irregularity that has come up and surprised a few of us developers. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070607/0ceaaa66/attachment.html From cfbolz at gmx.de Mon Jun 11 17:23:10 2007 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Mon, 11 Jun 2007 17:23:10 +0200 Subject: [Python-Dev] Question about dictobject.c:lookdict_string In-Reply-To: References: Message-ID: <466D68DE.6020907@gmx.de> Eyal Lotem wrote: > My question is specifically regarding the transition back from > lookdict_string (the initial value) to the general lookdict. > > Currently, when a string-only dict is trying to look up any > non-string, it reverts back to a general lookdict. > > Wouldn't it be better (especially in the more important case of a > string-key-only dict), to revert to the generic lookdict when a > non-string is inserted to the dict, rather than when one is being > searched? [...] > This does not seem like a significant issue, but as I know a lot of > effort went into optimizing dicts, I was wondering if I am missing > something here. Yes, you are: when doing a lookup with a non-string-key, that key could be an instance of a class that has __hash__ and __eq__ implementations that make the key compare equal to some string that is in the dictionary. So you need to change to lookdict, otherwise that lookup might fail. Cheers, Carl Friedrich Bolz From martin at v.loewis.de Wed Jun 13 05:46:54 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 13 Jun 2007 05:46:54 +0200 Subject: [Python-Dev] TLSAbruptCloseError In-Reply-To: <000801c7a792$45e15e90$d1a41bb0$@edu> References: <000801c7a792$45e15e90$d1a41bb0$@edu> Message-ID: <466F68AE.7020603@v.loewis.de> > Any thoughts? My main thought: this posting is off-topic for python-dev. This list is for the development of Python itself; use comp.lang.python for discussing development *with* Python. However, this may still be the wrong place - perhaps you better ask in a Java forum? Regards, Martin From arigo at tunes.org Wed Jun 13 09:46:41 2007 From: arigo at tunes.org (Armin Rigo) Date: Wed, 13 Jun 2007 09:46:41 +0200 Subject: [Python-Dev] Instance variable access and descriptors In-Reply-To: <466E54F2.20002@canterbury.ac.nz> References: <20070609232842.005993A4060@sparrow.telecommunity.com> <466E54F2.20002@canterbury.ac.nz> Message-ID: <20070613074641.GA8702@code0.codespeak.net> Hi, On Tue, Jun 12, 2007 at 08:10:26PM +1200, Greg Ewing wrote: > Rather than spend time tinkering with the lookup order, > it might be more productive to look into implementing > a cache for attribute lookups. See patch #1700288. Armin From jon+python-dev at unequivocal.co.uk Wed Jun 13 11:53:26 2007 From: jon+python-dev at unequivocal.co.uk (Jon Ribbens) Date: Wed, 13 Jun 2007 10:53:26 +0100 Subject: [Python-Dev] TLSAbruptCloseError In-Reply-To: <000801c7a792$45e15e90$d1a41bb0$@edu> References: <000801c7a792$45e15e90$d1a41bb0$@edu> Message-ID: <20070613095326.GY2531@snowy.squish.net> On Tue, Jun 05, 2007 at 12:55:07PM -0400, Todd Hopfinger wrote: > I am using TLS Lite and J2ME SecureConnection for the purposes of > encrypting traffic to/from a Java Midlet client and a multithreaded Python > server. However, I encounter a TLSAbruptCloseError. I have tried to > determine the cause of the exception to no avail. I understand that it has > to do with close_notify alerts. My abbreviated code follows. It may or may not be your specific problem, but Microsoft SSL servers tend to just drop the TCP connection when they're done, rather than do a proper SSL shutdown. This tends to make errors such as the above, which you must then ignore. From ocean at m2.ccsnet.ne.jp Wed Jun 13 19:17:25 2007 From: ocean at m2.ccsnet.ne.jp (ocean) Date: Thu, 14 Jun 2007 02:17:25 +0900 Subject: [Python-Dev] 2.5 slower than 2.4 for some things? References: <466E52C5.5040906@canterbury.ac.nz><005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh> Message-ID: <001701c7adde$b6dfe560$0300a8c0@whiterabc2znlh> > Meanwhile I tried to replace the parsing I did with Plex by re.Scanner. And > again there is a remarkable speed difference. Again python2.5 is slower: > > try: > from re import Scanner > except: > from sre import Scanner > > pars = {} > order = [] > count = 0 > > def par(scanner,name): > global count, order, pars > > if name in ['caller','e','pi']: > return name > if name not in pars.keys(): > pars[name] = ('ns', count) > order.append(name) > ret = 'a[%d]'%count > count += 1 > else: > ret = 'a[%d]'%(order.index(name)) > return ret > > scanner = Scanner([ > (r"x", lambda y,x: x), > (r"[a-zA-Z]+\.", lambda y,x: x), > (r"[a-z]+$", lambda y,x: x), > (r"[a-zA-Z_]\w*", par), > (r"\d+\.\d*", lambda y,x: x), > (r"\d+", lambda y,x: x), > (r"\+|-|\*|/", lambda y,x: x), > (r"\s+", None), > (r"$+", lambda y,x: x), > (r"\(+", lambda y,x: x), > (r",", lambda y,x: x), > ]) > > import profile > import pstats > > def run(): > arg = '+amp*exp(-(x-pos)/fwhm)' > for i in range(100): > scanner.scan(arg) > > profile.run('run()','profscanner') > p = pstats.Stats('profscanner') > p.strip_dirs() > p.sort_stats('cumulative') > p.print_stats() Well, I tried this script, there was no big difference. Python2.4 0.772sec Python2.5 0.816sec Probably I found one reason comparation for classic style class is slower on Python2.5. Comparation function instance_compare() calls PyErr_GivenExceptionMatches(), and it was just flag operation on 2.4. But on 2.5, probably related to introduction of BaseException, it checks inherited type tuple. (ie: PyExceptionInstance_Check) From kumar.mcmillan at gmail.com Wed Jun 13 22:30:57 2007 From: kumar.mcmillan at gmail.com (Kumar McMillan) Date: Wed, 13 Jun 2007 15:30:57 -0500 Subject: [Python-Dev] sys.setdefaultencoding() vs. csv module + unicode Message-ID: I'm seeing conflicting opinions on whether to put sys.setdefaultencoding('utf-8') in sitecustomize.py or not ([1] vs. [2]) and frankly I'm confused. The csv module says it's not unicode safe but the 2.5 docs [3] have a workaround for this. While the workaround says nothing about sys.setdefaultencoding() it simply does not work with the default encoding, "ascii." Is this _the_ problem with the csv module? Should I give up and use XML? Below is code that works vs. code that doesn't. Am I interpretting the workaround from the docs wrong? If so, can someone please give me a hint ;) I should also point out that I've tried this with the StringIO queued approach (from the workaround) but that doesn't solve anything. 1) with the default encoding : kumar$ python2.5 Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin >>> import sys, csv, codecs >>> f = codecs.open('unicsv.csv','wb','utf-8') >>> w = csv.writer(f) >>> w.writerow([u'lang', u'espa\xa4ol']) Traceback (most recent call last): File "", line 1, in UnicodeEncodeError: 'ascii' codec can't encode character u'\xa4' in position 4: ordinal not in range(128) >>> 2) with custom encoding : kumar$ python2.5 -S Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin >>> import sys, csv, codecs >>> sys.setdefaultencoding('utf-8') >>> f = codecs.open('unicsv.csv','wb','utf-8') >>> w = csv.writer(f) >>> w.writerow([u'lang', u'espa\xa4ol']) >>> f.close() thanks, Kumar [1] http://mail.python.org/pipermail/python-dev/2007-June/073593.html [2] http://diveintopython.org/xml_processing/unicode.html [3] http://docs.python.org/lib/csv-examples.html#csv-examples From nnorwitz at gmail.com Thu Jun 14 00:55:49 2007 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 13 Jun 2007 15:55:49 -0700 Subject: [Python-Dev] 2.5 slower than 2.4 for some things? In-Reply-To: <001701c7adde$b6dfe560$0300a8c0@whiterabc2znlh> References: <466E52C5.5040906@canterbury.ac.nz> <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh> <001701c7adde$b6dfe560$0300a8c0@whiterabc2znlh> Message-ID: On 6/13/07, ocean wrote: > > Meanwhile I tried to replace the parsing I did with Plex by re.Scanner. > And > > again there is a remarkable speed difference. Again python2.5 is slower: > > > > try: > > from re import Scanner > > except: > > from sre import Scanner > > > > pars = {} > > order = [] > > count = 0 > > > > def par(scanner,name): > > global count, order, pars > > > > if name in ['caller','e','pi']: > > return name > > if name not in pars.keys(): > > pars[name] = ('ns', count) > > order.append(name) > > ret = 'a[%d]'%count > > count += 1 > > else: > > ret = 'a[%d]'%(order.index(name)) > > return ret > > > > scanner = Scanner([ > > (r"x", lambda y,x: x), > > (r"[a-zA-Z]+\.", lambda y,x: x), > > (r"[a-z]+$", lambda y,x: x), > > (r"[a-zA-Z_]\w*", par), > > (r"\d+\.\d*", lambda y,x: x), > > (r"\d+", lambda y,x: x), > > (r"\+|-|\*|/", lambda y,x: x), > > (r"\s+", None), > > (r"$+", lambda y,x: x), > > (r"\(+", lambda y,x: x), > > (r",", lambda y,x: x), > > ]) > > > > import profile > > import pstats > > > > def run(): > > arg = '+amp*exp(-(x-pos)/fwhm)' > > for i in range(100): > > scanner.scan(arg) > > > > profile.run('run()','profscanner') > > p = pstats.Stats('profscanner') > > p.strip_dirs() > > p.sort_stats('cumulative') > > p.print_stats() > > Well, I tried this script, there was no big difference. > Python2.4 0.772sec > Python2.5 0.816sec > > Probably I found one reason comparation for classic style class is slower on > Python2.5. > Comparation function instance_compare() calls PyErr_GivenExceptionMatches(), > and it was just flag operation on 2.4. But on 2.5, probably related to > introduction of BaseException, > it checks inherited type tuple. (ie: PyExceptionInstance_Check) I'm curious about the speed of 2.6 (trunk). I think this should have become faster due to the introduction of fast subtype checks (he says without looking at the code). n From jimjjewett at gmail.com Thu Jun 14 01:27:24 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 13 Jun 2007 19:27:24 -0400 Subject: [Python-Dev] [RFC] urlparse - parse query facility Message-ID: > a) import cgi and call cgi module's query_ps. [circular imports] or > b) Implement a stand alone query parsing facility in urlparse *AS IN* > cgi module. Assuming (b), please remove the (code for the) parsing from the cgi module, and just import it back from urlparse (or urllib). Since cgi already imports urllib (which imports urlparse), this isn't adding any dependencies -- but it keeps the code in a single location. -jJ From ocean at m2.ccsnet.ne.jp Thu Jun 14 04:37:08 2007 From: ocean at m2.ccsnet.ne.jp (ocean) Date: Thu, 14 Jun 2007 11:37:08 +0900 Subject: [Python-Dev] 2.5 slower than 2.4 for some things? References: <466E52C5.5040906@canterbury.ac.nz> <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh> <001701c7adde$b6dfe560$0300a8c0@whiterabc2znlh> Message-ID: <002201c7ae2c$e7920910$0300a8c0@whiterabc2znlh> > > Probably I found one reason comparation for classic style class is slower on > > Python2.5. > > Comparation function instance_compare() calls PyErr_GivenExceptionMatches(), > > and it was just flag operation on 2.4. But on 2.5, probably related to > > introduction of BaseException, > > it checks inherited type tuple. (ie: PyExceptionInstance_Check) > > I'm curious about the speed of 2.6 (trunk). I think this should have > become faster due to the introduction of fast subtype checks (he says > without looking at the code). > > n > Yes, I confirmed trunk is faster than 2.5. /////////////////////////////////////// // Code import timeit t = timeit.Timer(""" f1 < f2 """, """ class Foo: pass f1 = Foo() f2 = Foo() """) print t.timeit(10000) /////////////////////////////////////// // Result release-maint24 0.337sec release-maint25 0.625sec trunk 0.494sec ////////////////////////////////////// // Result of plex_test2.py release-maint24 2.944sec release-maint25 4.026sec trunk 3.625sec From orsenthil at users.sourceforge.net Thu Jun 14 04:43:44 2007 From: orsenthil at users.sourceforge.net (O.R.Senthil Kumaran) Date: Thu, 14 Jun 2007 08:13:44 +0530 Subject: [Python-Dev] [RFC] urlparse - parse query facility In-Reply-To: References: Message-ID: <20070614024344.GA3321@gmail.com> * Jim Jewett [2007-06-13 19:27:24]: > > a) import cgi and call cgi module's query_ps. [circular imports] > > or > > > b) Implement a stand alone query parsing facility in urlparse *AS IN* > > cgi module. > > Assuming (b), please remove the (code for the) parsing from the cgi > module, and just import it back from urlparse (or urllib). Since cgi > already imports urllib (which imports urlparse), this isn't adding any > dependencies -- but it keeps the code in a single location. Sure, thats a good idea as I see it. It wont break anything as well. Thanks, -- O.R.Senthil Kumaran http://uthcode.sarovar.org From fdrake at acm.org Thu Jun 14 04:42:21 2007 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Jun 2007 22:42:21 -0400 Subject: [Python-Dev] [RFC] urlparse - parse query facility In-Reply-To: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com> References: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com> Message-ID: <200706132242.21436.fdrake@acm.org> On Tuesday 12 June 2007, Senthil Kumaran wrote: > This mail is a request for comments on changes to urlparse module. We > understand that urlparse returns the 'complete query' value as the query > component and does not > provide the facilities to separate the query components. User will have to > use the cgi module (cgi.parse_qs) to get the query parsed. I agree with the comments Jim provided. > Below method implements the urlparse_qs(url, > keep_blank_values,strict_parsing) that will help in parsing the query > component of the url. It behaves same as the cgi.parse_qs. Except that it takes a URL, not only a query string. > def urlparse_qs(url, keep_blank_values=0, strict_parsing=0): ... > scheme, netloc, url, params, querystring, fragment = urlparse(url) I see no reason to incorporate the URL splitting into the function; the existing function signatures for cgi.parse_qs and cgi.parse_qsl are sufficient. It may be convenient to add methods to the urlparse.BaseResult class providing access to the parsed version of the query on the instance. -Fred -- Fred L. Drake, Jr. From martin at v.loewis.de Thu Jun 14 08:47:43 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 14 Jun 2007 08:47:43 +0200 Subject: [Python-Dev] sys.setdefaultencoding() vs. csv module + unicode In-Reply-To: References: Message-ID: <4670E48F.3090903@v.loewis.de> > The csv module says it's not unicode safe but the 2.5 docs [3] have a > workaround for this. While the workaround says nothing about > sys.setdefaultencoding() it simply does not work with the default > encoding, "ascii." Is this _the_ problem with the csv module? Should > I give up and use XML? Below is code that works vs. code that > doesn't. Am I interpretting the workaround from the docs wrong? These questions are off-topic for python-dev; please ask them on comp.lang.python instead. python-dev is for the development *of* Python, not for the development *with* Python. > kumar$ python2.5 > Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) > [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin >>>> import sys, csv, codecs >>>> f = codecs.open('unicsv.csv','wb','utf-8') >>>> w = csv.writer(f) >>>> w.writerow([u'lang', u'espa\xa4ol']) What you should do here is def encoderow(r): return [s.encode("utf-8") for s in r]) f = open('unicsv.csv', 'wb', 'utf-8') w = csv.writer(f) w.writerow(encoderow([u'lang', u'espa\xa4ol']) IOW, you need to encode *before* passing the strings to the CSV module, not afterwards. If it is too tedious for you to put in the encoderow calls all the time, you can write a wrapper for CSV writers which transparently encodes all Unicode strings. Regards, Martin From amk at amk.ca Thu Jun 14 21:50:34 2007 From: amk at amk.ca (A.M. Kuchling) Date: Thu, 14 Jun 2007 15:50:34 -0400 Subject: [Python-Dev] Outcome of Georg's documentation work? Message-ID: <20070614195034.GA18011@localhost.localdomain> What was the outcome of the discussion of Georg Brandl's reworked documentation ("The docs, reloaded")? Was any decision made on whether to go with reST, or on what changes need to made before that's possible? Did Fred Drake say what he thought? Georg, do you want access to python.org to host a version of the docs there? --amk From orsenthil at users.sourceforge.net Sat Jun 16 06:04:59 2007 From: orsenthil at users.sourceforge.net (O.R.Senthil Kumaran) Date: Sat, 16 Jun 2007 09:34:59 +0530 Subject: [Python-Dev] [RFC] urlparse - parse query facility In-Reply-To: <200706132242.21436.fdrake@acm.org> References: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com> <200706132242.21436.fdrake@acm.org> Message-ID: <20070616040459.GA3598@gmail.com> * Fred L. Drake, Jr. [2007-06-13 22:42:21]: > I see no reason to incorporate the URL splitting into the function; the > existing function signatures for cgi.parse_qs and cgi.parse_qsl are > sufficient. Thanks for the comments, Fred. I understand, that having the signatures of parse_qs and parse_qsl are sufficient in the urlparse module and invoking the same from cgi module will be correct. The urlparse will cotain parse_qs and parse_qsl takes the query string (not url) and with optional arguments keep_blank_values and strict_parsing (same as cgi). http://deadbeefbabe.org/paste/5154 > It may be convenient to add methods to the urlparse.BaseResult class providing > access to the parsed version of the query on the instance. > This is where, I spent a little bit time and I am unable to comeout conclusively as how it can be done. Someone in the list, please help me. * parse_qs or parse_qsl will be invoked on the query component separately by the user. * If parsed query needs to be available at the instance as a convenience function, then we will have to assume the keep_blank_values and strict_parsing values. * Coding question: Without retyping the bunch of code again in the BaseResult, would is the possible to call parse_qs/parse_qsl function on self.query and provide the result? Basically, what would be a good of doing it. Thanks, Senthil -- O.R.Senthil Kumaran http://uthcode.sarovar.org From fdrake at acm.org Sat Jun 16 07:06:59 2007 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sat, 16 Jun 2007 01:06:59 -0400 Subject: [Python-Dev] [RFC] urlparse - parse query facility In-Reply-To: <20070616040459.GA3598@gmail.com> References: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com> <200706132242.21436.fdrake@acm.org> <20070616040459.GA3598@gmail.com> Message-ID: <200706160107.00288.fdrake@acm.org> On Saturday 16 June 2007, O.R.Senthil Kumaran wrote: > The urlparse will cotain parse_qs and parse_qsl takes the query string > (not url) and with optional arguments keep_blank_values and strict_parsing > (same as cgi). > > http://deadbeefbabe.org/paste/5154 Looks good. > > It may be convenient to add methods to the urlparse.BaseResult class > > providing access to the parsed version of the query on the instance. ... > * parse_qs or parse_qsl will be invoked on the query component separately > by the user. Yes; this doesn't change, really. Methods would still need to be invoked separately, but the query string doesn't need to be passed in; it's part of the data object. > * If parsed query needs to be available at the instance as a convenience > function, then we will have to assume the keep_blank_values and > strict_parsing values. If it were a property, yes, but I think a method on the result object makes more sense because we don't want to assume values for these arguments. > * Coding question: Without retyping the bunch of code again in the > BaseResult, would is the possible to call parse_qs/parse_qsl function on > self.query and provide the result? Basically, what would be a good of > doing it. That's what I was thinking. Just add something like this to BaseResult (untested): def parsedQuery(self, keep_blank_values=False, strict_parsing=False): return parse_qs( self.query, keep_blank_values=keep_blank_values, strict_parsing=strict_parsing) def parsedQueryList(self, keep_blank_values=False, strict_parsing=False): return parse_qsl( self.query, keep_blank_values=keep_blank_values, strict_parsing=strict_parsing) Whether there's a real win with this is unclear. I generally prefer having an object that represents the URL and lets me get what I want from it, rather than having to pass the bits around to separate parsing functions. The result objects were added in 2.5, though, and I've no real idea how widely they've been adopted. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From orsenthil at users.sourceforge.net Sat Jun 16 10:41:01 2007 From: orsenthil at users.sourceforge.net (O.R.Senthil Kumaran) Date: Sat, 16 Jun 2007 14:11:01 +0530 Subject: [Python-Dev] [RFC] urlparse - parse query facility In-Reply-To: <200706160107.00288.fdrake@acm.org> References: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com> <200706132242.21436.fdrake@acm.org> <20070616040459.GA3598@gmail.com> <200706160107.00288.fdrake@acm.org> Message-ID: <20070616084101.GA4115@gmail.com> * Fred L. Drake, Jr. [2007-06-16 01:06:59]: > > * Coding question: Without retyping the bunch of code again in the > > BaseResult, would is the possible to call parse_qs/parse_qsl function on > > self.query and provide the result? Basically, what would be a good of > > doing it. > > That's what I was thinking. Just add something like this to BaseResult > (untested): > > def parsedQuery(self, keep_blank_values=False, strict_parsing=False): > return parse_qs( > self.query, > keep_blank_values=keep_blank_values, > strict_parsing=strict_parsing) > > def parsedQueryList(self, keep_blank_values=False, strict_parsing=False): > return parse_qsl( > self.query, > keep_blank_values=keep_blank_values, > strict_parsing=strict_parsing) Thanks Fred. That really helped. :-) I have updated the urlparse.py module, cgi.py and also included in the tests in the test_urlparse.py to test this new functionality. test run passed for all the valid queries, except for these: # ("=", {}), # ("=&=", {}), # ("=;=", {}), The testcases are basically from test_cgi.py module and there is comment on validity of these 3 tests for query values. Pending stuff is updating the documentation. I maintained all the files temporarily at: http://cvs.sarovar.org/cgi-bin/cvsweb.cgi/python/?cvsroot=uthcode I had requested a commit access to Summer of Code branch in my previous mail, but I guess it not been noticed yet. I shall update the files later or send in as patches for application. > Whether there's a real win with this is unclear. I generally prefer having an > object that represents the URL and lets me get what I want from it, rather > than having to pass the bits around to separate parsing functions. The I agree. This is really convenient when one comes to know about it. Thanks, Senthil -- O.R.Senthil Kumaran http://uthcode.sarovar.org From g.brandl at gmx.net Sat Jun 16 11:31:56 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 16 Jun 2007 11:31:56 +0200 Subject: [Python-Dev] Outcome of Georg's documentation work? In-Reply-To: <20070614195034.GA18011@localhost.localdomain> References: <20070614195034.GA18011@localhost.localdomain> Message-ID: A.M. Kuchling schrieb: > What was the outcome of the discussion of Georg Brandl's reworked > documentation ("The docs, reloaded")? Was any decision made on > whether to go with reST, or on what changes need to made before that's > possible? Did Fred Drake say what he thought? For my part, I'm still working on it and want to integrate a few of the planned interactive features now. > Georg, do you want access to python.org to host a version of the docs > there? That would be really nice. Should I subscribe to the pydotorg list? Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From martin at v.loewis.de Sat Jun 16 12:10:45 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 16 Jun 2007 12:10:45 +0200 Subject: [Python-Dev] Requesting commit access to python sandbox. Cleanup urllib2 - Summer of Code 2007 Project In-Reply-To: <7c42eba10706121358g3040faaeu1682e5a460cff557@mail.gmail.com> References: <7c42eba10706121358g3040faaeu1682e5a460cff557@mail.gmail.com> Message-ID: <4673B725.1000309@v.loewis.de> > I am a student participant of Google Summer of Code 2007 and I am > working on the cleanup task of urllib2, with Skip as my mentor. > I would like to request for a commit access to the Python Sandbox for > implementing the changes as part of the project. I have attached by > SSH Public keys. > preferred name : senthil.kumaran I have now given you write access. Please constrain all checkins to the sandbox; checkins elsewhere should be approved by your mentor. Regards, Martin From orsenthil at users.sourceforge.net Sat Jun 16 15:25:05 2007 From: orsenthil at users.sourceforge.net (O.R.Senthil Kumaran) Date: Sat, 16 Jun 2007 18:55:05 +0530 Subject: [Python-Dev] Requesting commit access to python sandbox. Cleanup urllib2 - Summer of Code 2007 Project In-Reply-To: <4673B725.1000309@v.loewis.de> References: <7c42eba10706121358g3040faaeu1682e5a460cff557@mail.gmail.com> <4673B725.1000309@v.loewis.de> Message-ID: <20070616132505.GB3558@gmail.com> * "Martin v. L?wis" [2007-06-16 12:10:45]: > > I am a student participant of Google Summer of Code 2007 and I am > > working on the cleanup task of urllib2, with Skip as my mentor. > > I have now given you write access. Please constrain all checkins > to the sandbox; checkins elsewhere should be approved by your > mentor. Thanks Martin. I shall abide by the guidelines. -- O.R.Senthil Kumaran http://uthcode.sarovar.org From amk at amk.ca Sat Jun 16 20:10:31 2007 From: amk at amk.ca (A.M. Kuchling) Date: Sat, 16 Jun 2007 14:10:31 -0400 Subject: [Python-Dev] Outcome of Georg's documentation work? In-Reply-To: References: <20070614195034.GA18011@localhost.localdomain> Message-ID: <20070616181031.GA9465@andrew-kuchlings-computer.local> On Sat, Jun 16, 2007 at 11:31:56AM +0200, Georg Brandl wrote: > That would be really nice. Should I subscribe to the pydotorg list? Yes, please, and e-mail me an SSH key. Such work should be done on ximinez for security reasons, I think, even though the machine is fairly heavily loaded. --amk From status at bugs.python.org Sun Jun 17 02:00:49 2007 From: status at bugs.python.org (Tracker) Date: Sun, 17 Jun 2007 00:00:49 +0000 (UTC) Subject: [Python-Dev] Summary of Tracker Issues Message-ID: <20070617000049.02C96780DD@psf.upfronthosting.co.za> ACTIVITY SUMMARY (06/10/07 - 06/17/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1645 open ( +0) / 8584 closed ( +0) / 10229 total ( +0) Average duration of open issues: 829 days. Median duration of open issues: 777 days. Open Issues Breakdown open 1645 ( +0) pending 0 ( +0) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070617/18b6ad26/attachment.htm From martin at v.loewis.de Sun Jun 17 19:26:02 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 17 Jun 2007 19:26:02 +0200 Subject: [Python-Dev] Upgrade of {www,svn}.python.org Message-ID: <46756EAA.7000905@v.loewis.de> I'd like to upgrade www.python.org this coming Thursday (June 21), between 6:00am and 12:00am UTC. During that time, neither www nor subversion access will be available (although I hope that I need much less than 6 hours). mail.python.org, and all other services running on other machines, will continue to work. I will send another message when I start. Regards, Martin From jeff at taupro.com Sun Jun 17 22:24:54 2007 From: jeff at taupro.com (Jeff Rush) Date: Sun, 17 Jun 2007 15:24:54 -0500 Subject: [Python-Dev] [Pydotorg] Upgrade of {www,svn}.python.org In-Reply-To: <46756EAA.7000905@v.loewis.de> References: <46756EAA.7000905@v.loewis.de> Message-ID: <46759896.9080609@taupro.com> Martin v. L?wis wrote: > I'd like to upgrade www.python.org this coming Thursday (June 21), > between 6:00am and 12:00am UTC. During that time, neither www > nor subversion access will be available (although I hope that > I need much less than 6 hours). > > mail.python.org, and all other services running on other machines, > will continue to work. Is this a software version upgrade or a hardware upgrade re the increase in hard disk space recently mentioned by Sean? If you're already physically at the machine, it'd be nice to get an additional drive added at the same time. -Jeff From martin at v.loewis.de Sun Jun 17 22:28:08 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 17 Jun 2007 22:28:08 +0200 Subject: [Python-Dev] [Pydotorg] Upgrade of {www,svn}.python.org In-Reply-To: <46759896.9080609@taupro.com> References: <46756EAA.7000905@v.loewis.de> <46759896.9080609@taupro.com> Message-ID: <46759958.9010307@v.loewis.de> >> mail.python.org, and all other services running on other machines, >> will continue to work. > > Is this a software version upgrade or a hardware upgrade re the increase in > hard disk space recently mentioned by Sean? If you're already physically at > the machine, it'd be nice to get an additional drive added at the same time. Just a software upgrade, and I will not be physically at the machine - the machine is in Amsterdam, and I am in Berlin. I don't think a hard disk space upgrade is planned, and I don't think it is necessary. Regards, Martin From ocean at m2.ccsnet.ne.jp Mon Jun 18 15:02:56 2007 From: ocean at m2.ccsnet.ne.jp (ocean) Date: Mon, 18 Jun 2007 22:02:56 +0900 Subject: [Python-Dev] Investigated ref leak report related to thread (regrtest.py -R ::) Message-ID: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh> Hello. I investigated ref leak report related to thread. Please run python regrtest.py -R :: test_leak.py (attached file) Sometimes ref leak is reported. # I saw this as regression failure on python-checkins. # total ref count 92578 -> 92669 _Condition 2 Thread 6 _Event 1 bool 10 instancemethod 1 code 2 dict 9 file 1 frame 3 function 2 int 1 list 2 builtin_function_or_method 5 NoneType 2 str 27 thread.lock 7 tuple 5 type 5 Probably this happens because threading.Thread is implemented as Python code, (expecially threading.Thread#join), the code of regrtest.py if i >= nwarmup: deltas.append(sys.gettotalrefcount() - rc - 2) can run before thread really quits. (before Moudles/threadmodule.c t_bootstrap()'s Py_DECREF(boot->func); Py_DECREF(boot->args); Py_XDECREF(boot->keyw); runs) So I experimentally inserted the code to wait for thread termination. (attached file experimental.patch) And I confirmed error was gone. # Sorry for hackish patch which only runs on windows. It should run # on other platforms if you replace Sleep() in Python/sysmodule.c # sys_debug_ref_leak_leave() with appropriate function. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_leak.py Url: http://mail.python.org/pipermail/python-dev/attachments/20070618/b26e2d48/attachment.asc -------------- next part -------------- A non-text attachment was scrubbed... Name: experimental.patch Type: application/octet-stream Size: 2690 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070618/b26e2d48/attachment.obj From ocean at m2.ccsnet.ne.jp Mon Jun 18 15:08:05 2007 From: ocean at m2.ccsnet.ne.jp (ocean) Date: Mon, 18 Jun 2007 22:08:05 +0900 Subject: [Python-Dev] Investigated ref leak report related to thread(regrtest.py -R ::) References: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh> Message-ID: <007c01c7b1a9$b5f43ba0$0300a8c0@whiterabc2znlh> Sorry, mailer striped spaces... I'll try attaching files again. -------------- next part -------------- A non-text attachment was scrubbed... Name: archive.zip Type: application/x-zip-compressed Size: 1511 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070618/d0fae54b/attachment.bin From aahz at pythoncraft.com Mon Jun 18 16:33:59 2007 From: aahz at pythoncraft.com (Aahz) Date: Mon, 18 Jun 2007 07:33:59 -0700 Subject: [Python-Dev] Investigated ref leak report related to thread (regrtest.py -R ::) In-Reply-To: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh> References: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh> Message-ID: <20070618143358.GA24375@panix.com> On Mon, Jun 18, 2007, ocean wrote: > > Hello. I investigated ref leak report related to thread. > Please run python regrtest.py -R :: test_leak.py (attached file) > Sometimes ref leak is reported. Please post a bug report to SF and report the bug number here. When you post bugs only to the list they get lost. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "as long as we like the same operating system, things are cool." --piranha From ocean at m2.ccsnet.ne.jp Mon Jun 18 16:46:20 2007 From: ocean at m2.ccsnet.ne.jp (ocean) Date: Mon, 18 Jun 2007 23:46:20 +0900 Subject: [Python-Dev] Investigated ref leak report related to thread (regrtest.py -R ::) References: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh> <20070618143358.GA24375@panix.com> Message-ID: <002901c7b1b7$70749ad0$0300a8c0@whiterabc2znlh> > Please post a bug report to SF and report the bug number here. When you > post bugs only to the list they get lost. > -- > Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ > > "as long as we like the same operating system, things are cool." --piranha Thank you for pointing it out. Done. http://www.python.org/sf/1739118 From guido at python.org Tue Jun 19 08:32:59 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Jun 2007 23:32:59 -0700 Subject: [Python-Dev] Python 3000 Status Update (Long!) Message-ID: I've written up a comprehensive status report on Python 3000. Please read: http://www.artima.com/weblogs/viewpost.jsp?thread=208549 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From g.brandl at gmx.net Tue Jun 19 10:47:20 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 19 Jun 2007 10:47:20 +0200 Subject: [Python-Dev] Python 3000 Status Update (Long!) In-Reply-To: References: Message-ID: Guido van Rossum schrieb: > I've written up a comprehensive status report on Python 3000. Please read: > > http://www.artima.com/weblogs/viewpost.jsp?thread=208549 Thank you! Now I have something to show to interested people except "read the PEPs". A minuscule nit: the rot13 codec has no library equivalent, so it won't be supported anymore :) Georg From ncoghlan at gmail.com Tue Jun 19 13:57:44 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 19 Jun 2007 21:57:44 +1000 Subject: [Python-Dev] Python 3000 Status Update (Long!) In-Reply-To: References: Message-ID: <4677C4B8.8010508@gmail.com> Georg Brandl wrote: > Guido van Rossum schrieb: >> I've written up a comprehensive status report on Python 3000. Please read: >> >> http://www.artima.com/weblogs/viewpost.jsp?thread=208549 > > Thank you! Now I have something to show to interested people except "read > the PEPs". > > A minuscule nit: the rot13 codec has no library equivalent, so it won't be > supported anymore :) Given that there are valid use cases for bytes-to-bytes translations, and a common API for them would be nice, does it make sense to have an additional category of codec that is invoked via specific recoding methods on bytes objects? For example: encoded = data.encode_bytes('bz2') decoded = encoded.decode_bytes('bz2') assert data == decoded Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fuzzyman at voidspace.org.uk Tue Jun 19 14:31:38 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 19 Jun 2007 13:31:38 +0100 Subject: [Python-Dev] Inspect Patch for IronPython (and Jython?) Compatibility Message-ID: <4677CCAA.1050608@voidspace.org.uk> Hello all, I've just submitted a patch on sourceforge to make inspect compatible with IronPython (and Jython I think). This patch originally comes from the IPCE ( http://fepy.sf.net ) project by Seo Sanghyeon. It is a trivial change really. The patch is number 1739696 http://sourceforge.net/tracker/index.php?func=detail&aid=1739696&group_id=5470&atid=305470 It moves getting a reference to 'code.co_code' into the body of the loop responsible for inspecting anonymous (tuple) arguments. In IronPython, accessing 'co_code' raises a NotImplementedError - meaning that inspect.get_argspec is broken. This patch means that *except* for functions with anonymous tuple arguments, it will work again on IronPython - whilst maintaining full compatibility with the previous behaviour. Jython has a similar patch to overcome the same issue by the way. See http://jython.svn.sourceforge.net/viewvc/jython?view=rev&revision=3200 As it is a bugfix - backporting to 2.5 would be great. Should I generate a separate patch? All the best, Michael Foord From g.brandl at gmx.net Tue Jun 19 14:25:03 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 19 Jun 2007 14:25:03 +0200 Subject: [Python-Dev] Multi-line comments - a case for PEP 3099? Message-ID: Hi, we got another feature request for multi-line comments. While it is nice to comment out multiple lines at once, every editor that deserves that name can add a '#' to multiple lines. And there's always "if 0" and triple-quoted strings... Georg From walter at livinglogic.de Tue Jun 19 14:40:57 2007 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Tue, 19 Jun 2007 14:40:57 +0200 Subject: [Python-Dev] [Python-3000] Python 3000 Status Update (Long!) In-Reply-To: References: <4677C4B8.8010508@gmail.com> Message-ID: <4677CED9.1060800@livinglogic.de> Georg Brandl wrote: > Nick Coghlan schrieb: >> Georg Brandl wrote: >>> Guido van Rossum schrieb: >>>> I've written up a comprehensive status report on Python 3000. Please read: >>>> >>>> http://www.artima.com/weblogs/viewpost.jsp?thread=208549 >>> Thank you! Now I have something to show to interested people except "read >>> the PEPs". >>> >>> A minuscule nit: the rot13 codec has no library equivalent, so it won't be >>> supported anymore :) >> Given that there are valid use cases for bytes-to-bytes translations, >> and a common API for them would be nice, does it make sense to have an >> additional category of codec that is invoked via specific recoding >> methods on bytes objects? For example: >> >> encoded = data.encode_bytes('bz2') >> decoded = encoded.decode_bytes('bz2') >> assert data == decoded > > This is exactly what I proposed a while before under the name > bytes.transform(). > > IMO it would make a common use pattern much more convenient and > should be given thought. > > If a PEP is called for, I'd be happy to at least co-author it. Codecs are a major exception to Guido's law: Never have a parameter whose value switches between completely unrelated algorithms. Why don't we put all string transformation functions into a common module (the string module might be a good place): >>> import string >>> string.rot13('abc') Servus, Walter From mal at egenix.com Tue Jun 19 15:19:50 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 19 Jun 2007 15:19:50 +0200 Subject: [Python-Dev] [Python-3000] Python 3000 Status Update (Long!) In-Reply-To: <4677CED9.1060800@livinglogic.de> References: <4677C4B8.8010508@gmail.com> <4677CED9.1060800@livinglogic.de> Message-ID: <4677D7F6.3040304@egenix.com> On 2007-06-19 14:40, Walter D?rwald wrote: > Georg Brandl wrote: >>>> A minuscule nit: the rot13 codec has no library equivalent, so it won't be >>>> supported anymore :) >>> Given that there are valid use cases for bytes-to-bytes translations, >>> and a common API for them would be nice, does it make sense to have an >>> additional category of codec that is invoked via specific recoding >>> methods on bytes objects? For example: >>> >>> encoded = data.encode_bytes('bz2') >>> decoded = encoded.decode_bytes('bz2') >>> assert data == decoded >> This is exactly what I proposed a while before under the name >> bytes.transform(). >> >> IMO it would make a common use pattern much more convenient and >> should be given thought. >> >> If a PEP is called for, I'd be happy to at least co-author it. > > Codecs are a major exception to Guido's law: Never have a parameter > whose value switches between completely unrelated algorithms. I don't see much of a problem with that. Parameters are per-se intended to change the behavior of a function or method. Note that you are referring to the .encode() and .decode() methods - these are just easy to use interfaces to the codecs registered in the system. The codec design allows for different input and output types as it doesn't impose restrictions on these. Codecs are more general in that respect: they don't just deal with Unicode encodings, it's a more general approach that also works with other kinds of data types. The access methods, OTOH, can impose restrictions and probably should to restrict the return types to a predicable set. > Why don't we put all string transformation functions into a common > module (the string module might be a good place): > >>>> import string >>>> string.rot13('abc') I think the string module will have to go away. It doesn't really separate between text and bytes data. Adding more confusion will not really help with making this distinction clear, either, I'm afraid. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 19 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2007-07-09: EuroPython 2007, Vilnius, Lithuania 19 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From cbarton at metavr.com Tue Jun 19 16:21:32 2007 From: cbarton at metavr.com (Campbell Barton) Date: Wed, 20 Jun 2007 00:21:32 +1000 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: References: Message-ID: <4677E66C.8000403@metavr.com> Hey Guys, My first post on this list so I hope this is the right place to post and relevant. Im rewriting parts of the Blender3D python API that has got a bit old and needs an update. Im making a PyList subtype with the C/Python API, this involves intercepting calls to standard list methods to make sure Blenders array data is in Sync with the list's data. Iv got it working for tp_as_sequence, tp_as_mapping, iter and dealloc etc but methods are a problem. I want to add my own call's before and after PyLists standard functions but have a proplem with functons that use keywords and have no API equivalent. For example, I cant use the API's PyList_Sort because that dosnt support keywords like... ls.sort(key=lambda a: a.foo)) And the Problem with PyObject_CallMethod is that it dosnt accept keywords. PyObject_CallMethod((PyObject *)mylist, "sort", "O", args); Looking at abstract.c, PyObject_CallMethod uses call_function_tail, which calls "PyObject_Call(callable, args, NULL);" - so Its not currently possible with PyObject_CallMethod. But I cant find any way to do this in a few lines. I could use PyEval_CallObjectWithKeywords but that would mean Id need to get the method from the list manually which Ill look into, but unless Im missing something here, it seems PyObject_CallMethodWithKeywords would be a nice addition to the Python API that cant be done in a straight forward way at the moment. - Thanks From chrism at plope.com Tue Jun 19 16:24:05 2007 From: chrism at plope.com (Chris McDonough) Date: Tue, 19 Jun 2007 10:24:05 -0400 Subject: [Python-Dev] Issues with PEP 3101 (string formatting) Message-ID: Wrt http://www.python.org/dev/peps/pep-3101/ PEP 3101 says Py3K should allow item and attribute access syntax within string templating expressions but "to limit potential security issues", access to underscore prefixed names within attribute/item access expressions will be disallowed. I am a person who has lived with the aftermath of a framework designed to prevent data access by restricting access to underscore- prefixed names (Zope 2, ahem), and I've found it's very hard to explain and justify. As a result, I feel that this is a poor default policy choice for a framework. In some cases, underscore names must become part of an object's external interface. Consider a URL with one or more underscore- prefixed path segment elements (because prefixing a filename with an underscore is a perfectly reasonable thing to do on a filesystem, and path elements are often named after file names) fed to a traversal algorithm that attempts to resolve each path element into an object by calling __getitem__ against the parent found by the last path element's traversal result. Perhaps this is poor design and __getitem__ should not be consulted here, but I doubt that highly because there's nothing particularly special about calling a method named __getitem__ as opposed to some method named "traverse". The only precedent within Python 2 for this sort of behavior is limiting access to variables that begin with __ and which do not end with __ to the scope defined by a class and its instances. I personally don't believe this is a very useful feature, but it's still only an advisory policy and you can worm around it with enough gyrations. Given that security is a concern at all, the only truly reasonable way to "limit security issues" is to disallow item and attribute access completely within the string templating expression syntax. It seems gratuituous to me to encourage string templating expressions with item/attribute access, given that you could do it within the format arguments just as easily in the 99% case, and we've (well... I've) happily been living with that restriction for years now. But if this syntax is preserved, there really should be no *default* restrictions on the traversable names within an expression because this will almost certainly become a hard-to-explain, hard-to-justify bug magnet as it has become in Zope. - C From walter at livinglogic.de Tue Jun 19 16:45:46 2007 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Tue, 19 Jun 2007 16:45:46 +0200 Subject: [Python-Dev] [Python-3000] Python 3000 Status Update (Long!) In-Reply-To: References: <4677C4B8.8010508@gmail.com> <4677CED9.1060800@livinglogic.de> Message-ID: <4677EC1A.10306@livinglogic.de> Georg Brandl wrote: > Walter D?rwald schrieb: >> Georg Brandl wrote: >>> Nick Coghlan schrieb: >>>> Georg Brandl wrote: >>>>> Guido van Rossum schrieb: >>>>>> I've written up a comprehensive status report on Python 3000. Please read: >>>>>> >>>>>> http://www.artima.com/weblogs/viewpost.jsp?thread=208549 >>>>> Thank you! Now I have something to show to interested people except "read >>>>> the PEPs". >>>>> >>>>> A minuscule nit: the rot13 codec has no library equivalent, so it won't be >>>>> supported anymore :) >>>> Given that there are valid use cases for bytes-to-bytes translations, >>>> and a common API for them would be nice, does it make sense to have an >>>> additional category of codec that is invoked via specific recoding >>>> methods on bytes objects? For example: >>>> >>>> encoded = data.encode_bytes('bz2') >>>> decoded = encoded.decode_bytes('bz2') >>>> assert data == decoded >>> This is exactly what I proposed a while before under the name >>> bytes.transform(). >>> >>> IMO it would make a common use pattern much more convenient and >>> should be given thought. >>> >>> If a PEP is called for, I'd be happy to at least co-author it. >> Codecs are a major exception to Guido's law: Never have a parameter >> whose value switches between completely unrelated algorithms. > > I don't think that applies here. This is more like __import__(): > depending on the first parameter, completely different things can happen. > Yes, the same import algorithm is used, but in the case of > bytes.encode_bytes, the same algorithm is used to find and execute the > codec. What would a registry of tranformation algorithms buy us compared to a module with transformation functions? The function version is shorter: transform.rot13('foo') compared to: 'foo'.transform('rot13') If each transformation has its own function, these functions can have their own arguments, e.g. transform.bz2encode(data: bytes, level: int=6) -> bytes Of course str.transform() could pass along all arguments to the registered function, but that's worse from a documentation viewpoint, because the real signature is hidden deep in the registry. Servus, Walter From guido at python.org Tue Jun 19 17:18:15 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Jun 2007 08:18:15 -0700 Subject: [Python-Dev] Multi-line comments - a case for PEP 3099? In-Reply-To: References: Message-ID: On 6/19/07, Georg Brandl wrote: > we got another feature request for multi-line comments. > > While it is nice to comment out multiple lines at once, every editor > that deserves that name can add a '#' to multiple lines. > > And there's always "if 0" and triple-quoted strings... I'd als say that the case for TOOWTDI is pretty clear on that. But perhaps we can keep the Py3k discussions on the python-3000 at python.org list? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Jun 19 17:20:25 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Jun 2007 08:20:25 -0700 Subject: [Python-Dev] Issues with PEP 3101 (string formatting) In-Reply-To: References: Message-ID: Those are valid concerns. I'm cross-posting this to the python-3000 list in the hope that the PEP's author and defendents can respond. I'm sure we can work something out. Please keep further discussion on the python-3000 at python.org list. --Guido On 6/19/07, Chris McDonough wrote: > Wrt http://www.python.org/dev/peps/pep-3101/ > > PEP 3101 says Py3K should allow item and attribute access syntax > within string templating expressions but "to limit potential security > issues", access to underscore prefixed names within attribute/item > access expressions will be disallowed. > > I am a person who has lived with the aftermath of a framework > designed to prevent data access by restricting access to underscore- > prefixed names (Zope 2, ahem), and I've found it's very hard to > explain and justify. As a result, I feel that this is a poor default > policy choice for a framework. > > In some cases, underscore names must become part of an object's > external interface. Consider a URL with one or more underscore- > prefixed path segment elements (because prefixing a filename with an > underscore is a perfectly reasonable thing to do on a filesystem, and > path elements are often named after file names) fed to a traversal > algorithm that attempts to resolve each path element into an object > by calling __getitem__ against the parent found by the last path > element's traversal result. Perhaps this is poor design and > __getitem__ should not be consulted here, but I doubt that highly > because there's nothing particularly special about calling a method > named __getitem__ as opposed to some method named "traverse". > > The only precedent within Python 2 for this sort of behavior is > limiting access to variables that begin with __ and which do not end > with __ to the scope defined by a class and its instances. I > personally don't believe this is a very useful feature, but it's > still only an advisory policy and you can worm around it with enough > gyrations. > > Given that security is a concern at all, the only truly reasonable > way to "limit security issues" is to disallow item and attribute > access completely within the string templating expression syntax. It > seems gratuituous to me to encourage string templating expressions > with item/attribute access, given that you could do it within the > format arguments just as easily in the 99% case, and we've (well... > I've) happily been living with that restriction for years now. > > But if this syntax is preserved, there really should be no *default* > restrictions on the traversable names within an expression because > this will almost certainly become a hard-to-explain, hard-to-justify > bug magnet as it has become in Zope. > > - C > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Jun 19 17:22:20 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Jun 2007 08:22:20 -0700 Subject: [Python-Dev] Inspect Patch for IronPython (and Jython?) Compatibility In-Reply-To: <4677CCAA.1050608@voidspace.org.uk> References: <4677CCAA.1050608@voidspace.org.uk> Message-ID: Let's definitely add this to the trunk (2.6). It sounds fine to me as a bugfix too, since (from your description) it doesn't change the behavior at all in CPython. I won't have the time to submit this, but I'm sure there are others here who do. --Guido On 6/19/07, Michael Foord wrote: > Hello all, > > I've just submitted a patch on sourceforge to make inspect compatible > with IronPython (and Jython I think). This patch originally comes from > the IPCE ( http://fepy.sf.net ) project by Seo Sanghyeon. It is a > trivial change really. > > The patch is number 1739696 > http://sourceforge.net/tracker/index.php?func=detail&aid=1739696&group_id=5470&atid=305470 > > It moves getting a reference to 'code.co_code' into the body of the loop > responsible for inspecting anonymous (tuple) arguments. > > In IronPython, accessing 'co_code' raises a NotImplementedError - > meaning that inspect.get_argspec is broken. > > This patch means that *except* for functions with anonymous tuple > arguments, it will work again on IronPython - whilst maintaining full > compatibility with the previous behaviour. > > Jython has a similar patch to overcome the same issue by the way. See > http://jython.svn.sourceforge.net/viewvc/jython?view=rev&revision=3200 > > As it is a bugfix - backporting to 2.5 would be great. Should I generate > a separate patch? > > All the best, > > Michael Foord > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From g.brandl at gmx.net Tue Jun 19 17:27:58 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 19 Jun 2007 17:27:58 +0200 Subject: [Python-Dev] [Python-3000] Python 3000 Status Update (Long!) In-Reply-To: <4677EC1A.10306@livinglogic.de> References: <4677C4B8.8010508@gmail.com> <4677CED9.1060800@livinglogic.de> <4677EC1A.10306@livinglogic.de> Message-ID: Walter D?rwald schrieb: >>>> If a PEP is called for, I'd be happy to at least co-author it. >>> Codecs are a major exception to Guido's law: Never have a parameter >>> whose value switches between completely unrelated algorithms. >> >> I don't think that applies here. This is more like __import__(): >> depending on the first parameter, completely different things can happen. >> Yes, the same import algorithm is used, but in the case of >> bytes.encode_bytes, the same algorithm is used to find and execute the >> codec. > > What would a registry of tranformation algorithms buy us compared to a > module with transformation functions? Easier registering of custom transformations. Without a registry, you'd have to monkey-patch a module. > The function version is shorter: > > transform.rot13('foo') > > compared to: > > 'foo'.transform('rot13') Yes, that's a very convincing argument :) > If each transformation has its own function, these functions can have > their own arguments, e.g. > transform.bz2encode(data: bytes, level: int=6) -> bytes > > Of course str.transform() could pass along all arguments to the > registered function, but that's worse from a documentation viewpoint, > because the real signature is hidden deep in the registry. I don't think transformation functions need arguments. Georg From g.brandl at gmx.net Tue Jun 19 17:30:23 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 19 Jun 2007 17:30:23 +0200 Subject: [Python-Dev] Multi-line comments - a case for PEP 3099? In-Reply-To: References: Message-ID: Guido van Rossum schrieb: > On 6/19/07, Georg Brandl wrote: > >> we got another feature request for multi-line comments. >> >> While it is nice to comment out multiple lines at once, every editor >> that deserves that name can add a '#' to multiple lines. >> >> And there's always "if 0" and triple-quoted strings... > I'd als say that the case for TOOWTDI is pretty clear on that. > > But perhaps we can keep the Py3k discussions on the python-3000 at python.org list? I haven't really seen this as a python-3000 specific issue. Or are you referring to the other cross-posting thread? Georg From guido at python.org Tue Jun 19 18:07:20 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Jun 2007 09:07:20 -0700 Subject: [Python-Dev] Multi-line comments - a case for PEP 3099? In-Reply-To: References: Message-ID: On 6/19/07, Georg Brandl wrote: > Guido van Rossum schrieb: > > On 6/19/07, Georg Brandl wrote: > > > >> we got another feature request for multi-line comments. > >> > >> While it is nice to comment out multiple lines at once, every editor > >> that deserves that name can add a '#' to multiple lines. > >> > >> And there's always "if 0" and triple-quoted strings... > > I'd als say that the case for TOOWTDI is pretty clear on that. > > > > But perhaps we can keep the Py3k discussions on the python-3000 at python.org list? > > I haven't really seen this as a python-3000 specific issue. Or are you > referring to the other cross-posting thread? That too, but at this point *any* feature request is a Py3k request. If it's not good for Py3k there's no point in having it in 2.6. And I'd like new functionality in 2.6 to be restricted to backported Py3k features. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fuzzyman at voidspace.org.uk Tue Jun 19 21:50:46 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 19 Jun 2007 20:50:46 +0100 Subject: [Python-Dev] Inspect Patch for IronPython (and Jython?) Compatibility In-Reply-To: References: <4677CCAA.1050608@voidspace.org.uk> Message-ID: <46783396.6090302@voidspace.org.uk> Guido van Rossum wrote: > Let's definitely add this to the trunk (2.6). It sounds fine to me as > a bugfix too, since (from your description) it doesn't change the > behavior at all in CPython. Great. It looks to me like the patch will apply fine against release25-maint. No behaviour change. Thanks Michael Foord > > I won't have the time to submit this, but I'm sure there are others > here who do. > > --Guido > > On 6/19/07, Michael Foord wrote: >> Hello all, >> >> I've just submitted a patch on sourceforge to make inspect compatible >> with IronPython (and Jython I think). This patch originally comes from >> the IPCE ( http://fepy.sf.net ) project by Seo Sanghyeon. It is a >> trivial change really. >> >> The patch is number 1739696 >> http://sourceforge.net/tracker/index.php?func=detail&aid=1739696&group_id=5470&atid=305470 >> >> >> It moves getting a reference to 'code.co_code' into the body of the loop >> responsible for inspecting anonymous (tuple) arguments. >> >> In IronPython, accessing 'co_code' raises a NotImplementedError - >> meaning that inspect.get_argspec is broken. >> >> This patch means that *except* for functions with anonymous tuple >> arguments, it will work again on IronPython - whilst maintaining full >> compatibility with the previous behaviour. >> >> Jython has a similar patch to overcome the same issue by the way. See >> http://jython.svn.sourceforge.net/viewvc/jython?view=rev&revision=3200 >> >> As it is a bugfix - backporting to 2.5 would be great. Should I generate >> a separate patch? >> >> All the best, >> >> Michael Foord >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/guido%40python.org >> > > From martin at v.loewis.de Tue Jun 19 22:53:09 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 19 Jun 2007 22:53:09 +0200 Subject: [Python-Dev] [Python-3000] Python 3000 Status Update (Long!) In-Reply-To: References: <4677C4B8.8010508@gmail.com> <4677CED9.1060800@livinglogic.de> <4677EC1A.10306@livinglogic.de> Message-ID: <46784235.5050102@v.loewis.de> >> What would a registry of tranformation algorithms buy us compared to a >> module with transformation functions? > > Easier registering of custom transformations. Without a registry, you'd have > to monkey-patch a module. Or users would have to invoke the module directly. I think a convention would be enough: rot13.encode(foo) rot13.decode(bar) Then, "registration" would require to put the module on sys.path, which it would for any other kind of registry as well. My main objection to using an encoding is that for these, the algorithm name will *always* be a string literal, completely unlike "real" codecs, where the encoding name often comes from the environment (either from the process environment, or from some kind of input). Regards, Martin From hrvoje.niksic at avl.com Wed Jun 20 09:34:49 2007 From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=) Date: Wed, 20 Jun 2007 09:34:49 +0200 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: <4677E66C.8000403@metavr.com> References: <4677E66C.8000403@metavr.com> Message-ID: <1182324889.6077.111.camel@localhost> On Wed, 2007-06-20 at 00:21 +1000, Campbell Barton wrote: > I want to add my own call's before and after PyLists standard functions > but have a proplem with functons that use keywords and have no API > equivalent. > For example, I cant use the API's PyList_Sort because that dosnt support > keywords like... > > ls.sort(key=lambda a: a.foo)) > > And the Problem with PyObject_CallMethod is that it dosnt accept keywords. Note that you can always simply call PyObject_Call on the bound method object retrieved using PyObject_GetAttrString. The hardest part is usually constructing the keywords dictionary, a job best left to Py_BuildValue and friends. When I need that kind of thing in more than one place, I end up with a utility function like this one: /* Equivalent to PyObject_CallMethod but accepts keyword args. The format... arguments should produce a dictionary that will be passed as keyword arguments to obj.method. Usage example: PyObject *res = call_method(lst, "sort", "{s:O}", "key", keyfun)); */ PyObject * call_method(PyObject *obj, const char *methname, char *format, ...) { va_list va; PyObject *meth = NULL, *args = NULL, *kwds = NULL, *ret = NULL; args = PyTuple_New(0); if (!args) goto out; meth = PyObject_GetAttrString(obj, methname); if (!meth) goto out; va_start(va, format); kwds = Py_VaBuildValue(format, va); va_end(va); if (!kwds) goto out; ret = PyObject_Call(meth, args, kwds); out: Py_XDECREF(meth); Py_XDECREF(args); Py_XDECREF(kwds); return ret; } It would be nice for the Python C API to support a more convenient way of calling objects and methods with keyword arguments. From cbarton at metavr.com Wed Jun 20 12:17:22 2007 From: cbarton at metavr.com (Campbell Barton) Date: Wed, 20 Jun 2007 20:17:22 +1000 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: <1182324889.6077.111.camel@localhost> References: <4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost> Message-ID: <4678FEB2.9050506@metavr.com> Hrvoje Nik?i? wrote: > On Wed, 2007-06-20 at 00:21 +1000, Campbell Barton wrote: >> I want to add my own call's before and after PyLists standard functions >> but have a proplem with functons that use keywords and have no API >> equivalent. >> For example, I cant use the API's PyList_Sort because that dosnt support >> keywords like... >> >> ls.sort(key=lambda a: a.foo)) >> >> And the Problem with PyObject_CallMethod is that it dosnt accept keywords. > > Note that you can always simply call PyObject_Call on the bound method > object retrieved using PyObject_GetAttrString. The hardest part is > usually constructing the keywords dictionary, a job best left to > Py_BuildValue and friends. When I need that kind of thing in more than > one place, I end up with a utility function like this one: > > /* Equivalent to PyObject_CallMethod but accepts keyword args. The > format... arguments should produce a dictionary that will be passed > as keyword arguments to obj.method. > > Usage example: > PyObject *res = call_method(lst, "sort", "{s:O}", "key", keyfun)); > */ > > PyObject * > call_method(PyObject *obj, const char *methname, char *format, ...) > { > va_list va; > PyObject *meth = NULL, *args = NULL, *kwds = NULL, *ret = NULL; > > args = PyTuple_New(0); > if (!args) > goto out; > meth = PyObject_GetAttrString(obj, methname); > if (!meth) > goto out; > > va_start(va, format); > kwds = Py_VaBuildValue(format, va); > va_end(va); > if (!kwds) > goto out; > > ret = PyObject_Call(meth, args, kwds); > out: > Py_XDECREF(meth); > Py_XDECREF(args); > Py_XDECREF(kwds); > return ret; > } > > It would be nice for the Python C API to support a more convenient way > of calling objects and methods with keyword arguments. Thanks for the hint, I ended up using PyObject_Call. This seems to work, EXPP_PyTuple_New_Prepend - is a utility function that returns a new tuple with self at the start (needed so args starts with self) I dont think I can use PyObject_GetAttrString because the subtype would return a reference to this function - rather then the lists original function, Id need an instance of a list and dont have one at that point. ______________________ static PyObject * MaterialList_sort(BPy_MaterialList *self, PyObject *args, PyObject *keywds ) { PyObject *ret; PyObject *newargs = EXPP_PyTuple_New_Prepend(args, (PyObject *)self); sync_list_from_materials__internal(self); # makes sure the list matches blenders materials ret = PyObject_Call(PyDict_GetItemString(PyList_Type.tp_dict, "sort"), newargs, keywds); Py_DECREF(newargs); if (ret) sync_materials_from_list__internal(self); # makes blenders materials match the lists return ret; } _____________________ Later on Ill probably avoid using PyDict_GetItemString on PyList_Type.tp_dict all the time since the methods for lists does not change during python running. - Can probably be assigned to a constant. -- Campbell J Barton (ideasman42) From hrvoje.niksic at avl.com Wed Jun 20 13:38:49 2007 From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=) Date: Wed, 20 Jun 2007 13:38:49 +0200 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: <4678FEB2.9050506@metavr.com> References: <4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost> <4678FEB2.9050506@metavr.com> Message-ID: <1182339529.6077.120.camel@localhost> [ Note that this discussion, except maybe for the suggestion to add a simpler way to call a method with keyword args, is off-topic to python-dev. ] On Wed, 2007-06-20 at 20:17 +1000, Campbell Barton wrote: > I dont think I can use PyObject_GetAttrString because the subtype would > return a reference to this function - rather then the lists original > function, Id need an instance of a list and dont have one at that point. Note that PyList_Type is a full-fledged PyObject, so PyObject_GetAttrString works on it just fine. Of course, you would also need to add the "self" argument before the keywords, but that's a trivial change to the function. Calling PyObject_GetAttrString feels cleaner than accessing tp_dict directly, and most importantly call_method as written delegates creation of the dictionaty to Py_BuildValue. From cbarton at metavr.com Wed Jun 20 14:12:42 2007 From: cbarton at metavr.com (Campbell Barton) Date: Wed, 20 Jun 2007 22:12:42 +1000 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: <1182339529.6077.120.camel@localhost> References: <4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost> <4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost> Message-ID: <467919BA.2090708@metavr.com> Hrvoje Nik??i?? wrote: > [ Note that this discussion, except maybe for the suggestion to add a > simpler way to call a method with keyword args, is off-topic to > python-dev. ] Is there a list for this kind of discussion? Iv tried asking questions on the freenode python chat room but almost very few people there do C/Python api development. -- Campbell J Barton (ideasman42) From facundo at taniquetil.com.ar Wed Jun 20 14:36:33 2007 From: facundo at taniquetil.com.ar (Facundo Batista) Date: Wed, 20 Jun 2007 12:36:33 +0000 (UTC) Subject: [Python-Dev] Python 3000 Status Update (Long!) References: Message-ID: Guido van Rossum wrote: > I've written up a comprehensive status report on Python 3000. Please read: > > http://www.artima.com/weblogs/viewpost.jsp?thread=208549 One doubt: In Miscellaneus you say: Ordering comparisons (<, <=, >, >=) will raise TypeError by default instead of returning arbitrary results. Equality comparisons (==, !=) will compare for object identity (is, is not) by default. I *guess* that you're talking about comparisons between different datatypes... but you didn't explicit that in your blog. Am I right? -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From hrvoje.niksic at avl.com Wed Jun 20 15:00:58 2007 From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=) Date: Wed, 20 Jun 2007 15:00:58 +0200 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: <467919BA.2090708@metavr.com> References: <4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost> <4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost> <467919BA.2090708@metavr.com> Message-ID: <1182344458.6077.125.camel@localhost> On Wed, 2007-06-20 at 22:12 +1000, Campbell Barton wrote: > Hrvoje Nik?i? wrote: > > [ Note that this discussion, except maybe for the suggestion to add a > > simpler way to call a method with keyword args, is off-topic to > > python-dev. ] > > Is there a list for this kind of discussion? I believe the appropriate list would be the general Python list/newsgroup. I agree that response about the Python/C API tends to be sparse on general-purpose lists, though. If there is a forum dedicated to discussing the *use* of Python at the C level, I'd like to know about it as well. From guido at python.org Wed Jun 20 15:30:48 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jun 2007 06:30:48 -0700 Subject: [Python-Dev] Python 3000 Status Update (Long!) In-Reply-To: References: Message-ID: On 6/20/07, Facundo Batista wrote: > Guido van Rossum wrote: > > > I've written up a comprehensive status report on Python 3000. Please read: > > > > http://www.artima.com/weblogs/viewpost.jsp?thread=208549 > > One doubt: In Miscellaneus you say: > > Ordering comparisons (<, <=, >, >=) will raise TypeError by default > instead of returning arbitrary results. Equality comparisons (==, !=) > will compare for object identity (is, is not) by default. > > I *guess* that you're talking about comparisons between different > datatypes... but you didn't explicit that in your blog. > > Am I right? No. The *default* comparison always raises an exception. Of course, most types have a comparison that does the right thing for objects of the same type -- but they still raise an exception when compared (for ordering) to objects of different types (except subtypes or related types). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Wed Jun 20 17:43:15 2007 From: brett at python.org (Brett Cannon) Date: Wed, 20 Jun 2007 08:43:15 -0700 Subject: [Python-Dev] cleaning up the email addresses in the PEPs Message-ID: I am working on some code in the sandbox to automatically generate PEP 0. This is also leading to code that checks all the PEPs follow some basic guidelines. One of those guidelines is an author having a single email address. The Owners index at the bottom of PEP 0 is going to be created from the names and email addresses found in the PEPs themselves. But that doesn't work too well when an author has multiple addresses listed. If you are listed below, please choose a single address to use. You can either change the PEPs yourself or just reply with the email you prefer. I can tell you the multiple spellings if you want. If I don't hear from people I will just use my best judgement. And even better, if you spell your name multiple ways in the PEPs (e.g., Martin v. Loewis, Martin v. L?wis, Martin von L?wis) also let it be known which spelling you prefer (unifying name spelling comes after unifying the email addresses). Aahz Ka-Ping Yee: Neil Schemenauer David Goodger: Tim Peters: Martin v. L?wis: Paul Prescod: Jeremy Hylton: Clark C. Evans: Richard Jones: Alex Martelli: Moshe Zadka -Brett From martin at v.loewis.de Wed Jun 20 19:26:32 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 20 Jun 2007 19:26:32 +0200 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: <467919BA.2090708@metavr.com> References: <4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost> <4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost> <467919BA.2090708@metavr.com> Message-ID: <46796348.2050902@v.loewis.de> Campbell Barton schrieb: > Hrvoje Nik??i?? wrote: >> [ Note that this discussion, except maybe for the suggestion to add a >> simpler way to call a method with keyword args, is off-topic to >> python-dev. ] > > Is there a list for this kind of discussion? Hrvoje wasn't explicit on *why* this discussion is inappropriate here, so I just add that for better understanding: python-dev is for the development *of* Python, not for the development *with* Python. So you post here if you propose an enhancement or discuss the resolution of a bug. Question of the "how do I" kind are off-topic - posters are expected to know and understand the options, and then discuss the flaws of these options, rather than asking what they are. As Hrvoje says: try python-list (aka comp.lang.python). If you don't get an answer, you didn't phrase your question interestingly enough, or nobody knows the answer, or nobody has the time to tell you. Regards, Martin From martin at v.loewis.de Thu Jun 21 08:10:48 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 21 Jun 2007 08:10:48 +0200 Subject: [Python-Dev] www.python.org outage Message-ID: <467A1668.1020600@v.loewis.de> The scheduled outage starts now. Regards, Martin From martin at v.loewis.de Thu Jun 21 10:41:47 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 21 Jun 2007 10:41:47 +0200 Subject: [Python-Dev] www.python.org back up Message-ID: <467A39CB.4080705@v.loewis.de> I completed the update of dinsdale. Please let me know if you find any new problems with that machine. Regards, Martin From hrvoje.niksic at avl.com Thu Jun 21 13:33:56 2007 From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=) Date: Thu, 21 Jun 2007 13:33:56 +0200 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: <46796348.2050902@v.loewis.de> References: <4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost> <4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost> <467919BA.2090708@metavr.com> <46796348.2050902@v.loewis.de> Message-ID: <1182425636.6077.141.camel@localhost> On Wed, 2007-06-20 at 19:26 +0200, "Martin v. L?wis" wrote: > As Hrvoje says: try python-list (aka comp.lang.python). If you don't > get an answer, you didn't phrase your question interestingly enough, > or nobody knows the answer, or nobody has the time to tell you. The thing with comp.lang.python is that it is followed by a large number of Python users, but a much smaller number of the C API users -- which is only natural, since the group is about Python, not about C. For most users the Python/C API is an implementation detail which they never have to worry about. Futrhermore, questions about the C API often concern CPython implementation details and so they don't feel like they would belong in comp.lang.python. As an experiment, it might make sense to open a mailing list dedicated to the Python C API. It could become a useful support forum for extension writers (a group very useful to Python) and maybe even a melting pot for new ideas regarding CPython, much like comp.lang.python historically provided ideas for Python the language. From cbarton at metavr.com Thu Jun 21 13:59:30 2007 From: cbarton at metavr.com (Campbell Barton) Date: Thu, 21 Jun 2007 21:59:30 +1000 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: <1182425636.6077.141.camel@localhost> References: <4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost> <4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost> <467919BA.2090708@metavr.com> <46796348.2050902@v.loewis.de> <1182425636.6077.141.camel@localhost> Message-ID: <467A6822.3050103@metavr.com> The reason I asked on this in the first place is I had looked through the python source to make sure PyObject_Call had no equivalent that supported keywords, and since I needed to do this I figured it might be worth considering for Pythons API. Im sure everyone could write their own PyObject_Call, if they had to but thats what the API's for. Hrvoje Nik?i? wrote: > On Wed, 2007-06-20 at 19:26 +0200, "Martin v. L??wis" wrote: >> As Hrvoje says: try python-list (aka comp.lang.python). If you don't >> get an answer, you didn't phrase your question interestingly enough, >> or nobody knows the answer, or nobody has the time to tell you. > > The thing with comp.lang.python is that it is followed by a large number > of Python users, but a much smaller number of the C API users -- which > is only natural, since the group is about Python, not about C. For most > users the Python/C API is an implementation detail which they never have > to worry about. > > Futrhermore, questions about the C API often concern CPython > implementation details and so they don't feel like they would belong in > comp.lang.python. As an experiment, it might make sense to open a > mailing list dedicated to the Python C API. It could become a useful > support forum for extension writers (a group very useful to Python) and > maybe even a melting pot for new ideas regarding CPython, much like > comp.lang.python historically provided ideas for Python the language. Agree a Python/C API List would be great, in fact I cant see any reasons not to have it- likely the pure python users dont want to know about refcounting problems.. etc anyway. http://mail.python.org/mailman/listinfo http://www.python.org/community/sigs/ There are lists/newsgroup for py2exe and pyrex, Python-ObjectiveC etc, Python/C API seems much more generic, and its also fairly tricky to use at times - when doing more advanced stuff (subtyping has been tricky for me anyway). I expect the dev's of pyrex, pygame etc might also need to discuss C API spesific issues as well. Iv had roughly this conversation in IRC... Q. Hi, Id like to know how wrap python subtype methods in the C API A. C dosnt have classes, use C++ Q. no I want to use pythons C API, A. Subtypes are easy to do in python.. ...... you get the idea... Quite a few "python only" users dont understand where the Python/C API fits in and its annoying to have to explain the question each time (yes, Iv had these conversations more then once) - Cam From amk at amk.ca Thu Jun 21 17:23:52 2007 From: amk at amk.ca (A.M. Kuchling) Date: Thu, 21 Jun 2007 11:23:52 -0400 Subject: [Python-Dev] Wanted: readers for a mailbox.py article Message-ID: <20070621152352.GA10988@localhost.localdomain> I'm writing an article about the mailbox module for an online publication, and would like to get comments on the current draft from people familiar with the module. If you'd like to take a look, please e-mail me and I'll tell you the draft's URL. --amk From martin at v.loewis.de Thu Jun 21 19:25:06 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 21 Jun 2007 19:25:06 +0200 Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords In-Reply-To: <1182425636.6077.141.camel@localhost> References: <4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost> <4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost> <467919BA.2090708@metavr.com> <46796348.2050902@v.loewis.de> <1182425636.6077.141.camel@localhost> Message-ID: <467AB472.6070509@v.loewis.de> > Futrhermore, questions about the C API often concern CPython > implementation details and so they don't feel like they would belong in > comp.lang.python. As an experiment, it might make sense to open a > mailing list dedicated to the Python C API. It could become a useful > support forum for extension writers (a group very useful to Python) and > maybe even a melting pot for new ideas regarding CPython, much like > comp.lang.python historically provided ideas for Python the language. In the past, we created special-interest groups for such discussion. Would you like to coordinate a C sig? See http://www.python.org/community/sigs/ Regards, Martin From arigo at tunes.org Fri Jun 22 11:06:22 2007 From: arigo at tunes.org (Armin Rigo) Date: Fri, 22 Jun 2007 11:06:22 +0200 Subject: [Python-Dev] Vilnius/Post EuroPython PyPy Sprint 12-14th of July Message-ID: <20070622090621.GA19639@code0.codespeak.net> ======================================================== Vilnius/Post EuroPython PyPy Sprint 12-14th of July ======================================================== The PyPy team is sprinting at EuroPython again and we invite you to participate in our 3 day long sprint at the conference hotel - Reval Hotel Lietuva. If you plan to attend the sprint we recommend you to listen to the PyPy technical talks (`EuroPython schedule`_) during the conference since it will give you a good overview of the status of development. On the morning of the first sprint day (12th) we will also have a tutorial session for those new to PyPy development. As 3 days is relatively short for a PyPy sprint we suggest to travel back home on the 15th if possible (but it is ok to attend less than 3 days too). ------------------------------ Goals and topics of the sprint ------------------------------ There are many possible and interesting sprint topics to work on - here we list some possible task areas: * completing the missing python 2.5 features and support * write or port more extension modules (e.g. zlib is missing) * identify slow areas of PyPy through benchmarking and work on improvements, possibly moving app-level parts of the Python interpreter to interp-level if useful. * there are some parts of PyPy in need of refactoring, we may spend some time on those, for example: - rctypes and the extension compiler need some rethinking - support for LLVM 2.0 for the llvm backend - ... * some JIT improvement work * port the stackless transform to ootypesystem * other interesting stuff that you would like to work on ...;-) ------------ Registration ------------ If you'd like to come, please subscribe to the `pypy-sprint mailing list`_ and drop a note about your interests and post any questions. More organisational information will be sent to that list. Please register by adding yourself on the following list (via svn): http://codespeak.net/svn/pypy/extradoc/sprintinfo/post-ep2007/people.txt or on the pypy-sprint mailing list if you do not yet have check-in rights: http://codespeak.net/mailman/listinfo/pypy-sprint --------------------------------------- Preparation (if you feel it is needed): --------------------------------------- * read the `getting-started`_ pages on http://codespeak.net/pypy * for inspiration, overview and technical status you are welcome to read `the technical reports available and other relevant documentation`_ * please direct any technical and/or development oriented questions to pypy-dev at codespeak.net and any sprint organizing/logistical questions to pypy-sprint at codespeak.net * if you need information about the conference, potential hotels, directions etc we recommend to look at http://www.europython.org. We are looking forward to meet you at the Vilnius Post EuroPython PyPy sprint! The PyPy team .. See also .. .. _getting-started: http://codespeak.net/pypy/dist/pypy/doc/getting-started.html .. _`pypy-sprint mailing list`: http://codespeak.net/mailman/listinfo/pypy-sprint .. _`the technical reports available and other relevant documentation`: http://codespeak.net/pypy/dist/pypy/doc/index.html .. _`EuroPython schedule`: http://indico.cern.ch/conferenceTimeTable.py?confId=13919&showDate=all&showSession=all&detailLevel=contribution&viewMode=room From henning.vonbargen at arcor.de Fri Jun 22 23:40:04 2007 From: henning.vonbargen at arcor.de (Henning von Bargen) Date: Fri, 22 Jun 2007 23:40:04 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks Message-ID: <003a01c7b515$e56a7b50$6401a8c0@max> I'd like to propose a new function "open_noinherit" or maybe even a new mode flag "n" for the builtin "open" (see footnote for the names). The new function should work exactly like the builtin "open", with one difference: The open file is not inherited to any child processes (whereas files opened with "open" will be inherited). The new function can be implemented (basically) using os.O_NOINHERIT on MS Windows resp. fcntl / FD_CLOEXEC on Posix. I will post a working Python implementation next week. There are five reasons for the proposal: 1) The builtin "open" causes unexpected problems in conjunction with subprocesses, in particular in multi-threaded programs. It can cause file permission errors in the subprocess or in the current process. On Microsoft Windows, some of the possible file permission errors are not documented by Microsoft (thus very few programs written for Windows will react properly). 2) Inheriting open file handles to subprocesses is a security risk. 3) For the developer, finding "cause and effect" is *very* hard, in particular in multi-threaded programs, when the errors occur only in race-conditions. 4) The problems arise in some of the standard library modules as well, i.e. shutil.filecopy. 5) Very few developers are aware of the possible problems. As a work-around, one can replace open with os.fdopen (os.open (..., + os.O_NOINHERIT), ... ) on Windows, but that's really ugly, hard to read, may raise a different exception than open (IOError instead of OSError), and needs careful work to take platform-specific code into account Here is a single-threaded example to demonstrate the effect: import os import subprocess outf = open ("blah.tmp", "wt") subprocess.Popen("notepad.exe") # or whatever program you like, but # It must be a program that does not exit immediately! # Now the subprocess has inherited the open file handle # We can still write: outf.write ("Hello world!\n") outf.close() # But we can not rename the file (at least on Windows) os.rename ("blah.tmp", "blah.txt") # this fails with OSError: [Errno 13] Permission denied # Similar problems with other file operations on non-Windows platforms. Ok, in this little program one can see what is going wrong easily. But what if the subprocess exits very quickly? Then perhaps you see the OSError, perhaps not - depending on the process scheduler of your operation system. In a commercial multi-theaded daemon application, the error only occured under heavy load and was hard to reproduce - and it was even harder to find the cause. That's because cause and effect were in two different threads in two completely different parts of the program: - Thread A opens a file and starts to write data - Thread B starts a subprocess (which inherits the file handle from thread A!) - Thread A continues writing to the file and closes it. - And now it's a race condition: - a) Thread A wants to rename the file - b) the subprocess exits. If a) is first: Error, if b) is first: no error. To make things more complicated, even two subprocesses can disturb each other. The new function should be implemented in C ideally, because the GIL could prevent a thread-switch between os.open and the fcntl.F_SETFD call. Note that the problem described here arises not only for files, but for sockets as well. See bug 1222790: SimpleXMLRPCServer does not set FD_CLOEXEC Once there is an easy-to-use, platform-independent, documented builtin "open_noinherit" (or a new mode flag for "open"), the standard library should be considered. For each occurence of "open" or "file", it should be considered if it necessary to inherit the file to subprocesses. If not, it should be replaced with open_noinherit. One example is shutil.filecopy, where open_noiherit should be used instead of open. The socket module is another candidate, I think - but I'm not sure about that. A nice effect of using "open_noinherit" is that - in many cases - one no longer needs to speficy close_fds = True when calling subprocess.Popen. [Note that close_fds is *terribly* slow if MAX_OPEN_FILES is "big", e.g. 800, see bug 1663329] Footnote: While writing this mail, at least 3 times I typed "nonherit" instead of "noinherit". So maybe someone can propose a better name? Or a new mode flag character could be "p" (like "private" or "protected"). Henning From kbk at shore.net Sat Jun 23 04:17:15 2007 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri, 22 Jun 2007 22:17:15 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200706230217.l5N2HFld023393@hampton.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 385 open (+21) / 3790 closed (+21) / 4175 total (+42) Bugs : 1029 open (+43) / 6744 closed (+43) / 7773 total (+86) RFE : 262 open ( +4) / 291 closed ( +4) / 553 total ( +8) New / Reopened Patches ______________________ syslog syscall support for SysLogLogger (2007-05-02) http://python.org/sf/1711603 reopened by luke-jr syslog syscall support for SysLogLogger (2007-05-02) http://python.org/sf/1711603 reopened by luke-jr syslog syscall support for SysLogLogger (2007-05-02) http://python.org/sf/1711603 reopened by luke-jr syslog syscall support for SysLogLogger (2007-05-02) http://python.org/sf/1711603 reopened by luke-jr dict size changes during iter (2007-05-24) http://python.org/sf/1724999 opened by Ali Gholami Rudi Line ending bug SimpleXMLRPCServer (2007-05-24) http://python.org/sf/1725295 opened by bgrubbs IDLE - cursor color configuration bug (2007-05-25) http://python.org/sf/1725576 opened by Tal Einat Distutils default exclude doesn't match top level .svn (2007-05-25) http://python.org/sf/1725737 opened by Petteri R?ty ftplib.py: IndexError in voidresp occasionally (2007-05-26) http://python.org/sf/1726172 opened by kxroberto Patch to vs 2005 build (2007-05-26) http://python.org/sf/1726195 opened by Joseph Armbruster Windows Build Warnings (2007-05-26) http://python.org/sf/1726196 opened by Joseph Armbruster Line iteration readability (2007-05-26) http://python.org/sf/1726198 opened by Joseph Armbruster SimpleHTTPServer extensions_map (2007-05-26) http://python.org/sf/1726208 opened by Joseph Armbruster ftplib and ProFTPD NLST 226 without 1xx response (2007-05-27) http://python.org/sf/1726451 opened by Kenneth Loafman First steps towards new super (PEP 367) (2007-05-28) CLOSED http://python.org/sf/1727209 opened by Guido van Rossum move intern to sys, make intern() optionally warn (2007-05-31) http://python.org/sf/1728741 opened by Anthony Baxter IDLE - configDialog layout cleanup (2007-06-03) http://python.org/sf/1730217 opened by Tal Einat telnetlib: A callback for monitoring the telnet session (2007-06-04) http://python.org/sf/1730959 opened by Samuel Abels BufReader, TextReader for PEP 3116 "New I/O" (2007-06-04) http://python.org/sf/1731036 opened by Ilguiz Latypov Pruning threading.py from asserts (2007-06-05) CLOSED http://python.org/sf/1731049 opened by Bj?rn Lindqvist Expect skips by platform (2007-06-04) http://python.org/sf/1731169 opened by Matt Kraai Missing Py_DECREF in pysqlite_cache_display (2007-06-05) CLOSED http://python.org/sf/1731330 opened by Tim Delaney Improve doc for time.strptime (2007-06-05) http://python.org/sf/1731659 opened by Bj?rn Lindqvist urllib.urlretrieve/URLopener.retrieve - 'buff' argument (2007-06-05) http://python.org/sf/1731720 opened by Dariusz Suchojad Document the constants in the socket module (2007-06-06) http://python.org/sf/1732367 opened by Bj?rn Lindqvist Allow T_LONGLONG to accepts ints (2007-06-09) CLOSED http://python.org/sf/1733960 opened by Roger Upole _lsprof.c:ptrace_enter_call assumes PyErr_* is clean (2007-06-09) http://python.org/sf/1733973 opened by Eyal Lotem PY_LLONG_MAX and so on (2007-06-09) CLOSED http://python.org/sf/1734014 opened by Hirokazu Yamamoto Fast path for unicodedata.normalize() (2007-06-10) http://python.org/sf/1734234 opened by Rauli Ruohonen patch for bug 1170311 "zipfile UnicodeDecodeError" (2007-06-10) http://python.org/sf/1734346 opened by Alexey Borzenkov platform.py patch to support turbolinux (2007-06-11) CLOSED http://python.org/sf/1734945 opened by Yayati_Turbolinux Fix selectmodule.c compilation on GNU/Hurd (2007-06-11) http://python.org/sf/1735030 opened by Michael Banck Kill StandardError (2007-06-12) CLOSED http://python.org/sf/1735485 opened by Collin Winter asyncore should handle also ECONNABORTED in recv (2007-06-12) http://python.org/sf/1736101 opened by billiejoex asyncore/asynchat patches (2007-06-12) http://python.org/sf/1736190 opened by Josiah Carlson EasyDialogs patch to remove aepack dependency (2007-06-15) http://python.org/sf/1737832 opened by has help() can't find right source file (2007-06-15) http://python.org/sf/1738179 opened by Greg Couch Add a -z interpreter flag to execute a zip file (2007-06-19) http://python.org/sf/1739468 opened by andy-chu zipfile.testzip() using progressive file reads (2007-06-19) http://python.org/sf/1739648 opened by Grzegorz Adam Hankiewicz Patch inspect.py for IronPython / Jython Compatibility (2007-06-19) http://python.org/sf/1739696 opened by Mike Foord Accelerate attr dict lookups (2007-06-19) http://python.org/sf/1739789 opened by Eyal Lotem Add reduce to functools in 2.6 (2007-06-19) http://python.org/sf/1739906 opened by Christian Heimes Fix Decimal.sqrt bugs described in #1725899 (2007-06-22) http://python.org/sf/1741308 opened by Mark Dickinson Patches Closed ______________ Make isinstance/issubclass overloadable for PEP 3119 (2007-04-26) http://python.org/sf/1708353 closed by gvanrossum subprocess: Support close_fds on Win32 (2007-02-26) http://python.org/sf/1669481 closed by astrand First steps towards new super (PEP 3135) (2007-05-28) http://python.org/sf/1727209 closed by gvanrossum platform.system() returns incorrect value in Vista (2007-05-28) http://python.org/sf/1726668 closed by lemburg Fix warnings related to PyLong_FromVoidPtr (2007-05-05) http://python.org/sf/1713234 closed by theller fix 1668596: copy datafiles properly when package_dir is ' ' (2007-05-17) http://python.org/sf/1720897 closed by nnorwitz Hide iteration variable in list comprehensions (2007-02-15) http://python.org/sf/1660500 closed by gbrandl urllib2 raises an UnboundLocalError if "auth-int" is the qop (2007-02-24) http://python.org/sf/1667860 closed by gbrandl Pruning threading.py from asserts (2007-06-04) http://python.org/sf/1731049 closed by collinwinter Missing Py_DECREF in pysqlite_cache_display (2007-06-05) http://python.org/sf/1731330 closed by gbrandl Fix tests that assume they can write to Lib/test (2006-07-12) http://python.org/sf/1520904 closed by dgreiman Allow specifying headers for MIME parts (2007-02-22) http://python.org/sf/1666625 closed by nnorwitz x64 clean compile patch for _ctypes (2007-05-09) http://python.org/sf/1715718 closed by theller bug fix: ctypes truncates 64-bit pointers (2007-04-19) http://python.org/sf/1703286 closed by theller fixes non ansi c declarations in libffi (2007-04-19) http://python.org/sf/1703300 closed by theller Allow T_LONGLONG to accepts ints (2007-06-09) http://python.org/sf/1733960 closed by loewis PY_LLONG_MAX and so on (2007-06-09) http://python.org/sf/1734014 closed by loewis bdist_deb - Debian packager (2004-10-27) http://python.org/sf/1054967 closed by jafo platform.py patch to support turbolinux (2007-06-11) http://python.org/sf/1734945 closed by lemburg Kill StandardError (2007-06-12) http://python.org/sf/1735485 closed by collinwinter locale.getdefaultlocale() bug when _locale is missing (2006-09-06) http://python.org/sf/1553427 closed by gbrandl New / Reopened Bugs ___________________ inspect.formatargspec last argument ignored (2007-05-23) CLOSED http://python.org/sf/1723875 opened by Patrick Dobbs Grammar error in Python Tutorial 2.5 section 8.3 (2007-05-23) CLOSED http://python.org/sf/1724099 opened by sampson cPickle module doesn't work with universal line endings (2007-05-23) http://python.org/sf/1724366 opened by Geoffrey Bache shlex.split problems on Windows (2007-05-24) http://python.org/sf/1724822 opened by Geoffrey Bache bsddb.btopen . del of record doesn't update index (2007-05-25) http://python.org/sf/1725856 opened by Charles Hixson bsddb.btopen . del of record doesn't update index (2007-05-25) CLOSED http://python.org/sf/1725862 opened by Charles Hixson decimal sqrt method doesn't use round-half-even (2007-05-25) http://python.org/sf/1725899 opened by Mark Dickinson Typo in ctypes.wintypes definition of WIN32_FIND_DATA field (2007-05-26) CLOSED http://python.org/sf/1726026 opened by Koby Kahane bsddb.btopen . del of record doesn't update index (2007-05-27) CLOSED http://python.org/sf/1726299 opened by Charles Hixson platform.system() returns incorrect value in Vista (2007-05-27) CLOSED http://python.org/sf/1726668 opened by Benjamin Leppard Bug found in datetime for Epoch time = -1 (2007-05-28) http://python.org/sf/1726687 opened by Martin Blais subprocess: unreliability of returncode not clear from docs (2007-05-28) http://python.org/sf/1727024 opened by Dan O'Huiginn 'assert statement' in doc index links to AssertionError (2007-05-29) CLOSED http://python.org/sf/1727417 opened by ?smund Skj?veland xmlrpclib waits indefinately (2007-05-29) http://python.org/sf/1727418 opened by Arno Stienen 64/32-bit issue when unpickling random.Random (2007-05-29) http://python.org/sf/1727780 opened by Charles reading from malformed big5 document hangs cpython (2007-05-30) CLOSED http://python.org/sf/1728403 opened by tsuraan 0.0 and -0.0 end up referring to the same object (2007-05-31) http://python.org/sf/1729014 opened by Johnnyg os.stat producing incorrect / invalid results (2007-05-31) CLOSED http://python.org/sf/1729170 opened by Joe SVNVERSION redefined during compilation (2007-05-31) CLOSED http://python.org/sf/1729277 opened by Brett Cannon Error in example (2007-05-31) CLOSED http://python.org/sf/1729280 opened by accdak test_doctest fails when run in verbose mode (2007-05-31) http://python.org/sf/1729305 opened by Neal Norwitz missing int->Py_ssize_t in documentation (2007-06-01) http://python.org/sf/1729742 opened by Brian Wellington test_bsddb3 malloc corruption bug #1721309 broken in 2.6 (2007-06-02) CLOSED http://python.org/sf/1729929 opened by David Favor 2.5.1 latest svn fails test_curses and test_timeout (2007-06-02) http://python.org/sf/1729930 opened by David Favor cStringIO no loonger accepts array.array objects (2007-06-02) http://python.org/sf/1730114 opened by reedobrien tkFont.__eq__ gives type error (2007-06-02) http://python.org/sf/1730136 opened by L. Peter Deutsch getattr([], '__eq__')(some-object) is NotImplemented (2007-06-03) CLOSED http://python.org/sf/1730322 opened by L. Peter Deutsch When Mesa is built with NPTL support, Python extensions link (2007-06-03) http://python.org/sf/1730372 opened by Gazi Alankus strptime bug in time module (2007-06-03) CLOSED http://python.org/sf/1730389 opened by Emma __cmp__ present in type but not instance?? (2007-06-03) CLOSED http://python.org/sf/1730401 opened by L. Peter Deutsch os._execvpe raises assignment error in python 3000 svn (2007-06-04) CLOSED http://python.org/sf/1730441 opened by nifan dict constructor accesses internal items of dict derivative (2007-06-03) http://python.org/sf/1730480 opened by Blake Ross Importing a submodule after unloading its parent (2007-06-04) http://python.org/sf/1731068 opened by Blake Ross tkinter memory leak problem (2007-06-05) http://python.org/sf/1731706 opened by Robert Hancock race condition in subprocess module (2007-06-05) http://python.org/sf/1731717 opened by dsagal python 2.6 latest fails test_socketserver.py (2007-06-06) http://python.org/sf/1732145 opened by David Favor Unable to Start IDLE (2007-06-06) CLOSED http://python.org/sf/1732160 opened by Kishore Destructor behavior faulty (2007-05-12) http://python.org/sf/1717900 reopened by gbrandl repr of 'nan' floats not parseable (2007-06-06) http://python.org/sf/1732212 opened by Pete Shinners T_LONGLONG chokes on ints (2007-06-06) CLOSED http://python.org/sf/1732557 opened by Roger Upole Built-in open function fail. Too many file open (2007-06-07) CLOSED http://python.org/sf/1732629 opened by Alex socket makefile objects are not independent (2007-06-07) http://python.org/sf/1732662 opened by Jan Ondrej Built-in open function fail. Too many file open (2007-06-07) http://python.org/sf/1732686 reopened by alexteo21 Built-in open function fail. Too many file open (2007-06-07) http://python.org/sf/1732686 reopened by alexteo21 Built-in open function fail. Too many file open (2007-06-07) http://python.org/sf/1732686 opened by Alex sqlite3 module trigger problem (2007-06-07) http://python.org/sf/1733085 opened by Oinopion sqlite3.dll cannot be relocated (2007-06-08) http://python.org/sf/1733134 opened by Tim Delaney slice type is unhashable (2007-06-07) http://python.org/sf/1733184 opened by L. Peter Deutsch Solaris 64 bit LD_LIBRARY_PATH_64 needs to be set (2007-06-08) http://python.org/sf/1733484 opened by Brad Hochstetler AIX Objects/buffereobject.c does not build on AIX (2007-06-08) CLOSED http://python.org/sf/1733488 opened by Brad Hochstetler AIX Modules/unicodedata.c does not build (2007-06-08) CLOSED http://python.org/sf/1733493 opened by Brad Hochstetler Modules/ld_so_aix needs to strip path off of whichcc call (2007-06-08) http://python.org/sf/1733509 opened by Brad Hochstetler zlib configure behaves differently than main configure (2007-06-08) http://python.org/sf/1733513 opened by Brad Hochstetler setup.py incorrect for HP (2007-06-08) CLOSED http://python.org/sf/1733518 opened by Brad Hochstetler HP shared object option (2007-06-08) http://python.org/sf/1733523 opened by Brad Hochstetler HP automatic build of zlib (2007-06-08) http://python.org/sf/1733532 opened by Brad Hochstetler windows 64 bit builds (2007-06-08) CLOSED http://python.org/sf/1733536 opened by Brad Hochstetler HP 64 bit does not run (2007-06-08) http://python.org/sf/1733544 opened by Brad Hochstetler AIX shared object build of python 2.5 does not work (2007-06-08) http://python.org/sf/1733546 opened by Brad Hochstetler RuntimeWarning: tp_compare didn't return -1 or -2 (2007-06-08) http://python.org/sf/1733757 opened by Fabio Zadrozny Tkinter is not working on trunk (2.6) (2007-06-09) http://python.org/sf/1733943 opened by Hirokazu Yamamoto mmap.mmap can overrun buffer (2007-06-09) http://python.org/sf/1733986 opened by Roger Upole struct.Struct.size is not documented (2007-06-09) http://python.org/sf/1734111 opened by Yang Yang sqlite3 causes memory read error (2007-06-10) http://python.org/sf/1734164 opened by atsuo ishimoto Repr class from repr module ignores maxtuple attribute (2007-06-11) CLOSED http://python.org/sf/1734723 opened by Jason Roberts Tutorial Section 6.4 (2007-06-10) CLOSED http://python.org/sf/1734732 opened by Eric Naeseth sitecustomize.py not found (2007-06-11) http://python.org/sf/1734860 opened by www.spirito.de file.read() truncating strings under Windows (2007-06-12) http://python.org/sf/1735418 opened by cgkanchi Add O_NOATIME to os module (2007-06-12) http://python.org/sf/1735632 opened by sam morris Mac build fails if not building universal due to libtool (2007-06-12) http://python.org/sf/1736103 opened by Jack Jansen os.popen('yes | echo hello') stuck (2007-06-13) http://python.org/sf/1736483 opened by Eric dict reentrant/threading bug (2007-06-13) http://python.org/sf/1736792 opened by Adam Olsen re.findall hangs python completely (2007-06-14) http://python.org/sf/1737127 reopened by abakker re.findall hangs python completely (2007-06-14) http://python.org/sf/1737127 opened by Arno Bakker Add/Remove programs shows Martin v L?wis (2007-06-14) http://python.org/sf/1737210 opened by Simon Dahlbacka telnetlib.Telnet does not process DATA MARK (DM) (2007-06-15) http://python.org/sf/1737737 opened by Norbert Buchm?ller logging.exception() does not allow empty string (2007-06-15) CLOSED http://python.org/sf/1737864 opened by Dmitrii Tisnek parser error : out of memory error (2007-06-15) CLOSED http://python.org/sf/1738193 opened by paul beard Universal MacPython 2.5.1 installation fails (2007-06-16) http://python.org/sf/1738250 opened by Shinichiro Wachi shutil.move doesn't work when only case changes (2007-06-16) http://python.org/sf/1738441 opened by Gabriel Gambetta Python-2.5.1.tar.bz2 build failed at Centos-4.5 server (2007-06-17) http://python.org/sf/1738559 opened by shuvo sqlite3 doc fix (2007-06-17) CLOSED http://python.org/sf/1738670 opened by Mark Carter Tutorial error in 3.1.2 Strings (2007-06-17) CLOSED http://python.org/sf/1738754 opened by otan Bug assigning list comprehension to __slots__ in python 2.5 (2007-06-18) CLOSED http://python.org/sf/1739107 opened by Fran?ois Desloges shutil.rmtree's error message is confusing (2007-06-18) CLOSED http://python.org/sf/1739115 opened by Bj?rn Lindqvist Investigated ref leak report related to thread(regrtest.py - (2007-06-18) http://python.org/sf/1739118 opened by Hirokazu Yamamoto Interactive help raise exception while listing modules (2007-06-19) CLOSED http://python.org/sf/1739659 opened by Dmitry Vasiliev xmlrpclib can no longer marshal Fault objects (2007-06-19) http://python.org/sf/1739842 opened by Mike Bonnet asynchat should call "handle_close" (2007-06-20) http://python.org/sf/1740572 opened by billiejoex python: Modules/gcmodule.c:240: update_refs: Assertion `gc-> (2007-06-20) http://python.org/sf/1740599 opened by Sean struct.pack("I", "foo"); struct.pack("L", "foo") should fail (2007-06-21) http://python.org/sf/1741130 opened by Thomas Heller string formatter %x problem with indirectly given long (2007-06-21) http://python.org/sf/1741218 opened by Kenji Noguchi defined format returns error (2007-06-22) CLOSED http://python.org/sf/1741524 opened by Ted Bell Odd UDP problems in socket library (2007-06-22) http://python.org/sf/1741898 opened by Jay Sherby Bugs Closed ___________ inspect.formatargspec last argument ignored (2007-05-23) http://python.org/sf/1723875 closed by patrickcd Crash in ctypes callproc function with unicode string arg (2007-05-22) http://python.org/sf/1723338 closed by theller Grammar error in Python Tutorial 2.5 section 8.3 (2007-05-23) http://python.org/sf/1724099 closed by gbrandl Option -OO doesn't remove docstrings (2007-05-21) http://python.org/sf/1722485 closed by gbrandl shlex.split problems on Windows (2007-05-24) http://python.org/sf/1724822 closed by gbrandl docu enhancement for logging.handlers.SysLogHandler (2007-05-17) http://python.org/sf/1720726 closed by vsajip tarfile stops expanding with long filenames (2007-05-16) http://python.org/sf/1719898 closed by gustaebel bsddb.btopen . del of record doesn't update index (2007-05-25) http://python.org/sf/1725862 closed by nnorwitz Typo in ctypes.wintypes definition of WIN32_FIND_DATA field (2007-05-26) http://python.org/sf/1726026 closed by theller bsddb.btopen . del of record doesn't update index (2007-05-26) http://python.org/sf/1726299 closed by nnorwitz 'assert statement' in doc index links to AssertionError (2007-05-29) http://python.org/sf/1727417 closed by gbrandl reading from malformed big5 document hangs cpython (2007-05-31) http://python.org/sf/1728403 closed by perky os.stat producing incorrect / invalid results (2007-05-31) http://python.org/sf/1729170 closed by loewis SVNVERSION redefined during compilation (2007-06-01) http://python.org/sf/1729277 closed by loewis Error in example (2007-05-31) http://python.org/sf/1729280 closed by nnorwitz distutils chops the first character of filenames (2007-02-25) http://python.org/sf/1668596 closed by nnorwitz test_bsddb3 malloc corruption bug #1721309 broken in 2.6 (2007-06-02) http://python.org/sf/1729929 closed by nnorwitz Compiler is not thread safe? (2007-05-16) http://python.org/sf/1720241 closed by loewis getattr([], '__eq__')(some-object) is NotImplemented (2007-06-03) http://python.org/sf/1730322 closed by collinwinter make testall shows many glibc detected malloc corruptions (2007-05-18) http://python.org/sf/1721309 closed by nnorwitz strptime bug in time module (2007-06-03) http://python.org/sf/1730389 closed by bcannon __cmp__ present in type but not instance?? (2007-06-03) http://python.org/sf/1730401 closed by bcannon os._execvpe raises assignment error in python 3000 svn (2007-06-03) http://python.org/sf/1730441 closed by nnorwitz Const(None) in compiler.ast.Return.value (2007-05-09) http://python.org/sf/1715581 closed by collinwinter CGIHttpServer fails if python exe has spaces (2007-05-02) http://python.org/sf/1711608 closed by collinwinter Unable to Start IDLE (2007-06-06) http://python.org/sf/1732160 closed by nnorwitz T_LONGLONG chokes on ints (2007-06-07) http://python.org/sf/1732557 closed by loewis Built-in open function fail. Too many file open (2007-06-07) http://python.org/sf/1732629 closed by gbrandl Built-in open function fail. Too many file open (2007-06-07) http://python.org/sf/1732686 closed by loewis Built-in open function fail. Too many file open (2007-06-07) http://python.org/sf/1732686 closed by gbrandl urllib2 has memory leaks (2006-02-13) http://python.org/sf/1430435 closed by gbrandl AIX Objects/buffereobject.c does not build on AIX (2007-06-08) http://python.org/sf/1733488 closed by loewis AIX Modules/unicodedata.c does not build (2007-06-08) http://python.org/sf/1733493 closed by perky setup.py incorrect for HP (2007-06-08) http://python.org/sf/1733518 closed by loewis windows 64 bit builds (2007-06-08) http://python.org/sf/1733536 closed by loewis ctypes Fundamental data types (2007-04-14) http://python.org/sf/1700455 closed by theller Repr class from repr module ignores maxtuple attribute (2007-06-10) http://python.org/sf/1734723 closed by nnorwitz Tutorial Section 6.4 (2007-06-10) http://python.org/sf/1734732 closed by nnorwitz logging.exception() does not allow empty string (2007-06-15) http://python.org/sf/1737864 closed by gbrandl parser error : out of memory error (2007-06-15) http://python.org/sf/1738193 closed by nnorwitz sqlite3 doc fix (2007-06-17) http://python.org/sf/1738670 closed by nnorwitz Tutorial error in 3.1.2 Strings (2007-06-17) http://python.org/sf/1738754 closed by nnorwitz Bug assigning list comprehension to __slots__ in python 2.5 (2007-06-18) http://python.org/sf/1739107 closed by gbrandl shutil.rmtree's error message is confusing (2007-06-18) http://python.org/sf/1739115 closed by gbrandl Interactive help raise exception while listing modules (2007-06-19) http://python.org/sf/1739659 closed by gbrandl defined format returns error (2007-06-22) http://python.org/sf/1741524 closed by gbrandl New / Reopened RFE __________________ provide a shlex.split alternative for Windows shell syntax (2007-05-24) http://python.org/sf/1724822 reopened by gbrandl add operator.fst and snd functions (2007-05-28) http://python.org/sf/1726697 opened by paul rubin add itertools.ichain function and count.getvalue (2007-05-28) CLOSED http://python.org/sf/1726707 opened by paul rubin -q (quiet) option for python interpreter (2007-05-30) http://python.org/sf/1728488 opened by Marcin Wojdyr ZipFile CallBack Needed... (2007-06-08) http://python.org/sf/1733259 opened by durumdara Newer reply format for imap commands in imaplib.py (2007-06-12) http://python.org/sf/1735509 opened by Naoyuki Tai make colon optional (2007-06-19) CLOSED http://python.org/sf/1739678 opened by Chris add multi-line comments (2007-06-19) CLOSED http://python.org/sf/1739679 opened by Chris RFE Closed __________ add itertools.ichain function and count.getvalue (2007-05-27) http://python.org/sf/1726707 closed by rhettinger new functool: "defaults" decorator (2007-05-15) http://python.org/sf/1719222 closed by rhettinger make colon optional (2007-06-19) http://python.org/sf/1739678 closed by gbrandl add multi-line comments (2007-06-19) http://python.org/sf/1739679 closed by gbrandl From martin at v.loewis.de Sat Jun 23 08:41:54 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Jun 2007 08:41:54 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <003a01c7b515$e56a7b50$6401a8c0@max> References: <003a01c7b515$e56a7b50$6401a8c0@max> Message-ID: <467CC0B2.2010700@v.loewis.de> Henning von Bargen schrieb: > I'd like to propose a new function "open_noinherit" > or maybe even a new mode flag "n" for the builtin "open" > (see footnote for the names). Do you have a patch implementing that feature? I believe it's unimplementable in Python 2.x: open() is mapped to fopen(), which does not support O_NOINHERIT. If you don't want the subprocess to inherit handles, why don't you just specify close_fds=True when creating the subprocess? Regards, Martin From martin at v.loewis.de Sat Jun 23 09:32:42 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Jun 2007 09:32:42 +0200 Subject: [Python-Dev] bzr on dinsdale Message-ID: <467CCC9A.7050502@v.loewis.de> If I do "bzr status" in dinsdale:/etc/apache2, I get bzr: ERROR: bzrlib.errors.BzrCheckError: Internal check failed: file u'/etc/init.d/stop-bootlogd' entered as kind 'symlink' id 'stopbootlogd-20070303140018-fe340b888f6e9c69', now of kind 'file' Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/bzrlib/commands.py", line 611, in run_bzr_catch_errors return run_bzr(argv) ...... BzrCheckError: Internal check failed: file u'/etc/init.d/stop-bootlogd' entered as kind 'symlink' id 'stopbootlogd-20070303140018-fe340b888f6e9c69', now of kind 'file' bzr 0.11.0 on python 2.4.4.final.0 (linux2) arguments: ['/usr/bin/bzr', 'status'] ** please send this report to bazaar-ng at lists.ubuntu.com Can somebody experienced with bzr please help? Regards, Martin From henning.vonbargen at arcor.de Sat Jun 23 10:04:33 2007 From: henning.vonbargen at arcor.de (Henning von Bargen) Date: Sat, 23 Jun 2007 10:04:33 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> Message-ID: <001901c7b56d$22c83f80$6401a8c0@max> """ OT: Argh - my email address is visible in the posting - I am doomed! """ ----- Original Message ----- > Martin v. L?wis wrote: > > Do you have a patch implementing that feature? I believe > it's unimplementable in Python 2.x: open() is mapped > to fopen(), which does not support O_NOINHERIT. Yes, I have a patch implemented in pure Python. I got the code on my workplace PC (now I am writing from home, that's why I said I'll post the code later). The patch uses os.fdopen ( os.open (..., ...), ...). It translates IOError into OSError then to raise the same class of exception aso open(). Unfortunately, the patch is ignoring the bufsize argument, so it is only a protoype at this time. I know that open() is mapped to fopen() and fopen does not support close_fds. Thus a correct patch has to be implemented at the C level. It should use open and fdopen instead of fopen - just like the Python prototype. AFAIK in the C stdlib implementation, fopen is implemented based on open anyway. BTW to find out what happens, I had to look to the source distribution for the first time after 3 years of using Python. > If you don't want the subprocess to inherit handles, > why don't you just specify close_fds=True when creating > the subprocess? The subprocess module is a great piece of code, but it has its weeknesses. "close_fds" is one of them. subprocess.py fails on MS Windows if I specify close_fds. And it *cannot* be fixed for MS Windows in the subprocess module. This is due to the different way MS Windows handles handles :-) in child process creation: In Posix, you can just work through the file numbers range and close the ones you don't want/need in the subprocess. This is how close_fds works internally. It closes the fds starting from 3 to MAX_FDs-1, thus only stdin, stdout and stderr are inherited. On MS Windows, AFAIK (correct me if I am wrong), you can only choose either to inherit handles or not *as a whole* - including stdin, stdout and stderr -, when calling CreateProcess. Each handle has a security attribute that specifies whether the handle should be inherited or not - but this has to be specified when creating the handle (in the Windows CreateFile API internally). Thus, on MS Windows, you can either choose to inherit all files opened with "open" + [stdin, stdout, stderr], or to not inherit any files (meaning even stdin, stdout and stderr will not be inherited). In a platform-independent program, close_fds is therefore not an option. Henning From martin at v.loewis.de Sat Jun 23 11:17:20 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Jun 2007 11:17:20 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <001901c7b56d$22c83f80$6401a8c0@max> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> Message-ID: <467CE520.3000405@v.loewis.de> > Yes, I have a patch implemented in pure Python. > > I got the code on my workplace PC (now I am writing from home, > that's why I said I'll post the code later). > > The patch uses os.fdopen ( os.open (..., ...), ...). > It translates IOError into OSError then to raise the same class > of exception aso open(). Hmm. I don't think I could accept such an implementation (whether in Python or in C). That's very hackish. > AFAIK in the C stdlib implementation, fopen is implemented > based on open anyway. Sure - and in turn, open is implemented on CreateFile. However, I don't think I would like to see an fopen implementation in Python. Python 3 will drop stdio entirely; for 2.x, I'd be cautious to change things because that may break other things in an unexpected manner. > On MS Windows, AFAIK (correct me if I am wrong), you can > only choose either to inherit handles or not *as a whole* > - including stdin, stdout and stderr -, when calling CreateProcess. I'm not sure. In general, that seems to be true. However, according to the ReactOS sources at http://www.reactos.org/generated/doxygen/dd/dda/dll_2win32_2kernel32_2process_2create_8c-source.html#l00624 Windows will duplicate stdin,stdout,stderr from the parent process even if bInheritHandles is false, provided that no handles are specified in the startupinfo, and provided that the program to be started is a console (CUI) program. > Each handle has a security attribute that specifies whether the > handle should be inherited or not - but this has to be specified > when creating the handle (in the Windows CreateFile API internally). Not necessarily. You can turn on the flag later, through SetHandleInformation. > Thus, on MS Windows, you can either choose to inherit all > files opened with "open" + [stdin, stdout, stderr], > or to not inherit any files (meaning even stdin, stdout and stderr > will not be inherited). > > In a platform-independent program, close_fds is therefore not an option. ... assuming you care about whether stdin,stdout,stderr are inherited to GUI programs. If the child process makes no use of stdin/stdout, you can safely set close_fds to true. Regards, Martin From dima at hlabs.spb.ru Sat Jun 23 11:38:58 2007 From: dima at hlabs.spb.ru (Dmitry Vasiliev) Date: Sat, 23 Jun 2007 13:38:58 +0400 Subject: [Python-Dev] bzr on dinsdale In-Reply-To: <467CCC9A.7050502@v.loewis.de> References: <467CCC9A.7050502@v.loewis.de> Message-ID: <467CEA32.9040605@hlabs.spb.ru> Martin v. L?wis wrote: > If I do "bzr status" in dinsdale:/etc/apache2, I get > > BzrCheckError: Internal check failed: file u'/etc/init.d/stop-bootlogd' > entered as kind 'symlink' id > 'stopbootlogd-20070303140018-fe340b888f6e9c69', now of kind 'file' > > bzr 0.11.0 on python 2.4.4.final.0 (linux2) > arguments: ['/usr/bin/bzr', 'status'] > > ** please send this report to bazaar-ng at lists.ubuntu.com > > Can somebody experienced with bzr please help? Bzr allow kind changes only starting from version 0.15, for old versions you should first remove file from version control with 'bzr rm' and then add again with 'bzr add'. -- Dmitry Vasiliev http://hlabs.spb.ru From martin at v.loewis.de Sat Jun 23 13:18:48 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Jun 2007 13:18:48 +0200 Subject: [Python-Dev] bzr on dinsdale In-Reply-To: <467CEA32.9040605@hlabs.spb.ru> References: <467CCC9A.7050502@v.loewis.de> <467CEA32.9040605@hlabs.spb.ru> Message-ID: <467D0198.8010103@v.loewis.de> > Bzr allow kind changes only starting from version 0.15, for old versions > you should first remove file from version control with 'bzr rm' and then > add again with 'bzr add'. Thanks! that worked fine. Regards, Martin From henning.vonbargen at arcor.de Sat Jun 23 14:32:24 2007 From: henning.vonbargen at arcor.de (Henning von Bargen) Date: Sat, 23 Jun 2007 14:32:24 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> Message-ID: <000801c7b592$8e13e670$6401a8c0@max> > "Martin v. L?wis" wrote: >> Yes, I have a patch implemented in pure Python. >> >> I got the code on my workplace PC (now I am writing from home, >> that's why I said I'll post the code later). >> >> The patch uses os.fdopen ( os.open (..., ...), ...). >> It translates IOError into OSError then to raise the same class >> of exception aso open(). > > Hmm. I don't think I could accept such an implementation > (whether in Python or in C). That's very hackish. Well, if this is your opinion... Take a look at the fopen implementation in stdio's fopen.c: # I found it via Google Code Search in the directory src/libc/ansi/stdio/fopen.c # of http://clio.rice.edu/djgpp/win2k/djsr204_alpha.zip FILE * fopen(const char *file, const char *mode) { FILE *f; int fd, rw, oflags = 0; ... fd = open(file, oflags, 0666); if (fd < 0) return NULL; f->_cnt = 0; f->_file = fd; f->_bufsiz = 0; ... } ... return f; } As you can see, at the C level, basically "fopen" is "open" with a little code around it to parse flags etc. It's the same kind of hackish code. And (apart from the ignored bufsize argument) the prototype is working fine. I have to admit, though, that I am only using it on regular files. Anyway, I don't want to argue about the implementation of a patch. The fact is that until now the python programmer does not have an easy, platform-independent option to open files non-inheritable. As you mentioned yourself, the only way to work around it in a platform-independent manner, IS VERY HACKISH. So, shouldn't this hackish-ness better be hidden in the library instead of leaving it as an execise to the common programmer? The kind of errors I mentioned ("permission denied" errors that seem to occur without an obvious reason) have cost me at least two weeks of debugging the hard way (with ProcessExplorer etc) and caused my manager to loose his trust in Python at all... I think it is well worth the effort to keep this trouble away from the Python programmers if possible. And throughout the standard library modules, "open" is used, causing these problems as soon as sub-processes come into play. Apart from shutil.copyfile, other examples of using open that can cause trouble are in socket.py (tell me any good reason why socket handles should be inherited to child processes) and even in logging.py. For example, I used RotatingFileHandler for logging my daemon program activity. Sometimes, the logging itself caused errors, when a still-running child process had inherited the log file handle and log rotation occured. > >> AFAIK in the C stdlib implementation, fopen is implemented >> based on open anyway. > > Sure - and in turn, open is implemented on CreateFile. > However, I don't think I would like to see an fopen > implementation in Python. Python 3 will drop stdio entirely; > for 2.x, I'd be cautious to change things because that > may break other things in an unexpected manner. Yeah, if you think it should not be included in 2.x, then the handle inheritance problem should at least be considered in the PEPs [(3116, "New I/O"), (337, "Logging Usage in the Standard Modules")] > >> On MS Windows, AFAIK (correct me if I am wrong), you can >> only choose either to inherit handles or not *as a whole* >> - including stdin, stdout and stderr -, when calling CreateProcess. > > I'm not sure. In general, that seems to be true. However, > according to the ReactOS sources at > > http://www.reactos.org/generated/doxygen/dd/dda/dll_2win32_2kernel32_2process_2create_8c-source.html#l00624 > > Windows will duplicate stdin,stdout,stderr from the parent > process even if bInheritHandles is false, provided that > no handles are specified in the startupinfo, and provided > that the program to be started is a console (CUI) program. > >> Each handle has a security attribute that specifies whether the >> handle should be inherited or not - but this has to be specified >> when creating the handle (in the Windows CreateFile API internally). > > Not necessarily. You can turn on the flag later, through > SetHandleInformation. So do you think that a working "close_fds" could be implemented for Windows as well? Explicitly turning off the inheritance flag for all child handles except stdin, stdout and stderr in subprocess / popen (the equivalent to what close_fds does for Posix) - that's what I call hackish. And I doubt that it is possible at all, for two reasons: - you have to KNOW all the handles. - due to the different process creation in Windows (there's no fork), you had to set the inheritance flags afterwards - all this is not thread-safe. > >> Thus, on MS Windows, you can either choose to inherit all >> files opened with "open" + [stdin, stdout, stderr], >> or to not inherit any files (meaning even stdin, stdout and stderr >> will not be inherited). >> >> In a platform-independent program, close_fds is therefore not an option. > > ... assuming you care about whether stdin,stdout,stderr are inherited > to GUI programs. If the child process makes no use of stdin/stdout, you > can safely set close_fds to true. Hmm... In the bug 1663329 I posted ("subprocess/popen close_fds perform poor if SC_OPEN_MAX is hi"), you suggested: """ - you should set the FD_CLOEXEC flag on all file descriptors you don't want to be inherited, using fnctl(fd, F_SETFD, 1) """ Apart from the fact that this is not possible on MS Windows, it won't solve the problem! (Because then I couldn't use all those standard modules that use open *without* FD_CLOEXEC). The fact is that the combination ("multi-threading", "subprocess creation", "standard modules") simply *does not work* flawlessly and produces errors that are hard to understand. And probably most progammers are not even aware of the problem. That's the main reason why I posted here. And, in my experience, programs tend to get more complex and in the future I expect to see more multi-threaded Python-programs. So the problem will not vanish - we will see it more often than we like... Regards, Henning From apt.shansen at gmail.com Sat Jun 23 17:39:38 2007 From: apt.shansen at gmail.com (Stephen Hansen) Date: Sat, 23 Jun 2007 08:39:38 -0700 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <000801c7b592$8e13e670$6401a8c0@max> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> Message-ID: <7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com> The kind of errors I mentioned ("permission denied" errors that > seem to occur without an obvious reason) have cost me at least > two weeks of debugging the hard way (with ProcessExplorer etc) > and caused my manager to loose his trust in Python at all... > I think it is well worth the effort to keep this trouble away from > the Python programmers if possible. > > And throughout the standard library modules, "open" is used, > causing these problems as soon as sub-processes come into play. > > Apart from shutil.copyfile, other examples of using open that can cause > trouble are in socket.py (tell me any good reason why socket handles > should be inherited to child processes) and even in logging.py. > > For example, I used RotatingFileHandler for logging my daemon > program activity. Sometimes, the logging itself caused errors, > when a still-running child process had inherited the log file handle > and log rotation occured. I just wanted to express to the group at large that these experiences aren't just Henning's; we spent a *tremendous* amount of time and effort debugging serious problems that arose from file handles getting shared to subprocesses where it wasn't really expected. Specifically, the RotatingFileHandler example above. It blatantly just breaks when subprocesses are used and its an extremely obtuse process to discover why. It was very costly to the company because it came up at a bad time and was *so* obtuse of an error. At first it looked like some sort of thread-safety problem, so a lot of prying went into that before we got stumped... after all, we *knew* no other process touched that file, and the logging module (and RotatingFileHandler) claimed and looked thread-safe, so.. how could it be having a Permission Denied error when it very clearly is closing the file before rotating it? Eventually the culprit was found, but it was very painful. A couple similar issues have arisen since, and they're only slightly easier to debug once you are expecting it. But the fact that the simple and obvious features provided in the stdlib break as a result of you launching a subprocess at some point sorta sucks :) So, yeah. Anything even remotely or vaguely approaching Henning's patch would be really, really appreciated. --SH -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070623/4c9eb540/attachment.htm From henning.vonbargen at arcor.de Sat Jun 23 19:01:45 2007 From: henning.vonbargen at arcor.de (Henning von Bargen) Date: Sat, 23 Jun 2007 19:01:45 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com> Message-ID: <001201c7b5b8$2ea03650$6401a8c0@max> Stephen, thank you for speaking it out loud on python-dev. And you know better english words like "tremendous" and "obtuse" (whatever that means:-) that express what a PITA this really is. When I said it took me two weeks, that's actually not the truth. It was even more. The first problem was with RotatingLogHandler, and just like you, I first thought it was a threading problem. Thus I wrote my own version of RotationLogHandler, which builds new log file name with a timestamp instead of renaming the old log files. Actually, the point when I found out that indeed subprocesses were causing problems was when I had the program running on about 50 computers (for different clients) and for some clients the program would run very well, while for other clients there were often errors - suddenly it came to my mind that the clients with the errors were those who used a subprocess for sending e-mail via MAPI, whereas the clients who didn't experience problems were those who used smtplib for sending e-mail (no subprocesses). And then it took me a few days to write my replacement open function and to replace each occurence of "open" with the replacement function. And then, another few days later, a client told me that the errors *still* occured (although in rare cases). At first I built a lot of tracing and debugging into the MAPI subprocess "sendmail.exe". Finally I found out that it was actually shutil.filecopy that caused the error. Of course, I hadn't searched for "open" in the whole bunch of standard modules... Henning From martin at v.loewis.de Sat Jun 23 20:09:17 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Jun 2007 20:09:17 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <000801c7b592$8e13e670$6401a8c0@max> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> Message-ID: <467D61CD.9000403@v.loewis.de> > As you can see, at the C level, basically "fopen" is "open" with a > little code around it to parse flags etc. It's the same kind of hackish code. "little code" is quite an understatement. In Microsoft's C library (which we would have to emulate), the argument parsing of fopen is 120 lines of code. In addition, that code changes across compiler versions (where VS 2005 adds additional error checking). > Anyway, I don't want to argue about the implementation of a patch. > The fact is that until now the python programmer does not have an > easy, platform-independent option to open files non-inheritable. > As you mentioned yourself, the only way to work around it > in a platform-independent manner, IS VERY HACKISH. > So, shouldn't this hackish-ness better be hidden in the library > instead of leaving it as an execise to the common programmer? Putting it into the library is fine. However, we need to find an implementation strategy that meets the user's needs, and is still maintainable. Python 3 will offer a clean solution, deviating entirely from stdio. For 2.x, we need to find a better solution than the one you proposed. > I think it is well worth the effort to keep this trouble away from > the Python programmers if possible. I don't argue about efforts - I argue about your proposed solution. > Apart from shutil.copyfile, other examples of using open that can cause > trouble are in socket.py (tell me any good reason why socket handles > should be inherited to child processes) and even in logging.py. On Unix, it is *very* common to inherit socket handles to child processes. The parent process opens the socket, and the child processes perform accept(3). This allows many processes to serve requests on the same port. In Python, SocketServer.Forking*Server rely on this precise capability. >> Sure - and in turn, open is implemented on CreateFile. >> However, I don't think I would like to see an fopen >> implementation in Python. Python 3 will drop stdio entirely; >> for 2.x, I'd be cautious to change things because that >> may break other things in an unexpected manner. > > Yeah, if you think it should not be included in 2.x, > then the handle inheritance problem should at least be considered > in the PEPs [(3116, "New I/O"), (337, "Logging Usage in the Standard > Modules")] I didn't say that a solution shouldn't be included in 2.x. I said *your* solution shouldn't be. In 3.x, your solution won't apply, sine Python won't be using stdio (so fdopen becomes irrelevant) >>> Each handle has a security attribute that specifies whether the >>> handle should be inherited or not - but this has to be specified >>> when creating the handle (in the Windows CreateFile API internally). >> >> Not necessarily. You can turn on the flag later, through >> SetHandleInformation. > > So do you think that a working "close_fds" could be implemented > for Windows as well? No. close_fds should have the semantics of only closing the handles for that subprocess. SetHandleInformation applies to the parent process, and *all* subprocesses. So this is different from close_fds. > Explicitly turning off the inheritance flag for all child handles except > stdin, stdout and stderr in subprocess / popen (the equivalent to > what close_fds does for Posix) - that's what I call hackish. I didn't propose that, and it wouldn't be the equivalent. In POSIX, the closing occurs in the child process. This is not possible on Windows, as there is no fork(). > And I doubt that it is possible at all, for two reasons: > - you have to KNOW all the handles. > - due to the different process creation in Windows (there's no fork), > you had to set the inheritance flags afterwards > - all this is not thread-safe. All true, and I did not suggest to integrate SetHandleInformation into subprocess. I *ONLY* claimed that you can change the flag after the file was opened. With that API, it would be possible to provide cross-platform access to the close-on-exec flag. Applications interested in setting it could then set it right after opening the file. > Apart from the fact that this is not possible on MS Windows, it won't > solve the problem! > (Because then I couldn't use all those standard modules that use open > *without* FD_CLOEXEC). > > The fact is that the combination ("multi-threading", "subprocess > creation", "standard modules") > simply *does not work* flawlessly and produces errors that are hard to > understand. > And probably most progammers are not even aware of the problem. > That's the main reason why I posted here. I don't see how your proposed change solves that. If there was an "n" flag, then the modules in the standard library that open files still won't use it. Regards, Martin From amk at amk.ca Sat Jun 23 20:36:45 2007 From: amk at amk.ca (A.M. Kuchling) Date: Sat, 23 Jun 2007 14:36:45 -0400 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com> Message-ID: <20070623183645.GA10808@andrew-kuchlings-computer.local> On Sat, Jun 23, 2007 at 08:39:38AM -0700, Stephen Hansen wrote: > I just wanted to express to the group at large that these experiences aren't > just Henning's; we spent a *tremendous* amount of time and effort debugging > serious problems that arose from file handles getting shared to subprocesses > where it wasn't really expected. I've also encountered this when writing programs that are SCGI servers that do a fork. SCGI is like FastCGI; the HTTP server passes requests to a local server using a custom protocol. If the fork doesn't close the SCGI server port, then Apache does nothing until the forked subprocess exits, because the subprocess is keeping the request socket open and alive. One fix is to always use subprocess.Popen and specify that close_fd=True, which wasn't difficult for me, but I can imagine that an easy way to set close-on-exec would be simpler in other cases. --amk From martin at v.loewis.de Sat Jun 23 21:34:55 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Jun 2007 21:34:55 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <20070623183645.GA10808@andrew-kuchlings-computer.local> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com> <20070623183645.GA10808@andrew-kuchlings-computer.local> Message-ID: <467D75DF.6070509@v.loewis.de> > One fix is to always use subprocess.Popen and specify that > close_fd=True, which wasn't difficult for me, but I can imagine that > an easy way to set close-on-exec would be simpler in other cases. I think the complaint is not so much about simplicity, but correctness. close_fd also closes stdin/stdout/stderr, which might be undesirable and differs from POSIX. In any case, providing a uniform set-close-on-exec looks fine to me, provided it is implementable on all interesting platforms. I'm -0 on adding "n" to open, and -1 for adding if it means to reimplement fopen. Regards, Martin From matthieu.brucher at gmail.com Sat Jun 23 22:03:41 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 23 Jun 2007 22:03:41 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <467D75DF.6070509@v.loewis.de> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com> <20070623183645.GA10808@andrew-kuchlings-computer.local> <467D75DF.6070509@v.loewis.de> Message-ID: Hi, I think the complaint is not so much about simplicity, but correctness. > close_fd also closes stdin/stdout/stderr, which might be undesirable > and differs from POSIX. > According to the docs, stdin/stdout and stderr are not closed ( http://docs.python.org/lib/node529.html) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070623/95531174/attachment.html From martin at v.loewis.de Sat Jun 23 23:12:32 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 23 Jun 2007 23:12:32 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com> <20070623183645.GA10808@andrew-kuchlings-computer.local> <467D75DF.6070509@v.loewis.de> Message-ID: <467D8CC0.8030808@v.loewis.de> > I think the complaint is not so much about simplicity, but correctness. > close_fd also closes stdin/stdout/stderr, which might be undesirable > and differs from POSIX. > > > According to the docs, stdin/stdout and stderr are not closed ( > http://docs.python.org/lib/node529.html) I don't get your point: The docs says explicitly "Unix only". Regards, Martin From status at bugs.python.org Sun Jun 24 02:00:49 2007 From: status at bugs.python.org (Tracker) Date: Sun, 24 Jun 2007 00:00:49 +0000 (UTC) Subject: [Python-Dev] Summary of Tracker Issues Message-ID: <20070624000049.1CFEB781E0@psf.upfronthosting.co.za> ACTIVITY SUMMARY (06/17/07 - 06/24/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1645 open ( +0) / 8584 closed ( +0) / 10229 total ( +0) Average duration of open issues: 836 days. Median duration of open issues: 784 days. Open Issues Breakdown open 1645 ( +0) pending 0 ( +0) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070624/0b2e5907/attachment.htm From talin at acm.org Sun Jun 24 04:28:58 2007 From: talin at acm.org (Talin) Date: Sat, 23 Jun 2007 19:28:58 -0700 Subject: [Python-Dev] [Python-3000] Issues with PEP 3101 (string formatting) In-Reply-To: <20070620085701.GA31968@crater.logilab.fr> References: <20070620085701.GA31968@crater.logilab.fr> Message-ID: <467DD6EA.6010303@acm.org> I haven't responded to this thread because I was hoping some of the original proponents of the feature would come out to defend it. (Remember, 3101 is a synthesis of a lot of people's ideas gleaned from many forum postings - In some cases I am willing to defend particular aspects of the PEP, and in others I just write down what I think the general consensus is.) That being said - from what I've read so far, the evidence on both sides of the argument seems anecdotal to me. I'd rather wait and see what more people have to say on the topic. -- Talin Aur?lien Camp?as wrote: > On Tue, Jun 19, 2007 at 08:20:25AM -0700, Guido van Rossum wrote: >> Those are valid concerns. I'm cross-posting this to the python-3000 >> list in the hope that the PEP's author and defendents can respond. I'm >> sure we can work something out. > > Thanks to raise this. It is horrible enough that I feel obliged to > de-lurk. > > -10 on this part of PEP3101. > > >> Please keep further discussion on the python-3000 at python.org list. >> >> --Guido >> >> On 6/19/07, Chris McDonough wrote: >>> Wrt http://www.python.org/dev/peps/pep-3101/ >>> >>> PEP 3101 says Py3K should allow item and attribute access syntax >>> within string templating expressions but "to limit potential security >>> issues", access to underscore prefixed names within attribute/item >>> access expressions will be disallowed. > > People talking about potential security issues should have an > obligation to show how their proposals *really* improve security (in > general); this is of course, a hard thing to do; mere hand-waving is > not sufficient. > >>> I am a person who has lived with the aftermath of a framework >>> designed to prevent data access by restricting access to underscore- >>> prefixed names (Zope 2, ahem), and I've found it's very hard to >>> explain and justify. As a result, I feel that this is a poor default >>> policy choice for a framework. > > And it's even poorer in the context of a language (for it's probably > harder to escape language-level restrictions than framework > obscurities ...). > >>> In some cases, underscore names must become part of an object's >>> external interface. Consider a URL with one or more underscore- >>> prefixed path segment elements (because prefixing a filename with an >>> underscore is a perfectly reasonable thing to do on a filesystem, and >>> path elements are often named after file names) fed to a traversal >>> algorithm that attempts to resolve each path element into an object >>> by calling __getitem__ against the parent found by the last path >>> element's traversal result. Perhaps this is poor design and >>> __getitem__ should not be consulted here, but I doubt that highly >>> because there's nothing particularly special about calling a method >>> named __getitem__ as opposed to some method named "traverse". > > This is trying to make a technical argument, but the 'consenting > adults' policy might be enough. In my experience, zope forbiding > access to _ prefixed attributes just led to work around the > limitation, thus adding more useless indirection to an already crufty > code base. The result is more obfuscation and probably even less > security (as in auditability of the code). > >>> The only precedent within Python 2 for this sort of behavior is >>> limiting access to variables that begin with __ and which do not end >>> with __ to the scope defined by a class and its instances. I >>> personally don't believe this is a very useful feature, but it's >>> still only an advisory policy and you can worm around it with enough >>> gyrations. > > FWIW I've come to never use __attrs. The obfuscation feature seems to > bring nothing but pain (the few times I've fell into that trap as a > beginner python programmer). > >>> Given that security is a concern at all, the only truly reasonable >>> way to "limit security issues" is to disallow item and attribute >>> access completely within the string templating expression syntax. It >>> seems gratuituous to me to encourage string templating expressions >>> with item/attribute access, given that you could do it within the >>> format arguments just as easily in the 99% case, and we've (well... >>> I've) happily been living with that restriction for years now. >>> >>> But if this syntax is preserved, there really should be no *default* >>> restrictions on the traversable names within an expression because >>> this will almost certainly become a hard-to-explain, hard-to-justify >>> bug magnet as it has become in Zope. > > I'd add that Zope in general looks to me like a giant collection of > python anti-patterns and as such can be used as a clue source about > what not to do, especially what not to include in Py3k. > > I don't want to offense people, well no more than necessary (imho zope > *is* an offense to common sense in many ways), but that's the opinion > from someone who earns its living mostly from zope/plone products > dev. and maintenance (these days, anyway). > > Regards, > Aur?lien. > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/talin%40acm.org > From henning.vonbargen at arcor.de Sun Jun 24 11:05:54 2007 From: henning.vonbargen at arcor.de (Henning von Bargen) Date: Sun, 24 Jun 2007 11:05:54 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de> Message-ID: <000f01c7b63e$df834c60$6401a8c0@max> """ My very personal opinion: After a sleepness night, it seems to me that this is not a Python problem (or any other programming language at all). It looks more like an OS design problem (on MS Windows as well as on Linux etc). In an ideal world, when a program asks the OS to start a child process, it should have to explicitly list the handles that should be inherited. """ > Martin v. L?wis wrote: >> As you can see, at the C level, basically "fopen" is "open" with a >> little code around it to parse flags etc. It's the same kind of hackish >> code. > > "little code" is quite an understatement. In Microsoft's C library > (which we would have to emulate), the argument parsing of fopen is > 120 lines of code. In addition, that code changes across compiler > versions (where VS 2005 adds additional error checking). Hmm. Wow! > >> Anyway, I don't want to argue about the implementation of a patch. >> The fact is that until now the python programmer does not have an >> easy, platform-independent option to open files non-inheritable. >> As you mentioned yourself, the only way to work around it >> in a platform-independent manner, IS VERY HACKISH. >> So, shouldn't this hackish-ness better be hidden in the library >> instead of leaving it as an execise to the common programmer? > > Putting it into the library is fine. However, we need to find > an implementation strategy that meets the user's needs, and > is still maintainable. > > Python 3 will offer a clean solution, deviating entirely from > stdio. Let me point out that stdio is not the problem. The problem is handle inheritance. So I don't see how this is going to be solve in Python 3 just by not using stdio. Inheritance has to be taken into account regardless of how it is implemented on the C level. And to open a file non-inheritable should be possible in an easy and platform-independent way for the average python programmer. > For 2.x, we need to find a better solution than the one you proposed. Stephen, perhaps you can describe the workaround you used? Maybe it is better than mine. Or anyone else? > >> I think it is well worth the effort to keep this trouble away from >> the Python programmers if possible. > > I don't argue about efforts - I argue about your proposed solution. > >> Apart from shutil.copyfile, other examples of using open that can cause >> trouble are in socket.py (tell me any good reason why socket handles >> should be inherited to child processes) and even in logging.py. > > On Unix, it is *very* common to inherit socket handles to child > processes. The parent process opens the socket, and the child > processes perform accept(3). This allows many processes to > serve requests on the same port. In Python, > SocketServer.Forking*Server rely on this precise capability. Ahh, I see. Maybe this is why my HTTP Server sometimes seems to not react when a subprocess is running... If more than one process has a handle for the same socket, how does the OS know which process should react? > >>> Sure - and in turn, open is implemented on CreateFile. >>> However, I don't think I would like to see an fopen >>> implementation in Python. Python 3 will drop stdio entirely; >>> for 2.x, I'd be cautious to change things because that >>> may break other things in an unexpected manner. >> >> Yeah, if you think it should not be included in 2.x, >> then the handle inheritance problem should at least be considered >> in the PEPs [(3116, "New I/O"), (337, "Logging Usage in the Standard >> Modules")] > > I didn't say that a solution shouldn't be included in 2.x. > I said *your* solution shouldn't be. In 3.x, your solution > won't apply, sine Python won't be using stdio (so > fdopen becomes irrelevant). See above - please take it into account for Python 3 then. > >>>> Each handle has a security attribute that specifies whether the >>>> handle should be inherited or not - but this has to be specified >>>> when creating the handle (in the Windows CreateFile API internally). >>> >>> Not necessarily. You can turn on the flag later, through >>> SetHandleInformation. >> >> So do you think that a working "close_fds" could be implemented >> for Windows as well? > > No. close_fds should have the semantics of only closing the handles > for that subprocess. SetHandleInformation applies to the parent > process, and *all* subprocesses. So this is different from close_fds. Yes - that's why I doubt that could work. And according to http://support.microsoft.com/kb/190351/en-us in order to capture stdout and stderr of the child process, one has to specify bInheritHandle=TRUE in CreateProcess, with the net effect that you can only choose if either ALL handles (if not explicitly specified otherwise during handle creation) should be inherited or none of them. > >> Explicitly turning off the inheritance flag for all child handles except >> stdin, stdout and stderr in subprocess / popen (the equivalent to >> what close_fds does for Posix) - that's what I call hackish. > > I didn't propose that, and it wouldn't be the equivalent. In POSIX, > the closing occurs in the child process. This is not possible on > Windows, as there is no fork(). OK - I agree it is not possible. But "avoiding handle inheritance" is what one wants to achieve when specifying close_fds, I think. >> And I doubt that it is possible at all, for two reasons: >> - you have to KNOW all the handles. >> - due to the different process creation in Windows (there's no fork), >> you had to set the inheritance flags afterwards >> - all this is not thread-safe. > > All true, and I did not suggest to integrate SetHandleInformation > into subprocess. I *ONLY* claimed that you can change the flag > after the file was opened. > > With that API, it would be possible to provide cross-platform > access to the close-on-exec flag. Applications interested in setting > it could then set it right after opening the file. YES - that's exactly why I proposed an open_noinherit function. It is a simple solution for a common problem - such a function, documented in the library, would tell developers that they have to be aware of the problem and that a solution exists (though the implementation is more or less hackish due to platform-specific code). > >> Apart from the fact that this is not possible on MS Windows, it won't >> solve the problem! >> (Because then I couldn't use all those standard modules that use open >> *without* FD_CLOEXEC). >> >> The fact is that the combination ("multi-threading", "subprocess >> creation", "standard modules") >> simply *does not work* flawlessly and produces errors that are hard to >> understand. >> And probably most progammers are not even aware of the problem. >> That's the main reason why I posted here. > > I don't see how your proposed change solves that. If there was > an "n" flag, then the modules in the standard library that open > files still won't use it. That's why I said the standard library should be reviewed for unintentionally handle inheritance by the use of open. Note this is a security risk as well, see http://msdn.microsoft.com/msdnmag/issues/0300/security/ Regards, Henning From g.brandl at gmx.net Sun Jun 24 11:09:40 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 24 Jun 2007 11:09:40 +0200 Subject: [Python-Dev] Issues with PEP 3101 (string formatting) In-Reply-To: References: Message-ID: Guido van Rossum schrieb: > Those are valid concerns. I'm cross-posting this to the python-3000 > list in the hope that the PEP's author and defendents can respond. I'm > sure we can work something out. Another question w.r.t. new string formatting: Assuming the %-operator for strings goes away as you said in the recent blog post, how are we going to convert string formatting (which I daresay is a very common operation in Python modules) in the 2to3 tool? Of course, "abc" % anything can be converted easily. name % tuple_or_dict can only be converted to name.format(tuple_or_dict), without correcting the format string. name % name can not be converted at all without type inference. Though probably the first type of application is the most frequent one, pre-building (or just loading from elsewhere) of format strings is not so uncommon when it comes to localization, where the format string likely has a _() wrapped around it. Of course, converting format strings manually is a PITA, mainly because it's so common. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From martin at v.loewis.de Sun Jun 24 20:19:40 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 24 Jun 2007 20:19:40 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <000f01c7b63e$df834c60$6401a8c0@max> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de> <000f01c7b63e$df834c60$6401a8c0@max> Message-ID: <467EB5BC.5060404@v.loewis.de> >> Putting it into the library is fine. However, we need to find >> an implementation strategy that meets the user's needs, and >> is still maintainable. >> >> Python 3 will offer a clean solution, deviating entirely from >> stdio. > > Let me point out that stdio is not the problem. > The problem is handle inheritance. > So I don't see how this is going to be solve in Python 3 just by > not using stdio. In Python 3, it would be possible to implement the "n" flag for open(), as we call CreateFile directly. > And to open a file non-inheritable should be possible in an easy > and platform-independent way for the average python programmer. I don't see why it is a requirement to *open* the file in non-inheritable mode. Why is not sufficient to *modify* an open file to have its handle non-inheritable in an easy and platform-independent way? > Maybe this is why my HTTP Server sometimes seems to not > react when a subprocess is running... > If more than one process has a handle for the same socket, > how does the OS know which process should react? The processes which don't perform accept(), recv(), or select() operations are not considered by the operating system. So if only one process does recv() (say), then this process will read the data. If multiple processes perform accept() (which is a common case), the system selects a process at random. This is desirable, as the system will then automatically split the load across processes, and the listen backlog cannot pile up: if multiple connection requests arrive at the same time, one process will do accept, and then start to process the connection. Then the second process will take the second request, and so on. If multiple processes perform recv(), the system will again chose randomly. This is mostly undesirable, and should be avoided. >> With that API, it would be possible to provide cross-platform >> access to the close-on-exec flag. Applications interested in setting >> it could then set it right after opening the file. > > YES - that's exactly why I proposed an open_noinherit function. I think I missed that proposal. What would that function do? If you propose it to be similar to the open() function, I'd be skeptical. It's not possible to implement that in thread-safe way if you use SetHandleInformation/ioctl. Regards, Martin From foom at fuhm.net Sun Jun 24 21:47:03 2007 From: foom at fuhm.net (James Y Knight) Date: Sun, 24 Jun 2007 15:47:03 -0400 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <467EB5BC.5060404@v.loewis.de> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de> <000f01c7b63e$df834c60$6401a8c0@max> <467EB5BC.5060404@v.loewis.de> Message-ID: <2CDFA7A4-0A3D-4D04-AB6C-D8A3B1C51CB8@fuhm.net> On Jun 24, 2007, at 2:19 PM, Martin v. L?wis wrote: > I don't see why it is a requirement to *open* the file in > non-inheritable mode. Why is not sufficient to *modify* > an open file to have its handle non-inheritable in > an easy and platform-independent way? Threads. Consider that you may fork a process on one thread right between the calls to open() and fcntl(F_SETFD, FD_CLOEXEC) on another thread. The only way to be safe is to open the file non-inheritable to start with. Now, it is currently impossible under linux to open a file descriptor noninheritable, but they're considering adding that feature (I don't have the thread-refs on me, but it's actually from the last month). The issue is that there's a *bunch* of syscalls that open FDs: this feature would need to be added to all of them, not only "open". It's possible that it makes sense for python to provide "as good as possible" an implementation. At the least, putting the fcntl call in the same C function as open would fix programs that don't open files/ spawn processes outside of the GIL protection. But, like the kernel, this feature then ought to be provided for all APIs that create file descriptors. >>> With that API, it would be possible to provide cross-platform >>> access to the close-on-exec flag. Applications interested in setting >>> it could then set it right after opening the file. >> >> YES - that's exactly why I proposed an open_noinherit function. > > I think I missed that proposal. What would that function do? > > If you propose it to be similar to the open() function, I'd > be skeptical. It's not possible to implement that in thread-safe > way if you use SetHandleInformation/ioctl. Now I'm confused: are you talking about the same thread-safety situation as I described above? If so, why did you ask why it's not sufficient to modify a handle to be non-inheritable? James From martin at v.loewis.de Sun Jun 24 22:48:30 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 24 Jun 2007 22:48:30 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <2CDFA7A4-0A3D-4D04-AB6C-D8A3B1C51CB8@fuhm.net> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de> <000f01c7b63e$df834c60$6401a8c0@max> <467EB5BC.5060404@v.loewis.de> <2CDFA7A4-0A3D-4D04-AB6C-D8A3B1C51CB8@fuhm.net> Message-ID: <467ED89E.7050405@v.loewis.de> >> I don't see why it is a requirement to *open* the file in >> non-inheritable mode. Why is not sufficient to *modify* >> an open file to have its handle non-inheritable in >> an easy and platform-independent way? > > Threads. Consider that you may fork a process on one thread right > between the calls to open() and fcntl(F_SETFD, FD_CLOEXEC) on another > thread. The only way to be safe is to open the file non-inheritable to > start with. No, that's not the only safe way. The application can synchronize the threads, and prevent starting subprocesses during critical regions. Just define a subprocess_lock, and make all of your threads follow the protocol of locking that lock when either opening a new file, or creating a subprocess. > Now, it is currently impossible under linux to open a file descriptor > noninheritable, but they're considering adding that feature (I don't > have the thread-refs on me, but it's actually from the last month). The > issue is that there's a *bunch* of syscalls that open FDs: this feature > would need to be added to all of them, not only "open". Right. That is what makes it difficult inherently on the API level. > It's possible that it makes sense for python to provide "as good as > possible" an implementation. At the least, putting the fcntl call in the > same C function as open would fix programs that don't open files/spawn > processes outside of the GIL protection. No, that would not work. Python releases the GIL when opening a file (and needs to do so because that may be a long-running operation). > Now I'm confused: are you talking about the same thread-safety situation > as I described above? Yes. > If so, why did you ask why it's not sufficient to > modify a handle to be non-inheritable? Because I wanted to hear what the reasons are to consider that insufficient. I would have expected "ease of use" and such things (perhaps Henning will still bring up other reasons). If thread-safety is a primary concern, then that flag should *not* be added to open(), since it cannot be implemented in a thread-safe manner in a generic way - only the application can perform the proper synchronization. As discussed, there are other handles subject to inheritance, too, and the application would have to use the modify-handle function, anyway, which means it needs to make it thread-safe through explicit locking. Regards, Martin From rcohen at snurgle.org Sun Jun 24 23:13:03 2007 From: rcohen at snurgle.org (Ross Cohen) Date: Sun, 24 Jun 2007 17:13:03 -0400 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <467ED89E.7050405@v.loewis.de> References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de> <001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de> <000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de> <000f01c7b63e$df834c60$6401a8c0@max> <467EB5BC.5060404@v.loewis.de> <2CDFA7A4-0A3D-4D04-AB6C-D8A3B1C51CB8@fuhm.net> <467ED89E.7050405@v.loewis.de> Message-ID: <20070624211303.GG22573@snurgle.org> On Sun, Jun 24, 2007 at 10:48:30PM +0200, "Martin v. L?wis" wrote: > >> I don't see why it is a requirement to *open* the file in > >> non-inheritable mode. Why is not sufficient to *modify* > >> an open file to have its handle non-inheritable in > >> an easy and platform-independent way? > > > > Threads. Consider that you may fork a process on one thread right > > between the calls to open() and fcntl(F_SETFD, FD_CLOEXEC) on another > > thread. The only way to be safe is to open the file non-inheritable to > > start with. > > No, that's not the only safe way. The application can synchronize the > threads, and prevent starting subprocesses during critical regions. > Just define a subprocess_lock, and make all of your threads follow > the protocol of locking that lock when either opening a new file, > or creating a subprocess. The problem here is that sitting in accept() becomes a critical section. While a thread is sitting in that call, no other thread could start a subprocess. A multithreaded server which uses a 1-thread-per-request model wouldn't be possible, at least not in a reasonable amount of comprehensible code. > > Now, it is currently impossible under linux to open a file descriptor > > noninheritable, but they're considering adding that feature (I don't > > have the thread-refs on me, but it's actually from the last month). The > > issue is that there's a *bunch* of syscalls that open FDs: this feature > > would need to be added to all of them, not only "open". > > Right. That is what makes it difficult inherently on the API level. LWN has had good coverage of the discussion: http://lwn.net/Articles/237722/ Ross From henning.vonbargen at arcor.de Mon Jun 25 22:51:28 2007 From: henning.vonbargen at arcor.de (Henning von Bargen) Date: Mon, 25 Jun 2007 22:51:28 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks Message-ID: <003601c7b76a$9a2a0550$6401a8c0@max> Hi, # I'm not sure about netiquette here: # I decided to continue posting to the python-list without CCing to everyone. First of all, here's the prototype. It's a prototype and I know it's far from perfect, but it works for me (in production code) - however, I did not yet test it on Non-Windows. --------------------------------- #!/bin/env python # -*- coding: iso-8859-1 -*- """ File ftools.py: Useful tools for working with files. """ import os import os.path import time import shutil rmtree = shutil.rmtree move = shutil.move builtin_open = open if os.name != "nt": import fcntl def open(fname, mode="r", bufsize=None): """ Like the "open" built-in, but does not inherit to child processes. The code is using os.open and os.fdopen. On Windows, to avoid inheritance, os.O_NOINHERIT is used directly in the open call, thus it should be thread-safe. On other operating systems, fcntl with FD_CLOEXEC is used right after opening the file; however in a mutli-threaded program it may happen that another thread starts a child process in the fraction of a second between os.open and fcntl. Note: The bufsize argument is ignored (not yet implemented). """ flags = 0 if "r" in mode: flags += os.O_RDONLY elif "w" in mode: flags += os.O_RDWR + os.O_CREAT + os.O_TRUNC elif "a" in mode: flags += os.O_RDWR + os.O_CREAT + os.O_APPEND else: raise NotImplementedError ("mode=" + mode) if os.name == "nt": if "b" in mode: flags += os.O_BINARY else: flags += os.O_TEXT flags += os.O_NOINHERIT try: fd = os.open (fname, flags) if os.name != "nt": old = fcntl.fcntl(fd, fcntl.F_GETFD) fcntl.fcntl(fd, fcntl.F_SETFD, old | fcntl.FD_CLOEXEC) return os.fdopen (fd, mode) except OSError, x: raise IOError(x.errno, x.strerror, x.filename) def copyfile(src, dst): """ Copies a file - like shutil.copyfile, but the files are opened non-inheritable. Note: This prototype does not test _samefile like shutil.copyfile. """ fsrc = None fdst = None try: fsrc = open(src, "rb") fdst = open(dst, "wb") shutil.copyfileobj(fsrc, fdst) finally: if fdst: fdst.close() if fsrc: fsrc.close() ------------------------------------------------ """ blah blah: I googled around a bit, and it more and more seems to me that the Posix system has a serious design flaw, since it seems to be SO hard to write multi-threaded programs that also start child-processes. It's funny that right now Linus Torvalds himself seems to be aware of this problem and that the Linux kernel developers are discussing ways to solve it. Let's hope they find a way to get around it on the OS level. To me, the design of MS Windows looks better in the aspect of process-creation, handle inheritance and multi-threading... Anyway, it has its drawbacks, too. For example, I still cannot specify in a thread-safe way that a handle should be inherited to one child process but not to another - I would have to use a lock to synchronize it, which has its own problems, as Ross Cohen noted. The best solution at the OS level would be to explitly specify the handles/fds I want to be inherited in a "create child process" system call. BTW the Linux kernel developers face the same situation as we do: They could somehow implement a new system function like "open_noinherit", but there's a whole bunch of existing "standard code" that uses open and similar functions like socket(), accept() etc., and they don't want to change all these calls. So perhaps, for Python development, we just have to accept that the problem persists and that at this time a 100% solution just does not exist - and we should watch the discussion on http://lwn.net/Articles/237722/ to see how they solve it for Linux. """ That being said, I still think it's necessary for Python to provide a solution as good as possible. For example, in my production application, by consequently using the ftools module above, I could reduce the error rate dramatically: * With built-in open and shutil.copyfile: Several "Permission denied" and other errors a day * With ftools.open and ftools.copyfile: program running for a week or more without errors. There are still errors sometimes, and I suspect it has to do with the unintenional inheritance of socket handles (I did not dig into SocketServer.py, socket.py and socket.c to solve it). (However, the errors are so rare now that our clients think it's just errors in their network :-). Martin, you mentioned that for sockets, inheritance is not a problem unless accept(), recv() or select() is called in the child process (as far as I understood it). Though I am not an expert in socket programming at the C level, I doubt that you are right here. Apart from by own experiences, I've found some evidence in the WWW (searching for "child process socket inherit respond", and for "socket error 10054 process"). * http://mail.python.org/pipermail/python-list/2003-November/236043.html "socket's strange behavior with subprocesses" Funny: Even notepad.exe is used there as an example child process... * http://mail.python.org/pipermail/python-bugs-list/2006-April/032974.html python-Bugs-1469163 SimpleXMLRPCServer doesn't work anymore on Windows (see also Bug 1222790). * http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4202949 Java has switched to non-inheritable sockets as well. * http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5069989 "(process) Runtime.exec unnecessarily inherits open files (win)" * http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4197666" "Socket descriptors leak into processes spawned by Java programs on windows" ... Any Windows Guru around who can explain what's going on with socket handles and CreateProcess? I mean - is the explanation Martin gave for accept(), recv(), select() correct for Windows, too? And if so - how can the errors be explained that are mentioned in the URLs above? Martin v. L?wis wrote: > In Python 3, it would be possible to implement the "n" flag > for open(), as we call CreateFile directly. BTW, if it will be an additional flag to "open", let me correct myself: "n" is not a good character. Now I prefer "p" like personal, private, protected, process. > > And to open a file non-inheritable should be possible in an easy > > and platform-independent way for the average python programmer. > I don't see why it is a requirement to *open* the file in > non-inheritable mode. Why is not sufficient to *modify* > an open file to have its handle non-inheritable in > an easy and platform-independent way? Because it wouldn't be thread-safe, unless a lock is used for synchronizing subprocess and open calls, which would cause other issues. Are you still reading? Here's a pragmatic proposal: - Add the functions in ftools.py (in a more complete version) to the standard library. Perhaps even add them to the subprocess.py module? - Add a note about handle inheritance to the documentation for the subprocess module, saying that for the parent process, one should avoid using open and prefer ftools.open instead. - Add a global switch to the socket module to choose between new and old behaviour: - New behaviour: In the C level socket implementation, use os.O_NOINHERIT resp. fcntl FD_CLOEXEC Remember: In case you write a forking socket server in Python, you have to use the old behaviour (so, in the ForkingServerMixin, expliticly choose the old behaviour). - Change the logging file handler classes to use ftools.open, so that at least the logging module does not produce errors in a multi-threaded program with child processes. - For Python 3000, search the standard library for unintentional inheritance. What do you think? Henning Footnote: I bet that about 50% of all unexplainable, seemingly random, hard-to-reproduce errors in any given program (written in any programming language) that uses child processes are caused by unintentional handle inheritance. From martin at v.loewis.de Mon Jun 25 23:53:19 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 25 Jun 2007 23:53:19 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <003601c7b76a$9a2a0550$6401a8c0@max> References: <003601c7b76a$9a2a0550$6401a8c0@max> Message-ID: <4680394F.6060001@v.loewis.de> > # I'm not sure about netiquette here: > # I decided to continue posting to the python-list without CCing to > everyone. [I assume you mean python-dev] Discussing this issue on the list is fine. Posting code is on the borderline, and will have no effect, i.e. no action will come out of (at least *I* will ignore the code entirely, unless it is an actual patch, and submitted to the bug tracker). > So perhaps, for Python development, we just have to accept > that the problem persists and that at this time a 100% solution > just does not exist - and we should watch the discussion > on http://lwn.net/Articles/237722/ to see how they solve it for > Linux. Exactly. My proposal is still to provide an API to toggle the flag after the handle was created. > Martin, you mentioned that for sockets, inheritance is not a problem > unless accept(), recv() or select() is called in the child process > (as far as I understood it). I did not say "no problems". I said "there is no ambiguity whereto direct the data if the child processes don't perform accept/recv". > * http://mail.python.org/pipermail/python-list/2003-November/236043.html > "socket's strange behavior with subprocesses" > Funny: Even notepad.exe is used there as an example child process... Sure: the system will not shutdown the connection as long as the handle is still open in the subprocess (as the subprocess *might* send more data - which it won't). I think the problem could be avoided by the parent process explicitly performing shutdown(2), but I'm uncertain as I have never actively used shutdown(). > * http://mail.python.org/pipermail/python-bugs-list/2006-April/032974.html > python-Bugs-1469163 SimpleXMLRPCServer doesn't work anymore on Windows > (see also Bug 1222790). I don't understand how this is relevant. This is about CLO_EXEC not being available on Windows, and has nothing to do with socket inheritance. > * http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4202949 > Java has switched to non-inheritable sockets as well. Not surprisingly - they don't support fork(). If they would, they could not have made that change. The bug report is the same issue: clients will be able to connect as long as the listen backlog fills. Then they will be turned down, as notepad will never perform accept. [I'm getting bored trying to explain the other cases as well] > Any Windows Guru around who can explain what's going on with socket > handles and CreateProcess? I mean - is the explanation Martin gave for > accept(), recv(), select() correct for Windows, too? And if so - how can > the errors be explained that are mentioned in the URLs above? See my explanation above. Martin From ckkart at hoc.net Tue Jun 26 04:47:21 2007 From: ckkart at hoc.net (Christian K) Date: Tue, 26 Jun 2007 11:47:21 +0900 Subject: [Python-Dev] csv changed from python 2.4 to 2.5 Message-ID: Hi, I could not find documentation of the following change in python2.5. What is the reason for that? Python 2.4.4 (#2, Apr 12 2007, 21:03:11) [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import csv >>> d=csv.get_dialect('excel') >>> d.delimiter = ',' >>> ck at kiste:/media/hda6/home/ck/prog/peak-o-mat/trunk$ python2.5 Python 2.5.1 (r251:54863, May 2 2007, 16:56:35) [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import csv >>> d=csv.get_dialect('excel') >>> d.delimiter = ',' Traceback (most recent call last): File "", line 1, in TypeError: readonly attribute >>> the following however works: Python 2.5.1 (r251:54863, May 2 2007, 16:56:35) [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import csv >>> d = csv.excel >>> d.delimiter = ',' >>> Christian From brett at python.org Tue Jun 26 22:46:41 2007 From: brett at python.org (Brett Cannon) Date: Tue, 26 Jun 2007 13:46:41 -0700 Subject: [Python-Dev] handling the granularity of possible Py3K warnings in 2.6 Message-ID: My rewrite of import is written for 2.6. But I am going to try to bootstrap it into 3.0. But I want to bootstrap into 2.6 if it works out well in 3.0. That means coding in 2.6 but constantly forward-porting to 3.0. In other words I am going to be a guinea pig of our transition plan. And being the guinea pig means I want to know where we want to take the Py3K warnings in 2.6. As of right now the -3 option causes some DeprecationWarnings to be raised (mostly about callable and dict.has_key). But I was thinking that a certain level of granularity existed amongst what should be warned against. There is syntactic vs. semantic. For syntactic there is stuff you can do in versions of Python older than 2.6 (e.g., backtick removal), stuff you can only do in 2.6 (e.g., new exception syntax), or stuff that you can't do at all without a __future__ statement (e.g., 'print'). For semantics, there is removal (e.g., callable), and then there is new semantics (dict.items). What I was thinking was having these various types of changes be represented by proper warnings. That allows for using the -W option to control what warnings you care about. So I envision something like: + Py3KWarning + Py3KSyntaxWarning + some reasonable name for stuff that can be done in 2.6. + some name for stuff that can be done in older than 2.6. + something for stuff like the 'print' removal that require a __future__ statement. + Py3KSemanticWarning + Py3KDeprecationWarning + Py3KChangedSemanticsWarning (or whatever name you prefer) The key point is that when I am forward-porting I want to easily tell what syntax changes I can deal with now in older, e.g. 2.5 code, stuff that I can change directly in 2.6, and stuff that requires 2to3 or a __future__ statement. Similar idea for semantic changes. That way I can do this all in steps. Does this sound reasonable to people at all? -Brett From nick at craig-wood.com Wed Jun 27 12:50:41 2007 From: nick at craig-wood.com (Nick Craig-Wood) Date: Wed, 27 Jun 2007 11:50:41 +0100 Subject: [Python-Dev] csv changed from python 2.4 to 2.5 In-Reply-To: References: Message-ID: <20070627105041.869BA14C24B@irishsea.home.craig-wood.com> Christian K wrote: > I could not find documentation of the following change in python2.5. What is the > reason for that? > > Python 2.4.4 (#2, Apr 12 2007, 21:03:11) > [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import csv > >>> d=csv.get_dialect('excel') > >>> d.delimiter = ',' > >>> > > ck at kiste:/media/hda6/home/ck/prog/peak-o-mat/trunk$ python2.5 > Python 2.5.1 (r251:54863, May 2 2007, 16:56:35) > [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import csv > >>> d=csv.get_dialect('excel') > >>> d.delimiter = ',' > Traceback (most recent call last): > File "", line 1, in > TypeError: readonly attribute > >>> Looks like this is the reason - the get_dialect call (which is implemented in C) is now returning a read only Dialect object rather than an instance of the original class :- 2.5 >>> import csv >>> d = csv.get_dialect('excel') >>> d.__class__ >>> 2.4 >>> import csv >>> d = csv.get_dialect('excel') >>> d.__class__ >>> > Python 2.5.1 (r251:54863, May 2 2007, 16:56:35) > [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import csv > >>> d = csv.excel > >>> d.delimiter = ',' > >>> Don't you want to do this anyway? import csv class my_dialect(csv.excel): delimeter = "," -- Nick Craig-Wood -- http://www.craig-wood.com/nick From python-dev at xhaus.com Wed Jun 27 20:26:08 2007 From: python-dev at xhaus.com (Alan Kennedy) Date: Wed, 27 Jun 2007 19:26:08 +0100 Subject: [Python-Dev] Return error codes from getaddrinfo. Message-ID: <4682ABC0.4060407@xhaus.com> Dear all, I'm seeking enlightenment on the error codes returned by the socket.getaddrinfo() function. Consider the following on python 2.5 on Windows >>> import urllib >>> urllib.urlopen("http://nonexistent") [snip traceback] IOError: [Errno socket error] (11001, 'getaddrinfo failed') So the error number is 11001. But when I try to find a symbolic constant in the errno module corresponding to this error number, I can't find one. >>> import errno >>> errno.errorcode[11] 'EAGAIN' >>> errno.errorcode[11001] Traceback (most recent call last): File "", line 1, in KeyError: 11001 Looking through the C source for the socket module doesn't provide any clarity (although my C is a little rusty). That module has a special function, set_gaierror(), for handling error returns from getaddrinfo. But I can't see if or how the resulting error codes relate to the errno module. Is there supposed to be symbolic constants in the errno module corresponding to getaddrinfo errors? I want jython to use the same errno symbolic constants as cpython, to ease portability of code. Regards, Alan. From alexandre at peadrop.com Wed Jun 27 22:15:46 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Wed, 27 Jun 2007 16:15:46 -0400 Subject: [Python-Dev] What's going on with the check-in emails? Message-ID: Hi, It seems there is a problem with check-in emails -- i.e., none have been sent since r56057 (and the svn tree is at r56098 right now). Does someone has a hint what's going on? Thanks, -- Alexandre From skip at pobox.com Wed Jun 27 22:41:53 2007 From: skip at pobox.com (skip at pobox.com) Date: Wed, 27 Jun 2007 15:41:53 -0500 Subject: [Python-Dev] What's going on with the check-in emails? In-Reply-To: References: Message-ID: <18050.52113.127601.141848@montanaro.dyndns.org> Alexandre> It seems there is a problem with check-in emails -- i.e., Alexandre> none have been sent since r56057 (and the svn tree is at Alexandre> r56098 right now). Does someone has a hint what's going on? I'm not aware of a problem, though I noticed the slowdown in checkin emails recently. I forwarded your note to the python.org mailman/postfix gurus. Skip From skip at pobox.com Wed Jun 27 22:59:22 2007 From: skip at pobox.com (skip at pobox.com) Date: Wed, 27 Jun 2007 15:59:22 -0500 Subject: [Python-Dev] csv changed from python 2.4 to 2.5 In-Reply-To: References: Message-ID: <18050.53162.537852.407639@montanaro.dyndns.org> Christian> I could not find documentation of the following change in Christian> python2.5. What is the reason for that? Without looking through the change history for the module it's unclear to me why that would have changed. The thing that changed is that the get_dialect call now returns a _csv.Dialect object instead of an instance of the csv.excel class: % python2.4 Python 2.4.1 (#3, Jul 28 2005, 22:08:40) [GCC 3.3 20030304 (Apple Computer, Inc. build 1671)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import csv >>> d = csv.get_dialect("excel") >>> d % python Python 2.6a0 (trunk:54264M, Mar 10 2007, 15:19:48) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import csv >>> d = csv.get_dialect("excel") >>> d <_csv.Dialect object at 0x137fac0> Please submit a bug report on SourceForge. Thx, Skip From martin at v.loewis.de Wed Jun 27 23:47:49 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 27 Jun 2007 23:47:49 +0200 Subject: [Python-Dev] Return error codes from getaddrinfo. In-Reply-To: <4682ABC0.4060407@xhaus.com> References: <4682ABC0.4060407@xhaus.com> Message-ID: <4682DB05.9090108@v.loewis.de> > Is there supposed to be symbolic constants in the errno module > corresponding to getaddrinfo errors? No. On Windows, there is a separate set of error codes, winerror.h If you google for "winerror 11001", you find quickly that it is "host not found". > I want jython to use the same errno symbolic constants as cpython, to > ease portability of code. That will be very difficult to achieve, as Python is (deliberately) not even consistent across systems. Instead, it reports what the platform reports, so you should do the same in Java. Regards, Martin From brett at python.org Thu Jun 28 02:19:36 2007 From: brett at python.org (Brett Cannon) Date: Wed, 27 Jun 2007 17:19:36 -0700 Subject: [Python-Dev] What's going on with the check-in emails? In-Reply-To: References: Message-ID: On 6/27/07, Alexandre Vassalotti wrote: > Hi, > > It seems there is a problem with check-in emails -- i.e., none have > been sent since r56057 (and the svn tree is at r56098 right now). > Does someone has a hint what's going on? > I am having issues as well. I just did a slew of PEP checkins and I have not gotten a single email on them. -Brett From thomas at python.org Thu Jun 28 02:46:11 2007 From: thomas at python.org (Thomas Wouters) Date: Wed, 27 Jun 2007 17:46:11 -0700 Subject: [Python-Dev] What's going on with the check-in emails? In-Reply-To: References: Message-ID: <9e804ac0706271746t60d6ed90le06b854218c09b72@mail.gmail.com> The mail-checkins script broke because of the upgrade of the machine that hosts the subversion repository -- Python 2.3 went away, but two scripts were still using '#!/usr/bin/env python2.3'. They should be fixed now. On 6/27/07, Alexandre Vassalotti wrote: > > Hi, > > It seems there is a problem with check-in emails -- i.e., none have > been sent since r56057 (and the svn tree is at r56098 right now). > Does someone has a hint what's going on? > > Thanks, > -- Alexandre > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/thomas%40python.org > -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070627/0936c738/attachment.htm From tav at espians.com Thu Jun 28 03:04:42 2007 From: tav at espians.com (tav) Date: Thu, 28 Jun 2007 02:04:42 +0100 Subject: [Python-Dev] object capability; func_closure; __subclasses__ Message-ID: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com> rehi all, I have been looking at the object capability + Python discussions for a while now, and was wondering what the use cases for ``object.__subclasses__`` and ``FunctionType.func_closure`` were? I don't see __subclasses__ used anywhere in the standard library. And whilst I can see exposing func_closure as being useful in terms of "cloning/modifying" an existing function, isn't it possible to do that without making it introspectable? Years ago, Ka-Ping Yee pointed out: http://mail.python.org/pipermail/python-dev/2003-March/034284.html Derived from this we get: # capability.py functions def Namespace(*args, **kwargs): for arg in args: kwargs[arg.__name__] = arg def get(key): return kwargs.get(key) return get class Getter(object): def __init__(self, getter): self.getter = getter def __repr__(self): return self.getter('__repr__') or object.__repr__(self) def __getattr__(self, attr): return self.getter(attr) # io.py module def FileReader(name): file = open(name, 'r') def __repr__(): return '' % name def read(bufsize=-1): return file.read(bufsize) def close(): return file.close() return Getter(Namespace(__repr__, read, close)) ---- Now, a process A -- which has full access to all objects -- can do: >>> motd = FileReader('/etc/motd') And pass it to "process B" operating in a limited scope, which can then call: >>> motd.read() >>> motd.close() But not: >>> motd = type(motd)(motd.name, 'w') which would have been possible *had* motd been created as a ``file`` type by calling: ``open('/etc/motd', 'r')``. Now, there are probably a million holes in this approach, but as long as process B's __import__ is sanitised and it operates in a "limited" scope with regards to references to other functionality, this seems to be relatively secure. However, this is where __subclasses__ and func_closure get in the way. With object.__subclasses__ (as Brett points out), all defined classes/types are available -- including the ``file`` type we were trying to deny process B access to! Is it necessary to expose this attribute publically? And, likewise with func_closure, one can do motd.read.func_closure[0].cell_contents and get hold of the original ``file`` object. Is it absolutely necessary to expose func_closure in this way? Now, whilst probably wrong, I can see myself being able to create a minimal object capability system in pure python if those 2 "features" disappeared. Am I missing something obvious that prevents me from doing that? Can we get rid of them for Python 2.6? Or even 2.5.2? Is anyone besides PJE actually using them? ;p Thanks in advance for your thoughts. -- love, tav founder and ceo, esp metanational llp plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From ckkart at hoc.net Thu Jun 28 05:36:48 2007 From: ckkart at hoc.net (Christian K) Date: Thu, 28 Jun 2007 12:36:48 +0900 Subject: [Python-Dev] csv changed from python 2.4 to 2.5 In-Reply-To: <18050.53162.537852.407639@montanaro.dyndns.org> References: <18050.53162.537852.407639@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > Christian> I could not find documentation of the following change in > Christian> python2.5. What is the reason for that? > > Without looking through the change history for the module it's unclear to me > why that would have changed. The thing that changed is that the get_dialect > call now returns a _csv.Dialect object instead of an instance of the > csv.excel class: > > % python2.4 > Python 2.4.1 (#3, Jul 28 2005, 22:08:40) > [GCC 3.3 20030304 (Apple Computer, Inc. build 1671)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import csv > >>> d = csv.get_dialect("excel") > >>> d > > > % python > Python 2.6a0 (trunk:54264M, Mar 10 2007, 15:19:48) > [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import csv > >>> d = csv.get_dialect("excel") > >>> d > <_csv.Dialect object at 0x137fac0> > > Please submit a bug report on SourceForge. > Ok. Done. Christian From pje at telecommunity.com Thu Jun 28 06:41:35 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 28 Jun 2007 00:41:35 -0400 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.co m> References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com> Message-ID: <20070628044018.076DA3A40AA@sparrow.telecommunity.com> At 02:04 AM 6/28/2007 +0100, tav wrote: >rehi all, > >I have been looking at the object capability + Python discussions for >a while now, and was wondering what the use cases for >``object.__subclasses__`` and ``FunctionType.func_closure`` were? > >I don't see __subclasses__ used anywhere in the standard library. And >whilst I can see exposing func_closure as being useful in terms of >"cloning/modifying" an existing function, isn't it possible to do that >without making it introspectable? You know, I find it particularly interesting that, as far as I can tell, nobody who proposes making changes to the Python language to add security, ever seems to offer any comparison or contrast of their approaches to Zope's -- which doesn't require any changes to the language. :) >Now, whilst probably wrong, I can see myself being able to create a >minimal object capability system in pure python if those 2 "features" >disappeared. Am I missing something obvious that prevents me from >doing that? Well, you're missing a simpler approach to protecting functions, anyway. The '__call__' attribute of functions is still callable, but doesn't provide any access to func_closure, func_code, etc. I believe this trick also works for bound method objects. I suspect that you could also use ctypes to remove or alter the type.__subclasses__ member, though I suppose you might not consider that to be "pure" Python any more. However, if you use a definition of pure that allows for stdlib modules, then perhaps it works. :) >Can we get rid of them for Python 2.6? Or even 2.5.2? Is anyone >besides PJE actually using them? ;p I wouldn't object (no pun intended) to moving the type.__subclasses__ method to say, the 'new' or 'gc' modules, since you wouldn't want to make those available to restricted code, but then they'd still be available for folks who need/want them. 'gc' has similar capabilities (again no pun intended) anyway. However, ISTM that this should be a 3.0 change rather than a 2.x one, even so. With regard to the func_closure thing, I'd actually like to make it *writable* as well as readable, and I don't mean just to change the contents of the cells. But, since you can use .__call__ to make a capability without access to func_closure, it doesn't seem like you really need to remove func_closure. From python-dev at xhaus.com Thu Jun 28 09:59:37 2007 From: python-dev at xhaus.com (Alan Kennedy) Date: Thu, 28 Jun 2007 08:59:37 +0100 Subject: [Python-Dev] Return error codes from getaddrinfo. In-Reply-To: <4682DB05.9090108@v.loewis.de> References: <4682ABC0.4060407@xhaus.com> <4682DB05.9090108@v.loewis.de> Message-ID: <46836A69.6040701@xhaus.com> [Alan] >>I want jython to use the same errno symbolic constants as cpython, to >>ease portability of code. [Martin] > That will be very difficult to achieve, as Python is (deliberately) > not even consistent across systems. Instead, it reports what the > platform reports, so you should do the same in Java. I think I need to explain myself more clearly; I'm looking for the errno.SYMBOLIC_CONSTANT for errors returned by the getaddrinfo function. Take the following lines from the cpython 2.5 httplib. Line 998 - 1014 # -=-=-=-=-=-= while True: try: buf = self._ssl.read(self._bufsize) except socket.sslerror, err: if (err[0] == socket.SSL_ERROR_WANT_READ or err[0] == socket.SSL_ERROR_WANT_WRITE): continue if (err[0] == socket.SSL_ERROR_ZERO_RETURN or err[0] == socket.SSL_ERROR_EOF): break raise except socket.error, err: if err[0] == errno.EINTR: continue if err[0] == errno.EBADF: # XXX socket was closed? break raise # -=-=-=-=-=-=-= How can that code work on jython, other than if A: The jython errno module contains definitions for EINTR and EBADF B: The socket module raises the exceptions with the correct errno.SYMBOLIC_CONSTANTS, in the same circumstances as the cpython module. (The actual integers don't matter, but thanks anyway to the three separate people who informed me that googling "11001" was the solution to my problem). And then there are the non-portable uses of error numbers, like this snippet from the 2.5 httplib: Lines 706-711 #-=-=-=-=-=-= try: self.sock.sendall(str) except socket.error, v: if v[0] == 32: # Broken pipe self.close() raise #-=-=-=-=-=-= Do these examples make it clearer why and in what way I want the jython errno symbolic constants to be the same as cpython? Thanks, Alan. From mithun_rn at yahoo.co.in Thu Jun 28 10:41:06 2007 From: mithun_rn at yahoo.co.in (Mithun R N) Date: Thu, 28 Jun 2007 09:41:06 +0100 (BST) Subject: [Python-Dev] Decoding libpython frame information on the stack Message-ID: <242398.97019.qm@web8510.mail.in.yahoo.com> Hi All, Am a new subscriber to this list. Am facing an issue in deciphering core-files of applications with mixed C and libpython frames in it. I was thinking of knowing any work that has been done with respect to getting into the actual python line (file-name.py:) from the libpython frames on the stack while debugging such core-files. If anybody knows some information on this, please let me know. I could not get any link on the web that talks about this feature. Looking forward for your reply. Thanks and regards, Mithun Bollywood, fun, friendship, sports and more... you name it, we have it at http://in.groups.yahoo.com From tav at espians.com Thu Jun 28 14:09:01 2007 From: tav at espians.com (tav) Date: Thu, 28 Jun 2007 13:09:01 +0100 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <20070628044018.076DA3A40AA@sparrow.telecommunity.com> References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com> <20070628044018.076DA3A40AA@sparrow.telecommunity.com> Message-ID: <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com> > You know, I find it particularly interesting that, as far as I can > tell, nobody who proposes making changes to the Python language to > add security, ever seems to offer any comparison or contrast of their > approaches to Zope's -- which doesn't require any changes to the language. :) Whilst it is definitely possible to build up a object capability system with a pruned down version of Zope 3's proxy + checker mechanism, it ends up in a system which is quite performance intensive. All those proxies being wrapped/unwrapped/checked... In contrast, the mechanism I am looking at here simply requires *moving* just 2 attributes *out* of the core builtins... Quite cheap, simple and effective, no? > Well, you're missing a simpler approach to protecting functions, > anyway. The '__call__' attribute of functions is still callable, but > doesn't provide any access to func_closure, func_code, etc. I > believe this trick also works for bound method objects. Whilst that would have been a nice trick, what about __call__.__self__ ? Or, setting __call__.__doc__ ? > I suspect that you could also use ctypes to remove or alter the > type.__subclasses__ member, though I suppose you might not consider > that to be "pure" Python any more. However, if you use a definition > of pure that allows for stdlib modules, then perhaps it works. :) Ah, thanks! Will look into that. > I wouldn't object (no pun intended) to moving the type.__subclasses__ > method to say, the 'new' or 'gc' modules, since you wouldn't want to > make those available to restricted code, but then they'd still be > available for folks who need/want them. 'gc' has similar > capabilities (again no pun intended) anyway. Ah, that's a great idea! > However, ISTM that this should be a 3.0 change rather than a 2.x one, > even so. With regard to the func_closure thing, I'd actually like to > make it *writable* as well as readable, and I don't mean just to > change the contents of the cells. But, since you can use .__call__ > to make a capability without access to func_closure, it doesn't seem > like you really need to remove func_closure. I don't object to making func_closure writable either. In fact, as someone who has been following your work on generic functions from way before RuleDispatch, I really want to see PEP 3124 in 3.0 But, all I am asking for is to not expose func_closure (and perhaps some of the other func_*) as members of FunctionType -- isn't it possible to add functionality to the ``new`` module which would allow one to read/write func_closure? -- love, tav founder and ceo, esp metanational llp plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From varmaa at gmail.com Thu Jun 28 15:54:52 2007 From: varmaa at gmail.com (Atul Varma) Date: Thu, 28 Jun 2007 08:54:52 -0500 Subject: [Python-Dev] Decoding libpython frame information on the stack In-Reply-To: <242398.97019.qm@web8510.mail.in.yahoo.com> References: <242398.97019.qm@web8510.mail.in.yahoo.com> Message-ID: <361b27370706280654y199a71e8l6a77a74b07cf0eef@mail.gmail.com> Hi Mithun, Because python-dev is a mailing list for the development *of* Python rather than development *with* Python, I believe you may not have posted to the best list. Further information about this distinction, and some discussion about potentially setting up a special-interest list exclusively for Python/C interactions, can be found in this recent thread: http://mail.python.org/pipermail/python-dev/2007-June/073680.html Regarding your question, I'll try to answer it as best I can: on our Windows application, we use Microsoft minidumps [1] instead of core dumps. At the time that a crash occurs and a minidump is written, we have some code that digs into the Python interpreter state to get a text traceback for every Python thread currently in execution at the time of the crash, which is appended to the log file that is sent with the minidump in the automated bug report. Doing this is a bit risky because it assumes that the relevant parts of the Python interpreter state aren't corrupt at the time of the crash, but precautions can be made to deal with this edge case. So while I can't help you get a bead on debugging core files, you may want to consider a similar solution on the Unix platform. - Atul [1] http://msdn2.microsoft.com/en-us/library/ms680369.aspx On 6/28/07, Mithun R N wrote: > Hi All, > > Am a new subscriber to this list. > Am facing an issue in deciphering core-files of > applications with mixed C and libpython frames in it. > > I was thinking of knowing any work that has been done > with respect to getting into the actual python line > (file-name.py:) from the libpython frames > on the stack while debugging such core-files. If > anybody knows some information on this, please let me > know. I could not get any link on the web that talks > about this feature. > > Looking forward for your reply. > Thanks and regards, > Mithun > > > > Bollywood, fun, friendship, sports and more... you name it, we have it at http://in.groups.yahoo.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/varmaa%40gmail.com > From dustin at v.igoro.us Thu Jun 28 16:57:18 2007 From: dustin at v.igoro.us (Dustin J. Mitchell) Date: Thu, 28 Jun 2007 09:57:18 -0500 Subject: [Python-Dev] Decoding libpython frame information on the stack In-Reply-To: <242398.97019.qm@web8510.mail.in.yahoo.com> References: <242398.97019.qm@web8510.mail.in.yahoo.com> Message-ID: <20070628145718.GA20378@v.igoro.us> On Thu, Jun 28, 2007 at 09:41:06AM +0100, Mithun R N wrote: > Am a new subscriber to this list. > Am facing an issue in deciphering core-files of > applications with mixed C and libpython frames in it. > > I was thinking of knowing any work that has been done > with respect to getting into the actual python line > (file-name.py:) from the libpython frames > on the stack while debugging such core-files. If > anybody knows some information on this, please let me > know. I could not get any link on the web that talks > about this feature. Dave Beazley once worked on this subject: http://www.usenix.org/events/usenix01/full_papers/beazley/beazley_html/index.html Dustin From pje at telecommunity.com Thu Jun 28 17:03:32 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 28 Jun 2007 11:03:32 -0400 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.co m> References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com> <20070628044018.076DA3A40AA@sparrow.telecommunity.com> <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com> Message-ID: <20070628150129.BBADF3A40B1@sparrow.telecommunity.com> At 01:09 PM 6/28/2007 +0100, tav wrote: >>You know, I find it particularly interesting that, as far as I can >>tell, nobody who proposes making changes to the Python language to >>add security, ever seems to offer any comparison or contrast of their >>approaches to Zope's -- which doesn't require any changes to the >>language. :) > >Whilst it is definitely possible to build up a object capability >system with a pruned down version of Zope 3's proxy + checker >mechanism, it ends up in a system which is quite performance >intensive. All those proxies being wrapped/unwrapped/checked... > >In contrast, the mechanism I am looking at here simply requires >*moving* just 2 attributes *out* of the core builtins... > >Quite cheap, simple and effective, no? > >>Well, you're missing a simpler approach to protecting functions, >>anyway. The '__call__' attribute of functions is still callable, but >>doesn't provide any access to func_closure, func_code, etc. I >>believe this trick also works for bound method objects. > >Whilst that would have been a nice trick, what about __call__.__self__ ? Well, there's no __self__ in 2.3 or 2.4; I guess that was added in 2.5. Darn. >Or, setting __call__.__doc__ ? What does that do? >>I suspect that you could also use ctypes to remove or alter the >>type.__subclasses__ member, though I suppose you might not consider >>that to be "pure" Python any more. However, if you use a definition >>of pure that allows for stdlib modules, then perhaps it works. :) > >Ah, thanks! Will look into that. If it works, you could probably do the same thing to remove __call__.__self__. >I don't object to making func_closure writable either. In fact, as >someone who has been following your work on generic functions from way >before RuleDispatch, I really want to see PEP 3124 in 3.0 > >But, all I am asking for is to not expose func_closure (and perhaps >some of the other func_*) as members of FunctionType -- isn't it >possible to add functionality to the ``new`` module which would allow >one to read/write func_closure? In 3.0, I don't mind if the access method moves, I just want to keep the access. OTOH, I don't really care about __call__.__self__, since I got along fine without it in 2.3/2.4 and didn't know it had been added in 2.5. :) From tav at espians.com Thu Jun 28 17:14:05 2007 From: tav at espians.com (tav) Date: Thu, 28 Jun 2007 16:14:05 +0100 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <20070628150129.BBADF3A40B1@sparrow.telecommunity.com> References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com> <20070628044018.076DA3A40AA@sparrow.telecommunity.com> <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com> <20070628150129.BBADF3A40B1@sparrow.telecommunity.com> Message-ID: <95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com> > Well, there's no __self__ in 2.3 or 2.4; I guess that was added in 2.5. Darn. anyone know *why* it was added? > >Or, setting __call__.__doc__ ? > > What does that do? ah, i just wanted a way of providing documentation, and __call__'s __doc__ isn't writable... > If it works, you could probably do the same thing to remove __call__.__self__. will look into that too... > In 3.0, I don't mind if the access method moves, I just want to keep > the access. OTOH, I don't really care about __call__.__self__, since > I got along fine without it in 2.3/2.4 and didn't know it had been > added in 2.5. :) w00p! so, suggestions as to how one can go about getting those 2 access methods moved? -- thanks, tav founder and ceo, esp metanational llp plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From pje at telecommunity.com Thu Jun 28 17:44:16 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 28 Jun 2007 11:44:16 -0400 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com > References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com> <20070628044018.076DA3A40AA@sparrow.telecommunity.com> <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com> <20070628150129.BBADF3A40B1@sparrow.telecommunity.com> <95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com> Message-ID: <20070628154208.DA6673A40B1@sparrow.telecommunity.com> At 04:14 PM 6/28/2007 +0100, tav wrote: > > Well, there's no __self__ in 2.3 or 2.4; I guess that was added > in 2.5. Darn. > >anyone know *why* it was added? > > > >Or, setting __call__.__doc__ ? > > > > What does that do? > >ah, i just wanted a way of providing documentation, and __call__'s >__doc__ isn't writable... > > > If it works, you could probably do the same thing to remove > __call__.__self__. > >will look into that too... > > > In 3.0, I don't mind if the access method moves, I just want to keep > > the access. OTOH, I don't really care about __call__.__self__, since > > I got along fine without it in 2.3/2.4 and didn't know it had been > > added in 2.5. :) > >w00p! > >so, suggestions as to how one can go about getting those 2 access >methods moved? Post a proposal on the Python-3000 list and supply patches to do the moves. From skip at pobox.com Thu Jun 28 18:00:14 2007 From: skip at pobox.com (skip at pobox.com) Date: Thu, 28 Jun 2007 11:00:14 -0500 Subject: [Python-Dev] Decoding libpython frame information on the stack In-Reply-To: <20070628145718.GA20378@v.igoro.us> References: <242398.97019.qm@web8510.mail.in.yahoo.com> <20070628145718.GA20378@v.igoro.us> Message-ID: <18051.56078.34614.249653@montanaro.dyndns.org> >> Am a new subscriber to this list. Am facing an issue in deciphering >> core-files of applications with mixed C and libpython frames in it. >> I was thinking of knowing any work that has been done with respect to >> getting into the actual python line (file-name.py:) from >> the libpython frames on the stack while debugging such core-files. If >> anybody knows some information on this, please let me know. I could >> not get any link on the web that talks about this feature. Sorry, I missed this the first time round and just saw Dustin's reply. The Python distribution comes with a gdbinit file in the Misc directory. I use it frequently to display Python stack traces from within GDB. Here's the most recent copy online: http://svn.python.org/view/python/trunk/Misc/gdbinit?view=markup The following commands are implemented: pystack - display the full stack trace pystackv - as above, but also display local variables pyframe - display just the current frame pyframev - as above, but also display local variables up, down - move up or down one C stack frame, but display Python frame if you move into PyEval_EvalFrame This should all work within active sessions and sessions debugging core files (e.g., no active process). It needs some rework. For instance, it assumes you're running within Emacs and puts out lines gud can use to display source lines. These look a little funky when debugging from a terminal window. Skip From pje at telecommunity.com Thu Jun 28 19:26:28 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 28 Jun 2007 13:26:28 -0400 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <95d8c0810706280923v6a72cb72w6d5474c41c748c29@mail.gmail.co m> References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com> <20070628044018.076DA3A40AA@sparrow.telecommunity.com> <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com> <20070628150129.BBADF3A40B1@sparrow.telecommunity.com> <95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com> <20070628154208.DA6673A40B1@sparrow.telecommunity.com> <95d8c0810706280923v6a72cb72w6d5474c41c748c29@mail.gmail.com> Message-ID: <20070628172433.382DC3A40AF@sparrow.telecommunity.com> At 05:23 PM 6/28/2007 +0100, tav wrote: >Any pointers on removing members via ctypes front? > >Whilst I can understand even the most obscure aspects of your python >code fine, I'm not familiar with C/ctypes... What you want is to get access to the type's real dictionary, not the proxy. Then you can just delete '__subclasses__' from the dictionary using Python code. Here's some code that does the trick: from ctypes import pythonapi, POINTER, py_object getdict = pythonapi._PyObject_GetDictPtr getdict.restype = POINTER(py_object) getdict.argtypes = [py_object] def dictionary_of(ob): dptr = getdict(ob) if dptr and dptr.contents: return dptr.contents.value 'dictionary_of' returns either a dictionary object, or None if the object has no dictionary. You can then simply delete any unwanted contents. However, you should *never use this* to assign __special__ methods, as Python will not change the type slots correctly. Heck, you should probably never use this, period. :) Usage example: print "before", type.__subclasses__ del dictionary_of(type)['__subclasses__'] print "after", type.__subclasses__ This will print something like: before after Traceback (most recent call last): File "ctypes_dicto.py", line 14, in print "after", type.__subclasses__ AttributeError: type object 'type' has no attribute '__subclasses__' et voila. You should also be able to delete unwanted function type attributes like this:: from types import FunctionType del dictionary_of(FunctionType)['func_closure'] del dictionary_of(FunctionType)['func_code'] Of course, don't blame me if any of this code fries your computer and gives you a disease, doesn't work with new versions of Python, etc. etc. It works for me on Windows and Linux with Python 2.3, 2.4 and 2.5. It may also work with 3.0, but remember that func_* attributes have different names there. From pje at telecommunity.com Thu Jun 28 19:34:05 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 28 Jun 2007 13:34:05 -0400 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <435DF58A933BA74397B42CDEB8145A860D502D6B@ex9.hostedexchang e.local> References: <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A860D502D6B@ex9.hostedexchange.local> Message-ID: <20070628173156.9D8773A40AF@sparrow.telecommunity.com> At 10:20 AM 6/28/2007 -0700, Robert Brewer wrote: >tav wrote: > > But, all I am asking for is to not expose func_closure (and perhaps > > some of the other func_*) as members of FunctionType -- isn't it > > possible to add functionality to the ``new`` module which would allow > > one to read/write func_closure? > >Would func_closure then also be removed from the FunctionType >constructor arg list? That wouldn't be necessary, since restricted code is probably not going to be allowed access to new in the first place. We're talking about removing read access to the closure *attribute* only. (And write access to func_code would also have to be removed, else you could replace the bytecode in order to grant yourself read access to the closure contents...) >If so, would I be expected to create a function >object and then use the "new" module to supply its func_closure? Nope. The idea here is that the new module would grow utility functions like get_closure, get_code, set_code, get_subclasses, etc. The 'inspect' module would then use these functions to do its job, and I would use them for generic function stuff. From martin at v.loewis.de Thu Jun 28 19:32:57 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 28 Jun 2007 19:32:57 +0200 Subject: [Python-Dev] Return error codes from getaddrinfo. In-Reply-To: <46836A69.6040701@xhaus.com> References: <4682ABC0.4060407@xhaus.com> <4682DB05.9090108@v.loewis.de> <46836A69.6040701@xhaus.com> Message-ID: <4683F0C9.7040708@v.loewis.de> > [Martin] >> That will be very difficult to achieve, as Python is (deliberately) >> not even consistent across systems. Instead, it reports what the >> platform reports, so you should do the same in Java. > > Do these examples make it clearer why and in what way I want the jython > errno symbolic constants to be the same as cpython? I fully understood that, already in your original message. All I was saying is that this will be very difficult to achieve. It would be much easier if you don't take the code of the standard library and the application as given, but instead accept that people may have to change the error conditions somewhat when porting to Jython. Ideally, such porting would allow to still run the same code on CPython, and ideally, you would then provide patches for the Python library to make it run unmodified on Jython (rather than trying to arrange to make the *current* library run unmodified). Regards, Martin From tav at espians.com Thu Jun 28 19:35:23 2007 From: tav at espians.com (tav) Date: Thu, 28 Jun 2007 18:35:23 +0100 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <20070628172433.382DC3A40AF@sparrow.telecommunity.com> References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com> <20070628044018.076DA3A40AA@sparrow.telecommunity.com> <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com> <20070628150129.BBADF3A40B1@sparrow.telecommunity.com> <95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com> <20070628154208.DA6673A40B1@sparrow.telecommunity.com> <95d8c0810706280923v6a72cb72w6d5474c41c748c29@mail.gmail.com> <20070628172433.382DC3A40AF@sparrow.telecommunity.com> Message-ID: <95d8c0810706281035o198ac9f9yba0000278b8b9ba7@mail.gmail.com> I love you PJE! Thank you! =) On 6/28/07, Phillip J. Eby wrote: > At 05:23 PM 6/28/2007 +0100, tav wrote: > >Any pointers on removing members via ctypes front? > > > >Whilst I can understand even the most obscure aspects of your python > >code fine, I'm not familiar with C/ctypes... > > What you want is to get access to the type's real dictionary, not the > proxy. Then you can just delete '__subclasses__' from the dictionary > using Python code. Here's some code that does the trick: > > from ctypes import pythonapi, POINTER, py_object > > getdict = pythonapi._PyObject_GetDictPtr > getdict.restype = POINTER(py_object) > getdict.argtypes = [py_object] > > def dictionary_of(ob): > dptr = getdict(ob) > if dptr and dptr.contents: > return dptr.contents.value > > 'dictionary_of' returns either a dictionary object, or None if the > object has no dictionary. You can then simply delete any unwanted > contents. However, you should *never use this* to assign __special__ > methods, as Python will not change the type slots correctly. Heck, > you should probably never use this, period. :) Usage example: > > print "before", type.__subclasses__ > del dictionary_of(type)['__subclasses__'] > print "after", type.__subclasses__ > > This will print something like: > > before > after > Traceback (most recent call last): > File "ctypes_dicto.py", line 14, in > print "after", type.__subclasses__ > AttributeError: type object 'type' has no attribute '__subclasses__' > > et voila. > > You should also be able to delete unwanted function type attributes like this:: > > from types import FunctionType > del dictionary_of(FunctionType)['func_closure'] > del dictionary_of(FunctionType)['func_code'] > > Of course, don't blame me if any of this code fries your computer and > gives you a disease, doesn't work with new versions of Python, etc. > etc. It works for me on Windows and Linux with Python 2.3, 2.4 and > 2.5. It may also work with 3.0, but remember that func_* attributes > have different names there. > > -- love, tav founder and ceo, esp metanational llp plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From fumanchu at amor.org Thu Jun 28 19:20:31 2007 From: fumanchu at amor.org (Robert Brewer) Date: Thu, 28 Jun 2007 10:20:31 -0700 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com> Message-ID: <435DF58A933BA74397B42CDEB8145A860D502D6B@ex9.hostedexchange.local> tav wrote: > But, all I am asking for is to not expose func_closure (and perhaps > some of the other func_*) as members of FunctionType -- isn't it > possible to add functionality to the ``new`` module which would allow > one to read/write func_closure? Would func_closure then also be removed from the FunctionType constructor arg list? If so, would I be expected to create a function object and then use the "new" module to supply its func_closure? Robert Brewer System Architect Amor Ministries fumanchu at amor.org From pje at telecommunity.com Thu Jun 28 19:44:03 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 28 Jun 2007 13:44:03 -0400 Subject: [Python-Dev] object capability; func_closure; __subclasses__ In-Reply-To: <20070628172433.382DC3A40AF@sparrow.telecommunity.com> References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com> <20070628044018.076DA3A40AA@sparrow.telecommunity.com> <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com> <20070628150129.BBADF3A40B1@sparrow.telecommunity.com> <95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com> <20070628154208.DA6673A40B1@sparrow.telecommunity.com> <95d8c0810706280923v6a72cb72w6d5474c41c748c29@mail.gmail.com> <20070628172433.382DC3A40AF@sparrow.telecommunity.com> Message-ID: <20070628174155.617AE3A40AF@sparrow.telecommunity.com> At 01:26 PM 6/28/2007 -0400, Phillip J. Eby wrote: >You should also be able to delete unwanted function type attributes >like this:: > > from types import FunctionType > del dictionary_of(FunctionType)['func_closure'] > del dictionary_of(FunctionType)['func_code'] By the way, you probably want to also delete func_globals and func_defaults, as there are security ramifications to those attributes as well. Probably not so much for func_dict/__dict__ though. And of course, for Python<=2.4 you can just use the __call__ attribute and not bother with deleting anything but __subclasses__. From alexandre at peadrop.com Thu Jun 28 23:34:47 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Thu, 28 Jun 2007 17:34:47 -0400 Subject: [Python-Dev] What's going on with the check-in emails? In-Reply-To: <9e804ac0706271746t60d6ed90le06b854218c09b72@mail.gmail.com> References: <9e804ac0706271746t60d6ed90le06b854218c09b72@mail.gmail.com> Message-ID: Thanks! The check-in emails are working again. -- Alexandre On 6/27/07, Thomas Wouters wrote: > > The mail-checkins script broke because of the upgrade of the machine that > hosts the subversion repository -- Python 2.3 went away, but two scripts > were still using '#!/usr/bin/env python2.3'. They should be fixed now. > From mithun_rn at yahoo.co.in Fri Jun 29 06:36:39 2007 From: mithun_rn at yahoo.co.in (Mithun R N) Date: Fri, 29 Jun 2007 05:36:39 +0100 (BST) Subject: [Python-Dev] Decoding libpython frame information on the stack In-Reply-To: <18051.56078.34614.249653@montanaro.dyndns.org> Message-ID: <473842.23106.qm@web8502.mail.in.yahoo.com> Hi All, Thanks much for your suggestions and help. Shall get back after reading through and trying some stuff mentioned in the emails. Thanks and regards, Mithun --- skip at pobox.com wrote: > > >> Am a new subscriber to this list. Am facing > an issue in deciphering > >> core-files of applications with mixed C and > libpython frames in it. > > >> I was thinking of knowing any work that has > been done with respect to > >> getting into the actual python line > (file-name.py:) from > >> the libpython frames on the stack while > debugging such core-files. If > >> anybody knows some information on this, > please let me know. I could > >> not get any link on the web that talks about > this feature. > > Sorry, I missed this the first time round and just > saw Dustin's reply. The > Python distribution comes with a gdbinit file in the > Misc directory. I use > it frequently to display Python stack traces from > within GDB. Here's the > most recent copy online: > > > http://svn.python.org/view/python/trunk/Misc/gdbinit?view=markup > > The following commands are implemented: > > pystack - display the full stack trace > pystackv - as above, but also display local > variables > pyframe - display just the current frame > pyframev - as above, but also display local > variables > up, down - move up or down one C stack frame, > but display Python > frame if you move into > PyEval_EvalFrame > > This should all work within active sessions and > sessions debugging core > files (e.g., no active process). > > It needs some rework. For instance, it assumes > you're running within Emacs > and puts out lines gud can use to display source > lines. These look a little > funky when debugging from a terminal window. > > Skip > Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php From cbarton at metavr.com Sat Jun 30 01:37:32 2007 From: cbarton at metavr.com (Campbell Barton) Date: Sat, 30 Jun 2007 09:37:32 +1000 Subject: [Python-Dev] Py/C API sig is here! --- (Was "Calling Methods from Pythons C API with Keywords") In-Reply-To: <1182847168.6077.156.camel@localhost> References: <4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost> <4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost> <467919BA.2090708@metavr.com> <46796348.2050902@v.loewis.de> <1182425636.6077.141.camel@localhost> <467AB472.6070509@v.loewis.de> <1182847168.6077.156.camel@localhost> Message-ID: <468597BC.5080703@metavr.com> Hrvoje Nik??i?? wrote: > On Thu, 2007-06-21 at 19:25 +0200, "Martin v. L????wis" wrote: >> In the past, we created special-interest groups for such discussion. >> Would you like to coordinate a C sig? See >> >> http://www.python.org/community/sigs/ > > A SIG sounds like an excellent idea. If created, a newcomer with a C > API question could then be redirected to the SIG's mailing list, where > (hopefully, in time) there would be enough knowledgable people to answer > his question. > > As for me coordinating the SIG, I'm not sure if that would be a good > idea. For one, I don't know what a coordinator really does and how much > time the job takes from one's daily activities. But more importantly, > my interest in Python's C API is related to my current needs at work. > If the situation at work changes, I will probably have much less time > (if any) to devote to the C API discussions. This mailing list is now running, if your interested in asking/answering questions about the Py/C api sign up here. http://mail.python.org/mailman/listinfo/capi-sig -- Campbell J Barton (ideasman42) From henning.vonbargen at arcor.de Sat Jun 30 14:03:35 2007 From: henning.vonbargen at arcor.de (Henning von Bargen) Date: Sat, 30 Jun 2007 14:03:35 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks References: <003601c7b76a$9a2a0550$6401a8c0@max> <4680394F.6060001@v.loewis.de> Message-ID: <003a01c7bb0e$affc6b00$6401a8c0@max> > Martin v. L?wis wrote: > Exactly. My proposal is still to provide an API to toggle the > flag after the handle was created. OK, here is an API that I tested on Windows and for sockets only. Perhaps someone can test it on Non-Windows (Linux, for example?) I think the best place for it would be as a new method "set_noinherit" for file and socket objects or as a new function in the os module (thus the implementation should probably be rewritten at the C level). Note that for file objects, the code would need an additional call to win32file._get_osfhandle, because .fileno() returns the Windows handle only for sockets, but not for files. The code below uses thes Mark Hammond's win32all library. import os if os.name == "nt": import win32api, win32con def set_noinherit(socket, noinherit=True): """ Mark a socket as non-inheritable to child processes. This should be called right after socket creation if you want to prevent the socket from being inherited to child processes. Notes: Unintentional socket or file inheritance is a security risk and can cause errors like "permission denied", "adress already in use" etc. in programs that start subprocesses, particularly in multi-threaded programs. These errors tend to occur seemingly randomly and are hard to reproduce (race condition!) and even harder to debug. Thus it is good practice to call this function as soon as possible after opening a file or socket that you doesn't need to be inherited to subprocesses. Note that in a multi-threaded environment, it is still possible that another thread starts a subprocess after you created a file/socket, but before you call set_noinherit. Note that for sockets, the new socket returned from accept() will be inheritable even if the listener socket was not; so you should call set_noinherit for the new socket as well. Availability: Posix, Windows """ flags = 0 if noinherit: flags = flags | win32con.HANDLE_FLAG_INHERIT win32api.SetHandleInformation(socket.fileno(), win32con.HANDLE_FLAG_INHERIT, flags) else: import fcntl def set_noinherit(socket, noinherit=True): """ ... documentation copied from the nt case ... """ fd = socket.fileno() flags = fcntl.fcntl(fd, fcntl.F_GETFD) & ~fncl.FD_CLOEXEC if noinherit: flags = flags | fcntl.FD_CLOEXEC) fcntl.fcntl(fd, fcntl.F_SETFD, flags) > >> Martin, you mentioned that for sockets, inheritance is not a problem >> unless accept(), recv() or select() is called in the child process >> (as far as I understood it). > > I did not say "no problems". I said "there is no ambiguity whereto > direct the data if the child processes don't perform accept/recv". > >> * http://mail.python.org/pipermail/python-list/2003-November/236043.html >> "socket's strange behavior with subprocesses" >> Funny: Even notepad.exe is used there as an example child process... > > Sure: the system will not shutdown the connection as long as the handle > is still open in the subprocess (as the subprocess *might* send more > data - which it won't). > > I think the problem could be avoided by the parent process explicitly > performing shutdown(2), but I'm uncertain as I have never actively used > shutdown(). > >> * http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4202949 >> Java has switched to non-inheritable sockets as well. > > Not surprisingly - they don't support fork(). If they would, > they could not have made that change. The bug report is the > same issue: clients will be able to connect as long as the > listen backlog fills. Then they will be turned down, as notepad > will never perform accept. I think I more or less understand what happens. The fact remains, that a subprocess takes influence on the behaviour of server and client programs in an unwanted (and for many people: unexpected) manner. Regarding the influence of socket inheritance for server apps, I performed some tests on Windows. If a subprocess is started after the parent called accept(), then a client can still connect() and send() to the socket - even if the parent process has closed it and exited meanwhile. This works until the queue is full (whose size was specified in listen()). THEN the client will get (10061, 'Connection refused'); as you already explained. And client and server will have the socket in CLOSE_WAIT resp. FIN_WAIT2 status. However I doubt that this is actually a problem as long as the server continues accept()ing. But it means the client and server OS have to manage the socket until the subprocess exits - even though neither client nor server need the socket anymore. One might argue that it is not actually a big problem, since the subprocess will exit sooner or later. This is more or less true for Linux, etc. However, sometimes a subprocess might crash or hang. Now what happens if the server program is closed and then started again? On Linux, no problem (more or less). When the server program is closed, the subprocess will be killed by the OS (I think), and the socket is released (perhaps with a few minutes delay). On Windows the situation is worse. Subprocess hanging: When the server program is closed, the subprocess will NOT be killed by the OS ("officially" there isn't a parent-child relationship for processes). It will continue hanging. When the server program is restarted, it will run into an "address alread in use" error. Subprocess crashing: Unfortunately, on a default Windows installation, this will be very much the same like a hanging subprocess: Dr. Watson or whichever debugger comes to debug the crashed program. As long as nobody clicks on the messagebox popup, the crashed program will not be freed and all handles will be kept open. Of course, on a server computer, usually there is nobody watching the desktop... Note: It is possible to work around this by installing a different debugger (which usually includes hacking the registry). These problems can be avoided by calling set_noinherit for the lisener socket as well as for the new socket returned by accept() ASAP. > >> * >> http://mail.python.org/pipermail/python-bugs-list/2006-April/032974.html >> python-Bugs-1469163 SimpleXMLRPCServer doesn't work anymore on Windows >> (see also Bug 1222790). > > I don't understand how this is relevant. This is about CLO_EXEC not > being available on Windows, and has nothing to do with socket > inheritance. The original bug was about problems due to unwanted handle inheritance with SimpleXMLRPCServer. The bug fix was to set CLO_EXEC. The fix didn't work for Windows of course. A correct bug fix would to use the "set_noinherit" function above. > > [I'm getting bored trying to explain the other cases as well] > OK, YOU do understand the issue and know what's going on under the hood. I understand as well - at least now. It cost be several weeks of frustrating debugging, changing code to avoid the built-in "open", reading library source code, searching the internet and cursing... (and I'm not a beginner, as well as others who mentioned they had similar problems in this list). Note that the solution REQUIRED avoiding and/or hacking the standard library. As mentioned in previous posts, in a multi-threaded program, the correct solution for files is to use the ftools.open on Windows and a correct solution is not possible for sockets and on Non-Windows due to possible race-conditions. Using set_noinherit will reduce the risk as best as possible. For Python 2.6 I propose to add the set_noinherit method to file and socket objects. For Python 3000 I propose that by default files and sockets should be created non-inheritable (though it will not work perfectly for mult-threaded programs on Non-Windows - see the doc in the code). A server that needs handles to be inherited can then still call set_noinherit(False). Typical uses would include SocketServer.ForkingMixIn and the 3 standard handles for subprocess/os popen. If this seems reasonable and I can help in implementing this, please let me know. The change would prevent other developers from making the same frustrating experiences as I did. Regards, Henning From martin at v.loewis.de Sat Jun 30 19:24:41 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 30 Jun 2007 19:24:41 +0200 Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks In-Reply-To: <003a01c7bb0e$affc6b00$6401a8c0@max> References: <003601c7b76a$9a2a0550$6401a8c0@max> <4680394F.6060001@v.loewis.de> <003a01c7bb0e$affc6b00$6401a8c0@max> Message-ID: <468691D9.1000009@v.loewis.de> > I think the best place for it would be as a new method "set_noinherit" > for file and socket objects or as a new function in the os module > (thus the implementation should probably be rewritten at the C level). Indeed. Can you come up with a C implementation of it? I think it should be a function in the posix/nt module, expecting OS handles; the function in the os module could additionally support sockets and file objects also in a polymorphic way. > This works until the queue is full (whose size was specified in > listen()). THEN the client will get (10061, 'Connection refused'); > as you already explained. That's for accept, yes. For send, you can continue sending until the TCP window closes (plus some unspecified amount of local buffering the OS might do). > However, sometimes a subprocess might crash or hang. > Now what happens if the server program is closed and then started > again? On Linux, no problem (more or less). When the server program > is closed, the subprocess will be killed by the OS (I think), and > the socket is released (perhaps with a few minutes delay). That's not true. The child process can run indefinitely even though the parent process has terminated. You may be thinking of SIGHUP, which is sent to all processes when the user logs out of the terminal. Regards, Martin From ckkart at hoc.net Wed Jun 27 13:43:49 2007 From: ckkart at hoc.net (Christian) Date: Wed, 27 Jun 2007 20:43:49 +0900 Subject: [Python-Dev] csv changed from python 2.4 to 2.5 In-Reply-To: <20070627105041.869BA14C24B@irishsea.home.craig-wood.com> References: <20070627105041.869BA14C24B@irishsea.home.craig-wood.com> Message-ID: <46824D75.9080808@hoc.net> Nick Craig-Wood wrote: > Christian K wrote: [...] >> Python 2.5.1 (r251:54863, May 2 2007, 16:56:35) >> [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import csv >>>>> d = csv.excel >>>>> d.delimiter = ',' >>>>> > > Don't you want to do this anyway? > > import csv > class my_dialect(csv.excel): > delimeter = "," > I could probably do that, sure. I used to register my custom dialects and retrieve and modify them at another place, thus probably misusing the register mechanism as a replacement for a global symbol. Thanks, Christian