From greg at krypto.org Tue Jan 1 04:26:54 2013 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 31 Dec 2012 19:26:54 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: On Sun, Dec 30, 2012 at 5:33 AM, Eli Bendersky wrote: > > > On Sat, Dec 29, 2012 at 5:34 PM, Nick Coghlan wrote: > >> On Sun, Dec 30, 2012 at 10:46 AM, Benjamin Peterson >> wrote: >> > 2012/12/29 Eli Bendersky : >> >> Hi, >> >> >> >> This came up while investigating some test-order-dependency failures in >> >> issue 16076. >> >> >> >> test___all__ goes over modules that have `__all__` in them and does >> 'from >> >> import *' on them. This leaves a lot of modules in >> sys.modules, >> >> which may interfere with some tests that do fancy things with >> sys,modules. >> >> In particular, the ElementTree tests have trouble with it because they >> >> carefully set up the imports to get the C or the Python version of >> etree >> >> (see issues 15083 and 15075). >> >> >> >> Would it make sense to save the sys.modules state and restore it in >> >> test___all__ so that sys.modules isn't affected by this test? >> > >> > Sounds reasonable to me. >> >> I've tried this as an inherent property of regrtest before (to resolve >> some problems with test_pydoc), and it didn't work - we have too many >> modules with non-local side effects (IIRC, mostly related to the copy >> and pickle registries). >> >> Given that it checks the whole standard library, test___all__ is >> likely to run into the same problem. >> >> > Yes, I'm running into all kinds of weird problems when saving/restoring > sys.modules around test___all__. This is not the first time I get to fight > this test-run-dependency problem and it's very frustrating. > > This may be a naive question, but why don't we run the tests in separate > interpreters? For example with -j we do (which creates all kinds of > strange intermittent problems depending on which tests got bundled into the > same process). Is this a matter of performance? Because that would help get > rid of these dependencies between tests, which would probably save core > devs some work and headache. > For test___all__ having it run itself within a subprocess to guarantee it is in its own interpreter is wise. Doing that for all tests by default though is a good way to hide problems with particular tests and underhandedly encourage more bad behavior from tests messing up the global environment as there are no consequences. We should aim to keep our tests as clean as possible. That doesn't mean that they _are_ (we've got a _lot_ of old messy code) but it is a good goal. -gps > > After all, since a lot of the interpreter state is global (for example > sys.modules), does it not make sense to run each test in a clean > environment? Many tests do fancy things with the global environment which > makes them difficult to keep clean and separate. > > >> Hence test.support.import_fresh_module - it can ensure you get the >> module you want, regardless of the preexisting contents of >> sys.modules. ( >> http://docs.python.org/dev/library/test#test.support.import_fresh_module) >> > > Yes, this is the solution currently used in test_xml_etree. However, > once pickling tests are added things stop working. Pickle uses __import__ > to import the module a class belongs to, bypassing all such trickery. So if > test___all__ got _elementtree into sys.modules, pickle's __import__ finds > it even if all the tests in test_xml_etree manage to ignore it for the > Python version because they use import_fresh_module. > > Eli > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Tue Jan 1 04:30:51 2013 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 31 Dec 2012 19:30:51 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: <20121230144830.F00802500B2@webabinitio.net> References: <20121230135420.GA24322@sleipnir.bytereef.org> <20121230144830.F00802500B2@webabinitio.net> Message-ID: On Sun, Dec 30, 2012 at 6:48 AM, R. David Murray wrote: > On Mon, 31 Dec 2012 00:38:47 +1000, Nick Coghlan > wrote: > > On Mon, Dec 31, 2012 at 12:19 AM, Eli Bendersky > wrote: > > > On Sun, Dec 30, 2012 at 5:54 AM, Stefan Krah > wrote: > > >> > > >> Eli Bendersky wrote: > > >> > Yes, this is the solution currently used in test_xml_etree. However, > > >> > once > > >> > pickling tests are added things stop working. Pickle uses > __import__ to > > >> > import > > >> > the module a class belongs to, bypassing all such trickery. So if > > >> > test___all__ > > >> > got _elementtree into sys.modules, pickle's __import__ finds it > even if > > >> > all the > > >> > tests in test_xml_etree manage to ignore it for the Python version > > >> > because they > > >> > use import_fresh_module. > > >> > > >> I ran into the same problem for test_decimal. The only thing that > appears > > >> to work is to set sys.modules['decimal'] explicitly before calling > > >> dumps()/loads(). See: > > >> > > >> PythonAPItests.test_pickle() > > >> ContextAPItests.test_pickle() > > > > > > Yes, this seems to have done the trick. Thanks for the suggestion. > > > > It may be worth offering a context manager/decorator equivalent to > > "import_fresh_module". > > I suggested making import_fresh_module a context manager in the issue > that Eli opened about test___all__. > > > > I'm still curious about the test-in-clean-env question though. > > > > As Stefan noted, the main advantage we get is that sometimes the > > failure to clean up properly is in the standard lib code rather than > > the tests, and with complete isolation we'd be less likely to notice > > the problem. > > > > Once you combine that with the fact that rearchitecting regrtest to > > work that way would be quite a bit of work, the motivation to make it > > happen goes way down. > > > > However, specifically spinning out the "import the world" tests like > > test_pydoc and test___all__ to a separate process might be worth the > > effort. > > Adding something to regertest (or unittest?) so that certain nominated > test modules are run in a subprocess has been discussed previously, but > so far no one has stepped up to implement it :) (I think this came up > originally for test_site, but I don't remember for sure.) > test_subprocess and possibly a few others (things testing signal behavior, etc) already have some individual tests that do this. its pretty easy. subprocess.Popen of sys.executable typically with -E and the code either in a string via -c or else point at a second file or add an argument for main in the current file to detect. making this simpler via something in test.support might make sense if there is a common need for it. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Jan 1 07:47:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 1 Jan 2013 07:47:21 +0100 Subject: [Python-Dev] Branch ancestry hiccup on the Mercurial repo Message-ID: <20130101074721.6bf4157b@pitrou.net> Hello, For the record, due to a bug in the "hg graft" command, a recent changeset of mine basically merged branch 2.7 into 3.2: $ hg log -r "ee8d999b6e05 or parents(ee8d999b6e05)" -G o changeset: 81129:ee8d999b6e05 |\ branch: 3.2 | | parent: 81124:e4ea38a92c4d | | parent: 81128:3436769a7964 | | user: Antoine Pitrou | | date: Fri Dec 28 19:07:43 2012 +0100 | | summary: Forward port new test for SSLSocket.connect_ex() | | | o changeset: 81128:3436769a7964 | | branch: 2.7 | | user: Antoine Pitrou | | date: Fri Dec 28 19:03:43 2012 +0100 | | summary: Backport Python 3.2 fix for issue #12065, and add another test for SSLSocket.connect_ex(). | | o | changeset: 81124:e4ea38a92c4d | | branch: 3.2 | | parent: 81118:b2cd12690a51 | | user: Serhiy Storchaka | | date: Fri Dec 28 09:42:11 2012 +0200 | | summary: Issue #16761: Raise TypeError when int() called with base argument only. | | The first symptoms were reported in http://bz.selenic.com/show_bug.cgi?id=3748 but the actual cause of the issue is http://bz.selenic.com/show_bug.cgi?id=3667. Chances are the problem won't be very annoying in practice, but just FYI. Regards Antoine. From senthil at uthcode.com Tue Jan 1 09:20:29 2013 From: senthil at uthcode.com (Senthil Kumaran) Date: Tue, 1 Jan 2013 00:20:29 -0800 Subject: [Python-Dev] Branch ancestry hiccup on the Mercurial repo In-Reply-To: <20130101074721.6bf4157b@pitrou.net> References: <20130101074721.6bf4157b@pitrou.net> Message-ID: On Mon, Dec 31, 2012 at 10:47 PM, Antoine Pitrou wrote: > > Chances are the problem won't be very annoying in practice, but just > FYI. I dont get this. I see 2.7 as a separate un-merged branch again ( http://hg.python.org/cpython/graph). Just curious, what happened next? Thanks, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Tue Jan 1 10:48:24 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 01 Jan 2013 10:48:24 +0100 Subject: [Python-Dev] Branch ancestry hiccup on the Mercurial repo In-Reply-To: References: <20130101074721.6bf4157b@pitrou.net> Message-ID: On 01/01/2013 09:20 AM, Senthil Kumaran wrote: > > > On Mon, Dec 31, 2012 at 10:47 PM, Antoine Pitrou > wrote: > > > Chances are the problem won't be very annoying in practice, but just > FYI. > > > I dont get this. I see 2.7 as a separate un-merged branch again > (http://hg.python.org/cpython/graph). Just curious, what happened next? You can still see the merge near the bottom of the graph on that URL. Nothing too bad happened, as we usually don't merge 2.7 into 3.x anyway, it does not matter that Mercurial now considers all changes before that accident as "merged". Merry 2013, Georg From francismb at email.de Tue Jan 1 15:54:53 2013 From: francismb at email.de (francis) Date: Tue, 01 Jan 2013 15:54:53 +0100 Subject: [Python-Dev] question about packaging In-Reply-To: References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> <20121229114739.54ef8b52@pitrou.net> Message-ID: <50E2F8BD.90603@email.de> >>> The current effort seems to be distlib, Vinay's project to gather the >>> "good parts" of packaging and distutils as a library API: >>> http://packages.python.org/distlib/ >>> (there's an active bitbucket repo) > See > > https://bitbucket.org/vinay.sajip/distlib/ > > for the latest code, which is periodically pushed to > > http://hg.python.org/distlib/ > > The latest documentation is at > > https://distlib.readthedocs.org/en/latest/ > > what about the actual packaging issues from the cypthon bugtracker? should we try to port the ones that apply to https://bitbucket.org/vinay.sajip/distlib/ ? Thanks and Happy 2013! francis From eliben at gmail.com Tue Jan 1 17:00:01 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 1 Jan 2013 08:00:01 -0800 Subject: [Python-Dev] PyTypeObject type names in Modules/ Message-ID: Hello and happy 2013, Something I noticed earlier today is that some C versions of stdlib modules define their name similarly to the Python version in their PyTypeObject. Some examples: Decimal, xml.etree's Element. Others prepend an understore, like _pickle.Pickler and many others. What are the tradeoffs involved in this choice? Is there a "right" thing for types that are supposed to be compatible (i.e. the C extension, where available, replaces the Python implementation seamlessly)? I can think of some meanings for pickling. Unpickling looks at the class name to figure out how to unpickle a user-defined object, so this can affect the pickle/unpickle compatibility between the C and Python versions. What else? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Tue Jan 1 17:03:11 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 1 Jan 2013 08:03:11 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: <20121230135420.GA24322@sleipnir.bytereef.org> <20121230144830.F00802500B2@webabinitio.net> Message-ID: On Mon, Dec 31, 2012 at 7:30 PM, Gregory P. Smith wrote: > > On Sun, Dec 30, 2012 at 6:48 AM, R. David Murray wrote: > >> On Mon, 31 Dec 2012 00:38:47 +1000, Nick Coghlan >> wrote: >> > On Mon, Dec 31, 2012 at 12:19 AM, Eli Bendersky >> wrote: >> > > On Sun, Dec 30, 2012 at 5:54 AM, Stefan Krah >> wrote: >> > >> >> > >> Eli Bendersky wrote: >> > >> > Yes, this is the solution currently used in test_xml_etree. >> However, >> > >> > once >> > >> > pickling tests are added things stop working. Pickle uses >> __import__ to >> > >> > import >> > >> > the module a class belongs to, bypassing all such trickery. So if >> > >> > test___all__ >> > >> > got _elementtree into sys.modules, pickle's __import__ finds it >> even if >> > >> > all the >> > >> > tests in test_xml_etree manage to ignore it for the Python version >> > >> > because they >> > >> > use import_fresh_module. >> > >> >> > >> I ran into the same problem for test_decimal. The only thing that >> appears >> > >> to work is to set sys.modules['decimal'] explicitly before calling >> > >> dumps()/loads(). See: >> > >> >> > >> PythonAPItests.test_pickle() >> > >> ContextAPItests.test_pickle() >> > > >> > > Yes, this seems to have done the trick. Thanks for the suggestion. >> > >> > It may be worth offering a context manager/decorator equivalent to >> > "import_fresh_module". >> >> I suggested making import_fresh_module a context manager in the issue >> that Eli opened about test___all__. >> >> > > I'm still curious about the test-in-clean-env question though. >> > >> > As Stefan noted, the main advantage we get is that sometimes the >> > failure to clean up properly is in the standard lib code rather than >> > the tests, and with complete isolation we'd be less likely to notice >> > the problem. >> > >> > Once you combine that with the fact that rearchitecting regrtest to >> > work that way would be quite a bit of work, the motivation to make it >> > happen goes way down. >> > >> > However, specifically spinning out the "import the world" tests like >> > test_pydoc and test___all__ to a separate process might be worth the >> > effort. >> >> Adding something to regertest (or unittest?) so that certain nominated >> test modules are run in a subprocess has been discussed previously, but >> so far no one has stepped up to implement it :) (I think this came up >> originally for test_site, but I don't remember for sure.) >> > > test_subprocess and possibly a few others (things testing signal behavior, > etc) already have some individual tests that do this. its pretty easy. > subprocess.Popen of sys.executable typically with -E and the code either > in a string via -c or else point at a second file or add an argument for > main in the current file to detect. making this simpler via something in > test.support might make sense if there is a common need for it. > > Thanks, and I agree that some tests have to be run separately. The patches submitted by Arfrever to http://bugs.python.org/issue1674555 seem interesting and relevant (pulling test_site into a separate process/interpreter). I'll take a look at them once I have some time. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Tue Jan 1 17:15:22 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 1 Jan 2013 10:15:22 -0600 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: 2013/1/1 Eli Bendersky : > Hello and happy 2013, > > Something I noticed earlier today is that some C versions of stdlib modules > define their name similarly to the Python version in their PyTypeObject. > Some examples: Decimal, xml.etree's Element. Others prepend an understore, > like _pickle.Pickler and many others. > > What are the tradeoffs involved in this choice? Is there a "right" thing for > types that are supposed to be compatible (i.e. the C extension, where > available, replaces the Python implementation seamlessly)? > > I can think of some meanings for pickling. Unpickling looks at the class > name to figure out how to unpickle a user-defined object, so this can affect > the pickle/unpickle compatibility between the C and Python versions. What > else? I don't it's terribly important except if the object from the C module is directly exposed through the API it's nicer if it's __name__ doesn't have a leading underscore. -- Regards, Benjamin From eliben at gmail.com Tue Jan 1 17:23:14 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 1 Jan 2013 08:23:14 -0800 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: On Tue, Jan 1, 2013 at 8:15 AM, Benjamin Peterson wrote: > 2013/1/1 Eli Bendersky : > > Hello and happy 2013, > > > > Something I noticed earlier today is that some C versions of stdlib > modules > > define their name similarly to the Python version in their PyTypeObject. > > Some examples: Decimal, xml.etree's Element. Others prepend an > understore, > > like _pickle.Pickler and many others. > > > > What are the tradeoffs involved in this choice? Is there a "right" thing > for > > types that are supposed to be compatible (i.e. the C extension, where > > available, replaces the Python implementation seamlessly)? > > > > I can think of some meanings for pickling. Unpickling looks at the class > > name to figure out how to unpickle a user-defined object, so this can > affect > > the pickle/unpickle compatibility between the C and Python versions. What > > else? > > I don't it's terribly important except if the object from the C module > is directly exposed through the API it's nicer if it's __name__ > doesn't have a leading underscore. > Hi Benjamin, Can you elaborate - what you mean by "is directly exposed through the API"? For example, Pickler in 3.x: >>> import pickle >>> pickle.Pickler.__name__ 'Pickler' >>> pickle.Pickler.__module__ '_pickle' -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Tue Jan 1 17:37:55 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 1 Jan 2013 10:37:55 -0600 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: 2013/1/1 Eli Bendersky : > > > > On Tue, Jan 1, 2013 at 8:15 AM, Benjamin Peterson > wrote: >> >> 2013/1/1 Eli Bendersky : >> > Hello and happy 2013, >> > >> > Something I noticed earlier today is that some C versions of stdlib >> > modules >> > define their name similarly to the Python version in their PyTypeObject. >> > Some examples: Decimal, xml.etree's Element. Others prepend an >> > understore, >> > like _pickle.Pickler and many others. >> > >> > What are the tradeoffs involved in this choice? Is there a "right" thing >> > for >> > types that are supposed to be compatible (i.e. the C extension, where >> > available, replaces the Python implementation seamlessly)? >> > >> > I can think of some meanings for pickling. Unpickling looks at the class >> > name to figure out how to unpickle a user-defined object, so this can >> > affect >> > the pickle/unpickle compatibility between the C and Python versions. >> > What >> > else? >> >> I don't it's terribly important except if the object from the C module >> is directly exposed through the API it's nicer if it's __name__ >> doesn't have a leading underscore. > > > Hi Benjamin, > > Can you elaborate - what you mean by "is directly exposed through the API"? > > For example, Pickler in 3.x: > >>>> import pickle >>>> pickle.Pickler.__name__ > 'Pickler' >>>> pickle.Pickler.__module__ > '_pickle' That's an example of what I meant. -- Regards, Benjamin From solipsis at pitrou.net Tue Jan 1 22:48:48 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 1 Jan 2013 22:48:48 +0100 Subject: [Python-Dev] cpython (3.3): configparser: preserve section order when using `__setitem__` (issue #16820) References: <3YbTJh5XGTzS3t@mail.python.org> Message-ID: <20130101224848.75f07576@pitrou.net> On Tue, 1 Jan 2013 22:37:56 +0100 (CET) lukasz.langa wrote: > http://hg.python.org/cpython/rev/f580342b63d8 > changeset: 81205:f580342b63d8 > branch: 3.3 > parent: 81203:b8b5303ac96f > user: ?ukasz Langa > date: Tue Jan 01 22:33:19 2013 +0100 > summary: > configparser: preserve section order when using `__setitem__` (issue #16820) Would this deserve some NEWS entry? Regards Antoine. From lukasz at langa.pl Wed Jan 2 00:02:45 2013 From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=) Date: Wed, 2 Jan 2013 00:02:45 +0100 Subject: [Python-Dev] cpython (3.3): configparser: preserve section order when using `__setitem__` (issue #16820) In-Reply-To: <20130101224848.75f07576@pitrou.net> References: <3YbTJh5XGTzS3t@mail.python.org> <20130101224848.75f07576@pitrou.net> Message-ID: Wiadomo?? napisana przez Antoine Pitrou w dniu 1 sty 2013, o godz. 22:48: > On Tue, 1 Jan 2013 22:37:56 +0100 (CET) > lukasz.langa wrote: >> http://hg.python.org/cpython/rev/f580342b63d8 >> changeset: 81205:f580342b63d8 >> branch: 3.3 >> parent: 81203:b8b5303ac96f >> user: ?ukasz Langa >> date: Tue Jan 01 22:33:19 2013 +0100 >> summary: >> configparser: preserve section order when using `__setitem__` (issue #16820) > > Would this deserve some NEWS entry? Indeed, I missed updating NEWS, sorry. Will do that now for all fixes I did in the past days. -- Best regards, ?ukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o. http://lukasz.langa.pl/ +48 791 080 144 From christian at python.org Wed Jan 2 07:14:58 2013 From: christian at python.org (Christian Heimes) Date: Wed, 02 Jan 2013 07:14:58 +0100 Subject: [Python-Dev] PEPs in progress Message-ID: Good morning and happy new year! I would like to let you know that I'm working on two new PEPs. The first PEP addresses hash collision attacks as discussed in http://bugs.python.org/issue14621. I'm planing to make the internal string and bytes hash function more secure, pluggable and even faster. DJB's SipHash is part of the plan as well. Users will be able to choose an algorithm on startup and embedders will be able to provide their own hash function, too. The second PEP addresses key derivation functions for secure password hashing. I like to add PBKDF2, bcrypt and scrypt facilities to the standard library. Regards, Christian From dirkjan at ochtman.nl Wed Jan 2 09:18:17 2013 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Wed, 2 Jan 2013 09:18:17 +0100 Subject: [Python-Dev] PEPs in progress In-Reply-To: References: Message-ID: On Wed, Jan 2, 2013 at 7:14 AM, Christian Heimes wrote: > The second PEP addresses key derivation functions for secure password > hashing. I like to add PBKDF2, bcrypt and scrypt facilities to the > standard library. Hashing only? I was hoping you would also include PKCS1 RSA primitives. Cheers, Dirkjan From solipsis at pitrou.net Wed Jan 2 10:49:42 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 2 Jan 2013 10:49:42 +0100 Subject: [Python-Dev] PEPs in progress References: Message-ID: <20130102104942.68e7df31@pitrou.net> Le Wed, 2 Jan 2013 09:18:17 +0100, Dirkjan Ochtman a ?crit : > On Wed, Jan 2, 2013 at 7:14 AM, Christian Heimes > wrote: > > The second PEP addresses key derivation functions for secure > > password hashing. I like to add PBKDF2, bcrypt and scrypt > > facilities to the standard library. > > Hashing only? I was hoping you would also include PKCS1 RSA > primitives. That's a totally different topic, no? Regards Antoine. From christian at python.org Wed Jan 2 11:06:07 2013 From: christian at python.org (Christian Heimes) Date: Wed, 02 Jan 2013 11:06:07 +0100 Subject: [Python-Dev] PEPs in progress In-Reply-To: <20130102104942.68e7df31@pitrou.net> References: <20130102104942.68e7df31@pitrou.net> Message-ID: <50E4068F.8010206@python.org> Am 02.01.2013 10:49, schrieb Antoine Pitrou: > Le Wed, 2 Jan 2013 09:18:17 +0100, > Dirkjan Ochtman a ?crit : >> On Wed, Jan 2, 2013 at 7:14 AM, Christian Heimes >> wrote: >>> The second PEP addresses key derivation functions for secure >>> password hashing. I like to add PBKDF2, bcrypt and scrypt >>> facilities to the standard library. >> >> Hashing only? I was hoping you would also include PKCS1 RSA >> primitives. > > That's a totally different topic, no? Yeah, these are totally different topics. PKCS #1 is about RSA encryption, decryption, verification and padding. PBKDF2 is specified in PKCS #5 v2.0. PKCS #1 support should be implemented on top of OpenSSL or NaCl. I'm planing to use OpenSSL as primary implementation and additional code as fallback implementation of PBKDF2-HMAC-SHA-1, PBKDF2-HMAC-SHA-256 and PBKDF2-HMAC-SHA-512. Christian From solipsis at pitrou.net Wed Jan 2 11:11:47 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 2 Jan 2013 11:11:47 +0100 Subject: [Python-Dev] PEPs in progress References: <20130102104942.68e7df31@pitrou.net> <50E4068F.8010206@python.org> Message-ID: <20130102111147.01a4f860@pitrou.net> Le Wed, 02 Jan 2013 11:06:07 +0100, Christian Heimes a ?crit : > Am 02.01.2013 10:49, schrieb Antoine Pitrou: > > Le Wed, 2 Jan 2013 09:18:17 +0100, > > Dirkjan Ochtman a ?crit : > >> On Wed, Jan 2, 2013 at 7:14 AM, Christian Heimes > >> wrote: > >>> The second PEP addresses key derivation functions for secure > >>> password hashing. I like to add PBKDF2, bcrypt and scrypt > >>> facilities to the standard library. > >> > >> Hashing only? I was hoping you would also include PKCS1 RSA > >> primitives. > > > > That's a totally different topic, no? > > Yeah, these are totally different topics. PKCS #1 is about RSA > encryption, decryption, verification and padding. PBKDF2 is specified > in PKCS #5 v2.0. > > PKCS #1 support should be implemented on top of OpenSSL or NaCl. I'm > planing to use OpenSSL as primary implementation and additional code > as fallback implementation of PBKDF2-HMAC-SHA-1, PBKDF2-HMAC-SHA-256 > and PBKDF2-HMAC-SHA-512. Does PBKDF2-SHA3 exist? From christian at python.org Wed Jan 2 11:30:36 2013 From: christian at python.org (Christian Heimes) Date: Wed, 02 Jan 2013 11:30:36 +0100 Subject: [Python-Dev] PEPs in progress In-Reply-To: <20130102111147.01a4f860@pitrou.net> References: <20130102104942.68e7df31@pitrou.net> <50E4068F.8010206@python.org> <20130102111147.01a4f860@pitrou.net> Message-ID: <50E40C4C.2050603@python.org> Am 02.01.2013 11:11, schrieb Antoine Pitrou: >> PKCS #1 support should be implemented on top of OpenSSL or NaCl. I'm >> planing to use OpenSSL as primary implementation and additional code >> as fallback implementation of PBKDF2-HMAC-SHA-1, PBKDF2-HMAC-SHA-256 >> and PBKDF2-HMAC-SHA-512. > > Does PBKDF2-SHA3 exist? No HMAC-SHA-3 or PBKDF2-SHA-3 yet. The different design may require different MAC and KDF functions. At least the sponge design isn't vulnerable to length extension attacks like Merkle-Damgard constructions, which removes the need for HMAC-SHA-3. From michael at voidspace.org.uk Wed Jan 2 12:54:21 2013 From: michael at voidspace.org.uk (Michael Foord) Date: Wed, 2 Jan 2013 11:54:21 +0000 Subject: [Python-Dev] hgweb misconfiguration Message-ID: <039D87CC-0F36-4B6B-8E1C-42CD2E907C6C@voidspace.org.uk> Hey folks, Likely the wrong place to report this, but I couldn't work out the best place and figured this is only as bad as anywhere else. A user has reported to webmaster that hgweb is misconfigured (or at least the server configuration is interfering with hgweb). The symptom is that this url is incorrectly a 404: http://hg.python.org/cpython/file/3.3/Doc/tools/sphinxext/static/version_switch.js His diagnosis: >> the web server is misconfigured to serve the hgweb static >> files for any directory named 'static'. It is appropriate for it to do >> this for http://hg.python.org/static/ and maybe even >> http://hg.python.org/cpython/static/ and the other repositories, but >> it is inappropriate for it to serve the hgweb static files within the >> repository contents. What should have happened: The web server passes on the request to hgweb, which notices that it is supposed to retrieve a particular version of a file from the Mercurial repository and show it to the user. This works successfully for URLs not containing '/static/' (ending slash significant), e.g., . You can see that hgweb successfully displays the directory listing. However, try adding a slash onto the end of that. The web server picks up '/static/' and takes over the request, listing its static files (rather than letting hgweb display the 'static' directory from the Mercurial repository). The static files shown by the web server's directory listing are used for the user interface of hgweb; for example, the logo . It should be serving those static files on and (where repo is a repository name like cpython), but it is also serving on , which is inappropriate, because it makes it impossible to view version_switch.js and the other files in that directory with the hgweb interface. -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From victor.stinner at gmail.com Thu Jan 3 02:40:32 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 3 Jan 2013 02:40:32 +0100 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: Oh, I just remember this old issue which is somehow similar: http://bugs.python.org/issue11995 "test_pydoc loads all Python modules" Running test_pydoc in a subprocess would work around this issue. Arfrever wrote a patch, I didn't read it (sorry!). Victor From fijall at gmail.com Thu Jan 3 11:22:27 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 3 Jan 2013 12:22:27 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> Message-ID: Hello everyone. Thanks raymond for writing down a pure python version ;-) I did an initial port to RPython for experiments. The results (on large dicts only) are inconclusive - it's either a bit faster or a bit slower, depending what exactly you do. There is a possibility I messed something up too (there is a branch rdict-experiments in PyPy, in a very sorry state though). One effect we did not think about is that besides extra indirection, there is an issue with data locality - having to look in two different large lists seems to be a hit. Again, while I tried, the approach is not scientific at all, but unless someone points a clear flaw in the code (it's in pypy/rlib/dict.py in rdict-experiments branch), I'm probably abandoning this for now. Cheers, fijal From eliben at gmail.com Thu Jan 3 15:40:46 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 3 Jan 2013 06:40:46 -0800 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: >> > >> 2013/1/1 Eli Bendersky : > >> > Hello and happy 2013, > >> > > >> > Something I noticed earlier today is that some C versions of stdlib > >> > modules > >> > define their name similarly to the Python version in their > PyTypeObject. > >> > Some examples: Decimal, xml.etree's Element. Others prepend an > >> > understore, > >> > like _pickle.Pickler and many others. > >> > > >> > >> I don't it's terribly important except if the object from the C module > >> is directly exposed through the API it's nicer if it's __name__ > >> doesn't have a leading underscore. > > > As a followup question: would it be considered a compatibility-breaking change to rename PyTypeObject names? As a concrete example, in Modules/_elementtree.c the name of Element_Type (essentially the xml.etree.ElementTree.Element replacement in C) is "Element". For the purpose of pickling/unpickling it should be named "_elementtree.Element" or "xml.etree.ElementTree.Element" or some such thing. Can such a change be made between 3.3 and 3.4? Between 3.3 and 3.3.1? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Thu Jan 3 15:43:23 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 3 Jan 2013 08:43:23 -0600 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: 2013/1/3 Eli Bendersky : > > >> >> >> >> 2013/1/1 Eli Bendersky : >> >> > Hello and happy 2013, >> >> > >> >> > Something I noticed earlier today is that some C versions of stdlib >> >> > modules >> >> > define their name similarly to the Python version in their >> >> > PyTypeObject. >> >> > Some examples: Decimal, xml.etree's Element. Others prepend an >> >> > understore, >> >> > like _pickle.Pickler and many others. >> >> > >> >> >> >> I don't it's terribly important except if the object from the C module >> >> is directly exposed through the API it's nicer if it's __name__ >> >> doesn't have a leading underscore. >> > > > > As a followup question: would it be considered a compatibility-breaking > change to rename PyTypeObject names? As a concrete example, in > Modules/_elementtree.c the name of Element_Type (essentially the > xml.etree.ElementTree.Element replacement in C) is "Element". For the > purpose of pickling/unpickling it should be named "_elementtree.Element" or > "xml.etree.ElementTree.Element" or some such thing. Can such a change be > made between 3.3 and 3.4? Between 3.3 and 3.3.1? I don't know much about etree. Can you pickle Element? -- Regards, Benjamin From eliben at gmail.com Thu Jan 3 15:46:27 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 3 Jan 2013 06:46:27 -0800 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: On Thu, Jan 3, 2013 at 6:43 AM, Benjamin Peterson wrote: > 2013/1/3 Eli Bendersky : > > > > > >> >> > >> >> 2013/1/1 Eli Bendersky : > >> >> > Hello and happy 2013, > >> >> > > >> >> > Something I noticed earlier today is that some C versions of stdlib > >> >> > modules > >> >> > define their name similarly to the Python version in their > >> >> > PyTypeObject. > >> >> > Some examples: Decimal, xml.etree's Element. Others prepend an > >> >> > understore, > >> >> > like _pickle.Pickler and many others. > >> >> > > >> >> > >> >> I don't it's terribly important except if the object from the C > module > >> >> is directly exposed through the API it's nicer if it's __name__ > >> >> doesn't have a leading underscore. > >> > > > > > > > As a followup question: would it be considered a compatibility-breaking > > change to rename PyTypeObject names? As a concrete example, in > > Modules/_elementtree.c the name of Element_Type (essentially the > > xml.etree.ElementTree.Element replacement in C) is "Element". For the > > purpose of pickling/unpickling it should be named "_elementtree.Element" > or > > "xml.etree.ElementTree.Element" or some such thing. Can such a change be > > made between 3.3 and 3.4? Between 3.3 and 3.3.1? > > I don't know much about etree. Can you pickle Element? > etree has a C accelerator that was improved and extended in 3.3 and was made the default when importing etree. But a regression (issue #16076) occurs because _elementree.Element has no pickling support, while the Python version does by default (being a plain Python class). In the aforementioned issue we're trying to resolve the strategy for supporting pickling for _elementtree.Element. I hope my understanding of unpickling is correct - it does need a class name to find the module that can unpickle the object, right? If this is correct, than such a change (name of the type) may be required to solve the regression between 3.3 and 3.3.1 Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Thu Jan 3 15:50:44 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 3 Jan 2013 08:50:44 -0600 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: 2013/1/3 Eli Bendersky : > etree has a C accelerator that was improved and extended in 3.3 and was made > the default when importing etree. But a regression (issue #16076) occurs > because _elementree.Element has no pickling support, while the Python > version does by default (being a plain Python class). In the aforementioned > issue we're trying to resolve the strategy for supporting pickling for > _elementtree.Element. I hope my understanding of unpickling is correct - it > does need a class name to find the module that can unpickle the object, > right? If this is correct, than such a change (name of the type) may be > required to solve the regression between 3.3 and 3.3.1 Yes, but you're probably going to have to do more than change the class name in order for pickling to work for C types. -- Regards, Benjamin From eliben at gmail.com Thu Jan 3 17:14:08 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 3 Jan 2013 08:14:08 -0800 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: On Thu, Jan 3, 2013 at 6:50 AM, Benjamin Peterson wrote: > 2013/1/3 Eli Bendersky : > > etree has a C accelerator that was improved and extended in 3.3 and was > made > > the default when importing etree. But a regression (issue #16076) occurs > > because _elementree.Element has no pickling support, while the Python > > version does by default (being a plain Python class). In the > aforementioned > > issue we're trying to resolve the strategy for supporting pickling for > > _elementtree.Element. I hope my understanding of unpickling is correct - > it > > does need a class name to find the module that can unpickle the object, > > right? If this is correct, than such a change (name of the type) may be > > required to solve the regression between 3.3 and 3.3.1 > > Yes, but you're probably going to have to do more than change the > class name in order for pickling to work for C types. > Yes, of course. All of that is already implemented in patches Daniel Shahaf has submitted to the issue. The "controversial" question here is whether it's valid to change the __module__ of the type between 3.3 and 3.3.1 ? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at bytereef.org Thu Jan 3 20:17:23 2013 From: stefan at bytereef.org (Stefan Krah) Date: Thu, 3 Jan 2013 20:17:23 +0100 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: <20130103191722.GA29796@sleipnir.bytereef.org> Eli Bendersky wrote: > Yes, of course. All of that is already implemented in patches Daniel Shahaf has > submitted to the issue. The "controversial" question here is whether it's valid > to change the __module__ of the type between 3.3 and 3.3.1 ? In 3.2 and in the Python version of 3.3 we have this: >>> Element.__module__ 'xml.etree.ElementTree' So IMHO changing the C version to do the same in 3.3.1 is a compatibility fix. Stefan Krah From jcea at jcea.es Thu Jan 3 20:58:18 2013 From: jcea at jcea.es (Jesus Cea) Date: Thu, 03 Jan 2013 20:58:18 +0100 Subject: [Python-Dev] Posting frequent spurious changes in bugtracker Message-ID: <50E5E2DA.4090606@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This happens to me very frequently. I get the notification about new issues open in the bugtracker. If I see an interesting "bug", I usually open a Firefox tab with it, to monitor it, decide if I will work on it in the future, whatever. When I have time to decide, I proceed. If I find the issue interesting but don't have the time to work on it, or somebody else is taking care of it, I just add myself to the nosy list, and close the tab. The problem with this is that even if I reload the tab, flags are usually stale and when I add myself to the nosy list, I revert flags to the old value. This happens frequently. Last time: I think tracker should keep a "version" hidden field, and refuse to act if the version on server is different that the just posted changes. Or something. Or am I doing something fundamentally wrong?. I reflesh 99% of the time, but could forget sometimes, and some other times I need a "shift+reload" to bypass browser cache. PS: Tracker tells me about conflicts ocasionally, but most of the time the stale flags are not noticed. - -- Jes?s Cea Avi?n _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQCVAwUBUOXi2Zlgi5GaxT1NAQK6ewQAlCv0RKb3/y6tzdjggm4JqOjt0E9vYyZJ o0FIC+KBTrVtWrACiBOKCHPVsvJ1KB+6RzQ4FJdIyeOzG8Kv/E0HUNDufLGjr41C NCd4O8FN0tpHIfJkpC3PbkinyhZpviQL7YindAteb92sa+QFm8Rp+QCKBFRecMUX A7WLf+Gk8pE= =famI -----END PGP SIGNATURE----- From brett at python.org Thu Jan 3 21:13:37 2013 From: brett at python.org (Brett Cannon) Date: Thu, 3 Jan 2013 15:13:37 -0500 Subject: [Python-Dev] Posting frequent spurious changes in bugtracker In-Reply-To: <50E5E2DA.4090606@jcea.es> References: <50E5E2DA.4090606@jcea.es> Message-ID: On Thu, Jan 3, 2013 at 2:58 PM, Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > This happens to me very frequently. > > I get the notification about new issues open in the bugtracker. If I > see an interesting "bug", I usually open a Firefox tab with it, to > monitor it, decide if I will work on it in the future, whatever. > > When I have time to decide, I proceed. If I find the issue interesting > but don't have the time to work on it, or somebody else is taking care > of it, I just add myself to the nosy list, and close the tab. > > The problem with this is that even if I reload the tab, flags are > usually stale and when I add myself to the nosy list, I revert flags > to the old value. > > This happens frequently. Last time: > > > I think tracker should keep a "version" hidden field, and refuse to > act if the version on server is different that the just posted > changes. Or something. > I think this should probably go upstream to Roundup if you want to see it implemented as it's a general thing and not specific to our instance. > > Or am I doing something fundamentally wrong?. I reflesh 99% of the > time, but could forget sometimes, and some other times I need a > "shift+reload" to bypass browser cache. > > Well, I would argue you are doing something fundamentally wrong by submitting anything without refreshing first. It is a form so technically nothing is being done incorrectly in changing values based on what you submit, whether you view them stale or not. If you want to keep this workflow going I would write a greasemonkey script which refreshes those tabs you leave open on occasion to avoid this issue. -Brett > PS: Tracker tells me about conflicts ocasionally, but most of the time > the stale flags are not noticed. > > - -- > Jes?s Cea Avi?n _/_/ _/_/_/ _/_/_/ > jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ > jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ > . _/_/ _/_/ _/_/ _/_/ _/_/ > "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ > "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ > "El amor es poner tu felicidad en la felicidad de otro" - Leibniz > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with undefined - http://www.enigmail.net/ > > iQCVAwUBUOXi2Zlgi5GaxT1NAQK6ewQAlCv0RKb3/y6tzdjggm4JqOjt0E9vYyZJ > o0FIC+KBTrVtWrACiBOKCHPVsvJ1KB+6RzQ4FJdIyeOzG8Kv/E0HUNDufLGjr41C > NCd4O8FN0tpHIfJkpC3PbkinyhZpviQL7YindAteb92sa+QFm8Rp+QCKBFRecMUX > A7WLf+Gk8pE= > =famI > -----END PGP SIGNATURE----- > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Thu Jan 3 22:06:59 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 03 Jan 2013 13:06:59 -0800 Subject: [Python-Dev] Posting frequent spurious changes in bugtracker In-Reply-To: References: <50E5E2DA.4090606@jcea.es> Message-ID: <50E5F2F3.6030708@g.nevcal.com> On 1/3/2013 12:13 PM, Brett Cannon wrote: > It is a form so technically nothing is being done incorrectly in > changing values based on what you submit, whether you view them stale > or not. Well, it sounds like a pretty shaky technology foundation, if simultaneous updates of a shared data repository have race conditions. Certainly leaving a tab open for long periods of time exacerbates the issue, as it severely extends the definition of "simultaneous". Without that, the likelihood of people doing simultaneous updates is seriously reduced, except maybe for bugs with "hot" discussions. Jesus' suggestion of a hidden version field would help, but could be annoying for the case of someone writing a lengthy response, and having it discarded because the hidden version field is too old... so care would have to be taken to preserve such responses when doing the refresh... Another possible implementation would be to track which fields in the form are actually updated by a submitter... and reject a submission only if there was a simultaneous update to that field. Another possible implementation for fields like nosy, would be to display the current list, but provide boxes for additions and deletions, rather than allowing editing. Or maybe just a "remove me" button for deletions would suffice, with a box for additions. Then the processing would avoid adding duplicates. People shouldn't have to do heroic things with refreshing to maintain the consistency of the underlying data... database transaction technology has been around for quite a few years by now, and is well understood. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Thu Jan 3 22:48:06 2013 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Thu, 3 Jan 2013 13:48:06 -0800 Subject: [Python-Dev] Posting frequent spurious changes in bugtracker In-Reply-To: <50E5F2F3.6030708@g.nevcal.com> References: <50E5E2DA.4090606@jcea.es> <50E5F2F3.6030708@g.nevcal.com> Message-ID: On Thu, Jan 3, 2013 at 1:06 PM, Glenn Linderman wrote: > Jesus' suggestion of a hidden version field would help, but could be > annoying for the case of someone writing a lengthy response, and having it > discarded because the hidden version field is too old... so care would have > to be taken to preserve such responses when doing the refresh... I seem to recall that this already happens in certain circumstances. If someone else posts a comment while you are composing a comment, the user interface notifies you with a red box when you try to submit that the thread has changed. It prevents the post but does preserve the text you are composing. --Chris From benjamin at python.org Thu Jan 3 23:43:39 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 3 Jan 2013 17:43:39 -0500 Subject: [Python-Dev] PyTypeObject type names in Modules/ In-Reply-To: References: Message-ID: 2013/1/3 Eli Bendersky : > > > > On Thu, Jan 3, 2013 at 6:50 AM, Benjamin Peterson > wrote: >> >> 2013/1/3 Eli Bendersky : >> > etree has a C accelerator that was improved and extended in 3.3 and was >> > made >> > the default when importing etree. But a regression (issue #16076) occurs >> > because _elementree.Element has no pickling support, while the Python >> > version does by default (being a plain Python class). In the >> > aforementioned >> > issue we're trying to resolve the strategy for supporting pickling for >> > _elementtree.Element. I hope my understanding of unpickling is correct - >> > it >> > does need a class name to find the module that can unpickle the object, >> > right? If this is correct, than such a change (name of the type) may be >> > required to solve the regression between 3.3 and 3.3.1 >> >> Yes, but you're probably going to have to do more than change the >> class name in order for pickling to work for C types. > > > Yes, of course. All of that is already implemented in patches Daniel Shahaf > has submitted to the issue. The "controversial" question here is whether > it's valid to change the __module__ of the type between 3.3 and 3.3.1 ? Failure to pickle suddenly is more of a compatibility issue IMO. -- Regards, Benjamin From brett at python.org Thu Jan 3 23:43:51 2013 From: brett at python.org (Brett Cannon) Date: Thu, 3 Jan 2013 17:43:51 -0500 Subject: [Python-Dev] Posting frequent spurious changes in bugtracker In-Reply-To: <50E5F2F3.6030708@g.nevcal.com> References: <50E5E2DA.4090606@jcea.es> <50E5F2F3.6030708@g.nevcal.com> Message-ID: On Thu, Jan 3, 2013 at 4:06 PM, Glenn Linderman wrote: > On 1/3/2013 12:13 PM, Brett Cannon wrote: > > It is a form so technically nothing is being done incorrectly in changing > values based on what you submit, whether you view them stale or not. > > > Well, it sounds like a pretty shaky technology foundation, if simultaneous > updates of a shared data repository have race conditions. > > Certainly leaving a tab open for long periods of time exacerbates the > issue, as it severely extends the definition of "simultaneous". Without > that, the likelihood of people doing simultaneous updates is seriously > reduced, except maybe for bugs with "hot" discussions. > > I would argue it is no longer simultaneous and thus not a race condition. You can't consider a POST transactional based on when the HTTP GET request for the form completed to when the POST finally occurs. > Jesus' suggestion of a hidden version field would help, but could be > annoying for the case of someone writing a lengthy response, and having it > discarded because the hidden version field is too old... so care would have > to be taken to preserve such responses when doing the refresh... > As I said, this belongs upstream in Roundup and not directly in our roundup instance for a proper fix. This is beyond schema and it heading into low-level Roundup POST functionality. -Brett > Another possible implementation would be to track which fields in the form > are actually updated by a submitter... and reject a submission only if > there was a simultaneous update to that field. > > Another possible implementation for fields like nosy, would be to display > the current list, but provide boxes for additions and deletions, rather > than allowing editing. Or maybe just a "remove me" button for deletions > would suffice, with a box for additions. Then the processing would avoid > adding duplicates. > > People shouldn't have to do heroic things with refreshing to maintain the > consistency of the underlying data... database transaction technology has > been around for quite a few years by now, and is well understood. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Fri Jan 4 00:03:45 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 03 Jan 2013 15:03:45 -0800 Subject: [Python-Dev] Posting frequent spurious changes in bugtracker In-Reply-To: References: <50E5E2DA.4090606@jcea.es> <50E5F2F3.6030708@g.nevcal.com> Message-ID: <50E60E51.8080204@g.nevcal.com> On 1/3/2013 2:43 PM, Brett Cannon wrote: > > > > On Thu, Jan 3, 2013 at 4:06 PM, Glenn Linderman > wrote: > > On 1/3/2013 12:13 PM, Brett Cannon wrote: >> It is a form so technically nothing is being done incorrectly in >> changing values based on what you submit, whether you view them >> stale or not. > > Well, it sounds like a pretty shaky technology foundation, if > simultaneous updates of a shared data repository have race conditions. > > Certainly leaving a tab open for long periods of time exacerbates > the issue, as it severely extends the definition of > "simultaneous". Without that, the likelihood of people doing > simultaneous updates is seriously reduced, except maybe for bugs > with "hot" discussions. > > > I would argue it is no longer simultaneous and thus not a race > condition. You can't consider a POST transactional based on when the > HTTP GET request for the form completed to when the POST finally occurs. Argue away. it still is a race condition, just a slow race. HTTP is stateless, not transactional, agreed, but when a transaction system is implemented on top of HTTP, it needs to implement transactional semantics itself... or else it is a pretty shaky foundation for a database. The POST is an update, the prior GET started the transaction. This is just the definition of a transaction, so any arguments you have can only claim that Roundup doesn't implement transactions, and the bug tracker is not a database... and if you are right, then it is a pretty shaky foundation for a bug tracker database. I think that is what I said in the first place :) > Jesus' suggestion of a hidden version field would help, but could > be annoying for the case of someone writing a lengthy response, > and having it discarded because the hidden version field is too > old... so care would have to be taken to preserve such responses > when doing the refresh... > > > As I said, this belongs upstream in Roundup and not directly in our > roundup instance for a proper fix. This is beyond schema and it > heading into low-level Roundup POST functionality. Agreed, the fix, if to be done correctly, belongs in Roundup, from what I understand of how things are implemented. > -Brett > > > Another possible implementation would be to track which fields in > the form are actually updated by a submitter... and reject a > submission only if there was a simultaneous update to that field. > > Another possible implementation for fields like nosy, would be to > display the current list, but provide boxes for additions and > deletions, rather than allowing editing. Or maybe just a "remove > me" button for deletions would suffice, with a box for additions. > Then the processing would avoid adding duplicates. > > People shouldn't have to do heroic things with refreshing to > maintain the consistency of the underlying data... database > transaction technology has been around for quite a few years by > now, and is well understood. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Fri Jan 4 04:20:37 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 04 Jan 2013 12:20:37 +0900 Subject: [Python-Dev] Posting frequent spurious changes in bugtracker In-Reply-To: References: <50E5E2DA.4090606@jcea.es> Message-ID: <87txqxbpmi.fsf@uwakimon.sk.tsukuba.ac.jp> Now that Jesus has provided a heads-up, detailed discussion belongs on the metatracker IMO. Warning: Reply-To set to tracker-discuss at python.org. metatracker workers: Created issue500. More info: see this thread: http://mail.python.org/pipermail/python-dev/2013-January/123437.html Brett Cannon writes: > > Or am I doing something fundamentally wrong?. I reflesh 99% of the > > time, but could forget sometimes, and some other times I need a > > "shift+reload" to bypass browser cache. > > > Well, I would argue you are doing something fundamentally wrong by > submitting anything without refreshing first. It is a form so technically > nothing is being done incorrectly in changing values based on what you > submit, whether you view them stale or not. That's beside the point, Brett. First, his intent is to refresh, so he's doing his best - he merely admits he's human, and would like help from the machine. But the nub of his report is the fact that nothing he *can* do seems to do the whole job of refreshing. Nor is he complaining about the form processing at the HTML/HTTP level. The limits of the HTTP transport don't justify the defects of the database backend, although they are likely to cause inconvenience to users (ie, the easy patch is to block the whole update, rather than update only the fields the user has explicitly changed). I need to update XEmacs's tracker anyway, I'll investigate whether upstream has a fix for this, and if not, see if I can come up with a patch for this while I'm updating. Technically it shouldn't be hard to implement Jesus's suggestion, and at least reject potential spurious updates (the tracker db has a timestamp on every change). metatracker workers: no ETA on this, feel free to ping me for updates or prod me to get to work. From trent at snakebite.org Fri Jan 4 11:48:00 2013 From: trent at snakebite.org (Trent Nelson) Date: Fri, 4 Jan 2013 05:48:00 -0500 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: References: Message-ID: <20130104104759.GA11908@snakebite.org> On Thu, Dec 20, 2012 at 09:33:27AM -0800, Brian Curtin wrote: > Last week in Raymond's dictionary thread, the topic of ARM came up, > along with the relative lack of build slave coverage. Today Trent > Nelson received the PandaBoard purchased by the PSF, and a Raspberry > Pi should be coming shortly as well. > > http://blog.python.org/2012/12/pandaboard-raspberry-pi-coming-to.html > > Thanks to the PSF for purchasing and thanks to Trent for offering to > host them in Snakebite! The installation of Ubuntu on the Pandaboard went smoothly. However, it crashes after about an hour. Console output: http://trent.snakebite.net/pandaboard-crash.txt Any ARM wizards out there with suggestions? Trent. From trent at snakebite.org Fri Jan 4 13:10:16 2013 From: trent at snakebite.org (Trent Nelson) Date: Fri, 4 Jan 2013 07:10:16 -0500 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: <20130104104759.GA11908@snakebite.org> References: <20130104104759.GA11908@snakebite.org> Message-ID: <20130104121016.GB12364@snakebite.org> On Fri, Jan 04, 2013 at 02:48:00AM -0800, Trent Nelson wrote: > On Thu, Dec 20, 2012 at 09:33:27AM -0800, Brian Curtin wrote: > > Last week in Raymond's dictionary thread, the topic of ARM came up, > > along with the relative lack of build slave coverage. Today Trent > > Nelson received the PandaBoard purchased by the PSF, and a Raspberry > > Pi should be coming shortly as well. > > > > http://blog.python.org/2012/12/pandaboard-raspberry-pi-coming-to.html > > > > Thanks to the PSF for purchasing and thanks to Trent for offering to > > host them in Snakebite! > > The installation of Ubuntu on the Pandaboard went smoothly. > However, it crashes after about an hour. Console output: > > http://trent.snakebite.net/pandaboard-crash.txt > > Any ARM wizards out there with suggestions? Forgot to mention, it's accessible via the p1 alias from the Snakebite menu. Until it crashes, anyway :-) If anyone wants root access to poke around, let me know. Trent. From victor.stinner at gmail.com Fri Jan 4 14:06:22 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 4 Jan 2013 14:06:22 +0100 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: <20130104104759.GA11908@snakebite.org> References: <20130104104759.GA11908@snakebite.org> Message-ID: 2013/1/4 Trent Nelson : > The installation of Ubuntu on the Pandaboard went smoothly. > However, it crashes after about an hour. Console output: > > http://trent.snakebite.net/pandaboard-crash.txt > > Any ARM wizards out there with suggestions? The bug was already reported to Ubuntu: https://bugs.launchpad.net/ubuntu/+source/linux-meta-ti-omap4/+bug/1012735 The issue contains a workaround. Victor From trent at snakebite.org Fri Jan 4 14:35:58 2013 From: trent at snakebite.org (Trent Nelson) Date: Fri, 4 Jan 2013 08:35:58 -0500 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: References: <20130104104759.GA11908@snakebite.org> Message-ID: <20130104133558.GA13662@snakebite.org> On Fri, Jan 04, 2013 at 05:06:22AM -0800, Victor Stinner wrote: > 2013/1/4 Trent Nelson : > > The installation of Ubuntu on the Pandaboard went smoothly. > > However, it crashes after about an hour. Console output: > > > > http://trent.snakebite.net/pandaboard-crash.txt > > > > Any ARM wizards out there with suggestions? > > The bug was already reported to Ubuntu: > https://bugs.launchpad.net/ubuntu/+source/linux-meta-ti-omap4/+bug/1012735 Ah! Thanks. > > The issue contains a workaround. [root at manganese/ttypts/0(~)#] echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor [root at manganese/ttypts/0(~)#] echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor Doesn't get more Linuxy than that :P Trent. From solipsis at pitrou.net Fri Jan 4 15:16:04 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 4 Jan 2013 15:16:04 +0100 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet References: <20130104104759.GA11908@snakebite.org> <20130104133558.GA13662@snakebite.org> Message-ID: <20130104151604.5ec455ba@pitrou.net> Le Fri, 4 Jan 2013 08:35:58 -0500, Trent Nelson a ?crit : > On Fri, Jan 04, 2013 at 05:06:22AM -0800, Victor Stinner wrote: > > 2013/1/4 Trent Nelson : > > > The installation of Ubuntu on the Pandaboard went smoothly. > > > However, it crashes after about an hour. Console output: > > > > > > http://trent.snakebite.net/pandaboard-crash.txt > > > > > > Any ARM wizards out there with suggestions? > > > > The bug was already reported to Ubuntu: > > https://bugs.launchpad.net/ubuntu/+source/linux-meta-ti-omap4/+bug/1012735 > > Ah! Thanks. > > > > The issue contains a workaround. > > [root at manganese/ttypts/0(~)#] echo performance > > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > > [root at manganese/ttypts/0(~)#] echo performance > > > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor > > Doesn't get more Linuxy than that :P You could probably have written: cpu-freq-set -rg performance Regards Antoine. From rovitotv at gmail.com Fri Jan 4 16:04:14 2013 From: rovitotv at gmail.com (Todd V Rovito) Date: Fri, 4 Jan 2013 10:04:14 -0500 Subject: [Python-Dev] Please review simple patch for IDLE documentation last updated 11/28/2012 Message-ID: <2629F645-7CD7-4990-83CB-D837D983E3FC@gmail.com> Greetings, I submitted a simple patch for updates to IDLE's documentation http://bugs.python.org/issue5066 and it has not been reviewed by an official Python developer. Since it is now been over 1month since last update can somebody please review and or commit? I did get some comments from another contributor and updated the patch based on those comments. The original bug was opened in 2009 and the IDLE documentation is out of date with undocumented menu options and inconsistencies between the help file and the HTML documentation. I hate to be pushy but I would like to see some progress made on IDLE. Thanks for the support. Sent from my iPad -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Fri Jan 4 16:23:01 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 04 Jan 2013 10:23:01 -0500 Subject: [Python-Dev] Please review simple patch for IDLE documentation last updated 11/28/2012 In-Reply-To: <2629F645-7CD7-4990-83CB-D837D983E3FC@gmail.com> References: <2629F645-7CD7-4990-83CB-D837D983E3FC@gmail.com> Message-ID: <20130104152302.532462500B2@webabinitio.net> On Fri, 04 Jan 2013 10:04:14 -0500, Todd V Rovito wrote: > I submitted a simple patch for updates to IDLE's documentation > http://bugs.python.org/issue5066 and it has not been reviewed by > an official Python developer. Since it is now been over 1month > since last update can somebody please review and or commit? I > did get some comments from another contributor and updated the > patch based on those comments. The original bug was opened in > 2009 and the IDLE documentation is out of date with undocumented > menu options and inconsistencies between the help file and the > HTML documentation. I hate to be pushy but I would like to see > some progress made on IDLE. Thanks for the support. IMO it isn't pushy to ask about an issue that seems ready but on which no action has been taken for an extended period. It could be that the nosy committers have just forgotten about it. In the future, it best way to approach this situation (patch that seems ready but no action has been taken) is to ping the issue first, and if you don't get a response after a few days, to post here as you did. --David From rovitotv at gmail.com Fri Jan 4 16:40:10 2013 From: rovitotv at gmail.com (Todd V Rovito) Date: Fri, 4 Jan 2013 10:40:10 -0500 Subject: [Python-Dev] Please review simple patch for IDLE documentation last updated 11/28/2012 In-Reply-To: <20130104152302.532462500B2@webabinitio.net> References: <2629F645-7CD7-4990-83CB-D837D983E3FC@gmail.com> <20130104152302.532462500B2@webabinitio.net> Message-ID: <67A4979F-806C-4677-AC3E-EB9171C406C8@gmail.com> On Jan 4, 2013, at 10:23 AM, "R. David Murray" wrote: > On Fri, 04 Jan 2013 10:04:14 -0500, Todd V Rovito wrote: >> I submitted a simple patch for updates to IDLE's documentation >> http://bugs.python.org/issue5066 and it has not been reviewed by >> an official Python developer. Since it is now been over 1month >> since last update can somebody please review and or commit? I >> did get some comments from another contributor and updated the >> patch based on those comments. The original bug was opened in >> 2009 and the IDLE documentation is out of date with undocumented >> menu options and inconsistencies between the help file and the >> HTML documentation. I hate to be pushy but I would like to see >> some progress made on IDLE. Thanks for the support. > > IMO it isn't pushy to ask about an issue that seems ready but on which > no action has been taken for an extended period. It could be that the > nosy committers have just forgotten about it. > > In the future, it best way to approach this situation (patch that seems > ready but no action has been taken) is to ping the issue first, and if > you don't get a response after a few days, to post here as you did. Good suggestion I will "ping" the bug report! FYI....the excellent developer's guide http://docs.python.org/devguide/patch.html#reviewing states "Getting your patch reviewed requires a reviewer to have the spare time and motivation to look at your patch (we cannot force anyone to review patches). If your patch has not received any notice from reviewers (i.e., no comment made) after a substantial amount of time then you may email python-dev at python.org asking for someone to take a look at your patch." The Developer's guide does not mention pinging the patch maybe that is common sense so it doesn't need to be documented but I had assumed somebody was watching bugs that had patches applied but no commits. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Fri Jan 4 17:51:02 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 04 Jan 2013 11:51:02 -0500 Subject: [Python-Dev] Please review simple patch for IDLE documentation last updated 11/28/2012 In-Reply-To: <67A4979F-806C-4677-AC3E-EB9171C406C8@gmail.com> References: <2629F645-7CD7-4990-83CB-D837D983E3FC@gmail.com> <20130104152302.532462500B2@webabinitio.net> <67A4979F-806C-4677-AC3E-EB9171C406C8@gmail.com> Message-ID: <20130104165104.1E1752500B2@webabinitio.net> On Fri, 04 Jan 2013 10:40:10 -0500, Todd V Rovito wrote: > On Jan 4, 2013, at 10:23 AM, "R. David Murray" wrote: > > > On Fri, 04 Jan 2013 10:04:14 -0500, Todd V Rovito wrote: > >> I submitted a simple patch for updates to IDLE's documentation > >> http://bugs.python.org/issue5066 and it has not been reviewed by > >> an official Python developer. Since it is now been over 1month > >> since last update can somebody please review and or commit? I > >> did get some comments from another contributor and updated the > >> patch based on those comments. The original bug was opened in > >> 2009 and the IDLE documentation is out of date with undocumented > >> menu options and inconsistencies between the help file and the > >> HTML documentation. I hate to be pushy but I would like to see > >> some progress made on IDLE. Thanks for the support. > > > > IMO it isn't pushy to ask about an issue that seems ready but on which > > no action has been taken for an extended period. It could be that the > > nosy committers have just forgotten about it. > > > > In the future, it best way to approach this situation (patch that seems > > ready but no action has been taken) is to ping the issue first, and if > > you don't get a response after a few days, to post here as you did. > > Good suggestion I will "ping" the bug report! FYI....the excellent > developer's guide http://docs.python.org/devguide/patch.html#reviewing > states "Getting your patch reviewed requires a reviewer to have the > spare time and motivation to look at your patch (we cannot force > anyone to review patches). If your patch has not received any notice > from reviewers (i.e., no comment made) after a substantial amount of > time then you may email python-dev at python.org asking for someone to > take a look at your patch." > > The Developer's guide does not mention pinging the patch maybe that is > common sense so it doesn't need to be documented but I had assumed > somebody was watching bugs that had patches applied but no commits. Adding that to the developers guide sounds like a good idea. You could open a new issue in the tracker with that suggestion :). No, we have no automated monitoring of anything in the bug tracker (other than the weekly issues summary). It is all up to whatever some volunteer chooses to spend time doing. If someone *did* want to take on that role, that would be fantastic. (To automate such monitoring we would need some sort of 'commit ready' flag in the tracker and a protocol for when it gets set...which might not be a bad idea, but would need to be discussed....) --David From storchaka at gmail.com Fri Jan 4 17:56:22 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 04 Jan 2013 18:56:22 +0200 Subject: [Python-Dev] Please review simple patch for IDLE documentation last updated 11/28/2012 In-Reply-To: <20130104165104.1E1752500B2@webabinitio.net> References: <2629F645-7CD7-4990-83CB-D837D983E3FC@gmail.com> <20130104152302.532462500B2@webabinitio.net> <67A4979F-806C-4677-AC3E-EB9171C406C8@gmail.com> <20130104165104.1E1752500B2@webabinitio.net> Message-ID: On 04.01.13 18:51, R. David Murray wrote: > (To automate such monitoring we would need some sort of 'commit ready' > flag in the tracker and a protocol for when it gets set...which might > not be a bad idea, but would need to be discussed....) Is not "commit review" stage purposed for this? From status at bugs.python.org Fri Jan 4 18:07:16 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 4 Jan 2013 18:07:16 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130104170716.C44B91CA7F@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2012-12-28 - 2013-01-04) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3817 (-17) closed 24823 (+78) total 28640 (+61) Open issues with patches: 1667 Issues opened (44) ================== #8952: Doc/c-api/arg.rst: fix documentation of number formats http://bugs.python.org/issue8952 reopened by georg.brandl #16748: Make CPython test package discoverable http://bugs.python.org/issue16748 reopened by serhiy.storchaka #16803: Make test_importlib run tests under both _frozen_importlib and http://bugs.python.org/issue16803 opened by brett.cannon #16804: python3 -S -m site fails http://bugs.python.org/issue16804 opened by pitrou #16805: when building docs on Debian 7 --> ERROR: Error in "note" dire http://bugs.python.org/issue16805 opened by tshepang #16806: col_offset is -1 and lineno is wrong for multiline string expr http://bugs.python.org/issue16806 opened by carsten.klein at axn-software.de #16807: argparse group nesting lost on inheritance http://bugs.python.org/issue16807 opened by Dougal.Sutherland #16808: inspect.stack() should return list of named tuples http://bugs.python.org/issue16808 opened by danielsh #16809: Tk 8.6.0 introduces TypeError. (Tk 8.5.13 works) http://bugs.python.org/issue16809 opened by serwy #16811: email.message.Message flatten dies of list index out of range http://bugs.python.org/issue16811 opened by HJarausch #16812: os.symlink can return wrong FileExistsError/WindowsError infor http://bugs.python.org/issue16812 opened by IAmTheClaw #16817: test___all__ affects other tests by doing too much importing http://bugs.python.org/issue16817 opened by eli.bendersky #16821: bundlebuilder broken in 2.7 http://bugs.python.org/issue16821 opened by barry-scott #16822: execv (et al.) should invoke atexit handlers before executing http://bugs.python.org/issue16822 opened by nedbat #16823: Python crashes on running tkinter code with threads http://bugs.python.org/issue16823 opened by Sarbjit.singh #16826: Don't check for PYTHONCASEOK if interpreter started with -E http://bugs.python.org/issue16826 opened by brett.cannon #16827: Remove the relatively advanced content from section 2 in tutor http://bugs.python.org/issue16827 opened by ramchandra.apte #16829: IDLE on POSIX can't print filenames with spaces http://bugs.python.org/issue16829 opened by Rod.Nayfield #16830: Add skip_host and skip_accept_encoding to httplib/http.client http://bugs.python.org/issue16830 opened by kanzure #16831: Better docs for ABCMeta.__subclasshook__ http://bugs.python.org/issue16831 opened by ncoghlan #16832: Expose cache validity checking support in ABCMeta http://bugs.python.org/issue16832 opened by ncoghlan #16834: ioctl mutate_flag behavior in regard to the buffer size limit http://bugs.python.org/issue16834 opened by Yuval.Weinbaum #16835: Update PEP 399 to allow for test discovery http://bugs.python.org/issue16835 opened by brett.cannon #16836: configure script disables support for IPv6 on a system where I http://bugs.python.org/issue16836 opened by schmir #16837: Number ABC can be instantiated http://bugs.python.org/issue16837 opened by belopolsky #16840: Tkinter doesn't support large integers http://bugs.python.org/issue16840 opened by serhiy.storchaka #16842: Allow to override a function signature for pydoc with a docstr http://bugs.python.org/issue16842 opened by serhiy.storchaka #16843: sporadic test_sched failure http://bugs.python.org/issue16843 opened by pitrou #16845: warnings.simplefilter should validate input http://bugs.python.org/issue16845 opened by seberg #16846: relative import solution http://bugs.python.org/issue16846 opened by Fixpythonbugs #16848: Mac OS X: python-config --ldflags and location of Python.frame http://bugs.python.org/issue16848 opened by samueljohn #16849: Element.{get,iter} doesn't handle keyword arguments when using http://bugs.python.org/issue16849 opened by kushou #16850: Atomic open + close-and-exec http://bugs.python.org/issue16850 opened by haypo #16851: ismethod and isfunction methods error http://bugs.python.org/issue16851 opened by wdanilo #16852: Fix test discovery for test_genericpath.py http://bugs.python.org/issue16852 opened by zach.ware #16853: add a Selector to the select module http://bugs.python.org/issue16853 opened by neologix #16854: usage() is not defined in Lib/test/regrtest.py http://bugs.python.org/issue16854 opened by serhiy.storchaka #16855: traceback module leaves off module name in last line of format http://bugs.python.org/issue16855 opened by waltermundt #16858: tarfile silently hides errors http://bugs.python.org/issue16858 opened by mmarkk #16859: tarfile.TarInfo.fromtarfile does not check read() return value http://bugs.python.org/issue16859 opened by mmarkk #16860: Use O_CLOEXEC in the tempfile module http://bugs.python.org/issue16860 opened by haypo #16861: Portion of code example not highlighted in collections doc http://bugs.python.org/issue16861 opened by ramchandra.apte #16862: FAQ has outdated information about Stackless http://bugs.python.org/issue16862 opened by ramchandra.apte #16863: Argparse tutorial outdated http://bugs.python.org/issue16863 opened by Charlie.Dimino Most recent 15 issues with no replies (15) ========================================== #16861: Portion of code example not highlighted in collections doc http://bugs.python.org/issue16861 #16859: tarfile.TarInfo.fromtarfile does not check read() return value http://bugs.python.org/issue16859 #16855: traceback module leaves off module name in last line of format http://bugs.python.org/issue16855 #16852: Fix test discovery for test_genericpath.py http://bugs.python.org/issue16852 #16845: warnings.simplefilter should validate input http://bugs.python.org/issue16845 #16834: ioctl mutate_flag behavior in regard to the buffer size limit http://bugs.python.org/issue16834 #16832: Expose cache validity checking support in ABCMeta http://bugs.python.org/issue16832 #16823: Python crashes on running tkinter code with threads http://bugs.python.org/issue16823 #16807: argparse group nesting lost on inheritance http://bugs.python.org/issue16807 #16803: Make test_importlib run tests under both _frozen_importlib and http://bugs.python.org/issue16803 #16795: Patch: some changes to AST to make it more useful for static l http://bugs.python.org/issue16795 #16786: argparse doesn't offer localization interface for "version" ac http://bugs.python.org/issue16786 #16776: Document PyCFunction_New and PyCFunction_NewEx functions http://bugs.python.org/issue16776 #16762: test_subprocess failure on OpenBSD/NetBSD buildbots http://bugs.python.org/issue16762 #16754: Incorrect shared library extension on linux http://bugs.python.org/issue16754 Most recent 15 issues waiting for review (15) ============================================= #16860: Use O_CLOEXEC in the tempfile module http://bugs.python.org/issue16860 #16853: add a Selector to the select module http://bugs.python.org/issue16853 #16852: Fix test discovery for test_genericpath.py http://bugs.python.org/issue16852 #16850: Atomic open + close-and-exec http://bugs.python.org/issue16850 #16849: Element.{get,iter} doesn't handle keyword arguments when using http://bugs.python.org/issue16849 #16843: sporadic test_sched failure http://bugs.python.org/issue16843 #16840: Tkinter doesn't support large integers http://bugs.python.org/issue16840 #16837: Number ABC can be instantiated http://bugs.python.org/issue16837 #16836: configure script disables support for IPv6 on a system where I http://bugs.python.org/issue16836 #16835: Update PEP 399 to allow for test discovery http://bugs.python.org/issue16835 #16830: Add skip_host and skip_accept_encoding to httplib/http.client http://bugs.python.org/issue16830 #16826: Don't check for PYTHONCASEOK if interpreter started with -E http://bugs.python.org/issue16826 #16808: inspect.stack() should return list of named tuples http://bugs.python.org/issue16808 #16807: argparse group nesting lost on inheritance http://bugs.python.org/issue16807 #16806: col_offset is -1 and lineno is wrong for multiline string expr http://bugs.python.org/issue16806 Top 10 most discussed issues (10) ================================= #16076: xml.etree.ElementTree.Element and xml.etree.ElementTree.TreeBu http://bugs.python.org/issue16076 20 msgs #16748: Make CPython test package discoverable http://bugs.python.org/issue16748 18 msgs #16801: Preserve original representation for integers / floats in docs http://bugs.python.org/issue16801 17 msgs #16850: Atomic open + close-and-exec http://bugs.python.org/issue16850 13 msgs #16805: when building docs on Debian 7 --> ERROR: Error in "note" dire http://bugs.python.org/issue16805 11 msgs #14621: Hash function is not randomized properly http://bugs.python.org/issue14621 9 msgs #16853: add a Selector to the select module http://bugs.python.org/issue16853 9 msgs #16047: Tools/freeze no longer works in Python 3 http://bugs.python.org/issue16047 7 msgs #16061: performance regression in string replace for 3.3 http://bugs.python.org/issue16061 7 msgs #16591: RUNSHARED wrong for OSX no framework http://bugs.python.org/issue16591 7 msgs Issues closed (74) ================== #2405: Drop w9xpopen and all dependencies http://bugs.python.org/issue2405 closed by brian.curtin #7320: Unable to load external modules on build slave with debug pyth http://bugs.python.org/issue7320 closed by brian.curtin #7651: Python3: guess text file charset using the BOM http://bugs.python.org/issue7651 closed by haypo #9561: distutils: set encoding to utf-8 for input and output files http://bugs.python.org/issue9561 closed by haypo #9586: "warning: comparison between pointer and integer" in multiproc http://bugs.python.org/issue9586 closed by sbt #9644: PEP 383: os.statvfs() does not accept surrogateescape argument http://bugs.python.org/issue9644 closed by haypo #10527: multiprocessing.Pipe problem: "handle out of range in select() http://bugs.python.org/issue10527 closed by giampaolo.rodola #10657: os.lstat/os.stat/os.fstat don't set st_dev (st_rdev) on Window http://bugs.python.org/issue10657 closed by brian.curtin #10848: Move test.regrtest from getopt to argparse http://bugs.python.org/issue10848 closed by brett.cannon #10951: gcc 4.6 warnings http://bugs.python.org/issue10951 closed by haypo #11242: urllib.request.url_open() doesn't support SSLContext http://bugs.python.org/issue11242 closed by haypo #11726: linecache becomes specific to Python scripts in Python 3 http://bugs.python.org/issue11726 closed by haypo #11769: test_notify() of test_threading hang on "x86 XP-4 3.x": http://bugs.python.org/issue11769 closed by haypo #11925: test_ttk_guionly.test_traversal() failed on "x86 Windows7 3.x" http://bugs.python.org/issue11925 closed by haypo #12103: Document how to use open with os.O_CLOEXEC http://bugs.python.org/issue12103 closed by christian.heimes #12112: The new packaging module should not use the locale encoding http://bugs.python.org/issue12112 closed by haypo #12309: os.environ was modified by test_packaging http://bugs.python.org/issue12309 closed by haypo #12336: tkinter.test.test_ttk.test_widgets.test_select() failure on Fr http://bugs.python.org/issue12336 closed by haypo #12453: test_import.test_failing_import_sticks() sporadic failures http://bugs.python.org/issue12453 closed by haypo #12466: test_subprocess.test_close_fds() sporadic failures on Mac OS X http://bugs.python.org/issue12466 closed by haypo #12495: Rewrite InterProcessSignalTests http://bugs.python.org/issue12495 closed by haypo #12858: crypt.mksalt: use ssl.RAND_pseudo_bytes() if available http://bugs.python.org/issue12858 closed by haypo #13059: Sporadic test_multiprocessing failure: IOError("bad message le http://bugs.python.org/issue13059 closed by haypo #13100: sre_compile._optimize_unicode() needs a cleanup http://bugs.python.org/issue13100 closed by haypo #13157: Build Python outside the source directory http://bugs.python.org/issue13157 closed by haypo #13209: Refactor code using unicode_encode_call_errorhandler() in unic http://bugs.python.org/issue13209 closed by haypo #13256: Document and test new socket options http://bugs.python.org/issue13256 closed by haypo #13453: Tests and network timeouts http://bugs.python.org/issue13453 closed by haypo #14057: Speedup sysconfig startup http://bugs.python.org/issue14057 closed by haypo #14372: Fix all invalid usage of borrowed references http://bugs.python.org/issue14372 closed by haypo #15516: exception-handling bug in PyString_Format http://bugs.python.org/issue15516 closed by python-dev #15803: Incorrect docstring on ConfigParser.items() http://bugs.python.org/issue15803 closed by lukasz.langa #16009: Json error messages could provide more information about the e http://bugs.python.org/issue16009 closed by ezio.melotti #16165: sched.scheduler.run() blocks scheduler http://bugs.python.org/issue16165 closed by serhiy.storchaka #16208: getaddrinfo returns wrong results if IPv6 is disabled http://bugs.python.org/issue16208 closed by haypo #16281: TODO in tailmatch(): it does not support backward in all cases http://bugs.python.org/issue16281 closed by python-dev #16367: io.FileIO.readall() is not 64-bit safe on Windows http://bugs.python.org/issue16367 closed by haypo #16455: Decode command line arguments from ASCII on FreeBSD and Solari http://bugs.python.org/issue16455 closed by haypo #16485: FD leaks in aifc module http://bugs.python.org/issue16485 closed by serhiy.storchaka #16486: Add context manager support to aifc module http://bugs.python.org/issue16486 closed by serhiy.storchaka #16541: tk_setPalette doesn't accept keyword parameters http://bugs.python.org/issue16541 closed by serhiy.storchaka #16640: Less code under lock in sched.scheduler http://bugs.python.org/issue16640 closed by serhiy.storchaka #16641: sched.scheduler.enter arguments should not be modifiable http://bugs.python.org/issue16641 closed by serhiy.storchaka #16642: Mention new "kwargs" named tuple parameter in sched module http://bugs.python.org/issue16642 closed by serhiy.storchaka #16645: Wrong test_extract_hardlink() in test_tarfile.py http://bugs.python.org/issue16645 closed by serhiy.storchaka #16674: Faster getrandbits() for small integers http://bugs.python.org/issue16674 closed by serhiy.storchaka #16688: Backreferences make case-insensitive regex fail on non-ASCII s http://bugs.python.org/issue16688 closed by serhiy.storchaka #16747: Remove 'file' type reference from 'iterable' glossary entry http://bugs.python.org/issue16747 closed by ezio.melotti #16749: Fatal Python error: PyEval_RestoreThread: NULL tstate http://bugs.python.org/issue16749 closed by pitrou #16761: Fix int(base=X) http://bugs.python.org/issue16761 closed by serhiy.storchaka #16782: No curses.initwin: Incorrect package docstring for curses http://bugs.python.org/issue16782 closed by asvetlov #16783: sqlite3 accepts strings it cannot (by default) return http://bugs.python.org/issue16783 closed by terry.reedy #16787: asyncore.dispatcher_with_send - increase the send buffer size http://bugs.python.org/issue16787 closed by neologix #16798: DTD not checked http://bugs.python.org/issue16798 closed by terry.reedy #16810: inconsistency in weekday http://bugs.python.org/issue16810 closed by v+python #16813: use paths relative to CPython root in documentation building i http://bugs.python.org/issue16813 closed by chris.jerdonek #16814: use --directory option of make in describing how to build the http://bugs.python.org/issue16814 closed by georg.brandl #16815: Is all OK!! http://bugs.python.org/issue16815 closed by hynek #16816: Bug in hash randomization http://bugs.python.org/issue16816 closed by benjamin.peterson #16818: Couple of mistakes in PEP 431 http://bugs.python.org/issue16818 closed by ezio.melotti #16819: IDLE b"" method completion incorrect http://bugs.python.org/issue16819 closed by serhiy.storchaka #16820: configparser.ConfigParser.clean and .update bugs http://bugs.python.org/issue16820 closed by lukasz.langa #16824: typo in test http://bugs.python.org/issue16824 closed by serhiy.storchaka #16825: all OK!!! http://bugs.python.org/issue16825 closed by neologix #16828: bz2 error on compression of empty string http://bugs.python.org/issue16828 closed by nadeem.vawda #16833: http.client delayed ack / Nagle algorithm optimisation perform http://bugs.python.org/issue16833 closed by pitrou #16838: fail to "from locale import getpreferredencoding" when there's http://bugs.python.org/issue16838 closed by Cravix #16839: segmentation fault when unicode(classic_class_instance) http://bugs.python.org/issue16839 closed by python-dev #16841: Set st_dev on Windows as unsigned long http://bugs.python.org/issue16841 closed by serhiy.storchaka #16844: funcName in logging module for python 2.6 http://bugs.python.org/issue16844 closed by vinay.sajip #16847: sha module broken http://bugs.python.org/issue16847 closed by christian.heimes #16856: Segfault from calling repr() on a dict with a key whose repr r http://bugs.python.org/issue16856 closed by serhiy.storchaka #16857: replace my email address on argparse howto with my website http://bugs.python.org/issue16857 closed by python-dev #1653416: OS X print >> f, "Hello" produces no error on read-only f: nor http://bugs.python.org/issue1653416 closed by ned.deily From rdmurray at bitdance.com Fri Jan 4 18:07:32 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 04 Jan 2013 12:07:32 -0500 Subject: [Python-Dev] Please review simple patch for IDLE documentation last updated 11/28/2012 In-Reply-To: References: <2629F645-7CD7-4990-83CB-D837D983E3FC@gmail.com> <20130104152302.532462500B2@webabinitio.net> <67A4979F-806C-4677-AC3E-EB9171C406C8@gmail.com> <20130104165104.1E1752500B2@webabinitio.net> Message-ID: <20130104170733.089D725003E@webabinitio.net> On Fri, 04 Jan 2013 18:56:22 +0200, Serhiy Storchaka wrote: > On 04.01.13 18:51, R. David Murray wrote: > > (To automate such monitoring we would need some sort of 'commit ready' > > flag in the tracker and a protocol for when it gets set...which might > > not be a bad idea, but would need to be discussed....) > > Is not "commit review" stage purposed for this? Not currently. But we could potentially re-purpose it for that. Currently it is supposed to mean that we are in an RC stage, one committer has signed off on the patch, and we need a second signoff before committing. I don't think we actually work that way any more now that we have switched to hg. Instead what we have been doing is committing as per normal, and then the release manager's decision on whether or not merge the fix into his release clone takes the place of the second review. --David From svenbrauch at googlemail.com Fri Jan 4 21:45:08 2013 From: svenbrauch at googlemail.com (Sven Brauch) Date: Fri, 4 Jan 2013 21:45:08 +0100 Subject: [Python-Dev] Range information in the AST -- once more In-Reply-To: References: Message-ID: 2012/12/27 Sven Brauch : > 2012/12/27 Guido van Rossum : >> So just submit a patch to the tracker... >> >> --Guido >> >> >> On Thursday, December 27, 2012, Sven Brauch wrote: >>> >>> 2012/12/27 Nick Coghlan : >>> > It certainly sounds like its worth considering for 3.4. It's a new >>> > feature, though, so it unfortunately wouldn't be possible to backport >>> > it to any earlier releases. >>> >>> Yes, that is understandable. It wouldn't be much of a problem tough, >>> my whole project is pretty bleeding-edge anyways, and depending on >>> python >= 3.4 wouldn't hurt. For me it would only be important to have >>> an acceptable solution for this long-term. >>> >>> Greetings, >>> Sven >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> http://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: >>> http://mail.python.org/mailman/options/python-dev/guido%40python.org >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) > > I submitted a patch to the tracker, see http://bugs.python.org/issue16795. > The patch only contains the very minimum set of changes which are > necessary. There's still a few things you'll need to work around when > writing a static language analyzer, but none of them is too much work, > so I didn't include them for now in order to keep things compact. > > Thanks and best regards, > Sven Ping! Nobody complained since I proposed the patch a week ago, just some people added themselves to the notify list. What do you think? Cheers, Sven From rovitotv at gmail.com Fri Jan 4 22:04:12 2013 From: rovitotv at gmail.com (Todd V Rovito) Date: Fri, 4 Jan 2013 16:04:12 -0500 Subject: [Python-Dev] Please review simple patch for IDLE documentation last updated 11/28/2012 In-Reply-To: <20130104165104.1E1752500B2@webabinitio.net> References: <2629F645-7CD7-4990-83CB-D837D983E3FC@gmail.com> <20130104152302.532462500B2@webabinitio.net> <67A4979F-806C-4677-AC3E-EB9171C406C8@gmail.com> <20130104165104.1E1752500B2@webabinitio.net> Message-ID: <82F63E12-9524-4075-89BC-DF8B2FDC1948@gmail.com> On Jan 4, 2013, at 11:51 AM, "R. David Murray" wrote: > On Fri, 04 Jan 2013 10:40:10 -0500, Todd V Rovito wrote: >> On Jan 4, 2013, at 10:23 AM, "R. David Murray" wrote: >> >>> On Fri, 04 Jan 2013 10:04:14 -0500, Todd V Rovito wrote: >>>> I submitted a simple patch for updates to IDLE's documentation >>>> http://bugs.python.org/issue5066 and it has not been reviewed by >>>> an official Python developer. Since it is now been over 1month >>>> since last update can somebody please review and or commit? I >>>> did get some comments from another contributor and updated the >>>> patch based on those comments. The original bug was opened in >>>> 2009 and the IDLE documentation is out of date with undocumented >>>> menu options and inconsistencies between the help file and the >>>> HTML documentation. I hate to be pushy but I would like to see >>>> some progress made on IDLE. Thanks for the support. >>> >>> IMO it isn't pushy to ask about an issue that seems ready but on which >>> no action has been taken for an extended period. It could be that the >>> nosy committers have just forgotten about it. >>> >>> In the future, it best way to approach this situation (patch that seems >>> ready but no action has been taken) is to ping the issue first, and if >>> you don't get a response after a few days, to post here as you did. >> >> Good suggestion I will "ping" the bug report! FYI....the excellent >> developer's guide http://docs.python.org/devguide/patch.html#reviewing >> states "Getting your patch reviewed requires a reviewer to have the >> spare time and motivation to look at your patch (we cannot force >> anyone to review patches). If your patch has not received any notice >> from reviewers (i.e., no comment made) after a substantial amount of >> time then you may email python-dev at python.org asking for someone to >> take a look at your patch." >> >> The Developer's guide does not mention pinging the patch maybe that is >> common sense so it doesn't need to be documented but I had assumed >> somebody was watching bugs that had patches applied but no commits. > > Adding that to the developers guide sounds like a good idea. You could > open a new issue in the tracker with that suggestion :). Thanks I will add an issue to the tracker and will even submit a patch. From guido at python.org Fri Jan 4 23:59:40 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 4 Jan 2013 14:59:40 -0800 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Fri, Jan 4, 2013 at 2:38 PM, Dustin Mitchell wrote: > As the maintainer of a pretty large, complex app written in Twisted, I think > this is great. I look forward to a future of being able to select from a > broad library of async tools, and being able to write tools that can be used > outside of Twisted. Thanks. Me too. :-) > Buildbot began, lo these many years ago, doing a lot of things in memory on > on local disk, neither of which require asynchronous IO. So a lot of API > methods did not originally return Deferreds. Those methods are then used by > other methods, many of which also do not return Deferreds. Now, we want to > use a database backend, and parallelize some of the operations, meaning that > the methods need to return a Deferred. Unfortunately, that requires a > complete tree traversal of all of the methods and methods that call them, > rewriting them to take and return Deferreds. There's no "halfway" solution. > This is a little easier with generators (@inlineCallbacks), since the syntax > doesn't change much, but it's a significant change to the API (in fact, this > is a large part of the reason for the big rewrite for Buildbot-0.9.x). > > I bring all this up to say, this PEP will introduce a new "kind" of method > signature into standard Python, one which the caller must know, and the use > of which changes the signature of the caller. That can cause sweeping > changes, and debugging those changes can be tricky. Yes, and this is the biggest unproven point of the PEP. (The rest is all backed by a decade or more of experience.) > Two things can help: > > First, `yield from somemeth()` should work fine even if `somemeth` is not a > coroutine function, and authors of async tools should be encouraged to use > this form to assist future-compatibility. Second, `somemeth()` without a > yield should fail loudly if `somemeth` is a coroutine function. Otherwise, > the effects can be pretty confusing. That would be nice. But the way yield from and generators work, that's hard to accomplish without further changes to the language -- and I don't want to have to change the language again (at least not immediately -- maybe in a few releases, after we've learned what the real issues are). The best I can do for the first requirement is to define @coroutine in a way that if the decorated function isn't a generator, it is wrapped in one. For the second requirement, if you call somemeth() and ignore the result, nothing happens at all -- this is indeed infuriating but I see no way to change this.(*) If you use the result, well, Futures have different attributes than most other objects so hopefully you'll get a loud AttributeError or TypeError soon, but of course if you pass it into something else which uses it, it may still be difficult to track. Hopefully these error messages provide a hint: >>> f.foo Traceback (most recent call last): File "", line 1, in AttributeError: 'Future' object has no attribute 'foo' >>> f() Traceback (most recent call last): File "", line 1, in TypeError: 'Future' object is not callable >>> (*) There's a heavy gun we might use, but I would make this optional, as a heavy duty debugging mode only. @coroutine could wrap generators in a lightweight object with a __del__ method and an __iter__ method. If __del__ is called before __iter__ is ever called, it could raise an exception or log a warning. But this probably adds too much overhead to have it always enabled. > In http://code.google.com/p/uthreads, I accomplished the latter by taking > advantage of garbage collection: if the generator is garbage collected > before it's begun, then it's probably not been yielded. This is a bit > gross, but good enough as a debugging technique. Eh, yeah, what I said. :-) > On the topic of debugging, I also took pains to make sure that tracebacks > looked reasonable, filtering out scheduler code[1]. I haven't looked > closely at Tulip to see if that's a problem. Most of the "noise" in the > tracebacks came from the lack of 'yield from', so it may not be an issue at > all. One of the great advantages of using yield from is that the tracebacks automatically look nice. > Dustin > > [1] > http://code.google.com/p/uthreads/source/browse/trunk/uthreads/core.py#253 -- --Guido van Rossum (python.org/~guido) From djmitche at gmail.com Fri Jan 4 23:38:32 2013 From: djmitche at gmail.com (Dustin Mitchell) Date: Fri, 4 Jan 2013 14:38:32 -0800 (PST) Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: As the maintainer of a pretty large, complex app written in Twisted, I think this is great. I look forward to a future of being able to select from a broad library of async tools, and being able to write tools that can be used outside of Twisted. Buildbot began, lo these many years ago, doing a lot of things in memory on on local disk, neither of which require asynchronous IO. So a lot of API methods did not originally return Deferreds. Those methods are then used by other methods, many of which also do not return Deferreds. Now, we want to use a database backend, and parallelize some of the operations, meaning that the methods need to return a Deferred. Unfortunately, that requires a complete tree traversal of all of the methods and methods that call them, rewriting them to take and return Deferreds. There's no "halfway" solution. This is a little easier with generators (@inlineCallbacks), since the syntax doesn't change much, but it's a significant change to the API (in fact, this is a large part of the reason for the big rewrite for Buildbot-0.9.x). I bring all this up to say, this PEP will introduce a new "kind" of method signature into standard Python, one which the caller must know, and the use of which changes the signature of the caller. That can cause sweeping changes, and debugging those changes can be tricky. Two things can help: First, `yield from somemeth()` should work fine even if `somemeth` is not a coroutine function, and authors of async tools should be encouraged to use this form to assist future-compatibility. Second, `somemeth()` without a yield should fail loudly if `somemeth` is a coroutine function. Otherwise, the effects can be pretty confusing. In http://code.google.com/p/uthreads, I accomplished the latter by taking advantage of garbage collection: if the generator is garbage collected before it's begun, then it's probably not been yielded. This is a bit gross, but good enough as a debugging technique. On the topic of debugging, I also took pains to make sure that tracebacks looked reasonable, filtering out scheduler code[1]. I haven't looked closely at Tulip to see if that's a problem. Most of the "noise" in the tracebacks came from the lack of 'yield from', so it may not be an issue at all. Dustin [1] http://code.google.com/p/uthreads/source/browse/trunk/uthreads/core.py#253 -------------- next part -------------- An HTML attachment was scrubbed... URL: From lampenregenschirm at web.de Fri Jan 4 23:59:04 2013 From: lampenregenschirm at web.de (Elli Lola) Date: Fri, 4 Jan 2013 23:59:04 +0100 (CET) Subject: [Python-Dev] test failed: test_urlwithfrag Message-ID: An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Jan 5 00:34:40 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 5 Jan 2013 00:34:40 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> Message-ID: <20130105003440.3ba164b3@pitrou.net> On Thu, 3 Jan 2013 12:22:27 +0200 Maciej Fijalkowski wrote: > Hello everyone. > > Thanks raymond for writing down a pure python version ;-) > > I did an initial port to RPython for experiments. The results (on > large dicts only) are inconclusive - it's either a bit faster or a bit > slower, depending what exactly you do. There is a possibility I messed > something up too (there is a branch rdict-experiments in PyPy, in a > very sorry state though). But what about the memory consumption? This seems to be the main point of Raymond's proposal. Regards Antoine. From guido at python.org Sat Jan 5 00:56:23 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 4 Jan 2013 15:56:23 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: <41C7D2F7-CA38-46C0-8477-B3F2FA034712@twistedmatrix.com> References: <20121221204411.1269322b@pitrou.net> <41C7D2F7-CA38-46C0-8477-B3F2FA034712@twistedmatrix.com> Message-ID: On Mon, Dec 24, 2012 at 2:58 PM, Glyph wrote: > In my humble (but entirely, verifiably correct) opinion, thinking of this as > a "default" is propagating a design error in the BSD sockets API. Datagram > and stream sockets have radically different semantics. In Twisted, > "dataReceived" and "datagramReceived" are different methods for a good > reason. Again, it's very very easy to fall into the trap of thinking that a > TCP segment is a datagram and writing all your application code as if it > were. After all, it probably works over localhost most of the time! This > difference in semantics mirrored by a difference in method naming has helped > quite a few people grok the distinction between streaming and datagrams over > the years; I think it would be a good idea if Tulip followed suit. Suppose PEP 3156 / Tulip uses data_received() for streams and datagram_received() for datagram protocols (which seems reasonable enough), what API should a datagram transport have for sending datagrams? write_datagram() and write_datagram_list()? (Given that the transport and protocol classes are different for datagrams, I suppose the create*() methods should also be different, rather than just having a type={SOCK_STREAM,SOCK_DATAGRAM} flag. But I can figure that out for myself. The naming and exact APIs for the client and server transport creation methods are in flux anyway.) -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Sat Jan 5 02:03:42 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 04 Jan 2013 20:03:42 -0500 Subject: [Python-Dev] test failed: test_urlwithfrag In-Reply-To: References: Message-ID: <50E77BEE.1060108@udel.edu> On 1/4/2013 5:59 PM, Elli Lola wrote: > Dear python team, > > I never used python before and installed it today the first time, so I > have no idea what to do about this failure: This current-version usage question should have been directed to python-list. pydev is only for discussion of future versions. (And if you are in doubt, start with python-list.) However, > $ ./python -m test -v test_urlwithfrag There is no such module in the test package. Hence > ImportError: No module named 'test.test_urlwithfrag' Believe error messages. Look in Lib/test/test_* for the test modules you can run. I tried (interactively, much easier on windows) >>> import test.test_urllib as t; t.test_main() >>> import test.test_urllib_response as t; t.test_main() >>> import test.test_urllib2 as t; t.test_main() and they all work. The other test_urllib* modules are test_urllibnet test_urllib2net test_urllib2_localnet If you get a failure from actually running a test, post on python-list, with the headers you did post here, and ask if it seems to be a problem with your build or python itself. You can also check the tracker to see if it is an already known failure. python-list people are generally very helpful with this sort of thing. -- Terry Jan Reedy From senthil at uthcode.com Sat Jan 5 04:49:33 2013 From: senthil at uthcode.com (Senthil Kumaran) Date: Fri, 4 Jan 2013 19:49:33 -0800 Subject: [Python-Dev] test failed: test_urlwithfrag In-Reply-To: References: Message-ID: Yes, this is question is for python-lists at python.org On Fri, Jan 4, 2013 at 2:59 PM, Elli Lola wrote: > $ ./python -m test -v test_urlwithfrag Where did you get this command from? It looks to me to me that more than one person is trying the exact same command experiencing the same failure (expected result). Perhaps the source which is advertising this is misleading? Thanks, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Sat Jan 5 05:19:46 2013 From: glyph at twistedmatrix.com (Glyph) Date: Fri, 4 Jan 2013 20:19:46 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: <20121221204411.1269322b@pitrou.net> <41C7D2F7-CA38-46C0-8477-B3F2FA034712@twistedmatrix.com> Message-ID: <80704794-03AF-4B48-A665-2F1704B3EC46@twistedmatrix.com> On Jan 4, 2013, at 3:56 PM, Guido van Rossum wrote: > On Mon, Dec 24, 2012 at 2:58 PM, Glyph wrote: >> In my humble (but entirely, verifiably correct) opinion, thinking of this as >> a "default" is propagating a design error in the BSD sockets API. Datagram >> and stream sockets have radically different semantics. In Twisted, >> "dataReceived" and "datagramReceived" are different methods for a good >> reason. Again, it's very very easy to fall into the trap of thinking that a >> TCP segment is a datagram and writing all your application code as if it >> were. After all, it probably works over localhost most of the time! This >> difference in semantics mirrored by a difference in method naming has helped >> quite a few people grok the distinction between streaming and datagrams over >> the years; I think it would be a good idea if Tulip followed suit. > > Suppose PEP 3156 / Tulip uses data_received() for streams and > datagram_received() for datagram protocols (which seems reasonable > enough), what API should a datagram transport have for sending > datagrams? write_datagram() and write_datagram_list()? Twisted just have a different method called write() which has a different signature (data, address). Probably write_datagram is better. Why write_datagram_list though? Twisted's writeSequence is there to provide the (eventual) opportunity to optimize by writev; since datagrams are always sent one at a time anyway, write_datagram_list would seem to be a very minor optimization. -glyph From guido at python.org Sat Jan 5 05:51:58 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 4 Jan 2013 20:51:58 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: <80704794-03AF-4B48-A665-2F1704B3EC46@twistedmatrix.com> References: <20121221204411.1269322b@pitrou.net> <41C7D2F7-CA38-46C0-8477-B3F2FA034712@twistedmatrix.com> <80704794-03AF-4B48-A665-2F1704B3EC46@twistedmatrix.com> Message-ID: On Fri, Jan 4, 2013 at 8:19 PM, Glyph wrote: > > On Jan 4, 2013, at 3:56 PM, Guido van Rossum wrote: > >> On Mon, Dec 24, 2012 at 2:58 PM, Glyph wrote: >>> In my humble (but entirely, verifiably correct) opinion, thinking of this as >>> a "default" is propagating a design error in the BSD sockets API. Datagram >>> and stream sockets have radically different semantics. In Twisted, >>> "dataReceived" and "datagramReceived" are different methods for a good >>> reason. Again, it's very very easy to fall into the trap of thinking that a >>> TCP segment is a datagram and writing all your application code as if it >>> were. After all, it probably works over localhost most of the time! This >>> difference in semantics mirrored by a difference in method naming has helped >>> quite a few people grok the distinction between streaming and datagrams over >>> the years; I think it would be a good idea if Tulip followed suit. >> >> Suppose PEP 3156 / Tulip uses data_received() for streams and >> datagram_received() for datagram protocols (which seems reasonable >> enough), what API should a datagram transport have for sending >> datagrams? write_datagram() and write_datagram_list()? > > Twisted just have a different method called write() which has a different signature (data, address). Probably write_datagram is better. Why write_datagram_list though? Twisted's writeSequence is there to provide the (eventual) opportunity to optimize by writev; since datagrams are always sent one at a time anyway, write_datagram_list would seem to be a very minor optimization. That makes sense (you can see I haven't tried to use UDP in a long time :-). Should write_datagram() perhaps return a future? Or is there still a use case for buffering datagrams? -- --Guido van Rossum (python.org/~guido) From glyph at twistedmatrix.com Sat Jan 5 06:18:21 2013 From: glyph at twistedmatrix.com (Glyph) Date: Fri, 4 Jan 2013 21:18:21 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: <20121221204411.1269322b@pitrou.net> <41C7D2F7-CA38-46C0-8477-B3F2FA034712@twistedmatrix.com> <80704794-03AF-4B48-A665-2F1704B3EC46@twistedmatrix.com> Message-ID: <699D76B2-3266-4BF4-944E-24395AC380B4@twistedmatrix.com> On Jan 4, 2013, at 8:51 PM, Guido van Rossum wrote: > On Fri, Jan 4, 2013 at 8:19 PM, Glyph wrote: >> >> On Jan 4, 2013, at 3:56 PM, Guido van Rossum wrote: >> >>> On Mon, Dec 24, 2012 at 2:58 PM, Glyph wrote: >>>> In my humble (but entirely, verifiably correct) opinion, thinking of this as >>>> a "default" is propagating a design error in the BSD sockets API. Datagram >>>> and stream sockets have radically different semantics. In Twisted, >>>> "dataReceived" and "datagramReceived" are different methods for a good >>>> reason. Again, it's very very easy to fall into the trap of thinking that a >>>> TCP segment is a datagram and writing all your application code as if it >>>> were. After all, it probably works over localhost most of the time! This >>>> difference in semantics mirrored by a difference in method naming has helped >>>> quite a few people grok the distinction between streaming and datagrams over >>>> the years; I think it would be a good idea if Tulip followed suit. >>> >>> Suppose PEP 3156 / Tulip uses data_received() for streams and >>> datagram_received() for datagram protocols (which seems reasonable >>> enough), what API should a datagram transport have for sending >>> datagrams? write_datagram() and write_datagram_list()? >> >> Twisted just have a different method called write() which has a different signature (data, address). Probably write_datagram is better. Why write_datagram_list though? Twisted's writeSequence is there to provide the (eventual) opportunity to optimize by writev; since datagrams are always sent one at a time anyway, write_datagram_list would seem to be a very minor optimization. > > That makes sense (you can see I haven't tried to use UDP in a long time :-). > > Should write_datagram() perhaps return a future? Or is there still a > use case for buffering datagrams? There's not much value in returning a future even if you don't buffer. UDP packets can be dropped, and there's no acknowledgement from the remote end either when they're received or when they're dropped. You can get a couple hints from ICMP, but you can't rely on it, because lots of networks just dumbly filter it. Personally I think its flow control should work the same way as a TCP stream just for symmetry, but the only time that this becomes important in an application is when you're actually saturating your entire outbound network, and you need to notice and slow down. Returning a future would let you do this too, but might mislead users into thinking that "once write_datagram completes, the datagram is sent and the other side has it", which is another pernicious idea it's hard to disabuse people of. -glyph From fijall at gmail.com Sat Jan 5 22:03:42 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 5 Jan 2013 23:03:42 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <20130105003440.3ba164b3@pitrou.net> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> <20130105003440.3ba164b3@pitrou.net> Message-ID: On Sat, Jan 5, 2013 at 1:34 AM, Antoine Pitrou wrote: > On Thu, 3 Jan 2013 12:22:27 +0200 > Maciej Fijalkowski wrote: >> Hello everyone. >> >> Thanks raymond for writing down a pure python version ;-) >> >> I did an initial port to RPython for experiments. The results (on >> large dicts only) are inconclusive - it's either a bit faster or a bit >> slower, depending what exactly you do. There is a possibility I messed >> something up too (there is a branch rdict-experiments in PyPy, in a >> very sorry state though). > > But what about the memory consumption? This seems to be the main point > of Raymond's proposal. > Er. The memory consumption can be measured on pen and paper, you don't actually need to see right? After a lot more experimentation I came up with something that behaves better for large dictionaries. This was for a long time a weak point of PyPy, because of some GC details. So I guess I'll try to implement it fully and see how it goes. Stay tuned, I'll keep you posted. PS. PyPy does not have lots of small dictionaries because of maps (so you don't have a dict per instance), hence their performance is not at all that interesting for PyPy. Cheers, fijal From solipsis at pitrou.net Sun Jan 6 13:26:08 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 6 Jan 2013 13:26:08 +0100 Subject: [Python-Dev] peps: Updates in response to Barry Warsaw's feedback References: <3YfB945Y7zzRpX@mail.python.org> Message-ID: <20130106132608.6f867944@pitrou.net> On Sun, 6 Jan 2013 08:25:44 +0100 (CET) nick.coghlan wrote: > http://hg.python.org/peps/rev/3eb7e4b587da > changeset: 4654:3eb7e4b587da > user: Nick Coghlan > date: Sun Jan 06 17:22:45 2013 +1000 > summary: > Updates in response to Barry Warsaw's feedback > > files: > pep-0432.txt | 217 ++++++++++++++++++++++++++------------ > 1 files changed, 146 insertions(+), 71 deletions(-) > > > diff --git a/pep-0432.txt b/pep-0432.txt > --- a/pep-0432.txt > +++ b/pep-0432.txt > @@ -40,19 +40,21 @@ > well-defined phases during the startup sequence: > > * Pre-Initialization - no interpreter available > -* Initialization - interpreter partially available > -* Initialized - full interpreter available, __main__ related metadata > +* Initializing - interpreter partially available > +* Initialized - interpreter available, __main__ related metadata > incomplete > -* Main Execution - optional state, __main__ related metadata populated, > - bytecode executing in the __main__ module namespace > +* Main Execution - __main__ related metadata populated, bytecode > + executing in the __main__ module namespace (embedding applications > + may choose not to use this phase) Since we are here, I would bikeshed a little and point out that "initializing" and "initialiazed" are states, not phases. Either way, the nomenclature should be consistent. Regards Antoine. From ncoghlan at gmail.com Sun Jan 6 13:39:18 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 6 Jan 2013 22:39:18 +1000 Subject: [Python-Dev] peps: Updates in response to Barry Warsaw's feedback In-Reply-To: <20130106132608.6f867944@pitrou.net> References: <3YfB945Y7zzRpX@mail.python.org> <20130106132608.6f867944@pitrou.net> Message-ID: On Sun, Jan 6, 2013 at 10:26 PM, Antoine Pitrou wrote: > Since we are here, I would bikeshed a little and point out that > "initializing" and "initialiazed" are states, not phases. Either way, > the nomenclature should be consistent. States and phases are generally the same thing (e.g. states of matter vs phases of matter). I've switched back and forth between state, stage and phase at various times, though, so the PEP is currently internally inconsistent. If there are any leftover "state" and "stage" references, it's just a mistake since "phase" is the terminology I finally settled on. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tismer at stackless.com Sun Jan 6 19:22:37 2013 From: tismer at stackless.com (Christian Tismer) Date: Sun, 06 Jan 2013 19:22:37 +0100 Subject: [Python-Dev] Why no venv in existing directory? In-Reply-To: <50E9BE11.5040104@stackless.com> References: <26A4F829-8091-4CA4-A4F5-A18B5813C1CC@epy.co.at> <50083843.2040105@oddbird.net> <50E9BE11.5040104@stackless.com> Message-ID: <50E9C0ED.7060701@stackless.com> Minor correction: On 06.01.13 19:10, Christian Tismer wrote: > Yes, you can do the upgrade, but there are a few flaws which keep me > from using this: > > It is pretty common to use virtualenv inside a mercurial checkout. > > With venv, installation with > > python3 -m venv my-repos > > complains that the directory is not empty. > With > > python3 -m venv my-repos --clear > > I was shocked, because that kills my repository :-( > After doing a new clone of my repos, I tried with > > python3 -m venv my-repos --update > Sorry, I meant python3 -m venv my-repos --upgrade > This works a little, but does not install the bin/activate source. > -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at stackless.com Sun Jan 6 19:10:25 2013 From: tismer at stackless.com (Christian Tismer) Date: Sun, 06 Jan 2013 19:10:25 +0100 Subject: [Python-Dev] Why no venv in existing directory? In-Reply-To: References: <26A4F829-8091-4CA4-A4F5-A18B5813C1CC@epy.co.at> <50083843.2040105@oddbird.net> Message-ID: <50E9BE11.5040104@stackless.com> Yes, you can do the upgrade, but there are a few flaws which keep me from using this: It is pretty common to use virtualenv inside a mercurial checkout. With venv, installation with python3 -m venv my-repos complains that the directory is not empty. With python3 -m venv my-repos --clear I was shocked, because that kills my repository :-( After doing a new clone of my repos, I tried with python3 -m venv my-repos --update This works a little, but does not install the bin/activate source. Finally, I reverted to using virtualenv instead, although I would love to migrate. But this is not enough. Anything I'm missing here? cheers - chris On 24.08.12 13:30, Andrew Svetlov wrote: > Looks like you can use for that > $ pyvenv . --upgrade > > > On Fri, Aug 24, 2012 at 12:19 PM, "Stefan H. Holek ? Jarn" > wrote: >> FYI, I have created a tracker issue for this: http://bugs.python.org/issue15776 >> >> Stefan >> >> >> On 23.07.2012, at 09:09, Stefan H. Holek wrote: >> >>> The feature certainly is on *my* wish-list but I might be alone here. ;-) -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From kristjan at ccpgames.com Sun Jan 6 22:40:08 2013 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Sun, 6 Jan 2013 21:40:08 +0000 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> <20130105003440.3ba164b3@pitrou.net>, Message-ID: The memory part is also why I am interested in this approach. Another thing has been bothering me. This is the fact that with the default implementation, the smll table is only ever populated up to a certain percentage, I cant recall, perhaps 50%. Since the small table is small by definition, I think it ought to be worth investigating if we cannot allow it to fill to 100% before growing, even if it costs some collisions. A linear lookup in a handful of slots can't be that much of a bother, it is only with larger number of entries that the O(1) property starts to matter. K ________________________________________ Fr?: Python-Dev [python-dev-bounces+kristjan=ccpgames.com at python.org] fyrir hönd Maciej Fijalkowski [fijall at gmail.com] Sent: 5. jan?ar 2013 21:03 To: Antoine Pitrou Cc: Efni: Re: [Python-Dev] More compact dictionaries with faster iteration On Sat, Jan 5, 2013 at 1:34 AM, Antoine Pitrou wrote: > On Thu, 3 Jan 2013 12:22:27 +0200 > Maciej Fijalkowski wrote: >> Hello everyone. >> >> Thanks raymond for writing down a pure python version ;-) >> >> I did an initial port to RPython for experiments. The results (on >> large dicts only) are inconclusive - it's either a bit faster or a bit >> slower, depending what exactly you do. There is a possibility I messed >> something up too (there is a branch rdict-experiments in PyPy, in a >> very sorry state though). > > But what about the memory consumption? This seems to be the main point > of Raymond's proposal. > Er. The memory consumption can be measured on pen and paper, you don't actually need to see right? After a lot more experimentation I came up with something that behaves better for large dictionaries. This was for a long time a weak point of PyPy, because of some GC details. So I guess I'll try to implement it fully and see how it goes. Stay tuned, I'll keep you posted. PS. PyPy does not have lots of small dictionaries because of maps (so you don't have a dict per instance), hence their performance is not at all that interesting for PyPy. Cheers, fijal _______________________________________________ Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com From raymond.hettinger at gmail.com Mon Jan 7 00:02:50 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 6 Jan 2013 15:02:50 -0800 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> Message-ID: <2104D9D4-466E-43BF-933F-1DEE2627C87E@gmail.com> On Jan 3, 2013, at 2:22 AM, Maciej Fijalkowski wrote: > Hello everyone. > > Thanks raymond for writing down a pure python version ;-) Thanks for running with it. > > I did an initial port to RPython for experiments. The results (on > large dicts only) are inconclusive - it's either a bit faster or a bit > slower, depending what exactly you do. There is a possibility I messed > something up too (there is a branch rdict-experiments in PyPy, in a > very sorry state though). > > One effect we did not think about is that besides extra indirection, > there is an issue with data locality - having to look in two different > large lists seems to be a hit. In pure python, I didn't see a way to bring the hash/key/value entries side-by-side as they currently are in CPython. How does PyPy currently handle this issue? Is there a change I could make to the recipe that would restore data locality? Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon Jan 7 01:55:52 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 6 Jan 2013 16:55:52 -0800 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> <20130105003440.3ba164b3@pitrou.net>, Message-ID: <06BF126C-B710-4EE8-9607-3413F6B5C0E3@gmail.com> On Jan 6, 2013, at 1:40 PM, Kristj?n Valur J?nsson wrote: > The memory part is also why I am interested in this approach. > Another thing has been bothering me. This is the fact that with the default implementation, the smll table is only ever populated up to a certain percentage, I cant recall, perhaps 50%. Dicts resize when the table is over two-thirds full. Small dicts contain zero to five keys. > Since the small table is small by definition, I think it ought to be worth investigating if we cannot allow it to fill to 100% before growing, even if it costs some collisions. A linear lookup in a handful of slots can't be that much of a bother, it is only with larger number of entries that the O(1) property starts to matter. I looked at this a few years ago and found that it hurt performance considerably. Uncle Timmy (and others) had done a darned fine job of tuning dictionaries. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From agriff at tin.it Mon Jan 7 09:05:42 2013 From: agriff at tin.it (Andrea Griffini) Date: Mon, 7 Jan 2013 09:05:42 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> <20130105003440.3ba164b3@pitrou.net> Message-ID: On Sun, Jan 6, 2013 at 10:40 PM, Kristj?n Valur J?nsson wrote: > A linear lookup in a handful of slots can't be that much of a bother, it is only with larger number of entries that the O(1) property starts to matter. Something that I was thinking is if for small tables does make sense to actually do the hashing... a small table could be kept with just key/value pairs (saving also the hash space) with a linear scan first for identity and then a second scan for equality. Given the level of optimization of dicts however I'm 99% sure this was already tried before tho. From vinay_sajip at yahoo.co.uk Mon Jan 7 11:52:50 2013 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 7 Jan 2013 10:52:50 +0000 (UTC) Subject: [Python-Dev] Why no venv in existing directory? References: <26A4F829-8091-4CA4-A4F5-A18B5813C1CC@epy.co.at> <50083843.2040105@oddbird.net> <50E9BE11.5040104@stackless.com> Message-ID: Christian Tismer stackless.com> writes: > Anything I'm missing here? The use case was missed during PEP and patch review, so the feature didn't make it into 3.3. Issue #15776 raised the problem; it has been resolved and the feature should appear in Python 3.4. Regards, Vinay Sajip From fijall at gmail.com Mon Jan 7 11:55:39 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 7 Jan 2013 12:55:39 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> <20130105003440.3ba164b3@pitrou.net> Message-ID: On Sun, Jan 6, 2013 at 11:40 PM, Kristj?n Valur J?nsson wrote: > The memory part is also why I am interested in this approach. > Another thing has been bothering me. This is the fact that with the default implementation, the smll table is only ever populated up to a certain percentage, I cant recall, perhaps 50%. Since the small table is small by definition, I think it ought to be worth investigating if we cannot allow it to fill to 100% before growing, even if it costs some collisions. A linear lookup in a handful of slots can't be that much of a bother, it is only with larger number of entries that the O(1) property starts to matter. > K If you're very interested in the memory footprint you should do what PyPy does. It gives you no speed benefits without the JIT, but it makes all the instances behave like they are having slots. The trick is to keep a dictionary name -> index into a list on a class (you go through hierarchy of such dicts as you add or remove attributes) and a list of attributes on the instance. We call it maps, V8 calls it hidden classes, it's well documented on the pypy blog. Cheers, fijal From mark at hotpy.org Mon Jan 7 12:08:36 2013 From: mark at hotpy.org (Mark Shannon) Date: Mon, 07 Jan 2013 11:08:36 +0000 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> <20130105003440.3ba164b3@pitrou.net> Message-ID: <50EAACB4.2030003@hotpy.org> On 07/01/13 10:55, Maciej Fijalkowski wrote: > On Sun, Jan 6, 2013 at 11:40 PM, Kristj?n Valur J?nsson > wrote: >> The memory part is also why I am interested in this approach. >> Another thing has been bothering me. This is the fact that with the default implementation, the smll table is only ever populated up to a certain percentage, I cant recall, perhaps 50%. Since the small table is small by definition, I think it ought to be worth investigating if we cannot allow it to fill to 100% before growing, even if it costs some collisions. A linear lookup in a handful of slots can't be that much of a bother, it is only with larger number of entries that the O(1) property starts to matter. >> K In 3.3 dicts no longer have a small table. > > If you're very interested in the memory footprint you should do what > PyPy does. It gives you no speed benefits without the JIT, but it > makes all the instances behave like they are having slots. The trick > is to keep a dictionary name -> index into a list on a class (you go > through hierarchy of such dicts as you add or remove attributes) and a > list of attributes on the instance. > > We call it maps, V8 calls it hidden classes, it's well documented on > the pypy blog. They are called shared keys in CPython 3.3 :) > > Cheers, > fijal > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mark%40hotpy.org > From fijall at gmail.com Mon Jan 7 15:02:30 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 7 Jan 2013 16:02:30 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50EAACB4.2030003@hotpy.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> <20130105003440.3ba164b3@pitrou.net> <50EAACB4.2030003@hotpy.org> Message-ID: On Mon, Jan 7, 2013 at 1:08 PM, Mark Shannon wrote: > > > On 07/01/13 10:55, Maciej Fijalkowski wrote: >> >> On Sun, Jan 6, 2013 at 11:40 PM, Kristj?n Valur J?nsson >> wrote: >>> >>> The memory part is also why I am interested in this approach. >>> Another thing has been bothering me. This is the fact that with the >>> default implementation, the smll table is only ever populated up to a >>> certain percentage, I cant recall, perhaps 50%. Since the small table is >>> small by definition, I think it ought to be worth investigating if we cannot >>> allow it to fill to 100% before growing, even if it costs some collisions. >>> A linear lookup in a handful of slots can't be that much of a bother, it is >>> only with larger number of entries that the O(1) property starts to matter. >>> K > > > In 3.3 dicts no longer have a small table. > >> >> If you're very interested in the memory footprint you should do what >> PyPy does. It gives you no speed benefits without the JIT, but it >> makes all the instances behave like they are having slots. The trick >> is to keep a dictionary name -> index into a list on a class (you go >> through hierarchy of such dicts as you add or remove attributes) and a >> list of attributes on the instance. >> >> We call it maps, V8 calls it hidden classes, it's well documented on >> the pypy blog. > > > They are called shared keys in CPython 3.3 :) shared keys don't go to the same lengths, do they? > >> >> Cheers, >> fijal >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/mark%40hotpy.org >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/fijall%40gmail.com From dimaqq at gmail.com Mon Jan 7 15:06:51 2013 From: dimaqq at gmail.com (Dima Tisnek) Date: Mon, 7 Jan 2013 15:06:51 +0100 Subject: [Python-Dev] Accessing value stack Message-ID: Hi, is it possible to access the values stored on the stack in Python stack machine from Python? As in, consider expression: tmp = foo() + bar() At the point in time when bar() is called, foo() already returned something, and that value is on top of the stack in that frame, a few steps after bar() returns, that value is added to bar and is lost. I would very much like to be able to read (and maybe even change) that ephemeral value. If it is absolutely not possible to achieve from Pythonland, I welcome pointers on how to write a C/cython/types extension to achieve this. Thanks, d. -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Mon Jan 7 15:22:27 2013 From: phd at phdru.name (Oleg Broytman) Date: Mon, 7 Jan 2013 18:22:27 +0400 Subject: [Python-Dev] Accessing value stack In-Reply-To: References: Message-ID: <20130107142227.GC20095@iskra.aviel.ru> Hello. We are sorry but we cannot help you. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Mon, Jan 07, 2013 at 03:06:51PM +0100, Dima Tisnek wrote: > Hi, is it possible to access the values stored on the stack in Python stack > machine from Python? In short: it's possible but very much discouraged. Don't hack into python internals. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From fijall at gmail.com Mon Jan 7 16:46:20 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 7 Jan 2013 17:46:20 +0200 Subject: [Python-Dev] Accessing value stack In-Reply-To: <20130107142227.GC20095@iskra.aviel.ru> References: <20130107142227.GC20095@iskra.aviel.ru> Message-ID: On Mon, Jan 7, 2013 at 4:22 PM, Oleg Broytman wrote: > Hello. > > We are sorry but we cannot help you. This mailing list is to work on > developing Python (adding new features to Python itself and fixing bugs); > if you're having problems learning, understanding or using Python, please > find another forum. Probably python-list/comp.lang.python mailing list/news > group is the best place; there are Python developers who participate in it; > you may get a faster, and probably more complete, answer there. See > http://www.python.org/community/ for other lists/news groups/fora. Thank > you for understanding. > > On Mon, Jan 07, 2013 at 03:06:51PM +0100, Dima Tisnek wrote: >> Hi, is it possible to access the values stored on the stack in Python stack >> machine from Python? > > In short: it's possible but very much discouraged. Don't hack into > python internals. Is it possible? I claim it's not (from *Python* because obviously data is in memory). Cheers, fijal From brett at python.org Mon Jan 7 17:32:51 2013 From: brett at python.org (Brett Cannon) Date: Mon, 7 Jan 2013 11:32:51 -0500 Subject: [Python-Dev] Accessing value stack In-Reply-To: References: <20130107142227.GC20095@iskra.aviel.ru> Message-ID: On Mon, Jan 7, 2013 at 10:46 AM, Maciej Fijalkowski wrote: > On Mon, Jan 7, 2013 at 4:22 PM, Oleg Broytman wrote: >> Hello. >> >> We are sorry but we cannot help you. This mailing list is to work on >> developing Python (adding new features to Python itself and fixing bugs); >> if you're having problems learning, understanding or using Python, please >> find another forum. Probably python-list/comp.lang.python mailing list/news >> group is the best place; there are Python developers who participate in it; >> you may get a faster, and probably more complete, answer there. See >> http://www.python.org/community/ for other lists/news groups/fora. Thank >> you for understanding. >> >> On Mon, Jan 07, 2013 at 03:06:51PM +0100, Dima Tisnek wrote: >>> Hi, is it possible to access the values stored on the stack in Python stack >>> machine from Python? >> >> In short: it's possible but very much discouraged. Don't hack into >> python internals. > > Is it possible? I claim it's not (from *Python* because obviously data > is in memory). Nope, it's not. From regebro at gmail.com Mon Jan 7 17:47:39 2013 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 7 Jan 2013 17:47:39 +0100 Subject: [Python-Dev] FYI: don't CC the peps mailing list In-Reply-To: References: Message-ID: On Mon, Dec 31, 2012 at 2:40 PM, Brett Cannon wrote: > Since this has happened for the second time in the past month, I want to > prevent a trend from starting here. Please do not CC the peps mailing list > on any discussions as it makes it impossible to know what emails are about > an actual update vs. people replying to some discussion which in no way > affects the PEP editors. Emails to the peps list should deal only with > adding/updating peps and only between the PEP editors and PEP authors. > > Lennart, I tossed all emails that got held up in moderation, so please > check that the latest version of your PEP is in as you want, else send a > patch to the peps list directly (inlined text is not as easy to work with). > A simple fix for this is to send separate mails to the two mailing lists. Is that OK? //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Mon Jan 7 17:57:24 2013 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 7 Jan 2013 17:57:24 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: <201212292005.06359.Arfrever.FTA@gmail.com> References: <201212292005.06359.Arfrever.FTA@gmail.com> Message-ID: On Sat, Dec 29, 2012 at 8:05 PM, Arfrever Frehtes Taifersar Arahesis < arfrever.fta at gmail.com> wrote: > 2012-12-29 19:54:42 Lennart Regebro napisa?(a): > > On Sat, Dec 29, 2012 at 7:38 PM, Tres Seaver > wrote: > > > > > - -Lots for enabling fallback by default except on platforms known not > to > > > have their own database > > > > Well, it's the same thing really. If the platform does have a database, > the > > fallback will not be used. > > I suggest that configure script support > --enable-internal-timezone-database / --disable-internal-timezone-database > options. > --disable-internal-timezone-database should cause that installational > targets in Makefile would not install files of timezone > database. Will this help the makers of distributions, like Ubuntu etc? If that is the case, then it might be worth doing, otherwise I don't think it's very useful. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Mon Jan 7 18:01:40 2013 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 7 Jan 2013 18:01:40 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Sun, Dec 30, 2012 at 10:47 AM, Ronald Oussoren wrote: > The module could be split into several modules in a package without > affecting the public API if that would help with maintenance, simular to > unittest. > Is there popular support for doing so, ie make datetime into a package, have all of the pytz modules in there, but letting the public API go only through datetime, ie no timezone submodule? If the list decides so, I'll update the PEP. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From dirkjan at ochtman.nl Mon Jan 7 18:43:26 2013 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 7 Jan 2013 18:43:26 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: <201212292005.06359.Arfrever.FTA@gmail.com> Message-ID: On Mon, Jan 7, 2013 at 5:57 PM, Lennart Regebro wrote: > Will this help the makers of distributions, like Ubuntu etc? If that is the > case, then it might be worth doing, otherwise I don't think it's very > useful. As a Gentoo packager, I think it's useful. Cheers, Dirkjan From brett at python.org Mon Jan 7 19:01:56 2013 From: brett at python.org (Brett Cannon) Date: Mon, 7 Jan 2013 13:01:56 -0500 Subject: [Python-Dev] FYI: don't CC the peps mailing list In-Reply-To: References: Message-ID: Yep, or a BCC. Just as long as people blindly replying to anything PEP-related don't send mail to the PEP mailing list. On Mon, Jan 7, 2013 at 11:47 AM, Lennart Regebro wrote: > On Mon, Dec 31, 2012 at 2:40 PM, Brett Cannon wrote: >> >> Since this has happened for the second time in the past month, I want to >> prevent a trend from starting here. Please do not CC the peps mailing list >> on any discussions as it makes it impossible to know what emails are about >> an actual update vs. people replying to some discussion which in no way >> affects the PEP editors. Emails to the peps list should deal only with >> adding/updating peps and only between the PEP editors and PEP authors. >> >> Lennart, I tossed all emails that got held up in moderation, so please >> check that the latest version of your PEP is in as you want, else send a >> patch to the peps list directly (inlined text is not as easy to work with). > > > A simple fix for this is to send separate mails to the two mailing lists. Is > that OK? > > //Lennart From brett at python.org Mon Jan 7 19:06:35 2013 From: brett at python.org (Brett Cannon) Date: Mon, 7 Jan 2013 13:06:35 -0500 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Mon, Jan 7, 2013 at 12:01 PM, Lennart Regebro wrote: > On Sun, Dec 30, 2012 at 10:47 AM, Ronald Oussoren > wrote: >> >> The module could be split into several modules in a package without >> affecting the public API if that would help with maintenance, simular to >> unittest. > > > Is there popular support for doing so, ie make datetime into a package, have > all of the pytz modules in there, but letting the public API go only through > datetime, ie no timezone submodule? > > If the list decides so, I'll update the PEP. +1 for putting this under datetime; "Namespaces are one honking great idea". People looking for date stuff is going to look in datetime or time first so might as well put it there to begin with. From pje at telecommunity.com Mon Jan 7 21:40:49 2013 From: pje at telecommunity.com (PJ Eby) Date: Mon, 7 Jan 2013 15:40:49 -0500 Subject: [Python-Dev] Accessing value stack In-Reply-To: References: <20130107142227.GC20095@iskra.aviel.ru> Message-ID: On Mon, Jan 7, 2013 at 11:32 AM, Brett Cannon wrote: > On Mon, Jan 7, 2013 at 10:46 AM, Maciej Fijalkowski wrote: >> On Mon, Jan 7, 2013 at 4:22 PM, Oleg Broytman wrote: >>> On Mon, Jan 07, 2013 at 03:06:51PM +0100, Dima Tisnek wrote: >>>> Hi, is it possible to access the values stored on the stack in Python stack >>>> machine from Python? >>> >>> In short: it's possible but very much discouraged. Don't hack into >>> python internals. >> >> Is it possible? I claim it's not (from *Python* because obviously data >> is in memory). > > Nope, it's not. I took that as a challenge, and just tried to do it using gc.get_referents(). ;-) Didn't work though... which actually makes me wonder if that's a bug in gc.get_referents(), or whether I'm making a bad assumption about when you'd have to run gc.get_referents() on a frame in order to see items from the value stack included. Could this actually be a bug in frames' GC implementation, or are value stack items not supposed to count for GC purposes? From fijall at gmail.com Mon Jan 7 21:43:06 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 7 Jan 2013 22:43:06 +0200 Subject: [Python-Dev] Accessing value stack In-Reply-To: References: <20130107142227.GC20095@iskra.aviel.ru> Message-ID: On Mon, Jan 7, 2013 at 10:40 PM, PJ Eby wrote: > On Mon, Jan 7, 2013 at 11:32 AM, Brett Cannon wrote: >> On Mon, Jan 7, 2013 at 10:46 AM, Maciej Fijalkowski wrote: >>> On Mon, Jan 7, 2013 at 4:22 PM, Oleg Broytman wrote: >>>> On Mon, Jan 07, 2013 at 03:06:51PM +0100, Dima Tisnek wrote: >>>>> Hi, is it possible to access the values stored on the stack in Python stack >>>>> machine from Python? >>>> >>>> In short: it's possible but very much discouraged. Don't hack into >>>> python internals. >>> >>> Is it possible? I claim it's not (from *Python* because obviously data >>> is in memory). >> >> Nope, it's not. > > I took that as a challenge, and just tried to do it using > gc.get_referents(). ;-) > > Didn't work though... which actually makes me wonder if that's a bug > in gc.get_referents(), or whether I'm making a bad assumption about > when you'd have to run gc.get_referents() on a frame in order to see > items from the value stack included. Could this actually be a bug in > frames' GC implementation, or are value stack items not supposed to > count for GC purposes? valustack is a static list in C (it's not a Python list at all) that's stored on the frame. You can't do it that way. gc.get_referrers returns you only python objects (and in fact should return you only python objects that are accessible otherwise, but those details are really implementation-dependent) From victor.stinner at gmail.com Tue Jan 8 00:48:15 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 8 Jan 2013 00:48:15 +0100 Subject: [Python-Dev] Add "e" (close and exec) mode to open() Message-ID: Hi, I would like add a new flag to open() mode to close the file on exec: "e". This feature exists using different APIs depending on the OS and OS version: O_CLOEXEC, FD_CLOEXEC and O_NOINHERIT. Do you consider that such flag would be interesting? On Linux (glibc), the feature is available as the "e" flag for fopen() (O_CLOEXEC or FD_CLOEXEC depending on the kernel version). With Visual Studio, it's the "N" flag (O_NOINHERIT). I would like to modify open() to be able to benefit of O_CLOEXEC atomicity. The FD_CLOEXEC method has a race condition if another thread does fork() between open() and fcntl() (two calls to fcntl() are required). I created a patch available on the following issue: http://bugs.python.org/issue16850 Drawbacks of my idea/patch: - my patch raises NotImplementedError if the platform doesn't support close-and-exec flag. I tested Linux, OpenBSD, FreeBSD, OpenIndiana (Solaris) and Mac OS X: my patch works on all these platforms (except OpenBSD 4.9 because of an OS bug fixed in OpenBSD 5.2). I don't know platform without this flag. - my patch raises an error if opener argument of open() is used: opener and the "e" mode are exclusive (see the issue for details) --- I first proposed to only expose the atomic close-and-exec option, but only a few platforms (Linux 2.6.23+, FreeBSD 8.3+, Windows, QNX) support it. It's more convinient to have the best-effort method (only atomic if available). This article of Ulrich Drepper explain which kind of problem are solved by O_CLOEXEC: http://udrepper.livejournal.com/20407.html Victor From benjamin at python.org Tue Jan 8 01:03:54 2013 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 7 Jan 2013 18:03:54 -0600 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: 2013/1/7 Victor Stinner : > Hi, > > I would like add a new flag to open() mode to close the file on exec: > "e". This feature exists using different APIs depending on the OS and > OS version: O_CLOEXEC, FD_CLOEXEC and O_NOINHERIT. Do you consider > that such flag would be interesting? I'm not sure it's worth cluttering the open() interface with such a non-portable option. People requiring such control should use the low-level os.open interface. -- Regards, Benjamin From benjamin at python.org Tue Jan 8 02:18:49 2013 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 7 Jan 2013 19:18:49 -0600 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: 2013/1/7 Gregory P. Smith : > > > > On Mon, Jan 7, 2013 at 4:03 PM, Benjamin Peterson > wrote: >> >> 2013/1/7 Victor Stinner : >> > Hi, >> > >> > I would like add a new flag to open() mode to close the file on exec: >> > "e". This feature exists using different APIs depending on the OS and >> > OS version: O_CLOEXEC, FD_CLOEXEC and O_NOINHERIT. Do you consider >> > that such flag would be interesting? >> >> I'm not sure it's worth cluttering the open() interface with such a >> non-portable option. People requiring such control should use the >> low-level os.open interface. > > > The ability to supply such flags really belongs on _all_ high or low level > file descriptor creating APIs so that things like subprocess_cloexec_pipe() > would not be necessary: > http://hg.python.org/cpython/file/0afa7b323abb/Modules/_posixsubprocess.c#l729 I think the open() interface should have consistent and non-conditional support features to the maximum extent possible. The recent addition of "x" is a good example I think. The myriad cloexec APIs between different platforms suggests to me that using this features requires understanding its various quirks on different platforms. -- Regards, Benjamin From zbyszek at in.waw.pl Tue Jan 8 02:03:45 2013 From: zbyszek at in.waw.pl (Zbigniew =?utf-8?Q?J=C4=99drzejewski-Szmek?=) Date: Tue, 8 Jan 2013 02:03:45 +0100 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: <20130108010345.GC6218@in.waw.pl> On Mon, Jan 07, 2013 at 06:03:54PM -0600, Benjamin Peterson wrote: > 2013/1/7 Victor Stinner : > > Hi, > > > > I would like add a new flag to open() mode to close the file on exec: > > "e". This feature exists using different APIs depending on the OS and > > OS version: O_CLOEXEC, FD_CLOEXEC and O_NOINHERIT. Do you consider > > that such flag would be interesting? Having close-on-exec makes the code much cleaner, often. Definitely interesting. > I'm not sure it's worth cluttering the open() interface with such a > non-portable option. People requiring such control should use the > low-level os.open interface. If the best-effort fallback is included, it is quite portable. Definitely all modern and semi-modern systems support either the atomic or the nonatomic methods. Zbyszek From greg at krypto.org Tue Jan 8 02:06:42 2013 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 7 Jan 2013 17:06:42 -0800 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: On Mon, Jan 7, 2013 at 4:03 PM, Benjamin Peterson wrote: > 2013/1/7 Victor Stinner : > > Hi, > > > > I would like add a new flag to open() mode to close the file on exec: > > "e". This feature exists using different APIs depending on the OS and > > OS version: O_CLOEXEC, FD_CLOEXEC and O_NOINHERIT. Do you consider > > that such flag would be interesting? > > I'm not sure it's worth cluttering the open() interface with such a > non-portable option. People requiring such control should use the > low-level os.open interface. > The ability to supply such flags really belongs on _all_ high or low level file descriptor creating APIs so that things like subprocess_cloexec_pipe() would not be necessary: http://hg.python.org/cpython/file/0afa7b323abb/Modules/_posixsubprocess.c#l729 I'm not excited about raising an exception when it isn't supported; it should attempt to get the same behavior via multiple API calls instead (thus introducing a race condition for threads+fork+exec, but there is no choice at that point and it is what most people would want in that situation regardless). Not being supported will be such an extremely rare situation for any OS someone runs a Python 3.4 program on that that detail is pretty much a non-issue either way as far as I'm concerned. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian at python.org Tue Jan 8 04:38:30 2013 From: brian at python.org (Brian Curtin) Date: Mon, 7 Jan 2013 21:38:30 -0600 Subject: [Python-Dev] FYI - wiki.python.org compromised Message-ID: On December 28th, an unknown attacker used a previously unknown remote code exploit on http://wiki.python.org/. The attacker was able to get shell access as the "moin" user, but no other services were affected. Some time later, the attacker deleted all files owned by the "moin" user, including all instance data for both the Python and Jython wikis. The attack also had full access to all MoinMoin user data on all wikis. In light of this, the Python Software Foundation encourages all wiki users to change their password on other sites if the same one is in use elsewhere. We apologize for the inconvenience and will post further news as we bring the new and improved wiki.python.org online. If you have any questions about this incident please contact jnoller at python.org. Thank you for your patience. From alexander.belopolsky at gmail.com Tue Jan 8 04:48:37 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 7 Jan 2013 22:48:37 -0500 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Mon, Jan 7, 2013 at 1:06 PM, Brett Cannon wrote: .. > +1 for putting this under datetime; "Namespaces are one honking great > idea". People looking for date stuff is going to look in datetime or > time first so might as well put it there to begin with. +1 for the same reason. From kalug at kalug.net Tue Jan 8 05:12:59 2013 From: kalug at kalug.net (=?UTF-8?Q?Alu=C3=ADsio_Augusto_Silva_Gon=C3=A7alves?=) Date: Tue, 8 Jan 2013 02:12:59 -0200 Subject: [Python-Dev] FYI - wiki.python.org compromised In-Reply-To: References: Message-ID: Can this possibly be related to the exploit on the Debian wiki ( http://lwn.net/Articles/531726/ )? On Tue, Jan 8, 2013 at 1:38 AM, Brian Curtin wrote: > > On December 28th, an unknown attacker used a previously unknown remote > code exploit on http://wiki.python.org/. The attacker was able to get > shell access as the "moin" user, but no other services were affected. > > Some time later, the attacker deleted all files owned by the "moin" > user, including all instance data for both the Python and Jython > wikis. The attack also had full access to all MoinMoin user data on > all wikis. In light of this, the Python Software Foundation encourages > all wiki users to change their password on other sites if the same one > is in use elsewhere. We apologize for the inconvenience and will post > further news as we bring the new and improved wiki.python.org online. > > If you have any questions about this incident please contact > jnoller at python.org. Thank you for your patience. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev From victor.stinner at gmail.com Tue Jan 8 09:49:54 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 8 Jan 2013 09:49:54 +0100 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: Benjamin wrote: > I'm not sure it's worth cluttering the open() interface with such a > non-portable option. Zbyszek wrote: > If the best-effort fallback is included, it is quite portable. Definitely > all modern and semi-modern systems support either the atomic or the > nonatomic methods. Gregory wrote: > I'm not excited about raising an exception when it isn't supported; > it should attempt to get the same behavior via multiple API calls instead Yes, I'm proposing the best-effort approach: use O_CLOEXEC/O_NOINHERIT if available, or fcntl()+FD_CLOEXEC otherwise. My patch requires fcntl() + FD_CLOEXEC flag or open() + O_NOINHERIT flag. I failed to find an OS where none of these flag/function is present. Usually, when I would like to test the portability, I test Linux, Windows and Mac OS X. Then I test FreeBSD, OpenBSD and OpenIndiana (Solaris). I don't test other OS because I don't know them and they are not installed on my PC :-) (I have many VM.) Should I try other platforms? Benjamin wrote: > People requiring such control should use the low-level os.open interface. os.open() is not very convinient because you have to chose the flag and the functions depending on the OS. If the open() flag is rejected, we should at least provide an helper for os.open() + fcntl(). The myriad cloexec APIs between different platforms suggests to me that using this features requires understanding its various quirks on different platforms. Victor 2013/1/8 Benjamin Peterson : > 2013/1/7 Gregory P. Smith : >> >> >> >> On Mon, Jan 7, 2013 at 4:03 PM, Benjamin Peterson >> wrote: >>> >>> 2013/1/7 Victor Stinner : >>> > Hi, >>> > >>> > I would like add a new flag to open() mode to close the file on exec: >>> > "e". This feature exists using different APIs depending on the OS and >>> > OS version: O_CLOEXEC, FD_CLOEXEC and O_NOINHERIT. Do you consider >>> > that such flag would be interesting? >>> >>> I'm not sure it's worth cluttering the open() interface with such a >>> non-portable option. People requiring such control should use the >>> low-level os.open interface. >> >> >> The ability to supply such flags really belongs on _all_ high or low level >> file descriptor creating APIs so that things like subprocess_cloexec_pipe() >> would not be necessary: >> http://hg.python.org/cpython/file/0afa7b323abb/Modules/_posixsubprocess.c#l729 > > I think the open() interface should have consistent and > non-conditional support features to the maximum extent possible. The > recent addition of "x" is a good example I think. The myriad cloexec > APIs between different platforms suggests to me that using this > features requires understanding its various quirks on different > platforms. > > > > -- > Regards, > Benjamin From victor.stinner at gmail.com Tue Jan 8 09:54:36 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 8 Jan 2013 09:54:36 +0100 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: Oops, I sent my email too early by mistake (it was not finished). > The myriad cloexec > APIs between different platforms suggests to me that using this > features requires understanding its various quirks on different > platforms. Sorry, I don't understand. What do you mean by "various quirks". The "close-on-exec" feature is implemented differently depending on the platform, but it always have the same meaning. It closes the file when a subprocess is created. Running a subprocess is also implemented differently depending on the OS, there are two mains approaches: fork()+exec() on UNIX, on Windows (I don't know how it works on Windows). Extract of fcntl() manual page on Linux: "If the FD_CLOEXEC bit is 0, the file descriptor will remain open across an execve(2), otherwise it will be closed." I would like to expose the OS feature using a portable API to hide the "The myriad cloexec APIs". Victor 2013/1/8 Victor Stinner : > Benjamin wrote: >> I'm not sure it's worth cluttering the open() interface with such a >> non-portable option. > Zbyszek wrote: >> If the best-effort fallback is included, it is quite portable. Definitely >> all modern and semi-modern systems support either the atomic or the >> nonatomic methods. > Gregory wrote: >> I'm not excited about raising an exception when it isn't supported; >> it should attempt to get the same behavior via multiple API calls instead > > Yes, I'm proposing the best-effort approach: use O_CLOEXEC/O_NOINHERIT > if available, or fcntl()+FD_CLOEXEC otherwise. > > My patch requires fcntl() + FD_CLOEXEC flag or open() + O_NOINHERIT > flag. I failed to find an OS where none of these flag/function is > present. > > Usually, when I would like to test the portability, I test Linux, > Windows and Mac OS X. Then I test FreeBSD, OpenBSD and OpenIndiana > (Solaris). I don't test other OS because I don't know them and they > are not installed on my PC :-) (I have many VM.) > > Should I try other platforms? > > Benjamin wrote: >> People requiring such control should use the low-level os.open interface. > > os.open() is not very convinient because you have to chose the flag > and the functions depending on the OS. If the open() flag is rejected, > we should at least provide an helper for os.open() + fcntl(). > > The myriad cloexec > APIs between different platforms suggests to me that using this > features requires understanding its various quirks on different > platforms. > > Victor > > 2013/1/8 Benjamin Peterson : >> 2013/1/7 Gregory P. Smith : >>> >>> >>> >>> On Mon, Jan 7, 2013 at 4:03 PM, Benjamin Peterson >>> wrote: >>>> >>>> 2013/1/7 Victor Stinner : >>>> > Hi, >>>> > >>>> > I would like add a new flag to open() mode to close the file on exec: >>>> > "e". This feature exists using different APIs depending on the OS and >>>> > OS version: O_CLOEXEC, FD_CLOEXEC and O_NOINHERIT. Do you consider >>>> > that such flag would be interesting? >>>> >>>> I'm not sure it's worth cluttering the open() interface with such a >>>> non-portable option. People requiring such control should use the >>>> low-level os.open interface. >>> >>> >>> The ability to supply such flags really belongs on _all_ high or low level >>> file descriptor creating APIs so that things like subprocess_cloexec_pipe() >>> would not be necessary: >>> http://hg.python.org/cpython/file/0afa7b323abb/Modules/_posixsubprocess.c#l729 >> >> I think the open() interface should have consistent and >> non-conditional support features to the maximum extent possible. The >> recent addition of "x" is a good example I think. The myriad cloexec >> APIs between different platforms suggests to me that using this >> features requires understanding its various quirks on different >> platforms. >> >> >> >> -- >> Regards, >> Benjamin From xwl0765 at 126.com Tue Jan 8 04:21:34 2013 From: xwl0765 at 126.com (Xu Wanglin) Date: Tue, 8 Jan 2013 11:21:34 +0800 (CST) Subject: [Python-Dev] make test Message-ID: <611ddba6.204bd.13c182d6892.Coremail.xwl0765@126.com> When I run the test, i got these: ... ... testShareLocal (test.test_socket.TestSocketSharing) ... skipped 'Windows specific' testTypes (test.test_socket.TestSocketSharing) ... skipped 'Windows specific' ====================================================================== ERROR: test_idna (test.test_socket.GeneralModuleTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/users/xwl/Python-3.3.0/Lib/test/test_socket.py", line 1183, in test_idna socket.gethostbyname('?????????.python.org') socket.gaierror: [Errno -3] Temporary failure in name resolution ---------------------------------------------------------------------- Ran 483 tests in 24.816s FAILED (errors=1, skipped=29) test test_socket failed make: *** [test] Error 1 Then I run the failing test manually, as the README guided: [xwl at ordosz8003 ~/Python-3.3.0]$ ./python -m test -v test_idna == CPython 3.3.0 (default, Jan 8 2013, 11:07:49) [GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] == Linux-2.6.18-164.el5-x86_64-with-redhat-5.4-Tikanga little-endian == /data/users/xwl/Python-3.3.0/build/test_python_17651 Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1) [1/1] test_idna test test_idna crashed -- Traceback (most recent call last): File "/data/users/xwl/Python-3.3.0/Lib/test/regrtest.py", line 1213, in runtest_inner the_package = __import__(abstest, globals(), locals(), []) ImportError: No module named 'test.test_idna' 1 test failed: test_idna [xwl at ordosz8003 ~/Python-3.3.0]$ pwd /data/users/xwl/Python-3.3.0 [xwl at ordosz8003 ~/Python-3.3.0]$ Please help me to install this great software on my machine. Think you! David.Xu -------------- next part -------------- An HTML attachment was scrubbed... URL: From yury at shurup.com Tue Jan 8 10:28:25 2013 From: yury at shurup.com (Yury V. Zaytsev) Date: Tue, 08 Jan 2013 10:28:25 +0100 Subject: [Python-Dev] Point of building without threads? In-Reply-To: <20120507214943.579045c2@pitrou.net> References: <20120507214943.579045c2@pitrou.net> Message-ID: <1357637305.2857.54.camel@newpride> On Mon, 2012-05-07 at 21:49 +0200, Antoine Pitrou wrote: > > I guess a long time ago, threading support in operating systems wasn't > very widespread, but these days all our supported platforms have it. > Is it still useful for production purposes to configure > --without-threads? Do people use this option for something else than > curiosity of mind? I hope that the intent behind asking this question was more of being curious, rather then considering dropping --without-threads: unfortunately, multithreading was, still is and probably will remain troublesome on many supercomputing platforms. Often, once a new supercomputer is launched, as a developer you get a half-baked C/C++ compiler with threading support broken to the point when it's much easier to not use it altogether [*] rather than trying to work around the compiler quirks. Of course, the situation improves over the lifetime of each particular computer, but usually, when everything is halfway working, the computer itself becomes obsolete, so there is not much point in using it anymore. Moreover, these days there is a clear trend towards OpenMP, so it has become even harder to pressure the manufacturers to fix threads, because they have 101 argument why you should port your code to OpenMP instead. HTH. [*]: Another usual candidates for being broken beyond repair are the linker, especially when it comes to shared libraries, and support for advanced C++ language features, such as templates... -- Sincerely yours, Yury V. Zaytsev From solipsis at pitrou.net Tue Jan 8 11:29:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 8 Jan 2013 11:29:39 +0100 Subject: [Python-Dev] Point of building without threads? References: <20120507214943.579045c2@pitrou.net> <1357637305.2857.54.camel@newpride> Message-ID: <20130108112939.64627666@pitrou.net> Le Tue, 08 Jan 2013 10:28:25 +0100, "Yury V. Zaytsev" a ?crit : > On Mon, 2012-05-07 at 21:49 +0200, Antoine Pitrou wrote: > > > > I guess a long time ago, threading support in operating systems > > wasn't very widespread, but these days all our supported platforms > > have it. Is it still useful for production purposes to configure > > --without-threads? Do people use this option for something else than > > curiosity of mind? > > I hope that the intent behind asking this question was more of being > curious, rather then considering dropping --without-threads: > unfortunately, multithreading was, still is and probably will remain > troublesome on many supercomputing platforms. I was actually asking this question in the hope that we could perhaps simplify our range of build options (and the corresponding C #define's), but you made a convincing point that we should keep the --without-threads option :-) Thank you Antoine. From victor.stinner at gmail.com Tue Jan 8 12:56:52 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 8 Jan 2013 12:56:52 +0100 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: 2013/1/8 Victor Stinner : > I don't know platform without this flag. According to the following email, fcntl.FD_CLOEXEC was not available in Python 2.2 on Red Hat 7.3 (in 2003): http://communities.mentor.com/community/cs/archives/qmtest/msg00501.html I don't know if the constant was not defined in fcntl.h, or if the constant was just not exposed in Python 2.2? Does anyone have such old version of RedHat to test if fcntl.FD_CLOEXEC is available (on a recent version of Python)? -- In the Python issue #12107, I can read: "I realize this bugreport cannot fix 35 years of a bad design decision in linux." http://bugs.python.org/issue12107 Well... Ruby made a brave choice :-) Ruby (2.0?) does set close-on-exec flag on *ALL file descriptors (except 0, 1, 2) *by default*: http://bugs.ruby-lang.org/issues/5041 This change solves the problem of having to close all file descriptor after a fork to run a new program (see closed Python issues #11284 and #8052)... if you are not using C extensions creating file descriptors? Ruby applications relying on passing FD to child processes have to explicitly disable close-on-exec the flag: it was done in Unicorn for example. -- See also the issue discussing the usage of O_CLOEXEC and SOCK_CLOEXEC in the libapr: https://issues.apache.org/bugzilla/show_bug.cgi?id=46425 Victor From phd at phdru.name Tue Jan 8 13:48:41 2013 From: phd at phdru.name (Oleg Broytman) Date: Tue, 8 Jan 2013 16:48:41 +0400 Subject: [Python-Dev] make test In-Reply-To: <611ddba6.204bd.13c182d6892.Coremail.xwl0765@126.com> References: <611ddba6.204bd.13c182d6892.Coremail.xwl0765@126.com> Message-ID: <20130108124841.GA21788@iskra.aviel.ru> Hello. We are sorry but we cannot help you. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Tue, Jan 08, 2013 at 11:21:34AM +0800, Xu Wanglin wrote: > ImportError: No module named 'test.test_idna' Really, there is no module test.test_idna. Look into Lib/test/ yourself. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From rdmurray at bitdance.com Tue Jan 8 14:19:30 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 08 Jan 2013 08:19:30 -0500 Subject: [Python-Dev] make test In-Reply-To: <20130108124841.GA21788@iskra.aviel.ru> References: <611ddba6.204bd.13c182d6892.Coremail.xwl0765@126.com> <20130108124841.GA21788@iskra.aviel.ru> Message-ID: <20130108131930.845A72500B3@webabinitio.net> On Tue, 08 Jan 2013 16:48:41 +0400, Oleg Broytman wrote: > Hello. > > We are sorry but we cannot help you. This mailing list is to work on > developing Python (adding new features to Python itself and fixing bugs); > if you're having problems learning, understanding or using Python, please > find another forum. Probably python-list/comp.lang.python mailing list/news > group is the best place; there are Python developers who participate in it; > you may get a faster, and probably more complete, answer there. See > http://www.python.org/community/ for other lists/news groups/fora. Thank > you for understanding. > > On Tue, Jan 08, 2013 at 11:21:34AM +0800, Xu Wanglin wrote: > > ImportError: No module named 'test.test_idna' > > Really, there is no module test.test_idna. Look into Lib/test/ yourself. Xu's confusion arises from the fact that the test is named test_idna, but that is actually the name of the test *within* the test file that is named to the right of that name in the unittest output. There is a bug in the issue tracker for having unittest print out the complete path to the individual test in a cut-and-pasteable fashion. Perhaps someone will be motivated to work on a fix :) --David PS: as long as I'm writing this, Xu, that error you got looks like a transient error possibly caused by your local name server...if that is the only failure you got, your Python is working perfectly fine and you can go ahead and install/use it. But as Oleg said, you will get much better and faster help for this kind of question from python-list. From trent at snakebite.org Tue Jan 8 15:02:00 2013 From: trent at snakebite.org (Trent Nelson) Date: Tue, 8 Jan 2013 09:02:00 -0500 Subject: [Python-Dev] Point of building without threads? In-Reply-To: <1357637305.2857.54.camel@newpride> References: <20120507214943.579045c2@pitrou.net> <1357637305.2857.54.camel@newpride> Message-ID: <20130108140159.GA31795@snakebite.org> [ Weird, I can't see your original e-mail Antoine; hijacking Yury's reply instead. ] On Tue, Jan 08, 2013 at 01:28:25AM -0800, Yury V. Zaytsev wrote: > On Mon, 2012-05-07 at 21:49 +0200, Antoine Pitrou wrote: > > > > I guess a long time ago, threading support in operating systems wasn't > > very widespread, but these days all our supported platforms have it. > > Is it still useful for production purposes to configure > > --without-threads? Do people use this option for something else than > > curiosity of mind? All our NetBSD, OpenBSD and DragonFlyBSD slaves use --without-thread. Without it, they all wedge in some way or another. (That should be fixed*/investigated, but, until then, yeah, --without-threads allows for a slightly more useful (but still broken) test suite run on these platforms.) [*]: I suspect the problem with at least OpenBSD is that their userland pthreads implementation just doesn't cut it; there is no hope for the really technical tests that poke and prod at things like correct signal handling and whatnot. Trent. From solipsis at pitrou.net Tue Jan 8 15:09:44 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 8 Jan 2013 15:09:44 +0100 Subject: [Python-Dev] Point of building without threads? References: <20120507214943.579045c2@pitrou.net> <1357637305.2857.54.camel@newpride> <20130108140159.GA31795@snakebite.org> Message-ID: <20130108150944.679c07f2@pitrou.net> Le Tue, 8 Jan 2013 09:02:00 -0500, Trent Nelson a ?crit : > [ Weird, I can't see your original e-mail Antoine; hijacking > Yury's reply instead. ] The original e-mail is quite old (it was sent in May) :-) Regards Antoine. From stefan at bytereef.org Tue Jan 8 15:15:45 2013 From: stefan at bytereef.org (Stefan Krah) Date: Tue, 8 Jan 2013 15:15:45 +0100 Subject: [Python-Dev] Point of building without threads? In-Reply-To: <20130108140159.GA31795@snakebite.org> References: <20120507214943.579045c2@pitrou.net> <1357637305.2857.54.camel@newpride> <20130108140159.GA31795@snakebite.org> Message-ID: <20130108141545.GA11755@sleipnir.bytereef.org> Trent Nelson wrote: > All our NetBSD, OpenBSD and DragonFlyBSD slaves use --without-thread. > Without it, they all wedge in some way or another. (That should be > fixed*/investigated, but, until then, yeah, --without-threads allows > for a slightly more useful (but still broken) test suite run on > these platforms.) > > [*]: I suspect the problem with at least OpenBSD is that their > userland pthreads implementation just doesn't cut it; there > is no hope for the really technical tests that poke and > prod at things like correct signal handling and whatnot. For OpenBSD the situation should be fixed in the latest release: http://www.openbsd.org/52.html#new I haven't tried it myself though. Stefan Krah From fijall at gmail.com Tue Jan 8 15:49:52 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 8 Jan 2013 16:49:52 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: On Mon, Dec 10, 2012 at 3:44 AM, Raymond Hettinger wrote: > The current memory layout for dictionaries is > unnecessarily inefficient. It has a sparse table of > 24-byte entries containing the hash value, key pointer, > and value pointer. > > Instead, the 24-byte entries should be stored in a > dense table referenced by a sparse table of indices. > > For example, the dictionary: > > d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'} > > is currently stored as: > > entries = [['--', '--', '--'], > [-8522787127447073495, 'barry', 'green'], > ['--', '--', '--'], > ['--', '--', '--'], > ['--', '--', '--'], > [-9092791511155847987, 'timmy', 'red'], > ['--', '--', '--'], > [-6480567542315338377, 'guido', 'blue']] > > Instead, the data should be organized as follows: > > indices = [None, 1, None, None, None, 0, None, 2] > entries = [[-9092791511155847987, 'timmy', 'red'], > [-8522787127447073495, 'barry', 'green'], > [-6480567542315338377, 'guido', 'blue']] > > Only the data layout needs to change. The hash table > algorithms would stay the same. All of the current > optimizations would be kept, including key-sharing > dicts and custom lookup functions for string-only > dicts. There is no change to the hash functions, the > table search order, or collision statistics. > > The memory savings are significant (from 30% to 95% > compression depending on the how full the table is). > Small dicts (size 0, 1, or 2) get the most benefit. > > For a sparse table of size t with n entries, the sizes are: > > curr_size = 24 * t > new_size = 24 * n + sizeof(index) * t > > In the above timmy/barry/guido example, the current > size is 192 bytes (eight 24-byte entries) and the new > size is 80 bytes (three 24-byte entries plus eight > 1-byte indices). That gives 58% compression. > > Note, the sizeof(index) can be as small as a single > byte for small dicts, two bytes for bigger dicts and > up to sizeof(Py_ssize_t) for huge dict. > > In addition to space savings, the new memory layout > makes iteration faster. Currently, keys(), values, and > items() loop over the sparse table, skipping-over free > slots in the hash table. Now, keys/values/items can > loop directly over the dense table, using fewer memory > accesses. > > Another benefit is that resizing is faster and > touches fewer pieces of memory. Currently, every > hash/key/value entry is moved or copied during a > resize. In the new layout, only the indices are > updated. For the most part, the hash/key/value entries > never move (except for an occasional swap to fill a > hole left by a deletion). > > With the reduced memory footprint, we can also expect > better cache utilization. > > For those wanting to experiment with the design, > there is a pure Python proof-of-concept here: > > http://code.activestate.com/recipes/578375 > > YMMV: Keep in mind that the above size statics assume a > build with 64-bit Py_ssize_t and 64-bit pointers. The > space savings percentages are a bit different on other > builds. Also, note that in many applications, the size > of the data dominates the size of the container (i.e. > the weight of a bucket of water is mostly the water, > not the bucket). > > > Raymond > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fijall%40gmail.com One question Raymond. The compression ratios stay true provided you don't overallocate entry list. If you do overallocate you don't really gain that much (it all depends vastly on details), or even loose in some cases. What do you think should the strategy be? From trent at snakebite.org Tue Jan 8 16:15:11 2013 From: trent at snakebite.org (Trent Nelson) Date: Tue, 8 Jan 2013 10:15:11 -0500 Subject: [Python-Dev] Point of building without threads? In-Reply-To: <20130108141545.GA11755@sleipnir.bytereef.org> References: <20120507214943.579045c2@pitrou.net> <1357637305.2857.54.camel@newpride> <20130108140159.GA31795@snakebite.org> <20130108141545.GA11755@sleipnir.bytereef.org> Message-ID: <20130108151511.GA31988@snakebite.org> On Tue, Jan 08, 2013 at 06:15:45AM -0800, Stefan Krah wrote: > Trent Nelson wrote: > > All our NetBSD, OpenBSD and DragonFlyBSD slaves use --without-thread. > > Without it, they all wedge in some way or another. (That should be > > fixed*/investigated, but, until then, yeah, --without-threads allows > > for a slightly more useful (but still broken) test suite run on > > these platforms.) > > > > [*]: I suspect the problem with at least OpenBSD is that their > > userland pthreads implementation just doesn't cut it; there > > is no hope for the really technical tests that poke and > > prod at things like correct signal handling and whatnot. > > For OpenBSD the situation should be fixed in the latest release: > > http://www.openbsd.org/52.html#new > > I haven't tried it myself though. Interesting! I'll look into upgrading the existing Snakebite OpenBSD slaves (they're both at 5.1). Trent. From eliben at gmail.com Tue Jan 8 16:19:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 8 Jan 2013 07:19:56 -0800 Subject: [Python-Dev] is this the fault of import_fresh_module or pickle? Message-ID: Hello, I'm still having some struggles with the interaction between pickle and import overriding with import_fresh_module. _elementtree.TreeBuilder can't be pickled at this point. When I do this: from test.support import import_fresh_module import pickle P = import_fresh_module('xml.etree.ElementTree', blocked=['_elementtree']) tb = P.TreeBuilder() print(pickle.dumps(tb)) Everything works fine. However, if I add import_fresh_module for the C module: from test.support import import_fresh_module import pickle C = import_fresh_module('xml.etree.ElementTree', fresh=['_elementtree']) P = import_fresh_module('xml.etree.ElementTree', blocked=['_elementtree']) tb = P.TreeBuilder() print(pickle.dumps(tb)) I get an error from pickle.dumps: Traceback (most recent call last): File "mix_c_py_etree.py", line 10, in print(pickle.dumps(tb)) _pickle.PicklingError: Can't pickle : it's not the same object as xml.etree.ElementTree.TreeBuilder Note that I didn't change the executed code sequence. All I did was import the C version of ET before the Python version. I was under the impression this had to keep working since P = import_fresh_module uses blocked=['_elementtree'] properly. This interaction only seems to happen with pickle. What's going on here? Can we somehow improve import_fresh_module to avoid this? Perhaps actually deleting previously imported modules with some special keyword flag? Thanks in advance, Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at bytereef.org Tue Jan 8 17:05:50 2013 From: stefan at bytereef.org (Stefan Krah) Date: Tue, 8 Jan 2013 17:05:50 +0100 Subject: [Python-Dev] is this the fault of import_fresh_module or pickle? In-Reply-To: References: Message-ID: <20130108160550.GA13380@sleipnir.bytereef.org> Eli Bendersky wrote: > Everything works fine. However, if I add import_fresh_module for the C module: > > from test.support import import_fresh_module > import pickle > C = import_fresh_module('xml.etree.ElementTree', fresh=['_elementtree']) > P = import_fresh_module('xml.etree.ElementTree', blocked=['_elementtree']) sys.modules still contains the C version at this point, so: sys.modules['xml.etree.ElementTree'] = P > tb = P.TreeBuilder() > print(pickle.dumps(tb)) > This interaction only seems to happen with pickle. What's going on here? Can we > somehow improve import_fresh_module to avoid this? Perhaps actually deleting > previously imported modules with some special keyword flag? pickle always looks up sys.modules['xml.etree.ElementTree']. Perhaps we could improve something, but this requirement is rather special; personally I'm okay with switching sys.modules explicitly in the tests, because that reminds me of what pickle does. Stefan Krah From benjamin at python.org Tue Jan 8 17:08:02 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 8 Jan 2013 10:08:02 -0600 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: 2013/1/8 Victor Stinner : > Oops, I sent my email too early by mistake (it was not finished). > >> The myriad cloexec >> APIs between different platforms suggests to me that using this >> features requires understanding its various quirks on different >> platforms. > > Sorry, I don't understand. What do you mean by "various quirks". The > "close-on-exec" feature is implemented differently depending on the > platform, but it always have the same meaning. It closes the file when > a subprocess is created. Running a subprocess is also implemented > differently depending on the OS, there are two mains approaches: > fork()+exec() on UNIX, on Windows (I don't know how > it works on Windows). > > Extract of fcntl() manual page on Linux: "If the FD_CLOEXEC bit is > 0, the file descriptor will remain open across an execve(2), otherwise > it will be closed." > > I would like to expose the OS feature using a portable API to hide the > "The myriad cloexec APIs". Okay, fair enough, but I really would like it not to ever raise NotImplementedError. Then you would end up having different codepaths for various oses anyway. -- Regards, Benjamin From victor.stinner at gmail.com Tue Jan 8 17:22:34 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 8 Jan 2013 17:22:34 +0100 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: 2013/1/8 Benjamin Peterson : > Okay, fair enough, but I really would like it not to ever raise > NotImplementedError. Then you would end up having different codepaths > for various oses anyway. So what do you suggest? Victor From benjamin at python.org Tue Jan 8 17:26:04 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 8 Jan 2013 10:26:04 -0600 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: 2013/1/8 Victor Stinner : > 2013/1/8 Benjamin Peterson : >> Okay, fair enough, but I really would like it not to ever raise >> NotImplementedError. Then you would end up having different codepaths >> for various oses anyway. > > So what do you suggest? If the only systems it doesn't work on is ancient RedHat, that's probably okay. -- Regards, Benjamin From eliben at gmail.com Tue Jan 8 17:36:04 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 8 Jan 2013 08:36:04 -0800 Subject: [Python-Dev] is this the fault of import_fresh_module or pickle? In-Reply-To: <20130108160550.GA13380@sleipnir.bytereef.org> References: <20130108160550.GA13380@sleipnir.bytereef.org> Message-ID: On Tue, Jan 8, 2013 at 8:05 AM, Stefan Krah wrote: > Eli Bendersky wrote: > > Everything works fine. However, if I add import_fresh_module for the C > module: > > > > from test.support import import_fresh_module > > import pickle > > C = import_fresh_module('xml.etree.ElementTree', fresh=['_elementtree']) > > P = import_fresh_module('xml.etree.ElementTree', > blocked=['_elementtree']) > > sys.modules still contains the C version at this point, so: > > sys.modules['xml.etree.ElementTree'] = P > > > > tb = P.TreeBuilder() > > print(pickle.dumps(tb)) > > > > > This interaction only seems to happen with pickle. What's going on here? > Can we > > somehow improve import_fresh_module to avoid this? Perhaps actually > deleting > > previously imported modules with some special keyword flag? > > pickle always looks up sys.modules['xml.etree.ElementTree']. Perhaps we > could improve something, but this requirement is rather special; personally > I'm okay with switching sys.modules explicitly in the tests, because that > reminds me of what pickle does. > Wouldn?t it be be better if import_fresh_module or some alternative function could do that for you? I mean, wipe out the import cache for certain modules I don't want to be found? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Tue Jan 8 17:37:59 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 08 Jan 2013 11:37:59 -0500 Subject: [Python-Dev] is this the fault of import_fresh_module or pickle? In-Reply-To: <20130108160550.GA13380@sleipnir.bytereef.org> References: <20130108160550.GA13380@sleipnir.bytereef.org> Message-ID: <20130108163759.97031250BC4@webabinitio.net> On Tue, 08 Jan 2013 17:05:50 +0100, Stefan Krah wrote: > Eli Bendersky wrote: > > Everything works fine. However, if I add import_fresh_module for the C module: > > > > from test.support import import_fresh_module > > import pickle > > C = import_fresh_module('xml.etree.ElementTree', fresh=['_elementtree']) > > P = import_fresh_module('xml.etree.ElementTree', blocked=['_elementtree']) > > sys.modules still contains the C version at this point, so: > > sys.modules['xml.etree.ElementTree'] = P > > > > tb = P.TreeBuilder() > > print(pickle.dumps(tb)) > > > This interaction only seems to happen with pickle. What's going on here? Can we > > somehow improve import_fresh_module to avoid this? Perhaps actually deleting > > previously imported modules with some special keyword flag? > > pickle always looks up sys.modules['xml.etree.ElementTree']. Perhaps we > could improve something, but this requirement is rather special; personally > I'm okay with switching sys.modules explicitly in the tests, because that > reminds me of what pickle does. Handling this case is why having a context-manager form of import_fresh_module was suggested earlier in this meta-thread. At least, I think that would solve it, I haven't tried it :) --David From victor.stinner at gmail.com Tue Jan 8 17:40:28 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 8 Jan 2013 17:40:28 +0100 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: 2013/1/8 Benjamin Peterson : > 2013/1/8 Victor Stinner : >> 2013/1/8 Benjamin Peterson : >>> Okay, fair enough, but I really would like it not to ever raise >>> NotImplementedError. Then you would end up having different codepaths >>> for various oses anyway. >> >> So what do you suggest? > > If the only systems it doesn't work on is ancient RedHat, that's probably okay. What do you mean? NotIlmplementedError is acceptable if only rare and/or old OS raise such issue? > According to the following email, fcntl.FD_CLOEXEC was not available > in Python 2.2 on Red Hat 7.3 (in 2003): > http://communities.mentor.com/community/cs/archives/qmtest/msg00501.html This issue looks like http://bugs.python.org/issue496171 So it looks like the problem was just that the constant was not exposed properly whereas the OS supports the feature. I guess that FD_CLOEXEC always worked on RedHat. Victor From benjamin at python.org Tue Jan 8 17:45:59 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 8 Jan 2013 10:45:59 -0600 Subject: [Python-Dev] Add "e" (close and exec) mode to open() In-Reply-To: References: Message-ID: 2013/1/8 Victor Stinner : > 2013/1/8 Benjamin Peterson : >> 2013/1/8 Victor Stinner : >>> 2013/1/8 Benjamin Peterson : >>>> Okay, fair enough, but I really would like it not to ever raise >>>> NotImplementedError. Then you would end up having different codepaths >>>> for various oses anyway. >>> >>> So what do you suggest? >> >> If the only systems it doesn't work on is ancient RedHat, that's probably okay. > > What do you mean? NotIlmplementedError is acceptable if only rare > and/or old OS raise such issue? We have to draw the line somewhere. People writing Python to run on such systems will already have to be aware of such issues. -- Regards, Benjamin From stefan at bytereef.org Tue Jan 8 17:58:29 2013 From: stefan at bytereef.org (Stefan Krah) Date: Tue, 8 Jan 2013 17:58:29 +0100 Subject: [Python-Dev] is this the fault of import_fresh_module or pickle? In-Reply-To: References: <20130108160550.GA13380@sleipnir.bytereef.org> Message-ID: <20130108165829.GA14635@sleipnir.bytereef.org> Eli Bendersky wrote: > On Tue, Jan 8, 2013 at 8:05 AM, Stefan Krah wrote: > > pickle always looks up sys.modules['xml.etree.ElementTree']. Perhaps we > could improve something, but this requirement is rather special; personally > I'm okay with switching sys.modules explicitly in the tests, because that > reminds me of what pickle does. > > > Wouldn?t it be be better if import_fresh_module or some alternative function > could do that for you? I mean, wipe out the import cache for certain modules I > don't want to be found? For a single test, perhaps. ContextAPItests.test_pickle() from test_decimal would look quite strange if import_fresh_module was used repeatedly. Stefan Krah From eliben at gmail.com Tue Jan 8 17:58:46 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 8 Jan 2013 08:58:46 -0800 Subject: [Python-Dev] is this the fault of import_fresh_module or pickle? In-Reply-To: <20130108163759.97031250BC4@webabinitio.net> References: <20130108160550.GA13380@sleipnir.bytereef.org> <20130108163759.97031250BC4@webabinitio.net> Message-ID: On Tue, Jan 8, 2013 at 8:37 AM, R. David Murray wrote: > On Tue, 08 Jan 2013 17:05:50 +0100, Stefan Krah > wrote: > > Eli Bendersky wrote: > > > Everything works fine. However, if I add import_fresh_module for the C > module: > > > > > > from test.support import import_fresh_module > > > import pickle > > > C = import_fresh_module('xml.etree.ElementTree', > fresh=['_elementtree']) > > > P = import_fresh_module('xml.etree.ElementTree', > blocked=['_elementtree']) > > > > sys.modules still contains the C version at this point, so: > > > > sys.modules['xml.etree.ElementTree'] = P > > > > > > > tb = P.TreeBuilder() > > > print(pickle.dumps(tb)) > > > > > This interaction only seems to happen with pickle. What's going on > here? Can we > > > somehow improve import_fresh_module to avoid this? Perhaps actually > deleting > > > previously imported modules with some special keyword flag? > > > > pickle always looks up sys.modules['xml.etree.ElementTree']. Perhaps we > > could improve something, but this requirement is rather special; > personally > > I'm okay with switching sys.modules explicitly in the tests, because that > > reminds me of what pickle does. > > Handling this case is why having a context-manager form of > import_fresh_module was suggested earlier in this meta-thread. At > least, I think that would solve it, I haven't tried it :) > Would you mind extracting just this idea into this discussion so we can focus on it here? I personally don't see how making import_fresh_module a context manager will solve things, unless you add some extra functionality to it? AFAIU it doesn't remove modules from sys.modules *before* importing, at this point. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Tue Jan 8 19:08:08 2013 From: regebro at gmail.com (Lennart Regebro) Date: Tue, 8 Jan 2013 19:08:08 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Fri, Dec 28, 2012 at 10:12 PM, Ronald Oussoren wrote: > > On 28 Dec, 2012, at 21:23, Lennart Regebro wrote: > > > Happy Holidays! Here is the update of PEP 431 with the changes that > emerged after the earlier discussion. > > Why is the new timezone support added in a submodule of datetime? Adding > the new > function and exception to datetime itself wouldn't clutter the API that > much, and datetime > already contains some timezone support (datetime.tzinfo). Putting the API directly into the datetime module does conflict with the new timezone class from Python 3.2. The timezone() function therefore needs to be called something else, or the timezone class must be renamed. Alternative names for the timezone() function is get_timezone(), which has already been rejected, and zoneinfo() which makes it clear that it's only zoneinfo timezones that are relevant. Or the timezone class get's renamed TimeZone (which is more PEP8 anyway). We can allow the timezone() function to take both timezone(offset, [name]) as now, and timezone(name) and return a TimeZone object in the first case and a zoneinfo based timezone in the second case. Or maybe somebody else can come up with more clever options? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Jan 8 20:02:50 2013 From: brett at python.org (Brett Cannon) Date: Tue, 8 Jan 2013 14:02:50 -0500 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Tue, Jan 8, 2013 at 1:08 PM, Lennart Regebro wrote: > On Fri, Dec 28, 2012 at 10:12 PM, Ronald Oussoren > wrote: >> >> >> On 28 Dec, 2012, at 21:23, Lennart Regebro wrote: >> >> > Happy Holidays! Here is the update of PEP 431 with the changes that >> > emerged after the earlier discussion. >> >> Why is the new timezone support added in a submodule of datetime? Adding >> the new >> function and exception to datetime itself wouldn't clutter the API that >> much, and datetime >> already contains some timezone support (datetime.tzinfo). > > > Putting the API directly into the datetime module does conflict with the new > timezone class from Python 3.2. The timezone() function therefore needs to > be called something else, or the timezone class must be renamed. Can't rename the class since that would potentially break code (e.g. if the class can be pickled). > > Alternative names for the timezone() function is get_timezone(), which has > already been rejected, and zoneinfo() which makes it clear that it's only > zoneinfo timezones that are relevant. I'm personally +0 on zoneinfo(), but I don't have a better suggestion. -Brett > Or the timezone class get's renamed TimeZone (which is more PEP8 anyway). > > We can allow the timezone() function to take both timezone(offset, [name]) > as now, and timezone(name) and return a TimeZone object in the first case > and a zoneinfo based timezone in the second case. > > Or maybe somebody else can come up with more clever options? > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > From victor.stinner at gmail.com Tue Jan 8 21:16:41 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 8 Jan 2013 21:16:41 +0100 Subject: [Python-Dev] Point of building without threads? In-Reply-To: <20130108151511.GA31988@snakebite.org> References: <20120507214943.579045c2@pitrou.net> <1357637305.2857.54.camel@newpride> <20130108140159.GA31795@snakebite.org> <20130108141545.GA11755@sleipnir.bytereef.org> <20130108151511.GA31988@snakebite.org> Message-ID: 2013/1/8 Trent Nelson : > On Tue, Jan 08, 2013 at 06:15:45AM -0800, Stefan Krah wrote: >> Trent Nelson wrote: >> > All our NetBSD, OpenBSD and DragonFlyBSD slaves use --without-thread. >> > Without it, they all wedge in some way or another. (That should be >> > fixed*/investigated, but, until then, yeah, --without-threads allows >> > for a slightly more useful (but still broken) test suite run on >> > these platforms.) >> > >> > [*]: I suspect the problem with at least OpenBSD is that their >> > userland pthreads implementation just doesn't cut it; there >> > is no hope for the really technical tests that poke and >> > prod at things like correct signal handling and whatnot. >> >> For OpenBSD the situation should be fixed in the latest release: >> >> http://www.openbsd.org/52.html#new >> >> I haven't tried it myself though. > > Interesting! I'll look into upgrading the existing Snakebite > OpenBSD slaves (they're both at 5.1). Oooh yes, many bugs has been fixed by the implementation of threads in the kernel (rthreads in OpenBSD 5.2)! Just one recent example. On OpenBSD 4.9, FD_CLOEXEC doesn't work with fork()+exec() whereas it works with exec(); on OpenBSD 5.2, both cases work as expected. http://bugs.python.org/issue16850#msg179294 Victor From yorik.sar at gmail.com Wed Jan 9 02:14:02 2013 From: yorik.sar at gmail.com (Yuriy Taraday) Date: Wed, 9 Jan 2013 05:14:02 +0400 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted Message-ID: Hello. I've read the PEP and some things raise questions in my consciousness. Here they are. 1. Series of sock_ methods can be organized into a wrapper around sock object. This wrappers can then be saved and used later in async-aware code. This way code like: sock = socket(...) # later, e.g. in connect() yield from tulip.get_event_loop().sock_connect(sock, ...) # later, e.g. in read() data = yield from tulip.get_event_loop().sock_recv(sock, ...) will look like: sock = socket(...) async_sock = tulip.get_event_loop().wrap_socket(sock) # later, e.g. in connect() yield from async_sock.connect(...) # later, e.g. in read() data = yield from async_sock.recv(...) Interface looks cleaner while plain calls (if they ever needed) will be only 5 chars longer. 2. Not as great, but still possible to wrap fd in similar way to make interface simpler. Instead of: add_reader(fd, callback, *args) remove_reader(fd) We can do: wrap_fd(fd).reader = functools.partial(callback, *args) wrap_fd(fd).reader = None # or del wrap_fd(fd).reader 3. Why not use properties (or fields) instead of methods for cancelled, running and done in Future class? I think, it'll be easier to use since I expect such attributes to be accessed as properties. I see it as some javaism since in Java Future have getters for this fields but they are prefixed with 'is'. 4. Why separate exception() from result() for Future class? It does the same as result() but with different interface (return instead of raise). Doesn't this violate the rule "There should be one obvious way to do it"? 5. I think, protocol and transport methods' names are not easy or understanding enough: - write_eof() does not write anything but closes smth, should be close_writing or smth alike; - the same way eof_received() should become smth like receive_closed; - pause() and resume() work with reading only, so they should be suffixed (prefixed) with read(ing), like pause_reading(), resume_reading(). Kind regards, Yuriy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jan 9 02:28:21 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 Jan 2013 11:28:21 +1000 Subject: [Python-Dev] is this the fault of import_fresh_module or pickle? In-Reply-To: References: <20130108160550.GA13380@sleipnir.bytereef.org> <20130108163759.97031250BC4@webabinitio.net> Message-ID: On Wed, Jan 9, 2013 at 2:58 AM, Eli Bendersky wrote: >> Handling this case is why having a context-manager form of >> import_fresh_module was suggested earlier in this meta-thread. At >> least, I think that would solve it, I haven't tried it :) > > > Would you mind extracting just this idea into this discussion so we can > focus on it here? I personally don't see how making import_fresh_module a > context manager will solve things, unless you add some extra functionality > to it? AFAIU it doesn't remove modules from sys.modules *before* importing, > at this point. Sure it does, that's how it works: the module being imported, as well as anything requested as a "fresh" module is removed from sys.modules, anything requested as a "blocked" module is replaced with None (or maybe 0 - whatever it is that will force ImportError). It then does the requested import and then *reverts all those changes* to sys.modules. It's that last part which is giving you trouble: by the time you run the actual tests, sys.modules has been reverted to its original state, so pickle gets confused when it attempts to look things up by name. Rather than a context manager form of import_fresh_module, what we really want is a "modules_replaced" context manager: @contextmanager def modules_replaced(replacements): _missing = object() saved = {} try: for name, mod in replacements.items(): saved[name] = sys.modules.get(name, _missing) sys.modules[name] = mod yield finally: for name, mod in saved.items(): if mod is _missing: del sys.modules[name] else: sys.modules[name] = mod And a new import_fresh_modules function that is like import_fresh_module, but returns a 2-tuple of the requested module and a mapping of all the affected modules, rather than just the module object. However, there will still be cases where this doesn't work (i.e. modules with import-time side effects that don't support repeated execution), and the pickle and copy global registries are a couple of the places that are notorious for not coping with repeated imports of a module. In that case, the test case will need to figure out what global state is being modified and deal with that specifically. (FWIW, this kind of problem is why import_fresh_module is in test.support rather than importlib and the reload builtin became imp.reload in Python 3 - "module level code is executed once per process" is an assumption engrained in most Python developer's brains, and these functions deliberately violate it). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From benjamin at python.org Wed Jan 9 03:07:42 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 8 Jan 2013 20:07:42 -0600 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: 2013/1/8 Yuriy Taraday : > 4. Why separate exception() from result() for Future class? It does the same > as result() but with different interface (return instead of raise). Doesn't > this violate the rule "There should be one obvious way to do it"? I expect that's a copy-and-paste error. exception() will return the exception if one occured. -- Regards, Benjamin From ncoghlan at gmail.com Wed Jan 9 03:23:51 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 Jan 2013 12:23:51 +1000 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 5:02 AM, Brett Cannon wrote: > On Tue, Jan 8, 2013 at 1:08 PM, Lennart Regebro wrote: >> Alternative names for the timezone() function is get_timezone(), which has >> already been rejected, and zoneinfo() which makes it clear that it's only >> zoneinfo timezones that are relevant. > > I'm personally +0 on zoneinfo(), but I don't have a better suggestion. zoneinfo() sounds reasonable to me. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Jan 9 03:31:04 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 Jan 2013 12:31:04 +1000 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 11:14 AM, Yuriy Taraday wrote: > 4. Why separate exception() from result() for Future class? It does the same > as result() but with different interface (return instead of raise). Doesn't > this violate the rule "There should be one obvious way to do it"? The exception() method exists for the same reason that we support both "key in mapping" and raising KeyError from "mapping[key]": sometimes you want "Look Before You Leap", other times you want to let the exception fly. If you want the latter, just call .result() directly, if you want the former, check .exception() first. Regardless, the Future API isn't really being defined in PEP 3156, as it is mostly inheritied from the previously implemented PEP 3148 (http://www.python.org/dev/peps/pep-3148/#future-objects) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Wed Jan 9 03:49:50 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Jan 2013 18:49:50 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Tue, Jan 8, 2013 at 6:07 PM, Benjamin Peterson wrote: > 2013/1/8 Yuriy Taraday : >> 4. Why separate exception() from result() for Future class? It does the same >> as result() but with different interface (return instead of raise). Doesn't >> this violate the rule "There should be one obvious way to do it"? > > I expect that's a copy-and-paste error. exception() will return the > exception if one occured. I don't see the typo. It is as Nick explained. -- --Guido van Rossum (python.org/~guido) From benjamin at python.org Wed Jan 9 03:53:07 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 8 Jan 2013 20:53:07 -0600 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: 2013/1/8 Guido van Rossum : > On Tue, Jan 8, 2013 at 6:07 PM, Benjamin Peterson wrote: >> 2013/1/8 Yuriy Taraday : >>> 4. Why separate exception() from result() for Future class? It does the same >>> as result() but with different interface (return instead of raise). Doesn't >>> this violate the rule "There should be one obvious way to do it"? >> >> I expect that's a copy-and-paste error. exception() will return the >> exception if one occured. > > I don't see the typo. It is as Nick explained. PEP 3156 says "exception(). Difference with PEP 3148: This has no timeout argument and does not wait; if the future is not yet done, it raises an exception." I assume it's not supposed to raise. -- Regards, Benjamin From benjamin at python.org Wed Jan 9 04:02:20 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 8 Jan 2013 21:02:20 -0600 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: 2013/1/8 Benjamin Peterson : > 2013/1/8 Guido van Rossum : >> On Tue, Jan 8, 2013 at 6:07 PM, Benjamin Peterson wrote: >>> 2013/1/8 Yuriy Taraday : >>>> 4. Why separate exception() from result() for Future class? It does the same >>>> as result() but with different interface (return instead of raise). Doesn't >>>> this violate the rule "There should be one obvious way to do it"? >>> >>> I expect that's a copy-and-paste error. exception() will return the >>> exception if one occured. >> >> I don't see the typo. It is as Nick explained. > > PEP 3156 says "exception(). Difference with PEP 3148: This has no > timeout argument and does not wait; if the future is not yet done, it > raises an exception." I assume it's not supposed to raise. Oh, I see. "it raises an exception" refers to the not-completed-eyt exception. Poor reading on my part; never mind. -- Regards, Benjamin From guido at python.org Wed Jan 9 04:06:19 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Jan 2013 19:06:19 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Tue, Jan 8, 2013 at 6:53 PM, Benjamin Peterson wrote: > 2013/1/8 Guido van Rossum : >> On Tue, Jan 8, 2013 at 6:07 PM, Benjamin Peterson wrote: >>> 2013/1/8 Yuriy Taraday : >>>> 4. Why separate exception() from result() for Future class? It does the same >>>> as result() but with different interface (return instead of raise). Doesn't >>>> this violate the rule "There should be one obvious way to do it"? >>> >>> I expect that's a copy-and-paste error. exception() will return the >>> exception if one occured. >> >> I don't see the typo. It is as Nick explained. > > PEP 3156 says "exception(). Difference with PEP 3148: This has no > timeout argument and does not wait; if the future is not yet done, it > raises an exception." I assume it's not supposed to raise. No, actually, in that case it *does* raise an exception, because it means that the caller didn't understand the interface. It *returns* an exception object when the Future is done but the "result" is exceptional. But it *raises* when the Future is not done yet. -- --Guido van Rossum (python.org/~guido) From yorik.sar at gmail.com Wed Jan 9 04:56:17 2013 From: yorik.sar at gmail.com (Yuriy Taraday) Date: Wed, 9 Jan 2013 07:56:17 +0400 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 6:31 AM, Nick Coghlan wrote: > On Wed, Jan 9, 2013 at 11:14 AM, Yuriy Taraday > wrote: > > 4. Why separate exception() from result() for Future class? It does the > same > > as result() but with different interface (return instead of raise). > Doesn't > > this violate the rule "There should be one obvious way to do it"? > > The exception() method exists for the same reason that we support both > "key in mapping" and raising KeyError from "mapping[key]": sometimes > you want "Look Before You Leap", other times you want to let the > exception fly. If you want the latter, just call .result() directly, > if you want the former, check .exception() first. > Ok, I get it now. Thank you for clarifying. > Regardless, the Future API isn't really being defined in PEP 3156, as > it is mostly inheritied from the previously implemented PEP 3148 > (http://www.python.org/dev/peps/pep-3148/#future-objects) > Then #3 and #4 are about PEP 3148. Why was it done this way? Kind regards, Yuriy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jan 9 05:31:58 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Jan 2013 20:31:58 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Tue, Jan 8, 2013 at 5:14 PM, Yuriy Taraday wrote: > I've read the PEP and some things raise questions in my consciousness. Here > they are. Thanks! > 1. Series of sock_ methods can be organized into a wrapper around sock > object. This wrappers can then be saved and used later in async-aware code. > This way code like: > > sock = socket(...) > # later, e.g. in connect() > yield from tulip.get_event_loop().sock_connect(sock, ...) > # later, e.g. in read() > data = yield from tulip.get_event_loop().sock_recv(sock, ...) > > will look like: > > sock = socket(...) > async_sock = tulip.get_event_loop().wrap_socket(sock) > # later, e.g. in connect() > yield from async_sock.connect(...) > # later, e.g. in read() > data = yield from async_sock.recv(...) > > Interface looks cleaner while plain calls (if they ever needed) will be only > 5 chars longer. This is a semi-internal API that is mostly useful to Transport implementers, and there won't be many of those. So I prefer the API that has the fewest classes. > 2. Not as great, but still possible to wrap fd in similar way to make > interface simpler. Instead of: > > add_reader(fd, callback, *args) > remove_reader(fd) > > We can do: > > wrap_fd(fd).reader = functools.partial(callback, *args) > wrap_fd(fd).reader = None # or > del wrap_fd(fd).reader Ditto. > 3. Why not use properties (or fields) instead of methods for cancelled, > running and done in Future class? I think, it'll be easier to use since I > expect such attributes to be accessed as properties. I see it as some > javaism since in Java Future have getters for this fields but they are > prefixed with 'is'. Too late, this is how PEP 3148 defined it. It was indeed inspired by Java Futures. However I would defend using methods here, since these are not all that cheap -- they have to acquire and release a lock. > 4. Why separate exception() from result() for Future class? It does the same > as result() but with different interface (return instead of raise). Doesn't > this violate the rule "There should be one obvious way to do it"? Because it is quite awkward to check for an exception if you have to catch it (4 lines instead of 1). > 5. I think, protocol and transport methods' names are not easy or > understanding enough: > - write_eof() does not write anything but closes smth, should be > close_writing or smth alike; > - the same way eof_received() should become smth like receive_closed; I am indeed struggling a bit with these names, but "writing an EOF" is actually how I think of this (maybe I am dating myself to the time of mag tapes though :-). > - pause() and resume() work with reading only, so they should be suffixed > (prefixed) with read(ing), like pause_reading(), resume_reading(). Agreed. -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Jan 9 05:50:32 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Jan 2013 20:50:32 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Tue, Jan 8, 2013 at 8:31 PM, Guido van Rossum wrote: > On Tue, Jan 8, 2013 at 5:14 PM, Yuriy Taraday wrote: >> - pause() and resume() work with reading only, so they should be suffixed >> (prefixed) with read(ing), like pause_reading(), resume_reading(). > > Agreed. I think I want to take that back. I think it is more common for a protocol to want to pause the transport (i.e. hold back data_received() calls) than it is for a transport to want to pause the protocol (i.e. hold back write() calls). So the more common method can have a shorter name. Also, pause_reading() is almost confusing, since the protocol's method is named data_received(), not read_data(). Also, there's no reason for the protocol to want to pause the *write* (send) actions of the transport -- if wanted to write less it should not have called write(). The reason to distinguish between the two modes of pausing is because it is sometimes useful to "stack" multiple protocols, and then a protocol in the middle of the stack acts as a transport to the protocol next to it (and vice versa). See the discussion on this list previously, e.g. http://mail.python.org/pipermail/python-ideas/2013-January/018522.html (search for the keyword "stack" in this long message to find the relevant section). -- --Guido van Rossum (python.org/~guido) From yorik.sar at gmail.com Wed Jan 9 06:02:23 2013 From: yorik.sar at gmail.com (Yuriy Taraday) Date: Wed, 9 Jan 2013 09:02:23 +0400 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 8:31 AM, Guido van Rossum wrote: > On Tue, Jan 8, 2013 at 5:14 PM, Yuriy Taraday wrote: > > I've read the PEP and some things raise questions in my consciousness. > Here > > they are. > > Thanks! > > > 1. Series of sock_ methods can be organized into a wrapper around sock > > object. This wrappers can then be saved and used later in async-aware > code. > > This is a semi-internal API that is mostly useful to Transport > implementers, and there won't be many of those. So I prefer the API > that has the fewest classes. > > > 2. Not as great, but still possible to wrap fd in similar way to make > > interface simpler. > > Ditto. > Ok, I see. Should transports be bound to event loop on creation? I wonder, what would happen if someone changes current event loop between these calls. > > > 3. Why not use properties (or fields) instead of methods for cancelled, > > running and done in Future class? I think, it'll be easier to use since I > > expect such attributes to be accessed as properties. I see it as some > > javaism since in Java Future have getters for this fields but they are > > prefixed with 'is'. > > Too late, this is how PEP 3148 defined it. It was indeed inspired by > Java Futures. However I would defend using methods here, since these > are not all that cheap -- they have to acquire and release a lock. > > I understand why it should be a method, but still if it's a getter, it should have either get_ or is_ prefix. Are there any way to change this with 'Final' PEP? > > 4. Why separate exception() from result() for Future class? It does the > same > > as result() but with different interface (return instead of raise). > Doesn't > > this violate the rule "There should be one obvious way to do it"? > > Because it is quite awkward to check for an exception if you have to > catch it (4 lines instead of 1). > > > 5. I think, protocol and transport methods' names are not easy or > > understanding enough: > > - write_eof() does not write anything but closes smth, should be > > close_writing or smth alike; > > - the same way eof_received() should become smth like receive_closed; > > I am indeed struggling a bit with these names, but "writing an EOF" is > actually how I think of this (maybe I am dating myself to the time of > mag tapes though :-). > > I never saw a computer working with a tape, but it's clear to me what does they do. I've just imagined the amount of words I'll have to say to students about EOFs instead of simple "it closes our end of one half of a socket". > - pause() and resume() work with reading only, so they should be suffixed > > (prefixed) with read(ing), like pause_reading(), resume_reading(). > > Agreed. > > -- > --Guido van Rossum (python.org/~guido) > -- Kind regards, Yuriy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jan 9 06:14:05 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Jan 2013 21:14:05 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Tue, Jan 8, 2013 at 9:02 PM, Yuriy Taraday wrote: > On Wed, Jan 9, 2013 at 8:31 AM, Guido van Rossum wrote: >> On Tue, Jan 8, 2013 at 5:14 PM, Yuriy Taraday wrote: >> > 1. Series of sock_ methods can be organized into a wrapper around sock >> > object. This wrappers can then be saved and used later in async-aware >> > code. >> >> This is a semi-internal API that is mostly useful to Transport >> implementers, and there won't be many of those. So I prefer the API >> that has the fewest classes. >> >> > 2. Not as great, but still possible to wrap fd in similar way to make >> > interface simpler. >> >> Ditto. > > > Ok, I see. > Should transports be bound to event loop on creation? I wonder, what would > happen if someone changes current event loop between these calls. Yes, this is what the transport implementation does. >> > 3. Why not use properties (or fields) instead of methods for cancelled, >> > running and done in Future class? I think, it'll be easier to use since >> > I >> > expect such attributes to be accessed as properties. I see it as some >> > javaism since in Java Future have getters for this fields but they are >> > prefixed with 'is'. >> >> Too late, this is how PEP 3148 defined it. It was indeed inspired by >> Java Futures. However I would defend using methods here, since these >> are not all that cheap -- they have to acquire and release a lock. >> > > I understand why it should be a method, but still if it's a getter, it > should have either get_ or is_ prefix. Why? That's not a universal coding standard. The names seem clear enough to me. > Are there any way to change this with 'Final' PEP? No, the concurrent.futures package has been released (I forget if it was Python 3.2 or 3.3) and we're bound to backwards compatibility. Also I really don't think it's a big deal at all. >> > 4. Why separate exception() from result() for Future class? It does the >> > same >> > as result() but with different interface (return instead of raise). >> > Doesn't >> > this violate the rule "There should be one obvious way to do it"? >> >> Because it is quite awkward to check for an exception if you have to >> catch it (4 lines instead of 1). >> >> >> > 5. I think, protocol and transport methods' names are not easy or >> > understanding enough: >> > - write_eof() does not write anything but closes smth, should be >> > close_writing or smth alike; >> > - the same way eof_received() should become smth like receive_closed; >> >> I am indeed struggling a bit with these names, but "writing an EOF" is >> actually how I think of this (maybe I am dating myself to the time of >> mag tapes though :-). >> > I never saw a computer working with a tape, but it's clear to me what does > they do. > I've just imagined the amount of words I'll have to say to students about > EOFs instead of simple "it closes our end of one half of a socket". But which half? A socket is two independent streams, one in each direction. Twisted uses half_close() for this concept but unless you already know what this is for you are left wondering which half. Which is why I like using 'write' in the name. -- --Guido van Rossum (python.org/~guido) From yorik.sar at gmail.com Wed Jan 9 06:26:09 2013 From: yorik.sar at gmail.com (Yuriy Taraday) Date: Wed, 9 Jan 2013 09:26:09 +0400 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 9:14 AM, Guido van Rossum wrote: > On Tue, Jan 8, 2013 at 9:02 PM, Yuriy Taraday wrote: > > Should transports be bound to event loop on creation? I wonder, what > would > > happen if someone changes current event loop between these calls. > > Yes, this is what the transport implementation does. > But in theory every sock_ call is independent and returns Future bound to current event loop. So if one change event loop with active transport, nothing bad should happen. Or I'm missing something. > > I understand why it should be a method, but still if it's a getter, it > > should have either get_ or is_ prefix. > > Why? That's not a universal coding standard. The names seem clear enough > to me. > When I see (in autocompletion, for example) or remember name like "running", it triggers thought that it's a field. When I remember smth like is_running, it definitely associates with method. > > Are there any way to change this with 'Final' PEP? > > No, the concurrent.futures package has been released (I forget if it > was Python 3.2 or 3.3) and we're bound to backwards compatibility. > Also I really don't think it's a big deal at all. > Yes, not a big deal. > > >> > 5. I think, protocol and transport methods' names are not easy or > >> > understanding enough: > >> > - write_eof() does not write anything but closes smth, should be > >> > close_writing or smth alike; > >> > - the same way eof_received() should become smth like receive_closed; > >> > >> I am indeed struggling a bit with these names, but "writing an EOF" is > >> actually how I think of this (maybe I am dating myself to the time of > >> mag tapes though :-). > >> > > I never saw a computer working with a tape, but it's clear to me what > does > > they do. > > I've just imagined the amount of words I'll have to say to students about > > EOFs instead of simple "it closes our end of one half of a socket". > > But which half? A socket is two independent streams, one in each > direction. Twisted uses half_close() for this concept but unless you > already know what this is for you are left wondering which half. Which > is why I like using 'write' in the name. Yes, 'write' part is good, I should mention it. I meant to say that I won't need to explain that there were days when we had to handle a special marker at the end of file. -- Kind regards, Yuriy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Wed Jan 9 06:42:30 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 09 Jan 2013 14:42:30 +0900 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: <87k3rmap4p.fsf@uwakimon.sk.tsukuba.ac.jp> Is this thread really ready to migrate to python-dev when we're still bikeshedding method names? Yuriy Taraday writes: > > But which half? A socket is two independent streams, one in each > > direction. Twisted uses half_close() for this concept but unless you > > already know what this is for you are left wondering which half. Which > > is why I like using 'write' in the name. > > Yes, 'write' part is good, I should mention it. I meant to say that I won't > need to explain that there were days when we had to handle a special marker > at the end of file. Mystery is good for students. Getting serious, "close_writer" occured to me as a possibility. From guido at python.org Wed Jan 9 07:02:58 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Jan 2013 22:02:58 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Tue, Jan 8, 2013 at 9:26 PM, Yuriy Taraday wrote: > > > > On Wed, Jan 9, 2013 at 9:14 AM, Guido van Rossum wrote: >> >> On Tue, Jan 8, 2013 at 9:02 PM, Yuriy Taraday wrote: >> > Should transports be bound to event loop on creation? I wonder, what >> > would >> > happen if someone changes current event loop between these calls. >> >> Yes, this is what the transport implementation does. > > > But in theory every sock_ call is independent and returns Future bound to > current event loop. It is bound to the event loop whose sock_() method you called. > So if one change event loop with active transport, nothing bad should > happen. Or I'm missing something. Changing event loops in the middle of event processing is not a common (or even useful) pattern. You start the event loop and then leave it alone. >> > I understand why it should be a method, but still if it's a getter, it >> > should have either get_ or is_ prefix. >> >> Why? That's not a universal coding standard. The names seem clear enough >> to me. > > > When I see (in autocompletion, for example) or remember name like "running", > it triggers thought that it's a field. When I remember smth like is_running, > it definitely associates with method. That must pretty specific to your personal experience. >> > Are there any way to change this with 'Final' PEP? >> >> No, the concurrent.futures package has been released (I forget if it >> was Python 3.2 or 3.3) and we're bound to backwards compatibility. >> Also I really don't think it's a big deal at all. > > > Yes, not a big deal. >> >> >> >> > 5. I think, protocol and transport methods' names are not easy or >> >> > understanding enough: >> >> > - write_eof() does not write anything but closes smth, should be >> >> > close_writing or smth alike; >> >> > - the same way eof_received() should become smth like receive_closed; >> >> >> >> I am indeed struggling a bit with these names, but "writing an EOF" is >> >> actually how I think of this (maybe I am dating myself to the time of >> >> mag tapes though :-). >> >> >> > I never saw a computer working with a tape, but it's clear to me what >> > does >> > they do. >> > I've just imagined the amount of words I'll have to say to students >> > about >> > EOFs instead of simple "it closes our end of one half of a socket". >> >> But which half? A socket is two independent streams, one in each >> direction. Twisted uses half_close() for this concept but unless you >> already know what this is for you are left wondering which half. Which >> is why I like using 'write' in the name. > > > Yes, 'write' part is good, I should mention it. I meant to say that I won't > need to explain that there were days when we had to handle a special marker > at the end of file. But even today you have to mark the end somehow, to distinguish it from "not done yet, more could be coming". The equivalent is typing ^D into a UNIX terminal (or ^Z on Windows). -- --Guido van Rossum (python.org/~guido) From glyph at twistedmatrix.com Wed Jan 9 10:30:43 2013 From: glyph at twistedmatrix.com (Glyph) Date: Wed, 9 Jan 2013 01:30:43 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: <69E9D1F0-50C3-4F49-998A-3EEB79611C43@twistedmatrix.com> On Jan 8, 2013, at 9:14 PM, Guido van Rossum wrote: > But which half? A socket is two independent streams, one in each > direction. Twisted uses half_close() for this concept but unless you > already know what this is for you are left wondering which half. Which > is why I like using 'write' in the name. I should add, if you don't already know what this means you really shouldn't be trying to do it ;-). -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Jan 9 10:39:11 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 9 Jan 2013 10:39:11 +0100 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted References: Message-ID: <20130109103911.4f599709@pitrou.net> Hi Yuriy, For the record, it isn't necessary to cross-post. python-ideas is the place for discussing this, and most interested people will be subscribed to both python-ideas and python-dev, and therefore they get duplicate messages. Regards Antoine. Le Wed, 9 Jan 2013 05:14:02 +0400, Yuriy Taraday a ?crit : > Hello. > > I've read the PEP and some things raise questions in my > consciousness. Here they are. [snip] From yorik.sar at gmail.com Wed Jan 9 11:45:55 2013 From: yorik.sar at gmail.com (Yuriy Taraday) Date: Wed, 9 Jan 2013 14:45:55 +0400 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 8:50 AM, Guido van Rossum wrote: > On Tue, Jan 8, 2013 at 8:31 PM, Guido van Rossum wrote: > > On Tue, Jan 8, 2013 at 5:14 PM, Yuriy Taraday > wrote: > >> - pause() and resume() work with reading only, so they should be > suffixed > >> (prefixed) with read(ing), like pause_reading(), resume_reading(). > > > > Agreed. > > I think I want to take that back. I think it is more common for a > protocol to want to pause the transport (i.e. hold back > data_received() calls) than it is for a transport to want to pause the > protocol (i.e. hold back write() calls). So the more common method can > have a shorter name. Also, pause_reading() is almost confusing, since > the protocol's method is named data_received(), not read_data(). Also, > there's no reason for the protocol to want to pause the *write* (send) > actions of the transport -- if wanted to write less it should not have > called write(). The reason to distinguish between the two modes of > pausing is because it is sometimes useful to "stack" multiple > protocols, and then a protocol in the middle of the stack acts as a > transport to the protocol next to it (and vice versa). See the > discussion on this list previously, e.g. > http://mail.python.org/pipermail/python-ideas/2013-January/018522.html > (search for the keyword "stack" in this long message to find the > relevant section). I totally agree with protocol/transport stacking, anyone should be able to do some ugly thing like FTP over SSL over SOCKS over SSL over HTTP (j/k). Just take a look at what you can do with netgraph in *BSD (anything over anything with any number of layers). But still we shouldn't sacrifice ease of understanding (both docs and code) for couple extra chars (10 actually). Yes, 'reading' is misleading, pause_receiving and resume_receiving are better. -- Kind regards, Yuriy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yorik.sar at gmail.com Wed Jan 9 11:55:27 2013 From: yorik.sar at gmail.com (Yuriy Taraday) Date: Wed, 9 Jan 2013 14:55:27 +0400 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 10:02 AM, Guido van Rossum wrote: > Changing event loops in the middle of event processing is not a common > (or even useful) pattern. You start the event loop and then leave it > alone. > Yes. It was not-so-great morning idea. > Yes, 'write' part is good, I should mention it. I meant to say that I > won't > > need to explain that there were days when we had to handle a special > marker > > at the end of file. > > But even today you have to mark the end somehow, to distinguish it > from "not done yet, more could be coming". The equivalent is typing ^D > into a UNIX terminal (or ^Z on Windows). My interns told me that they remember EOF as special object only from high school when they had to study Pascal. I guess, in 5 years students won't understand how one can write an EOF. (and schools will finally replace Pascal with Python) -- Kind regards, Yuriy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yorik.sar at gmail.com Wed Jan 9 12:00:00 2013 From: yorik.sar at gmail.com (Yuriy Taraday) Date: Wed, 9 Jan 2013 15:00:00 +0400 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: <20130109103911.4f599709@pitrou.net> References: <20130109103911.4f599709@pitrou.net> Message-ID: On Wed, Jan 9, 2013 at 1:39 PM, Antoine Pitrou wrote: > > Hi Yuriy, > > For the record, it isn't necessary to cross-post. python-ideas is > the place for discussing this, and most interested people will be > subscribed to both python-ideas and python-dev, and therefore they get > duplicate messages. > Oh, sorry. I just found this thread in both MLs, so decided to send it to both. This will be my last email (for now) on this topic at python-dev. -- Kind regards, Yuriy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jan 9 12:06:45 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 Jan 2013 21:06:45 +1000 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 8:55 PM, Yuriy Taraday wrote: > My interns told me that they remember EOF as special object only from high > school when they had to study Pascal. I guess, in 5 years students won't > understand how one can write an EOF. (and schools will finally replace > Pascal with Python) Python really doesn't try to avoid the concept of an End-of-file marker. ================ $ python3 Python 3.2.3 (default, Jun 8 2012, 05:36:09) [GCC 4.7.0 20120507 (Red Hat 4.7.0-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> quit Use quit() or Ctrl-D (i.e. EOF) to exit >>> import io >>> print(io.FileIO.read.__doc__) read(size: int) -> bytes. read at most size bytes, returned as bytes. Only makes one system call, so less data may be returned than requested In non-blocking mode, returns None if no data is available. On end-of-file, returns ''. ================ Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Wed Jan 9 13:48:49 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 9 Jan 2013 13:48:49 +0100 Subject: [Python-Dev] Set close-on-exec flag by default in SocketServer Message-ID: Hi, The SocketServer class creates a socket to listen on clients, and a new socket per client (only for stream server like TCPServer, not for UDPServer). Until recently (2011-05-24, issue #5715), the listening socket was not closed after fork for the ForkingMixIn flavor. This caused two issues: it's a security leak, and it causes "address already in use" error if the server is restarted (see the first message of #12107 for an example with Django). Our implementation of the XMLRPC server uses fcntl() + FD_CLOEXEC to fix this issue since long time ago (2005-12-04, #1222790). IMO it would be safer to always enable close-on-exec flag on all sockets (listening+clients) *by default*. According to Charles-Fran?ois Natali, it will break applications. Ruby made a similar brave choice one year ago (2011-10-22), not only on sockets but on all file descriptors: http://bugs.ruby-lang.org/issues/5041 Developers of Ruby servers relying on inheriting sockets in child processes have to explicitly disable close-on-exec flag in their server. Unicorn has been fixed for example. My question is: would you accept to break backward compatibility (in Python 3.4) to fix a potential security vulnerability? If not, an alternative is to add an option, disabled by default, to enable (or disable) explicitly close-on-exec in Python 3.4, and wait for 3.5 to enable the option by default. So applications might disable the flag explicitly in Python 3.4. I wrote a patch attached to the issue #12107 which adds a flag to enable or disable close-on-exec, I chose to enable the flag by default: http://bugs.python.org/issue12107 -- I'm not sure that close-on-exec flag must be set on the listening socket *and* on the client sockets. What do you think? -- It would be nice to be able to set close-on-exec flag directly using socket.socket() constructor. For example, an optional cloexec keyword-only argument can be added to socket.socket(family, type, *, cloexec=False). Victor From senthil at uthcode.com Wed Jan 9 18:12:41 2013 From: senthil at uthcode.com (Senthil Kumaran) Date: Wed, 9 Jan 2013 09:12:41 -0800 Subject: [Python-Dev] Set close-on-exec flag by default in SocketServer In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 4:48 AM, Victor Stinner wrote: > My question is: would you accept to break backward compatibility (in > Python 3.4) to fix a potential security vulnerability? > > If not, an alternative is to add an option, disabled by default, to > enable (or disable) explicitly close-on-exec in Python 3.4, and wait > for 3.5 to enable the option by default. So applications might disable > the flag explicitly in Python 3.4. If the end goal is indeed going to close-on-exec ON by default, then I think having it 3.4 itself is a good idea. OFF for one release just gives the framework developers who use SocketServer some additional time. Usually, I have realized that framework devs try our release candidates and see if they see any potential changes to be done. If they realize this change in their testing, it would be good for both parties. So, my vote. +1 for making that in 3.4 Thank you, Senthil From trent at snakebite.org Wed Jan 9 19:34:44 2013 From: trent at snakebite.org (Trent Nelson) Date: Wed, 9 Jan 2013 13:34:44 -0500 Subject: [Python-Dev] Arena terminology (PyArena vs obmalloc.c:"arena") Message-ID: <20130109183443.GA35512@snakebite.org> There's no correlation between PyArenas and the extensive use of the term "arena" in obmalloc.c, right? I initially assumed there was, based solely on the common use of the term arena. However, after more investigation, it *appears* as though there's absolutely no correlation. Just wanted to make 100% sure. Trent. From benjamin at python.org Wed Jan 9 20:27:42 2013 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 9 Jan 2013 13:27:42 -0600 Subject: [Python-Dev] Arena terminology (PyArena vs obmalloc.c:"arena") In-Reply-To: <20130109183443.GA35512@snakebite.org> References: <20130109183443.GA35512@snakebite.org> Message-ID: 2013/1/9 Trent Nelson : > There's no correlation between PyArenas and the extensive use of the > term "arena" in obmalloc.c, right? Correct. -- Regards, Benjamin From cf.natali at gmail.com Wed Jan 9 22:15:33 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Wed, 9 Jan 2013 22:15:33 +0100 Subject: [Python-Dev] Set close-on-exec flag by default in SocketServer In-Reply-To: References: Message-ID: > My question is: would you accept to break backward compatibility (in > Python 3.4) to fix a potential security vulnerability? Although obvious, the security implications are not restricted to sockets (yes, it's a contrived example): """ # cat test_inherit.py import fcntl import os import pwd import sys f = open("/tmp/passwd", 'w+') #fcntl.fcntl(f.fileno(), fcntl.F_SETFD, fcntl.FD_CLOEXEC) if os.fork() == 0: os.setuid(pwd.getpwnam('nobody').pw_uid) os.execv(sys.executable, ['python', '-c', 'import os; os.write(3, "owned")']) else: os.waitpid(-1, 0) f.seek(0) print(f.read()) f.close() # python test_inherit.py owned """ > I'm not sure that close-on-exec flag must be set on the listening > socket *and* on the client sockets. What do you think? In the listening socket is inherited, it can lead to EADDRINUSE, or the child process "hijacking" new connections (by accept()ing on the same socket). As for the client sockets, there's at least one reason to set them close-on-exec: if a second forked process inherits the first process' client socket, even when the first client closes its file descriptor (and exits), the socket won't be closed until the the second process exits too: so one long-running child process can delay other child processes connection shutdown for arbitrarily long. From victor.stinner at gmail.com Thu Jan 10 01:13:05 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 10 Jan 2013 01:13:05 +0100 Subject: [Python-Dev] Set close-on-exec flag by default in SocketServer In-Reply-To: References: Message-ID: 2013/1/9 Charles-Fran?ois Natali : >> My question is: would you accept to break backward compatibility (in >> Python 3.4) to fix a potential security vulnerability? > > Although obvious, the security implications are not restricted to > sockets (yes, it's a contrived example): > ... > f = open("/tmp/passwd", 'w+') > if os.fork() == 0: > ... Do you mean that we should provide a way to enable close-on-exec flag by default on all file descriptors? Adding an "e" flag to open() mode is not compatible with this requirement: we would need the opposite flag to disable explicitly close-on-exec. A better API is maybe to add a "cloexec" argument to open(), socket.socket(), pipe(), etc. It's easier to choose the default value of this argument: * False: backward compatible * True: backward incompatible * value retrieved from a global option, ex: sys.getdefaultcloexec() So a simple call to "sys.setdefaultcloexec(True)" would make your program make secure and less error-prone, globally, without having to break backward compatibility. Example without setting cloexec globally: f = open("/tmp/passwd", "w+", cloexec=True) s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, cloexec=True) rfd, wfd = os.pipe(cloexec=True) -- "os.pipe(cloexec=True)" doesn't respect 1-to-1 "rule" of OS functions exposed in the Python module "os", because it it may use pipe2() with O_CLOEXEC internally, whereas os.pipe2() does exist... ... But other Python functions of the os module use a different C function depending on the option. Examples: - os.open() uses open(), or openat() if dir_fd is set. - os.utime() uses utimensat(), utime(), utimes(), futimens(), futimes(), futimesat(), ... Victor From victor.stinner at gmail.com Thu Jan 10 03:16:11 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 10 Jan 2013 03:16:11 +0100 Subject: [Python-Dev] Set close-on-exec flag by default in SocketServer In-Reply-To: References: Message-ID: 2013/1/10 Victor Stinner : > A better API is maybe to add a "cloexec" argument to open(), ... I realized that setting close-on-exec flag is a non trivial problem. There are many ways to set it depending on the function, on the OS, and on the OS version. There is also the difficult question of the "default value". I started to work on a PEP: https://bitbucket.org/haypo/misc/src/tip/python/pep_cloexec.rst Contact me if you would like to contribute. Victor From neologix at free.fr Thu Jan 10 09:06:25 2013 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 10 Jan 2013 09:06:25 +0100 Subject: [Python-Dev] Set close-on-exec flag by default in SocketServer In-Reply-To: References: Message-ID: > The SocketServer class creates a socket to listen on clients, and a > new socket per client (only for stream server like TCPServer, not for > UDPServer). > > Until recently (2011-05-24, issue #5715), the listening socket was not > closed after fork for the ForkingMixIn flavor. This caused two issues: > it's a security leak, and it causes "address already in use" error if > the server is restarted (see the first message of #12107 for an > example with Django). Note that the server socket is actually still not closed in the child process: one this gets fixed, setting FD_CLOEXEC will not be useful anymore (but it would be an extra security it it could be done atomically, especially against race conditions in multi-threaded applications). (Same thing for the client socket, which is actually already closed in the parent process). As for the backward compatibility issue, here's a thought: subprocess was changed in 3.2 to close all FDs > 2 in the child process by default. AFAICT, we didn't get a single report complaining about this behavior change. OTOH, we did get numerous bug reports due to FDs inherited by subprocesses before that change. (I know that Python >= 3.2 is less widespread than its predecessors, but still). From solipsis at pitrou.net Thu Jan 10 10:37:51 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 10 Jan 2013 10:37:51 +0100 Subject: [Python-Dev] fork or exec? References: Message-ID: <20130110103751.523ddb0a@pitrou.net> Hello, Le Wed, 9 Jan 2013 13:48:49 +0100, Victor Stinner a ?crit : > > Until recently (2011-05-24, issue #5715), the listening socket was not > closed after fork for the ForkingMixIn flavor. This caused two issues: > it's a security leak, and it causes "address already in use" error if > the server is restarted (see the first message of #12107 for an > example with Django). > [...] > > I wrote a patch attached to the issue #12107 which adds a flag to > enable or disable close-on-exec, I chose to enable the flag by > default: > http://bugs.python.org/issue12107 So, I read your e-mail again and I'm wondering if you're making a logic error, or if I'm misunderstanding something: 1. first you're talking about duplicate file or socket objects after *fork()* (which is an issue I agree is quite annoying) 2. the solution you're proposing doesn't close the file descriptors after fork() but after *exec()*. Basically the solution doesn't address the problem. Many fork() calls aren't followed by an exec() call (multiprocessing comes to mind). On the other hand, the one widespread user of exec() after fork() in the stdlib, namely subprocess, *already* closes file descriptors by default, so the exec() issue doesn't really exist anymore for us (or is at least quite exotic). Regards Antoine. From neologix at free.fr Thu Jan 10 11:35:29 2013 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 10 Jan 2013 11:35:29 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: <20130110103751.523ddb0a@pitrou.net> References: <20130110103751.523ddb0a@pitrou.net> Message-ID: > So, I read your e-mail again and I'm wondering if you're making a logic > error, or if I'm misunderstanding something: > > 1. first you're talking about duplicate file or socket objects after > *fork()* (which is an issue I agree is quite annoying) > > 2. the solution you're proposing doesn't close the file descriptors > after fork() but after *exec()*. > > Basically the solution doesn't address the problem. Many fork() calls > aren't followed by an exec() call (multiprocessing comes to mind). Yes. In this specific case, the proper solution is to close the server socket right after fork() in the child process. We can't do anything about file descriptors inherited upon fork() (and shouldn't do anything of course, except on an individual basis like this socket server example). On the other hand, setting file descriptors close-on-exec has the advantage of avoiding file descriptor inheritance to spawned (fork()+exec()) child processes, which, in 99% of cases, don't need them (apart from stdin/stdout/stderr). Not only can this cause subtle bugs (socket/file not being closed when the parent closes the file descriptor, deadlocks, there are several such examples in the bug tracker), but also a security issue, because contrarily to a fork()ed process which runs code controlled by the library/user, after exec() you might be running arbitrary code. Let's take the example of CGIHTTPServer: """ # Child try: try: os.setuid(nobody) except os.error: pass os.dup2(self.rfile.fileno(), 0) os.dup2(self.wfile.fileno(), 1) os.execve(scriptfile, args, env) """ The code tries to execute a CGI script as user "nobody" to minimize privilege, but if the current process has an sensitive file opened, the file descriptor will be leaked to the CGI script, which can do anything with it. In short, close-on-exec can solve a whole class of problems (but does not really apply to this specific case). > On the other hand, the one widespread user of exec() after fork() in the > stdlib, namely subprocess, *already* closes file descriptors by > default, so the exec() issue doesn't really exist anymore for us (or > is at least quite exotic). See the above example. There can be valid reasons to use fork()+exec() instead of subprocess. Disclaimer: I'm not saying we should be changing all FDs to close-on-exec by default like Ruby did, I'm just saying that there's a real problem. From solipsis at pitrou.net Thu Jan 10 11:48:51 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 10 Jan 2013 11:48:51 +0100 Subject: [Python-Dev] fork or exec? References: <20130110103751.523ddb0a@pitrou.net> Message-ID: <20130110114851.2e407871@pitrou.net> Le Thu, 10 Jan 2013 11:35:29 +0100, Charles-Fran?ois Natali a ?crit : > > So, I read your e-mail again and I'm wondering if you're making a > > logic error, or if I'm misunderstanding something: > > > > 1. first you're talking about duplicate file or socket objects after > > *fork()* (which is an issue I agree is quite annoying) > > > > 2. the solution you're proposing doesn't close the file descriptors > > after fork() but after *exec()*. > > > > Basically the solution doesn't address the problem. Many fork() > > calls aren't followed by an exec() call (multiprocessing comes to > > mind). > > Yes. > In this specific case, the proper solution is to close the server > socket right after fork() in the child process. > > We can't do anything about file descriptors inherited upon fork() (and > shouldn't do anything of course, except on an individual basis like > this socket server example). Having an official afterfork facility would still help. I currently rely on multiprocessing.util.register_after_fork(), even though it's an undocumented API (and, of course, it works only for those child processes launched by multiprocessing, not other fork() uses). Regards Antoine. From victor.stinner at gmail.com Thu Jan 10 12:13:27 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 10 Jan 2013 12:13:27 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: <20130110114851.2e407871@pitrou.net> References: <20130110103751.523ddb0a@pitrou.net> <20130110114851.2e407871@pitrou.net> Message-ID: > 2. the solution you're proposing doesn't close the file descriptors > after fork() but after *exec()*. Exact. My PEP cannot solve all issues :-) I will update it to make it more explicit. > Having an official afterfork facility would still help. Yes. IMO it is a different topic: you can also call exec() without calling fork() first ;-) Victor 2013/1/10 Antoine Pitrou : > Le Thu, 10 Jan 2013 11:35:29 +0100, > Charles-Fran?ois Natali a ?crit : >> > So, I read your e-mail again and I'm wondering if you're making a >> > logic error, or if I'm misunderstanding something: >> > >> > 1. first you're talking about duplicate file or socket objects after >> > *fork()* (which is an issue I agree is quite annoying) >> > >> > 2. the solution you're proposing doesn't close the file descriptors >> > after fork() but after *exec()*. >> > >> > Basically the solution doesn't address the problem. Many fork() >> > calls aren't followed by an exec() call (multiprocessing comes to >> > mind). >> >> Yes. >> In this specific case, the proper solution is to close the server >> socket right after fork() in the child process. >> >> We can't do anything about file descriptors inherited upon fork() (and >> shouldn't do anything of course, except on an individual basis like >> this socket server example). > > Having an official afterfork facility would still help. I currently > rely on multiprocessing.util.register_after_fork(), even though it's an > undocumented API (and, of course, it works only for those child > processes launched by multiprocessing, not other fork() uses). > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From christian at python.org Thu Jan 10 12:51:00 2013 From: christian at python.org (Christian Heimes) Date: Thu, 10 Jan 2013 12:51:00 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: <20130110114851.2e407871@pitrou.net> References: <20130110103751.523ddb0a@pitrou.net> <20130110114851.2e407871@pitrou.net> Message-ID: <50EEAB24.70205@python.org> Am 10.01.2013 11:48, schrieb Antoine Pitrou: > Having an official afterfork facility would still help. I currently > rely on multiprocessing.util.register_after_fork(), even though it's an > undocumented API (and, of course, it works only for those child > processes launched by multiprocessing, not other fork() uses). The time machine strikes again: http://bugs.python.org/issue16500 From victor.stinner at gmail.com Thu Jan 10 12:59:02 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 10 Jan 2013 12:59:02 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> Message-ID: 2013/1/10 Charles-Fran?ois Natali : > Disclaimer: I'm not saying we should be changing all FDs to > close-on-exec by default like Ruby did, I'm just saying that there's a > real problem. I changed my mind, the PEP does not propose to change the *default* behaviour (don't set close-on-exec by default). But the PEP proposes to *add a function* to change the default behaviour. Application developers taking care of security can set close-on-exec by default, but they will have maybe to fix bugs (add cloexec=False argument, call os.set_cloexec(fd, True)) because something may expect an inheried file descriptor. IMO it is a smoother transition than forcing all developers to fix their application which may make developers think that Python 3.4 is buggy, and slow down the transition from Python 3.3 to 3.4. For Ruby: the change will only be effective in the next major version of Ruby, which may be the expected "Ruby 2.0". Victor From solipsis at pitrou.net Thu Jan 10 13:52:41 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 10 Jan 2013 13:52:41 +0100 Subject: [Python-Dev] fork or exec? References: <20130110103751.523ddb0a@pitrou.net> Message-ID: <20130110135241.088a6a24@pitrou.net> Le Thu, 10 Jan 2013 12:59:02 +0100, Victor Stinner a ?crit : > 2013/1/10 Charles-Fran?ois Natali : > > Disclaimer: I'm not saying we should be changing all FDs to > > close-on-exec by default like Ruby did, I'm just saying that > > there's a real problem. > > I changed my mind, the PEP does not propose to change the *default* > behaviour (don't set close-on-exec by default). > > But the PEP proposes to *add a function* to change the default > behaviour. Application developers taking care of security can set > close-on-exec by default, but they will have maybe to fix bugs (add > cloexec=False argument, call os.set_cloexec(fd, True)) because > something may expect an inheried file descriptor. Do you have an example of what that "something" may be? Apart from standard streams, I can't think of any inherited file descriptor an external program would want to rely on. In other words, I think close-on-exec by default is probably a reasonable decision. Regards Antoine. From yury at shurup.com Tue Jan 8 22:01:03 2013 From: yury at shurup.com (Yury V. Zaytsev) Date: Tue, 08 Jan 2013 22:01:03 +0100 Subject: [Python-Dev] Point of building without threads? In-Reply-To: <20130108150944.679c07f2@pitrou.net> References: <20120507214943.579045c2@pitrou.net> <1357637305.2857.54.camel@newpride> <20130108140159.GA31795@snakebite.org> <20130108150944.679c07f2@pitrou.net> Message-ID: <1357678863.2605.50.camel@newpride> On Tue, 2013-01-08 at 15:09 +0100, Antoine Pitrou wrote: > > The original e-mail is quite old (it was sent in May) :-) I'm sorry about that! I've just stumbled upon this thread and got scared that --without-threads might be going away soon... We've just been porting Python to a new supercomputer, so that we can use Python interface to our simulation software and, at the moment, this is the only way we can get it to work due to the deficiencies of the available software stack (which will hopefully be fixed in the future). Anyways, the point is that it's a recurrent problem, and we hit it every single time with each new machine, so I hope you can understand why I decided to post, even though the original message appeared back in May. -- Sincerely yours, Yury V. Zaytsev From jstpierre at mecheye.net Wed Jan 9 06:59:39 2013 From: jstpierre at mecheye.net (Jasper St. Pierre) Date: Wed, 9 Jan 2013 00:59:39 -0500 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: <87k3rmap4p.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87k3rmap4p.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Well, if we're at the "bikeshedding about names" stage, that means that no serious issues with the proposal are left. So it's a sign of progress. On Wed, Jan 9, 2013 at 12:42 AM, Stephen J. Turnbull wrote: > Is this thread really ready to migrate to python-dev when we're still > bikeshedding method names? > > Yuriy Taraday writes: > > > > But which half? A socket is two independent streams, one in each > > > direction. Twisted uses half_close() for this concept but unless you > > > already know what this is for you are left wondering which half. Which > > > is why I like using 'write' in the name. > > > > Yes, 'write' part is good, I should mention it. I meant to say that I > won't > > need to explain that there were days when we had to handle a special > marker > > at the end of file. > > Mystery is good for students. > > Getting serious, "close_writer" occured to me as a possibility. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Jasper -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Jan 10 14:54:29 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Jan 2013 23:54:29 +1000 Subject: [Python-Dev] Point of building without threads? In-Reply-To: <1357678863.2605.50.camel@newpride> References: <20120507214943.579045c2@pitrou.net> <1357637305.2857.54.camel@newpride> <20130108140159.GA31795@snakebite.org> <20130108150944.679c07f2@pitrou.net> <1357678863.2605.50.camel@newpride> Message-ID: On Wed, Jan 9, 2013 at 7:01 AM, Yury V. Zaytsev wrote: > Anyways, the point is that it's a recurrent problem, and we hit it every > single time with each new machine, so I hope you can understand why I > decided to post, even though the original message appeared back in May. Thank you for speaking up. Easier porting to new platforms is certainly one of the reasons we keep that capability, and it's helpful to have it directly confirmed that there *are* users out there that benefit from it :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tseaver at palladion.com Thu Jan 10 18:47:23 2013 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 10 Jan 2013 12:47:23 -0500 Subject: [Python-Dev] fork or exec? In-Reply-To: <20130110135241.088a6a24@pitrou.net> References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/10/2013 07:52 AM, Antoine Pitrou wrote: > Le Thu, 10 Jan 2013 12:59:02 +0100, Victor Stinner > a ?crit : > >> 2013/1/10 Charles-Fran?ois Natali : >>> Disclaimer: I'm not saying we should be changing all FDs to >>> close-on-exec by default like Ruby did, I'm just saying that >>> there's a real problem. >> >> I changed my mind, the PEP does not propose to change the *default* >> behaviour (don't set close-on-exec by default). >> >> But the PEP proposes to *add a function* to change the default >> behaviour. Application developers taking care of security can set >> close-on-exec by default, but they will have maybe to fix bugs (add >> cloexec=False argument, call os.set_cloexec(fd, True)) because >> something may expect an inheried file descriptor. > > Do you have an example of what that "something" may be? Apart from > standard streams, I can't think of any inherited file descriptor an > external program would want to rely on. > > In other words, I think close-on-exec by default is probably a > reasonable decision. Why would we wander away from POSIX semantics here? There are good reasons not to close open descriptors (the 'pipe()' syscall, for instance), and there is no POSIXy way to ask for them *not* to be closed. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlDu/qsACgkQ+gerLs4ltQ5zxACgu9+OcQx9iMgm9YosUhausoIo orwAoK5NNzGDkC0MKwR8oRDCYTz3zNl4 =TPKH -----END PGP SIGNATURE----- From solipsis at pitrou.net Thu Jan 10 19:30:13 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 10 Jan 2013 19:30:13 +0100 Subject: [Python-Dev] fork or exec? References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: <20130110193013.5323e161@pitrou.net> On Thu, 10 Jan 2013 12:47:23 -0500 Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 01/10/2013 07:52 AM, Antoine Pitrou wrote: > > Le Thu, 10 Jan 2013 12:59:02 +0100, Victor Stinner > > a ?crit : > > > >> 2013/1/10 Charles-Fran?ois Natali : > >>> Disclaimer: I'm not saying we should be changing all FDs to > >>> close-on-exec by default like Ruby did, I'm just saying that > >>> there's a real problem. > >> > >> I changed my mind, the PEP does not propose to change the *default* > >> behaviour (don't set close-on-exec by default). > >> > >> But the PEP proposes to *add a function* to change the default > >> behaviour. Application developers taking care of security can set > >> close-on-exec by default, but they will have maybe to fix bugs (add > >> cloexec=False argument, call os.set_cloexec(fd, True)) because > >> something may expect an inheried file descriptor. > > > > Do you have an example of what that "something" may be? Apart from > > standard streams, I can't think of any inherited file descriptor an > > external program would want to rely on. > > > > In other words, I think close-on-exec by default is probably a > > reasonable decision. > > Why would we wander away from POSIX semantics here? There are good > reasons not to close open descriptors (the 'pipe()' syscall, for > instance), and there is no POSIXy way to ask for them *not* to be closed. Because Python is not POSIX. (and POSIX did mistakes anyway) Regards Antoine. From tseaver at palladion.com Thu Jan 10 20:16:18 2013 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 10 Jan 2013 14:16:18 -0500 Subject: [Python-Dev] fork or exec? In-Reply-To: <20130110193013.5323e161@pitrou.net> References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> <20130110193013.5323e161@pitrou.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/10/2013 01:30 PM, Antoine Pitrou wrote: > On Thu, 10 Jan 2013 12:47:23 -0500 Tres Seaver > wrote: >> Why would we wander away from POSIX semantics here? There are good >> reasons not to close open descriptors (the 'pipe()' syscall, for >> instance), and there is no POSIXy way to ask for them *not* to be >> closed. > > Because Python is not POSIX. (and POSIX did mistakes anyway) Anybody using systems programming using fork() should expect to be in POSIX land, and would be surprised not to have the semantics it implies. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlDvE4IACgkQ+gerLs4ltQ4gwwCZAbj6cPhTWahGam7bbr8KgcY4 IY8An1yH6YOsSvVoV38l4Ka2pcJUFzDA =gbOD -----END PGP SIGNATURE----- From victor.stinner at gmail.com Thu Jan 10 20:21:19 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 10 Jan 2013 20:21:19 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: <20130110135241.088a6a24@pitrou.net> References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: 2013/1/10 Antoine Pitrou : >> I changed my mind, the PEP does not propose to change the *default* >> behaviour (don't set close-on-exec by default). >> >> But the PEP proposes to *add a function* to change the default >> behaviour. Application developers taking care of security can set >> close-on-exec by default, but they will have maybe to fix bugs (add >> cloexec=False argument, call os.set_cloexec(fd, True)) because >> something may expect an inheried file descriptor. > > Do you have an example of what that "something" may be? > Apart from standard streams, I can't think of any inherited file > descriptor an external program would want to rely on. > > In other words, I think close-on-exec by default is probably a > reasonable decision. Network servers like inetd or apache MPM (prefork) uses a process listening on a socket, and then fork to execute a request in a child process. I don't know how it works exactly, but I guess that the child process need a socket from the parent to send the answer to the client. If the socket is closed on execute (ex: Apache with CGI), it does not work :-) Example with CGIHTTPRequestHandler.run_cgi(), self.connection is the socket coming from accept(): self.rfile = self.connection.makefile('rb', self.rbufsize) self.wfile = self.connection.makefile('wb', self.wbufsize) ... try: os.setuid(nobody) except OSError: pass os.dup2(self.rfile.fileno(), 0) os.dup2(self.wfile.fileno(), 1) os.execve(scriptfile, args, env) We are talking about *one* specific socket. So if close-on-exec flag is set on all file descriptors, it should be easy to only unset it on one specific file descriptor. For example, the fix for StreamRequestHandler.setup() would be simple, just add cloexec=True to dup2: os.dup2(self.rfile.fileno(), 0, cloexec=False) os.dup2(self.wfile.fileno(), 1, cloexec=False) Such servers is not the common case, but more the exception. So setting close-on-exec by default makes sense. I don't know if there are other use cases using fork()+exec() and *inheriting* one or more file descriptors. -- On Windows, the CGI server uses a classic PIPE to get the output of the subprocess and then copy data from the pipe into the socket. Victor From cf.natali at gmail.com Thu Jan 10 20:55:56 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 10 Jan 2013 20:55:56 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: > Network servers like inetd or apache MPM (prefork) uses a process > listening on a socket, and then fork to execute a request in a child > process. I don't know how it works exactly, but I guess that the child > process need a socket from the parent to send the answer to the > client. If the socket is closed on execute (ex: Apache with CGI), it > does not work :-) Yes, but the above (setting close-on-exec by default) would *not* apply to stdin, stdout and stderr. inetd servers use dup(socket, 0); dup(socket, 1, dup(socket, 2) before forking, so it would still work. > Example with CGIHTTPRequestHandler.run_cgi(), self.connection is the > socket coming from accept(): > > self.rfile = self.connection.makefile('rb', self.rbufsize) > self.wfile = self.connection.makefile('wb', self.wbufsize) > ... > try: > os.setuid(nobody) > except OSError: > pass > os.dup2(self.rfile.fileno(), 0) > os.dup2(self.wfile.fileno(), 1) > os.execve(scriptfile, args, env) Same thing here. And the same thing holds for shell-type pipelines: you're always using stdin, stdout or stderr. > Do you have an example of what that "something" may be? > Apart from standard streams, I can't think of any inherited file > descriptor an external program would want to rely on. Indeed, it should be really rare. There are far more programs that are bitten by FD inheritance upon exec than programs relying on it, and whereas failures and security issues in the first category are hard to debug and unpredictable (especially in a multi-threaded program), a program relying on a FD that would be closed will fail immediately with EBADF, and so could be updated quickly and easily. > In other words, I think close-on-exec by default is probably a > reasonable decision. close-on-exec should probably have been the default in Unix, and is a much saner option. The only question is whether we're willing to take the risk of breaking - admittedly a handful - of applications to avoid a whole class of difficult to debug and potential security issues. Note that if we do choose to set all file descriptors close-on-exec by default, there are several questions open: - This would hold for open(), Socket() and other high-level file-descriptor wrappers. Should it be enabled also for low-level syscall wrappers like os.open(), os.pipe(), etc? - On platforms that don't support atomic close-on-exec (e.g. open() with O_CLOEXEC, socket() with SOCK_CLOEXEC, pipe2(), etc), this would require extra fcntl()/ioctl() syscalls. The cost is probably negligible, but we'd have to check the impact on some benchmarks. From vinay_sajip at yahoo.co.uk Thu Jan 10 21:15:54 2013 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 10 Jan 2013 20:15:54 +0000 (UTC) Subject: [Python-Dev] fork or exec? References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: Charles-Fran?ois Natali gmail.com> writes: > > Do you have an example of what that "something" may be? > > Apart from standard streams, I can't think of any inherited file > > descriptor an external program would want to rely on. > > Indeed, it should be really rare. GnuPG allows communicating with it through arbitrary file descriptors [1]. For example, if you use the command-line argument --passphrase-fd n then it will attempt to read the passphrase from file descriptor n. If n is 0, it reads from stdin, as one would expect, but you can use other values for the fd. There are several other such command line options. How would that sit with the current proposal? I maintain a wrapper, python-gnupg, which communicates with the GnuPG process through subprocess. Although there is no in-built use of these parameters, users are allowed to pass additional parameters to GnuPG, and they might use these esoteric GnuPG options. Regards, Vinay Sajip [1] http://www.gnupg.org/documentation/manuals/gnupg-devel/GPG-Esoteric-Options.html From cf.natali at gmail.com Thu Jan 10 21:30:38 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 10 Jan 2013 21:30:38 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: > How would that sit with the current proposal? I maintain a wrapper, > python-gnupg, > which communicates with the GnuPG process through subprocess. Although there > is > no in-built use of these parameters, users are allowed to pass additional > parameters to GnuPG, and they might use these esoteric GnuPG options. Since you use subprocess, file descriptor passing to gnupg doesn't work since Subproces.Popen changed close_fds default to True (in 3.2) (which was the right decision, IMO). From xnite at xnite.org Thu Jan 10 21:27:06 2013 From: xnite at xnite.org (Robert Whitney) Date: Thu, 10 Jan 2013 14:27:06 -0600 Subject: [Python-Dev] FYI - wiki.python.org compromised Message-ID: <50EF241A.6080202@xnite.org> To Whoever this may concern, I believe the exploit in use on the Python Wiki could have been the following remote arbitrary code execution exploit that myself and some fellow researchers have been working with over the past few days. I'm not sure if this has quite been reported to the Moin development team, however this exploit would be triggered via a URL much like the following: http://wiki.python.org/WikiSandBox?action=moinexec&c=uname%20-a This URL of course would cause for the page to output the contents of the command "uname -a". I think this is definitely worth your researchers looking into, and please be sure to credit myself (Robert 'xnite' Whitney; http://xnite.org) for finding & reporting this vulnerability. Best of luck, Robert 'xnite' Whitney PS - If you have any further questions on this matter for me, please feel free to us the contact information in my signature below or reply to this email. -- xnite (xnite at xnite.org) Google Voice: 828-45-XNITE (96483) Web: http://xnite.org PGP Key: http://xnite.org/pgpkey From vinay_sajip at yahoo.co.uk Thu Jan 10 22:57:30 2013 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 10 Jan 2013 21:57:30 +0000 (UTC) Subject: [Python-Dev] fork or exec? References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: Charles-Fran?ois Natali gmail.com> writes: > > > How would that sit with the current proposal? I maintain a wrapper, > > python-gnupg, > > which communicates with the GnuPG process through subprocess. Although there > > is > > no in-built use of these parameters, users are allowed to pass additional > > parameters to GnuPG, and they might use these esoteric GnuPG options. > > Since you use subprocess, file descriptor passing to gnupg doesn't > work since Subproces.Popen changed close_fds default to True (in 3.2) > (which was the right decision, IMO). That could always be overcome by passing close_fds=False explicitly to subprocess from my code, though, right? I'm not doing that now, but then I'm not using the esoteric options in python-gnupg code, either. My point was that the GnuPG usage looked like an example where fds other than 0, 1 and 2 might be used by design in not-uncommonly-used programs. From a discussion I had with Barry Warsaw a while ago, I seem to remember that there was other software which relied on these features. See [1] for details. Regards, Vinay Sajip [1] https://code.google.com/p/python-gnupg/issues/detail?id=46 From victor.stinner at gmail.com Thu Jan 10 23:08:53 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 10 Jan 2013 23:08:53 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: 2013/1/10 Charles-Fran?ois Natali : >> Network servers like inetd or apache MPM (prefork) uses a process >> listening on a socket, and then fork to execute a request in a child >> process. I don't know how it works exactly, but I guess that the child >> process need a socket from the parent to send the answer to the >> client. If the socket is closed on execute (ex: Apache with CGI), it >> does not work :-) > > Yes, but the above (setting close-on-exec by default) would *not* > apply to stdin, stdout and stderr. inetd servers use dup(socket, 0); > dup(socket, 1, dup(socket, 2) before forking, so it would still work. Do you mean that file descriptors 0, 1 and 2 must be handled differently? A program can start without stdout (fd 1) nor stderr (fd 2), even if it's not the most common case :-) For example, should "os.dup2(fd, 1)" set the close-on-exec flag on the file descriptor 1? > There are far more programs that are bitten by FD inheritance upon > exec than programs relying on it, and whereas failures and security > issues in the first category are hard to debug and unpredictable > (especially in a multi-threaded program), a program relying on a FD > that would be closed will fail immediately with EBADF, and so could be > updated quickly and easily. > >> In other words, I think close-on-exec by default is probably a >> reasonable decision. > > close-on-exec should probably have been the default in Unix, and is a > much saner option. I will update the PEP to at least mention this option (enable close-on-exec by default). > Note that if we do choose to set all file descriptors close-on-exec by > default, there are several questions open: > - This would hold for open(), Socket() and other high-level > file-descriptor wrappers. Should it be enabled also for low-level > syscall wrappers like os.open(), os.pipe(), etc? os.pipe() has no higher level API, and it would be convinient to be able to set the close-on-exec flag, at least for the subprocess module For os.open(), the question is more difficult. I proposed to add an cloexec keyword argument to os.open(). os.open(..., cloexec=False) would not do anything special, whereas os.open(..., cloexec=True) would use the best method to set the close-on-exec flag, including adding O_CLOEXEC or O_NOINHERIT to flags. > - On platforms that don't support atomic close-on-exec (e.g. open() > with O_CLOEXEC, socket() with SOCK_CLOEXEC, pipe2(), etc), this would > require extra fcntl()/ioctl() syscalls. The cost is probably > negligible, but we'd have to check the impact on some benchmarks. Sure. We are working since a long time trying to reduce the number of syscalls at startup. I proposed to use ioctl(fd, FIOCLEX) which only adds one syscall, instead of two for fcntl(fd, F_SETFD, new_flags) (old flags must be read to be future-proof, as adviced in the glibc documentation). On Windows, Linux 2.6.23+ and FreeBSD 8.3+, no additional syscall is needed: O_NOINHERIT and O_CLOEXEC are enough (Linux just needs one additional syscall to ensure that O_CLOEXEC is supported). So the worst case is when fcntl() is the only available option: 2 additional syscalls are required per creation of new file descriptors. If setting close-on-exec has a significant cost, the configurable default value (sys.setdefaultcloexec()) should be considered more seriously. -- While we are talking about enable close-on-exec by default and the number of syscalls: I would like to know how a new file descriptor can be created without close-on-exec flag. What would be the API? Most existing API are designed to *set* the flag, but how should define that the flag must be unset? An option is to add an new "cloexec" keyword argument which would be True by default, and so set the argument to False to not set the flag. Example: os.pipe(cloexec=False). I dislike this option because developers may write os.pipe(cloexec=True) which would be useless and may be confusing if the flag is True by default. Victor From paul at boddie.org.uk Thu Jan 10 23:32:33 2013 From: paul at boddie.org.uk (Paul Boddie) Date: Thu, 10 Jan 2013 23:32:33 +0100 Subject: [Python-Dev] FYI - wiki.python.org compromised Message-ID: <201301102332.33222.paul@boddie.org.uk> Robert Whitney wrote: > To Whoever this may concern, > > I believe the exploit in use on the Python Wiki could have been the > following remote arbitrary code execution exploit that myself and some > fellow researchers have been working with over the past few days. I'm > not sure if this has quite been reported to the Moin development team, > however this exploit would be triggered via a URL much like the following: > http://wiki.python.org/WikiSandBox?action=moinexec&c=uname%20-a Did you check the MoinMoin security fixes page? http://moinmo.in/SecurityFixes What you describe is mentioned as "remote code execution vulnerability in twikidraw/anywikidraw action CVE-2012-6081". > This URL of course would cause for the page to output the contents of > the command "uname -a". I think this is definitely worth your > researchers looking into, and please be sure to credit myself (Robert > 'xnite' Whitney; http://xnite.org) for finding & reporting this > vulnerability. Have you discovered anything beyond the findings of the referenced, reported vulnerability, or any of those mentioned in the Debian advisory? http://www.debian.org/security/2012/dsa-2593 If so, I'm sure that the MoinMoin developers would be interested in working with you to responsibly mitigate the impact of any deployed, vulnerable code. Paul P.S. Although I don't speak for the MoinMoin developers in any way, please be advised that any replies to me may be shared with those developers and indeed any other parties I choose. From victor.stinner at gmail.com Fri Jan 11 00:12:47 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 11 Jan 2013 00:12:47 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: 2013/1/10 Tres Seaver : > Why would we wander away from POSIX semantics here? There are good > reasons not to close open descriptors (the 'pipe()' syscall, for > instance), and there is no POSIXy way to ask for them *not* to be closed. There are different methods to disable the close-on-exec flag. See os.set_cloexec() function of my PEP draft: https://bitbucket.org/haypo/misc/src/tip/python/pep_cloexec.rst Victor From stephen at xemacs.org Fri Jan 11 00:51:37 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 11 Jan 2013 08:51:37 +0900 Subject: [Python-Dev] FYI - wiki.python.org compromised In-Reply-To: <50EF241A.6080202@xnite.org> References: <50EF241A.6080202@xnite.org> Message-ID: <87zk0g8uly.fsf@uwakimon.sk.tsukuba.ac.jp> Robert Whitney writes: > I believe the exploit in use on the Python Wiki could have > been the following remote arbitrary code execution exploit that > myself and some fellow researchers have been working with over the > past few days. AFAIK, Python has a security policy. One point of that policy is to avoid announcing details of exploits on public channels until a policy for addressing them is implemented. If you have something to report, the best channel is security at python.org. http://www.python.org/news/security From tjreedy at udel.edu Fri Jan 11 00:53:58 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 10 Jan 2013 18:53:58 -0500 Subject: [Python-Dev] Point of building without threads? In-Reply-To: References: <20120507214943.579045c2@pitrou.net> <1357637305.2857.54.camel@newpride> <20130108140159.GA31795@snakebite.org> <20130108150944.679c07f2@pitrou.net> <1357678863.2605.50.camel@newpride> Message-ID: On 1/10/2013 8:54 AM, Nick Coghlan wrote: > On Wed, Jan 9, 2013 at 7:01 AM, Yury V. Zaytsev wrote: >> Anyways, the point is that it's a recurrent problem, and we hit it every >> single time with each new machine, so I hope you can understand why I >> decided to post, even though the original message appeared back in May. > > Thank you for speaking up. Easier porting to new platforms is > certainly one of the reasons we keep that capability, and it's helpful > to have it directly confirmed that there *are* users out there that > benefit from it :) If there is not currently a comment in the setup code explaining this, perhaps it would be a good idea, lest some future developer be tempted to delete the option. -- Terry Jan Reedy From ncoghlan at gmail.com Fri Jan 11 07:09:52 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 11 Jan 2013 16:09:52 +1000 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (aef7db0d3893): sum=287 In-Reply-To: References: Message-ID: On Fri, Jan 11, 2013 at 2:57 PM, wrote: > results for aef7db0d3893 on branch "default" > -------------------------------------------- > > test_dbm leaked [2, 0, 0] references, sum=2 > test_dbm leaked [2, 2, 1] memory blocks, sum=5 Hmm, I'm starting to wonder if there's something to this one - it seems to be popping up a bit lately. > test_xml_etree_c leaked [56, 56, 56] references, sum=168 > test_xml_etree_c leaked [36, 38, 38] memory blocks, sum=112 I'm gonna take a wild guess and suggest there may be a problem with the recent pickling fix in the C extension :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From cf.natali at gmail.com Fri Jan 11 07:30:46 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 11 Jan 2013 07:30:46 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: > That could always be overcome by passing close_fds=False explicitly to > subprocess from my code, though, right? I'm not doing that now, but then I'm not > using the esoteric options in python-gnupg code, either. You could do that, or better explicitly support this option, and only specify this file descriptor in subprocess.Popen keep_fds argument. > My point was that the GnuPG usage looked like an example where fds other than > 0, 1 and 2 might be used by design in not-uncommonly-used programs. From a > discussion I had with Barry Warsaw a while ago, I seem to remember that there > was other software which relied on these features. See [1] for details. Yes, it might be used. But I maintain that it should be really rare, and even if it's not, since the "official" way to launch subprocesses is through the subprocess module, FDs > 2 are already closed by default since Python 3.2. And the failure will be immediate and obvious (EBADF). Note that I admit I may be completely wrong, that's why I suggested to Victor to bring this up on python-dev to gather as much feedback as possible. Something saying like "we never ever break backward compatibility intentionally, even in corner cases" or "this would break POSIX semantics" would be enough (but OTOH, the subprocess change did break those hypothetical rules). Another pet peeve of mine is the non-handling of EINTR by low-level syscall wrappers, which results in code like this spread all over the stdlib and user code: while True: try: return syscall(...) except OSError as e: if e.errno =!= errno.EINTR: raise (and if it's select()/poll()/etc, the code doesn't update the timeout in 90% of cases). It gets a little better since the Exception hierarchy rework (InterruptedException), but's still a nuisance. From solipsis at pitrou.net Fri Jan 11 08:11:00 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 11 Jan 2013 08:11:00 +0100 Subject: [Python-Dev] cpython (3.3): #16919: test_crypt now works with unittest test discovery. Patch by Zachary References: <3Yj8Td4X49zR5K@mail.python.org> Message-ID: <20130111081100.4fade58e@pitrou.net> On Fri, 11 Jan 2013 04:20:21 +0100 (CET) ezio.melotti wrote: > > diff --git a/Lib/test/test_crypt.py b/Lib/test/test_crypt.py > --- a/Lib/test/test_crypt.py > +++ b/Lib/test/test_crypt.py > @@ -1,7 +1,11 @@ > from test import support > import unittest > > -crypt = support.import_module('crypt') > +def setUpModule(): > + # this import will raise unittest.SkipTest if _crypt doesn't exist, > + # so it has to be done in setUpModule for test discovery to work > + global crypt > + crypt = support.import_module('crypt') Yikes. Couldn't unittest support SkipTest being raised at import instead? setUpModule is an ugly way to do this. Regards Antoine. From benno at benno.id.au Fri Jan 11 08:25:44 2013 From: benno at benno.id.au (Ben Leslie) Date: Fri, 11 Jan 2013 18:25:44 +1100 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: On Fri, Jan 11, 2013 at 5:30 PM, Charles-Fran?ois Natali < cf.natali at gmail.com> wrote: > > That could always be overcome by passing close_fds=False explicitly to > > subprocess from my code, though, right? I'm not doing that now, but then > I'm not > > using the esoteric options in python-gnupg code, either. > > You could do that, or better explicitly support this option, and only > specify this file descriptor in subprocess.Popen keep_fds argument. > > > My point was that the GnuPG usage looked like an example where fds other > than > > 0, 1 and 2 might be used by design in not-uncommonly-used programs. From > a > > discussion I had with Barry Warsaw a while ago, I seem to remember that > there > > was other software which relied on these features. See [1] for details. > > Yes, it might be used. But I maintain that it should be really rare, > and even if it's not, since the "official" way to launch subprocesses > is through the subprocess module, FDs > 2 are already closed by > default since Python 3.2. And the failure will be immediate and > obvious (EBADF). > I suppose I should chime in. I think the pattern of "open some file descriptors, fork, exec" and pass the file descriptors to a child process it a very useful thing, and certainly something I use in my code. Although to be honest, it is usually a C program doing it. I do this for two reason: 1) I can have a relatively small program that runs at a rather high privilege level (e.g: root), which acquires protected resources (e.g: binds to port 80, opens some key files that only a restricted user has access to), and then drops privileges, forks (or just execs) and passes those resources to a larger, less trusted program. 2) I have a monitoring program that opens a socket, and then passes the socket to a child process (via fork, exec). If that program crashes for some reason, the monitoring program can respawn a new child and pass it the exact same socket. Now I'm not saying these are necessarily common use cases, but I think changing the default behaviour of file-descriptors to be close on exec would violate principle of least surprise for anyone familiar with and expecting UNIX semantics. If you are using fork/exec directly then I think you want these semantics. That being said, I have no problem with subprocess closing file-descriptors before exec()-ing as I don't think a developer would necessarily expect the UNIX semantics on that interface. Python is not UNIX, but I think if you are directly using the POSIX interfaces they should work (more or less) the same way the would if you were writing a C program. (Some of us still use Python to prototype things that will later be converted to C!). Cheers, Benno -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosslagerwall at gmail.com Fri Jan 11 08:38:47 2013 From: rosslagerwall at gmail.com (Ross Lagerwall) Date: Fri, 11 Jan 2013 07:38:47 +0000 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: <20130111073847.GA697@hobo.wolfson.cam.ac.uk> On Thu, Jan 10, 2013 at 09:57:30PM +0000, Vinay Sajip wrote: > My point was that the GnuPG usage looked like an example where fds other than > 0, 1 and 2 might be used by design in not-uncommonly-used programs. From a > discussion I had with Barry Warsaw a while ago, I seem to remember that there > was other software which relied on these features. See [1] for details. > >From the "dpkg" man page: --command-fd n Accept a series of commands on input file descriptor n. Note: additional options set on the command line, and through this file descriptor, are not reset for subsequent commands executed during the same run. -- Ross Lagerwall From chris.jerdonek at gmail.com Fri Jan 11 08:47:34 2013 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Thu, 10 Jan 2013 23:47:34 -0800 Subject: [Python-Dev] [Python-checkins] Cron /home/docs/build-devguide In-Reply-To: References: Message-ID: On Thu, Jan 10, 2013 at 11:35 PM, Cron Daemon wrote: > /home/docs/devguide/documenting.rst:766: WARNING: term not in glossary: bytecode FYI, this warning is spurious (second time at least). I made an issue about it here: http://bugs.python.org/issue16928 --Chris From tjreedy at udel.edu Fri Jan 11 14:26:31 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 11 Jan 2013 08:26:31 -0500 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: On 1/11/2013 2:25 AM, Ben Leslie wrote: > Python is not UNIX, but I think if you are directly using the POSIX > interfaces they should > work (more or less) the same way the would if you were writing a C > program. (Some of us > still use Python to prototype things that will later be converted to C!). I believe it has been the intent that if xxx is s syscall, then os.xxx should be a thin wrapper with the same defaults. -- Terry Jan Reedy From rdmurray at bitdance.com Fri Jan 11 14:37:36 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 11 Jan 2013 08:37:36 -0500 Subject: [Python-Dev] cpython (3.3): #16919: test_crypt now works with unittest test discovery. Patch by Zachary In-Reply-To: <20130111081100.4fade58e@pitrou.net> References: <3Yj8Td4X49zR5K@mail.python.org> <20130111081100.4fade58e@pitrou.net> Message-ID: <20130111133738.315AE250030@webabinitio.net> On Fri, 11 Jan 2013 08:11:00 +0100, Antoine Pitrou wrote: > On Fri, 11 Jan 2013 04:20:21 +0100 (CET) > ezio.melotti wrote: > > > > diff --git a/Lib/test/test_crypt.py b/Lib/test/test_crypt.py > > --- a/Lib/test/test_crypt.py > > +++ b/Lib/test/test_crypt.py > > @@ -1,7 +1,11 @@ > > from test import support > > import unittest > > > > -crypt = support.import_module('crypt') > > +def setUpModule(): > > + # this import will raise unittest.SkipTest if _crypt doesn't exist, > > + # so it has to be done in setUpModule for test discovery to work > > + global crypt > > + crypt = support.import_module('crypt') > > Yikes. > Couldn't unittest support SkipTest being raised at import instead? > setUpModule is an ugly way to do this. Indeed. Almost every use of support.import_module will have this issue, so fixing unittest is by far the better fix. --David From brett at python.org Fri Jan 11 15:43:34 2013 From: brett at python.org (Brett Cannon) Date: Fri, 11 Jan 2013 09:43:34 -0500 Subject: [Python-Dev] cpython (3.3): #16919: test_crypt now works with unittest test discovery. Patch by Zachary In-Reply-To: <20130111133738.315AE250030@webabinitio.net> References: <3Yj8Td4X49zR5K@mail.python.org> <20130111081100.4fade58e@pitrou.net> <20130111133738.315AE250030@webabinitio.net> Message-ID: On Fri, Jan 11, 2013 at 8:37 AM, R. David Murray wrote: > On Fri, 11 Jan 2013 08:11:00 +0100, Antoine Pitrou wrote: >> On Fri, 11 Jan 2013 04:20:21 +0100 (CET) >> ezio.melotti wrote: >> > >> > diff --git a/Lib/test/test_crypt.py b/Lib/test/test_crypt.py >> > --- a/Lib/test/test_crypt.py >> > +++ b/Lib/test/test_crypt.py >> > @@ -1,7 +1,11 @@ >> > from test import support >> > import unittest >> > >> > -crypt = support.import_module('crypt') >> > +def setUpModule(): >> > + # this import will raise unittest.SkipTest if _crypt doesn't exist, >> > + # so it has to be done in setUpModule for test discovery to work >> > + global crypt >> > + crypt = support.import_module('crypt') >> >> Yikes. >> Couldn't unittest support SkipTest being raised at import instead? >> setUpModule is an ugly way to do this. > > Indeed. Almost every use of support.import_module will have this issue, > so fixing unittest is by far the better fix. Bug filed: http://bugs.python.org/issue16935 From barry at python.org Fri Jan 11 15:56:33 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 11 Jan 2013 09:56:33 -0500 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: <20130111095633.20c6c4f8@anarchist.wooz.org> On Jan 11, 2013, at 06:25 PM, Ben Leslie wrote: >Python is not UNIX, but I think if you are directly using the POSIX >interfaces they should work (more or less) the same way the would if you were >writing a C program. (Some of us still use Python to prototype things that >will later be converted to C!). +1 Or put it another way: Python isn't POSIX but it exposes POSIX APIs, they should act like POSIX APIs. It's perfectly fine to provide alternative APIs that have different behavior. I think it's exactly analogous to Python exposing Windows or OS X specific APIs. -Barry From tseaver at palladion.com Fri Jan 11 16:42:19 2013 From: tseaver at palladion.com (Tres Seaver) Date: Fri, 11 Jan 2013 10:42:19 -0500 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/10/2013 02:55 PM, Charles-Fran?ois Natali wrote: > Indeed, it should be really rare. *Lots* of applications make use of POSIX semantics for fork() / exec(). > There are far more programs that are bitten by FD inheritance upon > exec than programs relying on it, and whereas failures and security > issues in the first category are hard to debug and unpredictable > (especially in a multi-threaded program), Can someone please point to a writeop of the security issues involved? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlDwMtsACgkQ+gerLs4ltQ7e7QCfdc1yxE+1iF6Gf/O9reOxStIG V8YAoKs79z5DHA3dyTDI+9yWkSSW0VD6 =9FVK -----END PGP SIGNATURE----- From christian at python.org Fri Jan 11 17:08:30 2013 From: christian at python.org (Christian Heimes) Date: Fri, 11 Jan 2013 17:08:30 +0100 Subject: [Python-Dev] Daily reference leaks (aef7db0d3893): sum=287 In-Reply-To: References: Message-ID: <50F038FE.208@python.org> Am 11.01.2013 07:09, schrieb Nick Coghlan: > On Fri, Jan 11, 2013 at 2:57 PM, wrote: >> results for aef7db0d3893 on branch "default" >> -------------------------------------------- >> >> test_dbm leaked [2, 0, 0] references, sum=2 >> test_dbm leaked [2, 2, 1] memory blocks, sum=5 > > Hmm, I'm starting to wonder if there's something to this one - it > seems to be popping up a bit lately. > >> test_xml_etree_c leaked [56, 56, 56] references, sum=168 >> test_xml_etree_c leaked [36, 38, 38] memory blocks, sum=112 > > I'm gonna take a wild guess and suggest there may be a problem with > the recent pickling fix in the C extension :) It has more issues. Coverity has sent me some complains, see attachment. -------------- next part -------------- An embedded message was scrubbed... From: scan-admin at coverity.com Subject: New Defects reported by Coverity Scan for Python Date: Thu, 10 Jan 2013 16:35:29 -0800 Size: 4865 URL: From cf.natali at gmail.com Fri Jan 11 17:44:36 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 11 Jan 2013 17:44:36 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: > *Lots* of applications make use of POSIX semantics for fork() / exec(). This doesn't mean much. We're talking about inheritance of FDs > 2 upon exec, which is a very limited subset of "POSIX semantics for fork() / exec()". I personally think that there's been enough feedback to show that we should stick with the default POSIX behavior, however broken it is... > Can someone please point to a writeop of the security issues involved? I've posted sample codes earlier in this thread, but here's a writeup by Ulrich Drepper: http://udrepper.livejournal.com/20407.html From status at bugs.python.org Fri Jan 11 18:07:24 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 11 Jan 2013 18:07:24 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130111170724.511381CC82@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-01-04 - 2013-01-11) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3825 ( +8) closed 24888 (+65) total 28713 (+73) Open issues with patches: 1668 Issues opened (41) ================== #12944: Accept arbitrary files for packaging's upload command http://bugs.python.org/issue12944 reopened by eric.araujo #13198: Remove duplicate definition of write_record_file http://bugs.python.org/issue13198 reopened by eric.araujo #15539: Fixing Tools/scripts/pindent.py http://bugs.python.org/issue15539 reopened by serhiy.storchaka #16076: xml.etree.ElementTree.Element is no longer pickleable http://bugs.python.org/issue16076 reopened by ezio.melotti #16823: Python crashes on running tkinter code with threads http://bugs.python.org/issue16823 reopened by terry.reedy #16864: sqlite3.Cursor.lastrowid isn't populated when executing a SQL http://bugs.python.org/issue16864 opened by jim_minter #16865: ctypes arrays >=2GB in length causes exception http://bugs.python.org/issue16865 opened by coderforlife #16866: libainstall doesn't create $(BINDIR) directory http://bugs.python.org/issue16866 opened by bennoleslie #16869: makesetup should support .S source files http://bugs.python.org/issue16869 opened by bennoleslie #16878: argparse: positional args with nargs='*' defaults to [] http://bugs.python.org/issue16878 opened by chris.jerdonek #16879: distutils.command.config uses fragile constant temporary file http://bugs.python.org/issue16879 opened by mgorny #16880: Importing "imp" will fail if dynamic loading not supported http://bugs.python.org/issue16880 opened by Jeffrey.Armstrong #16885: SQLite3 iterdump ordering http://bugs.python.org/issue16885 opened by Jamie.Spence #16886: Doctests in test_dictcomp depend on dict order http://bugs.python.org/issue16886 opened by fwierzbicki #16887: IDLE - tabify/untabify applied when clicking Cancel http://bugs.python.org/issue16887 opened by serwy #16889: facilitate log output starting at beginning of line http://bugs.python.org/issue16889 opened by chris.jerdonek #16891: Fix docs about module search order http://bugs.python.org/issue16891 opened by dmugtasimov #16892: Windows bug picking up stdin from a pipe http://bugs.python.org/issue16892 opened by oeffner #16893: Create IDLE help.txt from Doc/library/idle.rst http://bugs.python.org/issue16893 opened by zach.ware #16894: Function attribute access doesn't invoke methods in dict subcl http://bugs.python.org/issue16894 opened by dabeaz #16895: Batch file to mimic 'make' on Windows http://bugs.python.org/issue16895 opened by zach.ware #16899: Add support for C99 complex type (_Complex) as ctypes.c_comple http://bugs.python.org/issue16899 opened by rutsky #16901: In http.cookiejar.FileCookieJar() the .load() and .revert() me http://bugs.python.org/issue16901 opened by py.user #16902: Add OSS module support for Solaris http://bugs.python.org/issue16902 opened by brian-cameron-oracle #16903: subprocess.Popen.communicate with universal_newlines=True does http://bugs.python.org/issue16903 opened by serhiy.storchaka #16904: Avoid unnecessary and possibly unsafe code from http.client.HT http://bugs.python.org/issue16904 opened by sanyi #16907: Distutils fails to build extension in path with spaces http://bugs.python.org/issue16907 opened by klo.uo #16909: urlparse: add userinfo attribute http://bugs.python.org/issue16909 opened by olof #16914: timestamping in smtplib.py to help troubleshoot server timeout http://bugs.python.org/issue16914 opened by gac #16915: mode of socket.makefile is more limited than documentation sug http://bugs.python.org/issue16915 opened by Antoon.Pardon #16916: The Extended Iterable Unpacking (PEP-3132) for byte strings do http://bugs.python.org/issue16916 opened by marco.buttu #16921: Docs of subprocess.CREATE_NEW_CONSOLE are wrong http://bugs.python.org/issue16921 opened by torsten #16923: test_ssl kicks up a lot of ResourceWarnings http://bugs.python.org/issue16923 opened by benjamin.peterson #16926: setup.py register does not always respect --repository http://bugs.python.org/issue16926 opened by chris.jerdonek #16927: Separate built-in types from functions and group similar funct http://bugs.python.org/issue16927 opened by ezio.melotti #16930: mention limitations and/or alternatives to hg graft http://bugs.python.org/issue16930 opened by chris.jerdonek #16931: mention work-around to create diffs in default/non-git mode http://bugs.python.org/issue16931 opened by chris.jerdonek #16933: argparse: remove magic from examples http://bugs.python.org/issue16933 opened by guettli #16935: unittest should understand SkipTest at import time during test http://bugs.python.org/issue16935 opened by brett.cannon #16936: Documentation for stat.S_IFMT inconsistent http://bugs.python.org/issue16936 opened by lechten #16922: ElementTree.findtext() returns empty bytes object instead of e http://bugs.python.org/issue16922 opened by serhiy.storchaka Most recent 15 issues with no replies (15) ========================================== #16936: Documentation for stat.S_IFMT inconsistent http://bugs.python.org/issue16936 #16931: mention work-around to create diffs in default/non-git mode http://bugs.python.org/issue16931 #16930: mention limitations and/or alternatives to hg graft http://bugs.python.org/issue16930 #16927: Separate built-in types from functions and group similar funct http://bugs.python.org/issue16927 #16923: test_ssl kicks up a lot of ResourceWarnings http://bugs.python.org/issue16923 #16916: The Extended Iterable Unpacking (PEP-3132) for byte strings do http://bugs.python.org/issue16916 #16904: Avoid unnecessary and possibly unsafe code from http.client.HT http://bugs.python.org/issue16904 #16901: In http.cookiejar.FileCookieJar() the .load() and .revert() me http://bugs.python.org/issue16901 #16895: Batch file to mimic 'make' on Windows http://bugs.python.org/issue16895 #16892: Windows bug picking up stdin from a pipe http://bugs.python.org/issue16892 #16887: IDLE - tabify/untabify applied when clicking Cancel http://bugs.python.org/issue16887 #16885: SQLite3 iterdump ordering http://bugs.python.org/issue16885 #16879: distutils.command.config uses fragile constant temporary file http://bugs.python.org/issue16879 #16859: tarfile.TarInfo.fromtarfile does not check read() return value http://bugs.python.org/issue16859 #16845: warnings.simplefilter should validate input http://bugs.python.org/issue16845 Most recent 15 issues waiting for review (15) ============================================= #16936: Documentation for stat.S_IFMT inconsistent http://bugs.python.org/issue16936 #16922: ElementTree.findtext() returns empty bytes object instead of e http://bugs.python.org/issue16922 #16921: Docs of subprocess.CREATE_NEW_CONSOLE are wrong http://bugs.python.org/issue16921 #16915: mode of socket.makefile is more limited than documentation sug http://bugs.python.org/issue16915 #16914: timestamping in smtplib.py to help troubleshoot server timeout http://bugs.python.org/issue16914 #16909: urlparse: add userinfo attribute http://bugs.python.org/issue16909 #16902: Add OSS module support for Solaris http://bugs.python.org/issue16902 #16887: IDLE - tabify/untabify applied when clicking Cancel http://bugs.python.org/issue16887 #16886: Doctests in test_dictcomp depend on dict order http://bugs.python.org/issue16886 #16880: Importing "imp" will fail if dynamic loading not supported http://bugs.python.org/issue16880 #16878: argparse: positional args with nargs='*' defaults to [] http://bugs.python.org/issue16878 #16869: makesetup should support .S source files http://bugs.python.org/issue16869 #16866: libainstall doesn't create $(BINDIR) directory http://bugs.python.org/issue16866 #16860: Use O_CLOEXEC in the tempfile module http://bugs.python.org/issue16860 #16853: add a Selector to the select module http://bugs.python.org/issue16853 Top 10 most discussed issues (10) ================================= #16853: add a Selector to the select module http://bugs.python.org/issue16853 42 msgs #16748: Make CPython test package discoverable http://bugs.python.org/issue16748 19 msgs #16076: xml.etree.ElementTree.Element is no longer pickleable http://bugs.python.org/issue16076 16 msgs #5066: IDLE documentation for Unix obsolete/incorrect http://bugs.python.org/issue5066 14 msgs #16795: Patch: some changes to AST to make it more useful for static l http://bugs.python.org/issue16795 13 msgs #16902: Add OSS module support for Solaris http://bugs.python.org/issue16902 11 msgs #12107: TCP listening sockets created without FD_CLOEXEC flag http://bugs.python.org/issue12107 8 msgs #14468: Update cloning guidelines in devguide http://bugs.python.org/issue14468 8 msgs #16865: ctypes arrays >=2GB in length causes exception http://bugs.python.org/issue16865 8 msgs #16889: facilitate log output starting at beginning of line http://bugs.python.org/issue16889 8 msgs Issues closed (65) ================== #3073: Cookie.Morsel breaks in parsing cookie values with whitespace http://bugs.python.org/issue3073 closed by christian.heimes #3583: test_urllibnet.test_bad_address() fails when using OpenDNS http://bugs.python.org/issue3583 closed by brett.cannon #7378: unexpected truncation of traceback http://bugs.python.org/issue7378 closed by serhiy.storchaka #9242: unicodeobject.c: use of uninitialized values http://bugs.python.org/issue9242 closed by serhiy.storchaka #9685: tuples should remember their hash value http://bugs.python.org/issue9685 closed by benjamin.peterson #10895: Private stdlib API: getopt, getpass, glob, gzip, genericpath, http://bugs.python.org/issue10895 closed by brett.cannon #11461: UTF-16 incremental decoder doesn't support partial surrogate p http://bugs.python.org/issue11461 closed by serhiy.storchaka #13094: Need Programming FAQ entry for the behavior of closures http://bugs.python.org/issue13094 closed by ezio.melotti #13899: re pattern r"[\A]" should work like "A" but matches nothing. D http://bugs.python.org/issue13899 closed by ezio.melotti #13934: sqlite3 test typo http://bugs.python.org/issue13934 closed by r.david.murray #15109: sqlite3.Connection.iterdump() dies with encoding exception http://bugs.python.org/issue15109 closed by r.david.murray #15278: UnicodeDecodeError when readline in codecs.py http://bugs.python.org/issue15278 closed by serhiy.storchaka #15545: sqlite3.Connection.iterdump() does not work with row_factory = http://bugs.python.org/issue15545 closed by r.david.murray #15782: Compile error for a number of Mac modules with recent Xcode http://bugs.python.org/issue15782 closed by ned.deily #15845: Fixing some byte-to-string conversion warnings http://bugs.python.org/issue15845 closed by serhiy.storchaka #15972: wrong error message for os.path.getsize http://bugs.python.org/issue15972 closed by serhiy.storchaka #16143: Building with configure option "--without-doc-strings" crashes http://bugs.python.org/issue16143 closed by skrah #16154: Some minor doc fixes in Doc/library http://bugs.python.org/issue16154 closed by ezio.melotti #16320: Establish order in bytes/string dependencies http://bugs.python.org/issue16320 closed by serhiy.storchaka #16835: Update PEP 399 to allow for test discovery http://bugs.python.org/issue16835 closed by ezio.melotti #16836: configure script disables support for IPv6 on a system where I http://bugs.python.org/issue16836 closed by neologix #16843: sporadic test_sched failure http://bugs.python.org/issue16843 closed by serhiy.storchaka #16849: Element.{get,iter} doesn't handle keyword arguments when using http://bugs.python.org/issue16849 closed by eli.bendersky #16852: Fix test discovery for test_genericpath.py http://bugs.python.org/issue16852 closed by ezio.melotti #16854: usage() is not defined in Lib/test/regrtest.py http://bugs.python.org/issue16854 closed by chris.jerdonek #16861: Portion of code example not highlighted in collections doc http://bugs.python.org/issue16861 closed by ezio.melotti #16862: FAQ has outdated information about Stackless http://bugs.python.org/issue16862 closed by ezio.melotti #16867: setup.py fails if there are no extensions http://bugs.python.org/issue16867 closed by ned.deily #16868: Python Developer Guide: Include a reminder to "ping" bug repor http://bugs.python.org/issue16868 closed by ezio.melotti #16870: re fails to match ^ when start index is specified ? http://bugs.python.org/issue16870 closed by ned.deily #16871: Cleanup a few minor things http://bugs.python.org/issue16871 closed by ezio.melotti #16872: Metaclass conflict proposition http://bugs.python.org/issue16872 closed by benjamin.peterson #16873: increase epoll.poll() maxevents default value, and improve doc http://bugs.python.org/issue16873 closed by neologix #16874: setup.py upload option repeated in docs http://bugs.python.org/issue16874 closed by chris.jerdonek #16875: in 3.3.0 windows 64 idle editor http://bugs.python.org/issue16875 closed by ezio.melotti #16876: epoll: reuse epoll_event buffer instead of allocating a new on http://bugs.python.org/issue16876 closed by neologix #16877: Odd behavior of ~ in os.path.abspath and os.path.realpath http://bugs.python.org/issue16877 closed by r.david.murray #16881: Py_ARRAY_LENGTH macro incorrect with GCC < 3.1 http://bugs.python.org/issue16881 closed by christian.heimes #16882: Python 2.7 has 73 files with hard references to /usr/local whe http://bugs.python.org/issue16882 closed by ned.deily #16883: "--enable-shared" during configure forces "2.7.3" to build as http://bugs.python.org/issue16883 closed by ned.deily #16884: logging handler automatically added starting in 3.2+ http://bugs.python.org/issue16884 closed by python-dev #16888: Fix test discovery for test_array.py http://bugs.python.org/issue16888 closed by ezio.melotti #16890: minidom error http://bugs.python.org/issue16890 closed by ezio.melotti #16896: Fix test discovery for test_asyncore.py http://bugs.python.org/issue16896 closed by ezio.melotti #16897: Fix test discovery for test_bisect.py http://bugs.python.org/issue16897 closed by ezio.melotti #16898: Fix test discovery for test_bufio.py http://bugs.python.org/issue16898 closed by ezio.melotti #16900: SSL sockets don't give resource warnings http://bugs.python.org/issue16900 closed by python-dev #16905: Fix test discovery for test_warnings http://bugs.python.org/issue16905 closed by ezio.melotti #16906: Bug in _PyUnicode_ClearStaticStrings() method of unicodeobject http://bugs.python.org/issue16906 closed by python-dev #16908: Enhancing performance and memory usage http://bugs.python.org/issue16908 closed by georg.brandl #16910: Fix test discovery for bytes/string tests http://bugs.python.org/issue16910 closed by ezio.melotti #16911: Socket Documentation Error http://bugs.python.org/issue16911 closed by sghose #16912: Wrong float sum w/ 10.14+10.1 and 10.24+10.2 http://bugs.python.org/issue16912 closed by ezio.melotti #16913: ElementTree tostring error when method='text' http://bugs.python.org/issue16913 closed by eli.bendersky #16917: Misleading missing parenthesis syntax error http://bugs.python.org/issue16917 closed by ezio.melotti #16918: Fix test discovery for test_codecs.py http://bugs.python.org/issue16918 closed by ezio.melotti #16919: Fix test discovery for test_crypt.py http://bugs.python.org/issue16919 closed by ezio.melotti #16920: multiprocessing.connection listener gets MemoryError on recv http://bugs.python.org/issue16920 closed by sbt #16924: try: except: ordering error http://bugs.python.org/issue16924 closed by r.david.murray #16925: Fix test discovery for test_configparser.py http://bugs.python.org/issue16925 closed by ezio.melotti #16928: spurious Cron Daemon e-mails to docs at dinsdale.python.org http://bugs.python.org/issue16928 closed by georg.brandl #16929: poll()/epoll() are not thread-safe http://bugs.python.org/issue16929 closed by neologix #16932: urlparse fails at parsing "www.python.org:80/" http://bugs.python.org/issue16932 closed by georg.brandl #16934: qsort doesn't work for double arrays http://bugs.python.org/issue16934 closed by mark.dickinson #1694663: Overloading int.__pow__ does not work http://bugs.python.org/issue1694663 closed by serhiy.storchaka From brett at python.org Fri Jan 11 18:08:18 2013 From: brett at python.org (Brett Cannon) Date: Fri, 11 Jan 2013 12:08:18 -0500 Subject: [Python-Dev] [Python-checkins] cpython (merge 3.3 -> default): Issue #15539: Fix a number of bugs in Tools/scripts/pindent.py. In-Reply-To: <3YjLF866ftzS6g@mail.python.org> References: <3YjLF866ftzS6g@mail.python.org> Message-ID: This seems to have caused the Windows buildbots to fail. On Fri, Jan 11, 2013 at 5:40 AM, serhiy.storchaka wrote: > http://hg.python.org/cpython/rev/8452c23139c6 > changeset: 81407:8452c23139c6 > parent: 81399:5ec8daab477a > parent: 81406:01df1f7841b2 > user: Serhiy Storchaka > date: Fri Jan 11 12:12:32 2013 +0200 > summary: > Issue #15539: Fix a number of bugs in Tools/scripts/pindent.py. > Now pindent.py works with a "with" statement. pindent.py no longer produces > improper indentation. pindent.py now works with continued lines broken after > "class" or "def" keywords and with continuations at the start of line. Added > regression tests for pindent.py. Modernized pindent.py. > > files: > Lib/test/test_tools.py | 326 ++++++++++++++++++++++++++- > Misc/NEWS | 7 + > Tools/scripts/pindent.py | 166 +++++-------- > 3 files changed, 393 insertions(+), 106 deletions(-) > > > diff --git a/Lib/test/test_tools.py b/Lib/test/test_tools.py > --- a/Lib/test/test_tools.py > +++ b/Lib/test/test_tools.py > @@ -9,10 +9,13 @@ > import importlib.machinery > import unittest > from unittest import mock > +import shutil > +import subprocess > import sysconfig > import tempfile > +import textwrap > from test import support > -from test.script_helper import assert_python_ok > +from test.script_helper import assert_python_ok, temp_dir > > if not sysconfig.is_python_build(): > # XXX some installers do contain the tools, should we detect that > @@ -36,6 +39,327 @@ > self.assertGreater(err, b'') > > > +class PindentTests(unittest.TestCase): > + script = os.path.join(scriptsdir, 'pindent.py') > + > + def assertFileEqual(self, fn1, fn2): > + with open(fn1) as f1, open(fn2) as f2: > + self.assertEqual(f1.readlines(), f2.readlines()) > + > + def pindent(self, source, *args): > + with subprocess.Popen( > + (sys.executable, self.script) + args, > + stdin=subprocess.PIPE, stdout=subprocess.PIPE, > + universal_newlines=True) as proc: > + out, err = proc.communicate(source) > + self.assertIsNone(err) > + return out > + > + def lstriplines(self, data): > + return '\n'.join(line.lstrip() for line in data.splitlines()) + '\n' > + > + def test_selftest(self): > + with temp_dir() as directory: > + data_path = os.path.join(directory, '_test.py') > + with open(self.script) as f: > + closed = f.read() > + with open(data_path, 'w') as f: > + f.write(closed) > + > + rc, out, err = assert_python_ok(self.script, '-d', data_path) > + self.assertEqual(out, b'') > + self.assertEqual(err, b'') > + backup = data_path + '~' > + self.assertTrue(os.path.exists(backup)) > + with open(backup) as f: > + self.assertEqual(f.read(), closed) > + with open(data_path) as f: > + clean = f.read() > + compile(clean, '_test.py', 'exec') > + self.assertEqual(self.pindent(clean, '-c'), closed) > + self.assertEqual(self.pindent(closed, '-d'), clean) > + > + rc, out, err = assert_python_ok(self.script, '-c', data_path) > + self.assertEqual(out, b'') > + self.assertEqual(err, b'') > + with open(backup) as f: > + self.assertEqual(f.read(), clean) > + with open(data_path) as f: > + self.assertEqual(f.read(), closed) > + > + broken = self.lstriplines(closed) > + with open(data_path, 'w') as f: > + f.write(broken) > + rc, out, err = assert_python_ok(self.script, '-r', data_path) > + self.assertEqual(out, b'') > + self.assertEqual(err, b'') > + with open(backup) as f: > + self.assertEqual(f.read(), broken) > + with open(data_path) as f: > + indented = f.read() > + compile(indented, '_test.py', 'exec') > + self.assertEqual(self.pindent(broken, '-r'), indented) > + > + def pindent_test(self, clean, closed): > + self.assertEqual(self.pindent(clean, '-c'), closed) > + self.assertEqual(self.pindent(closed, '-d'), clean) > + broken = self.lstriplines(closed) > + self.assertEqual(self.pindent(broken, '-r', '-e', '-s', '4'), closed) > + > + def test_statements(self): > + clean = textwrap.dedent("""\ > + if a: > + pass > + > + if a: > + pass > + else: > + pass > + > + if a: > + pass > + elif: > + pass > + else: > + pass > + > + while a: > + break > + > + while a: > + break > + else: > + pass > + > + for i in a: > + break > + > + for i in a: > + break > + else: > + pass > + > + try: > + pass > + finally: > + pass > + > + try: > + pass > + except TypeError: > + pass > + except ValueError: > + pass > + else: > + pass > + > + try: > + pass > + except TypeError: > + pass > + except ValueError: > + pass > + finally: > + pass > + > + with a: > + pass > + > + class A: > + pass > + > + def f(): > + pass > + """) > + > + closed = textwrap.dedent("""\ > + if a: > + pass > + # end if > + > + if a: > + pass > + else: > + pass > + # end if > + > + if a: > + pass > + elif: > + pass > + else: > + pass > + # end if > + > + while a: > + break > + # end while > + > + while a: > + break > + else: > + pass > + # end while > + > + for i in a: > + break > + # end for > + > + for i in a: > + break > + else: > + pass > + # end for > + > + try: > + pass > + finally: > + pass > + # end try > + > + try: > + pass > + except TypeError: > + pass > + except ValueError: > + pass > + else: > + pass > + # end try > + > + try: > + pass > + except TypeError: > + pass > + except ValueError: > + pass > + finally: > + pass > + # end try > + > + with a: > + pass > + # end with > + > + class A: > + pass > + # end class A > + > + def f(): > + pass > + # end def f > + """) > + self.pindent_test(clean, closed) > + > + def test_multilevel(self): > + clean = textwrap.dedent("""\ > + def foobar(a, b): > + if a == b: > + a = a+1 > + elif a < b: > + b = b-1 > + if b > a: a = a-1 > + else: > + print 'oops!' > + """) > + closed = textwrap.dedent("""\ > + def foobar(a, b): > + if a == b: > + a = a+1 > + elif a < b: > + b = b-1 > + if b > a: a = a-1 > + # end if > + else: > + print 'oops!' > + # end if > + # end def foobar > + """) > + self.pindent_test(clean, closed) > + > + def test_preserve_indents(self): > + clean = textwrap.dedent("""\ > + if a: > + if b: > + pass > + """) > + closed = textwrap.dedent("""\ > + if a: > + if b: > + pass > + # end if > + # end if > + """) > + self.assertEqual(self.pindent(clean, '-c'), closed) > + self.assertEqual(self.pindent(closed, '-d'), clean) > + broken = self.lstriplines(closed) > + self.assertEqual(self.pindent(broken, '-r', '-e', '-s', '9'), closed) > + clean = textwrap.dedent("""\ > + if a: > + \tif b: > + \t\tpass > + """) > + closed = textwrap.dedent("""\ > + if a: > + \tif b: > + \t\tpass > + \t# end if > + # end if > + """) > + self.assertEqual(self.pindent(clean, '-c'), closed) > + self.assertEqual(self.pindent(closed, '-d'), clean) > + broken = self.lstriplines(closed) > + self.assertEqual(self.pindent(broken, '-r'), closed) > + > + def test_escaped_newline(self): > + clean = textwrap.dedent("""\ > + class\\ > + \\ > + A: > + def\ > + \\ > + f: > + pass > + """) > + closed = textwrap.dedent("""\ > + class\\ > + \\ > + A: > + def\ > + \\ > + f: > + pass > + # end def f > + # end class A > + """) > + self.assertEqual(self.pindent(clean, '-c'), closed) > + self.assertEqual(self.pindent(closed, '-d'), clean) > + > + def test_empty_line(self): > + clean = textwrap.dedent("""\ > + if a: > + > + pass > + """) > + closed = textwrap.dedent("""\ > + if a: > + > + pass > + # end if > + """) > + self.pindent_test(clean, closed) > + > + def test_oneline(self): > + clean = textwrap.dedent("""\ > + if a: pass > + """) > + closed = textwrap.dedent("""\ > + if a: pass > + # end if > + """) > + self.pindent_test(clean, closed) > + > + > class TestSundryScripts(unittest.TestCase): > # At least make sure the rest don't have syntax errors. When tests are > # added for a script it should be added to the whitelist below. > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -621,6 +621,8 @@ > Tests > ----- > > +- Issue #15539: Added regression tests for Tools/scripts/pindent.py. > + > - Issue #16836: Enable IPv6 support even if IPv6 is disabled on the build host. > > - Issue #16925: test_configparser now works with unittest test discovery. > @@ -777,6 +779,11 @@ > Tools/Demos > ----------- > > +- Issue #15539: Fix a number of bugs in Tools/scripts/pindent.py. Now > + pindent.py works with a "with" statement. pindent.py no longer produces > + improper indentation. pindent.py now works with continued lines broken after > + "class" or "def" keywords and with continuations at the start of line. > + > - Issue #11797: Add a 2to3 fixer that maps reload() to imp.reload(). > > - Issue #10966: Remove the concept of unexpected skipped tests. > diff --git a/Tools/scripts/pindent.py b/Tools/scripts/pindent.py > --- a/Tools/scripts/pindent.py > +++ b/Tools/scripts/pindent.py > @@ -79,8 +79,9 @@ > # Defaults > STEPSIZE = 8 > TABSIZE = 8 > -EXPANDTABS = 0 > +EXPANDTABS = False > > +import io > import re > import sys > > @@ -89,7 +90,8 @@ > next['while'] = next['for'] = 'else', 'end' > next['try'] = 'except', 'finally' > next['except'] = 'except', 'else', 'finally', 'end' > -next['else'] = next['finally'] = next['def'] = next['class'] = 'end' > +next['else'] = next['finally'] = next['with'] = \ > + next['def'] = next['class'] = 'end' > next['end'] = () > start = 'if', 'while', 'for', 'try', 'with', 'def', 'class' > > @@ -105,11 +107,11 @@ > self.expandtabs = expandtabs > self._write = fpo.write > self.kwprog = re.compile( > - r'^\s*(?P[a-z]+)' > - r'(\s+(?P[a-zA-Z_]\w*))?' > + r'^(?:\s|\\\n)*(?P[a-z]+)' > + r'((?:\s|\\\n)+(?P[a-zA-Z_]\w*))?' > r'[^\w]') > self.endprog = re.compile( > - r'^\s*#?\s*end\s+(?P[a-z]+)' > + r'^(?:\s|\\\n)*#?\s*end\s+(?P[a-z]+)' > r'(\s+(?P[a-zA-Z_]\w*))?' > r'[^\w]') > self.wsprog = re.compile(r'^[ \t]*') > @@ -125,7 +127,7 @@ > > def readline(self): > line = self.fpi.readline() > - if line: self.lineno = self.lineno + 1 > + if line: self.lineno += 1 > # end if > return line > # end def readline > @@ -143,27 +145,24 @@ > line2 = self.readline() > if not line2: break > # end if > - line = line + line2 > + line += line2 > # end while > return line > # end def getline > > - def putline(self, line, indent = None): > - if indent is None: > - self.write(line) > - return > + def putline(self, line, indent): > + tabs, spaces = divmod(indent*self.indentsize, self.tabsize) > + i = self.wsprog.match(line).end() > + line = line[i:] > + if line[:1] not in ('\n', '\r', ''): > + line = '\t'*tabs + ' '*spaces + line > # end if > - tabs, spaces = divmod(indent*self.indentsize, self.tabsize) > - i = 0 > - m = self.wsprog.match(line) > - if m: i = m.end() > - # end if > - self.write('\t'*tabs + ' '*spaces + line[i:]) > + self.write(line) > # end def putline > > def reformat(self): > stack = [] > - while 1: > + while True: > line = self.getline() > if not line: break # EOF > # end if > @@ -173,10 +172,9 @@ > kw2 = m.group('kw') > if not stack: > self.error('unexpected end') > - elif stack[-1][0] != kw2: > + elif stack.pop()[0] != kw2: > self.error('unmatched end') > # end if > - del stack[-1:] > self.putline(line, len(stack)) > continue > # end if > @@ -208,23 +206,23 @@ > def delete(self): > begin_counter = 0 > end_counter = 0 > - while 1: > + while True: > line = self.getline() > if not line: break # EOF > # end if > m = self.endprog.match(line) > if m: > - end_counter = end_counter + 1 > + end_counter += 1 > continue > # end if > m = self.kwprog.match(line) > if m: > kw = m.group('kw') > if kw in start: > - begin_counter = begin_counter + 1 > + begin_counter += 1 > # end if > # end if > - self.putline(line) > + self.write(line) > # end while > if begin_counter - end_counter < 0: > sys.stderr.write('Warning: input contained more end tags than expected\n') > @@ -234,17 +232,12 @@ > # end def delete > > def complete(self): > - self.indentsize = 1 > stack = [] > todo = [] > - thisid = '' > - current, firstkw, lastkw, topid = 0, '', '', '' > - while 1: > + currentws = thisid = firstkw = lastkw = topid = '' > + while True: > line = self.getline() > - i = 0 > - m = self.wsprog.match(line) > - if m: i = m.end() > - # end if > + i = self.wsprog.match(line).end() > m = self.endprog.match(line) > if m: > thiskw = 'end' > @@ -269,7 +262,9 @@ > thiskw = '' > # end if > # end if > - indent = len(line[:i].expandtabs(self.tabsize)) > + indentws = line[:i] > + indent = len(indentws.expandtabs(self.tabsize)) > + current = len(currentws.expandtabs(self.tabsize)) > while indent < current: > if firstkw: > if topid: > @@ -278,11 +273,11 @@ > else: > s = '# end %s\n' % firstkw > # end if > - self.putline(s, current) > + self.write(currentws + s) > firstkw = lastkw = '' > # end if > - current, firstkw, lastkw, topid = stack[-1] > - del stack[-1] > + currentws, firstkw, lastkw, topid = stack.pop() > + current = len(currentws.expandtabs(self.tabsize)) > # end while > if indent == current and firstkw: > if thiskw == 'end': > @@ -297,18 +292,18 @@ > else: > s = '# end %s\n' % firstkw > # end if > - self.putline(s, current) > + self.write(currentws + s) > firstkw = lastkw = topid = '' > # end if > # end if > if indent > current: > - stack.append((current, firstkw, lastkw, topid)) > + stack.append((currentws, firstkw, lastkw, topid)) > if thiskw and thiskw not in start: > # error > thiskw = '' > # end if > - current, firstkw, lastkw, topid = \ > - indent, thiskw, thiskw, thisid > + currentws, firstkw, lastkw, topid = \ > + indentws, thiskw, thiskw, thisid > # end if > if thiskw: > if thiskw in start: > @@ -326,7 +321,6 @@ > self.write(line) > # end while > # end def complete > - > # end class PythonIndenter > > # Simplified user interface > @@ -352,76 +346,34 @@ > pi.reformat() > # end def reformat_filter > > -class StringReader: > - def __init__(self, buf): > - self.buf = buf > - self.pos = 0 > - self.len = len(self.buf) > - # end def __init__ > - def read(self, n = 0): > - if n <= 0: > - n = self.len - self.pos > - else: > - n = min(n, self.len - self.pos) > - # end if > - r = self.buf[self.pos : self.pos + n] > - self.pos = self.pos + n > - return r > - # end def read > - def readline(self): > - i = self.buf.find('\n', self.pos) > - return self.read(i + 1 - self.pos) > - # end def readline > - def readlines(self): > - lines = [] > - line = self.readline() > - while line: > - lines.append(line) > - line = self.readline() > - # end while > - return lines > - # end def readlines > - # seek/tell etc. are left as an exercise for the reader > -# end class StringReader > - > -class StringWriter: > - def __init__(self): > - self.buf = '' > - # end def __init__ > - def write(self, s): > - self.buf = self.buf + s > - # end def write > - def getvalue(self): > - return self.buf > - # end def getvalue > -# end class StringWriter > - > def complete_string(source, stepsize = STEPSIZE, tabsize = TABSIZE, expandtabs = EXPANDTABS): > - input = StringReader(source) > - output = StringWriter() > + input = io.StringIO(source) > + output = io.StringIO() > pi = PythonIndenter(input, output, stepsize, tabsize, expandtabs) > pi.complete() > return output.getvalue() > # end def complete_string > > def delete_string(source, stepsize = STEPSIZE, tabsize = TABSIZE, expandtabs = EXPANDTABS): > - input = StringReader(source) > - output = StringWriter() > + input = io.StringIO(source) > + output = io.StringIO() > pi = PythonIndenter(input, output, stepsize, tabsize, expandtabs) > pi.delete() > return output.getvalue() > # end def delete_string > > def reformat_string(source, stepsize = STEPSIZE, tabsize = TABSIZE, expandtabs = EXPANDTABS): > - input = StringReader(source) > - output = StringWriter() > + input = io.StringIO(source) > + output = io.StringIO() > pi = PythonIndenter(input, output, stepsize, tabsize, expandtabs) > pi.reformat() > return output.getvalue() > # end def reformat_string > > def complete_file(filename, stepsize = STEPSIZE, tabsize = TABSIZE, expandtabs = EXPANDTABS): > - source = open(filename, 'r').read() > + with open(filename, 'r') as f: > + source = f.read() > + # end with > result = complete_string(source, stepsize, tabsize, expandtabs) > if source == result: return 0 > # end if > @@ -429,14 +381,16 @@ > try: os.rename(filename, filename + '~') > except OSError: pass > # end try > - f = open(filename, 'w') > - f.write(result) > - f.close() > + with open(filename, 'w') as f: > + f.write(result) > + # end with > return 1 > # end def complete_file > > def delete_file(filename, stepsize = STEPSIZE, tabsize = TABSIZE, expandtabs = EXPANDTABS): > - source = open(filename, 'r').read() > + with open(filename, 'r') as f: > + source = f.read() > + # end with > result = delete_string(source, stepsize, tabsize, expandtabs) > if source == result: return 0 > # end if > @@ -444,14 +398,16 @@ > try: os.rename(filename, filename + '~') > except OSError: pass > # end try > - f = open(filename, 'w') > - f.write(result) > - f.close() > + with open(filename, 'w') as f: > + f.write(result) > + # end with > return 1 > # end def delete_file > > def reformat_file(filename, stepsize = STEPSIZE, tabsize = TABSIZE, expandtabs = EXPANDTABS): > - source = open(filename, 'r').read() > + with open(filename, 'r') as f: > + source = f.read() > + # end with > result = reformat_string(source, stepsize, tabsize, expandtabs) > if source == result: return 0 > # end if > @@ -459,9 +415,9 @@ > try: os.rename(filename, filename + '~') > except OSError: pass > # end try > - f = open(filename, 'w') > - f.write(result) > - f.close() > + with open(filename, 'w') as f: > + f.write(result) > + # end with > return 1 > # end def reformat_file > > @@ -474,7 +430,7 @@ > -r : reformat a completed program (use #end directives) > -s stepsize: indentation step (default %(STEPSIZE)d) > -t tabsize : the worth in spaces of a tab (default %(TABSIZE)d) > --e : expand TABs into spaces (defailt OFF) > +-e : expand TABs into spaces (default OFF) > [file] ... : files are changed in place, with backups in file~ > If no files are specified or a single - is given, > the program acts as a filter (reads stdin, writes stdout). > @@ -517,7 +473,7 @@ > elif o == '-t': > tabsize = int(a) > elif o == '-e': > - expandtabs = 1 > + expandtabs = True > # end if > # end for > if not action: > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > From agriff at tin.it Fri Jan 11 18:19:44 2013 From: agriff at tin.it (Andrea Griffini) Date: Fri, 11 Jan 2013 18:19:44 +0100 Subject: [Python-Dev] Daily reference leaks (aef7db0d3893): sum=287 In-Reply-To: <50F038FE.208@python.org> References: <50F038FE.208@python.org> Message-ID: On Fri, Jan 11, 2013 at 5:08 PM, Christian Heimes wrote: > It has more issues. Coverity has sent me some complains, see attachment. The second complaint seems a false positive; if self->extra is null then children is set to 0 and following code is not executed. From eliben at gmail.com Fri Jan 11 18:36:06 2013 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 11 Jan 2013 09:36:06 -0800 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (aef7db0d3893): sum=287 In-Reply-To: References: Message-ID: On Thu, Jan 10, 2013 at 10:09 PM, Nick Coghlan wrote: > On Fri, Jan 11, 2013 at 2:57 PM, wrote: > > results for aef7db0d3893 on branch "default" > > -------------------------------------------- > > > > test_dbm leaked [2, 0, 0] references, sum=2 > > test_dbm leaked [2, 2, 1] memory blocks, sum=5 > > Hmm, I'm starting to wonder if there's something to this one - it > seems to be popping up a bit lately. > > > test_xml_etree_c leaked [56, 56, 56] references, sum=168 > > test_xml_etree_c leaked [36, 38, 38] memory blocks, sum=112 > > I'm gonna take a wild guess and suggest there may be a problem with > the recent pickling fix in the C extension :) > Yep. We're on it :-) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Fri Jan 11 18:42:20 2013 From: christian at python.org (Christian Heimes) Date: Fri, 11 Jan 2013 18:42:20 +0100 Subject: [Python-Dev] Daily reference leaks (aef7db0d3893): sum=287 In-Reply-To: References: <50F038FE.208@python.org> Message-ID: <50F04EFC.80505@python.org> Am 11.01.2013 18:19, schrieb Andrea Griffini: > On Fri, Jan 11, 2013 at 5:08 PM, Christian Heimes wrote: >> It has more issues. Coverity has sent me some complains, see attachment. > > The second complaint seems a false positive; if self->extra is null > then children is set to 0 and following code is not executed. Yes, Coverity's results have lots of false positives. On the other hand it's able to detect hard to find resource leaks. You get used to the false positives and learn to appreciate the good stuff. :) From ncoghlan at gmail.com Sat Jan 12 03:55:25 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 12 Jan 2013 12:55:25 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #15031: Refactor some code in importlib pertaining to validating In-Reply-To: <3Yjfsp6PBWzSBY@mail.python.org> References: <3Yjfsp6PBWzSBY@mail.python.org> Message-ID: Nice improvement. Just a couple of minor cleanup suggestions. On Sat, Jan 12, 2013 at 9:09 AM, brett.cannon wrote: > + else: > + # To prevent having to make all messages have a conditional name. > + name = 'bytecode' For consistency with other default/implied names, I suggest wrapping this in angle brackets: "") > + if path is not None: > + exc_details['path'] = path > + magic = data[:4] > + raw_timestamp = data[4:8] > + raw_size = data[8:12] > + if magic != _MAGIC_BYTES: > + msg = 'bad magic number in {!r}: {!r}'.format(name, magic) > + raise ImportError(msg, **exc_details) > + elif len(raw_timestamp) != 4: > + message = 'bad timestamp in {!r}'.format(name) > + _verbose_message(message) > + raise EOFError(message) > + elif len(raw_size) != 4: > + message = 'bad size in {!r}'.format(name) > + _verbose_message(message) > + raise EOFError(message) For timestamp and size "incomplete" would probably be a better word than "bad" in the error messages (since we're only checking the length rather than the value). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From storchaka at gmail.com Sat Jan 12 13:07:27 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 12 Jan 2013 14:07:27 +0200 Subject: [Python-Dev] cpython (merge 3.3 -> default): Issue #15539: Fix a number of bugs in Tools/scripts/pindent.py. In-Reply-To: References: <3YjLF866ftzS6g@mail.python.org> Message-ID: On 11.01.13 19:08, Brett Cannon wrote: > This seems to have caused the Windows buildbots to fail. Yes, Ezio had already told me. I wrote too strong tests which caught yet one bug on Windows. Now it is fixed and all bots happy (except one). Is there a possibility to subscribe to email notifications about all buildbot's breaks in which I blamed? From eliben at gmail.com Sat Jan 12 14:46:33 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 12 Jan 2013 05:46:33 -0800 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (aef7db0d3893): sum=287 In-Reply-To: <50F038FE.208@python.org> References: <50F038FE.208@python.org> Message-ID: On Fri, Jan 11, 2013 at 8:08 AM, Christian Heimes wrote: > Am 11.01.2013 07:09, schrieb Nick Coghlan: > > On Fri, Jan 11, 2013 at 2:57 PM, wrote: > >> results for aef7db0d3893 on branch "default" > >> -------------------------------------------- > >> > >> test_dbm leaked [2, 0, 0] references, sum=2 > >> test_dbm leaked [2, 2, 1] memory blocks, sum=5 > > > > Hmm, I'm starting to wonder if there's something to this one - it > > seems to be popping up a bit lately. > > > >> test_xml_etree_c leaked [56, 56, 56] references, sum=168 > >> test_xml_etree_c leaked [36, 38, 38] memory blocks, sum=112 > > > > I'm gonna take a wild guess and suggest there may be a problem with > > the recent pickling fix in the C extension :) > > It has more issues. Coverity has sent me some complains, see attachment. > The second report is indeed a false positive. Coverity doesn't know that PyList_GET_SIZE returns 0 for PyList_New(0). Maybe it can be taught some project/domain-specific information? The first report is legit, however. PyTuple_New(0) was called and its return value wasn't checked for NULL. I actually think Coverity is very useful for such cases because forgetting to check NULL returns from PyObject constructors is a common mistake and it's not something that would show up in tests. Anyway, this was fixed. Thanks for reporting Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Jan 12 14:57:39 2013 From: brett at python.org (Brett Cannon) Date: Sat, 12 Jan 2013 08:57:39 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Issue #15031: Refactor some code in importlib pertaining to validating In-Reply-To: References: <3Yjfsp6PBWzSBY@mail.python.org> Message-ID: On Fri, Jan 11, 2013 at 9:55 PM, Nick Coghlan wrote: > Nice improvement. Just a couple of minor cleanup suggestions. > > On Sat, Jan 12, 2013 at 9:09 AM, brett.cannon > wrote: >> + else: >> + # To prevent having to make all messages have a conditional name. >> + name = 'bytecode' > > For consistency with other default/implied names, I suggest wrapping > this in angle brackets: "") Good suggestion. > >> + if path is not None: >> + exc_details['path'] = path >> + magic = data[:4] >> + raw_timestamp = data[4:8] >> + raw_size = data[8:12] >> + if magic != _MAGIC_BYTES: >> + msg = 'bad magic number in {!r}: {!r}'.format(name, magic) >> + raise ImportError(msg, **exc_details) >> + elif len(raw_timestamp) != 4: >> + message = 'bad timestamp in {!r}'.format(name) >> + _verbose_message(message) >> + raise EOFError(message) >> + elif len(raw_size) != 4: >> + message = 'bad size in {!r}'.format(name) >> + _verbose_message(message) >> + raise EOFError(message) > > For timestamp and size "incomplete" would probably be a better word > than "bad" in the error messages (since we're only checking the length > rather than the value). True. Those were the original messages and in hindsight not accurate. Since we don't consider exception messages a backwards-compatible thing I'll update them. From brett at python.org Sat Jan 12 15:08:45 2013 From: brett at python.org (Brett Cannon) Date: Sat, 12 Jan 2013 09:08:45 -0500 Subject: [Python-Dev] cpython (merge 3.3 -> default): Issue #15539: Fix a number of bugs in Tools/scripts/pindent.py. In-Reply-To: References: <3YjLF866ftzS6g@mail.python.org> Message-ID: On Sat, Jan 12, 2013 at 7:07 AM, Serhiy Storchaka wrote: > On 11.01.13 19:08, Brett Cannon wrote: >> >> This seems to have caused the Windows buildbots to fail. > > > Yes, Ezio had already told me. I wrote too strong tests which caught yet one > bug on Windows. Now it is fixed and all bots happy (except one). > > Is there a possibility to subscribe to email notifications about all > buildbot's breaks in which I blamed? Not that I know of. I usually leave the buildbot page up after a commit and just keep refreshing until a majority of the buildbots have completed building and are green. But hacking something together like that wouldn't be too hard. Could do that on App Engine on a free account as long as you didn't exhaust your free quota for fetching web pages, and with pubsubhubbub you probably wouldn't. Honestly the biggest pain I see is mapping commit real names to email addresses and maintaining the list of stable vs. unstable builders so people know how severe the breakage is (although I guess a break is a break and stable is more "always up" than "finicky failures"). And then from there you can do Chrome/Firefox extensions, build orbs, etc. It never ends! =) From dustin at v.igoro.us Sat Jan 12 16:03:40 2013 From: dustin at v.igoro.us (Dustin J. Mitchell) Date: Sat, 12 Jan 2013 10:03:40 -0500 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Wed, Jan 9, 2013 at 12:14 AM, Guido van Rossum wrote: > But which half? A socket is two independent streams, one in each > direction. Twisted uses half_close() for this concept but unless you > already know what this is for you are left wondering which half. Which > is why I like using 'write' in the name. FWIW, "half-closed" is, IMHO, a well-known term. It's not just a Twisted thing. Either name is better than "shutdown"! Dustin From victor.stinner at gmail.com Sat Jan 12 21:29:18 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 12 Jan 2013 21:29:18 +0100 Subject: [Python-Dev] fork or exec? In-Reply-To: References: <20130110103751.523ddb0a@pitrou.net> <20130110135241.088a6a24@pitrou.net> Message-ID: 2013/1/11 Ben Leslie : > Python is not UNIX, but I think if you are directly using the POSIX > interfaces they should > work (more or less) the same way the would if you were writing a C program. > (Some of us > still use Python to prototype things that will later be converted to C!). I completed the PEP to list advantages and drawbacks of the different alternatives. I moved the PEP to the PEP repository because I now consider it complete enough (for a first draft!): http://www.python.org/dev/peps/pep-0433/ I see 3 major alternatives: (a) add cloexec=False argument to some functions http://www.python.org/dev/peps/pep-0433/#proposition Drawbacks: http://www.python.org/dev/peps/pep-0433/#scope (b) always set close-on-exec flag and add cloexec=True argument to some functions (to be able to disable it) http://www.python.org/dev/peps/pep-0433/#always-set-close-on-exec-flag (c) add cloexec argument to some functions and a sys.setdefaultcloexec() function to change the default value of cloexec (will be False at startup) http://www.python.org/dev/peps/pep-0433/#add-a-function-to-set-close-on-exec-flag-by-default If you care about backward compatibility, (a) is the safest option. If you care about bugs and security, (b) is the best option. For example, (a) does not fix os.urandom(). (c) tries to be a compromise, but has its own drawbacks. Victor From victor.stinner at gmail.com Sun Jan 13 01:38:16 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 13 Jan 2013 01:38:16 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors Message-ID: HTML version: http://www.python.org/dev/peps/pep-0433/ *** PEP: 433 Title: Add cloexec argument to functions creating file descriptors Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 10-January-2013 Python-Version: 3.4 Abstract ======== This PEP proposes to add a new optional argument ``cloexec`` on functions creating file descriptors in the Python standard library. If the argument is ``True``, the close-on-exec flag will be set on the new file descriptor. Rationale ========= On UNIX, subprocess closes file descriptors greater than 2 by default since Python 3.2 [#subprocess_close]_. All file descriptors created by the parent process are automatically closed. ``xmlrpc.server.SimpleXMLRPCServer`` sets the close-on-exec flag of the listening socket, the parent class ``socketserver.BaseServer`` does not set this flag. There are other cases creating a subprocess or executing a new program where file descriptors are not closed: functions of the os.spawn*() family and third party modules calling ``exec()`` or ``fork()`` + ``exec()``. In this case, file descriptors are shared between the parent and the child processes which is usually unexpected and causes various issues. This PEP proposes to continue the work started with the change in the subprocess, to fix the issue in any code, and not just code using subprocess. Inherited file descriptors issues --------------------------------- Closing the file descriptor in the parent process does not close the related resource (file, socket, ...) because it is still open in the child process. The listening socket of TCPServer is not closed on ``exec()``: the child process is able to get connection from new clients; if the parent closes the listening socket and create a new listening socket on the same address, it would get an "address already is used" error. Not closing file descriptors can lead to resource exhaustion: even if the parent closes all files, creating a new file descriptor may fail with "too many files" because files are still open in the child process. Security -------- Leaking file descriptors is a major security vulnerability. An untrusted child process can read sensitive data like passwords and take control of the parent process though leaked file descriptors. It is for example a known vulnerability to escape from a chroot. Atomicity --------- Using ``fcntl()`` to set the close-on-exec flag is not safe in a multithreaded application. If a thread calls ``fork()`` and ``exec()`` between the creation of the file descriptor and the call to ``fcntl(fd, F_SETFD, new_flags)``: the file descriptor will be inherited by the child process. Modern operating systems offer functions to set the flag during the creation of the file descriptor, which avoids the race condition. Portability ----------- Python 3.2 added ``socket.SOCK_CLOEXEC`` flag, Python 3.3 added ``os.O_CLOEXEC`` flag and ``os.pipe2()`` function. It is already possible to set atomically close-on-exec flag in Python 3.3 when opening a file and creating a pipe or socket. The problem is that these flags and functions are not portable: only recent versions of operating systems support them. ``O_CLOEXEC`` and ``SOCK_CLOEXEC`` flags are ignored by old Linux versions and so ``FD_CLOEXEC`` flag must be checked using ``fcntl(fd, F_GETFD)``. If the kernel ignores ``O_CLOEXEC`` or ``SOCK_CLOEXEC`` flag, a call to ``fcntl(fd, F_SETFD, flags)`` is required to set close-on-exec flag. .. note:: OpenBSD older 5.2 does not close the file descriptor with close-on-exec flag set if ``fork()`` is used before ``exec()``, but it works correctly if ``exec()`` is called without ``fork()``. Scope ----- Applications still have to close explicitly file descriptors after a ``fork()``. The close-on-exec flag only closes file descriptors after ``exec()``, and so after ``fork()`` + ``exec()``. This PEP only change the close-on-exec flag of file descriptors created by the Python standard library, or by modules using the standard library. Third party modules not using the standard library should be modified to conform to this PEP. The new ``os.set_cloexec()`` function can be used for example. Impacted functions: * ``os.forkpty()`` * ``http.server.CGIHTTPRequestHandler.run_cgi()`` Impacted modules: * ``multiprocessing`` * ``socketserver`` * ``subprocess`` * ``tempfile`` * ``xmlrpc.server`` * Maybe: ``signal``, ``threading`` XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX XXX descriptors of the constructor the ``pass_fds`` argument? XXX .. note:: See `Close file descriptors after fork`_ for a possible solution for ``fork()`` without ``exec()``. Proposal ======== This PEP proposes to add a new optional argument ``cloexec`` on functions creating file descriptors in the Python standard library. If the argument is ``True``, the close-on-exec flag will be set on the new file descriptor. Add a new function: * ``os.set_cloexec(fd: int, cloexec: bool)``: set or unset the close-on-exec flag of a file descriptor Add a new optional ``cloexec`` argument to: * ``open()``: ``os.fdopen()`` is indirectly modified * ``os.dup()``, ``os.dup2()`` * ``os.pipe()`` * ``socket.socket()``, ``socket.socketpair()``, ``socket.socket.accept()`` * Maybe also: ``os.open()``, ``os.openpty()`` * TODO: * ``select.devpoll()`` * ``select.poll()`` * ``select.epoll()`` * ``select.kqueue()`` * ``socket.socket.recvmsg()``: use ``MSG_CMSG_CLOEXEC``, or ``os.set_cloexec()`` The default value of the ``cloexec`` argument is ``False`` to keep the backward compatibility. The close-on-exec flag will not be set on file descriptors 0 (stdin), 1 (stdout) and 2 (stderr), because these files are expected to be inherited. It would still be possible to set close-on-exec flag explicitly using ``os.set_cloexec()``. Drawbacks: * Many functions of the Python standard library creating file descriptors are cannot be changed by this proposal, because adding a ``cloexec`` optional argument would be surprising and too many functions would need it. For example, ``os.urandom()`` uses a temporary file on UNIX, but it calls a function of Windows API on Windows. Adding a ``cloexec`` argument to ``os.urandom()`` would not make sense. See `Always set close-on-exec flag`_ for an incomplete list of functions creating file descriptors. * Checking if a module creates file descriptors is difficult. For example, ``os.urandom()`` creates a file descriptor on UNIX to read ``/dev/urandom`` (and closes it at exit), whereas it is implemented using a function call on Windows. It is not possible to control close-on-exec flag of the file descriptor used by ``os.urandom()``, because ``os.urandom()`` API does not allow it. Alternatives ============ Always set close-on-exec flag ----------------------------- Always set close-on-exec flag on new file descriptors created by Python. This alternative just changes the default value of the new ``cloexec`` argument. If a file must be inherited by child processes, ``cloexec=False`` argument can be used. ``subprocess.Popen`` constructor has an ``pass_fds`` argument to specify which file descriptors must be inherited. The close-on-exec flag of these file descriptors must be changed with ``os.set_cloexec()``. Example of functions creating file descriptors which will be modified to set close-on-exec flag: * ``os.urandom()`` (on UNIX) * ``curses.window.getwin()``, ``curses.window.putwin()`` * ``mmap.mmap()`` (if ``MAP_ANONYMOUS`` is not defined) * ``oss.open()`` * ``Modules/main.c``: ``RunStartupFile()`` * ``Python/pythonrun.c``: ``PyRun_SimpleFileExFlags()`` * ``Modules/getpath.c``: ``search_for_exec_prefix()`` * ``Modules/zipimport.c``: ``read_directory()`` * ``Modules/_ssl.c``: ``load_dh_params()`` * ``PC/getpathp.c``: ``calculate_path()`` * ``Python/errors.c``: ``PyErr_ProgramText()`` * ``Python/import.c``: ``imp_load_dynamic()`` * TODO: ``PC/_msi.c`` Many functions are impacted indirectly by this alternative. Examples: * ``logging.FileHandler`` Advantages of setting close-on-exec flag by default: * There are far more programs that are bitten by FD inheritance upon exec (see `Inherited file descriptors issues`_ and `Security`_) than programs relying on it (see `Applications using inherance of file descriptors`_). Drawbacks of setting close-on-exec flag by default: * The os module is written as a thin wrapper to system calls (to functions of the C standard library). If atomic flags to set close-on-exec flag are not supported (see `Appendix: Operating system support`_), a single Python function call may call 2 or 3 system calls (see `Performances`_ section). * Extra system calls, if any, may slow down Python: see `Performances`_. * It violates the principle of least surprise. Developers using the os module may expect that Python respects the POSIX standard and so that close-on-exec flag is not set by default. Backward compatibility: only a few programs rely on inherance of file descriptors, and they only pass a few file descriptors, usually just one. These programs will fail immediatly with ``EBADF`` error, and it will be simple to fix them: add ``cloexec=False`` argument or use ``os.set_cloexec(fd, False)``. The ``subprocess`` module will be changed anyway to unset close-on-exec flag on file descriptors listed in the ``pass_fds`` argument of Popen constructor. So it possible that these programs will not need any fix if they use the ``subprocess`` module. Add a function to set close-on-exec flag by default --------------------------------------------------- An alternative is to add also a function to change globally the default behaviour. It would be possible to set close-on-exec flag for the whole application including all modules and the Python standard library. This alternative is based on the `Proposal`_ and adds extra changes. Add new functions: * ``sys.getdefaultcloexec() -> bool``: get the default value of the close-on-exec flag for new file descriptor * ``sys.setdefaultcloexec(cloexec: bool)``: enable or disable close-on-exec flag, the state of the flag can be overriden in each function creating a file descriptor The major change is that the default value of the ``cloexec`` argument is ``sys.getdefaultcloexec()``, instead of ``False``. When ``sys.setdefaultcloexec(True)`` is called to set close-on-exec by default, we have the same drawbacks than `Always set close-on-exec flag`_ alternative. There are additionnal drawbacks of having two behaviours depending on ``sys.getdefaultcloexec()`` value: * It is not more possible to know if the close-on-exec flag will be set or not just by reading the source code. Close file descriptors after fork --------------------------------- This PEP does not fix issues with applications using ``fork()`` without ``exec()``. Python needs a generic process to register callbacks which would be called after a fork, see `Add an 'afterfork' module`_. Such registry could be used to close file descriptors just after a ``fork()``. Drawbacks: * This alternative does not solve the problem for programs using ``exec()`` without ``fork()``. * A third party module may call directly the C function ``fork()`` which will not call "atfork" callbacks. * All functions creating file descriptors must be changed to register a callback and then unregister their callback when the file is closed. Or a list of *all* open file descriptors must be maintained. * The operating system is a better place than Python to close automatically file descriptors. For example, it is not easy to avoid a race condition between closing the file and unregistering the callback closing the file. open(): add "e" flag to mode ---------------------------- A new "e" mode would set close-on-exec flag (best-effort). This alternative only solves the problem for ``open()``. socket.socket() and os.pipe() do not have a ``mode`` argument for example. Since its version 2.7, the GNU libc supports ``"e"`` flag for ``fopen()``. It uses ``O_CLOEXEC`` if available, or use ``fcntl(fd, F_SETFD, FD_CLOEXEC)``. With Visual Studio, fopen() accepts a "N" flag which uses ``O_NOINHERIT``. Applications using inherance of file descriptors ================================================ Most developers don't know that file descriptors are inherited by default. Most programs do not rely on inherance of file descriptors. For example, ``subprocess.Popen`` was changed in Python 3.2 to close all file descriptors greater than 2 in the child process by default. No user complained about this behavior change. Network servers using fork may want to pass the client socket to the child process. For example, on UNIX a CGI server pass the socket client through file descriptors 0 (stdin) and 1 (stdout) using ``dup2()``. This specific case is not impacted by this PEP because the close-on-exec flag is never set on file descriptors smaller than 3. To access a restricted resource like creating a socket listening on a TCP port lower than 1024 or reading a file containing sensitive data like passwords, a common practice is: start as the root user, create a file descriptor, create a child process, pass the file descriptor to the child process and exit. Security is very important in such use case: leaking another file descriptor would be a critical security vulnerability (see `Security`_). The root process may not exit but monitors the child process instead, and restarts a new child process and pass the same file descriptor if the previous child process crashed. Example of programs taking file descriptors from the parent process using a command line option: * gpg: ``--status-fd ``, ``--logger-fd ``, etc. * openssl: ``-pass fd:`` * qemu: ``-add-fd `` * valgrind: ``--log-fd=``, ``--input-fd=``, etc. * xterm: ``-S `` On Linux, it is possible to use ``"/dev/fd/"`` filename to pass a file descriptor to a program expecting a filename. Performances ============ Setting close-on-exec flag may require additional system calls for each creation of new file descriptors. The number of additional system calls depends on the method used to set the flag: * ``O_NOINHERIT``: no additionnal system call * ``O_CLOEXEC``: one addition system call, but only at the creation of the first file descriptor, to check if the flag is supported. If no, Python has to fallback to the next method. * ``ioctl(fd, FIOCLEX)``: one addition system call per file descriptor * ``fcntl(fd, F_SETFD, flags)``: two addition system calls per file descriptor, one to get old flags and one to set new flags XXX Benchmark the overhead for these 4 methods. XXX Implementation ============== os.set_cloexec(fd, cloexec) --------------------------- Best-effort by definition. Pseudo-code:: if os.name == 'nt': def set_cloexec(fd, cloexec=True): SetHandleInformation(fd, HANDLE_FLAG_INHERIT, int(cloexec)) else: fnctl = None ioctl = None try: import ioctl except ImportError: try: import fcntl except ImportError: pass if ioctl is not None and hasattr('FIOCLEX', ioctl): def set_cloexec(fd, cloexec=True): if cloexec: ioctl.ioctl(fd, ioctl.FIOCLEX) else: ioctl.ioctl(fd, ioctl.FIONCLEX) elif fnctl is not None: def set_cloexec(fd, cloexec=True): flags = fcntl.fcntl(fd, fcntl.F_GETFD) if cloexec: flags |= FD_CLOEXEC else: flags &= ~FD_CLOEXEC fcntl.fcntl(fd, fcntl.F_SETFD, flags) else: def set_cloexec(fd, cloexec=True): raise NotImplementedError( "close-on-exec flag is not supported " "on your platform") ioctl is preferred over fcntl because it requires only one syscall, instead of two syscalls for fcntl. .. note:: ``fcntl(fd, F_SETFD, flags)`` only supports one flag (``FD_CLOEXEC``), so it would be possible to avoid ``fcntl(fd, F_GETFD)``. But it may drop other flags in the future, and so it is safer to keep the two functions calls. .. note:: ``fopen()`` function of the GNU libc ignores the error if ``fcntl(fd, F_SETFD, flags)`` failed. open() ------ * Windows: ``open()`` with ``O_NOINHERIT`` flag [atomic] * ``open()`` with ``O_CLOEXEC flag`` [atomic] * ``open()`` + ``os.set_cloexec(fd, True)`` [best-effort] os.dup() -------- * ``fcntl(fd, F_DUPFD_CLOEXEC)`` [atomic] * ``dup()`` + ``os.set_cloexec(fd, True)`` [best-effort] os.dup2() --------- * ``dup3()`` with ``O_CLOEXEC`` flag [atomic] * ``dup2()`` + ``os.set_cloexec(fd, True)`` [best-effort] os.pipe() --------- * Windows: ``_pipe()`` with ``O_NOINHERIT`` flag [atomic] * ``pipe2()`` with ``O_CLOEXEC`` flag [atomic] * ``pipe()`` + ``os.set_cloexec(fd, True)`` [best-effort] socket.socket() --------------- * ``socket()`` with ``SOCK_CLOEXEC`` flag [atomic] * ``socket()`` + ``os.set_cloexec(fd, True)`` [best-effort] socket.socketpair() ------------------- * ``socketpair()`` with ``SOCK_CLOEXEC`` flag [atomic] * ``socketpair()`` + ``os.set_cloexec(fd, True)`` [best-effort] socket.socket.accept() ---------------------- * ``accept4()`` with ``SOCK_CLOEXEC`` flag [atomic] * ``accept()`` + ``os.set_cloexec(fd, True)`` [best-effort] Backward compatibility ====================== There is no backward incompatible change. The default behaviour is unchanged: the close-on-exec flag is not set by default. Appendix: Operating system support ================================== Windows ------- Windows has an ``O_NOINHERIT`` flag: "Do not inherit in child processes". For example, it is supported by ``open()`` and ``_pipe()``. The value of the flag can be modified using: ``SetHandleInformation(fd, HANDLE_FLAG_INHERIT, 1)``. ``CreateProcess()`` has an ``bInheritHandles`` argument: if it is FALSE, the handles are not inherited. It is used by ``subprocess.Popen`` with ``close_fds`` option. fcntl ----- Functions: * ``fcntl(fd, F_GETFD)`` * ``fcntl(fd, F_SETFD, flags | FD_CLOEXEC)`` Availability: AIX, Digital UNIX, FreeBSD, HP-UX, IRIX, Linux, Mac OS X, OpenBSD, Solaris, SunOS, Unicos. ioctl ----- Functions: * ``ioctl(fd, FIOCLEX, 0)`` sets close-on-exec flag * ``ioctl(fd, FIONCLEX, 0)`` unsets close-on-exec flag Availability: Linux, Mac OS X, QNX, NetBSD, OpenBSD, FreeBSD. Atomic flags ------------ New flags: * ``O_CLOEXEC``: available on Linux (2.6.23+), FreeBSD (8.3+), OpenBSD 5.0, QNX, BeOS, next NetBSD release (6.1?). This flag is part of POSIX.1-2008. * ``socket()``: ``SOCK_CLOEXEC`` flag, available on Linux 2.6.27+, OpenBSD 5.2, NetBSD 6.0. * ``fcntl()``: ``F_DUPFD_CLOEXEC`` flag, available on Linux 2.6.24+, OpenBSD 5.0, FreeBSD 9.1, NetBSD 6.0. This flag is part of POSIX.1-2008. * ``recvmsg()``: ``MSG_CMSG_CLOEXEC``, available on Linux 2.6.23+, NetBSD 6.0. On Linux older than 2.6.23, ``O_CLOEXEC`` flag is simply ignored. So we have to check that the flag is supported by calling ``fcntl()``. If it does not work, we have to set the flag using ``fcntl()``. XXX what is the behaviour on Linux older than 2.6.27 XXX with SOCK_CLOEXEC? XXX New functions: * ``dup3()``: available on Linux 2.6.27+ (and glibc 2.9) * ``pipe2()``: available on Linux 2.6.27+ (and glibc 2.9) * ``accept4()``: available on Linux 2.6.28+ (and glibc 2.10) If ``accept4()`` is called on Linux older than 2.6.28, ``accept4()`` returns ``-1`` (fail) and errno is set to ``ENOSYS``. Links ===== Links: * `Secure File Descriptor Handling `_ (Ulrich Drepper, 2008) * `win32_support.py of the Tornado project `_: emulate fcntl(fd, F_SETFD, FD_CLOEXEC) using ``SetHandleInformation(fd, HANDLE_FLAG_INHERIT, 1)`` Python issues: * `open() does not able to set flags, such as O_CLOEXEC `_ * `Add "e" mode to open(): close-and-exec (O_CLOEXEC) / O_NOINHERIT `_ * `TCP listening sockets created without FD_CLOEXEC flag `_ * `Use O_CLOEXEC in the tempfile module `_ * `Support accept4() for atomic setting of flags at socket creation `_ * `Add an 'afterfork' module `_ Ruby: * `Set FD_CLOEXEC for all fds (except 0, 1, 2) `_ * `O_CLOEXEC flag missing for Kernel::open `_: `commit reverted `_ later Footnotes ========= .. [#subprocess_close] On UNIX since Python 3.2, subprocess.Popen() closes all file descriptors by default: ``close_fds=True``. It closes file descriptors in range 3 inclusive to ``local_max_fd`` exclusive, where ``local_max_fd`` is ``fcntl(0, F_MAXFD)`` on NetBSD, or ``sysconf(_SC_OPEN_MAX)`` otherwise. If the error pipe has a descriptor smaller than 3, ``ValueError`` is raised. From victor.stinner at gmail.com Sun Jan 13 01:41:07 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 13 Jan 2013 01:41:07 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: > PEP: 433 > Title: Add cloexec argument to functions creating file descriptors > Status: Draft The PEP is still a draft. I'm sending it to python-dev to get a first review. The main question is the choice between the 3 different options: * don't set close-on-exec flag by default * always set close-on-exec flag * add sys.setdefaultcloexec() to leave the choice to the application Victor From storchaka at gmail.com Sun Jan 13 09:27:03 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 13 Jan 2013 10:27:03 +0200 Subject: [Python-Dev] cpython (3.3): clean trailing whitespace In-Reply-To: <3Yk4yg41Z1zS5v@mail.python.org> References: <3Yk4yg41Z1zS5v@mail.python.org> Message-ID: On 12.01.13 17:45, eli.bendersky wrote: > http://hg.python.org/cpython/rev/bbfc8f62cb67 > changeset: 81452:bbfc8f62cb67 > branch: 3.3 > parent: 81450:f9d1d120c19e > user: Eli Bendersky > date: Sat Jan 12 07:44:32 2013 -0800 > summary: > clean trailing whitespace Can you please clean trailing whitespace in Modules/_elementtree.c too? It's very annoying manually edit every patch for Modules/_elementtree.c to remove unrelated changes (my editors remove trailing spaces automatically). From cf.natali at gmail.com Sun Jan 13 11:40:34 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Sun, 13 Jan 2013 11:40:34 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: Hello, > PEP: 433 > Title: Add cloexec argument to functions creating file descriptors I'm not a native English speaker, but it seems to me that the correct wording should be "parameter" (part of the function definition/prototype, whereas "argument" refers to the actual value supplied). > This PEP proposes to add a new optional argument ``cloexec`` on > functions creating file descriptors in the Python standard library. If > the argument is ``True``, the close-on-exec flag will be set on the > new file descriptor. It would probably be useful to recap briefly what the close-on-exec flag does. Also, ISTM that Windows also supports this flag. If it does, then "cloexec" might not be the best name, because it refers to the execve() Unix system call. Maybe something like "noinherit" would be clearer (although coming from a Unix background "cloexec" is crystal-clear to me :-). > On UNIX, subprocess closes file descriptors greater than 2 by default > since Python 3.2 [#subprocess_close]_. All file descriptors created by > the parent process are automatically closed. ("in the child process") > ``xmlrpc.server.SimpleXMLRPCServer`` sets the close-on-exec flag of > the listening socket, the parent class ``socketserver.BaseServer`` > does not set this flag. As has been discussed earlier, the real issue is that the server socket is not closed in the child process. Setting it cloexec would only add an extra security for multi-threaded programs. > Inherited file descriptors issues > --------------------------------- > > Closing the file descriptor in the parent process does not close the > related resource (file, socket, ...) because it is still open in the > child process. You might want to go through the bug tracker to find examples of such issues, and list them: http://bugs.python.org/issue7213 http://bugs.python.org/issue12786 http://bugs.python.org/issue2320 http://bugs.python.org/issue3006 The list goes on. Some of those examples resulted in deadlocks. > The listening socket of TCPServer is not closed on ``exec()``: the > child process is able to get connection from new clients; if the > parent closes the listening socket and create a new listening socket > on the same address, it would get an "address already is used" error. See above for the real cause. > Not closing file descriptors can lead to resource exhaustion: even if > the parent closes all files, creating a new file descriptor may fail > with "too many files" because files are still open in the child > process. You might want to detail the course of events (a child if forked before the parent gets a chance to close the file descriptors... EMFILE). > Leaking file descriptors is a major security vulnerability. An > untrusted child process can read sensitive data like passwords and > take control of the parent process though leaked file descriptors. It > is for example a known vulnerability to escape from a chroot. You might add a link to this: https://www.securecoding.cert.org/confluence/display/seccode/FIO42-C.+Ensure+files+are+properly+closed+when+they+are+no+longer+needed It can also result in DoS (if the child process highjacks the server socket and accepts connections). Example of vulnerabilities: http://www.openssh.com/txt/portable-keysign-rand-helper.adv http://www.securityfocus.com/archive/1/348368 http://cwe.mitre.org/data/definitions/403.html > The problem is that these flags and functions are not portable: only > recent versions of operating systems support them. ``O_CLOEXEC`` and > ``SOCK_CLOEXEC`` flags are ignored by old Linux versions and so > ``FD_CLOEXEC`` flag must be checked using ``fcntl(fd, F_GETFD)``. If > the kernel ignores ``O_CLOEXEC`` or ``SOCK_CLOEXEC`` flag, a call to > ``fcntl(fd, F_SETFD, flags)`` is required to set close-on-exec flag. > > .. note:: > OpenBSD older 5.2 does not close the file descriptor with > close-on-exec flag set if ``fork()`` is used before ``exec()``, but > it works correctly if ``exec()`` is called without ``fork()``. That would be *really* surprising, are your sure your test case is correct? Otherwise it could be a compilation issue, because I simply can't believe OpenBSD would ignore the close-on-exec flag. > This PEP only change the close-on-exec flag of file descriptors > created by the Python standard library, or by modules using the > standard library. Third party modules not using the standard library > should be modified to conform to this PEP. The new > ``os.set_cloexec()`` function can be used for example. > > Impacted functions: > > * ``os.forkpty()`` > * ``http.server.CGIHTTPRequestHandler.run_cgi()`` I've opened http://bugs.python.org/issue16945 to rewrite this to use subprocess. > Impacted modules: > > * ``multiprocessing`` > * ``socketserver`` > * ``subprocess`` > * ``tempfile`` Hum, I thought temporay file are already created with the close-on-exec flag. > * ``xmlrpc.server`` > * Maybe: ``signal``, ``threading`` > > XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX > XXX descriptors of the constructor the ``pass_fds`` argument? XXX What? Setting them cloexec would prevent them from being inherited in the child process! > Add a new optional ``cloexec`` argument to: > > * Maybe also: ``os.open()``, ``os.openpty()`` Open can be passed O_CLOEXEC directly. > * Many functions of the Python standard library creating file > descriptors are cannot be changed by this proposal, because adding > a ``cloexec`` optional argument would be surprising and too many > functions would need it. For example, ``os.urandom()`` uses a > temporary file on UNIX, but it calls a function of Windows API on > Windows. Adding a ``cloexec`` argument to ``os.urandom()`` would > not make sense. See `Always set close-on-exec flag`_ for an > incomplete list of functions creating file descriptors. > * Checking if a module creates file descriptors is difficult. For > example, ``os.urandom()`` creates a file descriptor on UNIX to read > ``/dev/urandom`` (and closes it at exit), whereas it is implemented > using a function call on Windows. It is not possible to control > close-on-exec flag of the file descriptor used by ``os.urandom()``, > because ``os.urandom()`` API does not allow it. I think that the rule of thumb should be simple: if a library opens a file descriptor which is not exposed to the user, either because it's opened and closed before returning (e.g. os.urandom()) or the file descriptor is kept private (e.g. poll(), it should be set close-on-exec. Because the FD is not handed over to the user, there's no risk of breaking applications: that's what the glibc does (e.g. for getpwnam(), so you're sure you don't leak an open FD to /etc/passwd). > Always set close-on-exec flag on new file descriptors created by > Python. This alternative just changes the default value of the new > ``cloexec`` argument. In a perfect world, all FDS should have been cloexec by default. But it's too late now, I don't think we can change the default, it's going to break some applications, and would be a huge deviation from POSIX (however broken this design decision is). > Add a function to set close-on-exec flag by default > --------------------------------------------------- > > An alternative is to add also a function to change globally the > default behaviour. It would be possible to set close-on-exec flag for > the whole application including all modules and the Python standard > library. This alternative is based on the `Proposal`_ and adds extra > changes. > * It is not more possible to know if the close-on-exec flag will be > set or not just by reading the source code. That's really a show stopper: """ s = socket.socket() if os.fork() == 0: # child os.execve(['myprog', 'arg1]) else: # parent [...] """ It would be impossible to now if the socket is inherited, because the behavior of the all program is affected by an - hidden - global variable. That's just too wrong. Also, it has the same drawbacks as global variables: not thread-safe, not library-safe (i.e. if two libraries set it to conflicting values, you don't know which one is picked up). > Close file descriptors after fork > --------------------------------- > > This PEP does not fix issues with applications using ``fork()`` > without ``exec()``. Python needs a generic process to register > callbacks which would be called after a fork, see `Add an 'afterfork' > module`_. Such registry could be used to close file descriptors just > after a ``fork()``. An atfork() module would indeed be really useful, but I don't think it should be used for closing file descriptors: file descriptors are a scarce resource, so should be closed as soon as possible, explicitly. Also, you could end up actually leaking file descriptors just by keeping a reference to them in the atfork module (that's why it should probably use weakrefs, but that's another debate). The atfork mecanism is useful for patterns where some resource is acquired/open in a library, and needs reiniatialization in a child process (e.g. locks to avoid deadlocks, or random seed), not to encourage lazy programming. > Applications using inherance of file descriptors > ================================================ > > Most developers don't know that file descriptors are inherited by > default. Most programs do not rely on inherance of file descriptors. > For example, ``subprocess.Popen`` was changed in Python 3.2 to close > all file descriptors greater than 2 in the child process by default. > No user complained about this behavior change. "yet" ;-) > Network servers using fork may want to pass the client socket to the > child process. For example, on UNIX a CGI server pass the socket > client through file descriptors 0 (stdin) and 1 (stdout) using > ``dup2()``. This specific case is not impacted by this PEP because the > close-on-exec flag is never set on file descriptors smaller than 3. CGI servers, inetd servers, etc. > Example of programs taking file descriptors from the parent process > using a command line option: > > * gpg: ``--status-fd ``, ``--logger-fd ``, etc. > * openssl: ``-pass fd:`` > * qemu: ``-add-fd `` > * valgrind: ``--log-fd=``, ``--input-fd=``, etc. > * xterm: ``-S `` Hum, yes, but since the official way to start new process is through the subprocess module, I expect that all python code wrapping those programs is broken since subprocess "close_fds" parameter has been changed to "True" in 3.2. From chris.jerdonek at gmail.com Sun Jan 13 11:49:07 2013 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 13 Jan 2013 02:49:07 -0800 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: On Sun, Jan 13, 2013 at 2:40 AM, Charles-Fran?ois Natali wrote: > Hello, > >> PEP: 433 >> Title: Add cloexec argument to functions creating file descriptors > > I'm not a native English speaker, but it seems to me that the correct > wording should be "parameter" (part of the function > definition/prototype, whereas "argument" refers to the actual value > supplied). Yes, this distinction is now reflected in our glossary as of a month or two ago. Let's try to be consistent with that. :) --Chris From ncoghlan at gmail.com Sun Jan 13 12:13:42 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 13 Jan 2013 21:13:42 +1000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: On Sun, Jan 13, 2013 at 8:40 PM, Charles-Fran?ois Natali wrote: > Hello, > >> PEP: 433 >> Title: Add cloexec argument to functions creating file descriptors > > I'm not a native English speaker, but it seems to me that the correct > wording should be "parameter" (part of the function > definition/prototype, whereas "argument" refers to the actual value > supplied). This is correct (although it's a subtle distinction that even many native English speakers get wrong). For the PEP, I'd actually avoid assuming the solution in the title and instead say something like "Easier suppression of file descriptor inheritance" >> This PEP proposes to add a new optional argument ``cloexec`` on >> functions creating file descriptors in the Python standard library. If >> the argument is ``True``, the close-on-exec flag will be set on the >> new file descriptor. > > It would probably be useful to recap briefly what the close-on-exec flag does. > > Also, ISTM that Windows also supports this flag. If it does, then > "cloexec" might not be the best name, because it refers to the > execve() Unix system call. Maybe something like "noinherit" would be > clearer (although coming from a Unix background "cloexec" is > crystal-clear to me :-). Indeed, this may be an area where following the underlying standards too closely may not be a good idea. In particular, a *descriptive* flag may be better choice than an imperative one. For example, if we make the flag "sensitive", then the programmer is telling us "this file descriptor is sensitive" and then we get to decide what that means in terms of the underlying OS behaviours like "close-on-exec" and "no-inherit" (as well as deciding whether or not file descriptors are considered sensitive by default). It also means we're free to implement a mechanism that tries to close all sensitive file descriptors in _PyOS_AfterFork. Alternatively, you could flip the sense of the flag and use "inherited" - the scope is then clearly limited to indicating whether or not the file descriptor should be inherited by child processes. Either way, if done with sufficient advance warning and clear advice to users, the "It's a systematic fix to adopt secure-by-default behaviour" excuse may actually let it get by, especially if the appropriate stdlib APIs adopt "inherited-by-default" for the FDs where it is needed (such as subprocess pass_fds). >> Add a new optional ``cloexec`` argument to: >> >> * Maybe also: ``os.open()``, ``os.openpty()`` > > Open can be passed O_CLOEXEC directly. Indeed, people playing at the os module layer are already expected to have to deal with platform specific behaviour. That's probably a reasonable model to continue here - if people want cross-platform handling of sensitive file descriptors, they either have to handle it themselves or step up a layer of abstraction. > Also, it has the same drawbacks as global variables: not thread-safe, > not library-safe (i.e. if two libraries set it to conflicting values, > you don't know which one is picked up). I think it makes sense to consider the two issues separately, while making sure the design in this PEP supports both "given inherited FD by default, mark individual ones that shouldn't be inherited" and "given non-inherited FD by default, mark individual ones for inheritance" Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Jan 13 12:15:12 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 13 Jan 2013 21:15:12 +1000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: On Sun, Jan 13, 2013 at 8:40 PM, Charles-Fran?ois Natali wrote: >> XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX >> XXX descriptors of the constructor the ``pass_fds`` argument? XXX > > What? > Setting them cloexec would prevent them from being inherited in the > child process! Perhaps Victor meant "clear" rather than "set" in this comment? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun Jan 13 12:15:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 13 Jan 2013 12:15:18 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors References: Message-ID: <20130113121518.7c1f134d@pitrou.net> On Sun, 13 Jan 2013 21:13:42 +1000 Nick Coghlan wrote: > > > > Also, ISTM that Windows also supports this flag. If it does, then > > "cloexec" might not be the best name, because it refers to the > > execve() Unix system call. Maybe something like "noinherit" would be > > clearer (although coming from a Unix background "cloexec" is > > crystal-clear to me :-). > > Indeed, this may be an area where following the underlying standards > too closely may not be a good idea. In particular, a *descriptive* > flag may be better choice than an imperative one. > > For example, if we make the flag "sensitive", then the programmer is > telling us "this file descriptor is sensitive" and then we get to > decide what that means in terms of the underlying OS behaviours like > "close-on-exec" and "no-inherit" (as well as deciding whether or not > file descriptors are considered sensitive by default). > > It also means we're free to implement a mechanism that tries to close > all sensitive file descriptors in _PyOS_AfterFork. Ouch! This actually shows that "noinherit" is a very bad name. The PEP is about closing fds after exec(), *not* after fork(). So "cloexec" is really the right, precise, non-ambiguous name here. Regards Antoine. From ncoghlan at gmail.com Sun Jan 13 12:33:30 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 13 Jan 2013 21:33:30 +1000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: <20130113121518.7c1f134d@pitrou.net> References: <20130113121518.7c1f134d@pitrou.net> Message-ID: On Sun, Jan 13, 2013 at 9:15 PM, Antoine Pitrou wrote: >> It also means we're free to implement a mechanism that tries to close >> all sensitive file descriptors in _PyOS_AfterFork. > > Ouch! This actually shows that "noinherit" is a very bad name. The PEP > is about closing fds after exec(), *not* after fork(). So "cloexec" is > really the right, precise, non-ambiguous name here. No, 'cloexec' is a terrible name, because, aside from the cryptic opacity of it, it's also wrong on Windows, which doesn't have the fork() vs exec() distinction. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun Jan 13 12:43:19 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 13 Jan 2013 12:43:19 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: <20130113121518.7c1f134d@pitrou.net> Message-ID: <20130113124319.71746379@pitrou.net> On Sun, 13 Jan 2013 21:33:30 +1000 Nick Coghlan wrote: > On Sun, Jan 13, 2013 at 9:15 PM, Antoine Pitrou wrote: > >> It also means we're free to implement a mechanism that tries to close > >> all sensitive file descriptors in _PyOS_AfterFork. > > > > Ouch! This actually shows that "noinherit" is a very bad name. The PEP > > is about closing fds after exec(), *not* after fork(). So "cloexec" is > > really the right, precise, non-ambiguous name here. > > No, 'cloexec' is a terrible name, because, aside from the cryptic > opacity of it, it's also wrong on Windows, which doesn't have the > fork() vs exec() distinction. Just because Windows doesn't have the fork() vs exec() distinction doesn't mean "cloexec" is a bad description of the flag: it describes exactly what happens (the fd gets closed when executing an external program). As for the opacity, feel free to propose something better ("close_on_spawn", whatever). But I'm definitely and strongly -1 on "noinherit". Regards Antoine. From ncoghlan at gmail.com Sun Jan 13 13:44:06 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 13 Jan 2013 22:44:06 +1000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: <20130113124319.71746379@pitrou.net> References: <20130113121518.7c1f134d@pitrou.net> <20130113124319.71746379@pitrou.net> Message-ID: On Sun, Jan 13, 2013 at 9:43 PM, Antoine Pitrou wrote: > As for the opacity, feel free to propose something better > ("close_on_spawn", whatever). But I'm definitely and strongly -1 > on "noinherit". That's the main reason I quite like "sensitive" as a term for this, since it decouples the user statement ("this file descriptor provides access to potentially sensitive information") from the steps the interpreter promises to take to protect that information (such as closing it before executing a different program or ensuring it isn't inherited by child processes). We can then define a glossary entry for "sensitive" that explains the consequences of flagging a descriptor as sensitive on the various operating systems (i.e. setting cloexec on POSIX and noinherit on Windows). As the platforms provide additional security mechanisms, we can provide them without needing to change the user facing APIs. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun Jan 13 14:22:33 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 13 Jan 2013 14:22:33 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: <20130113121518.7c1f134d@pitrou.net> <20130113124319.71746379@pitrou.net> Message-ID: <20130113142233.47aac171@pitrou.net> On Sun, 13 Jan 2013 22:44:06 +1000 Nick Coghlan wrote: > On Sun, Jan 13, 2013 at 9:43 PM, Antoine Pitrou wrote: > > As for the opacity, feel free to propose something better > > ("close_on_spawn", whatever). But I'm definitely and strongly -1 > > on "noinherit". > > That's the main reason I quite like "sensitive" as a term for this, > since it decouples the user statement ("this file descriptor provides > access to potentially sensitive information") from the steps the > interpreter promises to take to protect that information (such as > closing it before executing a different program or ensuring it isn't > inherited by child processes). This assumes that some file descriptors are not "sensitive", which sounds a bit weird to me (since a fd will by definition give access to a system resource). What should happen is that *no* file descriptors are inherited on exec(), except for those few ones which are necessary for proper operation of the exec()ed process. (it's not even just a security issue: letting a bound socket open and therefore being unable to re-use the same port is a bug even when security is not a concern) Regards Antoine. From ncoghlan at gmail.com Sun Jan 13 14:49:32 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 13 Jan 2013 23:49:32 +1000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: <20130113142233.47aac171@pitrou.net> References: <20130113121518.7c1f134d@pitrou.net> <20130113124319.71746379@pitrou.net> <20130113142233.47aac171@pitrou.net> Message-ID: On Sun, Jan 13, 2013 at 11:22 PM, Antoine Pitrou wrote: > On Sun, 13 Jan 2013 22:44:06 +1000 > Nick Coghlan wrote: >> On Sun, Jan 13, 2013 at 9:43 PM, Antoine Pitrou wrote: >> > As for the opacity, feel free to propose something better >> > ("close_on_spawn", whatever). But I'm definitely and strongly -1 >> > on "noinherit". >> >> That's the main reason I quite like "sensitive" as a term for this, >> since it decouples the user statement ("this file descriptor provides >> access to potentially sensitive information") from the steps the >> interpreter promises to take to protect that information (such as >> closing it before executing a different program or ensuring it isn't >> inherited by child processes). > > This assumes that some file descriptors are not "sensitive", which > sounds a bit weird to me (since a fd will by definition give access > to a system resource). What should happen is that *no* file descriptors > are inherited on exec(), except for those few ones which are necessary > for proper operation of the exec()ed process. I agree it makes it obvious what the right default behaviour should be: flag every FD as sensitive by default, and pass an argument to say "sensitive=False" when you want to disable Python's automatic protections. However, there's a pragmatic question of whether we're willing to eat the backwards incompatibility of such a change in a feature release, or if it's something that has to wait until Python 4.0, or else be selected via a global flag (in line with the hash randomisation change). > (it's not even just a security issue: letting a bound socket open and > therefore being unable to re-use the same port is a bug even when > security is not a concern) Agreed, but it's the security implications that let us even contemplate the backwards compatibility break. We either let inexperienced users continue to write insecure software by default, or we close the loophole and tell experienced users "hey, to upgrade to Python 3.4, you will need to address this change in behaviour". The nice thing is that with enough advance warning, they should be able to update their code to forcibly clear the flag in a way that works even on earlier Python versions. A more conservative approach, based on the steps taken in introducing hash randomisation, would be to expose the setting as an environment variable in 3.4, and then switch the default behaviour in 3.5. So, 3.3: FDs non-sensitive by default, use O_CLOEXEC, noinherit, etc directly for sensitive ones 3.4: FDs non-sensitive by default, set "sensitive=True" to enable selectively, or set PYTHONSENSITIVEFDS to enable globally and "sensitive=False" to disable selectively 3.5: FDs sensitive by default, set "sensitive=False" to enable selectively, no PYTHONSENSITIVEFDS setting (To accelerate the adoption schedule, replace 3.5 with 3.4, and 3.4 with 3.3.1, but require that people set or clear the platform specific flags to work around the default behaviour) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun Jan 13 14:53:50 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 13 Jan 2013 14:53:50 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: <20130113121518.7c1f134d@pitrou.net> <20130113124319.71746379@pitrou.net> <20130113142233.47aac171@pitrou.net> Message-ID: <20130113145350.581855f2@pitrou.net> On Sun, 13 Jan 2013 23:49:32 +1000 Nick Coghlan wrote: > > > (it's not even just a security issue: letting a bound socket open and > > therefore being unable to re-use the same port is a bug even when > > security is not a concern) > > Agreed, but it's the security implications that let us even > contemplate the backwards compatibility break. We either let > inexperienced users continue to write insecure software by default, or > we close the loophole and tell experienced users "hey, to upgrade to > Python 3.4, you will need to address this change in behaviour". > > The nice thing is that with enough advance warning, they should be > able to update their code to forcibly clear the flag in a way that > works even on earlier Python versions. > > A more conservative approach, based on the steps taken in introducing > hash randomisation, would be to expose the setting as an environment > variable in 3.4, and then switch the default behaviour in 3.5. The "more conservative approach" sounds good to me :-) Regards Antoine. From victor.stinner at gmail.com Sun Jan 13 15:01:55 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 13 Jan 2013 15:01:55 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: > Alternatives > ============ > > Always set close-on-exec flag > ----------------------------- Oops, this is completly wrong. This section is basically exactly the same than the PEP proposal, except that the default value of cloexec is set to True. The correct title is: "Set the close-on-exec flag by default" (I just fixed it). Victort From storchaka at gmail.com Sun Jan 13 16:21:22 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 13 Jan 2013 17:21:22 +0200 Subject: [Python-Dev] cpython: Cleanup the docs ElementTree a bit and describe the default_namespace In-Reply-To: <3YkgCP49RlzS4W@mail.python.org> References: <3YkgCP49RlzS4W@mail.python.org> Message-ID: On 13.01.13 16:28, eli.bendersky wrote: > http://hg.python.org/cpython/rev/f0e80c7404a5 > changeset: 81490:f0e80c7404a5 > user: Eli Bendersky > date: Sun Jan 13 06:27:51 2013 -0800 > summary: > Cleanup the docs ElementTree a bit and describe the default_namespace parameter. In the code, replace the old outdated Doxygen-ish comment above ElementTree.write by a proper docstring. I think that wrong documentation of ElementTree.write is a bug and should be fixed in early Python versions too. From storchaka at gmail.com Sun Jan 13 16:31:37 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 13 Jan 2013 17:31:37 +0200 Subject: [Python-Dev] cpython: Cleanup the docs ElementTree a bit and describe the default_namespace In-Reply-To: References: <3YkgCP49RlzS4W@mail.python.org> Message-ID: And perhaps it will be better if parameters will be described in same order as they enumerated in the signature (now 'default_namespace' described after 'method'). From dreamingforward at gmail.com Sun Jan 13 17:13:42 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Sun, 13 Jan 2013 10:13:42 -0600 Subject: [Python-Dev] [Visualpython-users] How VPython 6 differs from VPython 5 In-Reply-To: References: Message-ID: On Sun, Jan 13, 2013 at 1:15 AM, Bruce Sherwood wrote: > Here is detailed information on how VPython 6 differs from VPython 5, which > will be incorporated in the Help for upcoming releases of VPython 6. Note > that the fact that in a main program __name__ isn't '__main__' is an > unavoidable "feature", not a bug. That is, there doesn't seem to be a way to > make this work. Hi, Bruce. I think there are two ways of fixing this. One is to get down to the C language level, and share the C main() loop, which will percolate back up into the CPython loop. The other involves mending the split that I've been mentioning that happened over floating point between the VPython developers and the Python-dev group. Then perhaps Guido will accept VPython using/sharing the __main__ "loop". (I'm CC'ing him in this message.) So, either of these tracks should fix this problem. This is why I keep mentioning the important of healing that split between the parties arguing in the two camps. Perhaps Tim Peters will be able help bridge this gap. > Changes from VPython 5 > > The new version makes an essential change to the syntax of VPython programs. > Now, an animation loop MUST contain a rate or sleep statement, which limits > the number of loop iterations per second as before but also when appropriate > (about 30 times per second) updates the 3D scene and handles mouse and > keyboard events. Without a rate or sleep statement, the scene will not be > updated until and unless the loop is completed. I think this is perfectly acceptible and is just a necessary restriction wherefrom the OS *must* maintain ultimate control of I/O. Python sits in userspace surrounded by the larger computation environment, so this is only fair that an OS call is made so that it can keep things in "check". Naming it "sleep" is okay, but makes it sound more like a voluntary thing, and as you probably know, is the traditional name for relinquishng some control to the OS. (cf. Unix Systems Programming, Robbins, et al.) > You should use the new function sleep rather than time.sleep. The new > function periodically renders the scene and processes mouse events, making > it possible to continue using zoom and rotate, whereas time.sleep does not > do this. This is strange to me. I think this is where VPython must be ahead of the curve and not close enough to the Linux development communities. > For technical reasons, it is necessary for VPython 6 to do something rather > unusual. When you import visual (or vis), your own program is in turn > imported by the visual module. Again this is because of the faction that was created by in 200?, regarding the Vpython community vs. the python-dev. Really, this should be mended. Anyway, I hope that any of this made sense. Thanks for your help! Mark From eliben at gmail.com Sun Jan 13 18:28:47 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 13 Jan 2013 09:28:47 -0800 Subject: [Python-Dev] cpython: Cleanup the docs ElementTree a bit and describe the default_namespace In-Reply-To: References: <3YkgCP49RlzS4W@mail.python.org> Message-ID: On Sun, Jan 13, 2013 at 7:31 AM, Serhiy Storchaka wrote: > And perhaps it will be better if parameters will be described in same > order as they enumerated in the signature (now 'default_namespace' > described after 'method'). > > > Serhiy, both you comments make sense. But: 1. A better place to "park" them is an issue (the original issue for which the commits were made, or even a new issue). 2. Feel free to use your newly gained superpowers to help out with such fixes. Often a quick fix + commit doesn't take much longer than to write an email. Eli > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > eliben%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sun Jan 13 20:37:19 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 13 Jan 2013 21:37:19 +0200 Subject: [Python-Dev] cpython: Cleanup the docs ElementTree a bit and describe the default_namespace In-Reply-To: References: <3YkgCP49RlzS4W@mail.python.org> Message-ID: On 13.01.13 19:28, Eli Bendersky wrote: > 1. A better place to "park" them is an issue (the original issue for > which the commits were made, or even a new issue). I agree. Usually I do this, except for the cases when the issue number is not specified in the commit message. > 2. Feel free to use your newly gained superpowers to help out with such > fixes. Often a quick fix + commit doesn't take much longer than to write > an email. I didn't want to come in your heels. Now, when you have approved it and left it for me, I'll do it. From greg.ewing at canterbury.ac.nz Sun Jan 13 23:06:48 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 14 Jan 2013 11:06:48 +1300 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: <20130113121518.7c1f134d@pitrou.net> <20130113124319.71746379@pitrou.net> <20130113142233.47aac171@pitrou.net> Message-ID: <50F32FF8.7000900@canterbury.ac.nz> Nick Coghlan wrote: > Agreed, but it's the security implications that let us even > contemplate the backwards compatibility break. I don't think that's a sufficient criterion for choosing a name. The backwards compatibility issue is a transitional thing, but we'll be stuck with the name forever. IMO the name should simply describe the mechanism, and be neutral as to what the reason might be for using the mechanism. -- Greg From dreamingforward at gmail.com Mon Jan 14 02:41:34 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Sun, 13 Jan 2013 19:41:34 -0600 Subject: [Python-Dev] [Visualpython-users] How VPython 6 differs from VPython 5 In-Reply-To: References: Message-ID: On Sun, Jan 13, 2013 at 12:14 PM, Bruce Sherwood wrote: > For the record, I do not know of any evidence whatsoever for a supposed > "split" between the tiny VPython community and the huge Python community > concerning floating point variables. Nor do I see anything in Python that > needs to be "fixed". Well it was bit enough that the python community created a brand-new language construct called import __future__ -- something never considered before then and which actually changes the behavior of python unlike any other module. And perhaps I've just felt it more because I was a big proponent of both 3-d graphics as a way to keep python a draw for beginning programmers and also a big fan of scientific simulation. No one had anything near vpython back then. (But ultimately I need to stop mentioning this issue to this vpython list because it's really the Python group which need to get back in sync.) > The new (currently experimental) version of VPython based on wxPython must, > in order to run in a Cocoa environment on the Mac, make the interact loop be > the primary thread, with the user's Python calculations at worst a secondary > thread, which is the opposite of the older VPython, where the > interval-driven rendering thread was secondary to the user's calculations. > By the unusual stratagem of having VPython import the user's program it has > been possible to make this work, and at the same time greatly simplify the > C++ component of VPython by eliminating threading, with additional important > simplification from eliminating essentially all platform-dependent code > thanks to the multiplatform character of wxPython. The result is that nearly > all existing VPython programs will work without change, at the very small > cost of a few marginal cases requiring minor tweaking. I should alter the > documentation to make this important property of the new version more > salient. I need to analyze this more carefully before commenting further.... mark From v+python at g.nevcal.com Mon Jan 14 03:19:47 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 13 Jan 2013 18:19:47 -0800 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: <20130113121518.7c1f134d@pitrou.net> <20130113124319.71746379@pitrou.net> <20130113142233.47aac171@pitrou.net> Message-ID: <50F36B43.4010205@g.nevcal.com> On 1/13/2013 5:49 AM, Nick Coghlan wrote: > I agree it makes it obvious what the right default behaviour should > be: flag every FD as sensitive by default, and pass an argument to say > "sensitive=False" when you want to disable Python's automatic > protections. "sensitive" is a bad name... because the data passed via an open descriptor that _should_ be passed may well be sensitive, so saying sensitive=False is false... it's just necessary to leave the descriptor open to make things work... -------------- next part -------------- An HTML attachment was scrubbed... URL: From dreamingforward at gmail.com Mon Jan 14 04:06:19 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Sun, 13 Jan 2013 21:06:19 -0600 Subject: [Python-Dev] [Visualpython-users] How VPython 6 differs from VPython 5 In-Reply-To: References: Message-ID: > Since this was copied to the Python-Dev list, I want to go on record as > stating firmly that there is no evidence whatsoever to substantiate claims > that there has ever been some kind of conflict between VPython and Python. My apologies, Bruce, I didn't mean for that second message to go to they python-dev mailing list (and sorry python-dev list... :). > Since __future__ was also mentioned, I'll take the opportunity to say that > I've been very impressed by the way the Python community has handled the > difficult 2->3 transition. For example, it has been a big help to the > educational uses of VPython that we could tell students simply to start with > "from __future__ import division, print_function", put parens around print > arguments, and thereby make it irrelevant whether they used Python 2 or > Python 3. Many thanks to the Python development community! Yes, it is/was relatively seemless *syntactically*, but it hasn't been seemless *semanticly*. from __future__ still does something very odd as far as the program language definition -- it modifies the way the interpreter interprets a syntactic construct -- a sort of meta-linguistic construct. mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Jan 14 10:08:14 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 14 Jan 2013 10:08:14 +0100 Subject: [Python-Dev] How VPython 6 differs from VPython 5 References: Message-ID: <20130114100814.2ae6d96d@pitrou.net> Le Sun, 13 Jan 2013 19:58:52 -0700, Bruce Sherwood a ?crit : > Since this was copied to the Python-Dev list, I want to go on record > as stating firmly that there is no evidence whatsoever to > substantiate claims that there has ever been some kind of conflict > between VPython and Python. > > Since __future__ was also mentioned, I'll take the opportunity to say > that I've been very impressed by the way the Python community has > handled the difficult 2->3 transition. For example, it has been a big > help to the educational uses of VPython that we could tell students > simply to start with "from __future__ import division, > print_function", put parens around print arguments, and thereby make > it irrelevant whether they used Python 2 or Python 3. Many thanks to > the Python development community! You're welcome :) Best regards Antoine. > > Bruce Sherwood > From "ja...py" at farowl.co.uk Sun Jan 13 10:58:47 2013 From: "ja...py" at farowl.co.uk (Jeff Allen) Date: Sun, 13 Jan 2013 09:58:47 +0000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: <50F28557.6010101@farowl.co.uk> On 13/01/2013 00:41, Victor Stinner wrote: >> PEP: 433 >> Title: Add cloexec argument to functions creating file descriptors >> Status: Draft > The PEP is still a draft. I'm sending it to python-dev to get a first review. > > The main question is the choice between the 3 different options: > > * don't set close-on-exec flag by default > * always set close-on-exec flag > * add sys.setdefaultcloexec() to leave the choice to the application > > Victor Nice clear explanation. I think io, meaning _io and _pyio really, would be amongst the impacted modules, and should perhaps be in the examples. (I am currently working on the Jython implementation of the _io module.) It seems to me that io.open, and probably all the constructors, such as _io.FileIO, would need the extra information as a mode or a boolean argument like closefd. This may be a factor in your choice above. Other things I noticed were minor, and I infer that they should wait until principles are settled. Jeff Allen From Bruce_Sherwood at ncsu.edu Sun Jan 13 19:14:14 2013 From: Bruce_Sherwood at ncsu.edu (Bruce Sherwood) Date: Sun, 13 Jan 2013 11:14:14 -0700 Subject: [Python-Dev] [Visualpython-users] How VPython 6 differs from VPython 5 In-Reply-To: References: Message-ID: For the record, I do not know of any evidence whatsoever for a supposed "split" between the tiny VPython community and the huge Python community concerning floating point variables. Nor do I see anything in Python that needs to be "fixed". The new (currently experimental) version of VPython based on wxPython must, in order to run in a Cocoa environment on the Mac, make the interact loop be the primary thread, with the user's Python calculations at worst a secondary thread, which is the opposite of the older VPython, where the interval-driven rendering thread was secondary to the user's calculations. By the unusual stratagem of having VPython import the user's program it has been possible to make this work, and at the same time greatly simplify the C++ component of VPython by eliminating threading, with additional important simplification from eliminating essentially all platform-dependent code thanks to the multiplatform character of wxPython. The result is that nearly all existing VPython programs will work without change, at the very small cost of a few marginal cases requiring minor tweaking. I should alter the documentation to make this important property of the new version more salient. Minor point: The meaning of the new VPython "sleep" is precisely to relinquish control to the OS, including the GUI. During sleep one wants the wxPython window controls to remain active, which will not be the case with time.sleep, which knows nothing about the wxPython interact loop. So I think the meaning of "sleep" is the usual meaning. Bruce Sherwood On Sun, Jan 13, 2013 at 9:13 AM, Mark Janssen wrote: > On Sun, Jan 13, 2013 at 1:15 AM, Bruce Sherwood > wrote: > > Here is detailed information on how VPython 6 differs from VPython 5, > which > > will be incorporated in the Help for upcoming releases of VPython 6. Note > > that the fact that in a main program __name__ isn't '__main__' is an > > unavoidable "feature", not a bug. That is, there doesn't seem to be a > way to > > make this work. > > Hi, Bruce. I think there are two ways of fixing this. One is to get > down to the C language level, and share the C main() loop, which will > percolate back up into the CPython loop. The other involves mending > the split that I've been mentioning that happened over floating point > between the VPython developers and the Python-dev group. Then perhaps > Guido will accept VPython using/sharing the __main__ "loop". (I'm > CC'ing him in this message.) So, either of these tracks should fix > this problem. This is why I keep mentioning the important of healing > that split between the parties arguing in the two camps. Perhaps Tim > Peters will be able help bridge this gap. > > > Changes from VPython 5 > > > > The new version makes an essential change to the syntax of VPython > programs. > > Now, an animation loop MUST contain a rate or sleep statement, which > limits > > the number of loop iterations per second as before but also when > appropriate > > (about 30 times per second) updates the 3D scene and handles mouse and > > keyboard events. Without a rate or sleep statement, the scene will not be > > updated until and unless the loop is completed. > > I think this is perfectly acceptible and is just a necessary > restriction wherefrom the OS *must* maintain ultimate control of I/O. > Python sits in userspace surrounded by the larger computation > environment, so this is only fair that an OS call is made so that it > can keep things in "check". Naming it "sleep" is okay, but makes it > sound more like a voluntary thing, and as you probably know, is the > traditional name for relinquishng some control to the OS. (cf. Unix > Systems Programming, Robbins, et al.) > > > You should use the new function sleep rather than time.sleep. The new > > function periodically renders the scene and processes mouse events, > making > > it possible to continue using zoom and rotate, whereas time.sleep does > not > > do this. > > This is strange to me. I think this is where VPython must be ahead of > the curve and not close enough to the Linux development communities. > > > For technical reasons, it is necessary for VPython 6 to do something > rather > > unusual. When you import visual (or vis), your own program is in turn > > imported by the visual module. > > Again this is because of the faction that was created by in 200?, > regarding the Vpython community vs. the python-dev. Really, this > should be mended. > > Anyway, I hope that any of this made sense. > > Thanks for your help! > > Mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bruce_Sherwood at ncsu.edu Mon Jan 14 03:58:52 2013 From: Bruce_Sherwood at ncsu.edu (Bruce Sherwood) Date: Sun, 13 Jan 2013 19:58:52 -0700 Subject: [Python-Dev] [Visualpython-users] How VPython 6 differs from VPython 5 In-Reply-To: References: Message-ID: Since this was copied to the Python-Dev list, I want to go on record as stating firmly that there is no evidence whatsoever to substantiate claims that there has ever been some kind of conflict between VPython and Python. Since __future__ was also mentioned, I'll take the opportunity to say that I've been very impressed by the way the Python community has handled the difficult 2->3 transition. For example, it has been a big help to the educational uses of VPython that we could tell students simply to start with "from __future__ import division, print_function", put parens around print arguments, and thereby make it irrelevant whether they used Python 2 or Python 3. Many thanks to the Python development community! Bruce Sherwood On Sun, Jan 13, 2013 at 6:41 PM, Mark Janssen wrote: > On Sun, Jan 13, 2013 at 12:14 PM, Bruce Sherwood > wrote: > > For the record, I do not know of any evidence whatsoever for a supposed > > "split" between the tiny VPython community and the huge Python community > > concerning floating point variables. Nor do I see anything in Python that > > needs to be "fixed". > > Well it was bit enough that the python community created a brand-new > language construct called import __future__ -- something never > considered before then and which actually changes the behavior of > python unlike any other module. And perhaps I've just felt it more > because I was a big proponent of both 3-d graphics as a way to keep > python a draw for beginning programmers and also a big fan of > scientific simulation. No one had anything near vpython back then. > (But ultimately I need to stop mentioning this issue to this vpython > list because it's really the Python group which need to get back in > sync.) > > > The new (currently experimental) version of VPython based on wxPython > must, > > in order to run in a Cocoa environment on the Mac, make the interact > loop be > > the primary thread, with the user's Python calculations at worst a > secondary > > thread, which is the opposite of the older VPython, where the > > interval-driven rendering thread was secondary to the user's > calculations. > > By the unusual stratagem of having VPython import the user's program it > has > > been possible to make this work, and at the same time greatly simplify > the > > C++ component of VPython by eliminating threading, with additional > important > > simplification from eliminating essentially all platform-dependent code > > thanks to the multiplatform character of wxPython. The result is that > nearly > > all existing VPython programs will work without change, at the very small > > cost of a few marginal cases requiring minor tweaking. I should alter the > > documentation to make this important property of the new version more > > salient. > > I need to analyze this more carefully before commenting further.... > > mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jan 14 12:14:11 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 14 Jan 2013 21:14:11 +1000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: <50F32FF8.7000900@canterbury.ac.nz> References: <20130113121518.7c1f134d@pitrou.net> <20130113124319.71746379@pitrou.net> <20130113142233.47aac171@pitrou.net> <50F32FF8.7000900@canterbury.ac.nz> Message-ID: On Mon, Jan 14, 2013 at 8:06 AM, Greg Ewing wrote: > Nick Coghlan wrote: >> >> Agreed, but it's the security implications that let us even >> contemplate the backwards compatibility break. > > > I don't think that's a sufficient criterion for choosing a > name. The backwards compatibility issue is a transitional > thing, but we'll be stuck with the name forever. > > IMO the name should simply describe the mechanism, and be > neutral as to what the reason might be for using the > mechanism. The problem is the mechanism is *not the same* on Windows and POSIX - the Windows mechanism (noinherit) means the file won't be accessible in child processes, the POSIX mechanism (cloexec) means it won't survive exec(). This difference really shows up with multiprocessing, as that uses a bare fork() without exec() on POSIX systems, meaning even flagged descriptors will still be inherited by the child processes. A flag like "sensitive" or "protected" or "private" tells the interpreter "keep this to yourself, as best you can", leaving Python implementors with a certain amount of freedom in the way they translate that into platform appropriate settings (e.g. does "cloexec" or "noinherit" even make sense in Java?) I agree with Glenn's point that "sensitive" in particular is a bad choice, but a more value neutral term like "protect" may still work. We can then explain what it means to protect a file descriptor on the various platforms, and potentially even change the meaning over time as the platforms provide additional capabilities. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Mon Jan 14 12:23:13 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 14 Jan 2013 12:23:13 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: Hi, Thanks for your feedback, I already updated the PEP for most of your remarks. 2013/1/13 Charles-Fran?ois Natali : > Also, ISTM that Windows also supports this flag. Yes it does, see Appendix: Operating system support > Windows. >> .. note:: >> OpenBSD older 5.2 does not close the file descriptor with >> close-on-exec flag set if ``fork()`` is used before ``exec()``, but >> it works correctly if ``exec()`` is called without ``fork()``. > > That would be *really* surprising, are your sure your test case is correct? > Otherwise it could be a compilation issue, because I simply can't > believe OpenBSD would ignore the close-on-exec flag. I had this issue with a short Python script. I will try to reproduce it with a C program to double check. >> Impacted modules: >> >> * ``multiprocessing`` >> * ``socketserver`` >> * ``subprocess`` >> * ``tempfile`` > > Hum, I thought temporay file are already created with the close-on-exec flag. "Impacted" is maybe not the best word. I mean that these modules should be modified or that their behaviour may change. It's more a TODO list to ensure that these modules are consistent with PEP. >> XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX >> XXX descriptors of the constructor the ``pass_fds`` argument? XXX > > What? > Setting them cloexec would prevent them from being inherited in the > child process! Oops, it's just the opposite: pass_fds should (must?) *clear* the flag :-) (I'm not sure of what should be done here.) >> Add a new optional ``cloexec`` argument to: >> >> * Maybe also: ``os.open()``, ``os.openpty()`` > > Open can be passed O_CLOEXEC directly. See Portability section of Rational: O_CLOEXEC is not portable, even under Linux! It's tricky to use these atomic flags and to implement a fallback. It looks like most developers agree that we should expose an helper or something like that to access the "cloexec" feature for os.open(), os.pipe(), etc. I propose to add it directly in the os module because I don't want to add a new function just for that. (In the kernel/libc) OS functions don't support keyword arguments, so I consider cloexec=True explicit enough. The parameter may be a keyword-only argument, I chose a classic keyword parameter because it's simpler to use them in C. If you don't want to add extra syscalls to os functions using an optional keyword parameter, what do you propose to expose cloexec features to os.open(), os.pipe(), etc.? A new functions? A new module? >> * Many functions of the Python standard library creating file >> descriptors are cannot be changed by this proposal, because adding >> a ``cloexec`` optional argument would be surprising and too many >> functions would need it. For example, ``os.urandom()`` uses a >> temporary file on UNIX, but it calls a function of Windows API on >> Windows. Adding a ``cloexec`` argument to ``os.urandom()`` would >> not make sense. See `Always set close-on-exec flag`_ for an >> incomplete list of functions creating file descriptors. >> * Checking if a module creates file descriptors is difficult. For >> example, ``os.urandom()`` creates a file descriptor on UNIX to read >> ``/dev/urandom`` (and closes it at exit), whereas it is implemented >> using a function call on Windows. It is not possible to control >> close-on-exec flag of the file descriptor used by ``os.urandom()``, >> because ``os.urandom()`` API does not allow it. > > I think that the rule of thumb should be simple: > if a library opens a file descriptor which is not exposed to the user, > either because it's opened and closed before returning (e.g. > os.urandom()) or the file descriptor is kept private (e.g. poll(), it > should be set close-on-exec. Ok, it looks fair. I agree *but* only if the file descriptor is closed when the function does exit. select.epoll() doesn't return directly a file descriptor, but an object having a fileno() method. A server may rely on the inherance of this file descriptor, we cannot set close-on-exec flag in this case (if the default is False). It becomes unclear if a function returns an opaque object which contains a file descriptor, but the file descriptor is not accessible. I don't know if Python contains such function. >> Always set close-on-exec flag on new file descriptors created by >> Python. This alternative just changes the default value of the new >> ``cloexec`` argument. > > In a perfect world, all FDS should have been cloexec by default. > But it's too late now, I don't think we can change the default, it's > going to break some applications, and would be a huge deviation from > POSIX (however broken this design decision is). I disagree, according to all emails exchanged: only a very few applications rely on file descriptor inherance. Most of them are already using subprocess with pass_fds, and it should be easy to fix the last applications. I'm not sure that it's so different than the change on subprocess (close_fds=True by default on UNIX since Python 3.2). Do you think that it would break more applications? I disagree but I'm waiting other votes :-) >> Add a function to set close-on-exec flag by default >> --------------------------------------------------- >> >> ... >> * It is not more possible to know if the close-on-exec flag will be >> set or not just by reading the source code. > > That's really a show stopper: > > """ > s = socket.socket() > if os.fork() == 0: > # child > os.execve(['myprog', 'arg1]) > else: > # parent > [...] > """ > > It would be impossible to now if the socket is inherited, because the > behavior of the all program is affected by an - hidden - global > variable. That's just too wrong. For most file descriptors, the application doesn't care on inherance. For the few file descriptors that must be inherited, or not inherited, you can use the explicit flag: s = socket.socket(cloexec=True) The idea of a global option is to not slow down programs not using exec() nor fork() at all. It would also allow users to not have to "fix" their applications if their applications are not "cloexec compliant". By the way, I don't know how this parameter will be specific in the code if the code needs to be compatible with Python < 3.4. With a check on the Python version? > Also, it has the same drawbacks as global variables: not thread-safe, > not library-safe (i.e. if two libraries set it to conflicting values, > you don't know which one is picked up). What do you mean by "thread-safe"? "library-safe": only applications "should" use sys.setdefaultcloexec(). If a library uses this function, I guess that it would *enable* close-on-exec. If two libraries enable it, it works. If they disagree... something is wrong :-) But I agree that it's not the best solution. Setting cloexec to True (or False) by default is simpler ;-) > An atfork() module would indeed be really useful, but I don't think it > should be used for closing file descriptors: file descriptors are a > scarce resource, ... I agree but Antoine proposed something like that, and I would like to list all proposed alternatives. If I don't, someone will ask why the "atfork" solution was not taken :-) I will add your argument to explain why this alternative was not chosen. Victor From ncoghlan at gmail.com Mon Jan 14 13:13:16 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 14 Jan 2013 22:13:16 +1000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: On Mon, Jan 14, 2013 at 9:23 PM, Victor Stinner wrote: > I'm not sure that it's so different than the change on subprocess > (close_fds=True by default on UNIX since Python 3.2). Do you think > that it would break more applications? Don't ignore the possible explanation that the intersection of the sets "users with applications affected by that change" and "users migrating to Python 3.x" may currently be close to zero. > I disagree but I'm waiting other votes :-) I'm a fan of the conservative approach, with an environment variable and command line option to close FDs by default in 3.4 (similar to PYTHONHASHSEED and -R in the pre-3.3 security releases), and the cloexec/noinherit behaviour becoming the default (with no way to turn it off globally) in 3.5. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Jan 14 13:15:49 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 14 Jan 2013 22:15:49 +1000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: On Mon, Jan 14, 2013 at 9:23 PM, Victor Stinner wrote: >>> XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX >>> XXX descriptors of the constructor the ``pass_fds`` argument? XXX >> >> What? >> Setting them cloexec would prevent them from being inherited in the >> child process! > > Oops, it's just the opposite: pass_fds should (must?) *clear* the flag > :-) (I'm not sure of what should be done here.) Turning off a security feature implicitly isn't a good idea. If someone passes such a descriptor, their child application will fail noisily - it's then up to the developer to decide if they passed the wrong file descriptor, or simply need to ensure the one they passed remains open in the child process. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From hrvoje.niksic at avl.com Mon Jan 14 13:32:53 2013 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Mon, 14 Jan 2013 13:32:53 +0100 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (aef7db0d3893): sum=287 In-Reply-To: References: <50F038FE.208@python.org> Message-ID: <50F3FAF5.4040500@avl.com> On 01/12/2013 02:46 PM, Eli Bendersky wrote: > The first report is legit, however. PyTuple_New(0) was called and its > return value wasn't checked for NULL. The author might have been relying on Python caching the empty tuple. From victor.stinner at gmail.com Mon Jan 14 15:08:24 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 14 Jan 2013 15:08:24 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: 2013/1/14 Nick Coghlan : > I'm a fan of the conservative approach, with an environment variable > and command line option to close FDs by default in 3.4 (similar to > PYTHONHASHSEED and -R in the pre-3.3 security releases), and the > cloexec/noinherit behaviour becoming the default (with no way to turn > it off globally) in 3.5. Do you mean "environment variable and command line option" *instead of* a new sys.setdefaultcloexec() function? An environment variable and a command line option have an advantage over a function: libraries cannot modify the value at runtime (so 2 libraries cannot set different values :-)). 2013/1/14 Nick Coghlan : > Turning off a security feature implicitly isn't a good idea. If > someone passes such a descriptor, their child application will fail > noisily - it's then up to the developer to decide if they passed the > wrong file descriptor, or simply need to ensure the one they passed > remains open in the child process. For my subprocess/pass_fds comment: I wrote it initially while the PEP was proposing to setting close-on-exec flag by default. I will move this comment to the "Set close-on-exec flag by default". Victor From nad at acm.org Mon Jan 14 21:44:45 2013 From: nad at acm.org (Ned Deily) Date: Mon, 14 Jan 2013 12:44:45 -0800 Subject: [Python-Dev] cpython (3.3): fix for previous commit related to issue 10527 which didn't have the intended References: <3Ykxmg1NdXzSKN@mail.python.org> Message-ID: In article <3Ykxmg1NdXzSKN at mail.python.org>, giampaolo.rodola wrote: > http://hg.python.org/cpython/rev/88fadc0d7b20 > changeset: 81498:88fadc0d7b20 > branch: 3.3 > parent: 81495:e1c81ab5ad97 > user: Giampaolo Rodola' > date: Mon Jan 14 02:24:25 2013 +0100 > summary: > fix for previous commit related to issue 10527 which didn't have the > intended effect as per http://bugs.python.org/issue10527#msg179895 In article <3YkxmM0JwNzSJb at mail.python.org>, giampaolo.rodola wrote: > http://hg.python.org/cpython/rev/831f49cc00fc > changeset: 81497:831f49cc00fc > user: Giampaolo Rodola' > date: Mon Jan 14 02:24:05 2013 +0100 > summary: > fix for previous commit related to issue 10527 which didn't have the > intended effect as per http://bugs.python.org/issue10527#msg179895 > > files: > Lib/multiprocessing/connection.py | 48 +++++++++--------- > Lib/test/test_multiprocessing.py | 2 +- > 2 files changed, 26 insertions(+), 24 deletions(-) > It appears these two changesets were checked in independently rather than merging from branch 3.3 to default. As a result, the 3.3 branch is currently "open"; you'll see this as a +1 heads when doing a hg pull -u. A null merge needs to happen from 3.3 to default to close the 3.3 branch again. Giampaolo, can you take care of this? -- Ned Deily, nad at acm.org From greg.ewing at canterbury.ac.nz Mon Jan 14 22:44:12 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 15 Jan 2013 10:44:12 +1300 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: <20130113121518.7c1f134d@pitrou.net> <20130113124319.71746379@pitrou.net> <20130113142233.47aac171@pitrou.net> <50F32FF8.7000900@canterbury.ac.nz> Message-ID: <50F47C2C.1000404@canterbury.ac.nz> Nick Coghlan wrote: > The problem is the mechanism is *not the same* on Windows and POSIX - > the Windows mechanism (noinherit) means the file won't be accessible > in child processes, the POSIX mechanism (cloexec) means it won't > survive exec(). But since there is no such thing as a fork without exec on Windows, the term "cloexec" is still correct. The fd is closed during the exec part of the fork+exec implied by spawning a child process. > This difference really shows up with multiprocessing, > as that uses a bare fork() without exec() on POSIX systems, meaning > even flagged descriptors will still be inherited by the child > processes. There are a lot of other ways that this difference will be visible as well, so the difference in cloexec behaviour just needs to be documented in the multiprocessing module along with the rest. > I agree with Glenn's point that "sensitive" in particular is a bad > choice, but a more value neutral term like "protect" may still work. I think "protect" is *far* too vague. We don't want something so neutral that it gives no clue at all about its meaning. -- Greg From bugzilla-mail-box at yandex.ru Mon Jan 14 23:43:17 2013 From: bugzilla-mail-box at yandex.ru (lname name) Date: Tue, 15 Jan 2013 08:43:17 +1000 Subject: [Python-Dev] Issue about LoadError Message-ID: <1870851358203397@web25h.yandex.ru> http://bugs.python.org/issue16715 I assume that someone changes the LoadError class and goes into the revert function. How he can figure out that the cookies should return to the previous state if LoadError was raised ? From dholth at gmail.com Tue Jan 15 03:46:58 2013 From: dholth at gmail.com (Daniel Holth) Date: Mon, 14 Jan 2013 21:46:58 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: There you go. Obsoleted-By (optional) ::::::::::::::::::::::: Indicates that this project is no longer being developed. The named project provides a substitute or replacement. A version declaration may be supplied and must follow the rules described in `Version Specifiers`_. The most common use of this field will be in case a project name changes. Examples:: Name: BadName Obsoleted-By: AcceptableName Obsoleted-By: AcceptableName (>=4.0.0) -------------- next part -------------- An HTML attachment was scrubbed... URL: From benno at benno.id.au Tue Jan 15 12:44:09 2013 From: benno at benno.id.au (Ben Leslie) Date: Tue, 15 Jan 2013 22:44:09 +1100 Subject: [Python-Dev] Should libpython.a go in eprefix? Message-ID: Hi all, This issue: http://bugs.python.org/issue9807 from a couple of years ago seems to have changed the LIBPL build variable from pointing to a directory inside EPREFIX, to a directory in PREFIX. This doesn't seem ideal as LIBPL is populated with platform specific files such as libpython.a. I'm wondering if this was intentional, or an oversight? It seems to dramatically lessen the utility of providing an --exec-prefix to configure. Thanks, Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Tue Jan 15 19:59:06 2013 From: larry at hastings.org (Larry Hastings) Date: Tue, 15 Jan 2013 10:59:06 -0800 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: <50F47C2C.1000404@canterbury.ac.nz> References: <20130113121518.7c1f134d@pitrou.net> <20130113124319.71746379@pitrou.net> <20130113142233.47aac171@pitrou.net> <50F32FF8.7000900@canterbury.ac.nz> <50F47C2C.1000404@canterbury.ac.nz> Message-ID: <50F5A6FA.9040708@hastings.org> On 01/14/2013 01:44 PM, Greg Ewing wrote: > I think "protect" is *far* too vague. We don't want something > so neutral that it gives no clue at all about its meaning. My shoot-from-the-hip name suggestion: "sterile". (Does not produce offspring.) When a process and a file descriptor love each other *very much*... //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Tue Jan 15 22:07:10 2013 From: regebro at gmail.com (Lennart Regebro) Date: Tue, 15 Jan 2013 22:07:10 +0100 Subject: [Python-Dev] PEP-431 updated. Message-ID: OK, here is the third major incarnation of PEP-431. I think it addresses all the concerns brought to light so far. https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt Is there a simpler way of doing this than thus "cut and paste" updating? Is there a way to make pull requests to hg.python.org, for example? ------------ PEP: 431 Title: Time zone support improvements Version: $Revision$ Last-Modified: $Date$ Author: Lennart Regebro BDFL-Delegate: Barry Warsaw Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Dec-2012 Post-History: 11-Dec-2012, 28-Dec-2012 Abstract ======== This PEP proposes the implementation of concrete time zone support in the Python standard library, and also improvements to the time zone API to deal with ambiguous time specifications during DST changes. Proposal ======== Concrete time zone support -------------------------- The time zone support in Python has no concrete implementation in the standard library outside of a tzinfo baseclass that supports fixed offsets. To properly support time zones you need to include a database over all time zones, both current and historical, including daylight saving changes. But such information changes frequently, so even if we include the last information in a Python release, that information would be outdated just a few months later. Time zone support has therefore only been available through two third-party modules, ``pytz`` and ``dateutil``, both who include and wrap the "zoneinfo" database. This database, also called "tz" or "The Olsen database", is the de-facto standard time zone database over time zones, and it is included in most Unix and Unix-like operating systems, including OS X. This gives us the opportunity to include the code that supports the zoneinfo data in the standard library, but by default use the operating system's copy of the data, which typically will be kept updated by the updating mechanism of the operating system or distribution. For those who have an operating system that does not include the zoneinfo database, for example Windows, the Python source distribution will include a copy of the zoneinfo database, and a distribution containing the latest zoneinfo database will also be available at the Python Package Index, so it can be easily installed with the Python packaging tools such as ``easy_install`` or ``pip``. This could also be done on Unices that are no longer receiving updates and therefore have an outdated database. With such a mechanism Python would have full time zone support in the standard library on any platform, and a simple package installation would provide an updated time zone database on those platforms where the zoneinfo database isn't included, such as Windows, or on platforms where OS updates are no longer provided. The time zone support will be implemented by making the ``datetime`` module into a package, and adding time zone support to ``datetime`` based on Stuart Bishop's ``pytz`` module. Getting the local time zone --------------------------- On Unix there is no standard way of finding the name of the time zone that is being used. All the information that is available is the time zone abbreviations, such as ``EST`` and ``PDT``, but many of those abbreviations are ambiguous and therefore you can't rely on them to figure out which time zone you are located in. There is however a standard for finding the compiled time zone information since it's located in ``/etc/localtime``. Therefore it is possible to create a local time zone object with the correct time zone information even though you don't know the name of the time zone. A function in ``datetime`` should be provided to return the local time zone. The support for this will be made by integrating Lennart Regebro's ``tzlocal`` module into the new ``datetime`` module. For Windows it will look up the local Windows time zone name, and use a mapping between Windows time zone names and zoneinfo time zone names provided by the Unicode consortium to convert that to a zoneinfo time zone. The mapping should be updated before each major or bugfix release, scripts for doing so will be provided in the ``Tools/`` directory. Ambiguous times --------------- When changing over from daylight savings time (DST) the clock is turned back one hour. This means that the times during that hour happens twice, once without DST and then once with DST. Similarly, when changing to daylight savings time, one hour goes missing. The current time zone API can not differentiate between the two ambiguous times during a change from DST. For example, in Stockholm the time of 2012-11-28 02:00:00 happens twice, both at UTC 2012-11-28 00:00:00 and also at 2012-11-28 01:00:00. The current time zone API can not disambiguate this and therefore it's unclear which time should be returned:: # This could be either 00:00 or 01:00 UTC: >>> dt = datetime(2012, 10, 28, 2, 0, tzinfo=zoneinfo('Europe/Stockholm')) # But we can not specify which: >>> dt.astimezone(zoneinfo('UTC')) datetime.datetime(2012, 10, 28, 1, 0, tzinfo=) ``pytz`` solved this problem by adding ``is_dst`` parameters to several methods of the tzinfo objects to make it possible to disambiguate times when this is desired. This PEP proposes to add these ``is_dst`` parameters to the relevant methods of the ``datetime`` API, and therefore add this functionality directly to ``datetime``. This is likely the hardest part of this PEP as this involves updating the C version of the ``datetime`` library with this functionality, as this involved writing new code, and not just reorganizing existing external libraries. Implementation API ================== The zoneinfo database --------------------- The latest version of the zoneinfo database should exist in the ``Lib/tzdata`` directory of the Python source control system. This copy of the database should be updated before every Python feature and bug-fix release, but not for releases of Python versions that are in security-fix-only-mode. Scripts to update the database will be provided in ``Tools/``, and the release instructions will be updated to include this update. New configure options ``--enable-internal-timezone-database`` and ``--disable-internal-timezone-database`` will be implemented to enable and disable the installation of this database when installing from source. A source install will default to installing them. Binary installers for systems that have a system-provided zoneinfo database may skip installing the included database since it would never be used for these platforms. For other platforms, for example Windows, binary installers must install the included database. Changes in the ``datetime``-module ---------------------------------- The public API of the new time zone support contains one new class, one new function, one new exception and four new collections. In addition to this, several methods on the datetime object gets a new ``is_dst`` parameter. New class ``DstTzInfo`` ^^^^^^^^^^^^^^^^^^^^^^^^ This class provides a concrete implementation of the ``zoneinfo`` base class that implements DST support. New function ``zoneinfo(name=None, db_path=None)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This function takes a name string that must be a string specifying a valid zoneinfo time zone, i.e. "US/Eastern", "Europe/Warsaw" or "Etc/GMT". If not given, the local time zone will be looked up. If an invalid zone name is given, or the local time zone can not be retrieved, the function raises `UnknownTimeZoneError`. The function also takes an optional path to the location of the zoneinfo database which should be used. If not specified, the function will look for databases in the following order: 1. Use the database in ``/usr/share/zoneinfo``, if it exists. 2. Check if the `tzdata-update` module is installed, and then use that database. 3. Use the Python-provided database in ``Lib/tzdata``. If no database is found an ``UnknownTimeZoneError`` or subclass thereof will be raised with a message explaining that no zoneinfo database can be found, but that you can install one with the ``tzdata-update`` package. New parameter ``is_dst`` ^^^^^^^^^^^^^^^^^^^^^^^^ A new ``is_dst`` parameter is added to several methods to handle time ambiguity during DST changeovers. * ``tzinfo.utcoffset(dt, is_dst=False)`` * ``tzinfo.dst(dt, is_dst=False)`` * ``tzinfo.tzname(dt, is_dst=False)`` * ``datetime.astimezone(tz, is_dst=False)`` The ``is_dst`` parameter can be ``False`` (default), ``True``, or ``None``. ``False`` will specify that the given datetime should be interpreted as not happening during daylight savings time, i.e. that the time specified is after the change from DST. ``True`` will specify that the given datetime should be interpreted as happening during daylight savings time, i.e. that the time specified is before the change from DST. ``None`` will raise an ``AmbiguousTimeError`` exception if the time specified was during a DST change over. It will also raise a ``NonExistentTimeError`` if a time is specified during the "missing time" in a change to DST. New exceptions ^^^^^^^^^^^^^^ * ``UnknownTimeZoneError`` This exception is a subclass of KeyError and raised when giving a time zone specification that can't be found:: >>> datetime.Timezone('Europe/New_York') Traceback (most recent call last): ... UnknownTimeZoneError: There is no time zone called 'Europe/New_York' * ``InvalidTimeError`` This exception serves as a base for ``AmbiguousTimeError`` and ``NonExistentTimeError``, to enable you to trap these two separately. It will subclass from ValueError, so that you can catch these errors together with inputs like the 29th of February 2011. * ``AmbiguousTimeError`` This exception is raised when giving a datetime specification that is ambiguous while setting ``is_dst`` to None:: >>> datetime(2012, 11, 28, 2, 0, tzinfo=zoneinfo('Europe/Stockholm'), is_dst=None) >>> Traceback (most recent call last): ... AmbiguousTimeError: 2012-10-28 02:00:00 is ambiguous in time zone Europe/Stockholm * ``NonExistentTimeError`` This exception is raised when giving a datetime specification that is ambiguous while setting ``is_dst`` to None:: >>> datetime(2012, 3, 25, 2, 0, tzinfo=zoneinfo('Europe/Stockholm'), is_dst=None) >>> Traceback (most recent call last): ... NonExistentTimeError: 2012-03-25 02:00:00 does not exist in time zone Europe/Stockholm New collections ^^^^^^^^^^^^^^^ * ``all_timezones`` is the exhaustive list of the time zone names that can be used, listed alphabethically. * ``all_timezones_set`` is a set of the time zones in ``all_timezones``. * ``common_timezones`` is a list of useful, current time zones, listed alphabethically. * ``common_timezones_set`` is a set of the time zones in ``common_timezones``. The ``tzdata-update``-package ----------------------------- The zoneinfo database will be packaged for easy installation with ``easy_install``/``pip``/``buildout``. This package will not install any Python code, and will not contain any Python code except that which is needed for installation. It will be kept updated with the same tools as the internal database, but released whenever the ``zoneinfo``-database is updated, and use the same version schema. Differences from the ``pytz`` API ================================= * ``pytz`` has the functions ``localize()`` and ``normalize()`` to work around that ``tzinfo`` doesn't have is_dst. When ``is_dst`` is implemented directly in ``datetime.tzinfo`` they are no longer needed. * The ``timezone()`` function is called ``zoneinfo()`` to avoid clashing with the ``timezone`` class introduced in Python 3.2. * ``zoneinfo()`` will return the local time zone if called without arguments. * The class ``pytz.StaticTzInfo`` is there to provide the ``is_dst`` support for static time zones. When ``is_dst`` support is included in ``datetime.tzinfo`` it is no longer needed. * ``InvalidTimeError`` subclasses from ``ValueError``. Resources ========= * http://pytz.sourceforge.net/ * http://pypi.python.org/pypi/tzlocal * http://pypi.python.org/pypi/python-dateutil * http://unicode.org/cldr/data/common/supplemental/windowsZones.xml Copyright ========= This document has been placed in the public domain. From brett at python.org Tue Jan 15 23:43:45 2013 From: brett at python.org (Brett Cannon) Date: Tue, 15 Jan 2013 17:43:45 -0500 Subject: [Python-Dev] PEP-431 updated. In-Reply-To: References: Message-ID: On Tue, Jan 15, 2013 at 4:07 PM, Lennart Regebro wrote: > OK, here is the third major incarnation of PEP-431. > > I think it addresses all the concerns brought to light so far. > > https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt > > > Is there a simpler way of doing this than thus "cut and paste" > updating? Is there a way to make pull requests to hg.python.org, for > example? Easier how? If you mean mail python-dev then no since people need the ability to do inline commenting on the mailing list. If you mean update the PEP editors, clone the hg repo and send us a patch. -Brett > > > ------------ > > PEP: 431 > Title: Time zone support improvements > Version: $Revision$ > Last-Modified: $Date$ > Author: Lennart Regebro > BDFL-Delegate: Barry Warsaw > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 11-Dec-2012 > Post-History: 11-Dec-2012, 28-Dec-2012 > > > Abstract > ======== > > This PEP proposes the implementation of concrete time zone support in the > Python standard library, and also improvements to the time zone API to deal > with ambiguous time specifications during DST changes. > > > Proposal > ======== > > Concrete time zone support > -------------------------- > > The time zone support in Python has no concrete implementation in the > standard library outside of a tzinfo baseclass that supports fixed offsets. > To properly support time zones you need to include a database over all time > zones, both current and historical, including daylight saving changes. > But such information changes frequently, so even if we include the last > information in a Python release, that information would be outdated just a > few months later. > > Time zone support has therefore only been available through two third-party > modules, ``pytz`` and ``dateutil``, both who include and wrap the "zoneinfo" > database. This database, also called "tz" or "The Olsen database", is the > de-facto standard time zone database over time zones, and it is included in > most Unix and Unix-like operating systems, including OS X. > > This gives us the opportunity to include the code that supports the zoneinfo > data in the standard library, but by default use the operating system's copy > of the data, which typically will be kept updated by the updating mechanism > of the operating system or distribution. > > For those who have an operating system that does not include the zoneinfo > database, for example Windows, the Python source distribution will include a > copy of the zoneinfo database, and a distribution containing the latest > zoneinfo database will also be available at the Python Package Index, so it > can be easily installed with the Python packaging tools such as > ``easy_install`` or ``pip``. This could also be done on Unices that are no > longer receiving updates and therefore have an outdated database. > > With such a mechanism Python would have full time zone support in the > standard library on any platform, and a simple package installation would > provide an updated time zone database on those platforms where the zoneinfo > database isn't included, such as Windows, or on platforms where OS updates > are no longer provided. > > The time zone support will be implemented by making the ``datetime`` module > into a package, and adding time zone support to ``datetime`` based on Stuart > Bishop's ``pytz`` module. > > > Getting the local time zone > --------------------------- > > On Unix there is no standard way of finding the name of the time zone that is > being used. All the information that is available is the time zone > abbreviations, such as ``EST`` and ``PDT``, but many of those abbreviations > are ambiguous and therefore you can't rely on them to figure out which time > zone you are located in. > > There is however a standard for finding the compiled time zone information > since it's located in ``/etc/localtime``. Therefore it is possible to create > a local time zone object with the correct time zone information even though > you don't know the name of the time zone. A function in ``datetime`` should > be provided to return the local time zone. > > The support for this will be made by integrating Lennart Regebro's > ``tzlocal`` module into the new ``datetime`` module. > > For Windows it will look up the local Windows time zone name, and use a > mapping between Windows time zone names and zoneinfo time zone names provided > by the Unicode consortium to convert that to a zoneinfo time zone. > > The mapping should be updated before each major or bugfix release, scripts > for doing so will be provided in the ``Tools/`` directory. > > > Ambiguous times > --------------- > > When changing over from daylight savings time (DST) the clock is turned back > one hour. This means that the times during that hour happens twice, once > without DST and then once with DST. Similarly, when changing to daylight > savings time, one hour goes missing. > > The current time zone API can not differentiate between the two ambiguous > times during a change from DST. For example, in Stockholm the time of > 2012-11-28 02:00:00 happens twice, both at UTC 2012-11-28 00:00:00 and also > at 2012-11-28 01:00:00. > > The current time zone API can not disambiguate this and therefore it's > unclear which time should be returned:: > > # This could be either 00:00 or 01:00 UTC: > >>> dt = datetime(2012, 10, 28, 2, 0, tzinfo=zoneinfo('Europe/Stockholm')) > # But we can not specify which: > >>> dt.astimezone(zoneinfo('UTC')) > datetime.datetime(2012, 10, 28, 1, 0, tzinfo=) > > ``pytz`` solved this problem by adding ``is_dst`` parameters to several > methods of the tzinfo objects to make it possible to disambiguate times when > this is desired. > > This PEP proposes to add these ``is_dst`` parameters to the relevant methods > of the ``datetime`` API, and therefore add this functionality directly to > ``datetime``. This is likely the hardest part of this PEP as this involves > updating the C version of the ``datetime`` library with this functionality, > as this involved writing new code, and not just reorganizing existing > external libraries. > > > Implementation API > ================== > > The zoneinfo database > --------------------- > > The latest version of the zoneinfo database should exist in the > ``Lib/tzdata`` directory of the Python source control system. This copy of > the database should be updated before every Python feature and bug-fix > release, but not for releases of Python versions that are in > security-fix-only-mode. > > Scripts to update the database will be provided in ``Tools/``, and the > release instructions will be updated to include this update. > > New configure options ``--enable-internal-timezone-database`` and > ``--disable-internal-timezone-database`` will be implemented to enable and > disable the installation of this database when installing from source. A > source install will default to installing them. > > Binary installers for systems that have a system-provided zoneinfo database > may skip installing the included database since it would never be used for > these platforms. For other platforms, for example Windows, binary installers > must install the included database. > > > Changes in the ``datetime``-module > ---------------------------------- > > The public API of the new time zone support contains one new class, one new > function, one new exception and four new collections. In addition to > this, several > methods on the datetime object gets a new ``is_dst`` parameter. > > New class ``DstTzInfo`` > ^^^^^^^^^^^^^^^^^^^^^^^^ > > This class provides a concrete implementation of the ``zoneinfo`` base > class that implements DST support. > > > New function ``zoneinfo(name=None, db_path=None)`` > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > This function takes a name string that must be a string specifying a > valid zoneinfo time zone, i.e. "US/Eastern", "Europe/Warsaw" or "Etc/GMT". > If not given, the local time zone will be looked up. If an invalid zone name > is given, or the local time zone can not be retrieved, the function raises > `UnknownTimeZoneError`. > > The function also takes an optional path to the location of the zoneinfo > database which should be used. If not specified, the function will look for > databases in the following order: > > 1. Use the database in ``/usr/share/zoneinfo``, if it exists. > > 2. Check if the `tzdata-update` module is installed, and then use that > database. > > 3. Use the Python-provided database in ``Lib/tzdata``. > > If no database is found an ``UnknownTimeZoneError`` or subclass thereof will > be raised with a message explaining that no zoneinfo database can be found, > but that you can install one with the ``tzdata-update`` package. > > > New parameter ``is_dst`` > ^^^^^^^^^^^^^^^^^^^^^^^^ > > A new ``is_dst`` parameter is added to several methods to handle time > ambiguity during DST changeovers. > > * ``tzinfo.utcoffset(dt, is_dst=False)`` > > * ``tzinfo.dst(dt, is_dst=False)`` > > * ``tzinfo.tzname(dt, is_dst=False)`` > > * ``datetime.astimezone(tz, is_dst=False)`` > > The ``is_dst`` parameter can be ``False`` (default), ``True``, or ``None``. > > ``False`` will specify that the given datetime should be interpreted as not > happening during daylight savings time, i.e. that the time specified is after > the change from DST. > > ``True`` will specify that the given datetime should be interpreted as happening > during daylight savings time, i.e. that the time specified is before the change > from DST. > > ``None`` will raise an ``AmbiguousTimeError`` exception if the time specified > was during a DST change over. It will also raise a ``NonExistentTimeError`` > if a time is specified during the "missing time" in a change to DST. > > New exceptions > ^^^^^^^^^^^^^^ > > * ``UnknownTimeZoneError`` > > This exception is a subclass of KeyError and raised when giving a time > zone specification that can't be found:: > > >>> datetime.Timezone('Europe/New_York') > Traceback (most recent call last): > ... > UnknownTimeZoneError: There is no time zone called 'Europe/New_York' > > * ``InvalidTimeError`` > > This exception serves as a base for ``AmbiguousTimeError`` and > ``NonExistentTimeError``, to enable you to trap these two separately. It > will subclass from ValueError, so that you can catch these errors together > with inputs like the 29th of February 2011. > > * ``AmbiguousTimeError`` > > This exception is raised when giving a datetime specification that > is ambiguous > while setting ``is_dst`` to None:: > > >>> datetime(2012, 11, 28, 2, 0, > tzinfo=zoneinfo('Europe/Stockholm'), is_dst=None) > >>> > Traceback (most recent call last): > ... > AmbiguousTimeError: 2012-10-28 02:00:00 is ambiguous in time zone > Europe/Stockholm > > > * ``NonExistentTimeError`` > > This exception is raised when giving a datetime specification that > is ambiguous > while setting ``is_dst`` to None:: > > >>> datetime(2012, 3, 25, 2, 0, > tzinfo=zoneinfo('Europe/Stockholm'), is_dst=None) > >>> > Traceback (most recent call last): > ... > NonExistentTimeError: 2012-03-25 02:00:00 does not exist in time > zone Europe/Stockholm > > > New collections > ^^^^^^^^^^^^^^^ > > * ``all_timezones`` is the exhaustive list of the time zone names that can > be used, listed alphabethically. > > * ``all_timezones_set`` is a set of the time zones in ``all_timezones``. > > * ``common_timezones`` is a list of useful, current time zones, listed > alphabethically. > > * ``common_timezones_set`` is a set of the time zones in ``common_timezones``. > > > The ``tzdata-update``-package > ----------------------------- > > The zoneinfo database will be packaged for easy installation with > ``easy_install``/``pip``/``buildout``. This package will not install any > Python code, and will not contain any Python code except that which is needed > for installation. > > It will be kept updated with the same tools as the internal database, but > released whenever the ``zoneinfo``-database is updated, and use the same > version schema. > > > Differences from the ``pytz`` API > ================================= > > * ``pytz`` has the functions ``localize()`` and ``normalize()`` to work > around that ``tzinfo`` doesn't have is_dst. When ``is_dst`` is > implemented directly in ``datetime.tzinfo`` they are no longer needed. > > * The ``timezone()`` function is called ``zoneinfo()`` to avoid clashing with > the ``timezone`` class introduced in Python 3.2. > > * ``zoneinfo()`` will return the local time zone if called without arguments. > > * The class ``pytz.StaticTzInfo`` is there to provide the ``is_dst`` > support for static > time zones. When ``is_dst`` support is included in > ``datetime.tzinfo`` it is no longer needed. > > * ``InvalidTimeError`` subclasses from ``ValueError``. > > > Resources > ========= > > * http://pytz.sourceforge.net/ > > * http://pypi.python.org/pypi/tzlocal > > * http://pypi.python.org/pypi/python-dateutil > > * http://unicode.org/cldr/data/common/supplemental/windowsZones.xml > > Copyright > ========= > > This document has been placed in the public domain. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org From pd at gmane.2013.dobrogost.net Wed Jan 16 00:21:57 2013 From: pd at gmane.2013.dobrogost.net (Piotr Dobrogost) Date: Tue, 15 Jan 2013 23:21:57 +0000 (UTC) Subject: [Python-Dev] DLLs folder on Windows Message-ID: Hi! I'm curious how dlls from the DLLs folder on Windows are being loaded? As they are not placed in the same folder where python.exe resides I guess they must be loaded by giving the path explicitly but I'm not sure. I'm asking because there's no DLLs folder being created when creating virtualenv and I suspect it should be. This is a followup to my question "Why doesn't virtualenv create DLLs folder?" (http://stackoverflow.com/q/6657541/95735) and to the virtualenv's issue 87 - "Problems creating virtualenv on Windows when Python is not installed for all users.". Regards, Piotr Dobrogost From tjreedy at udel.edu Wed Jan 16 02:21:07 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 15 Jan 2013 20:21:07 -0500 Subject: [Python-Dev] DLLs folder on Windows In-Reply-To: References: Message-ID: On 1/15/2013 6:21 PM, Piotr Dobrogost wrote: > I'm curious how dlls from the DLLs folder on Windows are being loaded? As they > are not placed in the same folder where python.exe resides I guess they must be > loaded by giving the path explicitly but I'm not sure. They are in .../DLLs, which is in sys.path before .../Lib. I don't know about virtualenv. (Usage questions should go to python-list.) -- Terry Jan Reedy From victor.stinner at gmail.com Thu Jan 17 14:02:45 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 17 Jan 2013 14:02:45 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: <50F28557.6010101@farowl.co.uk> References: <50F28557.6010101@farowl.co.uk> Message-ID: 2013/1/13 Jeff Allen : > I think io, meaning _io and _pyio really, would be amongst the impacted > modules, and should perhaps be in the examples. (I am currently working on > the Jython implementation of the _io module.) It seems to me that io.open, > and probably all the constructors, such as _io.FileIO, would need the extra > information as a mode or a boolean argument like closefd. This may be a > factor in your choice above. open() is listed in the PEP: open is io.open. I plan to add cloexec parameter to open() and FileIO constructor (and so io.open and _pyio.open). Examples: rawfile = io.FileIO("test.txt", "r", cloexec=True) textfile = open("text.txt", "r", cloexec=True) I started an implementation, you can already test FileIO and open(): http://hg.python.org/features/pep-433 Victor From "ja...py" at farowl.co.uk Fri Jan 18 00:40:32 2013 From: "ja...py" at farowl.co.uk (Jeff Allen) Date: Thu, 17 Jan 2013 23:40:32 +0000 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: <50F28557.6010101@farowl.co.uk> Message-ID: <50F88BF0.6040907@farowl.co.uk> On 17/01/2013 13:02, Victor Stinner wrote: > 2013/1/13 Jeff Allen: >> I think io, meaning _io and _pyio really, would be amongst the impacted >> modules, and should perhaps be in the examples. (I am currently working on >> the Jython implementation of the _io module.) It seems to me that io.open, >> and probably all the constructors, such as _io.FileIO, would need the extra >> information as a mode or a boolean argument like closefd. This may be a >> factor in your choice above. > open() is listed in the PEP: open is io.open. I plan to add cloexec > parameter to open() and FileIO constructor (and so io.open and > _pyio.open). Examples: > > rawfile = io.FileIO("test.txt", "r", cloexec=True) > textfile = open("text.txt", "r", cloexec=True) Ok it fell under "too obvious to mention". And my ignorance of v3.x, sorry. (We're still working on 2.7 in Jython, where open is from __builtin__.) Thanks for the reply. Jeff Allen From status at bugs.python.org Fri Jan 18 18:07:28 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 18 Jan 2013 18:07:28 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130118170728.B97DC1C967@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-01-11 - 2013-01-18) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3829 ( +4) closed 24942 (+54) total 28771 (+58) Open issues with patches: 1668 Issues opened (40) ================== #15031: Split .pyc parsing from module loading http://bugs.python.org/issue15031 reopened by brett.cannon #16932: urlparse fails at parsing "www.python.org:80/" http://bugs.python.org/issue16932 reopened by georg.brandl #16937: -u (unbuffered I/O) command line option documentation mismatch http://bugs.python.org/issue16937 opened by mjpieters #16938: pydoc confused by __dir__ http://bugs.python.org/issue16938 opened by ronaldoussoren #16942: urllib still doesn't support persistent connections http://bugs.python.org/issue16942 opened by C19 #16945: rewrite CGIHTTPRequestHandler to always use subprocess http://bugs.python.org/issue16945 opened by neologix #16946: subprocess: _close_open_fd_range_safe() does not set close-on- http://bugs.python.org/issue16946 opened by haypo #16948: email.mime.text.MIMEText: QP encoding broken with charset!=ISO http://bugs.python.org/issue16948 opened by jwilk #16953: select module compile errors with broken poll() http://bugs.python.org/issue16953 opened by Jeffrey.Armstrong #16954: Add docstrings for ElementTree module http://bugs.python.org/issue16954 opened by serhiy.storchaka #16956: Allow signed line number deltas in the code object's line num http://bugs.python.org/issue16956 opened by Mark.Shannon #16957: shutil.which() shouldn't look in working directory on unix-y s http://bugs.python.org/issue16957 opened by takluyver #16958: The sqlite3 context manager does not work with isolation_level http://bugs.python.org/issue16958 opened by r.david.murray #16959: rlcompleter doesn't work if __main__ can't be imported http://bugs.python.org/issue16959 opened by Aaron.Meurer #16961: No regression tests for -E and individual environment vars http://bugs.python.org/issue16961 opened by ncoghlan #16962: _posixsubprocess module uses outdated getdents system call http://bugs.python.org/issue16962 opened by riku-voipio #16964: Add 'm' format specifier for mon_grouping etc. http://bugs.python.org/issue16964 opened by skrah #16965: 2to3 should rewrite execfile() to open in 'rb' mode http://bugs.python.org/issue16965 opened by barry #16967: Keyword only argument default values are evaluated before othe http://bugs.python.org/issue16967 opened by Kay.Hayen #16968: Fix test discovery for test_concurrent_futures.py http://bugs.python.org/issue16968 opened by zach.ware #16969: test_urlwithfrag fail http://bugs.python.org/issue16969 opened by Ry #16970: argparse: bad nargs value raises misleading message http://bugs.python.org/issue16970 opened by chris.jerdonek #16971: Refleaks in charmap decoder http://bugs.python.org/issue16971 opened by serhiy.storchaka #16972: Useless function call in site.py http://bugs.python.org/issue16972 opened by x746e #16974: when "python -c command" does a traceback, it open the file "< http://bugs.python.org/issue16974 opened by ericlammerts #16975: Broken error handling in codecs.escape_decode() http://bugs.python.org/issue16975 opened by serhiy.storchaka #16976: Asyncore/asynchat hangs when used with ssl sockets http://bugs.python.org/issue16976 opened by Anthony.Lozano #16977: argparse: mismatch between choices parsing and usage/error mes http://bugs.python.org/issue16977 opened by chris.jerdonek #16978: fix grammar in 'threading' documentation http://bugs.python.org/issue16978 opened by tshepang #16979: Broken error handling in codecs.unicode_escape_decode() http://bugs.python.org/issue16979 opened by serhiy.storchaka #16980: SystemError in codecs.unicode_escape_decode() http://bugs.python.org/issue16980 opened by serhiy.storchaka #16981: ImportError hides real error when there too many open files du http://bugs.python.org/issue16981 opened by jinty #16983: header parsing could apply postel's law to encoded words insid http://bugs.python.org/issue16983 opened by r.david.murray #16985: Docs reference a concrete UTC tzinfo, but none exists http://bugs.python.org/issue16985 opened by jason.coombs #16986: ElementTree incorrectly parses strings with declared encoding http://bugs.python.org/issue16986 opened by serhiy.storchaka #16988: argparse: PARSER option for nargs not documented http://bugs.python.org/issue16988 opened by Robert #16989: allow distutils debug mode to be enabled more easily http://bugs.python.org/issue16989 opened by chris.jerdonek #16991: Add OrderedDict written in C http://bugs.python.org/issue16991 opened by eric.snow #16993: shutil.which() should preserve path case http://bugs.python.org/issue16993 opened by serhiy.storchaka #16994: collections.Counter.least_common http://bugs.python.org/issue16994 opened by davidcoallier Most recent 15 issues with no replies (15) ========================================== #16994: collections.Counter.least_common http://bugs.python.org/issue16994 #16993: shutil.which() should preserve path case http://bugs.python.org/issue16993 #16989: allow distutils debug mode to be enabled more easily http://bugs.python.org/issue16989 #16985: Docs reference a concrete UTC tzinfo, but none exists http://bugs.python.org/issue16985 #16983: header parsing could apply postel's law to encoded words insid http://bugs.python.org/issue16983 #16978: fix grammar in 'threading' documentation http://bugs.python.org/issue16978 #16972: Useless function call in site.py http://bugs.python.org/issue16972 #16965: 2to3 should rewrite execfile() to open in 'rb' mode http://bugs.python.org/issue16965 #16964: Add 'm' format specifier for mon_grouping etc. http://bugs.python.org/issue16964 #16962: _posixsubprocess module uses outdated getdents system call http://bugs.python.org/issue16962 #16961: No regression tests for -E and individual environment vars http://bugs.python.org/issue16961 #16953: select module compile errors with broken poll() http://bugs.python.org/issue16953 #16948: email.mime.text.MIMEText: QP encoding broken with charset!=ISO http://bugs.python.org/issue16948 #16946: subprocess: _close_open_fd_range_safe() does not set close-on- http://bugs.python.org/issue16946 #16945: rewrite CGIHTTPRequestHandler to always use subprocess http://bugs.python.org/issue16945 Most recent 15 issues waiting for review (15) ============================================= #16994: collections.Counter.least_common http://bugs.python.org/issue16994 #16993: shutil.which() should preserve path case http://bugs.python.org/issue16993 #16991: Add OrderedDict written in C http://bugs.python.org/issue16991 #16981: ImportError hides real error when there too many open files du http://bugs.python.org/issue16981 #16980: SystemError in codecs.unicode_escape_decode() http://bugs.python.org/issue16980 #16979: Broken error handling in codecs.unicode_escape_decode() http://bugs.python.org/issue16979 #16978: fix grammar in 'threading' documentation http://bugs.python.org/issue16978 #16977: argparse: mismatch between choices parsing and usage/error mes http://bugs.python.org/issue16977 #16975: Broken error handling in codecs.escape_decode() http://bugs.python.org/issue16975 #16972: Useless function call in site.py http://bugs.python.org/issue16972 #16971: Refleaks in charmap decoder http://bugs.python.org/issue16971 #16970: argparse: bad nargs value raises misleading message http://bugs.python.org/issue16970 #16968: Fix test discovery for test_concurrent_futures.py http://bugs.python.org/issue16968 #16962: _posixsubprocess module uses outdated getdents system call http://bugs.python.org/issue16962 #16957: shutil.which() shouldn't look in working directory on unix-y s http://bugs.python.org/issue16957 Top 10 most discussed issues (10) ================================= #16468: argparse only supports iterable choices http://bugs.python.org/issue16468 17 msgs #10527: multiprocessing.Pipe problem: "handle out of range in select() http://bugs.python.org/issue10527 11 msgs #16893: Create IDLE help.txt from Doc/library/idle.rst http://bugs.python.org/issue16893 11 msgs #12939: Add new io.FileIO using the native Windows API http://bugs.python.org/issue12939 9 msgs #14468: Update cloning guidelines in devguide http://bugs.python.org/issue14468 7 msgs #16968: Fix test discovery for test_concurrent_futures.py http://bugs.python.org/issue16968 7 msgs #16969: test_urlwithfrag fail http://bugs.python.org/issue16969 7 msgs #11844: Update json to upstream simplejson latest release http://bugs.python.org/issue11844 6 msgs #16974: when "python -c command" does a traceback, it open the file "< http://bugs.python.org/issue16974 6 msgs #11983: Inconsistent hash and comparison for code objects http://bugs.python.org/issue11983 5 msgs Issues closed (50) ================== #3876: multiprocessing does not compile on systems which do not defin http://bugs.python.org/issue3876 closed by skrah #5066: IDLE documentation for Unix obsolete/incorrect http://bugs.python.org/issue5066 closed by asvetlov #7340: Doc for sys.exc_info has warning that is no longer valid http://bugs.python.org/issue7340 closed by python-dev #7468: PyErr_Format documentation doesn't mention all format codes http://bugs.python.org/issue7468 closed by serhiy.storchaka #8712: Skip libpthread related test failures on OpenBSD http://bugs.python.org/issue8712 closed by skrah #8714: Delayed signals in the REPL on OpenBSD (possibly libpthread re http://bugs.python.org/issue8714 closed by skrah #9720: zipfile writes incorrect local file header for large files in http://bugs.python.org/issue9720 closed by serhiy.storchaka #11290: ttk.Combobox['values'] String Conversion to Tcl http://bugs.python.org/issue11290 closed by serhiy.storchaka #11441: compile() raises SystemError if called from except clause http://bugs.python.org/issue11441 closed by brett.cannon #11729: libffi assembler relocation check is not robust, fails with cl http://bugs.python.org/issue11729 closed by skrah #12812: libffi does not build with clang on amd64 http://bugs.python.org/issue12812 closed by skrah #14110: FreeBSD: test_os fails if user is in the wheel group http://bugs.python.org/issue14110 closed by skrah #14377: Modify serializer for xml.etree.ElementTree to allow forcing t http://bugs.python.org/issue14377 closed by python-dev #14850: The inconsistency of codecs.charmap_decode http://bugs.python.org/issue14850 closed by serhiy.storchaka #15442: Expand the list of default dirs filecmp.dircmp ignores http://bugs.python.org/issue15442 closed by python-dev #15861: ttk.Treeview "unmatched open brace in list" http://bugs.python.org/issue15861 closed by serhiy.storchaka #16216: Arithmetic operations with NULL http://bugs.python.org/issue16216 closed by serhiy.storchaka #16259: Replace exec() in test.regrtest with __import__ http://bugs.python.org/issue16259 closed by r.david.murray #16398: deque.rotate() could be much faster http://bugs.python.org/issue16398 closed by rhettinger #16422: Decimal constants should be the same for py & c module version http://bugs.python.org/issue16422 closed by skrah #16613: ChainMap.new_child could use improvement http://bugs.python.org/issue16613 closed by python-dev #16730: _fill_cache in _bootstrap.py crashes without directory execute http://bugs.python.org/issue16730 closed by brett.cannon #16762: test_subprocess failure on OpenBSD/NetBSD buildbots http://bugs.python.org/issue16762 closed by neologix #16829: IDLE on POSIX can't print filenames with spaces http://bugs.python.org/issue16829 closed by serhiy.storchaka #16886: Doctests in test_dictcomp depend on dict order http://bugs.python.org/issue16886 closed by python-dev #16916: The Extended Iterable Unpacking (PEP-3132) for byte strings do http://bugs.python.org/issue16916 closed by python-dev #16923: test_ssl kicks up a lot of ResourceWarnings http://bugs.python.org/issue16923 closed by pitrou #16933: argparse: remove magic from examples http://bugs.python.org/issue16933 closed by chris.jerdonek #16936: Documentation for stat.S_IFMT inconsistent http://bugs.python.org/issue16936 closed by python-dev #16939: Broken link in 14. Cryptographic Service http://bugs.python.org/issue16939 closed by chris.jerdonek #16940: argparse 15.4.5.1. Sub-commands documentation missing indentat http://bugs.python.org/issue16940 closed by ezio.melotti #16941: TkInter won't update display on OS X if delay is too small http://bugs.python.org/issue16941 closed by ned.deily #16943: seriously? FileCookieJar can't really save ? save method is N http://bugs.python.org/issue16943 closed by nadeem.vawda #16944: German number separators not working using format language and http://bugs.python.org/issue16944 closed by skrah #16947: Search for "sherpa" on pypi leads to gitflow http://bugs.python.org/issue16947 closed by r.david.murray #16949: removal of string exceptions is already done http://bugs.python.org/issue16949 closed by python-dev #16950: the old raise syntax is not legal in Python 3 http://bugs.python.org/issue16950 closed by python-dev #16951: expand on meaning of 'string literals that rely on significant http://bugs.python.org/issue16951 closed by georg.brandl #16952: test_kqueue failure on NetBSD/OpenBSD http://bugs.python.org/issue16952 closed by neologix #16955: multiprocessing.connection poll() always returns false http://bugs.python.org/issue16955 closed by sbt #16960: Fix PEP8 errors everywhere http://bugs.python.org/issue16960 closed by ezio.melotti #16963: module html.parser HTMLParser's strict mode don't work http://bugs.python.org/issue16963 closed by pysean #16966: Publishing multiprocessing listener code http://bugs.python.org/issue16966 closed by sbt #16973: Extending the datetime package by adding a workDayMath module http://bugs.python.org/issue16973 closed by ezio.melotti #16982: _ssl not built --without-threads http://bugs.python.org/issue16982 closed by skrah #16984: idle problem with dark color schemes in kde http://bugs.python.org/issue16984 closed by serwy #16987: HP-UX: SHLIB_EXT is not set http://bugs.python.org/issue16987 closed by skrah #16990: re: match of nongreedy regex not grouping right http://bugs.python.org/issue16990 closed by r.david.murray #16992: signal.set_wakeup_fd(400) crashes on Windows http://bugs.python.org/issue16992 closed by python-dev #16922: ElementTree.findtext() returns empty bytes object instead of e http://bugs.python.org/issue16922 closed by eli.bendersky From victor.stinner at gmail.com Fri Jan 18 23:16:17 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 18 Jan 2013 23:16:17 +0100 Subject: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors In-Reply-To: References: Message-ID: 2013/1/13 Charles-Fran?ois Natali : >> .. note:: >> OpenBSD older 5.2 does not close the file descriptor with >> close-on-exec flag set if ``fork()`` is used before ``exec()``, but >> it works correctly if ``exec()`` is called without ``fork()``. > > That would be *really* surprising, are your sure your test case is correct? > Otherwise it could be a compilation issue, because I simply can't > believe OpenBSD would ignore the close-on-exec flag. I didn't write a C program yet, but you can test the folllowing Python script. On OpenBSD 4.9 it writes "OS BUG !!!". -- USE_FORK = True import fcntl, os, sys fd = os.open("/etc/passwd", os.O_RDONLY) flags = fcntl.fcntl(fd, fcntl.F_GETFD) flags |= fcntl.FD_CLOEXEC fcntl.fcntl(fd, fcntl.F_SETFD, flags) code = """ import os, sys fd = int(sys.argv[1]) try: os.fstat(fd) except OSError: print("fd %s closed by exec (FD_CLOEXEC works)" % fd) else: print("fd %s not closed by exec: FD_CLOEXEC doesn't work, OS BUG!!!" % fd) """ args = [sys.executable, '-c', code, str(fd)] if USE_FORK: pid = os.fork() if pid: os.waitpid(pid, 0) sys.exit(0) os.execv(args[0], args) -- It works with USE_FORKS = False. Victor From benjamin at python.org Sat Jan 19 20:30:10 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 19 Jan 2013 14:30:10 -0500 Subject: [Python-Dev] 2.7.4 Message-ID: It's been almost a year since 2.7.3, so it's time for another 2.7 bugfix release. 2013-02-02 - 2.7.4 release branch created; rc released 2013-02-16 - 2.7.4 released Does this work for you, Martin and Ned? -- Regards, Benjamin From nad at acm.org Sat Jan 19 21:42:26 2013 From: nad at acm.org (Ned Deily) Date: Sat, 19 Jan 2013 12:42:26 -0800 Subject: [Python-Dev] 2.7.4 References: Message-ID: In article , Benjamin Peterson wrote: > It's been almost a year since 2.7.3, so it's time for another 2.7 > bugfix release. > > 2013-02-02 - 2.7.4 release branch created; rc released > 2013-02-16 - 2.7.4 released > > Does this work for you, Martin and Ned? That works for me. There are also several pending issues that I want to get into 2.7.4 (and 3.2.4). That should be enough time to finish them. -- Ned Deily, nad at acm.org From hodgestar+pythondev at gmail.com Sat Jan 19 21:43:52 2013 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Sat, 19 Jan 2013 22:43:52 +0200 Subject: [Python-Dev] 2.7.4 In-Reply-To: References: Message-ID: On Sat, Jan 19, 2013 at 9:30 PM, Benjamin Peterson wrote: > It's been almost a year since 2.7.3, so it's time for another 2.7 > bugfix release. > > 2013-02-02 - 2.7.4 release branch created; rc released > 2013-02-16 - 2.7.4 released The Cape Town Python User Group is having a Python Bug Day next weekend (26th & 27th January) [1]. If there any particular 2.7 bugs that need looking at please shout, otherwise we'll just work from the top of the list of 2.7 issues on the bug tracker. [1] http://ctpug.org.za/wiki/Sprint20130126 Schiavo Simon From chrism at plope.com Sat Jan 19 23:13:00 2013 From: chrism at plope.com (Chris McDonough) Date: Sat, 19 Jan 2013 17:13:00 -0500 Subject: [Python-Dev] 2.7.4 In-Reply-To: References: Message-ID: <1358633580.4470.11.camel@thinko> On Sat, 2013-01-19 at 14:30 -0500, Benjamin Peterson wrote: > It's been almost a year since 2.7.3, so it's time for another 2.7 > bugfix release. > > 2013-02-02 - 2.7.4 release branch created; rc released > 2013-02-16 - 2.7.4 released > > Does this work for you, Martin and Ned? I have a pet issue that has a patch that requires application to the 2.7 branch, if anyone would be kind enough to do it: http://bugs.python.org/issue15881 It has already been applied to various 3.X branches. - C From peter.a.portante at gmail.com Sun Jan 20 05:49:04 2013 From: peter.a.portante at gmail.com (Peter Portante) Date: Sat, 19 Jan 2013 23:49:04 -0500 Subject: [Python-Dev] Modules/socketmodule.c: avoiding second fcntl() call worth the effort? Message-ID: Hello folks, I noticed while stracing a process that sock.setblocking() calls always result in pairs of fcntl() calls on Linux. Checking 2.6.8, 2.7.3, and 3.3.0 Modules/socketmodule.c, the code seems to use the following (unless I have missed something): delay_flag = fcntl(s->sock_fd, F_GETFL, 0); if (block) delay_flag &= (~O_NONBLOCK); else delay_flag |= O_NONBLOCK; fcntl(s->sock_fd, F_SETFL, delay_flag); Perhaps a check to see the flags changed might be worth making? int orig_delay_flag = fcntl(s->sock_fd, F_GETFL, 0); if (block) delay_flag = orig_delay_flag & (~O_NONBLOCK); else delay_flag = orig_delay_flag | O_NONBLOCK; if (delay_flag != orig_delay_flag) fcntl(s->sock_fd, F_SETFL, delay_flag); OpenStack Swift using the Eventlet module, which sets the accepted socket non-blocking, resulting in twice the number of fcntl() calls. Not a killer on performance, but it seems simple enough to save a system call here. Thanks for your consideration, -peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Sun Jan 20 06:41:17 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 20 Jan 2013 00:41:17 -0500 Subject: [Python-Dev] Modules/socketmodule.c: avoiding second fcntl() call worth the effort? In-Reply-To: References: Message-ID: 2013/1/19 Peter Portante : > Hello folks, > > I noticed while stracing a process that sock.setblocking() calls always > result in pairs of fcntl() calls on Linux. Checking 2.6.8, 2.7.3, and 3.3.0 > Modules/socketmodule.c, the code seems to use the following (unless I have > missed something): > > delay_flag = fcntl(s->sock_fd, F_GETFL, 0); > if (block) > delay_flag &= (~O_NONBLOCK); > else > delay_flag |= O_NONBLOCK; > fcntl(s->sock_fd, F_SETFL, delay_flag); > > Perhaps a check to see the flags changed might be worth making? Considering most sockets are only set to blocking once, this doesn't seem very useful. -- Regards, Benjamin From guido at python.org Sun Jan 20 06:38:41 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 19 Jan 2013 21:38:41 -0800 Subject: [Python-Dev] Modules/socketmodule.c: avoiding second fcntl() call worth the effort? In-Reply-To: References: Message-ID: On Sat, Jan 19, 2013 at 8:49 PM, Peter Portante wrote: > I noticed while stracing a process that sock.setblocking() calls always > result in pairs of fcntl() calls on Linux. Checking 2.6.8, 2.7.3, and 3.3.0 > Modules/socketmodule.c, the code seems to use the following (unless I have > missed something): > > delay_flag = fcntl(s->sock_fd, F_GETFL, 0); > if (block) > delay_flag &= (~O_NONBLOCK); > else > delay_flag |= O_NONBLOCK; > fcntl(s->sock_fd, F_SETFL, delay_flag); > > Perhaps a check to see the flags changed might be worth making? > > int orig_delay_flag = fcntl(s->sock_fd, F_GETFL, 0); > if (block) > delay_flag = orig_delay_flag & (~O_NONBLOCK); > else > delay_flag = orig_delay_flag | O_NONBLOCK; > if (delay_flag != orig_delay_flag) > fcntl(s->sock_fd, F_SETFL, delay_flag); > > OpenStack Swift using the Eventlet module, which sets the accepted socket > non-blocking, resulting in twice the number of fcntl() calls. Not a killer > on performance, but it seems simple enough to save a system call here. This would seem to be a simple enough fix, but it seems you are only fixing it if a *redundant* call to setblocking() is made (i.e. one that attempts to set the flag to the value it already has). Why would this be a common pattern? Even if it was, is the cost of one extra fcntl() call really worth making the code more complex? -- --Guido van Rossum (python.org/~guido) From peter.a.portante at gmail.com Sun Jan 20 07:31:29 2013 From: peter.a.portante at gmail.com (Peter Portante) Date: Sun, 20 Jan 2013 01:31:29 -0500 Subject: [Python-Dev] Modules/socketmodule.c: avoiding second fcntl() call worth the effort? In-Reply-To: References: Message-ID: I don't have a concrete case where a socket object's setblocking() method is called with a value in one module, handed off to another module (which does not know what the first did with it) which in turn also calls setblocking() with the same value. It certainly seems that that not is a common pattern, but perhaps one could argue a valid pattern, since the state of blocking/nonblocking is maintained in the kernel behind the fcntl() system calls. Here is what I am seeing concretely. This is the syscall pattern from eventlet/wsgi.py + eventlet/greenio.py (after removing redundants call to set_nonblocking (see https://bitbucket.org/portante/eventlet/commits/cc27508f4bbaaea566aecb51cf6c8b4629b083bd)). First, these are the call stacks for the three calls to the set_nonblocking() method, made in one HTTP request; the greenio.py:set_nonblocking() method wraps the socketmodule.c:setblocking() method: pid 1385 File "/usr/bin/swift-object-server", line 22, in run_wsgi(conf_file, 'object-server', default_port=6000, **options) File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 194, in run_wsgi run_server(max_clients) File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 158, in run_server wsgi.server(sock, app, NullLogger(), custom_pool=pool) File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 598, in server *client_socket = sock.accept()* File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 163, in accept return type(self)(client), addr File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 133, in __init__ *set_nonblocking(fd)* pid 1385 File "/usr/lib/python2.6/site-packages/eventlet/greenpool.py", line 80, in _spawn_n_impl func(*args, **kwargs) File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 516, in process_request proto = self.protocol(socket, address, self) File "/usr/lib64/python2.6/SocketServer.py", line 616, in __init__ self.setup() File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 174, in setup *self.rfile = conn.makefile('rb', self.rbufsize)* File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 219, in makefile return _fileobject(self.dup(), *args, **kw) File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 214, in dup newsock = type(self)(sock) File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 133, in __init__ *set_nonblocking(fd)* pid 1385 File "/usr/lib/python2.6/site-packages/eventlet/greenpool.py", line 80, in _spawn_n_impl func(*args, **kwargs) File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 516, in process_request proto = self.protocol(socket, address, self) File "/usr/lib64/python2.6/SocketServer.py", line 616, in __init__ self.setup() File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 175, in setup *self.wfile = conn.makefile('wb', self.wbufsize)* File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 219, in makefile return _fileobject(self.dup(), *args, **kw) File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 214, in dup newsock = type(self)(sock) File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 133, in __init__ *set_nonblocking(fd)* The first one above is expected, the next two unexpectedly result in fcntl() calls on the same fd. The strace looks like: accept(8, {sa_family=AF_INET, sin_port=htons(54375), sin_addr=inet_addr("127.0.0.1")}, [16]) = 14 fcntl(14, F_GETFL) = 0x2 (flags O_RDWR) fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0 # *self.rfile = conn.makefile('rb', self.rbufsize)* fcntl(14, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0 # *self.wfile = conn.makefile('wb', self.wbufsize)* fcntl(14, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0 recvfrom(14, "GET /sdb1/234456/AUTH_del0/gprfc"..., 8192, 0, NULL, NULL) = 353 getsockname(14, {sa_family=AF_INET, sin_port=htons(6010), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 ... It appears that conn.makefile() is attempting to dup() the fd, but rfile and wfile end up with objects that share the same fd contained in conn. For eventlet/wsgi.py based webservers, OpenStack Swift is the one I am working with right now, handles millions of requests a day on our customer systems. Seems like these suggested code changes are trivial compared to the number of system calls that can be saved. Thanks for indulging on this topic, -peter On Sun, Jan 20, 2013 at 12:38 AM, Guido van Rossum wrote: > On Sat, Jan 19, 2013 at 8:49 PM, Peter Portante > wrote: > > I noticed while stracing a process that sock.setblocking() calls always > > result in pairs of fcntl() calls on Linux. Checking 2.6.8, 2.7.3, and > 3.3.0 > > Modules/socketmodule.c, the code seems to use the following (unless I > have > > missed something): > > > > delay_flag = fcntl(s->sock_fd, F_GETFL, 0); > > if (block) > > delay_flag &= (~O_NONBLOCK); > > else > > delay_flag |= O_NONBLOCK; > > fcntl(s->sock_fd, F_SETFL, delay_flag); > > > > Perhaps a check to see the flags changed might be worth making? > > > > int orig_delay_flag = fcntl(s->sock_fd, F_GETFL, 0); > > if (block) > > delay_flag = orig_delay_flag & (~O_NONBLOCK); > > else > > delay_flag = orig_delay_flag | O_NONBLOCK; > > if (delay_flag != orig_delay_flag) > > fcntl(s->sock_fd, F_SETFL, delay_flag); > > > > OpenStack Swift using the Eventlet module, which sets the accepted socket > > non-blocking, resulting in twice the number of fcntl() calls. Not a > killer > > on performance, but it seems simple enough to save a system call here. > > This would seem to be a simple enough fix, but it seems you are only > fixing it if a *redundant* call to setblocking() is made (i.e. one > that attempts to set the flag to the value it already has). Why would > this be a common pattern? Even if it was, is the cost of one extra > fcntl() call really worth making the code more complex? > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Sun Jan 20 08:25:10 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 20 Jan 2013 08:25:10 +0100 Subject: [Python-Dev] 2.7.4 In-Reply-To: References: Message-ID: Am 19.01.2013 20:30, schrieb Benjamin Peterson: > It's been almost a year since 2.7.3, so it's time for another 2.7 > bugfix release. > > 2013-02-02 - 2.7.4 release branch created; rc released > 2013-02-16 - 2.7.4 released > > Does this work for you, Martin and Ned? I would propose to sync this with 3.2.4 if there's no argument against it. Georg From peter.a.portante at gmail.com Sun Jan 20 22:40:42 2013 From: peter.a.portante at gmail.com (Peter Portante) Date: Sun, 20 Jan 2013 16:40:42 -0500 Subject: [Python-Dev] Modules/socketmodule.c: avoiding second fcntl() call worth the effort? In-Reply-To: References: Message-ID: FWIW, looking into Python's Lib/socket.py module, it would seem that the dup() method of a _sockobject() just transfers the underlying fd, not performing a new socket create. So a caller could know about that behavior and work to call setblocking(0) only when it has a socket not seen before. In eventlet's case, it would appear the code could be taught this fact, and adjusted accordingly. I'll propose a patch for eventlet that will eliminate most of these calls in their case. Thanks for your time, -peter On Sun, Jan 20, 2013 at 1:31 AM, Peter Portante wrote: > I don't have a concrete case where a socket object's setblocking() method > is called with a value in one module, handed off to another module (which > does not know what the first did with it) which in turn also calls > setblocking() with the same value. It certainly seems that that not is a > common pattern, but perhaps one could argue a valid pattern, since the > state of blocking/nonblocking is maintained in the kernel behind the > fcntl() system calls. > > Here is what I am seeing concretely. > > This is the syscall pattern from eventlet/wsgi.py + eventlet/greenio.py > (after removing redundants call to set_nonblocking (see > https://bitbucket.org/portante/eventlet/commits/cc27508f4bbaaea566aecb51cf6c8b4629b083bd)). > First, these are the call stacks for the three calls to the > set_nonblocking() method, made in one HTTP request; the > greenio.py:set_nonblocking() method wraps the socketmodule.c:setblocking() > method: > > pid 1385 > File "/usr/bin/swift-object-server", line 22, in > run_wsgi(conf_file, 'object-server', default_port=6000, **options) > File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 194, > in run_wsgi > run_server(max_clients) > File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 158, > in run_server > wsgi.server(sock, app, NullLogger(), custom_pool=pool) > File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 598, in > server > *client_socket = sock.accept()* > File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 163, > in accept > return type(self)(client), addr > File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 133, > in __init__ > *set_nonblocking(fd)* > > pid 1385 > File "/usr/lib/python2.6/site-packages/eventlet/greenpool.py", line 80, > in _spawn_n_impl > func(*args, **kwargs) > File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 516, in > process_request > proto = self.protocol(socket, address, self) > File "/usr/lib64/python2.6/SocketServer.py", line 616, in __init__ > self.setup() > File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 174, in > setup > *self.rfile = conn.makefile('rb', self.rbufsize)* > File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 219, > in makefile > return _fileobject(self.dup(), *args, **kw) > File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 214, > in dup > newsock = type(self)(sock) > File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 133, > in __init__ > *set_nonblocking(fd)* > > pid 1385 > File "/usr/lib/python2.6/site-packages/eventlet/greenpool.py", line 80, > in _spawn_n_impl > func(*args, **kwargs) > File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 516, in > process_request > proto = self.protocol(socket, address, self) > File "/usr/lib64/python2.6/SocketServer.py", line 616, in __init__ > self.setup() > File "/usr/lib/python2.6/site-packages/eventlet/wsgi.py", line 175, in > setup > *self.wfile = conn.makefile('wb', self.wbufsize)* > File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 219, > in makefile > return _fileobject(self.dup(), *args, **kw) > File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 214, > in dup > newsock = type(self)(sock) > File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 133, > in __init__ > *set_nonblocking(fd)* > > The first one above is expected, the next two unexpectedly result in > fcntl() calls on the same fd. The strace looks like: > > accept(8, {sa_family=AF_INET, sin_port=htons(54375), > sin_addr=inet_addr("127.0.0.1")}, [16]) = 14 > fcntl(14, F_GETFL) = 0x2 (flags O_RDWR) > fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > # *self.rfile = conn.makefile('rb', self.rbufsize)* > fcntl(14, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) > fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > # *self.wfile = conn.makefile('wb', self.wbufsize)* > fcntl(14, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) > fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > recvfrom(14, "GET /sdb1/234456/AUTH_del0/gprfc"..., 8192, 0, NULL, NULL) = > 353 > getsockname(14, {sa_family=AF_INET, sin_port=htons(6010), > sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 > ... > > It appears that conn.makefile() is attempting to dup() the fd, but rfile > and wfile end up with objects that share the same fd contained in conn. > > For eventlet/wsgi.py based webservers, OpenStack Swift is the one I am > working with right now, handles millions of requests a day on our customer > systems. Seems like these suggested code changes are trivial compared to > the number of system calls that can be saved. > > Thanks for indulging on this topic, > > -peter > > > On Sun, Jan 20, 2013 at 12:38 AM, Guido van Rossum wrote: > >> On Sat, Jan 19, 2013 at 8:49 PM, Peter Portante >> wrote: >> > I noticed while stracing a process that sock.setblocking() calls always >> > result in pairs of fcntl() calls on Linux. Checking 2.6.8, 2.7.3, and >> 3.3.0 >> > Modules/socketmodule.c, the code seems to use the following (unless I >> have >> > missed something): >> > >> > delay_flag = fcntl(s->sock_fd, F_GETFL, 0); >> > if (block) >> > delay_flag &= (~O_NONBLOCK); >> > else >> > delay_flag |= O_NONBLOCK; >> > fcntl(s->sock_fd, F_SETFL, delay_flag); >> > >> > Perhaps a check to see the flags changed might be worth making? >> > >> > int orig_delay_flag = fcntl(s->sock_fd, F_GETFL, 0); >> > if (block) >> > delay_flag = orig_delay_flag & (~O_NONBLOCK); >> > else >> > delay_flag = orig_delay_flag | O_NONBLOCK; >> > if (delay_flag != orig_delay_flag) >> > fcntl(s->sock_fd, F_SETFL, delay_flag); >> > >> > OpenStack Swift using the Eventlet module, which sets the accepted >> socket >> > non-blocking, resulting in twice the number of fcntl() calls. Not a >> killer >> > on performance, but it seems simple enough to save a system call here. >> >> This would seem to be a simple enough fix, but it seems you are only >> fixing it if a *redundant* call to setblocking() is made (i.e. one >> that attempts to set the flag to the value it already has). Why would >> this be a common pattern? Even if it was, is the cost of one extra >> fcntl() call really worth making the code more complex? >> >> -- >> --Guido van Rossum (python.org/~guido) >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sun Jan 20 23:15:25 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 20 Jan 2013 23:15:25 +0100 Subject: [Python-Dev] Modules/socketmodule.c: avoiding second fcntl() call worth the effort? In-Reply-To: References: Message-ID: Since Linux 2.6.27, it's possible to set SOCK_NONBLOCK directly at the creation of the socket (using socket() and accept4()). So you don't need any extra call. I implemented something similar for SOCK_CLOEXEC flag to implement the PEP 433. See for example: http://hg.python.org/features/pep-433/file/1097ffc652f4/Modules/socketmodule.c#l3918 and: http://hg.python.org/features/pep-433/file/1097ffc652f4/Modules/socketmodule.c#l1950 The PEP 433 (Easier suppression of file descriptor inheritance): http://python.org/dev/peps/pep-0433/ But I added a new keyword argument to socket.socket() and socket.socket.accept() for that. I don't know if you can do something similar without adding a new argument. -- Instead of two calls to fcntl() (2 syscalls), you can also use "int opt = 1; ioctl(fd, FIONBIO, &opt);" (1 syscall). Victor From christian at python.org Mon Jan 21 11:29:45 2013 From: christian at python.org (Christian Heimes) Date: Mon, 21 Jan 2013 11:29:45 +0100 Subject: [Python-Dev] cpython (3.2): Issue #16335: Fix integer overflow in unicode-escape decoder. In-Reply-To: <3YqSZP3HzlzSNK@mail.python.org> References: <3YqSZP3HzlzSNK@mail.python.org> Message-ID: <50FD1899.6060505@python.org> Am 21.01.2013 10:46, schrieb serhiy.storchaka: > http://hg.python.org/cpython/rev/7625866f8127 > changeset: 81622:7625866f8127 > branch: 3.2 > parent: 81610:260a9afd999a > user: Serhiy Storchaka > date: Mon Jan 21 11:38:00 2013 +0200 > summary: > Issue #16335: Fix integer overflow in unicode-escape decoder. > > files: > Lib/test/test_ucn.py | 16 ++++++++++++++++ > Objects/unicodeobject.c | 3 ++- > 2 files changed, 18 insertions(+), 1 deletions(-) > > > diff --git a/Lib/test/test_ucn.py b/Lib/test/test_ucn.py > --- a/Lib/test/test_ucn.py > +++ b/Lib/test/test_ucn.py > @@ -8,6 +8,7 @@ > """#" > > import unittest > +import _testcapi > > from test import support > > @@ -141,6 +142,21 @@ > str, b"\\NSPACE", 'unicode-escape', 'strict' > ) > > + @unittest.skipUnless(_testcapi.INT_MAX < _testcapi.PY_SSIZE_T_MAX, > + "needs UINT_MAX < SIZE_MAX") > + def test_issue16335(self): > + # very very long bogus character name > + try: > + x = b'\\N{SPACE' + b'x' * (_testcapi.UINT_MAX + 1) + b'}' > + except MemoryError: > + raise unittest.SkipTest("not enough memory") > + self.assertEqual(len(x), len(b'\\N{SPACE}') + (_testcapi.UINT_MAX + 1)) > + self.assertRaisesRegex(UnicodeError, > + 'unknown Unicode character name', > + x.decode, 'unicode-escape' > + ) > + > + The test requires a lot of memory on my 64bit Linux box. Two of three times the test process was killed by the Kernel's out of memory manager although I have 8 GB physical RAM and 4 GB swap. Can you rewrite the test to use less memory? From solipsis at pitrou.net Mon Jan 21 19:13:37 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 21 Jan 2013 19:13:37 +0100 Subject: [Python-Dev] cpython (2.7): Fix memory error in test_ucn. References: <3YqVLn69qczMJG@mail.python.org> Message-ID: <20130121191337.76c8c489@pitrou.net> On Mon, 21 Jan 2013 12:06:25 +0100 (CET) serhiy.storchaka wrote: > diff --git a/Lib/test/test_ucn.py b/Lib/test/test_ucn.py > --- a/Lib/test/test_ucn.py > +++ b/Lib/test/test_ucn.py > @@ -144,13 +144,14 @@ > # very very long bogus character name > try: > x = b'\\N{SPACE' + b'x' * int(_testcapi.UINT_MAX + 1) + b'}' > + self.assertEqual(len(x), len(b'\\N{SPACE}') + > + (_testcapi.UINT_MAX + 1)) > + self.assertRaisesRegexp(UnicodeError, > + 'unknown Unicode character name', > + x.decode, 'unicode-escape' > + ) > except MemoryError: > raise unittest.SkipTest("not enough memory") > - self.assertEqual(len(x), len(b'\\N{SPACE}') + (_testcapi.UINT_MAX + 1)) > - self.assertRaisesRegexp(UnicodeError, > - 'unknown Unicode character name', > - x.decode, 'unicode-escape' > - ) You can't just do that. The test may end up swapping a lot and make the machine grind to a halt, rather than raise MemoryError. This threatens to make life miserable for anyone who hasn't enough RAM, but has enough swap to fit the test's working set. Really, you have to use a bigmem decorator. Regards Antoine. From marcin.szewczyk at wodny.org Tue Jan 22 13:16:52 2013 From: marcin.szewczyk at wodny.org (Marcin Szewczyk) Date: Tue, 22 Jan 2013 13:16:52 +0100 Subject: [Python-Dev] Inconsistent behaviour of methods waiting for child process Message-ID: <20130122121652.GC3664@magazyn-ziarno.chello.pl> Hi, I've previously asked about this issue on Python General but there was no response and I hope this post qualifies for Python Devel. I've done some experiments with: 1) multiprocessing.Process.join() 2) os.waitpid() 3) subprocess.Popen.wait() These three methods behave completely different when interrupted with a signal which I find disturbing. Reactions are: 1) exit with no exception or special return code 2) OSError exception 3) quiet retry (no exit) The 1) case is very impractical. Is there any movement towards standardization of those 3? Am I missing something and there is a way to get more information from Process.join()? My environment is: $ python --version Python 2.7.3rc2 $ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 7.0 (wheezy) Release: 7.0 Codename: wheezy $ uname -a Linux magazyn-ziarno 3.2.0-4-686-pae #1 SMP Debian 3.2.35-2 i686 GNU/Linux -- Marcin Szewczyk http://wodny.org mailto:Marcin.Szewczyk at wodny.borg <- remove b / usu? b xmpp:wodny at ubuntu.pl xmpp:wodny at jabster.pl From shibturn at gmail.com Tue Jan 22 18:35:06 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Tue, 22 Jan 2013 17:35:06 +0000 Subject: [Python-Dev] Inconsistent behaviour of methods waiting for child process In-Reply-To: <20130122121652.GC3664@magazyn-ziarno.chello.pl> References: <20130122121652.GC3664@magazyn-ziarno.chello.pl> Message-ID: On 22/01/2013 12:16pm, Marcin Szewczyk wrote: > The 1) case is very impractical. > > Is there any movement towards standardization of those 3? > > Am I missing something and there is a way to get more information from > Process.join()? With Process.join(), it looks like version 2.6 raises OSError but later versions do not. Could you file an issue at bugs.python.org? You can work around the problem by checking the exitcode attribute, e.g. while proc.exitcode is None: proc.join() -- Richard From solipsis at pitrou.net Tue Jan 22 21:26:07 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 22 Jan 2013 21:26:07 +0100 Subject: [Python-Dev] hg.python.org Mercurial upgrade Message-ID: <20130122212607.3e44d487@pitrou.net> Hello, I've upgraded the Mercurial version on hg.python.org. If there any problems, don't hesitate to post here. (apart from the connectivity problems we seem to have from time to time and which shouldn't be related) Regards Antoine. From chris at python.org Wed Jan 23 00:23:58 2013 From: chris at python.org (Chris Withers) Date: Tue, 22 Jan 2013 23:23:58 +0000 Subject: [Python-Dev] slightly misleading Popen.poll() docs In-Reply-To: <50BF811C.4070708@pearwood.info> References: <50BF718E.5080604@simplistix.co.uk> <50BF811C.4070708@pearwood.info> Message-ID: <50FF1F8E.8020400@python.org> On 05/12/2012 17:15, Steven D'Aprano wrote: > """ > Check if child process has terminated. Returns None while the child is > still running, > any non-None value means that the child has terminated. In either case, > the return > value is also available from the instance's returncode attribute. > """ Do you want to make the commit or shall I? Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From marcin.s.szewczyk at gmail.com Wed Jan 23 16:20:32 2013 From: marcin.s.szewczyk at gmail.com (Marcin Szewczyk) Date: Wed, 23 Jan 2013 16:20:32 +0100 Subject: [Python-Dev] Inconsistent behaviour of methods waiting for child process In-Reply-To: References: <20130122121652.GC3664@magazyn-ziarno.chello.pl> Message-ID: <20130123152032.GA3838@magazyn-ziarno.chello.pl> On Tue, Jan 22, 2013 at 05:35:06PM +0000, Richard Oudkerk wrote: > With Process.join(), it looks like version 2.6 raises OSError but > later versions do not. Could you file an issue at bugs.python.org? Done. I've linked an associated bug report and the commit changing that behaviour. http://bugs.python.org/issue17018 Thank for the hint on looping. -- Marcin Szewczyk http://wodny.org mailto:Marcin.Szewczyk at wodny.borg <- remove b / usu? b xmpp:wodny at ubuntu.pl xmpp:wodny at jabster.pl From amauryfa at gmail.com Wed Jan 23 20:41:11 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 23 Jan 2013 20:41:11 +0100 Subject: [Python-Dev] hg.python.org Mercurial upgrade In-Reply-To: <20130122212607.3e44d487@pitrou.net> References: <20130122212607.3e44d487@pitrou.net> Message-ID: 2013/1/22 Antoine Pitrou > I've upgraded the Mercurial version on hg.python.org. If there any > problems, don't hesitate to post here. > I've noticed a display glitch with the hg viewer: http://hg.python.org/cpython/rev/6df0b4ed8617#l2.8 There is a "[#14591]" link which causes the rest of the line to be shifted. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Jan 23 20:43:27 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 23 Jan 2013 20:43:27 +0100 Subject: [Python-Dev] hg.python.org Mercurial upgrade In-Reply-To: References: <20130122212607.3e44d487@pitrou.net> Message-ID: <20130123204327.27229755@pitrou.net> On Wed, 23 Jan 2013 20:41:11 +0100 "Amaury Forgeot d'Arc" wrote: > 2013/1/22 Antoine Pitrou > > > I've upgraded the Mercurial version on hg.python.org. If there any > > problems, don't hesitate to post here. > > > > I've noticed a display glitch with the hg viewer: > http://hg.python.org/cpython/rev/6df0b4ed8617#l2.8 > There is a "[#14591]" link which causes the rest of the line to be shifted. Indeed. This is not because of the upgrade, but because of a new regexp Ezio asked me to insert in the Web UI configuration :-) Regards Antoine. From amauryfa at gmail.com Wed Jan 23 21:36:21 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 23 Jan 2013 21:36:21 +0100 Subject: [Python-Dev] hg.python.org Mercurial upgrade In-Reply-To: <20130123204327.27229755@pitrou.net> References: <20130122212607.3e44d487@pitrou.net> <20130123204327.27229755@pitrou.net> Message-ID: 2013/1/23 Antoine Pitrou > On Wed, 23 Jan 2013 20:41:11 +0100 > "Amaury Forgeot d'Arc" wrote: > > 2013/1/22 Antoine Pitrou > > > > > I've upgraded the Mercurial version on hg.python.org. If there any > > > problems, don't hesitate to post here. > > > > > > > I've noticed a display glitch with the hg viewer: > > http://hg.python.org/cpython/rev/6df0b4ed8617#l2.8 > > There is a "[#14591]" link which causes the rest of the line to be > shifted. > > Indeed. This is not because of the upgrade, but because of a new regexp > Ezio asked me to insert in the Web UI configuration :-) Also I noticed that the table at the top (author, date, etc) is not nicely aligned; the first column should be wider and "change baseline" (btw, what does this mean?) should not wrap. Experiments in Chrome suggest to change a line in static/style-paper.css, for the "#changesetEntry th" rule: "width: 8em;" -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Jan 23 21:41:24 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 23 Jan 2013 21:41:24 +0100 Subject: [Python-Dev] hg.python.org Mercurial upgrade In-Reply-To: References: <20130122212607.3e44d487@pitrou.net> <20130123204327.27229755@pitrou.net> Message-ID: <20130123214124.1083394c@pitrou.net> On Wed, 23 Jan 2013 21:36:21 +0100 "Amaury Forgeot d'Arc" wrote: > 2013/1/23 Antoine Pitrou > > > On Wed, 23 Jan 2013 20:41:11 +0100 > > "Amaury Forgeot d'Arc" wrote: > > > 2013/1/22 Antoine Pitrou > > > > > > > I've upgraded the Mercurial version on hg.python.org. If there any > > > > problems, don't hesitate to post here. > > > > > > > > > > I've noticed a display glitch with the hg viewer: > > > http://hg.python.org/cpython/rev/6df0b4ed8617#l2.8 > > > There is a "[#14591]" link which causes the rest of the line to be > > shifted. > > > > Indeed. This is not because of the upgrade, but because of a new regexp > > Ezio asked me to insert in the Web UI configuration :-) > > > Also I noticed that the table at the top (author, date, etc) is not nicely > aligned; the first column should be wider and "change baseline" (btw, what > does this mean?) should not wrap. I don't know, that's a new thing in hgweb apparently. The issue should be reported in http://bz.selenic.com/, since that's nothing python.org-specific AFAIR. Regards Antoine. From db3l.net at gmail.com Wed Jan 23 23:37:53 2013 From: db3l.net at gmail.com (David Bolen) Date: Wed, 23 Jan 2013 17:37:53 -0500 Subject: [Python-Dev] hg.python.org Mercurial upgrade References: <20130122212607.3e44d487@pitrou.net> Message-ID: Antoine Pitrou writes: > I've upgraded the Mercurial version on hg.python.org. If there any > problems, don't hesitate to post here. > (apart from the connectivity problems we seem to have from time to time > and which shouldn't be related) I'm not sure if this is related to the upgrade specifically (or just some other hg issue0, but yesterday my Windows 7 buildbot failed an hg pull operation on the 3.x branch and automatically removed it's local repository. Ever since then an attempt to start fresh (with an hg clone) is dying during the clone operation. I'm running the clone operation manually now, and after sitting there for 10 minutes or so at the "adding changesets" stage, it died just like in the buildbot logs ("transaction abort!" followed by "rollback completed"). Nothing was happening on the buildbot during the adding changesets delay so I'm assuming that was all server-side. -- David From ezio.melotti at gmail.com Thu Jan 24 03:16:29 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 24 Jan 2013 04:16:29 +0200 Subject: [Python-Dev] hg.python.org Mercurial upgrade In-Reply-To: <20130123204327.27229755@pitrou.net> References: <20130122212607.3e44d487@pitrou.net> <20130123204327.27229755@pitrou.net> Message-ID: On Wed, Jan 23, 2013 at 9:43 PM, Antoine Pitrou wrote: > On Wed, 23 Jan 2013 20:41:11 +0100 > "Amaury Forgeot d'Arc" wrote: >> 2013/1/22 Antoine Pitrou >> >> > I've upgraded the Mercurial version on hg.python.org. If there any >> > problems, don't hesitate to post here. >> > >> >> I've noticed a display glitch with the hg viewer: >> http://hg.python.org/cpython/rev/6df0b4ed8617#l2.8 >> There is a "[#14591]" link which causes the rest of the line to be shifted. > > Indeed. This is not because of the upgrade, but because of a new regexp > Ezio asked me to insert in the Web UI configuration :-) > FWIW this was an attempt to fix the links to issues in http://hg.python.org/cpython/. AFAIU the interhg extension used here to turn "#12345" to links affects at least 3 places: 1) the description of each changeset in the "shortlog" page (e.g. http://hg.python.org/cpython/); 2) the description at the top of the "rev" page (e.g. http://hg.python.org/cpython/rev/6df0b4ed8617); 3) the code in the "diff"/"rev"/"annotate" and possibly other pages (e.g. http://hg.python.org/cpython/rev/6df0b4ed8617#l2.6); With the previous solution, case 1 was broken, but links for cases 2-3 worked fine. The problem is that in 1 the description is already a link, so the result ended up being something like 'Issue #12345 is now fixed'. With the new solution 1-2 work (the links are added/moved at the end), but it's glitched for case 3. Unless interhg provides a way to limit the replacement only to specific places and/or use different replacements for different places, we will either have to live with these glitches or come up with a proper fix done at the right level. Best Regards, Ezio Melotti > Regards > > Antoine. From chris.jerdonek at gmail.com Thu Jan 24 03:29:24 2013 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Wed, 23 Jan 2013 18:29:24 -0800 Subject: [Python-Dev] hg.python.org Mercurial upgrade In-Reply-To: References: <20130122212607.3e44d487@pitrou.net> <20130123204327.27229755@pitrou.net> Message-ID: On Wed, Jan 23, 2013 at 6:16 PM, Ezio Melotti wrote: > On Wed, Jan 23, 2013 at 9:43 PM, Antoine Pitrou wrote: >> On Wed, 23 Jan 2013 20:41:11 +0100 >> "Amaury Forgeot d'Arc" wrote: >>> 2013/1/22 Antoine Pitrou >>> >>> > I've upgraded the Mercurial version on hg.python.org. If there any >>> > problems, don't hesitate to post here. >>> > >>> >>> I've noticed a display glitch with the hg viewer: >>> http://hg.python.org/cpython/rev/6df0b4ed8617#l2.8 >>> There is a "[#14591]" link which causes the rest of the line to be shifted. >> >> Indeed. This is not because of the upgrade, but because of a new regexp >> Ezio asked me to insert in the Web UI configuration :-) >> > > FWIW this was an attempt to fix the links to issues in > http://hg.python.org/cpython/. > AFAIU the interhg extension used here to turn "#12345" to links > affects at least 3 places: > 1) the description of each changeset in the "shortlog" page (e.g. > http://hg.python.org/cpython/); > 2) the description at the top of the "rev" page (e.g. > http://hg.python.org/cpython/rev/6df0b4ed8617); > 3) the code in the "diff"/"rev"/"annotate" and possibly other pages > (e.g. http://hg.python.org/cpython/rev/6df0b4ed8617#l2.6); > With the previous solution, case 1 was broken, but links for cases 2-3 > worked fine. The problem is that in 1 the description is already a > link, so the result ended up being something like ' href="rev/...">Issue #12345 is now > fixed'. > With the new solution 1-2 work (the links are added/moved at the end), > but it's glitched for case 3. > Unless interhg provides a way to limit the replacement only to > specific places and/or use different replacements for different > places, we will either have to live with these glitches or come up > with a proper fix done at the right level. How does the above relate to this issue? http://bugs.python.org/issue15919 --Chris From ezio.melotti at gmail.com Thu Jan 24 04:11:02 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 24 Jan 2013 05:11:02 +0200 Subject: [Python-Dev] hg.python.org Mercurial upgrade In-Reply-To: References: <20130122212607.3e44d487@pitrou.net> <20130123204327.27229755@pitrou.net> Message-ID: On Thu, Jan 24, 2013 at 4:29 AM, Chris Jerdonek wrote: > On Wed, Jan 23, 2013 at 6:16 PM, Ezio Melotti wrote: >> On Wed, Jan 23, 2013 at 9:43 PM, Antoine Pitrou wrote: >>> On Wed, 23 Jan 2013 20:41:11 +0100 >>> "Amaury Forgeot d'Arc" wrote: >>>> 2013/1/22 Antoine Pitrou >>>> >>>> > I've upgraded the Mercurial version on hg.python.org. If there any >>>> > problems, don't hesitate to post here. >>>> > >>>> >>>> I've noticed a display glitch with the hg viewer: >>>> http://hg.python.org/cpython/rev/6df0b4ed8617#l2.8 >>>> There is a "[#14591]" link which causes the rest of the line to be shifted. >>> >>> Indeed. This is not because of the upgrade, but because of a new regexp >>> Ezio asked me to insert in the Web UI configuration :-) >>> >> >> FWIW this was an attempt to fix the links to issues in >> http://hg.python.org/cpython/. >> AFAIU the interhg extension used here to turn "#12345" to links >> affects at least 3 places: >> 1) the description of each changeset in the "shortlog" page (e.g. >> http://hg.python.org/cpython/); >> 2) the description at the top of the "rev" page (e.g. >> http://hg.python.org/cpython/rev/6df0b4ed8617); >> 3) the code in the "diff"/"rev"/"annotate" and possibly other pages >> (e.g. http://hg.python.org/cpython/rev/6df0b4ed8617#l2.6); >> With the previous solution, case 1 was broken, but links for cases 2-3 >> worked fine. The problem is that in 1 the description is already a >> link, so the result ended up being something like '> href="rev/...">Issue #12345 is now >> fixed'. >> With the new solution 1-2 work (the links are added/moved at the end), >> but it's glitched for case 3. >> Unless interhg provides a way to limit the replacement only to >> specific places and/or use different replacements for different >> places, we will either have to live with these glitches or come up >> with a proper fix done at the right level. > > How does the above relate to this issue? > > http://bugs.python.org/issue15919 > This is exactly the problem I fixed. I added a few more comments on the issue and closed it. Best Regards, Ezio Melotti > --Chris From solipsis at pitrou.net Thu Jan 24 11:28:53 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 24 Jan 2013 11:28:53 +0100 Subject: [Python-Dev] hg.python.org Mercurial upgrade References: <20130122212607.3e44d487@pitrou.net> Message-ID: <20130124112853.7a371da4@pitrou.net> Le Wed, 23 Jan 2013 17:37:53 -0500, David Bolen a ?crit : > Antoine Pitrou writes: > > > I've upgraded the Mercurial version on hg.python.org. If there any > > problems, don't hesitate to post here. > > (apart from the connectivity problems we seem to have from time to > > time and which shouldn't be related) > > I'm not sure if this is related to the upgrade specifically (or just > some other hg issue0, but yesterday my Windows 7 buildbot failed an hg > pull operation on the 3.x branch and automatically removed it's local > repository. Ever since then an attempt to start fresh (with an hg > clone) is dying during the clone operation. > > I'm running the clone operation manually now, and after sitting there > for 10 minutes or so at the "adding changesets" stage, it died just > like in the buildbot logs ("transaction abort!" followed by "rollback > completed"). Nothing was happening on the buildbot during the adding > changesets delay so I'm assuming that was all server-side. Given the delay (10 minutes), I'm assuming this has more to do with the connectivity problem Noah is trying to investigate: http://mail.python.org/pipermail/catalog-sig/2013-January/004736.html Regards Antoine. From victor.stinner at gmail.com Fri Jan 25 00:20:28 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 25 Jan 2013 00:20:28 +0100 Subject: [Python-Dev] Implementation of the PEP 433 Message-ID: Hi, I implemented the PEP 433 in the following repository: http://hg.python.org/features/pep-433 You can use it if you would like to experiment/test the PEP 433. PEP 433: "Easier suppression of file descriptor inheritance" http://www.python.org/dev/peps/pep-0433/ I added a PYTHONCLOEXEC environment variable, an "-e" option and sys.setdefaultcloexec() function. At startup, the default value of cloexec parameter is False. I tested my implementation on Linux, Windows, Mac OS X, FreeBSD, OpenBSD and OpenIndiana. Victor From victor.stinner at gmail.com Fri Jan 25 00:58:13 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 25 Jan 2013 00:58:13 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter Message-ID: Hi, The PEP 433 proposes different options to add a new cloexec parameter: a) cloexec=False by default b) cloexec=True by default c) configurable default value I tried to list in the PEP 433 advantages and drawbacks of each option. If I recorded correctly opinions, the different options have the following supporters: a) cloexec=False by default b) cloexec=True by default: Charles-Fran?ois Natali c) configurable default value: Antoine Pitrou, Nick Coghlan, Guido van Rossum I don't know if it's enough to say that (c) is the chosen option. -- For the name of the new parameter, I still prefer "cloexec" for the reason given by Antoine: on UNIX, the flag has only an effect on exec(), not on fork(). So inherited=False can be misleading on UNIX. I updated the rationale of the PEP to describe the close-on-exec flag: http://www.python.org/dev/peps/pep-0433/#rationale -- FYI on Windows, cloexec=False (inherited=True) doesn't mean that the file descriptor will be inherited. It also depends on the bInheritHandles parameter of the CreateProcess() function. Extract of the PEP: "On Windows, the file descriptor is not inherited if the close-on-exec flag is set, the file descriptor is inherited by child processes if the flag is cleared and if CreateProcess() is called with the bInheritHandles parameter set to TRUE (when subprocess.Popen is created with close_fds=False for example)." Victor From cf.natali at gmail.com Fri Jan 25 09:56:52 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 25 Jan 2013 09:56:52 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: Message-ID: Hello, > I tried to list in the PEP 433 advantages and drawbacks of each option. > > If I recorded correctly opinions, the different options have the > following supporters: > > a) cloexec=False by default > b) cloexec=True by default: Charles-Fran?ois Natali > c) configurable default value: Antoine Pitrou, Nick Coghlan, Guido van Rossum You can actually count me in the cloexec=False camp, and against the idea of a configurable default value. Here's why: Why cloexec shouldn't be set by default: - While it's really tempting to fix one of Unix historical worst decisions, I don't think we can set file descriptors cloexec by default: this would break some applications (I don't think there would be too many of them, but still), but most notably, this would break POSIX semantics. If Python didn't expose POSIX syscalls and file descriptors, but only high-level file streams/sockets/etc, then we could probably go ahead, but now it's too late. Someone said earlier on python-dev that many people use Python for prototyping, and indeed, when using POSIX API, you expect POSIX semantics. Why the default value shouldn't be tunable: - I think it's useless: if the default cloexec behavior can be altered (either by a command-line flag, an environment variable or a sys module function), then libraries cannot rely on it and have to make file descriptors cloexec on an individual basis, since the default flag can be disabled. So it would basically be useless for the Python standard library, and any third-party library. So the only use case is for application writers that use raw exec() (since subprocess already closes file descriptors > 3, and AFAICT we don't expose a way to create processes "manually" on Windows), but there I think they fall into two categories: those who are aware of the problem of file descriptor inheritance, and who therefore set their FDs cloexec manually, and those who are not familiar with this issue, and who'll never look up a sys.setdefaultcloexec() tunable (and if they do, they might think: "Hey, if that's so nice, why isn't it on by default? Wait, it might break applications? I'll just leave the default then."). - But most importantly, I think such a tunable flag is a really wrong idea because it's a global tunable that alters the underlying operating system semantics. Consider this code: """ r, w = os.pipe() if os.fork() == 0: os.execve(['myprog']) """ With a tunable flag, just by looking at this code, you have no way to know whether the file descriptor will be inherited by the child process. That would be introducing an hidden global variable silently changing the semantics of the underlying operating system, and that's just so wrong. Sure, we do have global tunables: """ sys.setcheckinterval() sys.setrecursionlimit() sys.setswitchinterval() hash_randomization """ But those alter "extralinguistic" behavior, i.e. they don't affect the semantics of the language or underlying operating system in a way that would break or change the behavior of a "conforming" program. Although it's not as bad, just to belabor the point, imagine we introduced a new method: """ sys.enable_integer_division(boolean) Depending on the value of this flag, the division of two integers will either yield a floating point or truncated integer value. """ Global variables are bad, hidden global variables are worse, and hidden global variables altering language/operating system semantics are evil :-) What I'd like to see: - Adding a "cloexec" parameter to file descriptor creating functions/classes is fine, it will make it easier for a library/application writer to create file descriptors cloexec, especially in an atomic way. - We should go over the standard library, and create FDs cloexec if they're not handed over to the caller, either because they're opened/closed before returning, or because the underlying file descriptor is kept private (not fileno() method, although it's relatively rare). That's the approach chosen by glibc, and it makes sense: if another thread forks() while a thread is in the middle of getpwnam(), you don't want to leak an open file descriptor to /etc/passwd (or /etc/shadow). cf From ncoghlan at gmail.com Fri Jan 25 12:07:54 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 Jan 2013 21:07:54 +1000 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: Message-ID: On Fri, Jan 25, 2013 at 6:56 PM, Charles-Fran?ois Natali wrote: > Hello, > >> I tried to list in the PEP 433 advantages and drawbacks of each option. >> >> If I recorded correctly opinions, the different options have the >> following supporters: >> >> a) cloexec=False by default >> b) cloexec=True by default: Charles-Fran?ois Natali >> c) configurable default value: Antoine Pitrou, Nick Coghlan, Guido van Rossum > > You can actually count me in the cloexec=False camp, and against the > idea of a configurable default value. Here's why: > > Why cloexec shouldn't be set by default: > - While it's really tempting to fix one of Unix historical worst > decisions, I don't think we can set file descriptors cloexec by > default: this would break some applications (I don't think there would > be too many of them, but still), but most notably, this would break > POSIX semantics. If Python didn't expose POSIX syscalls and file > descriptors, but only high-level file streams/sockets/etc, then we > could probably go ahead, but now it's too late. Someone said earlier > on python-dev that many people use Python for prototyping, and indeed, > when using POSIX API, you expect POSIX semantics. > > Why the default value shouldn't be tunable: > - I think it's useless: if the default cloexec behavior can be altered > (either by a command-line flag, an environment variable or a sys > module function), then libraries cannot rely on it and have to make > file descriptors cloexec on an individual basis, since the default > flag can be disabled. So it would basically be useless for the Python > standard library, and any third-party library. So the only use case is > for application writers that use raw exec() (since subprocess already > closes file descriptors > 3, and AFAICT we don't expose a way to > create processes "manually" on Windows), but there I think they fall > into two categories: those who are aware of the problem of file > descriptor inheritance, and who therefore set their FDs cloexec > manually, and those who are not familiar with this issue, and who'll > never look up a sys.setdefaultcloexec() tunable (and if they do, they > might think: "Hey, if that's so nice, why isn't it on by default? > Wait, it might break applications? I'll just leave the default > then."). It's a configurable setting in the same way that -Q makes the behaviour of "/" configurable in Python 2 (so your hypothetical example isn't hypothetical at all - it's a description the -Q option), and -R makes random hashing configurable in 2.7 and 3.2: it means we can change the default behaviour in a future version (perhaps Python 4) while allowing people to easily check if their code operates correctly in that state in the current version. I think the default behaviour needs to be configurable from the environment and the command line, but I don't believe it should be configurable from within the interpreter. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Fri Jan 25 12:24:15 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 25 Jan 2013 12:24:15 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: Message-ID: 2013/1/25 Charles-Fran?ois Natali : > You can actually count me in the cloexec=False camp, and against the > idea of a configurable default value. Oh ok. > Why cloexec shouldn't be set by default: > - While it's really tempting to fix one of Unix historical worst > decisions, I don't think we can set file descriptors cloexec by > default: this would break some applications (I don't think there would > be too many of them, but still), but most notably, this would break > POSIX semantics. (...) Agreed. > Why the default value shouldn't be tunable: > - I think it's useless: if the default cloexec behavior can be altered > (either by a command-line flag, an environment variable or a sys > module function), then libraries cannot rely on it and have to make > file descriptors cloexec on an individual basis, since the default > flag can be disabled. In my experience, in most cases, the default value of cloexec just doesn't matter at all. If your program relies on the state of the close-on-exec flag: you have to make it explicit, specify cloexec paramater. Example: os.pipe(cloexec=True). If you don't modify your application, it will just not work using -e command line option (or PYTHONCLOEXEC environment variable). But why would you enable cloexec by default if your application is not compatible? > So it would basically be useless for the Python > standard library, and any third-party library. So the only use case is > for application writers that use raw exec() (since subprocess already > closes file descriptors > 3, and AFAICT we don't expose a way to > create processes "manually" on Windows), but there I think they fall > into two categories: those who are aware of the problem of file > descriptor inheritance, and who therefore set their FDs cloexec > manually, and those who are not familiar with this issue, and who'll > never look up a sys.setdefaultcloexec() tunable (and if they do, they > might think: "Hey, if that's so nice, why isn't it on by default? > Wait, it might break applications? I'll just leave the default > then."). The problem is that Python can be embeded in application: the application can start a child process independently of Python. > - But most importantly, I think such a tunable flag is a really wrong > idea because it's a global tunable that alters the underlying > operating system semantics. Consider this code: > """ > r, w = os.pipe() > if os.fork() == 0: > os.execve(['myprog']) > """ If this snippet doesn't work with cloexec enabled by default, you have to write: os.pipe(cloexec=False). > - We should go over the standard library, and create FDs cloexec if > they're not handed over to the caller, either because they're > opened/closed before returning, or because the underlying file > descriptor is kept private (not fileno() method, although it's > relatively rare). That's the approach chosen by glibc, and it makes > sense: if another thread forks() while a thread is in the middle of > getpwnam(), you don't want to leak an open file descriptor to > /etc/passwd (or /etc/shadow). I started to make cloexec explicit *everywhere* in the Python stdlib. I reverted my commit because I think that only a few application start child processes, so doing extra work (additionnal syscalls to set cloexec) would slowdown Python for no gain. See my revert commit to see how many functions need to be modified: http://hg.python.org/features/pep-433/rev/963e450fc24f That's why I really like the idea of being able to configure the default value of the cloexec parameter. By default: no overhead nor backward compatibility issue. Whereas you can set close-on-exec flag *everywhere* if you are concern by all issues of inheriting file descriptors (cases listed in the PEP). In the stdlib, I only specified cloexec parameter where it was required to ensure that it works with any default value. Only a few modules were modified (subprocess, multiprocessing, ...). Victor From victor.stinner at gmail.com Fri Jan 25 12:28:10 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 25 Jan 2013 12:28:10 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: Message-ID: > I think the default behaviour needs to be configurable from the > environment and the command line, but I don't believe it should be > configurable from within the interpreter. sys.setdefaultcloexec() is convinient for unit test, but it may also be used by modules. A web framework may want to enable close-on-exec flag by default. The drawback of changing the default value after Python started is that Python may have created file descriptors before, so you cannot guarantee that all existing file descriptors have the flag set. Well, I don't know if sys.setdefaultcloexec() is a good idea or not :-) Victor From solipsis at pitrou.net Fri Jan 25 12:36:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 25 Jan 2013 12:36:03 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter References: Message-ID: <20130125123603.430ee61c@pitrou.net> Le Fri, 25 Jan 2013 12:28:10 +0100, Victor Stinner a ?crit : > > I think the default behaviour needs to be configurable from the > > environment and the command line, but I don't believe it should be > > configurable from within the interpreter. > > sys.setdefaultcloexec() is convinient for unit test, but it may also > be used by modules. A web framework may want to enable close-on-exec > flag by default. > > The drawback of changing the default value after Python started is > that Python may have created file descriptors before, so you cannot > guarantee that all existing file descriptors have the flag set. > > Well, I don't know if sys.setdefaultcloexec() is a good idea or > not :-) Both Charles-Fran?ois and Nick have good points. sys.setdefaultcloexec() is still useful if you want to force the policy from a Python script's toplevel (it's more practical than trying to fit a command-line option in the shebang line). Regards Antoine. From ncoghlan at gmail.com Fri Jan 25 13:07:46 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 Jan 2013 22:07:46 +1000 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <20130125123603.430ee61c@pitrou.net> References: <20130125123603.430ee61c@pitrou.net> Message-ID: On Fri, Jan 25, 2013 at 9:36 PM, Antoine Pitrou wrote: > Le Fri, 25 Jan 2013 12:28:10 +0100, > Victor Stinner a ?crit : >> > I think the default behaviour needs to be configurable from the >> > environment and the command line, but I don't believe it should be >> > configurable from within the interpreter. >> >> sys.setdefaultcloexec() is convinient for unit test, but it may also >> be used by modules. A web framework may want to enable close-on-exec >> flag by default. >> >> The drawback of changing the default value after Python started is >> that Python may have created file descriptors before, so you cannot >> guarantee that all existing file descriptors have the flag set. >> >> Well, I don't know if sys.setdefaultcloexec() is a good idea or >> not :-) > > Both Charles-Fran?ois and Nick have good points. > sys.setdefaultcloexec() is still useful if you want to force the > policy from a Python script's toplevel (it's more practical than > trying to fit a command-line option in the shebang line). Right, I'm only -0 on that aspect. It strikes me as somewhat dubious, but it's not obviously wrong the way a runtime equivalent of -Q or -R would be. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Jan 25 13:12:19 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 Jan 2013 22:12:19 +1000 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: <20130125123603.430ee61c@pitrou.net> Message-ID: On Fri, Jan 25, 2013 at 10:07 PM, Nick Coghlan wrote: > On Fri, Jan 25, 2013 at 9:36 PM, Antoine Pitrou wrote: >> Le Fri, 25 Jan 2013 12:28:10 +0100, >> Victor Stinner a ?crit : >>> Well, I don't know if sys.setdefaultcloexec() is a good idea or >>> not :-) >> >> Both Charles-Fran?ois and Nick have good points. >> sys.setdefaultcloexec() is still useful if you want to force the >> policy from a Python script's toplevel (it's more practical than >> trying to fit a command-line option in the shebang line). > > Right, I'm only -0 on that aspect. It strikes me as somewhat dubious, > but it's not obviously wrong the way a runtime equivalent of -Q or -R > would be. I just realised I could be converted to a +0 if the runtime time switch could only be used to set the global default as "cloexec=True" and couldn't be used to switch it back again (for testing purposes, if you only want to switch it on temporarily, use a subprocess). That way, as soon as you saw "sys.setdefaultcloexec()" at the beginning of __main__, you'd know descriptors would only be inherited when cloexec=False was set explicitly. If the default flag can also be turned off globally (rather than being a one-way switch), then you have no idea what libraries might do behind your back. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Fri Jan 25 15:54:21 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 25 Jan 2013 15:54:21 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: <20130125123603.430ee61c@pitrou.net> Message-ID: 2013/1/25 Nick Coghlan : > I just realised I could be converted to a +0 if the runtime time > switch could only be used to set the global default as "cloexec=True" > and couldn't be used to switch it back again (for testing purposes, if > you only want to switch it on temporarily, use a subprocess). (...) Oh, I like this idea. It does simplify many things :-) (And I agree that subprocess can be used to run a test which requires cloexec to be True by default.) -- I tried to be future-proof. If we decide to enable close-on-exec flag globally by default, how do you disable the flag globally? We may add an inverse command line option and another environment variable (ex: PYTHONNOCLOEXEC), but what about sys.setdefaultcloexec()? In a previous version, my implementation expected an argument for PYTHONCLOEXEC: PYTHONCLOEXEC=0 or PYTHONCLOEXEC=1. I realized that it's not how other options (like PYTHONDONTWRITEBYTECODE) are designed. But do we really want to enable close-on-exec in the future? Charles Fran?ois has really good arguments against such choice :-) It's maybe better to consider that the default at startup will always be False. So we should only provide different ways to set the default to True. Victor From status at bugs.python.org Fri Jan 25 18:07:13 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 25 Jan 2013 18:07:13 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130125170713.5A89E56914@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-01-18 - 2013-01-25) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3837 ( +8) closed 24971 (+29) total 28808 (+37) Open issues with patches: 1673 Issues opened (29) ================== #16335: Integer overflow in unicode-escape decoder http://bugs.python.org/issue16335 reopened by serhiy.storchaka #16995: Add Base32 support for RFC4648 "Extended Hex" alphabet (patch http://bugs.python.org/issue16995 opened by matthaeus.wander #16996: Reuse shutil.which() in webbrowser module http://bugs.python.org/issue16996 opened by serhiy.storchaka #16997: subtests http://bugs.python.org/issue16997 opened by pitrou #16998: Lost updates with multiprocessing.Value http://bugs.python.org/issue16998 opened by lechten #17002: html.entities ImportError in distutils2/pypi/simple.py on Pyth http://bugs.python.org/issue17002 opened by goibhniu #17003: Unification of read()??and readline() argument names http://bugs.python.org/issue17003 opened by serhiy.storchaka #17004: Expand zipimport to include other compression methods http://bugs.python.org/issue17004 opened by rhettinger #17005: Add a topological sort algorithm http://bugs.python.org/issue17005 opened by rhettinger #17006: Warn users about hashing secrets? http://bugs.python.org/issue17006 opened by christian.heimes #17009: "Thread Programming With Python" should be removed http://bugs.python.org/issue17009 opened by nedbat #17010: Windows launcher ignores active virtual environment http://bugs.python.org/issue17010 opened by bryangeneolson #17011: ElementPath ignores different namespace mappings for the same http://bugs.python.org/issue17011 opened by scoder #17012: Differences between /usr/bin/which and shutil.which() http://bugs.python.org/issue17012 opened by serhiy.storchaka #17013: Allow waiting on a mock http://bugs.python.org/issue17013 opened by pitrou #17015: mock could be smarter and inspect the spec's signature http://bugs.python.org/issue17015 opened by pitrou #17016: _sre: avoid relying on pointer overflow http://bugs.python.org/issue17016 opened by Nickolai.Zeldovich #17018: Inconsistent behaviour of methods waiting for child process http://bugs.python.org/issue17018 opened by wodny #17020: random.random() generating values >= 1.0 http://bugs.python.org/issue17020 opened by klankschap #17023: Subprocess does not find executable on Windows if it is PATH w http://bugs.python.org/issue17023 opened by pekka.klarck #17024: cElementTree calls end() on parser taget even if start() fails http://bugs.python.org/issue17024 opened by scoder #17025: reduce multiprocessing.Queue contention http://bugs.python.org/issue17025 opened by neologix #17026: pdb frames accessible after the termination occurs on uncaught http://bugs.python.org/issue17026 opened by xdegaye #17027: support for drag and drop of files to tkinter application http://bugs.python.org/issue17027 opened by Sarbjit.singh #17028: launcher http://bugs.python.org/issue17028 opened by theller #17029: h2py.py: search the multiarch include dir if it does exist http://bugs.python.org/issue17029 opened by doko #17031: fix running regen in cross builds http://bugs.python.org/issue17031 opened by doko #17032: Misleading error message: global name 'X' is not defined http://bugs.python.org/issue17032 opened by cool-RR #17033: RPM spec file has old config_binsuffix value http://bugs.python.org/issue17033 opened by taavi-burns Most recent 15 issues with no replies (15) ========================================== #17031: fix running regen in cross builds http://bugs.python.org/issue17031 #17028: launcher http://bugs.python.org/issue17028 #17027: support for drag and drop of files to tkinter application http://bugs.python.org/issue17027 #17026: pdb frames accessible after the termination occurs on uncaught http://bugs.python.org/issue17026 #17024: cElementTree calls end() on parser taget even if start() fails http://bugs.python.org/issue17024 #17023: Subprocess does not find executable on Windows if it is PATH w http://bugs.python.org/issue17023 #17018: Inconsistent behaviour of methods waiting for child process http://bugs.python.org/issue17018 #17013: Allow waiting on a mock http://bugs.python.org/issue17013 #17010: Windows launcher ignores active virtual environment http://bugs.python.org/issue17010 #16996: Reuse shutil.which() in webbrowser module http://bugs.python.org/issue16996 #16983: header parsing could apply postel's law to encoded words insid http://bugs.python.org/issue16983 #16965: 2to3 should rewrite execfile() to open in 'rb' mode http://bugs.python.org/issue16965 #16964: Add 'm' format specifier for mon_grouping etc. http://bugs.python.org/issue16964 #16962: _posixsubprocess module uses outdated getdents system call http://bugs.python.org/issue16962 #16961: No regression tests for -E and individual environment vars http://bugs.python.org/issue16961 Most recent 15 issues waiting for review (15) ============================================= #17033: RPM spec file has old config_binsuffix value http://bugs.python.org/issue17033 #17031: fix running regen in cross builds http://bugs.python.org/issue17031 #17029: h2py.py: search the multiarch include dir if it does exist http://bugs.python.org/issue17029 #17028: launcher http://bugs.python.org/issue17028 #17025: reduce multiprocessing.Queue contention http://bugs.python.org/issue17025 #17016: _sre: avoid relying on pointer overflow http://bugs.python.org/issue17016 #17012: Differences between /usr/bin/which and shutil.which() http://bugs.python.org/issue17012 #17003: Unification of read()??and readline() argument names http://bugs.python.org/issue17003 #16997: subtests http://bugs.python.org/issue16997 #16996: Reuse shutil.which() in webbrowser module http://bugs.python.org/issue16996 #16995: Add Base32 support for RFC4648 "Extended Hex" alphabet (patch http://bugs.python.org/issue16995 #16994: collections.Counter.least_common http://bugs.python.org/issue16994 #16991: Add OrderedDict written in C http://bugs.python.org/issue16991 #16989: allow distutils debug mode to be enabled more easily http://bugs.python.org/issue16989 #16981: ImportError hides real error when there too many open files du http://bugs.python.org/issue16981 Top 10 most discussed issues (10) ================================= #3982: support .format for bytes http://bugs.python.org/issue3982 32 msgs #16507: Patch selectmodule.c to support WSAPoll on Windows http://bugs.python.org/issue16507 25 msgs #14771: Occasional failure in test_ioctl when run parallel with test_g http://bugs.python.org/issue14771 14 msgs #16335: Integer overflow in unicode-escape decoder http://bugs.python.org/issue16335 14 msgs #16997: subtests http://bugs.python.org/issue16997 13 msgs #17020: random.random() generating values >= 1.0 http://bugs.python.org/issue17020 9 msgs #16795: Patch: some changes to AST to make it more useful for static l http://bugs.python.org/issue16795 8 msgs #17012: Differences between /usr/bin/which and shutil.which() http://bugs.python.org/issue17012 8 msgs #16701: Docs missing the behavior of += (in-place add) for lists. http://bugs.python.org/issue16701 7 msgs #17004: Expand zipimport to include other compression methods http://bugs.python.org/issue17004 7 msgs Issues closed (25) ================== #7353: cporting docs recommend using Include/intobject.h, which was r http://bugs.python.org/issue7353 closed by skrah #9708: cElementTree iterparse does not support "parser" argument http://bugs.python.org/issue9708 closed by eli.bendersky #11379: Remove "lightweight" from minidom description http://bugs.python.org/issue11379 closed by ezio.melotti #12323: ElementPath 1.3 expressions http://bugs.python.org/issue12323 closed by eli.bendersky #13454: crash when deleting one pair from tee() http://bugs.python.org/issue13454 closed by serhiy.storchaka #15050: Python 3.2.3 fail to make http://bugs.python.org/issue15050 closed by ezio.melotti #15484: CROSS: use _PYTHON_PROJECT_BASE in distutils sysconfig http://bugs.python.org/issue15484 closed by doko #15919: hg.python.org: log page entries don't always link to revision http://bugs.python.org/issue15919 closed by ezio.melotti #16292: Cross compilation fixes (general) http://bugs.python.org/issue16292 closed by doko #16557: PEP 380 isn't reflected in the Functional Programming HOWTO http://bugs.python.org/issue16557 closed by ezio.melotti #16953: select module compile errors with broken poll() http://bugs.python.org/issue16953 closed by neologix #16957: shutil.which() shouldn't look in working directory on unix-y s http://bugs.python.org/issue16957 closed by serhiy.storchaka #16969: test_urlwithfrag fail http://bugs.python.org/issue16969 closed by ezio.melotti #16978: fix grammar in 'threading' documentation http://bugs.python.org/issue16978 closed by ezio.melotti #16985: Docs reference a concrete UTC tzinfo, but none exists http://bugs.python.org/issue16985 closed by asvetlov #16993: shutil.which() should preserve path case http://bugs.python.org/issue16993 closed by serhiy.storchaka #17001: Make uuid.UUID use functools.total_ordering http://bugs.python.org/issue17001 closed by rhettinger #17007: logging documentation clarifications http://bugs.python.org/issue17007 closed by python-dev #17008: Descriptor __get__() invoke is bypassed in the class context http://bugs.python.org/issue17008 closed by benjamin.peterson #17014: _getfinalpathname() no more used in 3.4 http://bugs.python.org/issue17014 closed by pitrou #17017: email.utils.getaddresses fails for certain addresses http://bugs.python.org/issue17017 closed by r.david.murray #17019: Invitation to connect on LinkedIn http://bugs.python.org/issue17019 closed by christian.heimes #17021: Invitation to connect on LinkedIn http://bugs.python.org/issue17021 closed by ezio.melotti #17022: Inline assignment uses the newly created object http://bugs.python.org/issue17022 closed by r.david.murray #17030: strange import visibility http://bugs.python.org/issue17030 closed by r.david.murray From victor.stinner at gmail.com Fri Jan 25 23:05:23 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 25 Jan 2013 23:05:23 +0100 Subject: [Python-Dev] Implementation of the PEP 433 In-Reply-To: References: Message-ID: I created the following issue to track and review the implementation of the PEP 433: http://bugs.python.org/issue17036 Last changes: - don't hold the GIL to ensure "atomicity" of the close-on-exec flag - apply the default value of cloexec on all C functions - sys.setdefaultcloexec() don't have any argument anymore (unit tests don't use sys.setdefaultcloexec() anymore) Victor From ncoghlan at gmail.com Sat Jan 26 08:38:59 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 26 Jan 2013 17:38:59 +1000 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: <20130125123603.430ee61c@pitrou.net> Message-ID: On Sat, Jan 26, 2013 at 12:54 AM, Victor Stinner wrote: > But do we really want to enable close-on-exec in the future? Charles > Fran?ois has really good arguments against such choice :-) While many of Python's APIs are heavily inspired by POSIX, it still isn't POSIX. It isn't C either, especially in Python 3 (where C truncating division must be written as "//" and our internal text handling has made the migration to being fully Unicode based) > It's maybe > better to consider that the default at startup will always be False. > So we should only provide different ways to set the default to True. I think Charles Fran?ois actually hit on a fairly good analogy by comparing this transition to the integer division one. To implement that: 1. In 2.x, the "-Q" option was introduced to allow the behaviour to be switched globally, while ensuring it remained consistent for the life of a given application 2. The "//" syntax was introduced to force the use of truncating integer division 3. "float(n) / d" could be used to force floating point division Significantly, code written using either option 2 or option 3 retains exactly the same semantics in Python 3, even though the default behaviour of "/" has now switched from truncating division to floating point division. Note also that in Python 3, there is *no* mechanism to request truncating division globally - if you want truncating division, you have to explicitly request it every time. So, if we agree that "cloexec-by-default" is a desirable goal, despite the inconsistency with POSIX (just as changing integer division was seen as desirable, despite the inconsistency with C), then a sensible transition plan becomes: 1. Introduce a mechanism to switch the behaviour globally, while ensuring it remains consistent for the life of a given application 2. Introduce the "cloexec=None/True/False" 3-valued parameter as needed to allow people to choose between default/definitely-cloexec/definitely-not-cloexec. 3. At some point in the future (perhaps in 3.5, perhaps in 4.0) switch the default behaviour to cloexec=True and remove the ability to change the behaviour globally. The reason I'd be +0 on a "one-way switch", even at runtime, is that you can just make it a no-op after that behaviour becomes the default. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stefan at bytereef.org Sat Jan 26 11:55:12 2013 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 26 Jan 2013 11:55:12 +0100 Subject: [Python-Dev] FreeBSD-9.0 bot running --without-doc-strings Message-ID: <20130126105512.GA25659@sleipnir.bytereef.org> Hi, I've subverted the build master authority on the FreeBSD-9.0 bot by exporting with_doc_strings=no. This is to test #16143 and #10156. Some tests assume that docstrings are present, so there will be a couple of failures. If someone wants to work on easy issues, this is your chance. If the failures get annoying just open a tracker issue, and I'll revert the change. Stefan Krah From victor.stinner at gmail.com Sat Jan 26 12:28:59 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 26 Jan 2013 12:28:59 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: <20130125123603.430ee61c@pitrou.net> Message-ID: > So, if we agree that "cloexec-by-default" is a desirable goal, despite > the inconsistency with POSIX (just as changing integer division was > seen as desirable, despite the inconsistency with C) ... I don't plan to enable cloexec by default in a near future, nor in python 4. Someone else may change that later. Having a global flag should be enough. Enabling cloexec by default is interesting if *all* libraries respect the global flag (even C libraries), so subprocess can be used with close_fds=False. Closing all file descriptors is slow, especially on FreeBSD (it's now faster on linux, python lists /proc/pid/fd/). Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Jan 26 14:47:37 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 26 Jan 2013 15:47:37 +0200 Subject: [Python-Dev] FreeBSD-9.0 bot running --without-doc-strings In-Reply-To: <20130126105512.GA25659@sleipnir.bytereef.org> References: <20130126105512.GA25659@sleipnir.bytereef.org> Message-ID: On 26.01.13 12:55, Stefan Krah wrote: > Some tests assume that docstrings are present, so there will be a couple > of failures. If someone wants to work on easy issues, this is your chance. http://bugs.python.org/issue17040 From solipsis at pitrou.net Sat Jan 26 15:45:32 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 26 Jan 2013 15:45:32 +0100 Subject: [Python-Dev] FreeBSD-9.0 bot running --without-doc-strings References: <20130126105512.GA25659@sleipnir.bytereef.org> Message-ID: <20130126154532.70974b34@pitrou.net> Hello, On Sat, 26 Jan 2013 11:55:12 +0100 Stefan Krah wrote: > > I've subverted the build master authority on the FreeBSD-9.0 bot by > exporting with_doc_strings=no. This is to test #16143 and #10156. Well... Speaking personally, I'd much rather stop shipping and "supporting" such obscure build options. I'd like to hear about people using production Pythons --without-doc-strings. Regards Antoine. From ncoghlan at gmail.com Sat Jan 26 16:07:14 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Jan 2013 01:07:14 +1000 Subject: [Python-Dev] FreeBSD-9.0 bot running --without-doc-strings In-Reply-To: <20130126154532.70974b34@pitrou.net> References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> Message-ID: On Sun, Jan 27, 2013 at 12:45 AM, Antoine Pitrou wrote: > > Hello, > > On Sat, 26 Jan 2013 11:55:12 +0100 > Stefan Krah wrote: >> >> I've subverted the build master authority on the FreeBSD-9.0 bot by >> exporting with_doc_strings=no. This is to test #16143 and #10156. > > Well... Speaking personally, I'd much rather stop shipping and > "supporting" such obscure build options. I'd like to hear about people > using production Pythons --without-doc-strings. I was going to suggest that folks like the PyMite developers might use it to create versions that run in severely constrained environments, but it looks like they wrote their interpreter from scratch instead. On the other hand, I've definitely heard at least a few people talking about using -OO to drop docstrings as a technique to reduce memory consumption. Some of these failures would likely show up there as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sat Jan 26 16:10:50 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 26 Jan 2013 16:10:50 +0100 Subject: [Python-Dev] FreeBSD-9.0 bot running --without-doc-strings In-Reply-To: References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> Message-ID: <20130126161050.1c33484d@pitrou.net> On Sun, 27 Jan 2013 01:07:14 +1000 Nick Coghlan wrote: > On Sun, Jan 27, 2013 at 12:45 AM, Antoine Pitrou wrote: > > > > Hello, > > > > On Sat, 26 Jan 2013 11:55:12 +0100 > > Stefan Krah wrote: > >> > >> I've subverted the build master authority on the FreeBSD-9.0 bot by > >> exporting with_doc_strings=no. This is to test #16143 and #10156. > > > > Well... Speaking personally, I'd much rather stop shipping and > > "supporting" such obscure build options. I'd like to hear about people > > using production Pythons --without-doc-strings. > > I was going to suggest that folks like the PyMite developers might use > it to create versions that run in severely constrained environments, > but it looks like they wrote their interpreter from scratch instead. > > On the other hand, I've definitely heard at least a few people talking > about using -OO to drop docstrings as a technique to reduce memory > consumption. Some of these failures would likely show up there as > well. Using -OO is far less extreme than building your own Python with docstrings disabled. And the test suite already passes with -OO: http://buildbot.python.org/all/builders/AMD64%20Mountain%20Lion%20Optimized%20%5BSB%5D%203.x Regards Antoine. From stefan at bytereef.org Sat Jan 26 16:47:39 2013 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 26 Jan 2013 16:47:39 +0100 Subject: [Python-Dev] Anyone building Python --without-doc-strings? In-Reply-To: <20130126154532.70974b34@pitrou.net> References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> Message-ID: <20130126154738.GA13795@sleipnir.bytereef.org> Antoine Pitrou wrote: > Well... Speaking personally, I'd much rather stop shipping and > "supporting" such obscure build options. I'd like to hear about people > using production Pythons --without-doc-strings. I'm not sure how accurate the output is for measuring these things, but according to ``ls'' and ``du'' the option is indeed quite worthless: ./configure CFLAGS="-Os -s" LDFLAGS="-s" && make $ du -h build/lib.linux-x86_64-3.4/ 24K build/lib.linux-x86_64-3.4/__pycache__ 3.9M build/lib.linux-x86_64-3.4/ $ ls -lh python 1.8M Jan 26 16:36 python $ ls -lh libpython3.4m.a 9.6M Jan 26 16:36 libpython3.4m.a =============================================================== ./configure --without-doc-strings CFLAGS="-Os -s" LDFLAGS="-s" && make $ du -h build/lib.linux-x86_64-3.4/ 24K build/lib.linux-x86_64-3.4/__pycache__ 3.8M build/lib.linux-x86_64-3.4/ $ ls -lh python 1.6M Jan 26 16:33 python $ ls -lh libpython3.4m.a 9.4M Jan 26 16:33 libpython3.4m.a Stefan Krah From stefan at bytereef.org Sat Jan 26 17:03:59 2013 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 26 Jan 2013 17:03:59 +0100 Subject: [Python-Dev] Anyone building Python --without-doc-strings? In-Reply-To: <20130126154738.GA13795@sleipnir.bytereef.org> References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> <20130126154738.GA13795@sleipnir.bytereef.org> Message-ID: <20130126160359.GA14047@sleipnir.bytereef.org> Stefan Krah wrote: > I'm not sure how accurate the output is for measuring these things, but > according to ``ls'' and ``du'' the option is indeed quite worthless: > > ./configure CFLAGS="-Os -s" LDFLAGS="-s" && make > 1.8M Jan 26 16:36 python > ./configure --without-doc-strings CFLAGS="-Os -s" LDFLAGS="-s" && make > 1.6M Jan 26 16:33 python The original contribution *was* in fact aiming for "10% smaller", see: http://docs.python.org/release/2.3/whatsnew/node20.html So apparently people thought it was useful. Stefan Krah From solipsis at pitrou.net Sat Jan 26 17:19:32 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 26 Jan 2013 17:19:32 +0100 Subject: [Python-Dev] Anyone building Python --without-doc-strings? References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> <20130126154738.GA13795@sleipnir.bytereef.org> <20130126160359.GA14047@sleipnir.bytereef.org> Message-ID: <20130126171932.74e792e9@pitrou.net> On Sat, 26 Jan 2013 17:03:59 +0100 Stefan Krah wrote: > Stefan Krah wrote: > > I'm not sure how accurate the output is for measuring these things, but > > according to ``ls'' and ``du'' the option is indeed quite worthless: > > > > ./configure CFLAGS="-Os -s" LDFLAGS="-s" && make > > 1.8M Jan 26 16:36 python > > ./configure --without-doc-strings CFLAGS="-Os -s" LDFLAGS="-s" && make > > 1.6M Jan 26 16:33 python > > The original contribution *was* in fact aiming for "10% smaller", see: > > http://docs.python.org/release/2.3/whatsnew/node20.html > > So apparently people thought it was useful. After a bit of digging, I found the following discussions: http://mail.python.org/pipermail/python-dev/2001-November/018444.html http://mail.python.org/pipermail/python-dev/2002-January/019392.html http://bugs.python.org/issue505375 Another reason for accepting the patch seemed to be that it introduced the Py_DOCSTR() macros, which were viewed as helpful for other reasons (some people talked about localizing docstrings). I would point out that if 200 KB is really a big win for someone, then Python (and especially Python 3) is probably not the best language for them. It is also ironic how the executable size went up since then (from 0.6 to more than 1.5 MB) :-) Regards Antoine. From rdmurray at bitdance.com Sat Jan 26 17:37:37 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 26 Jan 2013 11:37:37 -0500 Subject: [Python-Dev] Anyone building Python --without-doc-strings? In-Reply-To: <20130126171932.74e792e9@pitrou.net> References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> <20130126154738.GA13795@sleipnir.bytereef.org> <20130126160359.GA14047@sleipnir.bytereef.org> <20130126171932.74e792e9@pitrou.net> Message-ID: <20130126163738.8E7B6250BC5@webabinitio.net> On Sat, 26 Jan 2013 17:19:32 +0100, Antoine Pitrou wrote: > On Sat, 26 Jan 2013 17:03:59 +0100 > Stefan Krah wrote: > > Stefan Krah wrote: > > > I'm not sure how accurate the output is for measuring these things, but > > > according to ``ls'' and ``du'' the option is indeed quite worthless: > > > > > > ./configure CFLAGS="-Os -s" LDFLAGS="-s" && make > > > 1.8M Jan 26 16:36 python > > > ./configure --without-doc-strings CFLAGS="-Os -s" LDFLAGS="-s" && make > > > 1.6M Jan 26 16:33 python > > > > The original contribution *was* in fact aiming for "10% smaller", see: > > > > http://docs.python.org/release/2.3/whatsnew/node20.html > > > > So apparently people thought it was useful. > > After a bit of digging, I found the following discussions: > http://mail.python.org/pipermail/python-dev/2001-November/018444.html > http://mail.python.org/pipermail/python-dev/2002-January/019392.html > http://bugs.python.org/issue505375 > > Another reason for accepting the patch seemed to be that it introduced > the Py_DOCSTR() macros, which were viewed as helpful for other reasons > (some people talked about localizing docstrings). > > I would point out that if 200 KB is really a big win for someone, then > Python (and especially Python 3) is probably not the best language for > them. > > It is also ironic how the executable size went up since then (from 0.6 > to more than 1.5 MB) :-) 200K can make a difference. It does on the QNX platform, for example, where there is no virtual memory. It would be nice to reduce that executable size, too....but I'm not volunteering to try (at least not yet) :) --David From cs at zip.com.au Sun Jan 27 05:45:08 2013 From: cs at zip.com.au (Cameron Simpson) Date: Sun, 27 Jan 2013 15:45:08 +1100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: Message-ID: <20130127044508.GA1233@cskk.homeip.net> On 25Jan2013 12:24, Victor Stinner wrote: | 2013/1/25 Charles-Fran?ois Natali : | > You can actually count me in the cloexec=False camp, and against the | > idea of a configurable default value. | | Oh ok. I'm leaning to this view myself also. | > Why the default value shouldn't be tunable: | > - I think it's useless: if the default cloexec behavior can be altered | > (either by a command-line flag, an environment variable or a sys | > module function), then libraries cannot rely on it and have to make | > file descriptors cloexec on an individual basis, since the default | > flag can be disabled. | | In my experience, in most cases, the default value of cloexec just | doesn't matter at all. If your program relies on the state of the | close-on-exec flag: you have to make it explicit, specify cloexec | paramater. Example: os.pipe(cloexec=True). If you don't modify your | application, it will just not work using -e command line option (or | PYTHONCLOEXEC environment variable). But why would you enable cloexec | by default if your application is not compatible? Because we're not all writing applications. I write a lot of small library routines and then use them in my apps. On a UNIX system I can plow on presuming cloexec=False, and will feel no desire to label it in any calls. And so I might be making pipes or whatever and expecting to pass them to subprocesses which will be told the file descriptor numbers on the command line. (Let me say up front that I haven't such an example in my code at present; generally that kind of thing happens in shell script for me.) Let someone change the global default and suddening my library code doesn't work, and people will be tearing their hair out trying to figure out why; because the code was written correctly in the old regime. We have recently seen a mail thread where people were arguing (generally correctly IMO) that programs should not do chdir, because it changes a process wide global. The situation with cloexec is in my mind very analogous. So I am: +1 on offering a cloexec parameter, defaulting to False -0.5 on offering a global default that can be changed [...] | > - We should go over the standard library, and create FDs cloexec if | > they're not handed over to the caller, either because they're | > opened/closed before returning, or because the underlying file | > descriptor is kept private (not fileno() method, although it's | > relatively rare). That's the approach chosen by glibc, and it makes | > sense: if another thread forks() while a thread is in the middle of | > getpwnam(), you don't want to leak an open file descriptor to | > /etc/passwd (or /etc/shadow). | | I started to make cloexec explicit *everywhere* in the Python stdlib. | I reverted my commit because I think that only a few application start | child processes, so doing extra work (additionnal syscalls to set | cloexec) would slowdown Python for no gain. See my revert commit to | see how many functions need to be modified: | http://hg.python.org/features/pep-433/rev/963e450fc24f I see what you mean, it is a quite intrusive change. However, what you're mostly backing out is cloexec=True. There are therefore two changes coming here: - present the cloexec parameter - unilaterally make a global policy decision that in these many places in the stdlib, cloexec=True is now both correct/acceptable and _necessary_ If it were only the former, the change would be much smaller. | That's why I really like the idea of being able to configure the | default value of the cloexec parameter. By default: no overhead nor | backward compatibility issue. Whereas you can set close-on-exec flag | *everywhere* if you are concern by all issues of inheriting file | descriptors (cases listed in the PEP). Quietly breaking libraries that relied on cloexec=False being the status quo... The global tunable still make me feel uneasy. I am in agreement that the general leakage of open files because cloexec defaults to False is untidy and sometimes problematic; the tension here isn't lost on me:-) Cheers, -- Cameron Simpson Out on the road, feeling the breeze, passing the cars. - Bob Seger From cs at zip.com.au Sun Jan 27 05:48:04 2013 From: cs at zip.com.au (Cameron Simpson) Date: Sun, 27 Jan 2013 15:48:04 +1100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: Message-ID: <20130127044804.GA2058@cskk.homeip.net> On 25Jan2013 21:07, Nick Coghlan wrote: | It's a configurable setting in the same way that -Q makes the | behaviour of "/" configurable in Python 2 (so your hypothetical | example isn't hypothetical at all - it's a description the -Q option), | and -R makes random hashing configurable in 2.7 and 3.2: it means we | can change the default behaviour in a future version (perhaps Python | 4) while allowing people to easily check if their code operates | correctly in that state in the current version. | | I think the default behaviour needs to be configurable from the | environment and the command line, but I don't believe it should be | configurable from within the interpreter. Hmm. This I can live with more happily, though I'm still uneasy. As an aside, I tend to feel that if something is tuneable it should be exposed within the interpreter. Maybe only in an exciting new module called shoot_self_in_foot or some similarly alarming name... Cheers, -- Cameron Simpson That said, I'm inclined to agree that that's not necessarily a good idea. I always wanted to write a little program that would pop up a Mac window to ask ``I'm going to amputate a limb at random from you now.'' to see how many people would instinctively click "OK". - Marc VanHeyningen From guido at python.org Sun Jan 27 06:10:43 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 26 Jan 2013 21:10:43 -0800 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <20130127044804.GA2058@cskk.homeip.net> References: <20130127044804.GA2058@cskk.homeip.net> Message-ID: On Sat, Jan 26, 2013 at 8:48 PM, Cameron Simpson wrote: > On 25Jan2013 21:07, Nick Coghlan wrote: > | It's a configurable setting in the same way that -Q makes the > | behaviour of "/" configurable in Python 2 (so your hypothetical > | example isn't hypothetical at all - it's a description the -Q option), > | and -R makes random hashing configurable in 2.7 and 3.2: it means we > | can change the default behaviour in a future version (perhaps Python > | 4) while allowing people to easily check if their code operates > | correctly in that state in the current version. > | > | I think the default behaviour needs to be configurable from the > | environment and the command line, but I don't believe it should be > | configurable from within the interpreter. > > Hmm. This I can live with more happily, though I'm still uneasy. > > As an aside, I tend to feel that if something is tuneable it should be > exposed within the interpreter. Maybe only in an exciting new module > called shoot_self_in_foot or some similarly alarming name... I had missed this detail. I agree that it should be exposed in the interpreter. To my mind it is more like PYTHONPATH (which corresponds roughly to sys.path manipulations) than like -R (which changes something that should never be changed again, otherwise the sanity of the interpreter be at risk). It would seem hard to unittest the feature if it cannot be changed from within. But I can also think of other use cases for changing it from within (e.g. a script that decides on how to set it using a computation based on its arguments). -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Sun Jan 27 06:42:30 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Jan 2013 15:42:30 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Touch up exception messaging In-Reply-To: References: <3Yt8RJ6v3nzSLv@mail.python.org> Message-ID: On Sun, Jan 27, 2013 at 3:25 AM, Ezio Melotti wrote: > Hi, > I'm not sure these changes are an improvement. > > On Fri, Jan 25, 2013 at 8:49 PM, brett.cannon > wrote: >> http://hg.python.org/cpython/rev/792810303239 >> changeset: 81735:792810303239 >> user: Brett Cannon >> date: Fri Jan 25 13:49:19 2013 -0500 >> summary: >> Touch up exception messaging >> [...] >> magic = data[:4] >> raw_timestamp = data[4:8] >> raw_size = data[8:12] >> if magic != _MAGIC_BYTES: >> - msg = 'bad magic number in {!r}: {!r}'.format(name, magic) >> + msg = 'incomplete magic number in {!r}: {!r}'.format(name, magic) > > Here 2 things could go wrong: > 1) magic is less than 4 bytes (so it could be "incomplete"); > 2) magic is 4 bytes, but it's different (so it's "invalid" or "bad"); > For this to be "incomplete" the size of the whole file should be less > than 4 bytes, and it's unlikely that in a 3bytes-long file the content > is an incomplete magic number. It's much more likely than it's not a > magic number at all, or that it is a bad/wrong/invalid 4-bytes magic > number, so the previous error looked better to me. Oops, I missed that - this one wasn't supposed to change. It was only the other two that were misleading. > >> raise ImportError(msg, **exc_details) >> elif len(raw_timestamp) != 4: >> - message = 'bad timestamp in {!r}'.format(name) >> + message = 'incomplete timestamp in {!r}'.format(name) >> _verbose_message(message) >> raise EOFError(message) >> elif len(raw_size) != 4: >> - message = 'bad size in {!r}'.format(name) >> + message = 'incomplete size in {!r}'.format(name) > > If we arrived here the magic number was right, so this are probably > the timestamp and size, but for this to fail the whole file should be > less than 8 or 12 bytes (unless I misread the code). > Something like "reached EOF while reading timestamp/size" would > probably be more informative. I don't mind one way or the other here. "bad" was wrong, since it implied this code actually checked these values against the expected ones, but it doesn't, it merely requires that they exist. "incomplete " or "reached EOF while reading " are both correct and a sufficient hint for anyone already familiar with how the bytecode cache format works. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Jan 27 06:43:40 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Jan 2013 15:43:40 +1000 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Fix a few typos and a double semicolon. Patch by Eitan Adler. In-Reply-To: <3Yv14R5h6lzSPg@mail.python.org> References: <3Yv14R5h6lzSPg@mail.python.org> Message-ID: On Sun, Jan 27, 2013 at 2:21 PM, ezio.melotti wrote: > http://hg.python.org/cpython/rev/07488c3c85f1 > changeset: 81772:07488c3c85f1 > branch: 3.3 > parent: 81770:a0756411c527 > user: Ezio Melotti > date: Sun Jan 27 06:20:14 2013 +0200 > summary: > Fix a few typos and a double semicolon. Patch by Eitan Adler. Misc/ACKS? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From zeal_goswami at yahoo.com Sun Jan 27 06:42:22 2013 From: zeal_goswami at yahoo.com (abhishek goswami) Date: Sun, 27 Jan 2013 13:42:22 +0800 (SGT) Subject: [Python-Dev] 32 bit to 64 bit migration of C-API(Python) Message-ID: <1359265342.72980.YahooMailNeo@web193002.mail.sg3.yahoo.com> Hi, I am working in Python-C ApI (Integrated Project). The Project has migrated to 64 bit machine from 32 bit machine. Every thing was working fine in 32 bit machine. But we getting following error in 64 bit machine. " Python int too large to convert to C long". When I start debug the code and found that we are using long variable which has size of 8 bit in C code. If I ?replace long to int32_t variable in C and all long value become garbage and I got index out of error from Python.The reason to become garbage value of long(after int conversion),we are using lot of Place Python api which give the long result and we use in further process like ( write/reading file).I have tried to replace with int but it did not work. I am not sure really this is only reason but it may be one of the reason to get the above error.I have also done google search but not able to get exact reason. After spending 3 days,I am not able to figure out the solution of this issue. Does anyone provide some insight. we are using Python 2.7.2 and Gcc : 4.7.0 Please let me know if I need to provide more info. Your earlier response will help us lot as we need to fix this issue as soon as possible.( it is kind of blockage for us) Thanks Abhishek Goswami Bangalore Phone No -07829580867 Skype : abhishekgoswami1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Jan 27 07:16:40 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Jan 2013 16:16:40 +1000 Subject: [Python-Dev] 32 bit to 64 bit migration of C-API(Python) In-Reply-To: <1359265342.72980.YahooMailNeo@web193002.mail.sg3.yahoo.com> References: <1359265342.72980.YahooMailNeo@web193002.mail.sg3.yahoo.com> Message-ID: On Sun, Jan 27, 2013 at 3:42 PM, abhishek goswami wrote: > Hi, > I am working in Python-C ApI (Integrated Project). The Project has migrated > to 64 bit machine from 32 bit machine. Every thing was working fine in 32 > bit machine. > But we getting following error in 64 bit machine. > " Python int too large to convert to C long". Hi Abhishek, python-dev is for the development *of* CPython, rather than development *with* CPython (including the C API). More appropriate places for this question are the general python-list mailing list or the C API special interest group. However, from the sounds of it, you are running into trouble with the fact that on a 64-bit machine, Python stores many internal values as "Py_ssize_t" (which is always 8 bytes on a 64 bit machine), rather than a C int or long. PEP 353 [1] describes the background for this change, which allows Python containers to exploit the full size of the address space on 64-bit systems. C API functions that may be of interest in that context include: http://docs.python.org/2/c-api/number.html#PyNumber_AsSsize_t http://docs.python.org/2/c-api/int.html#PyInt_FromSsize_t http://docs.python.org/2/c-api/int.html#PyInt_AsSsize_t http://docs.python.org/2/c-api/long.html#PyLong_FromSsize_t http://docs.python.org/2/c-api/long.html#PyLong_AsSsize_t As well as the "n" format character for argument processing http://docs.python.org/2/c-api/arg.html#parsing-arguments-and-building-values [1] http://www.python.org/dev/peps/pep-0353/ Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From cf.natali at gmail.com Sun Jan 27 10:56:51 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Sun, 27 Jan 2013 10:56:51 +0100 Subject: [Python-Dev] usefulness of "extension modules" section in Misc/NEWS Message-ID: Hi, What's exactly the guideline for choosing between the "Library" and "Extension modules" section when updating Misc/NEWS? Is it just the fact that the modified files live under Lib/ or Modules/? I've frequently made a mistake when updating Misc/NEWS, and when looking at it, I'm not the only one. Is there really a good reason for having distinct sections? If the intended audience for this file are end users, ISTM that the only things that matters is that it's a library change, the fact that the modification impacted Python/C code isn't really relevant. Also, for example if you're rewriting a library from Python to C (or vice versa), should it appear under both sections? FWIW, the What's new documents don't have such a distinction. Cheers, cf From ncoghlan at gmail.com Sun Jan 27 11:50:33 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Jan 2013 20:50:33 +1000 Subject: [Python-Dev] usefulness of "extension modules" section in Misc/NEWS In-Reply-To: References: Message-ID: On Sun, Jan 27, 2013 at 7:56 PM, Charles-Fran?ois Natali wrote: > If the intended audience for this file are end users, ISTM that the > only things that matters is that it's a library change, the fact that > the modification impacted Python/C code isn't really relevant. I have no objection to merging them. A separate section for "C API" may still make sense, though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun Jan 27 12:02:40 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 27 Jan 2013 12:02:40 +0100 Subject: [Python-Dev] usefulness of "extension modules" section in Misc/NEWS References: Message-ID: <20130127120240.09b869a7@pitrou.net> On Sun, 27 Jan 2013 10:56:51 +0100 Charles-Fran?ois Natali wrote: > Hi, > > What's exactly the guideline for choosing between the "Library" and > "Extension modules" section when updating Misc/NEWS? > Is it just the fact that the modified files live under Lib/ or Modules/? > > I've frequently made a mistake when updating Misc/NEWS, and when > looking at it, I'm not the only one. > > Is there really a good reason for having distinct sections? > > If the intended audience for this file are end users, ISTM that the > only things that matters is that it's a library change, the fact that > the modification impacted Python/C code isn't really relevant. Agreed. I think "we" should merge them (for some value of a motivated we :-)). Regards Antoine. From victor.stinner at gmail.com Sun Jan 27 12:23:15 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 27 Jan 2013 12:23:15 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: <20130127044804.GA2058@cskk.homeip.net> Message-ID: 2013/1/27 Guido van Rossum : > I had missed this detail. I agree that it should be exposed in the > interpreter. To my mind it is more like PYTHONPATH (which corresponds > roughly to sys.path manipulations) than like -R (which changes > something that should never be changed again, otherwise the sanity of > the interpreter be at risk). It would seem hard to unittest the > feature if it cannot be changed from within. But I can also think of > other use cases for changing it from within (e.g. a script that > decides on how to set it using a computation based on its arguments). sys.path is usually only used to add a new path, not to remove path from other libraries. I'm not sure that it's the best example to compare it to sys.setdefaultcloexec(). If sys.setdefaultcloexec() accepts an argument (so allow sys.setdefaultcloexec(False)), problems happen when two libraries, or an application and a library, disagree. Depending how/when the library is loaded, the flag may be True or False. I prefer to have a simple sys.setdefaultcloexec() which always set the flag to True. It's also simpler to explain how the default value is computed (it's less surprising). -- Unit tests can workaround the limitation using subprocesses. My implementation doesn't use sys.setdefaultcloexec() anymore, it just ensures that functions respect the current default value. I ran tests manually to test both default values (True and False). Tests may be improved later to test both defalut values using subprocess. Victor From solipsis at pitrou.net Sun Jan 27 12:29:00 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 27 Jan 2013 12:29:00 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter References: <20130127044804.GA2058@cskk.homeip.net> Message-ID: <20130127122900.469ec8d9@pitrou.net> On Sun, 27 Jan 2013 12:23:15 +0100 Victor Stinner wrote: > 2013/1/27 Guido van Rossum : > > I had missed this detail. I agree that it should be exposed in the > > interpreter. To my mind it is more like PYTHONPATH (which corresponds > > roughly to sys.path manipulations) than like -R (which changes > > something that should never be changed again, otherwise the sanity of > > the interpreter be at risk). It would seem hard to unittest the > > feature if it cannot be changed from within. But I can also think of > > other use cases for changing it from within (e.g. a script that > > decides on how to set it using a computation based on its arguments). > > sys.path is usually only used to add a new path, not to remove path > from other libraries. I'm not sure that it's the best example to > compare it to sys.setdefaultcloexec(). > > If sys.setdefaultcloexec() accepts an argument (so allow > sys.setdefaultcloexec(False)), problems happen when two libraries, or > an application and a library, disagree. Depending how/when the library > is loaded, the flag may be True or False. > > I prefer to have a simple sys.setdefaultcloexec() which always set the > flag to True. It's also simpler to explain how the default value is > computed (it's less surprising). I don't think such limitations are very useful in practice. Users calling sys.setdefaultexec() will have to be sufficiently knowledgeable to understand the implications, anyway. Regards Antoine. From ncoghlan at gmail.com Sun Jan 27 12:45:11 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Jan 2013 21:45:11 +1000 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <20130127122900.469ec8d9@pitrou.net> References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> Message-ID: On Sun, Jan 27, 2013 at 9:29 PM, Antoine Pitrou wrote: > I don't think such limitations are very useful in practice. Users > calling sys.setdefaultexec() will have to be sufficiently knowledgeable > to understand the implications, anyway. I've yet to hear a use case for being able to turn it off globally if the application developer has indicated they want it on. A global flag that can be turned off programmatically is worse than no global flag at all. If we're never going to migrate to cloexec-on-by-default, then there simply shouldn't be a global flag - the option should just default to False. If we're going to migrate to cloexec-on-by-default some day, then the global flag should purely be a transition strategy to allow people to try out that future behaviour and adjust their application appropriately (by clearing the flag on descriptors that really need to be inherited). The typical default that should be assumed by library code will still be cloexec-off-by-default. A completely flexible global flag is just a plain bad idea for all the reasons Charles-Fran?ois gave. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From skip at pobox.com Sun Jan 27 14:58:24 2013 From: skip at pobox.com (Skip Montanaro) Date: Sun, 27 Jan 2013 07:58:24 -0600 Subject: [Python-Dev] Ctypes bug fix for Solaris - too late for 2.7.3? Message-ID: I hit a bug in the ctypes package on Solaris yesterday. Investigating, I found there is a patch: http://bugs.python.org/issue5289 I applied it and verified that when run as a main program ctypes/util.py now works. Is it too late to get this into 2.7.3? Given the nature of the patch (a new block of code only matched on Solaris, and the fact that find_library is quite broken now on that platform), it's hard to see how things would be worse after applying it. BTW, this will be a requirement for getting PyPy running on Solaris. (That's the context where i encountered it.) Thanks, Skip From benjamin at python.org Sun Jan 27 16:22:45 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 27 Jan 2013 10:22:45 -0500 Subject: [Python-Dev] Ctypes bug fix for Solaris - too late for 2.7.3? In-Reply-To: References: Message-ID: 2013/1/27 Skip Montanaro : > I hit a bug in the ctypes package on Solaris yesterday. > Investigating, I found there is a patch: > > http://bugs.python.org/issue5289 > > I applied it and verified that when run as a main program > ctypes/util.py now works. > > Is it too late to get this into 2.7.3? Yes, it's far too late for 2.7.3, since that was released last April. :) I think it could go into 2.7.4, though. -- Regards, Benjamin From skip at pobox.com Sun Jan 27 16:31:46 2013 From: skip at pobox.com (Skip Montanaro) Date: Sun, 27 Jan 2013 09:31:46 -0600 Subject: [Python-Dev] Ctypes bug fix for Solaris - too late for 2.7.3? In-Reply-To: References: Message-ID: >> Is it too late to get this into 2.7.3? > > Yes, it's far too late for 2.7.3, since that was released last April. > :) I think it could go into 2.7.4, though. Whoops, sorry. I thought I had remembered some recent discussion of an upcoming 2.7 micro. Off-by-one error, or just brain freeze? :-) Skip From tseaver at palladion.com Sun Jan 27 17:09:59 2013 From: tseaver at palladion.com (Tres Seaver) Date: Sun, 27 Jan 2013 11:09:59 -0500 Subject: [Python-Dev] Ctypes bug fix for Solaris - too late for 2.7.3? In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/27/2013 10:31 AM, Skip Montanaro wrote: >>> Is it too late to get this into 2.7.3? >> >> Yes, it's far too late for 2.7.3, since that was released last >> April. :) I think it could go into 2.7.4, though. > > Whoops, sorry. I thought I had remembered some recent discussion of > an upcoming 2.7 micro. Off-by-one error, or just brain freeze? :-) The upcoming micro you recall is 2.7.4. The 2.7.3 train left the station already. ;) Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEUEARECAAYFAlEFUVcACgkQ+gerLs4ltQ6X3QCgscrJNOQo2mB5ylS97OYJ7EG/ tKYAl3YvSGLNd9NeZ6AKchreRcjNvYc= =yimz -----END PGP SIGNATURE----- From kristjan at ccpgames.com Sun Jan 27 17:18:28 2013 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Sun, 27 Jan 2013 16:18:28 +0000 Subject: [Python-Dev] Anyone building Python --without-doc-strings? In-Reply-To: <20130126163738.8E7B6250BC5@webabinitio.net> References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> <20130126154738.GA13795@sleipnir.bytereef.org> <20130126160359.GA14047@sleipnir.bytereef.org> <20130126171932.74e792e9@pitrou.net> <20130126163738.8E7B6250BC5@webabinitio.net> Message-ID: We (CCP) are certainly compiling python without docstrings for our embedded platforms (that include the PS3) Anyone using python as en engine to be used by programs and not users will appreciate the deletion of unneeded memory. K -----Original Message----- From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf Of R. David Murray Sent: 27. jan?ar 2013 00:38 To: python-dev at python.org Subject: Re: [Python-Dev] Anyone building Python --without-doc-strings? On Sat, 26 Jan 2013 17:19:32 +0100, Antoine Pitrou wrote: > On Sat, 26 Jan 2013 17:03:59 +0100 > Stefan Krah wrote: > > Stefan Krah wrote: > > > I'm not sure how accurate the output is for measuring these > > > things, but according to ``ls'' and ``du'' the option is indeed quite worthless: > > > > > > ./configure CFLAGS="-Os -s" LDFLAGS="-s" && make > > > 1.8M Jan 26 16:36 python > > > ./configure --without-doc-strings CFLAGS="-Os -s" LDFLAGS="-s" && make > > > 1.6M Jan 26 16:33 python > > > > The original contribution *was* in fact aiming for "10% smaller", see: > > > > http://docs.python.org/release/2.3/whatsnew/node20.html > > > > So apparently people thought it was useful. > > After a bit of digging, I found the following discussions: > http://mail.python.org/pipermail/python-dev/2001-November/018444.html > http://mail.python.org/pipermail/python-dev/2002-January/019392.html > http://bugs.python.org/issue505375 > > Another reason for accepting the patch seemed to be that it introduced > the Py_DOCSTR() macros, which were viewed as helpful for other reasons > (some people talked about localizing docstrings). > > I would point out that if 200 KB is really a big win for someone, then > Python (and especially Python 3) is probably not the best language for > them. > > It is also ironic how the executable size went up since then (from 0.6 > to more than 1.5 MB) :-) 200K can make a difference. It does on the QNX platform, for example, where there is no virtual memory. It would be nice to reduce that executable size, too....but I'm not volunteering to try (at least not yet) :) --David _______________________________________________ Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com From guido at python.org Sun Jan 27 17:24:00 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 27 Jan 2013 08:24:00 -0800 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> Message-ID: On Sun, Jan 27, 2013 at 3:45 AM, Nick Coghlan wrote: > On Sun, Jan 27, 2013 at 9:29 PM, Antoine Pitrou wrote: >> I don't think such limitations are very useful in practice. Users >> calling sys.setdefaultexec() will have to be sufficiently knowledgeable >> to understand the implications, anyway. > > I've yet to hear a use case for being able to turn it off globally if > the application developer has indicated they want it on. A global flag > that can be turned off programmatically is worse than no global flag > at all. Out of context that last statement sounds pretty absurd. :-) > If we're never going to migrate to cloexec-on-by-default, then there > simply shouldn't be a global flag - the option should just default to > False. That doesn't seem to follow. We might never want to turn it on by default for reasons of compatibility with POSIX (whatever you say, POSIX semantics seep through Python in many places). But it might still be useful to be able to turn it on for specific programs (and doing that in code is a lot more convenient than having to say "run this app with python --cloexec", which sounds a real pain). > If we're going to migrate to cloexec-on-by-default some day, then the > global flag should purely be a transition strategy to allow people to > try out that future behaviour and adjust their application > appropriately (by clearing the flag on descriptors that really need to > be inherited). The typical default that should be assumed by library > code will still be cloexec-off-by-default. But typical library code won't care one way or another, will it? > A completely flexible global flag is just a plain bad idea for all the > reasons Charles-Fran?ois gave. Honestly, what happened to the idea that we're all adults here? The likelihood that someone is clueless enough to misuse the flag and yet clueful enough to find out how to change it seems remote -- and they pretty much deserve what they get. It's like calling socket.settimeout(0.1) and then complaining that urllib.urlopen() raises exceptions, or calling sys.setrecursionlimit(1000000) when you are getting StackOverflowError. The only reason I can see why a flag should never be changeable after the fact is when it would end up violating the integrity of the interpreter. But I don't see that here. I just see paranoia about protecting users from all possible security issues. That seems a hopeless quest. That said, I'm sure I'll survive if you ignore my opinion, so I explicitly give you that option -- in the grand scheme of things I can't care that much about this and it seems a mostly theoretical issue. (And, an attacker who can exploit file descriptors given to it can probably find plenty of other attacks on a system they've broken into.) -- --Guido van Rossum (python.org/~guido) From storchaka at gmail.com Sun Jan 27 17:52:21 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 27 Jan 2013 18:52:21 +0200 Subject: [Python-Dev] Boost Software License Message-ID: Is Boost Software License [1] compatible with Python license? Can I steal some code from Boost library [2]? [1] http://www.boost.org/LICENSE_1_0.txt [2] http://www.boost.org/ From phd at phdru.name Sun Jan 27 18:11:06 2013 From: phd at phdru.name (Oleg Broytman) Date: Sun, 27 Jan 2013 21:11:06 +0400 Subject: [Python-Dev] Boost Software License In-Reply-To: References: Message-ID: <20130127171106.GA20047@iskra.aviel.ru> On Sun, Jan 27, 2013 at 06:52:21PM +0200, Serhiy Storchaka wrote: > Is Boost Software License [1] compatible with Python license? Can I > steal some code from Boost library [2]? > > [1] http://www.boost.org/LICENSE_1_0.txt > [2] http://www.boost.org/ BSD-ish license? I think yes, it's compatible with Python license. Also I think you have to copy the license to the section "Licenses and Acknowledgements for Incorporated Software" in Doc/license.rst. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From brett at python.org Sun Jan 27 19:11:31 2013 From: brett at python.org (Brett Cannon) Date: Sun, 27 Jan 2013 13:11:31 -0500 Subject: [Python-Dev] usefulness of "extension modules" section in Misc/NEWS In-Reply-To: <20130127120240.09b869a7@pitrou.net> References: <20130127120240.09b869a7@pitrou.net> Message-ID: On Sun, Jan 27, 2013 at 6:02 AM, Antoine Pitrou wrote: > On Sun, 27 Jan 2013 10:56:51 +0100 > Charles-Fran?ois Natali wrote: > > Hi, > > > > What's exactly the guideline for choosing between the "Library" and > > "Extension modules" section when updating Misc/NEWS? > > Is it just the fact that the modified files live under Lib/ or Modules/? > > > > I've frequently made a mistake when updating Misc/NEWS, and when > > looking at it, I'm not the only one. > > > > Is there really a good reason for having distinct sections? > > > > If the intended audience for this file are end users, ISTM that the > > only things that matters is that it's a library change, the fact that > > the modification impacted Python/C code isn't really relevant. > > Agreed. I think "we" should merge them (for some value of a motivated > we :-)). > "We" is "me" and it was a 2 line delete in 1fabff717ef4 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf at systemexit.de Sun Jan 27 19:42:59 2013 From: ralf at systemexit.de (Ralf Schmitt) Date: Sun, 27 Jan 2013 19:42:59 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: (Guido van Rossum's message of "Sun, 27 Jan 2013 08:24:00 -0800") References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> Message-ID: <87ehh61n7w.fsf@myhost.lan> Guido van Rossum writes: > It's like calling socket.settimeout(0.1) and then complaining that > urllib.urlopen() raises exceptions but that's not what's happening. you'll see urllib.urlopen raising exceptions and only afterwards realize that you called into some third party library code that decided to change the timeout. From stefan_ml at behnel.de Sun Jan 27 21:30:23 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 27 Jan 2013 21:30:23 +0100 Subject: [Python-Dev] Boost Software License In-Reply-To: References: Message-ID: Serhiy Storchaka, 27.01.2013 17:52: > Is Boost Software License [1] compatible with Python license? Can I steal > some code from Boost library [2]? > > [1] http://www.boost.org/LICENSE_1_0.txt > [2] http://www.boost.org/ Depends on what you want to do with it after stealing it. Assuming you want to add it to CPython, two things to consider: - Isn't Boost C++ code? - Usually, people who contribute code to CPython must sign a contributors agreement. As far as I understand it, that would be those who wrote it, not those who "stole" it from them. Stefan From rdmurray at bitdance.com Sun Jan 27 21:42:08 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 27 Jan 2013 15:42:08 -0500 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <87ehh61n7w.fsf@myhost.lan> References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> Message-ID: <20130127204209.34B80250BC3@webabinitio.net> On Sun, 27 Jan 2013 19:42:59 +0100, Ralf Schmitt wrote: > Guido van Rossum writes: > > > It's like calling socket.settimeout(0.1) and then complaining that > > urllib.urlopen() raises exceptions > > but that's not what's happening. you'll see urllib.urlopen raising > exceptions and only afterwards realize that you called into some third > party library code that decided to change the timeout. What is stopping some some third party library code from calling socket.settimeout(0.1)? --David From ralf at systemexit.de Sun Jan 27 21:56:06 2013 From: ralf at systemexit.de (Ralf Schmitt) Date: Sun, 27 Jan 2013 21:56:06 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <20130127204209.34B80250BC3@webabinitio.net> (R. David Murray's message of "Sun, 27 Jan 2013 15:42:08 -0500") References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> Message-ID: <87a9ru1h21.fsf@myhost.lan> "R. David Murray" writes: > On Sun, 27 Jan 2013 19:42:59 +0100, Ralf Schmitt wrote: >> Guido van Rossum writes: >> >> > It's like calling socket.settimeout(0.1) and then complaining that >> > urllib.urlopen() raises exceptions >> >> but that's not what's happening. you'll see urllib.urlopen raising >> exceptions and only afterwards realize that you called into some third >> party library code that decided to change the timeout. > > What is stopping some some third party library code from calling > socket.settimeout(0.1)? Nothing. That's the point. You just wonder why urlopen fails when the global timeout has been changed by that third party library. From rdmurray at bitdance.com Sun Jan 27 22:26:46 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 27 Jan 2013 16:26:46 -0500 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <87a9ru1h21.fsf@myhost.lan> References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> Message-ID: <20130127212646.C756C250BC3@webabinitio.net> On Sun, 27 Jan 2013 21:56:06 +0100, Ralf Schmitt wrote: > "R. David Murray" writes: > > > On Sun, 27 Jan 2013 19:42:59 +0100, Ralf Schmitt wrote: > >> Guido van Rossum writes: > >> > >> > It's like calling socket.settimeout(0.1) and then complaining that > >> > urllib.urlopen() raises exceptions > >> > >> but that's not what's happening. you'll see urllib.urlopen raising > >> exceptions and only afterwards realize that you called into some third > >> party library code that decided to change the timeout. > > > > What is stopping some some third party library code from calling > > socket.settimeout(0.1)? > > Nothing. That's the point. You just wonder why urlopen fails when the > global timeout has been changed by that third party library. Oh, you were agreeing with Guido? I guess I misunderstood. --David From victor.stinner at gmail.com Sun Jan 27 22:58:12 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 27 Jan 2013 22:58:12 +0100 Subject: [Python-Dev] Anyone building Python --without-doc-strings? In-Reply-To: References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> <20130126154738.GA13795@sleipnir.bytereef.org> <20130126160359.GA14047@sleipnir.bytereef.org> <20130126171932.74e792e9@pitrou.net> <20130126163738.8E7B6250BC5@webabinitio.net> Message-ID: Why don't you compile using python -OO and distribute only .pyo code? Victor 2013/1/27 Kristj?n Valur J?nsson : > We (CCP) are certainly compiling python without docstrings for our embedded platforms (that include the PS3) > Anyone using python as en engine to be used by programs and not users will appreciate the deletion of unneeded memory. > K > > -----Original Message----- > From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf Of R. David Murray > Sent: 27. jan?ar 2013 00:38 > To: python-dev at python.org > Subject: Re: [Python-Dev] Anyone building Python --without-doc-strings? > > On Sat, 26 Jan 2013 17:19:32 +0100, Antoine Pitrou wrote: >> On Sat, 26 Jan 2013 17:03:59 +0100 >> Stefan Krah wrote: >> > Stefan Krah wrote: >> > > I'm not sure how accurate the output is for measuring these >> > > things, but according to ``ls'' and ``du'' the option is indeed quite worthless: >> > > >> > > ./configure CFLAGS="-Os -s" LDFLAGS="-s" && make >> > > 1.8M Jan 26 16:36 python >> > > ./configure --without-doc-strings CFLAGS="-Os -s" LDFLAGS="-s" && make >> > > 1.6M Jan 26 16:33 python >> > >> > The original contribution *was* in fact aiming for "10% smaller", see: >> > >> > http://docs.python.org/release/2.3/whatsnew/node20.html >> > >> > So apparently people thought it was useful. >> >> After a bit of digging, I found the following discussions: >> http://mail.python.org/pipermail/python-dev/2001-November/018444.html >> http://mail.python.org/pipermail/python-dev/2002-January/019392.html >> http://bugs.python.org/issue505375 >> >> Another reason for accepting the patch seemed to be that it introduced >> the Py_DOCSTR() macros, which were viewed as helpful for other reasons >> (some people talked about localizing docstrings). >> >> I would point out that if 200 KB is really a big win for someone, then >> Python (and especially Python 3) is probably not the best language for >> them. >> >> It is also ironic how the executable size went up since then (from 0.6 >> to more than 1.5 MB) :-) > > 200K can make a difference. It does on the QNX platform, for example, where there is no virtual memory. It would be nice to reduce that executable size, too....but I'm not volunteering to try (at least not > yet) :) > > --David > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From brett at python.org Mon Jan 28 00:30:24 2013 From: brett at python.org (Brett Cannon) Date: Sun, 27 Jan 2013 18:30:24 -0500 Subject: [Python-Dev] Anyone building Python --without-doc-strings? In-Reply-To: References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> <20130126154738.GA13795@sleipnir.bytereef.org> <20130126160359.GA14047@sleipnir.bytereef.org> <20130126171932.74e792e9@pitrou.net> <20130126163738.8E7B6250BC5@webabinitio.net> Message-ID: On Sun, Jan 27, 2013 at 4:58 PM, Victor Stinner wrote: > Why don't you compile using python -OO and distribute only .pyo code? > Because .pyo files can be much larger than necessary, e.g. using my mnfy project on decimal (with --safe-transforms) compared to -OO yields:: 224K Lib/decimal.py 200K Lib/__pycache__/decimal.cpython-34.pyc 120K Lib/__pycache__/decimal.cpython-34.pyo 80K decimal-mnfy.py And before you ask, the bytecode is still larger on the minified source. So if you truly want to shrink your binary plus overall memory usage footprint you want to go beyond -OO sometimes (although it is a cheap, fast way to save if you are not explicitly striving to eek out every byte). -Brett -Brett > > Victor > > 2013/1/27 Kristj?n Valur J?nsson : > > We (CCP) are certainly compiling python without docstrings for our > embedded platforms (that include the PS3) > > Anyone using python as en engine to be used by programs and not users > will appreciate the deletion of unneeded memory. > > K > > > > -----Original Message----- > > From: Python-Dev [mailto:python-dev-bounces+kristjan= > ccpgames.com at python.org] On Behalf Of R. David Murray > > Sent: 27. jan?ar 2013 00:38 > > To: python-dev at python.org > > Subject: Re: [Python-Dev] Anyone building Python --without-doc-strings? > > > > On Sat, 26 Jan 2013 17:19:32 +0100, Antoine Pitrou > wrote: > >> On Sat, 26 Jan 2013 17:03:59 +0100 > >> Stefan Krah wrote: > >> > Stefan Krah wrote: > >> > > I'm not sure how accurate the output is for measuring these > >> > > things, but according to ``ls'' and ``du'' the option is indeed > quite worthless: > >> > > > >> > > ./configure CFLAGS="-Os -s" LDFLAGS="-s" && make > >> > > 1.8M Jan 26 16:36 python > >> > > ./configure --without-doc-strings CFLAGS="-Os -s" LDFLAGS="-s" && > make > >> > > 1.6M Jan 26 16:33 python > >> > > >> > The original contribution *was* in fact aiming for "10% smaller", see: > >> > > >> > http://docs.python.org/release/2.3/whatsnew/node20.html > >> > > >> > So apparently people thought it was useful. > >> > >> After a bit of digging, I found the following discussions: > >> http://mail.python.org/pipermail/python-dev/2001-November/018444.html > >> http://mail.python.org/pipermail/python-dev/2002-January/019392.html > >> http://bugs.python.org/issue505375 > >> > >> Another reason for accepting the patch seemed to be that it introduced > >> the Py_DOCSTR() macros, which were viewed as helpful for other reasons > >> (some people talked about localizing docstrings). > >> > >> I would point out that if 200 KB is really a big win for someone, then > >> Python (and especially Python 3) is probably not the best language for > >> them. > >> > >> It is also ironic how the executable size went up since then (from 0.6 > >> to more than 1.5 MB) :-) > > > > 200K can make a difference. It does on the QNX platform, for example, > where there is no virtual memory. It would be nice to reduce that > executable size, too....but I'm not volunteering to try (at least not > > yet) :) > > > > --David > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com > > > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Jan 28 10:44:16 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 28 Jan 2013 10:44:16 +0100 Subject: [Python-Dev] Boost Software License References: Message-ID: <20130128104416.706887c9@pitrou.net> Le Sun, 27 Jan 2013 21:30:23 +0100, Stefan Behnel a ?crit : > Serhiy Storchaka, 27.01.2013 17:52: > > Is Boost Software License [1] compatible with Python license? Can I > > steal some code from Boost library [2]? > > > > [1] http://www.boost.org/LICENSE_1_0.txt > > [2] http://www.boost.org/ > > Depends on what you want to do with it after stealing it. > > Assuming you want to add it to CPython, two things to consider: > > - Isn't Boost C++ code? > > - Usually, people who contribute code to CPython must sign a > contributors agreement. As far as I understand it, that would be > those who wrote it, not those who "stole" it from them. That's the only potentially contentious point. Otherwise, the boost license looks like a fairly ordinary non-copyleft free license, and therefore should be compatible with the PSF license. That said, we already ship "non-contributed" code such as zlib, expat or libffi; and even in the core you can find such code such as David Gay's dtoa.c. Regards Antoine. From regebro at gmail.com Mon Jan 28 11:01:05 2013 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 28 Jan 2013 11:01:05 +0100 Subject: [Python-Dev] PEP 431 Updates Message-ID: I sent this out the 15th, but it seems to have gotten lost since I have reports that it didn't show up and I got no feedback. So here goes again, a new version of PEP 431. http://www.python.org/dev/peps/pep-0431/ //Lennart PEP: 431 Title: Time zone support improvements Version: $Revision$ Last-Modified: $Date$ Author: Lennart Regebro BDFL-Delegate: Barry Warsaw Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Dec-2012 Post-History: 11-Dec-2012, 28-Dec-2012 Abstract ======== This PEP proposes the implementation of concrete time zone support in the Python standard library, and also improvements to the time zone API to deal with ambiguous time specifications during DST changes. Proposal ======== Concrete time zone support -------------------------- The time zone support in Python has no concrete implementation in the standard library outside of a tzinfo baseclass that supports fixed offsets. To properly support time zones you need to include a database over all time zones, both current and historical, including daylight saving changes. But such information changes frequently, so even if we include the last information in a Python release, that information would be outdated just a few months later. Time zone support has therefore only been available through two third-party modules, ``pytz`` and ``dateutil``, both who include and wrap the "zoneinfo" database. This database, also called "tz" or "The Olsen database", is the de-facto standard time zone database over time zones, and it is included in most Unix and Unix-like operating systems, including OS X. This gives us the opportunity to include the code that supports the zoneinfo data in the standard library, but by default use the operating system's copy of the data, which typically will be kept updated by the updating mechanism of the operating system or distribution. For those who have an operating system that does not include the zoneinfo database, for example Windows, the Python source distribution will include a copy of the zoneinfo database, and a distribution containing the latest zoneinfo database will also be available at the Python Package Index, so it can be easily installed with the Python packaging tools such as ``easy_install`` or ``pip``. This could also be done on Unices that are no longer receiving updates and therefore have an outdated database. With such a mechanism Python would have full time zone support in the standard library on any platform, and a simple package installation would provide an updated time zone database on those platforms where the zoneinfo database isn't included, such as Windows, or on platforms where OS updates are no longer provided. The time zone support will be implemented by making the ``datetime`` module into a package, and adding time zone support to ``datetime`` based on Stuart Bishop's ``pytz`` module. Getting the local time zone --------------------------- On Unix there is no standard way of finding the name of the time zone that is being used. All the information that is available is the time zone abbreviations, such as ``EST`` and ``PDT``, but many of those abbreviations are ambiguous and therefore you can't rely on them to figure out which time zone you are located in. There is however a standard for finding the compiled time zone information since it's located in ``/etc/localtime``. Therefore it is possible to create a local time zone object with the correct time zone information even though you don't know the name of the time zone. A function in ``datetime`` should be provided to return the local time zone. The support for this will be made by integrating Lennart Regebro's ``tzlocal`` module into the new ``datetime`` module. For Windows it will look up the local Windows time zone name, and use a mapping between Windows time zone names and zoneinfo time zone names provided by the Unicode consortium to convert that to a zoneinfo time zone. The mapping should be updated before each major or bugfix release, scripts for doing so will be provided in the ``Tools/`` directory. Ambiguous times --------------- When changing over from daylight savings time (DST) the clock is turned back one hour. This means that the times during that hour happens twice, once without DST and then once with DST. Similarly, when changing to daylight savings time, one hour goes missing. The current time zone API can not differentiate between the two ambiguous times during a change from DST. For example, in Stockholm the time of 2012-11-28 02:00:00 happens twice, both at UTC 2012-11-28 00:00:00 and also at 2012-11-28 01:00:00. The current time zone API can not disambiguate this and therefore it's unclear which time should be returned:: # This could be either 00:00 or 01:00 UTC: >>> dt = datetime(2012, 10, 28, 2, 0, tzinfo=zoneinfo('Europe/Stockholm')) # But we can not specify which: >>> dt.astimezone(zoneinfo('UTC')) datetime.datetime(2012, 10, 28, 1, 0, tzinfo=) ``pytz`` solved this problem by adding ``is_dst`` parameters to several methods of the tzinfo objects to make it possible to disambiguate times when this is desired. This PEP proposes to add these ``is_dst`` parameters to the relevant methods of the ``datetime`` API, and therefore add this functionality directly to ``datetime``. This is likely the hardest part of this PEP as this involves updating the C version of the ``datetime`` library with this functionality, as this involved writing new code, and not just reorganizing existing external libraries. Implementation API ================== The zoneinfo database --------------------- The latest version of the zoneinfo database should exist in the ``Lib/tzdata`` directory of the Python source control system. This copy of the database should be updated before every Python feature and bug-fix release, but not for releases of Python versions that are in security-fix-only-mode. Scripts to update the database will be provided in ``Tools/``, and the release instructions will be updated to include this update. New configure options ``--enable-internal-timezone-database`` and ``--disable-internal-timezone-database`` will be implemented to enable and disable the installation of this database when installing from source. A source install will default to installing them. Binary installers for systems that have a system-provided zoneinfo database may skip installing the included database since it would never be used for these platforms. For other platforms, for example Windows, binary installers must install the included database. Changes in the ``datetime``-module ---------------------------------- The public API of the new time zone support contains one new class, one new function, one new exception and four new collections. In addition to this, several methods on the datetime object gets a new ``is_dst`` parameter. New class ``DstTzInfo`` ^^^^^^^^^^^^^^^^^^^^^^^^ This class provides a concrete implementation of the ``zoneinfo`` base class that implements DST support. New function ``zoneinfo(name=None, db_path=None)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This function takes a name string that must be a string specifying a valid zoneinfo time zone, i.e. "US/Eastern", "Europe/Warsaw" or "Etc/GMT". If not given, the local time zone will be looked up. If an invalid zone name is given, or the local time zone can not be retrieved, the function raises `UnknownTimeZoneError`. The function also takes an optional path to the location of the zoneinfo database which should be used. If not specified, the function will look for databases in the following order: 1. Use the database in ``/usr/share/zoneinfo``, if it exists. 2. Check if the `tzdata-update` module is installed, and then use that database. 3. Use the Python-provided database in ``Lib/tzdata``. If no database is found an ``UnknownTimeZoneError`` or subclass thereof will be raised with a message explaining that no zoneinfo database can be found, but that you can install one with the ``tzdata-update`` package. New parameter ``is_dst`` ^^^^^^^^^^^^^^^^^^^^^^^^ A new ``is_dst`` parameter is added to several methods to handle time ambiguity during DST changeovers. * ``tzinfo.utcoffset(dt, is_dst=False)`` * ``tzinfo.dst(dt, is_dst=False)`` * ``tzinfo.tzname(dt, is_dst=False)`` * ``datetime.astimezone(tz, is_dst=False)`` The ``is_dst`` parameter can be ``False`` (default), ``True``, or ``None``. ``False`` will specify that the given datetime should be interpreted as not happening during daylight savings time, i.e. that the time specified is after the change from DST. ``True`` will specify that the given datetime should be interpreted as happening during daylight savings time, i.e. that the time specified is before the change from DST. ``None`` will raise an ``AmbiguousTimeError`` exception if the time specified was during a DST change over. It will also raise a ``NonExistentTimeError`` if a time is specified during the "missing time" in a change to DST. New exceptions ^^^^^^^^^^^^^^ * ``UnknownTimeZoneError`` This exception is a subclass of KeyError and raised when giving a time zone specification that can't be found:: >>> datetime.Timezone('Europe/New_York') Traceback (most recent call last): ... UnknownTimeZoneError: There is no time zone called 'Europe/New_York' * ``InvalidTimeError`` This exception serves as a base for ``AmbiguousTimeError`` and ``NonExistentTimeError``, to enable you to trap these two separately. It will subclass from ValueError, so that you can catch these errors together with inputs like the 29th of February 2011. * ``AmbiguousTimeError`` This exception is raised when giving a datetime specification that is ambiguous while setting ``is_dst`` to None:: >>> datetime(2012, 11, 28, 2, 0, tzinfo=zoneinfo('Europe/Stockholm'), is_dst=None) >>> Traceback (most recent call last): ... AmbiguousTimeError: 2012-10-28 02:00:00 is ambiguous in time zone Europe/Stockholm * ``NonExistentTimeError`` This exception is raised when giving a datetime specification that is ambiguous while setting ``is_dst`` to None:: >>> datetime(2012, 3, 25, 2, 0, tzinfo=zoneinfo('Europe/Stockholm'), is_dst=None) >>> Traceback (most recent call last): ... NonExistentTimeError: 2012-03-25 02:00:00 does not exist in time zone Europe/Stockholm New collections ^^^^^^^^^^^^^^^ * ``all_timezones`` is the exhaustive list of the time zone names that can be used, listed alphabethically. * ``all_timezones_set`` is a set of the time zones in ``all_timezones``. * ``common_timezones`` is a list of useful, current time zones, listed alphabethically. * ``common_timezones_set`` is a set of the time zones in ``common_timezones``. The ``tzdata-update``-package ----------------------------- The zoneinfo database will be packaged for easy installation with ``easy_install``/``pip``/``buildout``. This package will not install any Python code, and will not contain any Python code except that which is needed for installation. It will be kept updated with the same tools as the internal database, but released whenever the ``zoneinfo``-database is updated, and use the same version schema. Differences from the ``pytz`` API ================================= * ``pytz`` has the functions ``localize()`` and ``normalize()`` to work around that ``tzinfo`` doesn't have is_dst. When ``is_dst`` is implemented directly in ``datetime.tzinfo`` they are no longer needed. * The ``timezone()`` function is called ``zoneinfo()`` to avoid clashing with the ``timezone`` class introduced in Python 3.2. * ``zoneinfo()`` will return the local time zone if called without arguments. * The class ``pytz.StaticTzInfo`` is there to provide the ``is_dst`` support for static time zones. When ``is_dst`` support is included in ``datetime.tzinfo`` it is no longer needed. * ``InvalidTimeError`` subclasses from ``ValueError``. Resources ========= * http://pytz.sourceforge.net/ * http://pypi.python.org/pypi/tzlocal * http://pypi.python.org/pypi/python-dateutil * http://unicode.org/cldr/data/common/supplemental/windowsZones.xml Copyright ========= This document has been placed in the public domain. From solipsis at pitrou.net Mon Jan 28 11:26:25 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 28 Jan 2013 11:26:25 +0100 Subject: [Python-Dev] PEP 431 Updates References: Message-ID: <20130128112625.763fff98@pitrou.net> Hi, Le Mon, 28 Jan 2013 11:01:05 +0100, Lennart Regebro a ?crit : > This function takes a name string that must be a string specifying a > valid zoneinfo time zone, i.e. "US/Eastern", "Europe/Warsaw" or > "Etc/GMT". Will non-ambiguous shorthands such as "Warsaw" and "GMT" be accepted? > The ``is_dst`` parameter can be ``False`` (default), ``True``, or > ``None``. > > ``False`` will specify that the given datetime should be interpreted > as not happening during daylight savings time, i.e. that the time > specified is after the change from DST. > > ``True`` will specify that the given datetime should be interpreted > as happening during daylight savings time, i.e. that the time > specified is before the change from DST. > > ``None`` will raise an ``AmbiguousTimeError`` exception if the time > specified was during a DST change over. It will also raise a > ``NonExistentTimeError`` if a time is specified during the "missing > time" in a change to DST. NonExistentTimeError can also be raised for is_dst=True and is_dst=False, no? Or am I misunderstanding the semantics? Regards Antoine. From regebro at gmail.com Mon Jan 28 12:53:26 2013 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 28 Jan 2013 12:53:26 +0100 Subject: [Python-Dev] PEP 431 Updates In-Reply-To: <20130128112625.763fff98@pitrou.net> References: <20130128112625.763fff98@pitrou.net> Message-ID: On Mon, Jan 28, 2013 at 11:26 AM, Antoine Pitrou wrote: > Will non-ambiguous shorthands such as "Warsaw" and "GMT" be accepted? pytz accepts GMT, so that will work. Otherwise no. > NonExistentTimeError can also be raised for is_dst=True and > is_dst=False, no? Or am I misunderstanding the semantics? No, it will only be raised when is_dst=None. The flag is there to say what timezone the resulting time should be if it is ambiguous or missing. It is true that the times that don't exist will also not exist even if you set the flag. 2012-03-25 02:30 is a time that don't exist in Europe/Warsaw, flag or not. But it will not raise an error if you set the flag to False or True, for reasons of backwards compatibility as well as ease of handling. People don't expect that certain times don't exist, and will generally not check for those exceptions. Therefore, the normal behavior is to not raise an error, but happily keep dealing with the time as if it does exist. Only of you explicitly set it to "None" will you get an error. //Lennart From regebro at gmail.com Mon Jan 28 13:06:56 2013 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 28 Jan 2013 13:06:56 +0100 Subject: [Python-Dev] PEP 431 Updates In-Reply-To: References: <20130128112625.763fff98@pitrou.net> Message-ID: On Sat, Jan 26, 2013 at 10:38 AM, Nick Coghlan wrote: > 2. Under "New class DstTzInfo" > > This will be a subclass of "tzinfo" rather than "zoneinfo" (which is > not a class). Given that this is a *concrete* subclass, you may want > to consider the name "DstTimezone", which would be slightly more > consistent with the existing "timezone" fixed offset subclass. > (Incidentally, *grumble* at all the existing classes in that module > without capitalized names...) Which probably mean it actually rather should be called tztimezone or something. > 6. Under "New collections" > > Why both lists and sets? Because pytz did it. But yes, you are right, an ordered set is a better solution. Baseing it on OrderedDict seems like a hack, though. I could implement a custom orderedset, of course. //Lennart From ncoghlan at gmail.com Mon Jan 28 13:31:29 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 28 Jan 2013 22:31:29 +1000 Subject: [Python-Dev] PEP 431 Updates In-Reply-To: References: <20130128112625.763fff98@pitrou.net> Message-ID: On Mon, Jan 28, 2013 at 10:06 PM, Lennart Regebro wrote: > On Sat, Jan 26, 2013 at 10:38 AM, Nick Coghlan wrote: >> 2. Under "New class DstTzInfo" >> >> This will be a subclass of "tzinfo" rather than "zoneinfo" (which is >> not a class). Given that this is a *concrete* subclass, you may want >> to consider the name "DstTimezone", which would be slightly more >> consistent with the existing "timezone" fixed offset subclass. >> (Incidentally, *grumble* at all the existing classes in that module >> without capitalized names...) > > Which probably mean it actually rather should be called tztimezone or something. "dsttimezone", "dst_timezone", "timezonedst" or "timezone_dst" are probably the possibilities most consistent with the current naming scheme. Of those "dst_timezone" is probably the best of a fairly bad bunch. >> 6. Under "New collections" >> >> Why both lists and sets? > > Because pytz did it. But yes, you are right, an ordered set is a > better solution. Baseing it on OrderedDict seems like a hack, though. > I could implement a custom orderedset, of course. Sets themselves have an honourable history of just being a thin wrapper around dictionaries with all the values set to None (although they're not implemented that way any more). Whether you create an actual OrderedSet class, or just expose the result of calling keys() on an OrderedDict instance is just an implementation detail, though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Mon Jan 28 13:52:58 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 28 Jan 2013 13:52:58 +0100 Subject: [Python-Dev] PEP 431 Updates References: <20130128112625.763fff98@pitrou.net> Message-ID: <20130128135258.224526f5@pitrou.net> Le Mon, 28 Jan 2013 22:31:29 +1000, Nick Coghlan a ?crit : > > >> 6. Under "New collections" > >> > >> Why both lists and sets? > > > > Because pytz did it. But yes, you are right, an ordered set is a > > better solution. Baseing it on OrderedDict seems like a hack, > > though. I could implement a custom orderedset, of course. > > Sets themselves have an honourable history of just being a thin > wrapper around dictionaries with all the values set to None (although > they're not implemented that way any more). Whether you create an > actual OrderedSet class, or just expose the result of calling keys() > on an OrderedDict instance is just an implementation detail, though. Why the complication? Just expose a regular set and let users call sorted() if that's what they want. I'm -1 on datetime shipping its own container subclass just for the sake of looking clever. Regards Antoine. From ncoghlan at gmail.com Mon Jan 28 14:12:00 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 28 Jan 2013 23:12:00 +1000 Subject: [Python-Dev] PEP 431 Updates In-Reply-To: <20130128135258.224526f5@pitrou.net> References: <20130128112625.763fff98@pitrou.net> <20130128135258.224526f5@pitrou.net> Message-ID: On Mon, Jan 28, 2013 at 10:52 PM, Antoine Pitrou wrote: > Le Mon, 28 Jan 2013 22:31:29 +1000, > Nick Coghlan a ?crit : >> >> >> 6. Under "New collections" >> >> >> >> Why both lists and sets? >> > >> > Because pytz did it. But yes, you are right, an ordered set is a >> > better solution. Baseing it on OrderedDict seems like a hack, >> > though. I could implement a custom orderedset, of course. >> >> Sets themselves have an honourable history of just being a thin >> wrapper around dictionaries with all the values set to None (although >> they're not implemented that way any more). Whether you create an >> actual OrderedSet class, or just expose the result of calling keys() >> on an OrderedDict instance is just an implementation detail, though. > > Why the complication? Just expose a regular set and let users call > sorted() if that's what they want. > I'm -1 on datetime shipping its own container subclass just for the > sake of looking clever. I was going by the fact that pytz decided to expose both, and it isn't clear whether the "fast containment test" or "lexically ordered iteration" should be the base behaviour if you're going to rely on sorted() or set() to convert between the two. Since OrderedDict can provide both, creating one internally and just exposing keys() makes sense to me (perhaps via types.MappingProxyType to make it read-only). If people really don't like using OrderedDict that way, I think the other two potentially sensible options are either using an ordinary set, or else adding collections.OrderedSet. I agree adding a new ordered set type hidden within datetime would be a poor idea. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Mon Jan 28 16:48:28 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Jan 2013 07:48:28 -0800 Subject: [Python-Dev] Boost Software License In-Reply-To: <20130128104416.706887c9@pitrou.net> References: <20130128104416.706887c9@pitrou.net> Message-ID: I would check with the PSF's lawyers. That's what they're there for. Developers shouldn't be giving legal advice. On Mon, Jan 28, 2013 at 1:44 AM, Antoine Pitrou wrote: > Le Sun, 27 Jan 2013 21:30:23 +0100, > Stefan Behnel a ?crit : >> Serhiy Storchaka, 27.01.2013 17:52: >> > Is Boost Software License [1] compatible with Python license? Can I >> > steal some code from Boost library [2]? >> > >> > [1] http://www.boost.org/LICENSE_1_0.txt >> > [2] http://www.boost.org/ >> >> Depends on what you want to do with it after stealing it. >> >> Assuming you want to add it to CPython, two things to consider: >> >> - Isn't Boost C++ code? >> >> - Usually, people who contribute code to CPython must sign a >> contributors agreement. As far as I understand it, that would be >> those who wrote it, not those who "stole" it from them. > > That's the only potentially contentious point. > Otherwise, the boost license looks like a fairly ordinary non-copyleft > free license, and therefore should be compatible with the PSF license. > > That said, we already ship "non-contributed" code such as zlib, expat > or libffi; and even in the core you can find such code such as David > Gay's dtoa.c. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From zeal_goswami at yahoo.com Sun Jan 27 07:31:28 2013 From: zeal_goswami at yahoo.com (abhishek goswami) Date: Sun, 27 Jan 2013 14:31:28 +0800 (SGT) Subject: [Python-Dev] 32 bit to 64 bit migration of C-API(Python) In-Reply-To: References: <1359265342.72980.YahooMailNeo@web193002.mail.sg3.yahoo.com> Message-ID: <1359268288.10441.YahooMailNeo@web193001.mail.sg3.yahoo.com> Hi Nick, Thanks a lot for your quick responce ? Thanks Abhishek Goswami Bangalore Phone No -07829580867/9962270999 Skype : abhishekgoswami1 ________________________________ From: Nick Coghlan To: abhishek goswami Cc: "python-dev at python.org" Sent: Sunday, 27 January 2013 11:46 AM Subject: Re: [Python-Dev] 32 bit to 64 bit migration of C-API(Python) On Sun, Jan 27, 2013 at 3:42 PM, abhishek goswami wrote: > Hi, > I am working in Python-C ApI (Integrated Project). The Project has migrated > to 64 bit machine from 32 bit machine. Every thing was working fine in 32 > bit machine. > But we getting following error in 64 bit machine. > " Python int too large to convert to C long". Hi Abhishek, python-dev is for the development *of* CPython, rather than development *with* CPython (including the C API). More appropriate places for this question are the general python-list mailing list or the C API special interest group. However, from the sounds of it, you are running into trouble with the fact that on a 64-bit machine, Python stores many internal values as "Py_ssize_t" (which is always 8 bytes on a 64 bit machine), rather than a C int or long. PEP 353 [1] describes the background for this change, which allows Python containers to exploit the full size of the address space on 64-bit systems. C API functions that may be of interest in that context include: http://docs.python.org/2/c-api/number.html#PyNumber_AsSsize_t http://docs.python.org/2/c-api/int.html#PyInt_FromSsize_t http://docs.python.org/2/c-api/int.html#PyInt_AsSsize_t http://docs.python.org/2/c-api/long.html#PyLong_FromSsize_t http://docs.python.org/2/c-api/long.html#PyLong_AsSsize_t As well as the "n" format character for argument processing http://docs.python.org/2/c-api/arg.html#parsing-arguments-and-building-values [1] http://www.python.org/dev/peps/pep-0353/ Regards, Nick. -- Nick Coghlan? |? ncoghlan at gmail.com? |? Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf at systemexit.de Mon Jan 28 22:27:54 2013 From: ralf at systemexit.de (Ralf Schmitt) Date: Mon, 28 Jan 2013 22:27:54 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <20130127212646.C756C250BC3@webabinitio.net> (R. David Murray's message of "Sun, 27 Jan 2013 16:26:46 -0500") References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> Message-ID: <87y5fdm205.fsf@brainbot.com> "R. David Murray" writes: > On Sun, 27 Jan 2013 21:56:06 +0100, Ralf Schmitt wrote: >> "R. David Murray" writes: >> >> > On Sun, 27 Jan 2013 19:42:59 +0100, Ralf Schmitt wrote: >> >> Guido van Rossum writes: >> >> >> >> > It's like calling socket.settimeout(0.1) and then complaining that >> >> > urllib.urlopen() raises exceptions >> >> >> >> but that's not what's happening. you'll see urllib.urlopen raising >> >> exceptions and only afterwards realize that you called into some third >> >> party library code that decided to change the timeout. >> > >> > What is stopping some some third party library code from calling >> > socket.settimeout(0.1)? >> >> Nothing. That's the point. You just wonder why urlopen fails when the >> global timeout has been changed by that third party library. > > Oh, you were agreeing with Guido? I guess I misunderstood. no. I think it's rather surprising if your code depends on some global variable that might change by calling into some third party code. From guido at python.org Mon Jan 28 22:33:24 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Jan 2013 13:33:24 -0800 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <87y5fdm205.fsf@brainbot.com> References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> Message-ID: Yeah, so the answer to all this is that 3rd party libraries know better than to mess with global settings. On Mon, Jan 28, 2013 at 1:27 PM, Ralf Schmitt wrote: > "R. David Murray" writes: > > > On Sun, 27 Jan 2013 21:56:06 +0100, Ralf Schmitt > wrote: > >> "R. David Murray" writes: > >> > >> > On Sun, 27 Jan 2013 19:42:59 +0100, Ralf Schmitt > wrote: > >> >> Guido van Rossum writes: > >> >> > >> >> > It's like calling socket.settimeout(0.1) and then complaining that > >> >> > urllib.urlopen() raises exceptions > >> >> > >> >> but that's not what's happening. you'll see urllib.urlopen raising > >> >> exceptions and only afterwards realize that you called into some > third > >> >> party library code that decided to change the timeout. > >> > > >> > What is stopping some some third party library code from calling > >> > socket.settimeout(0.1)? > >> > >> Nothing. That's the point. You just wonder why urlopen fails when the > >> global timeout has been changed by that third party library. > > > > Oh, you were agreeing with Guido? I guess I misunderstood. > > no. I think it's rather surprising if your code depends on some global > variable that might change by calling into some third party code. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf at brainbot.com Mon Jan 28 22:25:36 2013 From: ralf at brainbot.com (Ralf Schmitt) Date: Mon, 28 Jan 2013 22:25:36 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <20130127212646.C756C250BC3@webabinitio.net> (R. David Murray's message of "Sun, 27 Jan 2013 16:26:46 -0500") References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> Message-ID: <8738xlngof.fsf@brainbot.com> "R. David Murray" writes: > On Sun, 27 Jan 2013 21:56:06 +0100, Ralf Schmitt wrote: >> "R. David Murray" writes: >> >> > On Sun, 27 Jan 2013 19:42:59 +0100, Ralf Schmitt wrote: >> >> Guido van Rossum writes: >> >> >> >> > It's like calling socket.settimeout(0.1) and then complaining that >> >> > urllib.urlopen() raises exceptions >> >> >> >> but that's not what's happening. you'll see urllib.urlopen raising >> >> exceptions and only afterwards realize that you called into some third >> >> party library code that decided to change the timeout. >> > >> > What is stopping some some third party library code from calling >> > socket.settimeout(0.1)? >> >> Nothing. That's the point. You just wonder why urlopen fails when the >> global timeout has been changed by that third party library. > > Oh, you were agreeing with Guido? I guess I misunderstood. no. I think it's rather surprising if your code depends on some global variable that might change by calling into some third party code. From ralf at systemexit.de Mon Jan 28 22:56:06 2013 From: ralf at systemexit.de (Ralf Schmitt) Date: Mon, 28 Jan 2013 22:56:06 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: (Guido van Rossum's message of "Mon, 28 Jan 2013 13:33:24 -0800") References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> Message-ID: <87txq1m0p5.fsf@brainbot.com> Guido van Rossum writes: > Yeah, so the answer to all this is that 3rd party libraries know better > than to mess with global settings. Right. But why make it configurable at runtime then? If you're changing the value, then you're the one probably breaking third party code. From steve at pearwood.info Mon Jan 28 23:17:46 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 29 Jan 2013 09:17:46 +1100 Subject: [Python-Dev] PEP 431 Updates In-Reply-To: <20130128135258.224526f5@pitrou.net> References: <20130128112625.763fff98@pitrou.net> <20130128135258.224526f5@pitrou.net> Message-ID: <5106F90A.60408@pearwood.info> On 28/01/13 23:52, Antoine Pitrou wrote: > Le Mon, 28 Jan 2013 22:31:29 +1000, > Nick Coghlan a ?crit : >> >>>> 6. Under "New collections" >>>> >>>> Why both lists and sets? >>> >>> Because pytz did it. But yes, you are right, an ordered set is a >>> better solution. Baseing it on OrderedDict seems like a hack, >>> though. I could implement a custom orderedset, of course. >> >> Sets themselves have an honourable history of just being a thin >> wrapper around dictionaries with all the values set to None (although >> they're not implemented that way any more). Whether you create an >> actual OrderedSet class, or just expose the result of calling keys() >> on an OrderedDict instance is just an implementation detail, though. > > Why the complication? Just expose a regular set and let users call > sorted() if that's what they want. An OrderedSet is not a sorted set. An OrderedSet, like an OrderedDict, remembers *insertion order*, it does not automatically sort the keys. So if datetime needs an ordered set, and I have no opinion on whether or not it really does, calling sorted() on a regular set is not the solution. -- Steven From regebro at gmail.com Tue Jan 29 00:28:55 2013 From: regebro at gmail.com (Lennart Regebro) Date: Tue, 29 Jan 2013 00:28:55 +0100 Subject: [Python-Dev] PEP 431 Updates In-Reply-To: <5106F90A.60408@pearwood.info> References: <20130128112625.763fff98@pitrou.net> <20130128135258.224526f5@pitrou.net> <5106F90A.60408@pearwood.info> Message-ID: On Mon, Jan 28, 2013 at 11:17 PM, Steven D'Aprano wrote: > On 28/01/13 23:52, Antoine Pitrou wrote: >> >> Le Mon, 28 Jan 2013 22:31:29 +1000, >> Nick Coghlan a ?crit : >>> >>> >>>>> 6. Under "New collections" >>>>> >>>>> Why both lists and sets? >>>> >>>> >>>> Because pytz did it. But yes, you are right, an ordered set is a >>>> better solution. Baseing it on OrderedDict seems like a hack, >>>> though. I could implement a custom orderedset, of course. >>> >>> >>> Sets themselves have an honourable history of just being a thin >>> wrapper around dictionaries with all the values set to None (although >>> they're not implemented that way any more). Whether you create an >>> actual OrderedSet class, or just expose the result of calling keys() >>> on an OrderedDict instance is just an implementation detail, though. >> >> >> Why the complication? Just expose a regular set and let users call >> sorted() if that's what they want. > > > An OrderedSet is not a sorted set. > > An OrderedSet, like an OrderedDict, remembers *insertion order*, it does > not automatically sort the keys. So if datetime needs an ordered set, and > I have no opinion on whether or not it really does, calling sorted() on a > regular set is not the solution. I think the use case for needing really quick checks if a string is in the list of recognized timezone is rather unusual. What is usually needed is a sorted list of valid time zone names. I'll drop the set. //Lennart From guido at python.org Tue Jan 29 00:30:29 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Jan 2013 15:30:29 -0800 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <87txq1m0p5.fsf@brainbot.com> References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> Message-ID: Sigh. This is getting exasperating. There's other code that might want to change this besides 3rd party library code. E.g. app configuration code. On Mon, Jan 28, 2013 at 1:56 PM, Ralf Schmitt wrote: > Guido van Rossum writes: > > > Yeah, so the answer to all this is that 3rd party libraries know better > > than to mess with global settings. > > Right. But why make it configurable at runtime then? If you're changing > the value, then you're the one probably breaking third party code. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf at systemexit.de Tue Jan 29 01:17:35 2013 From: ralf at systemexit.de (Ralf Schmitt) Date: Tue, 29 Jan 2013 01:17:35 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: (Guido van Rossum's message of "Mon, 28 Jan 2013 15:30:29 -0800") References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> Message-ID: <87wquw7sgw.fsf@myhost.lan> Guido van Rossum writes: > > On Mon, Jan 28, 2013 at 1:56 PM, Ralf Schmitt wrote: > >> Guido van Rossum writes: >> >> > Yeah, so the answer to all this is that 3rd party libraries know better >> > than to mess with global settings. >> >> Right. But why make it configurable at runtime then? If you're changing >> the value, then you're the one probably breaking third party code. > > Sigh. This is getting exasperating. There's other code that might want to > change this besides 3rd party library code. E.g. app configuration code. So, third party library code should know better, while at the same time it's fine to mess with global settings from app configuration code. That does not make sense. From ethan at stoneleaf.us Tue Jan 29 02:17:16 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 28 Jan 2013 17:17:16 -0800 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <87wquw7sgw.fsf@myhost.lan> References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> <87wquw7sgw.fsf@myhost.lan> Message-ID: <5107231C.3000408@stoneleaf.us> On 01/28/2013 04:17 PM, Ralf Schmitt wrote: > Guido van Rossum writes: >> On Mon, Jan 28, 2013 at 1:56 PM, Ralf Schmitt wrote: >>> Guido van Rossum writes: >>> >>>> Yeah, so the answer to all this is that 3rd party libraries know better >>>> than to mess with global settings. >>> >>> Right. But why make it configurable at runtime then? If you're changing >>> the value, then you're the one probably breaking third party code. >> >> Sigh. This is getting exasperating. There's other code that might want to >> change this besides 3rd party library code. E.g. app configuration code. > > So, third party library code should know better, while at the same time > it's fine to mess with global settings from app configuration code. > > That does not make sense. Library code should not be relying on globals settings that can change. Library code should be explicit in its calls so that the current value of a global setting is irrelevant. ~Ethan~ P.S. Unless, of course, the library is meant to work with that global, and can work correctly with all of the global's valid settings. From cf.natali at gmail.com Tue Jan 29 07:13:39 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 29 Jan 2013 07:13:39 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <5107231C.3000408@stoneleaf.us> References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> <87wquw7sgw.fsf@myhost.lan> <5107231C.3000408@stoneleaf.us> Message-ID: > Library code should not be relying on globals settings that can change. > Library code should be explicit in its calls so that the current value of a > global setting is irrelevant. That's one of the problems I've raised with this global flag since the beginning: it's useless for libraries, including the stdlib (and, as a reminder, this PEP started out of a a bug report against socket inheritance in socketserver). And once again, it's an hidden global variable, so you won't be able any more to tell what this code does: """ r, w = os.pipe() if os.fork() == 0: os.close(w) os.execve(['myprog']) """ Furthermore, if the above code is part of a library, and relies upon 'r' FD inheritance, it will break if the user sets the global cloexec flag. And the fact that a library relies upon FD inheritance is an implementation detail, the users shouldn't have to wonder whether enabling a global flag (in their code, not in a library) will break a given library: the only alternative for such code to continue working would be to pass cloexec=True explicitly to os.pipe()... The global socket.settimeout() is IMO a bad idea, and shouldn't be emulated. So I'm definitely -1 against any form of tunable value (be it a sys.setdefaultcloexec(), an environment variable or command-line flag), and still against changing the default value. But I promise that's the last time I'm bringing those arguments up, and I perfectly admit that some people want it as much as I don't want it :-) cf From solipsis at pitrou.net Tue Jan 29 08:12:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 29 Jan 2013 08:12:21 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> <87wquw7sgw.fsf@myhost.lan> Message-ID: <20130129081221.6c7120bb@pitrou.net> On Tue, 29 Jan 2013 01:17:35 +0100 Ralf Schmitt wrote: > Guido van Rossum writes: > > > > > On Mon, Jan 28, 2013 at 1:56 PM, Ralf Schmitt wrote: > > > >> Guido van Rossum writes: > >> > >> > Yeah, so the answer to all this is that 3rd party libraries know better > >> > than to mess with global settings. > >> > >> Right. But why make it configurable at runtime then? If you're changing > >> the value, then you're the one probably breaking third party code. > > > > Sigh. This is getting exasperating. There's other code that might want to > > change this besides 3rd party library code. E.g. app configuration code. > > So, third party library code should know better, while at the same time > it's fine to mess with global settings from app configuration code. Yes, it's fine, because an application developer can often control (or at least study) the behaviour of all the code involved. It's the same story as with logging configuration and similar process-wide settings. Libraries shouldn't mess with it but the top-level application definitely can (and should, even). (and if you think many third-party libraries call fork()+exec() as part as their normal duty, then I've got a bridge to sell you) Regards Antoine. From victor.stinner at gmail.com Tue Jan 29 09:20:34 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 29 Jan 2013 09:20:34 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <5107231C.3000408@stoneleaf.us> References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> <87wquw7sgw.fsf@myhost.lan> <5107231C.3000408@stoneleaf.us> Message-ID: > Library code should not be relying on globals settings that can change. Did you try my implementation of the PEP 433 on your project? Did you review my patch implementing the PEP 433? http://hg.python.org/features/pep-433 http://bugs.python.org/issue17036 I expected more change, whereas only very few modules of the stdlib need changes to support both modes (cloexec=False or cloexec=True by default). I made 5 different types of changes. Only th change (3) gives you an idea of how many modules must be fixed to support cloexec=True by default: 4 modules (http.server, multiprocessing, pty, subprocess) on a total of something like 200 modules (stdlib). And I'm not sure that all these modules need to be modified, I have to check again. By the way, I chose to not consider file descriptors 0, 1 and 2 as special: the developer must disable cloexec explicitly for standard streams. If we consider them as special, fewer (or no) modules would require changes. -- (1) add a cloexec parameter, modified modules: * io * asyncore * socket (2) explicitly enable cloexec: usually for security reasons, don't leak a file descriptor in a child process => it's not directly related to this PEP, I should maybe do this in a different commit. Said differently: code works with cloexec enabled or disabled, but I consider that enabled is safer. * cgi * getpass * importlib * logging * os * pty * pydoc * shutil (3) explicitly disable cloexec: code doesn't work if cloexec is set (code relies on file descriptor inherance); modified modules: * http.server: CGI uses dup2() to replace stdin and stdout * multiprocessing: stdin is replaced with /dev/null * pty: dup2() to replace stdin, stdout, stderr * subprocess: dup2() to replace stdin, stdout, stderr (4) refactoring to use the new cloexec parameter: * tempfile (5) apply the default value of the cloexec parameter (in C modules): * curses * mmap * oss * select * random Victor From ralf at systemexit.de Tue Jan 29 09:35:40 2013 From: ralf at systemexit.de (Ralf Schmitt) Date: Tue, 29 Jan 2013 09:35:40 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <20130129081221.6c7120bb@pitrou.net> (Antoine Pitrou's message of "Tue, 29 Jan 2013 08:12:21 +0100") References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> <87wquw7sgw.fsf@myhost.lan> <20130129081221.6c7120bb@pitrou.net> Message-ID: <87y5fcidyb.fsf@myhost.lan> Antoine Pitrou writes: > Yes, it's fine, because an application developer can often control (or > at least study) the behaviour of all the code involved. I'd rather not have to check if some library messes with that global setting and work around it if it does! The fact that you can control and work around a global setting that may change, isn't a reason to introduce that global setting and the increased complexity that comes with it. > > It's the same story as with logging configuration and similar > process-wide settings. Libraries shouldn't mess with it but the > top-level application definitely can (and should, even). That's a bad comparison, because the situation is quite different with the logging module. Configuration of the logging module does not change the behaviour for code calling into the logging module (well, besides log output being produced somewhere, but that doesn't matter for the caller). From solipsis at pitrou.net Tue Jan 29 10:48:56 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 29 Jan 2013 10:48:56 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> <87wquw7sgw.fsf@myhost.lan> <20130129081221.6c7120bb@pitrou.net> <87y5fcidyb.fsf@myhost.lan> Message-ID: <20130129104856.20ea5bf8@pitrou.net> Le Tue, 29 Jan 2013 09:35:40 +0100, Ralf Schmitt a ?crit : > Antoine Pitrou writes: > > > Yes, it's fine, because an application developer can often control > > (or at least study) the behaviour of all the code involved. > > I'd rather not have to check if some library messes with that global > setting and work around it if it does! Then just don't try changing the flag. Problem solved. Regards Antoine. From ralf at systemexit.de Tue Jan 29 11:24:49 2013 From: ralf at systemexit.de (Ralf Schmitt) Date: Tue, 29 Jan 2013 11:24:49 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: <20130129104856.20ea5bf8@pitrou.net> (Antoine Pitrou's message of "Tue, 29 Jan 2013 10:48:56 +0100") References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> <87wquw7sgw.fsf@myhost.lan> <20130129081221.6c7120bb@pitrou.net> <87y5fcidyb.fsf@myhost.lan> <20130129104856.20ea5bf8@pitrou.net> Message-ID: <8738xkjngu.fsf@myhost.lan> Antoine Pitrou writes: > Le Tue, 29 Jan 2013 09:35:40 +0100, > Ralf Schmitt a ?crit : >> >> I'd rather not have to check if some library messes with that global >> setting and work around it if it does! > > Then just don't try changing the flag. Problem solved. I was talking about some library changing the flag. Not changing the flag myself doesn't help in that case. I'll probably have to explicitly pass a cloexec argument to each fd creating function I'm calling in order to not be dependent on the global setting. Please just acknowledge that having a global configurable setting may lead to problems. Why is sys.setdefaultencoding being hidden in python 2? From solipsis at pitrou.net Tue Jan 29 11:53:01 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 29 Jan 2013 11:53:01 +0100 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> <87wquw7sgw.fsf@myhost.lan> <20130129081221.6c7120bb@pitrou.net> <87y5fcidyb.fsf@myhost.lan> <20130129104856.20ea5bf8@pitrou.net> <8738xkjngu.fsf@myhost.lan> Message-ID: <20130129115301.2d6c8d4a@pitrou.net> Le Tue, 29 Jan 2013 11:24:49 +0100, Ralf Schmitt a ?crit : > Please just acknowledge that having a global configurable setting may > lead to problems. Ralf, I won't "acknowledge" anything just so that it makes you feel better. Your point has been made and has also been rebutted. Let's agree to disagree and move along. Regards Antoine. From ncoghlan at gmail.com Tue Jan 29 12:46:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 29 Jan 2013 21:46:09 +1000 Subject: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter In-Reply-To: References: <20130127044804.GA2058@cskk.homeip.net> <20130127122900.469ec8d9@pitrou.net> <87ehh61n7w.fsf@myhost.lan> <20130127204209.34B80250BC3@webabinitio.net> <87a9ru1h21.fsf@myhost.lan> <20130127212646.C756C250BC3@webabinitio.net> <87y5fdm205.fsf@brainbot.com> <87txq1m0p5.fsf@brainbot.com> <87wquw7sgw.fsf@myhost.lan> <5107231C.3000408@stoneleaf.us> Message-ID: On Tue, Jan 29, 2013 at 4:13 PM, Charles-Fran?ois Natali wrote: > So I'm definitely -1 against any form of tunable value (be it a > sys.setdefaultcloexec(), an environment variable or command-line > flag), and still against changing the default value. I now think the conservative option is to initially implement the design in 3.4 without an option to change the global default. The "cloexec=True/False" model at least *supports* a tunable default, as well as eventually *changing* the default if we choose to do so, but I don't think there's any hurry to proceed to those steps. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Tue Jan 29 18:00:44 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 29 Jan 2013 18:00:44 +0100 Subject: [Python-Dev] PEP 433: second try Message-ID: Hi, Here is a new version of my PEP 433. The default value of the cloexec parameter is now configurable (cmdline option, env var, sys.setdefaultcloexec), it can be disabled using sys.setdefaultcloexec(False). HTML version: http://python.org/dev/peps/pep-0433/ Implementation: http://bugs.python.org/issue17036 My last question is on the implementation: should subprocess.Popen clears the close-on-exec flag on file descriptors passed to pass_fds? It makes sense if cloexec is True by default. -- PEP: 433 Title: Easier suppression of file descriptor inheritance Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 10-January-2013 Python-Version: 3.4 Abstract ======== Add a new optional *cloexec* parameter on functions creating file descriptors, add different ways to change default values of this parameter, and add four new functions: * ``os.get_cloexec(fd)`` * ``os.set_cloexec(fd, cloexec=True)`` * ``sys.getdefaultcloexec()`` * ``sys.setdefaultcloexec(cloexec=True)`` Rationale ========= A file descriptor has a close-on-exec flag which indicates if the file descriptor will be inherited or not. On UNIX, if the close-on-exec flag is set, the file descriptor is not inherited: it will be closed at the execution of child processes; otherwise the file descriptor is inherited by child processes. On Windows, if the close-on-exec flag is set, the file descriptor is not inherited; the file descriptor is inherited by child processes if the close-on-exec flag is cleared and if ``CreateProcess()`` is called with the *bInheritHandles* parameter set to ``TRUE`` (when ``subprocess.Popen`` is created with ``close_fds=False`` for example). Windows does now have "close-on-exec" flag but an inherance flag which is just the opposite value. For example, setting close-on-exec flag set means clearing the ``HANDLE_FLAG_INHERIT`` flag of an handle. Status in Python 3.3 -------------------- On UNIX, the subprocess module closes file descriptors greater than 2 by default since Python 3.2 [#subprocess_close]_. All file descriptors created by the parent process are automatically closed in the child process. ``xmlrpc.server.SimpleXMLRPCServer`` sets the close-on-exec flag of the listening socket, the parent class ``socketserver.TCPServer`` does not set this flag. There are other cases creating a subprocess or executing a new program where file descriptors are not closed: functions of the ``os.spawn*()`` and the ``os.exec*()`` families and third party modules calling ``exec()`` or ``fork()`` + ``exec()``. In this case, file descriptors are shared between the parent and the child processes which is usually unexpected and causes various issues. This PEP proposes to continue the work started with the change in the subprocess in Python 3.2, to fix the issue in any code, and not just code using subprocess. Inherited file descriptors issues --------------------------------- Closing the file descriptor in the parent process does not close the related resource (file, socket, ...) because it is still open in the child process. The listening socket of TCPServer is not closed on ``exec()``: the child process is able to get connection from new clients; if the parent closes the listening socket and create a new listening socket on the same address, it would get an "address already is used" error. Not closing file descriptors can lead to resource exhaustion: even if the parent closes all files, creating a new file descriptor may fail with "too many files" because files are still open in the child process. See also the following issues: * `Issue #2320: Race condition in subprocess using stdin `_ (2008) * `Issue #3006: subprocess.Popen causes socket to remain open after close `_ (2008) * `Issue #7213: subprocess leaks open file descriptors between Popen instances causing hangs `_ (2009) * `Issue #12786: subprocess wait() hangs when stdin is closed `_ (2011) Security -------- Leaking file descriptors is a major security vulnerability. An untrusted child process can read sensitive data like passwords and take control of the parent process though leaked file descriptors. It is for example a known vulnerability to escape from a chroot. See also the CERT recommandation: `FIO42-C. Ensure files are properly closed when they are no longer needed `_. Example of vulnerabilities: * `OpenSSH Security Advisory: portable-keysign-rand-helper.adv `_ (April 2011) * `CWE-403: Exposure of File Descriptor to Unintended Control Sphere `_ (2008) * `Hijacking Apache https by mod_php `_ (Dec 2003) Atomicity --------- Using ``fcntl()`` to set the close-on-exec flag is not safe in a multithreaded application. If a thread calls ``fork()`` and ``exec()`` between the creation of the file descriptor and the call to ``fcntl(fd, F_SETFD, new_flags)``: the file descriptor will be inherited by the child process. Modern operating systems offer functions to set the flag during the creation of the file descriptor, which avoids the race condition. Portability ----------- Python 3.2 added ``socket.SOCK_CLOEXEC`` flag, Python 3.3 added ``os.O_CLOEXEC`` flag and ``os.pipe2()`` function. It is already possible to set atomically close-on-exec flag in Python 3.3 when opening a file and creating a pipe or socket. The problem is that these flags and functions are not portable: only recent versions of operating systems support them. ``O_CLOEXEC`` and ``SOCK_CLOEXEC`` flags are ignored by old Linux versions and so ``FD_CLOEXEC`` flag must be checked using ``fcntl(fd, F_GETFD)``. If the kernel ignores ``O_CLOEXEC`` or ``SOCK_CLOEXEC`` flag, a call to ``fcntl(fd, F_SETFD, flags)`` is required to set close-on-exec flag. .. note:: OpenBSD older 5.2 does not close the file descriptor with close-on-exec flag set if ``fork()`` is used before ``exec()``, but it works correctly if ``exec()`` is called without ``fork()``. Try `openbsd_bug.py `_. Scope ----- Applications still have to close explicitly file descriptors after a ``fork()``. The close-on-exec flag only closes file descriptors after ``exec()``, and so after ``fork()`` + ``exec()``. This PEP only change the close-on-exec flag of file descriptors created by the Python standard library, or by modules using the standard library. Third party modules not using the standard library should be modified to conform to this PEP. The new ``os.set_cloexec()`` function can be used for example. XXX Should ``subprocess.Popen`` clear the close-on-exec flag on file XXX descriptors of the constructor the ``pass_fds`` parameter? .. note:: See `Close file descriptors after fork`_ for a possible solution for ``fork()`` without ``exec()``. Proposal ======== Add a new optional *cloexec* parameter on functions creating file descriptors and different ways to change default values of this parameter. Add new functions: * ``os.get_cloexec(fd:int) -> bool``: get the close-on-exec flag of a file descriptor. Not available on all platforms. * ``os.set_cloexec(fd:int, cloexec:bool=True)``: set or clear the close-on-exec flag on a file descriptor. Not available on all platforms. * ``sys.getdefaultcloexec() -> bool``: get the current default value of the *cloexec* parameter * ``sys.setdefaultcloexec(cloexec: bool=True)``: set the default value of the *cloexec* parameter Add a new optional *cloexec* parameter to: * ``asyncore.dispatcher.create_socket()`` * ``io.FileIO`` * ``io.open()`` * ``open()`` * ``os.dup()`` * ``os.dup2()`` * ``os.fdopen()`` * ``os.open()`` * ``os.openpty()`` * ``os.pipe()`` * ``select.devpoll()`` * ``select.epoll()`` * ``select.kqueue()`` * ``socket.socket()`` * ``socket.socket.accept()`` * ``socket.socket.dup()`` * ``socket.socket.fromfd`` * ``socket.socketpair()`` The default value of the *cloexec* parameter is ``sys.getdefaultcloexec()``. Add a new command line option ``-e`` and an environment variable ``PYTHONCLOEXEC`` to the set close-on-exec flag by default. All functions creating file descriptors in the standard library must respect the default *cloexec* parameter (``sys.getdefaultcloexec()``). File descriptors 0 (stdin), 1 (stdout) and 2 (stderr) are expected to be inherited, but Python does not handle them differently. When ``os.dup2()`` is used to replace standard streams, ``cloexec=False`` must be specified explicitly. Drawbacks of the proposal: * It is not more possible to know if the close-on-exec flag will be set or not on a newly created file descriptor just by reading the source code. * If the inherance of a file descriptor matters, the *cloexec* parameter must now be specified explicitly, or the library or the application will not work depending on the default value of the *cloexec* parameter. Alternatives ============ Enable inherance by default and no configurable default ------------------------------------------------------- Add a new optional parameter *cloexec* on functions creating file descriptors. The default value of the *cloexec* parameter is ``False``, and this default cannot be changed. No file descriptor inherance by default is also the default on POSIX and on Windows. This alternative is the most convervative option. This option does solve issues listed in the `Rationale`_ section, it only provides an helper to fix them. All functions creating file descriptors have to be modified to set *cloexec=True* in each module used by an application to fix all these issues. Enable inherance by default, default can only be set to True ------------------------------------------------------------ This alternative is based on the proposal: the only difference is that ``sys.setdefaultcloexec()`` does not take any argument, it can only be used to set the default value of the *cloexec* parameter to ``True``. Disable inheritance by default ------------------------------ This alternative is based on the proposal: the only difference is that the default value of the *cloexec* parameter is ``True`` (instead of ``False``). If a file must be inherited by child processes, ``cloexec=False`` parameter can be used. ``subprocess.Popen`` constructor has an ``pass_fds`` parameter to specify which file descriptors must be inherited. The close-on-exec flag of these file descriptors must be changed with ``os.set_cloexec()``. Advantages of setting close-on-exec flag by default: * There are far more programs that are bitten by FD inheritance upon exec (see `Inherited file descriptors issues`_ and `Security`_) than programs relying on it (see `Applications using inherance of file descriptors`_). Drawbacks of setting close-on-exec flag by default: * It violates the principle of least surprise. Developers using the os module may expect that Python respects the POSIX standard and so that close-on-exec flag is not set by default. * The os module is written as a thin wrapper to system calls (to functions of the C standard library). If atomic flags to set close-on-exec flag are not supported (see `Appendix: Operating system support`_), a single Python function call may call 2 or 3 system calls (see `Performances`_ section). * Extra system calls, if any, may slow down Python: see `Performances`_. Backward compatibility: only a few programs rely on inherance of file descriptors, and they only pass a few file descriptors, usually just one. These programs will fail immediatly with ``EBADF`` error, and it will be simple to fix them: add ``cloexec=False`` parameter or use ``os.set_cloexec(fd, False)``. The ``subprocess`` module will be changed anyway to clear close-on-exec flag on file descriptors listed in the ``pass_fds`` parameter of Popen constructor. So it possible that these programs will not need any fix if they use the ``subprocess`` module. Close file descriptors after fork --------------------------------- This PEP does not fix issues with applications using ``fork()`` without ``exec()``. Python needs a generic process to register callbacks which would be called after a fork, see `#16500: Add an atfork module`_. Such registry could be used to close file descriptors just after a ``fork()``. Drawbacks: * It does not solve the problem on Windows: ``fork()`` does not exist on Windows * This alternative does not solve the problem for programs using ``exec()`` without ``fork()``. * A third party module may call directly the C function ``fork()`` which will not call "atfork" callbacks. * All functions creating file descriptors must be changed to register a callback and then unregister their callback when the file is closed. Or a list of *all* open file descriptors must be maintained. * The operating system is a better place than Python to close automatically file descriptors. For example, it is not easy to avoid a race condition between closing the file and unregistering the callback closing the file. open(): add "e" flag to mode ---------------------------- A new "e" mode would set close-on-exec flag (best-effort). This alternative only solves the problem for ``open()``. socket.socket() and os.pipe() do not have a ``mode`` parameter for example. Since its version 2.7, the GNU libc supports ``"e"`` flag for ``fopen()``. It uses ``O_CLOEXEC`` if available, or use ``fcntl(fd, F_SETFD, FD_CLOEXEC)``. With Visual Studio, fopen() accepts a "N" flag which uses ``O_NOINHERIT``. Bikeshedding on the name of the new parameter --------------------------------------------- * ``inherit``, ``inherited``: closer to Windows definition * ``sensitive`` * ``sterile``: "Does not produce offspring." Applications using inherance of file descriptors ================================================ Most developers don't know that file descriptors are inherited by default. Most programs do not rely on inherance of file descriptors. For example, ``subprocess.Popen`` was changed in Python 3.2 to close all file descriptors greater than 2 in the child process by default. No user complained about this behavior change yet. Network servers using fork may want to pass the client socket to the child process. For example, on UNIX a CGI server pass the socket client through file descriptors 0 (stdin) and 1 (stdout) using ``dup2()``. To access a restricted resource like creating a socket listening on a TCP port lower than 1024 or reading a file containing sensitive data like passwords, a common practice is: start as the root user, create a file descriptor, create a child process, drop privileges (ex: change the current user), pass the file descriptor to the child process and exit the parent process. Security is very important in such use case: leaking another file descriptor would be a critical security vulnerability (see `Security`_). The root process may not exit but monitors the child process instead, and restarts a new child process and pass the same file descriptor if the previous child process crashed. Example of programs taking file descriptors from the parent process using a command line option: * gpg: ``--status-fd ``, ``--logger-fd ``, etc. * openssl: ``-pass fd:`` * qemu: ``-add-fd `` * valgrind: ``--log-fd=``, ``--input-fd=``, etc. * xterm: ``-S `` On Linux, it is possible to use ``"/dev/fd/"`` filename to pass a file descriptor to a program expecting a filename. Performances ============ Setting close-on-exec flag may require additional system calls for each creation of new file descriptors. The number of additional system calls depends on the method used to set the flag: * ``O_NOINHERIT``: no additional system call * ``O_CLOEXEC``: one additional system call, but only at the creation of the first file descriptor, to check if the flag is supported. If the flag is not supported, Python has to fallback to the next method. * ``ioctl(fd, FIOCLEX)``: one additional system call per file descriptor * ``fcntl(fd, F_SETFD, flags)``: two additional system calls per file descriptor, one to get old flags and one to set new flags On Linux, setting the close-on-flag has a low overhead on performances. Results of `bench_cloexec.py `_ on Linux 3.6: * close-on-flag not set: 7.8 us * ``O_CLOEXEC``: 1% slower (7.9 us) * ``ioctl()``: 3% slower (8.0 us) * ``fcntl()``: 3% slower (8.0 us) Implementation ============== os.get_cloexec(fd) ------------------ Get the close-on-exec flag of a file descriptor. Pseudo-code:: if os.name == 'nt': def get_cloexec(fd): handle = _winapi._get_osfhandle(fd); flags = _winapi.GetHandleInformation(handle) return not(flags & _winapi.HANDLE_FLAG_INHERIT) else: try: import fcntl except ImportError: pass else: def get_cloexec(fd): flags = fcntl.fcntl(fd, fcntl.F_GETFD) return bool(flags & fcntl.FD_CLOEXEC) os.set_cloexec(fd, cloexec=True) -------------------------------- Set or clear the close-on-exec flag on a file descriptor. The flag is set after the creation of the file descriptor and so it is not atomic. Pseudo-code:: if os.name == 'nt': def set_cloexec(fd, cloexec=True): handle = _winapi._get_osfhandle(fd); mask = _winapi.HANDLE_FLAG_INHERIT if cloexec: flags = 0 else: flags = mask _winapi.SetHandleInformation(handle, mask, flags) else: fnctl = None ioctl = None try: import ioctl except ImportError: try: import fcntl except ImportError: pass if ioctl is not None and hasattr('FIOCLEX', ioctl): def set_cloexec(fd, cloexec=True): if cloexec: ioctl.ioctl(fd, ioctl.FIOCLEX) else: ioctl.ioctl(fd, ioctl.FIONCLEX) elif fnctl is not None: def set_cloexec(fd, cloexec=True): flags = fcntl.fcntl(fd, fcntl.F_GETFD) if cloexec: flags |= FD_CLOEXEC else: flags &= ~FD_CLOEXEC fcntl.fcntl(fd, fcntl.F_SETFD, flags) ioctl is preferred over fcntl because it requires only one syscall, instead of two syscalls for fcntl. .. note:: ``fcntl(fd, F_SETFD, flags)`` only supports one flag (``FD_CLOEXEC``), so it would be possible to avoid ``fcntl(fd, F_GETFD)``. But it may drop other flags in the future, and so it is safer to keep the two functions calls. .. note:: ``fopen()`` function of the GNU libc ignores the error if ``fcntl(fd, F_SETFD, flags)`` failed. open() ------ * Windows: ``open()`` with ``O_NOINHERIT`` flag [atomic] * ``open()`` with ``O_CLOEXEC flag`` [atomic] * ``open()`` + ``os.set_cloexec(fd, True)`` [best-effort] os.dup() -------- * ``fcntl(fd, F_DUPFD_CLOEXEC)`` [atomic] * ``dup()`` + ``os.set_cloexec(fd, True)`` [best-effort] os.dup2() --------- * ``dup3()`` with ``O_CLOEXEC`` flag [atomic] * ``dup2()`` + ``os.set_cloexec(fd, True)`` [best-effort] os.pipe() --------- * Windows: ``CreatePipe()`` with ``SECURITY_ATTRIBUTES.bInheritHandle=TRUE``, or ``_pipe()`` with ``O_NOINHERIT`` flag [atomic] * ``pipe2()`` with ``O_CLOEXEC`` flag [atomic] * ``pipe()`` + ``os.set_cloexec(fd, True)`` [best-effort] socket.socket() --------------- * Windows: ``WSASocket()`` with ``WSA_FLAG_NO_HANDLE_INHERIT`` flag [atomic] * ``socket()`` with ``SOCK_CLOEXEC`` flag [atomic] * ``socket()`` + ``os.set_cloexec(fd, True)`` [best-effort] socket.socketpair() ------------------- * ``socketpair()`` with ``SOCK_CLOEXEC`` flag [atomic] * ``socketpair()`` + ``os.set_cloexec(fd, True)`` [best-effort] socket.socket.accept() ---------------------- * ``accept4()`` with ``SOCK_CLOEXEC`` flag [atomic] * ``accept()`` + ``os.set_cloexec(fd, True)`` [best-effort] Backward compatibility ====================== There is no backward incompatible change. The default behaviour is unchanged: the close-on-exec flag is not set by default. Appendix: Operating system support ================================== Windows ------- Windows has an ``O_NOINHERIT`` flag: "Do not inherit in child processes". For example, it is supported by ``open()`` and ``_pipe()``. The flag can be cleared using ``SetHandleInformation(fd, HANDLE_FLAG_INHERIT, 0)``. ``CreateProcess()`` has an ``bInheritHandles`` parameter: if it is ``FALSE``, the handles are not inherited. If it is ``TRUE``, handles with ``HANDLE_FLAG_INHERIT`` flag set are inherited. ``subprocess.Popen`` uses ``close_fds`` option to define ``bInheritHandles``. ioctl ----- Functions: * ``ioctl(fd, FIOCLEX, 0)``: set the close-on-exec flag * ``ioctl(fd, FIONCLEX, 0)``: clear the close-on-exec flag Availability: Linux, Mac OS X, QNX, NetBSD, OpenBSD, FreeBSD. fcntl ----- Functions: * ``flags = fcntl(fd, F_GETFD); fcntl(fd, F_SETFD, flags | FD_CLOEXEC)``: set the close-on-exec flag * ``flags = fcntl(fd, F_GETFD); fcntl(fd, F_SETFD, flags & ~FD_CLOEXEC)``: clear the close-on-exec flag Availability: AIX, Digital UNIX, FreeBSD, HP-UX, IRIX, Linux, Mac OS X, OpenBSD, Solaris, SunOS, Unicos. Atomic flags ------------ New flags: * ``O_CLOEXEC``: available on Linux (2.6.23), FreeBSD (8.3), OpenBSD 5.0, QNX, BeOS, next NetBSD release (6.1?). This flag is part of POSIX.1-2008. * ``SOCK_CLOEXEC`` flag for ``socket()`` and ``socketpair()``, available on Linux 2.6.27, OpenBSD 5.2, NetBSD 6.0. * ``WSA_FLAG_NO_HANDLE_INHERIT`` flag for ``WSASocket()``: supported on Windows 7 with SP1, Windows Server 2008 R2 with SP1, and later * ``fcntl()``: ``F_DUPFD_CLOEXEC`` flag, available on Linux 2.6.24, OpenBSD 5.0, FreeBSD 9.1, NetBSD 6.0. This flag is part of POSIX.1-2008. * ``recvmsg()``: ``MSG_CMSG_CLOEXEC``, available on Linux 2.6.23, NetBSD 6.0. On Linux older than 2.6.23, ``O_CLOEXEC`` flag is simply ignored. So we have to check that the flag is supported by calling ``fcntl()``. If it does not work, we have to set the flag using ``ioctl()`` or ``fcntl()``. On Linux older than 2.6.27, if the ``SOCK_CLOEXEC`` flag is set in the socket type, ``socket()`` or ``socketpair()`` fail and ``errno`` is set to ``EINVAL``. New functions: * ``dup3()``: available on Linux 2.6.27 (and glibc 2.9) * ``pipe2()``: available on Linux 2.6.27 (and glibc 2.9) * ``accept4()``: available on Linux 2.6.28 (and glibc 2.10) If ``accept4()`` is called on Linux older than 2.6.28, ``accept4()`` returns ``-1`` (fail) and ``errno`` is set to ``ENOSYS``. Links ===== Links: * `Secure File Descriptor Handling `_ (Ulrich Drepper, 2008) * `win32_support.py of the Tornado project `_: emulate fcntl(fd, F_SETFD, FD_CLOEXEC) using ``SetHandleInformation(fd, HANDLE_FLAG_INHERIT, 1)`` Python issues: * `#10115: Support accept4() for atomic setting of flags at socket creation `_ * `#12105: open() does not able to set flags, such as O_CLOEXEC `_ * `#12107: TCP listening sockets created without FD_CLOEXEC flag `_ * `#16500: Add an atfork module `_ * `#16850: Add "e" mode to open(): close-and-exec (O_CLOEXEC) / O_NOINHERIT `_ * `#16860: Use O_CLOEXEC in the tempfile module `_ * `#17036: Implementation of the PEP 433 `_ * `#16946: subprocess: _close_open_fd_range_safe() does not set close-on-exec flag on Linux < 2.6.23 if O_CLOEXEC is defined `_ * `#17070: PEP 433: Use the new cloexec to improve security and avoid bugs `_ Ruby: * `Set FD_CLOEXEC for all fds (except 0, 1, 2) `_ * `O_CLOEXEC flag missing for Kernel::open `_: the `commit was reverted later `_ Footnotes ========= .. [#subprocess_close] On UNIX since Python 3.2, subprocess.Popen() closes all file descriptors by default: ``close_fds=True``. It closes file descriptors in range 3 inclusive to ``local_max_fd`` exclusive, where ``local_max_fd`` is ``fcntl(0, F_MAXFD)`` on NetBSD, or ``sysconf(_SC_OPEN_MAX)`` otherwise. If the error pipe has a descriptor smaller than 3, ``ValueError`` is raised. From larry at hastings.org Wed Jan 30 02:00:47 2013 From: larry at hastings.org (Larry Hastings) Date: Tue, 29 Jan 2013 17:00:47 -0800 Subject: [Python-Dev] PEP 433: second try In-Reply-To: References: Message-ID: <510870BF.1040906@hastings.org> On 01/29/2013 09:00 AM, Victor Stinner wrote: > Hi, > > Here is a new version of my PEP 433. > [...] > * ``os.get_cloexec(fd)`` > * ``os.set_cloexec(fd, cloexec=True)`` > * ``sys.getdefaultcloexec()`` > * ``sys.setdefaultcloexec(cloexec=True)`` Passing no judgment on the PEP otherwise, just a single observation: the "cloexec" parameter to "sys.setdefaultcloexec" should not have a default. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Jan 30 09:45:06 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 30 Jan 2013 09:45:06 +0100 Subject: [Python-Dev] PEP 433: second try In-Reply-To: <510870BF.1040906@hastings.org> References: <510870BF.1040906@hastings.org> Message-ID: 2013/1/30 Larry Hastings : > Here is a new version of my PEP 433. > [...] > * ``os.get_cloexec(fd)`` > * ``os.set_cloexec(fd, cloexec=True)`` > * ``sys.getdefaultcloexec()`` > * ``sys.setdefaultcloexec(cloexec=True)`` > > Passing no judgment on the PEP otherwise, just a single observation: the > "cloexec" parameter to "sys.setdefaultcloexec" should not have a default. Oh, it's a mistake: sys.setdefaultcloexec() has one parameter and it is mandatory: sys.setdefaultcloexec(cloexec) Victor From victor.stinner at gmail.com Wed Jan 30 13:00:52 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 30 Jan 2013 13:00:52 +0100 Subject: [Python-Dev] PEP 433: second try In-Reply-To: References: Message-ID: > Disable inheritance by default > (...) > * It violates the principle of least surprise. Developers using the > os module may expect that Python respects the POSIX standard and so > that close-on-exec flag is not set by default. Oh, I just saw that Perl is "violating POSIX" since Perl 1: close-on-exec flag in set on new created file descriptors if their number is greater than $SYSTEM_FD_MAX (which is usually 2). So Python would not be the only language providing a convinient default behaviour ;-) (And Ruby 2 will also set close-on-exec by default.) Victor From hrvoje.niksic at avl.com Wed Jan 30 13:30:28 2013 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Wed, 30 Jan 2013 13:30:28 +0100 Subject: [Python-Dev] PEP 433: second try In-Reply-To: References: Message-ID: <51091264.8010800@avl.com> On 01/30/2013 01:00 PM, Victor Stinner wrote: >> Disable inheritance by default >> (...) >> * It violates the principle of least surprise. Developers using the >> os module may expect that Python respects the POSIX standard and so >> that close-on-exec flag is not set by default. > > Oh, I just saw that Perl is "violating POSIX" since Perl 1: > close-on-exec flag in set on new created file descriptors if their > number is greater than $SYSTEM_FD_MAX (which is usually 2). I haven't checked the source, but I suspect this applies only to file descriptors opened with open(), not to explicit POSIX::* calls. (The documentation of the latter doesn't mention close-on-exec at all.) Perl's open() contains functionality equivalent to Python's open() and subprocess.Popen(), the latter of which already closes on exec by default. From guido at python.org Wed Jan 30 22:26:08 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 30 Jan 2013 13:26:08 -0800 Subject: [Python-Dev] Fwd: I was just thinking that os.path could use some love... In-Reply-To: References: Message-ID: Thoughts on os.path? What happened to the idea of a new path object? --Guido ---------- Forwarded message ---------- From: Talin Date: Wed, Jan 30, 2013 at 12:34 PM Subject: Re: I was just thinking that os.path could use some love... To: Guido van Rossum On Wed, Jan 30, 2013 at 11:33 AM, Guido van Rossum wrote: > > Hmm... Mind if I just forward to python-dev? IIRC there's been some > discussion about a newfangled path object, but I lost track of where > the discussion went. (Or you can post there yourself. :-) Sure - forward away. Is this the same as the path object idea that was floated 5 years ago? I've yet to see any set of convenience methods for paths that are so compelling as to be worth all of the time and energy needed to update all of the various APIs which now expect paths to be passed in as strings. > > > On Wed, Jan 30, 2013 at 11:13 AM, Talin wrote: > > I just realized that os.path hasn't changed in a long time. Here's a couple > > of ideas for additions: > > > > os.path.splitall(path) - splits all path components into a tuple - so for > > example, 'this/is/a/path' turns into ('this', 'is', 'a', 'path'). If there's > > a trailing slash, the last item in the tuple will be a zero-length string. > > The main reason for having this in os.path is so that we can remain > > separator-character-agnostic. > > Would it also return a leading empty string if the path starts with /? > What about Windows C: or //host/ prefixes??? > I would say that it should only split on directory separators, not any other kind of delimiter, so that each component is a valid filesystem identifier. Further, the operation should be reversible by calling os.path.join(*dirnames). So it's a little more complex than just string.split('/'). Part of the reason for wanting the splitall function is to implement the common prefix function - you take all the paths, bust them into tuples, look for the longest tuple prefix, and then join the result back into a path. This means that os.path.join(*os.path.splitall(path)) must reproduce the original path exactly. > > > An alternative would be to add an optional 'maxsplits' parameter to > > os.path.split. Default would be 1; 0 = unlimited. Main disadvantage would be > > that no one would ever use values 2, 3, etc... but it would be more > > compatible with other split apis. > > > > A version of os.path.commonprefix that always produces valid paths (and > > works by path component, not character-by-character). > > > > I can probably think of others if I look at some other path manipulation > > apis. llvm::sys::path has a bunch if I recall... > > > > -- > > -- Talin > > > > -- > --Guido van Rossum (python.org/~guido) -- -- Talin -- --Guido van Rossum (python.org/~guido) From victor.stinner at gmail.com Wed Jan 30 22:39:13 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 30 Jan 2013 22:39:13 +0100 Subject: [Python-Dev] Fwd: I was just thinking that os.path could use some love... In-Reply-To: References: Message-ID: > Thoughts on os.path? What happened to the idea of a new path object? I don't know if it's related, but there are two new interesting projects: pathlib and walkdir. http://pypi.python.org/pypi/pathlib https://pathlib.readthedocs.org/en/latest/ http://pypi.python.org/pypi/walkdir http://walkdir.readthedocs.org/en/latest/ http://bugs.python.org/issue13229 Victor From rdmurray at bitdance.com Wed Jan 30 22:43:50 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 30 Jan 2013 16:43:50 -0500 Subject: [Python-Dev] Fwd: I was just thinking that os.path could use some love... In-Reply-To: References: Message-ID: <20130130214351.6185E250BC7@webabinitio.net> http://bugs.python.org/issue11344 evolved into a patch for 'splitpath', similar to splitall. Antoine's pathlib (PEP 428) is also mentioned at the end, which is probably what Guido is thinking of. --David On Wed, 30 Jan 2013 13:26:08 -0800, Guido van Rossum wrote: > Thoughts on os.path? What happened to the idea of a new path object? > > --Guido > > > ---------- Forwarded message ---------- > From: Talin > Date: Wed, Jan 30, 2013 at 12:34 PM > Subject: Re: I was just thinking that os.path could use some love... > To: Guido van Rossum > > > On Wed, Jan 30, 2013 at 11:33 AM, Guido van Rossum wrote: > > > > Hmm... Mind if I just forward to python-dev? IIRC there's been some > > discussion about a newfangled path object, but I lost track of where > > the discussion went. (Or you can post there yourself. :-) > > > Sure - forward away. Is this the same as the path object idea that was > floated 5 years ago? I've yet to see any set of convenience methods > for paths that are so compelling as to be worth all of the time and > energy needed to update all of the various APIs which now expect paths > to be passed in as strings. > > > > > > On Wed, Jan 30, 2013 at 11:13 AM, Talin wrote: > > > I just realized that os.path hasn't changed in a long time. Here's a couple > > > of ideas for additions: > > > > > > os.path.splitall(path) - splits all path components into a tuple - so for > > > example, 'this/is/a/path' turns into ('this', 'is', 'a', 'path'). If there's > > > a trailing slash, the last item in the tuple will be a zero-length string. > > > The main reason for having this in os.path is so that we can remain > > > separator-character-agnostic. > > > > Would it also return a leading empty string if the path starts with /? > > What about Windows C: or //host/ prefixes??? > > > I would say that it should only split on directory separators, not any > other kind of delimiter, so that each component is a valid filesystem > identifier. Further, the operation should be reversible by calling > os.path.join(*dirnames). So it's a little more complex than just > string.split('/'). > > Part of the reason for wanting the splitall function is to implement > the common prefix function - you take all the paths, bust them into > tuples, look for the longest tuple prefix, and then join the result > back into a path. This means that > os.path.join(*os.path.splitall(path)) must reproduce the original path > exactly. > > > > > > An alternative would be to add an optional 'maxsplits' parameter to > > > os.path.split. Default would be 1; 0 = unlimited. Main disadvantage would be > > > that no one would ever use values 2, 3, etc... but it would be more > > > compatible with other split apis. > > > > > > A version of os.path.commonprefix that always produces valid paths (and > > > works by path component, not character-by-character). > > > > > > I can probably think of others if I look at some other path manipulation > > > apis. llvm::sys::path has a bunch if I recall... > > > > > > -- > > > -- Talin > > > > > > > > -- > > --Guido van Rossum (python.org/~guido) > > > > > -- > -- Talin > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/rdmurray%40bitdance.com From skip at pobox.com Wed Jan 30 22:45:20 2013 From: skip at pobox.com (Skip Montanaro) Date: Wed, 30 Jan 2013 15:45:20 -0600 Subject: [Python-Dev] Fwd: I was just thinking that os.path could use some love... In-Reply-To: References: Message-ID: > Thoughts on os.path? What happened to the idea of a new path object? You old pot stirrer! I wonder about this from time-to-time as well, so was just interested enough to wander over to PyPI and see what I could dig up. There are a couple path module/package implementations on PyPI. Nothing jumped out at me as the One True Way. I think you are probably thinking of Jason Orendorff's path.py module: http://pypi.python.org/pypi/path.py There are also two PEPs: http://www.python.org/dev/peps/pep-0355/ http://www.python.org/dev/peps/pep-0428/ The former was rejected. The latter still seems to have legs, and an implementation: http://pypi.python.org/pypi/pathlib/0.7 Skip From cs at zip.com.au Wed Jan 30 23:01:41 2013 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 31 Jan 2013 09:01:41 +1100 Subject: [Python-Dev] Fwd: I was just thinking that os.path could use some love... In-Reply-To: References: Message-ID: <20130130220140.GA19745@cskk.homeip.net> On 30Jan2013 13:26, Guido van Rossum wrote: | Thoughts on os.path? What happened to the idea of a new path object? [...] | From: Talin [...] | On Wed, Jan 30, 2013 at 11:33 AM, Guido van Rossum wrote: | > Hmm... Mind if I just forward to python-dev? IIRC there's been some | > discussion about a newfangled path object, but I lost track of where | > the discussion went. (Or you can post there yourself. :-) | | Sure - forward away. Is this the same as the path object idea that was | floated 5 years ago? I've yet to see any set of convenience methods | for paths that are so compelling as to be worth all of the time and | energy needed to update all of the various APIs which now expect paths | to be passed in as strings. Speaking for myself, I've been having some usefulness with making "URL" objects that are subclasses of str. That lets me pass them to all the things that already expect strs, while still having convenience methods. Combined with a little factory function to make a "URL" from a bare str should it be needed, it has worked fairly well. It is certainly not seamless all the time... Eg: U = URL(s) # returns s if already an instance of _URL len(U) # I'm still a string U.startswith("http:") U.scheme # eg "http" U.hrefs # fetch content, parse as HTML, return links Just a thought on supporting code expecting strings. Cheers, -- Cameron Simpson Outside of a dog, a book is a man's best friend. Inside of a dog, it's too dark to read. - Groucho Marx From glyph at twistedmatrix.com Wed Jan 30 23:20:05 2013 From: glyph at twistedmatrix.com (Glyph) Date: Wed, 30 Jan 2013 14:20:05 -0800 Subject: [Python-Dev] I was just thinking that os.path could use some love... In-Reply-To: <20130130220140.GA19745@cskk.homeip.net> References: <20130130220140.GA19745@cskk.homeip.net> Message-ID: <37332BC3-DAF0-4344-818F-E2663475CC57@twistedmatrix.com> On Jan 30, 2013, at 2:01 PM, Cameron Simpson wrote: > Speaking for myself, I've been having some usefulness with making "URL" > objects that are subclasses of str. That lets me pass them to all the > things that already expect strs, while still having convenience methods. str subclasses are problematic. One issue is that it will still allow for invalid manipulations. If you prohibit them, then manipulations that take multiple steps will be super inconvenient. If you allow them, then you end up with half-formed values that will error out sometimes, or generate corrupt data that shouldn't be allowed to exist (trivial example; a NUL character in the middle of a file path). Also, automatic coercion will sometimes surprise you and give you a value which is of the wrong type if you forget a method or two. Also URL and file paths have a common interface, but are not totally the same. Basically, everybody wants to say "composition is better than inheritance, except for *this* case, where inheritance seems super convenient". That's how it gets you! Inheritance _is_ super convenient, but it's also super confusing. Resist the temptation :-). Once again (I see my previous reply went straight to the sender, not the whole list) I recommend as an abstraction that has worked very well in a wide variety of situations. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From doko at ubuntu.com Thu Jan 31 00:49:38 2013 From: doko at ubuntu.com (Matthias Klose) Date: Thu, 31 Jan 2013 00:49:38 +0100 Subject: [Python-Dev] mingw32 port Message-ID: <5109B192.9050504@ubuntu.com> [No, I'm not interested in the port myself] patches for a mingw32 port are floating around on the web and the python bug tracker, although most of them as a set of patches in one issue addressing several things, and which maybe outdated for the trunk. at least for me re-reading a big patch in a new version is more work than having the patches in different issues. So proposing to break down the patches in independent ones, dealing with: - mingw32 support (excluding any cross-build support). tested with a native build with srcdir == builddir. The changes for distutils mingw32 support should be a separate patch. Who could review these? Asked Martin, but didn't get a reply yet. - patches to cross-build for mingw32. - patches to deal with a srcdir != builddir configuration, where the srcdir is read-only (please don't mix with the cross-build support). All work should be done on the tip/trunk. So ok to close issue16526, issue3871, issue3754 and suggest in the reports to start over with more granular changes? Matthias From rdmurray at bitdance.com Thu Jan 31 05:04:35 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 30 Jan 2013 23:04:35 -0500 Subject: [Python-Dev] mingw32 port In-Reply-To: <5109B192.9050504@ubuntu.com> References: <5109B192.9050504@ubuntu.com> Message-ID: <20130131040435.CD751250BC7@webabinitio.net> On Thu, 31 Jan 2013 00:49:38 +0100, Matthias Klose wrote: > So ok to close issue16526, issue3871, issue3754 and suggest in the reports to > start over with more granular changes? +1 from me. Over time I've suggested to at least two people, maybe more, who wanted to see some action on these that the way to make progress on it was to break everything down into the to smallest independent patches possible. So far no one has followed through on it, obviously. [And no, I'm not interested in the port myself, either :)] --David From solipsis at pitrou.net Thu Jan 31 09:55:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 31 Jan 2013 09:55:21 +0100 Subject: [Python-Dev] I was just thinking that os.path could use some love... References: Message-ID: <20130131095521.0d5f3fb6@pitrou.net> Le Wed, 30 Jan 2013 13:26:08 -0800, Guido van Rossum a ?crit : > Thoughts on os.path? What happened to the idea of a new path object? I plan to launch another round of discussions following the changes in PEP 428. Regards Antoine. > > --Guido > > > ---------- Forwarded message ---------- > From: Talin > Date: Wed, Jan 30, 2013 at 12:34 PM > Subject: Re: I was just thinking that os.path could use some love... > To: Guido van Rossum > > > On Wed, Jan 30, 2013 at 11:33 AM, Guido van Rossum > wrote: > > > > Hmm... Mind if I just forward to python-dev? IIRC there's been some > > discussion about a newfangled path object, but I lost track of where > > the discussion went. (Or you can post there yourself. :-) > > > Sure - forward away. Is this the same as the path object idea that was > floated 5 years ago? I've yet to see any set of convenience methods > for paths that are so compelling as to be worth all of the time and > energy needed to update all of the various APIs which now expect paths > to be passed in as strings. > > > > > > On Wed, Jan 30, 2013 at 11:13 AM, Talin wrote: > > > I just realized that os.path hasn't changed in a long time. > > > Here's a couple of ideas for additions: > > > > > > os.path.splitall(path) - splits all path components into a tuple > > > - so for example, 'this/is/a/path' turns into ('this', 'is', 'a', > > > 'path'). If there's a trailing slash, the last item in the tuple > > > will be a zero-length string. The main reason for having this in > > > os.path is so that we can remain separator-character-agnostic. > > > > Would it also return a leading empty string if the path starts > > with /? What about Windows C: or //host/ prefixes??? > > > I would say that it should only split on directory separators, not any > other kind of delimiter, so that each component is a valid filesystem > identifier. Further, the operation should be reversible by calling > os.path.join(*dirnames). So it's a little more complex than just > string.split('/'). > > Part of the reason for wanting the splitall function is to implement > the common prefix function - you take all the paths, bust them into > tuples, look for the longest tuple prefix, and then join the result > back into a path. This means that > os.path.join(*os.path.splitall(path)) must reproduce the original path > exactly. > > > > > > An alternative would be to add an optional 'maxsplits' parameter > > > to os.path.split. Default would be 1; 0 = unlimited. Main > > > disadvantage would be that no one would ever use values 2, 3, > > > etc... but it would be more compatible with other split apis. > > > > > > A version of os.path.commonprefix that always produces valid > > > paths (and works by path component, not character-by-character). > > > > > > I can probably think of others if I look at some other path > > > manipulation apis. llvm::sys::path has a bunch if I recall... > > > > > > -- > > > -- Talin > > > > > > > > -- > > --Guido van Rossum (python.org/~guido) > > > > > -- > -- Talin > > From theller at ctypes.org Thu Jan 31 21:32:39 2013 From: theller at ctypes.org (Thomas Heller) Date: Thu, 31 Jan 2013 21:32:39 +0100 Subject: [Python-Dev] importlib.find_loader Message-ID: In Python3.3, I thought that every loaded module has a __loader__ attribute. Apparently this is not the case for a few builtin modules: Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import importlib >>> >>> def bug(): ... for name in list(sys.modules.keys()): ... try: ... importlib.find_loader(name) ... except Exception as details: ... print(name, type(details), details) ... >>> bug() _frozen_importlib 'module' object has no attribute '__loader__' builtins 'module' object has no attribute '__loader__' signal 'module' object has no attribute '__loader__' importlib._bootstrap 'module' object has no attribute '__loader__' >>> However, the importlib docs talk about a ValueError instead of the AttributeError that I see above: > Find the loader for a module, optionally within the specified path. > If the module is in sys.modules, then sys.modules[name].__loader__ > is returned (unless the loader would be None, in which case > ValueError is raised). Is this a bug? Thomas From kristjan at ccpgames.com Thu Jan 31 23:18:06 2013 From: kristjan at ccpgames.com (=?utf-8?B?S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29u?=) Date: Thu, 31 Jan 2013 22:18:06 +0000 Subject: [Python-Dev] Anyone building Python --without-doc-strings? In-Reply-To: References: <20130126105512.GA25659@sleipnir.bytereef.org> <20130126154532.70974b34@pitrou.net> <20130126154738.GA13795@sleipnir.bytereef.org> <20130126160359.GA14047@sleipnir.bytereef.org> <20130126171932.74e792e9@pitrou.net> <20130126163738.8E7B6250BC5@webabinitio.net> Message-ID: We do that, of course, but compiling python without the doc strings removes those from all built-in modules as well. That's quite a lot of static data. K -----Original Message----- From: Victor Stinner [mailto:victor.stinner at gmail.com] Sent: 27. jan?ar 2013 21:58 To: Kristj?n Valur J?nsson Cc: R. David Murray; python-dev at python.org Subject: Re: [Python-Dev] Anyone building Python --without-doc-strings? Why don't you compile using python -OO and distribute only .pyo code? Victor 2013/1/27 Kristj?n Valur J?nsson : > We (CCP) are certainly compiling python without docstrings for our > embedded platforms (that include the PS3) Anyone using python as en engine to be used by programs and not users will appreciate the deletion of unneeded memory. > K > > -----Original Message----- > From: Python-Dev > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf > Of R. David Murray > Sent: 27. jan?ar 2013 00:38 > To: python-dev at python.org > Subject: Re: [Python-Dev] Anyone building Python --without-doc-strings? > > On Sat, 26 Jan 2013 17:19:32 +0100, Antoine Pitrou wrote: >> On Sat, 26 Jan 2013 17:03:59 +0100 >> Stefan Krah wrote: >> > Stefan Krah wrote: >> > > I'm not sure how accurate the output is for measuring these >> > > things, but according to ``ls'' and ``du'' the option is indeed quite worthless: >> > > >> > > ./configure CFLAGS="-Os -s" LDFLAGS="-s" && make >> > > 1.8M Jan 26 16:36 python >> > > ./configure --without-doc-strings CFLAGS="-Os -s" LDFLAGS="-s" && make >> > > 1.6M Jan 26 16:33 python >> > >> > The original contribution *was* in fact aiming for "10% smaller", see: >> > >> > http://docs.python.org/release/2.3/whatsnew/node20.html >> > >> > So apparently people thought it was useful. >> >> After a bit of digging, I found the following discussions: >> http://mail.python.org/pipermail/python-dev/2001-November/018444.html >> http://mail.python.org/pipermail/python-dev/2002-January/019392.html >> http://bugs.python.org/issue505375 >> >> Another reason for accepting the patch seemed to be that it >> introduced the Py_DOCSTR() macros, which were viewed as helpful for >> other reasons (some people talked about localizing docstrings). >> >> I would point out that if 200 KB is really a big win for someone, >> then Python (and especially Python 3) is probably not the best >> language for them. >> >> It is also ironic how the executable size went up since then (from >> 0.6 to more than 1.5 MB) :-) > > 200K can make a difference. It does on the QNX platform, for example, > where there is no virtual memory. It would be nice to reduce that > executable size, too....but I'm not volunteering to try (at least not > yet) :) > > --David > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames. > com > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/victor.stinner%40gma > il.com