From ronaldoussoren at mac.com Thu Aug 1 09:03:33 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 1 Aug 2013 09:03:33 +0200 Subject: [Python-Dev] [Python-checkins] cpython (merge 3.3 -> default): Silence warning about set but unused variable inside compile_atom() in In-Reply-To: <3c57bZ3BgyzPsJ@mail.python.org> References: <3c57bZ3BgyzPsJ@mail.python.org> Message-ID: <630FBD5A-ACEF-4826-AEF2-3F2C83A5CA35@mac.com> On 31 Jul, 2013, at 23:50, christian.heimes wrote: > http://hg.python.org/cpython/rev/0e09588a3bc2 > changeset: 84939:0e09588a3bc2 > parent: 84937:809a64ecd5f1 > parent: 84938:83a55ca935f0 > user: Christian Heimes > date: Wed Jul 31 23:48:04 2013 +0200 > summary: > Silence warning about set but unused variable inside compile_atom() in non-debug builds > > files: > Parser/pgen.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > > diff --git a/Parser/pgen.c b/Parser/pgen.c > --- a/Parser/pgen.c > +++ b/Parser/pgen.c > @@ -283,6 +283,7 @@ > > REQ(n, ATOM); > i = n->n_nchildren; > + (void)i; /* Don't warn about set but unused */ > REQN(i, 1); Why didn't you change this to "REQN(n->nchilderen, 1);" (and then remove variable "i")? Ronald > n = n->n_child; > if (n->n_type == LPAR) { > > -- > Repository URL: http://hg.python.org/cpython > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins From christian at python.org Thu Aug 1 09:49:32 2013 From: christian at python.org (Christian Heimes) Date: Thu, 01 Aug 2013 09:49:32 +0200 Subject: [Python-Dev] [Python-checkins] cpython (merge 3.3 -> default): Silence warning about set but unused variable inside compile_atom() in In-Reply-To: <630FBD5A-ACEF-4826-AEF2-3F2C83A5CA35@mac.com> References: <3c57bZ3BgyzPsJ@mail.python.org> <630FBD5A-ACEF-4826-AEF2-3F2C83A5CA35@mac.com> Message-ID: Am 01.08.2013 09:03, schrieb Ronald Oussoren: > > On 31 Jul, 2013, at 23:50, christian.heimes wrote: > >> http://hg.python.org/cpython/rev/0e09588a3bc2 >> changeset: 84939:0e09588a3bc2 >> parent: 84937:809a64ecd5f1 >> parent: 84938:83a55ca935f0 >> user: Christian Heimes >> date: Wed Jul 31 23:48:04 2013 +0200 >> summary: >> Silence warning about set but unused variable inside compile_atom() in non-debug builds >> >> files: >> Parser/pgen.c | 1 + >> 1 files changed, 1 insertions(+), 0 deletions(-) >> >> >> diff --git a/Parser/pgen.c b/Parser/pgen.c >> --- a/Parser/pgen.c >> +++ b/Parser/pgen.c >> @@ -283,6 +283,7 @@ >> >> REQ(n, ATOM); >> i = n->n_nchildren; >> + (void)i; /* Don't warn about set but unused */ >> REQN(i, 1); > > Why didn't you change this to "REQN(n->nchilderen, 1);" (and then remove variable "i")? > > Ronald > >> n = n->n_child; >> if (n->n_type == LPAR) { It doesn't work because a few lines later the code does: n = n->n_child; if (n->n_type == LPAR) { REQN(i, 3); n is no longer the right n and REQN(i, 3) would fail. Christian From ronaldoussoren at mac.com Thu Aug 1 10:26:25 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 1 Aug 2013 10:26:25 +0200 Subject: [Python-Dev] [Python-checkins] cpython (merge 3.3 -> default): Silence warning about set but unused variable inside compile_atom() in In-Reply-To: References: <3c57bZ3BgyzPsJ@mail.python.org> <630FBD5A-ACEF-4826-AEF2-3F2C83A5CA35@mac.com> Message-ID: On 1 Aug, 2013, at 9:49, Christian Heimes wrote: > Am 01.08.2013 09:03, schrieb Ronald Oussoren: >> >> On 31 Jul, 2013, at 23:50, christian.heimes wrote: >> >>> http://hg.python.org/cpython/rev/0e09588a3bc2 >>> changeset: 84939:0e09588a3bc2 >>> parent: 84937:809a64ecd5f1 >>> parent: 84938:83a55ca935f0 >>> user: Christian Heimes >>> date: Wed Jul 31 23:48:04 2013 +0200 >>> summary: >>> Silence warning about set but unused variable inside compile_atom() in non-debug builds >>> >>> files: >>> Parser/pgen.c | 1 + >>> 1 files changed, 1 insertions(+), 0 deletions(-) >>> >>> >>> diff --git a/Parser/pgen.c b/Parser/pgen.c >>> --- a/Parser/pgen.c >>> +++ b/Parser/pgen.c >>> @@ -283,6 +283,7 @@ >>> >>> REQ(n, ATOM); >>> i = n->n_nchildren; >>> + (void)i; /* Don't warn about set but unused */ >>> REQN(i, 1); >> >> Why didn't you change this to "REQN(n->nchilderen, 1);" (and then remove variable "i")? >> >> Ronald >> >>> n = n->n_child; >>> if (n->n_type == LPAR) { > > It doesn't work because a few lines later the code does: > > n = n->n_child; > if (n->n_type == LPAR) { > REQN(i, 3); > > n is no longer the right n and REQN(i, 3) would fail. I overlooked that one. Thanks for the explanation, Ronald > > Christian > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com From ncoghlan at gmail.com Thu Aug 1 14:44:12 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Aug 2013 22:44:12 +1000 Subject: [Python-Dev] PEP 8 modernisation Message-ID: With feedback from Guido, Barry, Raymond and others, I have updated PEP 8 to better describe our current development practices. It started as an update to describe the different between public and internal interfaces and to advise against using wildcard imports, but became substantially more :) For those that want full details, the relevant commit and tracker issue are here: http://hg.python.org/peps/rev/fb24c80e9afb http://bugs.python.org/issue18472 If you're responsible for a coding standard that includes PEP 8 by reference, you probably want to take a look at these :) For everyone else, here are the highlights: 1. Made it clear this is a living document (using the approach of creating a tracker issue for major updates and adding a new footnote referencing that issue) 2. Added more specific points to the "foolish consistency" section to help out those folks resisting pointless PEP 8 compliance for code that predates the existence of the PEP's recommendations. 3. Stopped being wishy-washy about tabs vs spaces. Use spaces :) 4. Lines up to 99 characters are now permitted (but 79 is still the preferred limit) 5. The encodings section is now emphatically in favour of UTF-8 (latin-1 is no longer even mentioned) 6. While absolute imports are still favoured, explicit relative imports are deemed acceptable 7. Wildcard imports are strongly discouraged for most cases (with dynamic republishing the only acceptable use case, since PEP 8 doesn't apply at all for the interactive prompt) 8. New section explaining the distinction between public and internal interfaces (and how to tell which is which) 9. Explicit guideline not to assign lambdas to names (use def, that's what it's for) 10. Various tweaks to the exception raising and handling guidelines 11. Explicit recommendation to use a decorator in conjunction with annotations in third party experiments Cheers, Nick. P.S. It's possible this should also be published through python-announce and other channels... -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Thu Aug 1 15:10:20 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 1 Aug 2013 15:10:20 +0200 Subject: [Python-Dev] PEP 8 modernisation References: Message-ID: <20130801151020.2628a690@pitrou.net> Le Thu, 1 Aug 2013 22:44:12 +1000, Nick Coghlan a ?crit : > 4. Lines up to 99 characters are now permitted (but 79 is still the > preferred limit) Something magic about 99? cheers Antoine. From fred at fdrake.net Thu Aug 1 15:16:13 2013 From: fred at fdrake.net (Fred Drake) Date: Thu, 1 Aug 2013 09:16:13 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: <20130801151020.2628a690@pitrou.net> References: <20130801151020.2628a690@pitrou.net> Message-ID: On Thu, Aug 1, 2013 at 9:10 AM, Antoine Pitrou wrote: > Something magic about 99? I'm guessing it's short enough you can say you tried, but long enough to annoy traditionalists anyway. I'm annoyed already. :-) -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein From ncoghlan at gmail.com Thu Aug 1 15:21:49 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Aug 2013 23:21:49 +1000 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: <20130801151020.2628a690@pitrou.net> References: <20130801151020.2628a690@pitrou.net> Message-ID: On 1 August 2013 23:10, Antoine Pitrou wrote: > Le Thu, 1 Aug 2013 22:44:12 +1000, > Nick Coghlan a ?crit : >> 4. Lines up to 99 characters are now permitted (but 79 is still the >> preferred limit) > > Something magic about 99? One less than 100, same as 79 is one less than 80. The "100" came from Guido :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rdmurray at bitdance.com Thu Aug 1 16:21:42 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 01 Aug 2013 10:21:42 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: <20130801151020.2628a690@pitrou.net> Message-ID: <20130801142142.A3C942500B9@webabinitio.net> On Thu, 01 Aug 2013 09:16:13 -0400, Fred Drake wrote: > On Thu, Aug 1, 2013 at 9:10 AM, Antoine Pitrou wrote: > > Something magic about 99? > > I'm guessing it's short enough you can say you tried, but long > enough to annoy traditionalists anyway. > > I'm annoyed already. :-) +1 :) My terminal windows are usually wider than 80 chars, but I still find it far far better to limit myself to 79 columns, because it gives me the flexibility to narrow the windows at need (eg: :vsplit in vi to see several files side-by-side). The (small) improvement in readability of longer lines is far less significant to me than the loss of readability when I want narrower windows (or run into them in code review tools, as mentioned). But of course this is just my opinion :) :) --David From steve at pearwood.info Thu Aug 1 16:31:18 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 02 Aug 2013 00:31:18 +1000 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: <51FA7136.8060909@pearwood.info> On 01/08/13 22:44, Nick Coghlan wrote: > 4. Lines up to 99 characters are now permitted (but 79 is still the > preferred limit) Coincidentally, there was a discussion about line length on python-list over the last couple of days. I think the two most relevant comments are by Skip Montanaro: http://mail.python.org/pipermail/python-list/2013-July/652977.html http://mail.python.org/pipermail/python-list/2013-July/653046.html If I may be permitted to paraphrase: - publishers and printers have been dealing with readability of text for an awfully long time, and they pretty much all use a de facto standard of 70-80 characters per line; - most lines of code are short, stretching the max out to 100 characters when most lines are around 60 just ends up wasting screen real estate if your editor window is wide enough to deal with the max. To that last point, I add: it's even worse if you keep the editor relatively narrow, since now you have a few lines that require horizontal scrolling, which is awful, or line-wrapping, neither of which are palatable. -- Steven From kxepal at gmail.com Thu Aug 1 16:34:27 2013 From: kxepal at gmail.com (Alexander Shorin) Date: Thu, 1 Aug 2013 18:34:27 +0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: Hi Nick, On Thu, Aug 1, 2013 at 4:44 PM, Nick Coghlan wrote: > 9. Explicit guideline not to assign lambdas to names (use def, that's > what it's for) Even for propose to fit chars-per-line limit and/or to remove duplicates (especially for sorted groupby case)? -- ,,,^..^,,, From ronaldoussoren at mac.com Thu Aug 1 16:41:46 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 1 Aug 2013 16:41:46 +0200 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On 1 Aug, 2013, at 16:34, Alexander Shorin wrote: > Hi Nick, > > On Thu, Aug 1, 2013 at 4:44 PM, Nick Coghlan wrote: >> 9. Explicit guideline not to assign lambdas to names (use def, that's >> what it's for) > > Even for propose to fit chars-per-line limit and/or to remove > duplicates (especially for sorted groupby case)? When you do "name = lambda ..." you've created a named function, when you do that your better of using def statement for the reasons Nick mentioned in the PEP. Ronald > > -- > ,,,^..^,,, > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com From ldlandis at gmail.com Thu Aug 1 16:43:03 2013 From: ldlandis at gmail.com (LD 'Gus' Landis) Date: Thu, 1 Aug 2013 08:43:03 -0600 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: <51FA7136.8060909@pearwood.info> References: <51FA7136.8060909@pearwood.info> Message-ID: On Thu, Aug 1, 2013 at 8:31 AM, Steven D'Aprano wrote: > On 01/08/13 22:44, Nick Coghlan wrote: > > 4. Lines up to 99 characters are now permitted (but 79 is still the >> preferred limit) >> > > Coincidentally, there was a discussion about line length on python-list > over the last couple of days. I think the two most relevant comments are by > Skip Montanaro: > > http://mail.python.org/**pipermail/python-list/2013-**July/652977.html > http://mail.python.org/**pipermail/python-list/2013-**July/653046.html > > I believe there may be a relationship to the 7 plus or minus 2 (times 10) of human conceptual limits. Personally I find it very difficult to read text with long lines. Historically two or three column (newspaper/book) with a barrier margin was used to get much more text on the page, but still the reader had much shorter "chunks" to absorb. > If I may be permitted to paraphrase: > > - publishers and printers have been dealing with readability of text for > an awfully long time, and they pretty much all use a de facto standard of > 70-80 characters per line; > > - most lines of code are short, stretching the max out to 100 characters > when most lines are around 60 just ends up wasting screen real estate if > your editor window is wide enough to deal with the max. > > To that last point, I add: it's even worse if you keep the editor > relatively narrow, since now you have a few lines that require horizontal > scrolling, which is awful, or line-wrapping, neither of which are palatable. > > > > -- > Steven > > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > ldlandis%40gmail.com > -- --- NOTE: If it is important CALL ME - I may miss email, which I do NOT normally check on weekends nor on a regular basis during any other day. --- LD Landis - N0YRQ - de la tierra del encanto 3960 Schooner Loop, Las Cruces, NM 88012 651-340-4007 N32 21'48.28" W106 46'5.80" -------------- next part -------------- An HTML attachment was scrubbed... URL: From kxepal at gmail.com Thu Aug 1 16:48:38 2013 From: kxepal at gmail.com (Alexander Shorin) Date: Thu, 1 Aug 2013 18:48:38 +0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: Hi Ronald, I understand this, but I'm a bit confused about fate of lambdas with such guideline since I see no more reasons to use them with p.9 statement: long lines, code duplicate, no mock and well tests etc. - all these problems could be solved with assigning lambda to some name, but now they are looks useless (or useful only for very trivial cases) -- ,,,^..^,,, On Thu, Aug 1, 2013 at 6:41 PM, Ronald Oussoren wrote: > > On 1 Aug, 2013, at 16:34, Alexander Shorin wrote: > >> Hi Nick, >> >> On Thu, Aug 1, 2013 at 4:44 PM, Nick Coghlan wrote: >>> 9. Explicit guideline not to assign lambdas to names (use def, that's >>> what it's for) >> >> Even for propose to fit chars-per-line limit and/or to remove >> duplicates (especially for sorted groupby case)? > > When you do "name = lambda ..." you've created a named function, when you > do that your better of using def statement for the reasons Nick mentioned > in the PEP. > > Ronald > >> >> -- >> ,,,^..^,,, >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com > From ronaldoussoren at mac.com Thu Aug 1 16:53:16 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 1 Aug 2013 16:53:16 +0200 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On 1 Aug, 2013, at 16:48, Alexander Shorin wrote: > Hi Ronald, > > I understand this, but I'm a bit confused about fate of lambdas with > such guideline since I see no more reasons to use them with p.9 > statement: long lines, code duplicate, no mock and well tests etc. - > all these problems could be solved with assigning lambda to some name, > but now they are looks useless (or useful only for very trivial cases) That sounds about right :-) Note that: f = lambda x: x ** 2 And: def f(x): return x ** 2 Are functionally equivalent and use the same byte code. The only differences are that the lambda saves two characters in typing, and the "def" variant has a more useful value in its __name__ attribute. IMHO The lambda variant also looks uglier (even with the def variant on a single line). Ronald > -- > ,,,^..^,,, > > > On Thu, Aug 1, 2013 at 6:41 PM, Ronald Oussoren wrote: >> >> On 1 Aug, 2013, at 16:34, Alexander Shorin wrote: >> >>> Hi Nick, >>> >>> On Thu, Aug 1, 2013 at 4:44 PM, Nick Coghlan wrote: >>>> 9. Explicit guideline not to assign lambdas to names (use def, that's >>>> what it's for) >>> >>> Even for propose to fit chars-per-line limit and/or to remove >>> duplicates (especially for sorted groupby case)? >> >> When you do "name = lambda ..." you've created a named function, when you >> do that your better of using def statement for the reasons Nick mentioned >> in the PEP. >> >> Ronald >> >>> >>> -- >>> ,,,^..^,,, >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> http://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com >> From steve at pearwood.info Thu Aug 1 16:57:24 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 02 Aug 2013 00:57:24 +1000 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: <51FA7754.7040208@pearwood.info> On 01/08/13 22:44, Nick Coghlan wrote: > With feedback from Guido, Barry, Raymond and others, I have updated > PEP 8 to better describe our current development practices. It started > as an update to describe the different between public and internal > interfaces and to advise against using wildcard imports, but became > substantially more :) Before this entire thread be buried in a mountain of controversy over the 79-99 line length issue, let me say thanks Nick and the others for your work on this. -- Steven From martin at v.loewis.de Thu Aug 1 17:03:03 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 01 Aug 2013 17:03:03 +0200 Subject: [Python-Dev] PEP 442 aftermath: module globals at shutdown In-Reply-To: <20130730233223.1e472996@fsol> References: <20130730204200.6d77df8b@fsol> <20130730233223.1e472996@fsol> Message-ID: <51FA78A7.3090905@v.loewis.de> Am 30.07.13 23:32, schrieb Antoine Pitrou: > - it is held alive by a C extension: the main example is the locale > module, which is held alive by _io and in turn keeps alive other > Python modules (such as collections or re). If the _locale module would use PEP 3121 (issue15662), this problem should go away. Regards, Martin From kxepal at gmail.com Thu Aug 1 17:03:12 2013 From: kxepal at gmail.com (Alexander Shorin) Date: Thu, 1 Aug 2013 19:03:12 +0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: ...and, if so, why lambda's?(: Without backward compatibility point I see that they are getting "unofficially" deprecated and their usage is dishonoured. -- ,,,^..^,,, On Thu, Aug 1, 2013 at 6:53 PM, Ronald Oussoren wrote: > > On 1 Aug, 2013, at 16:48, Alexander Shorin wrote: > >> Hi Ronald, >> >> I understand this, but I'm a bit confused about fate of lambdas with >> such guideline since I see no more reasons to use them with p.9 >> statement: long lines, code duplicate, no mock and well tests etc. - >> all these problems could be solved with assigning lambda to some name, >> but now they are looks useless (or useful only for very trivial cases) > > That sounds about right :-) > > Note that: > > f = lambda x: x ** 2 > > And: > > def f(x): return x ** 2 > > Are functionally equivalent and use the same byte code. The only differences > are that the lambda saves two characters in typing, and the "def" variant has > a more useful value in its __name__ attribute. > > IMHO The lambda variant also looks uglier (even with the def variant on a single line). > > Ronald > >> -- >> ,,,^..^,,, >> >> >> On Thu, Aug 1, 2013 at 6:41 PM, Ronald Oussoren wrote: >>> >>> On 1 Aug, 2013, at 16:34, Alexander Shorin wrote: >>> >>>> Hi Nick, >>>> >>>> On Thu, Aug 1, 2013 at 4:44 PM, Nick Coghlan wrote: >>>>> 9. Explicit guideline not to assign lambdas to names (use def, that's >>>>> what it's for) >>>> >>>> Even for propose to fit chars-per-line limit and/or to remove >>>> duplicates (especially for sorted groupby case)? >>> >>> When you do "name = lambda ..." you've created a named function, when you >>> do that your better of using def statement for the reasons Nick mentioned >>> in the PEP. >>> >>> Ronald >>> >>>> >>>> -- >>>> ,,,^..^,,, >>>> _______________________________________________ >>>> Python-Dev mailing list >>>> Python-Dev at python.org >>>> http://mail.python.org/mailman/listinfo/python-dev >>>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com >>> > From skip at pobox.com Thu Aug 1 17:05:10 2013 From: skip at pobox.com (Skip Montanaro) Date: Thu, 1 Aug 2013 10:05:10 -0500 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: <51FA7136.8060909@pearwood.info> References: <51FA7136.8060909@pearwood.info> Message-ID: > http://mail.python.org/pipermail/python-list/2013-July/653046.html One correspondent objected that I was artificial biasing my histogram because wrapped lines are, more-or-less by definition, going to be < 80 characters. Off-list I responded with a modified version of my graph where I eliminated all lines which ended in my preferred continuation characters (open paren-like things and commas). The resulting histogram is attached (count as a function of line length). This makes the "wasted space" argument even stronger. Generally, when I wrap a line, I wrap it fairly near the limit, so by eliminating them, the shorter lines stand out more clearly. Skip -------------- next part -------------- A non-text attachment was scrubbed... Name: square2.png Type: image/png Size: 20390 bytes Desc: not available URL: From steve at pearwood.info Thu Aug 1 17:06:18 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 02 Aug 2013 01:06:18 +1000 Subject: [Python-Dev] Lambda [was Re: PEP 8 modernisation] In-Reply-To: References: Message-ID: <51FA796A.50001@pearwood.info> Hi Alexander, On 02/08/13 00:48, Alexander Shorin wrote: > Hi Ronald, > > I understand this, but I'm a bit confused about fate of lambdas with > such guideline since I see no more reasons to use them with p.9 > statement: long lines, code duplicate, no mock and well tests etc. - > all these problems could be solved with assigning lambda to some name, > but now they are looks useless (or useful only for very trivial cases) Lambda is still useful for the reason lambda has always been useful: it is an expression, not a statement, so you can embed it directly where needed. # Preferred: sorted(data, key=lambda value: value['spam'].casefold()) # Allowed: def f(value): return value['spam'].casefold() sorted(data, key=f) # Prohibited: f = lambda value: value['spam'].casefold() sorted(data, key=f) # SyntaxError: sorted(data, key=def f(value): value['spam'].casefold()) -- Steven From rdmurray at bitdance.com Thu Aug 1 17:07:48 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 01 Aug 2013 11:07:48 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: <20130801150749.33E5E25003E@webabinitio.net> On Thu, 01 Aug 2013 16:53:16 +0200, Ronald Oussoren wrote: > On 1 Aug, 2013, at 16:48, Alexander Shorin wrote: > > I understand this, but I'm a bit confused about fate of lambdas with > > such guideline since I see no more reasons to use them with p.9 > > statement: long lines, code duplicate, no mock and well tests etc. - > > all these problems could be solved with assigning lambda to some name, > > but now they are looks useless (or useful only for very trivial cases) > > That sounds about right :-) I don't understand the cases being mentioned in the question, but there are certainly places where lambdas are useful. The most obvious is as arguments to functions that expect functions as arguments. But yes, even in those cases if a lambda isn't fairly trivial, it probably shouldn't be a lambda. --David From solipsis at pitrou.net Thu Aug 1 17:09:48 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 1 Aug 2013 17:09:48 +0200 Subject: [Python-Dev] PEP 8 modernisation References: <20130801151020.2628a690@pitrou.net> Message-ID: <20130801170948.00c37446@pitrou.net> Le Thu, 1 Aug 2013 23:21:49 +1000, Nick Coghlan a ?crit : > On 1 August 2013 23:10, Antoine Pitrou wrote: > > Le Thu, 1 Aug 2013 22:44:12 +1000, > > Nick Coghlan a ?crit : > >> 4. Lines up to 99 characters are now permitted (but 79 is still the > >> preferred limit) > > > > Something magic about 99? > > One less than 100, same as 79 is one less than 80. The "100" came > from Guido :) Yes, I've heard about those spiffy BCD computers in the powerful datacenters of American companies :-) (and after all, BCD == ABC + 1) Regards Antoine. From ronaldoussoren at mac.com Thu Aug 1 17:11:00 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 1 Aug 2013 17:11:00 +0200 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On 1 Aug, 2013, at 17:03, Alexander Shorin wrote: > ...and, if so, why lambda's?(: Without backward compatibility point I > see that they are getting "unofficially" deprecated and their usage is > dishonoured. They are still usefull for simple functions that you use in one place, such as the key argument to sorted. By the time you assign a name to the function and give it unittests you may as well use a def-statement and let the function know it its own name. Ronald > > -- > ,,,^..^,,, > > > On Thu, Aug 1, 2013 at 6:53 PM, Ronald Oussoren wrote: >> >> On 1 Aug, 2013, at 16:48, Alexander Shorin wrote: >> >>> Hi Ronald, >>> >>> I understand this, but I'm a bit confused about fate of lambdas with >>> such guideline since I see no more reasons to use them with p.9 >>> statement: long lines, code duplicate, no mock and well tests etc. - >>> all these problems could be solved with assigning lambda to some name, >>> but now they are looks useless (or useful only for very trivial cases) >> >> That sounds about right :-) >> >> Note that: >> >> f = lambda x: x ** 2 >> >> And: >> >> def f(x): return x ** 2 >> >> Are functionally equivalent and use the same byte code. The only differences >> are that the lambda saves two characters in typing, and the "def" variant has >> a more useful value in its __name__ attribute. >> >> IMHO The lambda variant also looks uglier (even with the def variant on a single line). >> >> Ronald >> >>> -- >>> ,,,^..^,,, >>> >>> >>> On Thu, Aug 1, 2013 at 6:41 PM, Ronald Oussoren wrote: >>>> >>>> On 1 Aug, 2013, at 16:34, Alexander Shorin wrote: >>>> >>>>> Hi Nick, >>>>> >>>>> On Thu, Aug 1, 2013 at 4:44 PM, Nick Coghlan wrote: >>>>>> 9. Explicit guideline not to assign lambdas to names (use def, that's >>>>>> what it's for) >>>>> >>>>> Even for propose to fit chars-per-line limit and/or to remove >>>>> duplicates (especially for sorted groupby case)? >>>> >>>> When you do "name = lambda ..." you've created a named function, when you >>>> do that your better of using def statement for the reasons Nick mentioned >>>> in the PEP. >>>> >>>> Ronald >>>> >>>>> >>>>> -- >>>>> ,,,^..^,,, >>>>> _______________________________________________ >>>>> Python-Dev mailing list >>>>> Python-Dev at python.org >>>>> http://mail.python.org/mailman/listinfo/python-dev >>>>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com >>>> >> From alexander.belopolsky at gmail.com Thu Aug 1 17:05:18 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 1 Aug 2013 11:05:18 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: <20130801142142.A3C942500B9@webabinitio.net> References: <20130801151020.2628a690@pitrou.net> <20130801142142.A3C942500B9@webabinitio.net> Message-ID: On Thu, Aug 1, 2013 at 10:21 AM, R. David Murray wrote: > > I'm guessing it's short enough you can say you tried, but long > > enough to annoy traditionalists anyway. > > > > I'm annoyed already. :-) > > +1 :) +1 :) I recently gave up and reset default auto-wrap margin to 120 locally. This change had little effect on code because most line breaks in code are inserted manually anyways. However, docstrings are beginning to suffer. The "short description" line is not that short anymore and multi-paragraph prose filled between 4- and 120-characters margin is hard to read. I will start experimenting with 100-char limit, but I think it is still too wide for auto-wrapped text. Maybe we should have a stronger recommendation to keep 80-char limit for docstrings and other embedded text. It is OK to have an occasional long line in code, but readability suffers when you have every line close to 100 chars. Another observation is that long lines in code are usually heavily indented. This makes them still readable because non-white characters still fit within the field of view. Again, this is not the case for docstrings, comments or other embedded prose. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Aug 1 17:11:49 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 1 Aug 2013 17:11:49 +0200 Subject: [Python-Dev] PEP 442 aftermath: module globals at shutdown References: <20130730204200.6d77df8b@fsol> <20130730233223.1e472996@fsol> <51FA78A7.3090905@v.loewis.de> Message-ID: <20130801171149.42415087@pitrou.net> Le Thu, 01 Aug 2013 17:03:03 +0200, "Martin v. L?wis" a ?crit : > Am 30.07.13 23:32, schrieb Antoine Pitrou: > > - it is held alive by a C extension: the main example is the locale > > module, which is held alive by _io and in turn keeps alive other > > Python modules (such as collections or re). > > If the _locale module would use PEP 3121 (issue15662), this problem > should go away. Not really: I'm talking about the pure Python locale module. However, I've got another solution for this one (using weakrefs, unsurprisingly): http://bugs.python.org/issue18608 cheers Antoine. From barry at python.org Thu Aug 1 17:35:38 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 1 Aug 2013 11:35:38 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: <20130801151020.2628a690@pitrou.net> <20130801142142.A3C942500B9@webabinitio.net> Message-ID: <20130801113538.2172611f@anarchist> On Aug 01, 2013, at 11:05 AM, Alexander Belopolsky wrote: >I will start experimenting with 100-char limit, but I think it is still too >wide for auto-wrapped text. Maybe we should have a stronger >recommendation to keep 80-char limit for docstrings and other embedded >text. It is OK to have an occasional long line in code, but readability >suffers when you have every line close to 100 chars. In general, long lines are a smell that the code is trying to express something too complex or is being too clever. Using various strategies judiciously usually leads to better, more readable code (e.g. use a local variable, wrap the line after open parens, don't chain too many calls, etc.) I'm not counting exceptions of course, it's PEP 8 after all! So I would greatly prefer that stdlib files be kept to the 79 character limit. I see most violations of this in the library documents, but especially there, paragraphs should be wrapped to 79 characters, and can easily be done without losing expressability. Cheers, -Barry From alexander.belopolsky at gmail.com Thu Aug 1 17:35:54 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 1 Aug 2013 11:35:54 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On Thu, Aug 1, 2013 at 11:03 AM, Alexander Shorin wrote: > ...and, if so, why lambda's?(: Without backward compatibility point I > see that they are getting "unofficially" deprecated and their usage is > dishonored. > > Here is one use-case where .. = lambda .. cannot be replaced with def .. op['add'] = lambda x,y: x+y op['mul'] = lambda x, y: x*y .. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Thu Aug 1 17:52:10 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 01 Aug 2013 11:52:10 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: <20130801113538.2172611f@anarchist> References: <20130801151020.2628a690@pitrou.net> <20130801142142.A3C942500B9@webabinitio.net> <20130801113538.2172611f@anarchist> Message-ID: <20130801155210.96DB525003E@webabinitio.net> On Thu, 01 Aug 2013 11:35:38 -0400, Barry Warsaw wrote: > So I would greatly prefer that stdlib files be kept to the 79 character > limit. I see most violations of this in the library documents, but especially > there, paragraphs should be wrapped to 79 characters, and can easily be done > without losing expressability. The documentation often has line lengths longer than 80 chars for two reasons: (1) the original translation from TeX was done by a script, and the script had a bug in it that made the lines slightly too long, and no one noticed in time and (2) until relatively recently Sphinx didn't support wrapping prototype lines (it now does). So as we edit the docs, we re-wrap. Just like we do with the legacy code :) The code examples in the docs are a bit trickier, since if you wrap the source to 79 you wind up with even-shorter-than-79 wrapping in the actual code lines, which can look odd when the text is rendered. So there it's a judgement call...but I still generally try to wrap the source to 79, sometimes refactoring the example to make that more elegant. Which, as you point out, often makes it better as well :). --David From barry at python.org Thu Aug 1 17:59:49 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 1 Aug 2013 11:59:49 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: <20130801155210.96DB525003E@webabinitio.net> References: <20130801151020.2628a690@pitrou.net> <20130801142142.A3C942500B9@webabinitio.net> <20130801113538.2172611f@anarchist> <20130801155210.96DB525003E@webabinitio.net> Message-ID: <20130801115949.1c8629a2@anarchist> On Aug 01, 2013, at 11:52 AM, R. David Murray wrote: >So as we edit the docs, we re-wrap. Just like we do with the legacy >code :) +1! -Barry From kxepal at gmail.com Thu Aug 1 18:58:07 2013 From: kxepal at gmail.com (Alexander Shorin) Date: Thu, 1 Aug 2013 20:58:07 +0400 Subject: [Python-Dev] Lambda [was Re: PEP 8 modernisation] In-Reply-To: <51FA796A.50001@pearwood.info> References: <51FA796A.50001@pearwood.info> Message-ID: Hi Steven, On Thu, Aug 1, 2013 at 7:06 PM, Steven D'Aprano wrote: > Hi Alexander, > > On 02/08/13 00:48, Alexander Shorin wrote: >> >> Hi Ronald, >> >> I understand this, but I'm a bit confused about fate of lambdas with >> such guideline since I see no more reasons to use them with p.9 >> statement: long lines, code duplicate, no mock and well tests etc. - >> all these problems could be solved with assigning lambda to some name, >> but now they are looks useless (or useful only for very trivial cases) > > > Lambda is still useful for the reason lambda has always been useful: it is > an expression, not a statement, so you can embed it directly where needed. > > # Preferred: > sorted(data, key=lambda value: value['spam'].casefold()) > > # Allowed: > def f(value): return value['spam'].casefold() > sorted(data, key=f) > > # Prohibited: > f = lambda value: value['spam'].casefold() > sorted(data, key=f) > > # SyntaxError: > sorted(data, key=def f(value): value['spam'].casefold()) The case: items = [[0, 'foo'], [3, 'baz'], [2, 'foo'], [1, 'bar']] Need to group by second item. Quite common task: >>> from itertools import groupby >>> >>> for key, items in groupby(items, key=lambda i: i[1]): >>> print(key, ':', list(items)) foo : [[0, 'foo']] baz : [[3, 'baz']] foo : [[2, 'foo']] bar : [[1, 'bar']] oops, failed, we need to sort things first by this item and it looks we have to duplicate grouping function: fun = lambda i: i[1] for key, items in groupby(sorted(items, key=fun), key=fun): print(key, ':', list(items)) Ok, PEP suggests to use defs, so we adds 3 more lines (before and after def + return) to code: def fun(i): return i[1] for key, items in groupby(sorted(items, key=fun), key=fun): print(key, ':', list(items)) so that's the question: what is the rationale of this if lambdas successfully solves the problem with minimal amount of typing, code and thinking? I thought there should be only one way to do something, but this PEP-8 statement conflicts with PEP-20 one: > There should be one-- and preferably only one --obvious way to do it. It's really not oblivious why lambdas couldn't be assignment to some name, especially in the light of fact that if they are been passed to some function as argument, they will be assignee to some name. -- ,,,^..^,,, From rosuav at gmail.com Thu Aug 1 19:01:39 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Aug 2013 18:01:39 +0100 Subject: [Python-Dev] Lambda [was Re: PEP 8 modernisation] In-Reply-To: References: <51FA796A.50001@pearwood.info> Message-ID: On Thu, Aug 1, 2013 at 5:58 PM, Alexander Shorin wrote: > fun = lambda i: i[1] > for key, items in groupby(sorted(items, key=fun), key=fun): > print(key, ':', list(items)) I'd do a direct translation to def here: def fun(i): return i[1] for key, items in groupby(sorted(items, key=fun), key=fun): print(key, ':', list(items)) ChrisA From brett at python.org Thu Aug 1 21:33:54 2013 From: brett at python.org (Brett Cannon) Date: Thu, 1 Aug 2013 15:33:54 -0400 Subject: [Python-Dev] Lambda [was Re: PEP 8 modernisation] In-Reply-To: References: <51FA796A.50001@pearwood.info> Message-ID: On Thu, Aug 1, 2013 at 12:58 PM, Alexander Shorin wrote: > Hi Steven, > > On Thu, Aug 1, 2013 at 7:06 PM, Steven D'Aprano > wrote: > > Hi Alexander, > > > > On 02/08/13 00:48, Alexander Shorin wrote: > >> > >> Hi Ronald, > >> > >> I understand this, but I'm a bit confused about fate of lambdas with > >> such guideline since I see no more reasons to use them with p.9 > >> statement: long lines, code duplicate, no mock and well tests etc. - > >> all these problems could be solved with assigning lambda to some name, > >> but now they are looks useless (or useful only for very trivial cases) > > > > > > Lambda is still useful for the reason lambda has always been useful: it > is > > an expression, not a statement, so you can embed it directly where > needed. > > > > # Preferred: > > sorted(data, key=lambda value: value['spam'].casefold()) > > > > # Allowed: > > def f(value): return value['spam'].casefold() > > sorted(data, key=f) > > > > # Prohibited: > > f = lambda value: value['spam'].casefold() > > sorted(data, key=f) > > > > # SyntaxError: > > sorted(data, key=def f(value): value['spam'].casefold()) > > The case: > > items = [[0, 'foo'], [3, 'baz'], [2, 'foo'], [1, 'bar']] > > Need to group by second item. Quite common task: > > >>> from itertools import groupby > >>> > >>> for key, items in groupby(items, key=lambda i: i[1]): > >>> print(key, ':', list(items)) > foo : [[0, 'foo']] > baz : [[3, 'baz']] > foo : [[2, 'foo']] > bar : [[1, 'bar']] > > oops, failed, we need to sort things first by this item and it looks > we have to duplicate grouping function: > > fun = lambda i: i[1] > for key, items in groupby(sorted(items, key=fun), key=fun): > print(key, ':', list(items)) > > Ok, PEP suggests to use defs, so we adds 3 more lines (before and > after def + return) to code: > > def fun(i): > return i[1] > > for key, items in groupby(sorted(items, key=fun), key=fun): > print(key, ':', list(items)) > > so that's the question: what is the rationale of this if lambdas > successfully solves the problem with minimal amount of typing, code > and thinking? I thought there should be only one way to do something, > but this PEP-8 statement conflicts with PEP-20 one: > > > There should be one-- and preferably only one --obvious way to do it. > > It's really not oblivious why lambdas couldn't be assignment to some > name, especially in the light of fact that if they are been passed to > some function as argument, they will be assignee to some name. > Just because you can doesn't mean you should. This guideline is all about being explicit over implicit, not about saving typing. If you want to bind a function to a name then you should use a def to specify that fact; you also lose some things otherwise (e.g. __name__ is not set). Lambdas should be thought of one-off functions you write inline because it expresses the intent of the code just as well. Assigning a lambda to a variable is in no way more beneficial compared to using def and thus this guideline suggesting you use def to make it at least as clear, if not more and to gain benefits such as __name__ being set (which helps with debugging, etc.). -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Thu Aug 1 21:41:19 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 02 Aug 2013 04:41:19 +0900 Subject: [Python-Dev] Lambda [was Re: PEP 8 modernisation] In-Reply-To: References: <51FA796A.50001@pearwood.info> Message-ID: <87ob9hmc68.fsf@uwakimon.sk.tsukuba.ac.jp> Chris Angelico writes: > On Thu, Aug 1, 2013 at 5:58 PM, Alexander Shorin wrote: > > fun = lambda i: i[1] > > for key, items in groupby(sorted(items, key=fun), key=fun): > > print(key, ':', list(items)) > > I'd do a direct translation to def here: > > def fun(i): return i[1] > for key, items in groupby(sorted(items, key=fun), key=fun): > print(key, ':', list(items)) As long as it's about readability, why not make it readable? def second(pair): return pair[1] for key, items in groupby(sorted(items, key=second), key=second): print(key, ':', list(items)) I realize it's somewhat unfair (for several reasons) to compare that to Alexander's "fun = lambda i: i[1]", but I can't help feeling that in another sense it is fair. From nd at perlig.de Thu Aug 1 21:49:43 2013 From: nd at perlig.de (=?iso-8859-1?q?Andr=E9_Malo?=) Date: Thu, 1 Aug 2013 21:49:43 +0200 Subject: [Python-Dev] Lambda [was Re: PEP 8 modernisation] In-Reply-To: <87ob9hmc68.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87ob9hmc68.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <201308012149.43596@news.perlig.de> * Stephen J. Turnbull wrote: > Chris Angelico writes: > > On Thu, Aug 1, 2013 at 5:58 PM, Alexander Shorin wrote: > > > fun = lambda i: i[1] > > > for key, items in groupby(sorted(items, key=fun), key=fun): > > > print(key, ':', list(items)) > > > > I'd do a direct translation to def here: > > > > def fun(i): return i[1] > > for key, items in groupby(sorted(items, key=fun), key=fun): > > print(key, ':', list(items)) > > As long as it's about readability, why not make it readable? > > def second(pair): return pair[1] > for key, items in groupby(sorted(items, key=second), key=second): > print(key, ':', list(items)) > > I realize it's somewhat unfair (for several reasons) to compare that > to Alexander's "fun = lambda i: i[1]", but I can't help feeling that > in another sense it is fair. Seems to run OT somewhat, but "second" is probably a bad name here. If the key changes, you have to rename it in several places (or worse, you DON'T rename it, and then the readability is gone). Usually I'm using a name with "key" in it - describing what it's for, not how it's done. The minimal distance to its usage is supporting that, too. nd -- "Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine beiden Gef?hrten nicht zu z?hlen brauchte" -- Karl May, "Winnetou III" Im Westen was neues: From tjreedy at udel.edu Thu Aug 1 22:29:07 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 01 Aug 2013 16:29:07 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On 8/1/2013 10:34 AM, Alexander Shorin wrote: > Hi Nick, > > On Thu, Aug 1, 2013 at 4:44 PM, Nick Coghlan wrote: >> 9. Explicit guideline not to assign lambdas to names (use def, that's >> what it's for) > > Even for propose to fit chars-per-line limit def f(x): return 2*x f = lambda x: 2*x Three spaces is seldom a crucial difference. If the expression is so long it go past the limit (whatever we decide it is), it can be wrapped. > and/or to remove duplicates (especially for sorted groupby case)? I do not understand this. -- Terry Jan Reedy From tjreedy at udel.edu Thu Aug 1 22:35:10 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 01 Aug 2013 16:35:10 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On 8/1/2013 10:48 AM, Alexander Shorin wrote: > I understand this, but I'm a bit confused about fate of lambdas with > such guideline since I see no more reasons to use them with p.9 > statement: long lines, code duplicate, no mock and well tests etc. - > all these problems could be solved with assigning lambda to some name, > but now they are looks useless (or useful only for very trivial cases) I do not understand most of that, but... The guideline is not meant to cover passing a function by parameter name. mylist.sort(key=lambda x: x[0]) is still ok. Does "Always use a def statement instead of assigning a lambda expression to a name." need 'in an assignment statement' added? -- Terry Jan Reedy From tjreedy at udel.edu Thu Aug 1 22:36:34 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 01 Aug 2013 16:36:34 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On 8/1/2013 11:03 AM, Alexander Shorin wrote: > ...and, if so, why lambda's?(: Without backward compatibility point I > see that they are getting "unofficially" deprecated and their usage is > dishonoured. Please stop both the top-posting and the FUD. -- Terry Jan Reedy From brian at python.org Thu Aug 1 22:44:15 2013 From: brian at python.org (Brian Curtin) Date: Thu, 1 Aug 2013 15:44:15 -0500 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On Thu, Aug 1, 2013 at 3:36 PM, Terry Reedy wrote: > On 8/1/2013 11:03 AM, Alexander Shorin wrote: >> >> ...and, if so, why lambda's?(: Without backward compatibility point I >> see that they are getting "unofficially" deprecated and their usage is >> dishonoured. > > > Please stop both the top-posting and the FUD. Top posting doesn't matter. The end. From alexander.belopolsky at gmail.com Thu Aug 1 22:52:19 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 1 Aug 2013 16:52:19 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On Thu, Aug 1, 2013 at 4:29 PM, Terry Reedy wrote: > def f(x): return 2*x > f = lambda x: 2*x > Am I the only one who finds the second line above much more readable than the first? The def statement is not intended to be written in one line. The readability suffers because the argument is separated from the value expression by return keyword. When def statement is written traditionally: def f(x): return 2*x It is easy to run the eyes over the right margin and recognize a function that in a math paper would be written as "f: x -> 2 x". Same is true about lambda expression. While C# syntax "f = (x => 2*x)" is probably closest to mathematical notation, "f = lambda x: 2*x" is close enough. One can mentally focus on the "x: 2*x" part and ignore the rest. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Aug 1 22:56:55 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 01 Aug 2013 16:56:55 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On 8/1/2013 11:35 AM, Alexander Belopolsky wrote: > Here is one use-case where .. = lambda .. cannot be replaced with def .. > > op['add'] = lambda x,y: x+y > op['mul'] = lambda x, y: x*y Yes, you are binding the functions to named slots, not to names, so not covered by the PEP. Once might still want to replace the expressions themselves, at the cost of more typing, for the advantage of better representations. op = { 'add': lambda x,y: x*y, 'mul': lambda x, y: x+y} print(op) # no apparent problem # {'add': at 0x000000000227F730>, # 'mul': at 0x00000000033867B8>} def add(x, y): return x + y def mul(x, y): return x * y # These can be unittested individually op = {'add': mul, 'mul': add} # mistake easily seen in original code print(op) # {'add': , # 'mul': } # problem apparent to user who import this object and prints it when code fails If op has 20 such functions, names become even more of an advantage -- Terry Jan Reedy From brian at python.org Thu Aug 1 22:57:12 2013 From: brian at python.org (Brian Curtin) Date: Thu, 1 Aug 2013 15:57:12 -0500 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On Thu, Aug 1, 2013 at 3:44 PM, Brian Curtin wrote: > On Thu, Aug 1, 2013 at 3:36 PM, Terry Reedy wrote: >> On 8/1/2013 11:03 AM, Alexander Shorin wrote: >>> >>> ...and, if so, why lambda's?(: Without backward compatibility point I >>> see that they are getting "unofficially" deprecated and their usage is >>> dishonoured. >> >> >> Please stop both the top-posting and the FUD. > > Top posting doesn't matter. > > The end. Actually, quick expansion on this before moving along: if you're going to call someone out for top posting, you can't ignore the many high profile people who do it every time and single out the newcomer. That's why I said something. Sorry for the OT. From ncoghlan at gmail.com Thu Aug 1 23:48:07 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 2 Aug 2013 07:48:07 +1000 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: <20130801151020.2628a690@pitrou.net> <20130801142142.A3C942500B9@webabinitio.net> Message-ID: On 2 Aug 2013 01:18, "Alexander Belopolsky" wrote: > > > On Thu, Aug 1, 2013 at 10:21 AM, R. David Murray wrote: >> >> > I'm guessing it's short enough you can say you tried, but long >> > enough to annoy traditionalists anyway. >> > >> > I'm annoyed already. :-) >> >> +1 :) > > > +1 :) > > I recently gave up and reset default auto-wrap margin to 120 locally. This change had little effect on code because most line breaks in code are inserted manually anyways. However, docstrings are beginning to suffer. The "short description" line is not that short anymore and multi-paragraph prose filled between 4- and 120-characters margin is hard to read. > > I will start experimenting with 100-char limit, but I think it is still too wide for auto-wrapped text. Maybe we should have a stronger recommendation to keep 80-char limit for docstrings and other embedded text. It is OK to have an occasional long line in code, but readability suffers when you have every line close to 100 chars. 1. The recommended length limit for flowed text is still *72* (since it doesn't have the structural constraints code does). 2. The preferred length limit for code is still 79. The "up to 99 if it improves readability" escape clause was added because Guido deemed the occasional long line a lesser evil than the contortions he has seen people apply to code to stay within the 79 character limit (most notably, using cryptic variable names because they're shorter). That entire section of the PEP was completely rewritten - we didn't just s/79/99/ with the old content. Cheers, Nick. > > Another observation is that long lines in code are usually heavily indented. This makes them still readable because non-white characters still fit within the field of view. Again, this is not the case for docstrings, comments or other embedded prose. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Aug 2 01:59:18 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Aug 2013 01:59:18 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> Message-ID: 2013/7/30 Richard Oudkerk : > The documentation for STARTUPINFO says this about STARTF_USESTDHANDLES: > > If this flag is specified when calling one of the process creation > functions, the handles must be inheritable and the function's > bInheritHandles parameter must be set to TRUE. > > So, as I said, if you redirect the streams then you inherit all inheritable > handles. Outch! It means that all Python applications redirecting at least one standard stream inherit almost all open handles, including open files. The situation on Windows is worse than what I expected. If I understood correctly, making new handles and new file descriptors non-inheritable by default on Windows would improve the situation because they will not stay "open" (they are not inheritable anymore) in child processes. On Windows, a file cannot be removed if at least one process opened it. If you create a temporary file, run a program, and delete the temporary file: the deletion fails if the program inherited the file and the program is not done before the deletion. Is this correct? I didn't check this use case on Windows, but it is similar to the following Haskell (GHC) issue: http://ghc.haskell.org/trac/ghc/ticket/2650 Victor From victor.stinner at gmail.com Fri Aug 2 02:21:19 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Aug 2013 02:21:19 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> Message-ID: 2013/7/30 Victor Stinner : > I would be nice to have a "pass_handles" on Windows. I'm not sure that it's possible to implement this atomically. It's probably better to leave the application to choose how the inheritance is defined. Example: for handle in handles: os.set_inheritable(handle, True) subprocess.call(...) for handle in handles: os.set_inheritable(handle, False) This example is safe if the application has a single thread (if a single thread spawn new programs). Making handles non-inheritable again may be useless. Victor From victor.stinner at gmail.com Fri Aug 2 02:27:43 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Aug 2013 02:27:43 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: <20130728114341.23ec940f@fsol> References: <20130728114341.23ec940f@fsol> Message-ID: 2013/7/28 Antoine Pitrou : >> (A) How should we support support where os.set_inheritable() is not >> supported? Can we announce that os.set_inheritable() is always >> available or not? Does such platform exist? > > FD_CLOEXEC is POSIX: > http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html Ok, but this information does not help me. Does Python support non-POSIX platforms? (Windows has HANDLE_FLAG_INHERIT.) If we cannot answer to my question, it's safer to leave os.get/set_inheritable() optional (need hasattr in tests for example). Victor From victor.stinner at gmail.com Fri Aug 2 02:36:40 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Aug 2013 02:36:40 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: <20130730112912.53e108a7@pitrou.net> References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> <20130730112912.53e108a7@pitrou.net> Message-ID: 2013/7/30 Antoine Pitrou : > Le Tue, 30 Jul 2013 09:09:38 +0200, > Charles-Fran?ois Natali a ?crit : >> This looks more and more like PEP 433 :-) >> >> And honestly, when I think about it, I think that this whole mess is a >> solution looking for a problem. >> If we don't want to inherit file descriptors in child processes, the >> answer is simple: the subprocess module (this fact is not even >> mentioned in the PEP). > > This is a good point. Are there any reasons (other than fd inheritance) > not to use subprocess? If there are, perhaps we should try to eliminate > them by improving subprocess. On Windows, inheritable handles (including open files) are still inherited when a standard stream is overriden in the subprocess module (default value of close_fds is set to False in this case). This issue cannot be solved (at least, I don't see how): it is a limitation of Windows. bInheritedHandles must be set to FALSE (inherit *all* inheritable handles) when handles of standard streams are specified in the startup information of CreateProcess(). Victor From vito.detullio at gmail.com Thu Aug 1 19:55:03 2013 From: vito.detullio at gmail.com (Vito De Tullio) Date: Thu, 01 Aug 2013 19:55:03 +0200 Subject: [Python-Dev] Lambda [was Re: PEP 8 modernisation] References: <51FA796A.50001@pearwood.info> Message-ID: Steven D'Aprano wrote: > Lambda is still useful for the reason lambda has always been useful: it is > an expression, not a statement, so you can embed it directly where needed. are there some possibilities to change def to an expression? do I need to wait 'till python9k? yes, this brings to the possibility to write something like foo = def bar(): pass but at least should let the lambda to die in peace... -- ZeD From cf.natali at gmail.com Fri Aug 2 08:32:53 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 2 Aug 2013 08:32:53 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> Message-ID: 2013/8/2 Victor Stinner : > 2013/7/28 Antoine Pitrou : >>> (A) How should we support support where os.set_inheritable() is not >>> supported? Can we announce that os.set_inheritable() is always >>> available or not? Does such platform exist? >> >> FD_CLOEXEC is POSIX: >> http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html > > Ok, but this information does not help me. Does Python support > non-POSIX platforms? (Windows has HANDLE_FLAG_INHERIT.) > > If we cannot answer to my question, it's safer to leave > os.get/set_inheritable() optional (need hasattr in tests for example). On Unix platforms, you should always have FD_CLOEXEC. If there were such a platform without FD inheritance support, then it would probably make sense to make it a no-op anyway. cf From cf.natali at gmail.com Fri Aug 2 08:44:37 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 2 Aug 2013 08:44:37 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> <20130730112912.53e108a7@pitrou.net> Message-ID: 2013/8/2 Victor Stinner : > On Windows, inheritable handles (including open files) are still > inherited when a standard stream is overriden in the subprocess module > (default value of close_fds is set to False in this case). This issue > cannot be solved (at least, I don't see how): it is a limitation of > Windows. bInheritedHandles must be set to FALSE (inherit *all* > inheritable handles) when handles of standard streams are specified in > the startup information of CreateProcess(). Then how about changing the default to creating file descriptors unheritable on Windows (which is apparently the default)? Then you can implement keep_fds by setting them inheritable right before creation, and resetting them right after: sure there's a race in a multi-threaded program, but AFAICT that's already the case right now, and Windows API doesn't leave us any other choice. Amusingly, they address this case by recommending putting process creation in a critical section: http://support.microsoft.com/kb/315939/en-us This way, we keep default platform behavior on Unix and on Windows (so user using low-level syscalls/APIs won't be surprised), and we have a clean way to selectively inherit FD in child processes through subprocess. cf From kxepal at gmail.com Fri Aug 2 09:28:46 2013 From: kxepal at gmail.com (Alexander Shorin) Date: Fri, 2 Aug 2013 11:28:46 +0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: Hi Terry, On Fri, Aug 2, 2013 at 12:29 AM, Terry Reedy wrote: > def f(x): return 2*x > f = lambda x: 2*x > Three spaces is seldom a crucial difference. If the expression is so long it go past the limit (whatever we decide it is), it can be wrapped. and if I have multiple lambda-like def`s it will hit the PEP rule : > While sometimes it's okay to put an if/for/while with a small body on the same line, never do this for multi-clause statements. Also avoid folding such long lines! On Fri, Aug 2, 2013 at 12:29 AM, Terry Reedy wrote: >> and/or to remove duplicates (especially for sorted groupby case)? > I do not understand this. See [Python-Dev] Lambda [was Re: PEP 8 modernisation] thread for example: http://mail.python.org/pipermail/python-dev/2013-August/127715.html On Fri, Aug 2, 2013 at 12:35 AM, Terry Reedy wrote: >> I understand this, but I'm a bit confused about fate of lambdas with >> such guideline since I see no more reasons to use them with p.9 >> statement: long lines, code duplicate, no mock and well tests etc. - >> all these problems could be solved with assigning lambda to some name, >> but now they are looks useless (or useful only for very trivial cases) > >I do not understand most of that, but... >The guideline is not meant to cover passing a function by parameter name. mylist.sort(key=lambda x: x[0]) is still ok. Does "Always use a def statement >instead of assigning a lambda expression to a name." need 'in an assignment statement' added? I wrote about that lambda`s use case become too small to use them in real code. If they are dishonoured - need to write so and clearly, but not limiting their use cases step by step till every Python devs will think like "Lambdas? Why? Remove them!". Using `dict` to store lambdas: > op = { 'add': lambda x,y: x*y, 'mul': lambda x, y: x+y} Shows the hack to bypass PEP8 guides. Do you like to see code above instead of: add = lambda x,y: x*y mul = lambda x, y: x+y Probably, I don't since dict is a blackbox and I have to check things first before use them. Disclaimer: I don't try to stand for lambdas, I'm not using them everywhere in my code, but I'd like to know answer for the question "Why lambdas?". Currently, it is "Handy shorthand functions - use them free", but with new PEP-8 statement I really have to think like "Lambdas? Really, why?". P.S. On Thu, Aug 1, 2013 at 3:36 PM, Terry Reedy wrote: > Please stop both the top-posting and the FUD. Sorry, different ML, different rules. You know mail client with allows to have per-address reply setting? I don't, but would like to see your suggestions in private answer. Thanks. -- ,,,^..^,,, On Fri, Aug 2, 2013 at 12:56 AM, Terry Reedy wrote: > On 8/1/2013 11:35 AM, Alexander Belopolsky wrote: > >> Here is one use-case where .. = lambda .. cannot be replaced with def .. >> >> op['add'] = lambda x,y: x+y >> op['mul'] = lambda x, y: x*y > > > Yes, you are binding the functions to named slots, not to names, so not > covered by the PEP. Once might still want to replace the expressions > themselves, at the cost of more typing, for the advantage of better > representations. > > op = { 'add': lambda x,y: x*y, 'mul': lambda x, y: x+y} > print(op) # no apparent problem > # {'add': at 0x000000000227F730>, > # 'mul': at 0x00000000033867B8>} > > def add(x, y): return x + y > def mul(x, y): return x * y > # These can be unittested individually > > op = {'add': mul, 'mul': add} # mistake easily seen in original code > print(op) > # {'add': , > # 'mul': } > # problem apparent to user who import this object and prints it when code > fails > > If op has 20 such functions, names become even more of an advantage > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/kxepal%40gmail.com From shibturn at gmail.com Fri Aug 2 10:02:00 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Fri, 02 Aug 2013 09:02:00 +0100 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> <20130730112912.53e108a7@pitrou.net> Message-ID: On 02/08/2013 7:44am, Charles-Fran?ois Natali wrote: > Then how about changing the default to creating file descriptors > unheritable on Windows (which is apparently the default)? > Then you can implement keep_fds by setting them inheritable right > before creation, and resetting them right after: sure there's a race > in a multi-threaded program, but AFAICT that's already the case right > now, and Windows API doesn't leave us any other choice. > Amusingly, they address this case by recommending putting process > creation in a critical section: > http://support.microsoft.com/kb/315939/en-us > > This way, we keep default platform behavior on Unix and on Windows (so > user using low-level syscalls/APIs won't be surprised), and we have a > clean way to selectively inherit FD in child processes through > subprocess. http://bugs.python.org/issue16500 is a proposal/patch for adding atfork. But it also has a public recursive lock which is held when starting processes using fork()/subprocess/multiprocessing. This is so that users can safely manipulate fds while holding the lock, without these sorts of race conditions. For example with atfork.getlock(): fd = os.open("somefile", os.O_CREAT | os.O_WRONLY, 0600) flags = fcntl.fcntl(fd, fcntl.F_GETFL) fcntl.fcntl(fd, fcntl.F_SETFL, flags | fcntl.FD_CLOEXEC) -- Richard From shibturn at gmail.com Fri Aug 2 10:10:32 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Fri, 02 Aug 2013 09:10:32 +0100 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> Message-ID: On 02/08/2013 12:59am, Victor Stinner wrote: > On Windows, a file cannot be removed if at least one process opened > it. If you create a temporary file, run a program, and delete the > temporary file: the deletion fails if the program inherited the file > and the program is not done before the deletion. Is this correct? It depends whether the handles use FILE_SHARE_DELETE. The Python module tempfile uses the O_TEMPORARY flag which implies FILE_SHARE_DELETE. This means that the file *can* be deleted/moved while there are open handles. Annoyingly the folder containing that file still cannot be deleted while there are open handles. -- Richard From shibturn at gmail.com Fri Aug 2 10:16:07 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Fri, 02 Aug 2013 09:16:07 +0100 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> Message-ID: On 02/08/2013 1:21am, Victor Stinner wrote: > 2013/7/30 Victor Stinner : >> I would be nice to have a "pass_handles" on Windows. > > I'm not sure that it's possible to implement this atomically. It's > probably better to leave the application to choose how the inheritance > is defined. > > Example: > > for handle in handles: > os.set_inheritable(handle, True) > subprocess.call(...) > for handle in handles: > os.set_inheritable(handle, False) > > This example is safe if the application has a single thread (if a > single thread spawn new programs). Making handles non-inheritable > again may be useless. If we have a recursive lock which is always held when Python starts a process then you could write: with subprocess.getlock(): for handle in handles: os.set_inheritable(handle, True) subprocess.call(...) for handle in handles: os.set_inheritable(handle, False) This should be used by fork() and multiprocessing too. -- Richard From ncoghlan at gmail.com Fri Aug 2 11:10:19 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 2 Aug 2013 19:10:19 +1000 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On 2 Aug 2013 17:31, "Alexander Shorin" wrote: > > Hi Terry, > > On Fri, Aug 2, 2013 at 12:29 AM, Terry Reedy wrote: > > def f(x): return 2*x > > f = lambda x: 2*x > > Three spaces is seldom a crucial difference. If the expression is so long it go past the limit (whatever we decide it is), it can be wrapped. > > and if I have multiple lambda-like def`s it will hit the PEP rule : > > > While sometimes it's okay to put an if/for/while with a small body on the same line, never do this for multi-clause statements. Also avoid folding such long lines! > > On Fri, Aug 2, 2013 at 12:29 AM, Terry Reedy wrote: > >> and/or to remove duplicates (especially for sorted groupby case)? > > I do not understand this. > > See [Python-Dev] Lambda [was Re: PEP 8 modernisation] thread for example: > http://mail.python.org/pipermail/python-dev/2013-August/127715.html > > On Fri, Aug 2, 2013 at 12:35 AM, Terry Reedy wrote: > >> I understand this, but I'm a bit confused about fate of lambdas with > >> such guideline since I see no more reasons to use them with p.9 > >> statement: long lines, code duplicate, no mock and well tests etc. - > >> all these problems could be solved with assigning lambda to some name, > >> but now they are looks useless (or useful only for very trivial cases) > > > >I do not understand most of that, but... > >The guideline is not meant to cover passing a function by parameter name. mylist.sort(key=lambda x: x[0]) is still ok. Does "Always use a def statement >instead of assigning a lambda expression to a name." need 'in an assignment statement' added? > > I wrote about that lambda`s use case become too small to use them in > real code. If they are dishonoured - need to write so and clearly, but > not limiting their use cases step by step till every Python devs will > think like "Lambdas? Why? Remove them!". Lambda was almost removed in Python 3. > > Using `dict` to store lambdas: > > > op = { 'add': lambda x,y: x*y, 'mul': lambda x, y: x+y} > > Shows the hack to bypass PEP8 guides. Do you like to see code above instead of: > > add = lambda x,y: x*y > mul = lambda x, y: x+y > > Probably, I don't since dict is a blackbox and I have to check things > first before use them. People are free to write their own style guides that disagree with pep 8 (a point which is now made explicitly in the PEP). > > Disclaimer: I don't try to stand for lambdas, I'm not using them > everywhere in my code, but I'd like to know answer for the question > "Why lambdas?". Currently, it is "Handy shorthand functions - use them > free", but with new PEP-8 statement I really have to think like > "Lambdas? Really, why?". Use them for an anonymous function as an expression. All PEP 8 is now saying is that giving a lambda a name is to completely misunderstand what they're for. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kxepal at gmail.com Fri Aug 2 13:02:01 2013 From: kxepal at gmail.com (Alexander Shorin) Date: Fri, 2 Aug 2013 15:02:01 +0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On Fri, Aug 2, 2013 at 1:10 PM, Nick Coghlan wrote: > Lambda was almost removed in Python 3. > >> >> Using `dict` to store lambdas: >> >> > op = { 'add': lambda x,y: x*y, 'mul': lambda x, y: x+y} >> >> Shows the hack to bypass PEP8 guides. Do you like to see code above >> instead of: >> >> add = lambda x,y: x*y >> mul = lambda x, y: x+y >> >> Probably, I don't since dict is a blackbox and I have to check things >> first before use them. > > People are free to write their own style guides that disagree with pep 8 (a > point which is now made explicitly in the PEP). > >> >> Disclaimer: I don't try to stand for lambdas, I'm not using them >> everywhere in my code, but I'd like to know answer for the question >> "Why lambdas?". Currently, it is "Handy shorthand functions - use them >> free", but with new PEP-8 statement I really have to think like >> "Lambdas? Really, why?". > > Use them for an anonymous function as an expression. All PEP 8 is now saying > is that giving a lambda a name is to completely misunderstand what they're > for. > > Cheers, > Nick. Thanks for explanations, Nick, I'd got the point. -- ,,,^..^,,, From victor.stinner at gmail.com Fri Aug 2 13:23:11 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Aug 2013 13:23:11 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> Message-ID: Le 2 ao?t 2013 08:32, "Charles-Fran?ois Natali" a ?crit : > > 2013/8/2 Victor Stinner : > > 2013/7/28 Antoine Pitrou : > >>> (A) How should we support support where os.set_inheritable() is not > >>> supported? Can we announce that os.set_inheritable() is always > >>> available or not? Does such platform exist? > >> > >> FD_CLOEXEC is POSIX: > >> http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html > > > > Ok, but this information does not help me. Does Python support > > non-POSIX platforms? (Windows has HANDLE_FLAG_INHERIT.) > > > > If we cannot answer to my question, it's safer to leave > > os.get/set_inheritable() optional (need hasattr in tests for example). > > On Unix platforms, you should always have FD_CLOEXEC. > If there were such a platform without FD inheritance support, then it > would probably make sense to make it a no-op Ok, and os.get_inheritable() can also always return False. It would prefer to fail with a compiler error is the platform is unknown. A platform might support FD inheritance, but with something different than fcntl() or ioctl() (ex: Windows). Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Aug 2 13:27:05 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Aug 2013 13:27:05 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> <20130730112912.53e108a7@pitrou.net> Message-ID: http://support.microsoft.com/kb/315939/en-us Ah yes, we may implement pass_handles on Windows using a critical section to inherit *handles*. File descriptors cannot be inherited using CreateProcess(), only using spawn(). Or can we rely on the undocumented fields used by spawn()? Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Aug 2 13:30:56 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Aug 2013 13:30:56 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> <20130730112912.53e108a7@pitrou.net> Message-ID: Is it possible to implement atfork on Windows? A Python lock would be ignored by other C threads. It is unsafe if Python is embedded. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Fri Aug 2 14:07:53 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Fri, 02 Aug 2013 13:07:53 +0100 Subject: [Python-Dev] PEP 446: Open issues/questions In-Reply-To: References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> <20130730112912.53e108a7@pitrou.net> Message-ID: On 02/08/2013 12:30pm, Victor Stinner wrote: > Is it possible to implement atfork on Windows? On Windows the patch does expose atfork.getlock() and uses it in subprocess. (It should also modify os.spawn?(), os.startfile() etc.) But atfork.atfork() is Unix only. > A Python lock would be ignored by other C threads. It is unsafe if > Python is embedded. True. -- Richard From solipsis at pitrou.net Fri Aug 2 14:37:08 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 2 Aug 2013 14:37:08 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions References: <20130728114341.23ec940f@fsol> <20130729101130.3b12d0fb@pitrou.net> <20130730112912.53e108a7@pitrou.net> Message-ID: <20130802143708.0c300ee5@pitrou.net> Le Fri, 2 Aug 2013 13:30:56 +0200, Victor Stinner a ?crit : > Is it possible to implement atfork on Windows? > > A Python lock would be ignored by other C threads. It is unsafe if > Python is embedded. It is unsafe if Python is embedded *and* the embedding application uses fork() + exec(). Regards Antoine. From solipsis at pitrou.net Fri Aug 2 14:38:31 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 2 Aug 2013 14:38:31 +0200 Subject: [Python-Dev] PEP 446: Open issues/questions References: <20130728114341.23ec940f@fsol> Message-ID: <20130802143831.38104927@pitrou.net> Le Fri, 2 Aug 2013 02:27:43 +0200, Victor Stinner a ?crit : > 2013/7/28 Antoine Pitrou : > >> (A) How should we support support where os.set_inheritable() is not > >> supported? Can we announce that os.set_inheritable() is always > >> available or not? Does such platform exist? > > > > FD_CLOEXEC is POSIX: > > http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html > > Ok, but this information does not help me. Does Python support > non-POSIX platforms? (Windows has HANDLE_FLAG_INHERIT.) Python works under POSIX and Windows. Not sure what else you are thinking about :-) Regards Antoine. From matthewlmcclure at gmail.com Fri Aug 2 13:51:23 2013 From: matthewlmcclure at gmail.com (Matt McClure) Date: Fri, 2 Aug 2013 07:51:23 -0400 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long Message-ID: It seems unittest.TestSuite holds references to unittest.TestCase instances after the test runs, until the test suite finishes. In a large suite, where the TestCase instances consume memory during execution, that can lead to exhausting all available memory and the OS killing the test process. What do you think of a change like this? $ hg diff diff -r 3bd55ec317a7 Lib/unittest/suite.py --- a/Lib/unittest/suite.py Thu Aug 01 23:57:21 2013 +0200 +++ b/Lib/unittest/suite.py Fri Aug 02 07:42:22 2013 -0400 @@ -90,7 +90,12 @@ if getattr(result, '_testRunEntered', False) is False: result._testRunEntered = topLevel = True - for test in self: + while True: + try: + test = self._tests.pop(0) + except IndexError: + break + if result.shouldStop: break See also the conversation on django-developers[1] that led me here. [1]: https://groups.google.com/forum/#!topic/django-developers/XUMetDSGVT0 -- Matt McClure http://matthewlmcclure.com http://www.mapmyfitness.com/profile/matthewlmcclure -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Fri Aug 2 17:13:13 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 2 Aug 2013 18:13:13 +0300 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: References: Message-ID: Sent from my iPhone On 2 Aug 2013, at 14:51, Matt McClure wrote: > It seems unittest.TestSuite holds references to unittest.TestCase instances after the test runs, until the test suite finishes. In a large suite, where the TestCase instances consume memory during execution, that can lead to exhausting all available memory and the OS killing the test process. Well individual tests not releasing resources could be seen as a bug in those tests. That aside, there's an open bug for this with some discussion and a proposed fix: http://bugs.python.org/issue11798 The agreed on approach just needs doing. Michael > > What do you think of a change like this? > > $ hg diff > diff -r 3bd55ec317a7 Lib/unittest/suite.py > --- a/Lib/unittest/suite.py Thu Aug 01 23:57:21 2013 +0200 > +++ b/Lib/unittest/suite.py Fri Aug 02 07:42:22 2013 -0400 > @@ -90,7 +90,12 @@ > if getattr(result, '_testRunEntered', False) is False: > result._testRunEntered = topLevel = True > > - for test in self: > + while True: > + try: > + test = self._tests.pop(0) > + except IndexError: > + break > + > if result.shouldStop: > break > > See also the conversation on django-developers[1] that led me here. > > [1]: https://groups.google.com/forum/#!topic/django-developers/XUMetDSGVT0 > > -- > Matt McClure > http://matthewlmcclure.com > http://www.mapmyfitness.com/profile/matthewlmcclure > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Aug 2 18:07:37 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 2 Aug 2013 18:07:37 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130802160737.5748A56A59@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-07-26 - 2013-08-02) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4128 (+14) closed 26274 (+54) total 30402 (+68) Open issues with patches: 1856 Issues opened (53) ================== #10241: gc fixes for module m_copy attribute http://bugs.python.org/issue10241 reopened by pitrou #18562: Regex howto: revision pass http://bugs.python.org/issue18562 opened by akuchling #18563: No unit test for yiq to rgb and rgb to yiq converting functio http://bugs.python.org/issue18563 opened by vajrasky #18564: Integer overflow in socketmodule http://bugs.python.org/issue18564 opened by maker #18566: In unittest.TestCase docs for setUp() and tearDown() don't men http://bugs.python.org/issue18566 opened by py.user #18567: Python 2.7.5 CENTOS6 Error building dbm using bdb http://bugs.python.org/issue18567 opened by Denise.Mauldin #18570: OverflowError in division: wrong message http://bugs.python.org/issue18570 opened by marco.buttu #18571: Implementation of the PEP 446: non-inheriable file descriptors http://bugs.python.org/issue18571 opened by haypo #18572: Remove redundant note about surrogates in string escape doc http://bugs.python.org/issue18572 opened by stevenjd #18574: BaseHTTPRequestHandler.handle_expect_100() sends invalid respo http://bugs.python.org/issue18574 opened by Nikratio #18575: Fixing tarfile._mode when using gzip via ":gz" http://bugs.python.org/issue18575 opened by edulix #18576: Rename and document test.script_helper as test.support.script_ http://bugs.python.org/issue18576 opened by ncoghlan #18577: lru_cache enhancement: lru_timestamp helper function http://bugs.python.org/issue18577 opened by peter at psantoro.net #18578: Rename and document test.bytecode_helper as test.support.bytec http://bugs.python.org/issue18578 opened by ncoghlan #18579: Dereference after NULL check in listobject.c merge_hi() http://bugs.python.org/issue18579 opened by christian.heimes #18581: Duplicate test and missing class test in test_abc.py http://bugs.python.org/issue18581 opened by vajrasky #18582: PBKDF2 support http://bugs.python.org/issue18582 opened by christian.heimes #18583: Idle: enhance FormatParagraph http://bugs.python.org/issue18583 opened by terry.reedy #18585: Add a text truncation function http://bugs.python.org/issue18585 opened by pitrou #18586: Allow running benchmarks for Python 3 from same directory http://bugs.python.org/issue18586 opened by brett.cannon #18588: timeit examples should be consistent http://bugs.python.org/issue18588 opened by claymation #18590: 'Search' and 'Replace' dialogs don't work on quoted text in Wi http://bugs.python.org/issue18590 opened by Sarah #18591: threading.Thread.run returning a result http://bugs.python.org/issue18591 opened by James.Lu #18592: IDLE: Unit test for SearchDialogBase.py http://bugs.python.org/issue18592 opened by philwebster #18594: C accelerator for collections.Counter is slow http://bugs.python.org/issue18594 opened by scoder #18595: zipfile: symlinks etc. http://bugs.python.org/issue18595 opened by ronaldoussoren #18596: enable usage of AddressSanitizer in CPython [PATCH] http://bugs.python.org/issue18596 opened by halfie #18597: On Windows sys.stdin.readline() doesn't handle Ctrl-C properly http://bugs.python.org/issue18597 opened by Drekin #18598: Importlib, more verbosity please http://bugs.python.org/issue18598 opened by Luk????.N??mec #18600: email.policy doc example passes 'policy' to as_string, but tha http://bugs.python.org/issue18600 opened by r.david.murray #18602: "_io" module names itself "io" http://bugs.python.org/issue18602 opened by pitrou #18603: PyOS_mystricmp unused and no longer available http://bugs.python.org/issue18603 opened by christian.heimes #18604: Consolidate gui available checks in test.support http://bugs.python.org/issue18604 opened by terry.reedy #18605: 2.7: test_threading hangs on Solaris 9 http://bugs.python.org/issue18605 opened by automatthias #18606: Add statistics module to standard library http://bugs.python.org/issue18606 opened by stevenjd #18609: test_ctypes failure on AIX in PyEval_CallObjectWithKeywords http://bugs.python.org/issue18609 opened by David.Edelsohn #18610: wsgiref.validate expects wsgi.input read to give exactly one a http://bugs.python.org/issue18610 opened by Robin.Schoonover #18611: Mac: Some Python Launcher issues http://bugs.python.org/issue18611 opened by ronaldoussoren #18612: More elaborate documentation on how list comprehensions and ge http://bugs.python.org/issue18612 opened by uglemat #18614: Enhanced \N{} escapes for Unicode strings http://bugs.python.org/issue18614 opened by stevenjd #18615: sndhdr.whathdr could return a namedtuple http://bugs.python.org/issue18615 opened by Claudiu.Popa #18616: enable more ssl socket options with get_server_certificate http://bugs.python.org/issue18616 opened by underrun #18617: TLS and Intermediate Certificates http://bugs.python.org/issue18617 opened by dstufft #18620: multiprocessing page leaves out important part of Pool example http://bugs.python.org/issue18620 opened by Chris.Curvey #18621: site.py keeps too much stuff alive when it patches builtins http://bugs.python.org/issue18621 opened by pitrou #18622: reset_mock on mock created by mock_open causes infinite recurs http://bugs.python.org/issue18622 opened by michael.foord #18623: Factor out the _SuppressCoreFiles context manager http://bugs.python.org/issue18623 opened by pitrou #18624: Add alias for iso-8859-8-i which is the same as iso-8859-8 http://bugs.python.org/issue18624 opened by r.david.murray #18625: ks_c-5601-1987 is used by microsoft when it really means cp949 http://bugs.python.org/issue18625 opened by r.david.murray #18626: Make "python -m inspect " dump the source of a module http://bugs.python.org/issue18626 opened by ncoghlan #18628: Better index entry for encoding declarations http://bugs.python.org/issue18628 opened by terry.reedy #18629: future division breaks timedelta division by integer http://bugs.python.org/issue18629 opened by exarkun #18618: Need an atexit.register equivalent that also works in subinter http://bugs.python.org/issue18618 opened by pitrou Most recent 15 issues with no replies (15) ========================================== #18629: future division breaks timedelta division by integer http://bugs.python.org/issue18629 #18621: site.py keeps too much stuff alive when it patches builtins http://bugs.python.org/issue18621 #18620: multiprocessing page leaves out important part of Pool example http://bugs.python.org/issue18620 #18616: enable more ssl socket options with get_server_certificate http://bugs.python.org/issue18616 #18615: sndhdr.whathdr could return a namedtuple http://bugs.python.org/issue18615 #18611: Mac: Some Python Launcher issues http://bugs.python.org/issue18611 #18610: wsgiref.validate expects wsgi.input read to give exactly one a http://bugs.python.org/issue18610 #18603: PyOS_mystricmp unused and no longer available http://bugs.python.org/issue18603 #18602: "_io" module names itself "io" http://bugs.python.org/issue18602 #18600: email.policy doc example passes 'policy' to as_string, but tha http://bugs.python.org/issue18600 #18595: zipfile: symlinks etc. http://bugs.python.org/issue18595 #18592: IDLE: Unit test for SearchDialogBase.py http://bugs.python.org/issue18592 #18588: timeit examples should be consistent http://bugs.python.org/issue18588 #18586: Allow running benchmarks for Python 3 from same directory http://bugs.python.org/issue18586 #18583: Idle: enhance FormatParagraph http://bugs.python.org/issue18583 Most recent 15 issues waiting for review (15) ============================================= #18621: site.py keeps too much stuff alive when it patches builtins http://bugs.python.org/issue18621 #18616: enable more ssl socket options with get_server_certificate http://bugs.python.org/issue18616 #18615: sndhdr.whathdr could return a namedtuple http://bugs.python.org/issue18615 #18614: Enhanced \N{} escapes for Unicode strings http://bugs.python.org/issue18614 #18610: wsgiref.validate expects wsgi.input read to give exactly one a http://bugs.python.org/issue18610 #18596: enable usage of AddressSanitizer in CPython [PATCH] http://bugs.python.org/issue18596 #18592: IDLE: Unit test for SearchDialogBase.py http://bugs.python.org/issue18592 #18590: 'Search' and 'Replace' dialogs don't work on quoted text in Wi http://bugs.python.org/issue18590 #18588: timeit examples should be consistent http://bugs.python.org/issue18588 #18585: Add a text truncation function http://bugs.python.org/issue18585 #18582: PBKDF2 support http://bugs.python.org/issue18582 #18575: Fixing tarfile._mode when using gzip via ":gz" http://bugs.python.org/issue18575 #18574: BaseHTTPRequestHandler.handle_expect_100() sends invalid respo http://bugs.python.org/issue18574 #18571: Implementation of the PEP 446: non-inheriable file descriptors http://bugs.python.org/issue18571 #18564: Integer overflow in socketmodule http://bugs.python.org/issue18564 Top 10 most discussed issues (10) ================================= #18585: Add a text truncation function http://bugs.python.org/issue18585 9 msgs #17449: dev guide appears not to cover the benchmarking suite http://bugs.python.org/issue17449 8 msgs #10241: gc fixes for module m_copy attribute http://bugs.python.org/issue10241 6 msgs #18257: Two copies of python-config http://bugs.python.org/issue18257 6 msgs #16248: Security bug in tkinter allows for untrusted, arbitrary code e http://bugs.python.org/issue16248 5 msgs #18558: Iterable glossary entry needs clarification http://bugs.python.org/issue18558 5 msgs #18570: OverflowError in division: wrong message http://bugs.python.org/issue18570 5 msgs #1666318: shutil.copytree doesn't give control over directory permission http://bugs.python.org/issue1666318 5 msgs #5845: rlcompleter should be enabled automatically http://bugs.python.org/issue5845 4 msgs #15893: Py_FrozenMain() resource leak and missing malloc checks http://bugs.python.org/issue15893 4 msgs Issues closed (47) ================== #3099: On windows, "import nul" always succeed http://bugs.python.org/issue3099 closed by tim.golden #5302: Allow package_data specs/globs to match directories http://bugs.python.org/issue5302 closed by larry #7443: test.support.unlink issue on Windows platform http://bugs.python.org/issue7443 closed by tim.golden #9035: os.path.ismount on windows doesn't support windows mount point http://bugs.python.org/issue9035 closed by tim.golden #11571: Turtle window pops under the terminal on OSX http://bugs.python.org/issue11571 closed by belopolsky #13266: Add inspect.unwrap(f) to easily unravel "__wrapped__" chains http://bugs.python.org/issue13266 closed by python-dev #13463: Fix parsing of package_data http://bugs.python.org/issue13463 closed by larry #15415: Add temp_dir() and change_cwd() to test.support http://bugs.python.org/issue15415 closed by python-dev #15699: PEP 3121, 384 Refactoring applied to readline module http://bugs.python.org/issue15699 closed by pitrou #15892: _PyImport_GetDynLoadFunc() doesn't check return value of fstat http://bugs.python.org/issue15892 closed by christian.heimes #16635: Interpreter not closing stdout/stderr on exit http://bugs.python.org/issue16635 closed by neologix #17350: Use STAF call python script will case 1124861 issue in 2.7.2 v http://bugs.python.org/issue17350 closed by terry.reedy #17557: test_getgroups of test_posix can fail on OS X 10.8 if more tha http://bugs.python.org/issue17557 closed by ned.deily #17616: wave.Wave_read and wave.Wave_write can be context managers http://bugs.python.org/issue17616 closed by r.david.murray #17899: os.listdir() leaks FDs if invoked on FD pointing to a non-dire http://bugs.python.org/issue17899 closed by larry #18023: msi product code for 2.7.5150 not in Tools/msi/uuids.py http://bugs.python.org/issue18023 closed by loewis #18071: Extension module builds fail on OS X with TypeError if Xcode c http://bugs.python.org/issue18071 closed by ned.deily #18112: PEP 442 implementation http://bugs.python.org/issue18112 closed by pitrou #18214: Stop purging modules which are garbage collected before shutdo http://bugs.python.org/issue18214 closed by pitrou #18325: test_kqueue fails in OpenBSD http://bugs.python.org/issue18325 closed by neologix #18441: Idle: Make test.support.requires('gui') skip when it should. http://bugs.python.org/issue18441 closed by terry.reedy #18472: Update PEP 8 to encourage modern conventions http://bugs.python.org/issue18472 closed by ncoghlan #18481: lcov report http://bugs.python.org/issue18481 closed by christian.heimes #18517: "xxlimited" extension declared incorrectly in setup.py http://bugs.python.org/issue18517 closed by ned.deily #18539: Idle 2.7: Calltip wrong if def contains float default value http://bugs.python.org/issue18539 closed by terry.reedy #18552: obj2ast_object() doesn't check return value of PyArena_AddPyOb http://bugs.python.org/issue18552 closed by christian.heimes #18555: type_set_bases() doesn't check return value of PyArg_UnpackTu http://bugs.python.org/issue18555 closed by christian.heimes #18559: _pickle: NULL ptr dereference when PyLong_FromSsize_t() fails http://bugs.python.org/issue18559 closed by christian.heimes #18560: builtin_sum() doesn't check return value of PyLong_FromLong() http://bugs.python.org/issue18560 closed by christian.heimes #18561: ctypes _build_callargs() doesn't check name for NULL http://bugs.python.org/issue18561 closed by christian.heimes #18565: Test for closing delegating generator with cleared frame (Issu http://bugs.python.org/issue18565 closed by python-dev #18568: Support \e escape code in strings http://bugs.python.org/issue18568 closed by stevenjd #18569: Set PATHEXT in the Windows installer http://bugs.python.org/issue18569 closed by loewis #18573: In unittest.TestCase.assertWarns doc there is some text about http://bugs.python.org/issue18573 closed by terry.reedy #18580: distutils compilers are unicode strings on OS X since Python 2 http://bugs.python.org/issue18580 closed by ned.deily #18584: examples in email.policy doc are fu'd http://bugs.python.org/issue18584 closed by r.david.murray #18587: urllib raises exception with string in 'errno' attribute http://bugs.python.org/issue18587 closed by r.david.murray #18589: cross-referencing doesn't work between the extending guide and http://bugs.python.org/issue18589 closed by pitrou #18593: Typo in Lib/multiprocessing/heap.py http://bugs.python.org/issue18593 closed by eli.bendersky #18599: _sha1module report "SHA" as its name http://bugs.python.org/issue18599 closed by christian.heimes #18601: Example "command-line interface to difflib" has typographical http://bugs.python.org/issue18601 closed by r.david.murray #18607: struct.unpack http://bugs.python.org/issue18607 closed by Andres.Adjimann #18608: Avoid keeping a strong reference to locale in the _io module http://bugs.python.org/issue18608 closed by pitrou #18613: wrong float plus/minus op result http://bugs.python.org/issue18613 closed by brett.cannon #18619: atexit leaks callbacks in subinterpreters http://bugs.python.org/issue18619 closed by pitrou #18627: Typo in Modules/hashlib.h http://bugs.python.org/issue18627 closed by python-dev #812369: module shutdown procedure based on GC http://bugs.python.org/issue812369 closed by pitrou From solipsis at pitrou.net Fri Aug 2 18:19:33 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 2 Aug 2013 18:19:33 +0200 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long References: Message-ID: <20130802181933.7044834d@pitrou.net> Le Fri, 2 Aug 2013 18:13:13 +0300, Michael Foord a ?crit : > > On 2 Aug 2013, at 14:51, Matt McClure > wrote: > > > It seems unittest.TestSuite holds references to unittest.TestCase > > instances after the test runs, until the test suite finishes. In a > > large suite, where the TestCase instances consume memory during > > execution, that can lead to exhausting all available memory and the > > OS killing the test process. > > Well individual tests not releasing resources could be seen as a bug > in those tests. > > That aside, there's an open bug for this with some discussion and a > proposed fix: > > http://bugs.python.org/issue11798 > > The agreed on approach just needs doing. The patch is basically ready for commit, except for a possible doc addition, no? Regards Antoine. From fuzzyman at voidspace.org.uk Fri Aug 2 18:47:04 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 2 Aug 2013 19:47:04 +0300 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: <20130802181933.7044834d@pitrou.net> References: <20130802181933.7044834d@pitrou.net> Message-ID: <81B4BAA8-4824-4390-A6A9-9C14D5A7D64C@voidspace.org.uk> Sent from my iPhone On 2 Aug 2013, at 19:19, Antoine Pitrou wrote: > Le Fri, 2 Aug 2013 18:13:13 +0300, > Michael Foord a ?crit : >> >> On 2 Aug 2013, at 14:51, Matt McClure >> wrote: >> >>> It seems unittest.TestSuite holds references to unittest.TestCase >>> instances after the test runs, until the test suite finishes. In a >>> large suite, where the TestCase instances consume memory during >>> execution, that can lead to exhausting all available memory and the >>> OS killing the test process. >> >> Well individual tests not releasing resources could be seen as a bug >> in those tests. >> >> That aside, there's an open bug for this with some discussion and a >> proposed fix: >> >> http://bugs.python.org/issue11798 >> >> The agreed on approach just needs doing. > > The patch is basically ready for commit, except for a possible doc > addition, no? Looks to be the case, reading the patch it looks fine. I'm currently on holiday until Monday. If anyone is motivated to do the docs too and commit that would be great. Otherwise I'll get to it on my return. Michael > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk From matthewlmcclure at gmail.com Fri Aug 2 19:27:04 2013 From: matthewlmcclure at gmail.com (Matt McClure) Date: Fri, 2 Aug 2013 13:27:04 -0400 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: References: Message-ID: On Fri, Aug 2, 2013 at 11:13 AM, Michael Foord wrote: > There's an open bug for this with some discussion and a proposed fix: > > http://bugs.python.org/issue11798 > > The agreed on approach just needs doing. > Thanks for the link. I hadn't found that yet. I'll see if I can contribute there. -- Matt McClure http://matthewlmcclure.com http://www.mapmyfitness.com/profile/matthewlmcclure -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Aug 3 04:26:03 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Aug 2013 12:26:03 +1000 Subject: [Python-Dev] [Python-checkins] peps: Use Guido's preferred wording re: line length In-Reply-To: <51FC499F.3070703@udel.edu> References: <3c64955v4yz7Llj@mail.python.org> <51FC499F.3070703@udel.edu> Message-ID: On 3 Aug 2013 11:07, "Terry Reedy" wrote: > > On 8/2/2013 6:19 AM, nick.coghlan wrote: > >> +The Python standard library is conservative and requires limiting >> +lines to 79 characters (and docstrings/comments to 72). > > > If you (and Guido) mean that as a hard limit, then patchcheck should check line lengths as well as trailing whitespace. That raises issues when modifying existing non-compliant files, because it removes the human judgement on whether a non-compliance is worth fixing or not. Cheers, Nick. > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Aug 3 04:51:46 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Aug 2013 12:51:46 +1000 Subject: [Python-Dev] [Python-checkins] peps: Use Guido's preferred wording re: line length In-Reply-To: <51FC6EBC.5090209@udel.edu> References: <3c64955v4yz7Llj@mail.python.org> <51FC499F.3070703@udel.edu> <51FC6EBC.5090209@udel.edu> Message-ID: On 3 Aug 2013 12:45, "Terry Reedy" wrote: > > > > On 8/2/2013 10:26 PM, Nick Coghlan wrote: >> >> >> On 3 Aug 2013 11:07, "Terry Reedy" > > wrote: >> > >> > On 8/2/2013 6:19 AM, nick.coghlan wrote: >> > >> >> +The Python standard library is conservative and requires limiting >> >> +lines to 79 characters (and docstrings/comments to 72). >> > >> > >> > If you (and Guido) mean that as a hard limit, then patchcheck should >> check line lengths as well as trailing whitespace. >> >> That raises issues when modifying existing non-compliant files, because >> it removes the human judgement on whether a non-compliance is worth >> fixing or not. > > > I meant tools/scripts/patchcheck.py, not the pre-commit hook. The check would inform (especially for old files) or remind (for new files) so that judgment could be applied. Ah, right. Yeah, that may be reasonable. A warning option on reindent.py may be a place to start if someone wanted to implement it. Whether or not patchcheck used that option would likely depend on the initial results of running it manually :) Cheers, Nick. > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sat Aug 3 05:47:20 2013 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 2 Aug 2013 20:47:20 -0700 Subject: [Python-Dev] objections to renaming enumobject.h/c in 3.4? Message-ID: Hi all, I was looking around the Objects directory and noticed that we have enumobject.h/c with the enumobject structure for "enumerate" and "reversed". This is somewhat confusing now with Lib/enum.py and will be doubly confusing if we ever decide to have a C implementation of enums. Any objections to renaming the files and the internal structure & static functions with s/enum/enumerate/ ? This would more accurately reflect the use of the code, and avoid confusion with enums. These structures/types are not part of the stable ABI defined by PEP 384. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Aug 3 07:50:56 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Aug 2013 15:50:56 +1000 Subject: [Python-Dev] objections to renaming enumobject.h/c in 3.4? In-Reply-To: References: Message-ID: On 3 Aug 2013 13:50, "Eli Bendersky" wrote: > > Hi all, > > I was looking around the Objects directory and noticed that we have enumobject.h/c with the enumobject structure for "enumerate" and "reversed". This is somewhat confusing now with Lib/enum.py and will be doubly confusing if we ever decide to have a C implementation of enums. > > Any objections to renaming the files and the internal structure & static functions with s/enum/enumerate/ ? This would more accurately reflect the use of the code, and avoid confusion with enums. These structures/types are not part of the stable ABI defined by PEP 384. Sounds good to me. Cheers, Nick. > > Eli > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Sat Aug 3 08:20:29 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 2 Aug 2013 23:20:29 -0700 Subject: [Python-Dev] objections to renaming enumobject.h/c in 3.4? In-Reply-To: References: Message-ID: <9B9D8018-7157-4370-AF25-83D9D013246E@gmail.com> On Aug 2, 2013, at 8:47 PM, Eli Bendersky wrote: > I was looking around the Objects directory and noticed that we have enumobject.h/c with the enumobject structure for "enumerate" and "reversed". This is somewhat confusing now with Lib/enum.py and will be doubly confusing if we ever decide to have a C implementation of enums. > > Any objections to renaming the files and the internal structure & static functions with s/enum/enumerate/ ? This would more accurately reflect the use of the code, and avoid confusion with enums. These structures/types are not part of the stable ABI defined by PEP 384. I wouldn't mind renaming enumobject.c/h to enumerateobject.c/h, but I think it is going overboard to rename all the internal structures and static functions. The latter is entirely unnecessary. The C language itself has enums and there has never been any confusion with the enumerate iterator. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Aug 3 08:35:24 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Aug 2013 16:35:24 +1000 Subject: [Python-Dev] objections to renaming enumobject.h/c in 3.4? In-Reply-To: <9B9D8018-7157-4370-AF25-83D9D013246E@gmail.com> References: <9B9D8018-7157-4370-AF25-83D9D013246E@gmail.com> Message-ID: On 3 August 2013 16:20, Raymond Hettinger wrote: > > On Aug 2, 2013, at 8:47 PM, Eli Bendersky wrote: > > I was looking around the Objects directory and noticed that we have > enumobject.h/c with the enumobject structure for "enumerate" and "reversed". > This is somewhat confusing now with Lib/enum.py and will be doubly confusing > if we ever decide to have a C implementation of enums. > > Any objections to renaming the files and the internal structure & static > functions with s/enum/enumerate/ ? This would more accurately reflect the > use of the code, and avoid confusion with enums. These structures/types are > not part of the stable ABI defined by PEP 384. > > > I wouldn't mind renaming enumobject.c/h to enumerateobject.c/h, but I think > it is going overboard to rename all the internal structures and static > functions. The latter is entirely unnecessary. The C language itself has > enums and there has never been any confusion with the enumerate iterator. Oops, I missed the part about renaming things in the code. I'm with Raymond on that part (i.e. not worth it) - I was just agreeing to renaming the files. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eliben at gmail.com Sat Aug 3 14:39:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 3 Aug 2013 05:39:56 -0700 Subject: [Python-Dev] objections to renaming enumobject.h/c in 3.4? In-Reply-To: <9B9D8018-7157-4370-AF25-83D9D013246E@gmail.com> References: <9B9D8018-7157-4370-AF25-83D9D013246E@gmail.com> Message-ID: On Fri, Aug 2, 2013 at 11:20 PM, Raymond Hettinger wrote: > > On Aug 2, 2013, at 8:47 PM, Eli Bendersky wrote: > > I was looking around the Objects directory and noticed that we have > enumobject.h/c with the enumobject structure for "enumerate" and "reversed". > This is somewhat confusing now with Lib/enum.py and will be doubly confusing > if we ever decide to have a C implementation of enums. > > Any objections to renaming the files and the internal structure & static > functions with s/enum/enumerate/ ? This would more accurately reflect the > use of the code, and avoid confusion with enums. These structures/types are > not part of the stable ABI defined by PEP 384. > > > I wouldn't mind renaming enumobject.c/h to enumerateobject.c/h, but I think > it is going overboard to rename all the internal structures and static > functions. The latter is entirely unnecessary. The C language itself has > enums and there has never been any confusion with the enumerate iterator. > My reasoning is this: Objects/enumobject.c currently has functions like enum_new, enum_dealloc, etc. If we ever implement enums in C, we're going to either have to find creative names for them, or have two sets of same-named static functions in two different files. While valid formally, it's confusing for code navigation and similar reasons. However, this can really be delayed until we actually do decide to implement enums in C. For now, just renaming the files should solve most of the problem. Eli From ncoghlan at gmail.com Sat Aug 3 15:17:59 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Aug 2013 23:17:59 +1000 Subject: [Python-Dev] objections to renaming enumobject.h/c in 3.4? In-Reply-To: References: <9B9D8018-7157-4370-AF25-83D9D013246E@gmail.com> Message-ID: On 3 August 2013 22:39, Eli Bendersky wrote: > On Fri, Aug 2, 2013 at 11:20 PM, Raymond Hettinger > wrote: >> >> On Aug 2, 2013, at 8:47 PM, Eli Bendersky wrote: >> >> I was looking around the Objects directory and noticed that we have >> enumobject.h/c with the enumobject structure for "enumerate" and "reversed". >> This is somewhat confusing now with Lib/enum.py and will be doubly confusing >> if we ever decide to have a C implementation of enums. >> >> Any objections to renaming the files and the internal structure & static >> functions with s/enum/enumerate/ ? This would more accurately reflect the >> use of the code, and avoid confusion with enums. These structures/types are >> not part of the stable ABI defined by PEP 384. >> >> >> I wouldn't mind renaming enumobject.c/h to enumerateobject.c/h, but I think >> it is going overboard to rename all the internal structures and static >> functions. The latter is entirely unnecessary. The C language itself has >> enums and there has never been any confusion with the enumerate iterator. >> > > My reasoning is this: Objects/enumobject.c currently has functions > like enum_new, enum_dealloc, etc. If we ever implement enums in C, > we're going to either have to find creative names for them, or have > two sets of same-named static functions in two different files. While > valid formally, it's confusing for code navigation and similar > reasons. > > However, this can really be delayed until we actually do decide to > implement enums in C. For now, just renaming the files should solve > most of the problem. Yep, this is an area where laziness is definitely a virtue - if work is only needed to handle a hypothetical future change, then it can be deferred and handled as part of that change. :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From matthewlmcclure at gmail.com Sat Aug 3 16:27:30 2013 From: matthewlmcclure at gmail.com (Matt McClure) Date: Sat, 3 Aug 2013 10:27:30 -0400 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: References: Message-ID: Michael Foord voidspace.org.uk> writes: > On 2 Aug 2013, at 19:19, Antoine Pitrou pitrou.net> wrote: > > The patch is basically ready for commit, except for a possible doc > > addition, no? > > Looks to be the case, reading the patch it looks fine. I'm currently on holiday until Monday. > If anyone is motivated to do the docs too and commit that would be great. Otherwise I'll > get to it on my return. It looks like the patch is based on what will become 3.4. Would backporting it to 2.7 be feasible? What's involved in doing so? I took a crack at the docs. # HG changeset patch # User Matt McClure # Date 1375538965 14400 # Node ID d748d70201929288c230862da4dbdba33d61ae9f # Parent bf43956356ffe93e75ffdd5a7a8164fc68cf14ae [11798] Document TestSuite.{__iter__, run} changes diff --git a/Doc/library/unittest.rst b/Doc/library/unittest.rst --- a/Doc/library/unittest.rst +++ b/Doc/library/unittest.rst @@ -1470,15 +1470,24 @@ Tests grouped by a :class:`TestSuite` are always accessed by iteration. Subclasses can lazily provide tests by overriding :meth:`__iter__`. Note - that this method maybe called several times on a single suite - (for example when counting tests or comparing for equality) - so the tests returned must be the same for repeated iterations. + that this method may be called several times on a single suite (for + example when counting tests or comparing for equality) so the tests + returned by repeated iterations before :meth:`TestSuite.run` must be the + same for each call iteration. After :meth:`TestSuite.run`, callers should + not rely on the tests returned by this method unless the caller uses a + subclass that overrides :meth:`TestSuite._removeTestAtIndex` to preserve + test references. .. versionchanged:: 3.2 In earlier versions the :class:`TestSuite` accessed tests directly rather than through iteration, so overriding :meth:`__iter__` wasn't sufficient for providing tests. + .. versionchanged:: 3.4 + In earlier versions the :class:`TestSuite` held references to each + :class:`TestCase` after :meth:`TestSuite.run`. Subclasses can restore + that behavior by overriding :meth:`TestSuite._removeTestAtIndex`. + In the typical usage of a :class:`TestSuite` object, the :meth:`run` method is invoked by a :class:`TestRunner` rather than by the end-user test harness. diff --git a/Lib/unittest/suite.py b/Lib/unittest/suite.py --- a/Lib/unittest/suite.py +++ b/Lib/unittest/suite.py @@ -65,6 +65,7 @@ return result def _removeTestAtIndex(self, index): + """Stop holding a reference to the TestCase at index.""" try: self._tests[index] = None except TypeError: -- Matt McClure http://matthewlmcclure.com http://www.mapmyfitness.com/profile/matthewlmcclure -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Sat Aug 3 18:07:24 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 03 Aug 2013 12:07:24 -0400 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: References: Message-ID: <20130803160725.1D1602500CA@webabinitio.net> On Sat, 03 Aug 2013 10:27:30 -0400, Matt McClure wrote: > Michael Foord voidspace.org.uk> writes: > > On 2 Aug 2013, at 19:19, Antoine Pitrou pitrou.net> wrote: > > > The patch is basically ready for commit, except for a possible doc > > > addition, no? > > > > Looks to be the case, reading the patch it looks fine. I'm currently on > holiday until Monday. > > If anyone is motivated to do the docs too and commit that would be great. > Otherwise I'll > > get to it on my return. > > It looks like the patch is based on what will become 3.4. Would backporting > it to 2.7 be feasible? What's involved in doing so? That depends on how likely Michale thinks it is that it might break things. > I took a crack at the docs. Thanks. Please post your patch to the issue, it will get lost here. --David From matthewlmcclure at gmail.com Sat Aug 3 20:01:15 2013 From: matthewlmcclure at gmail.com (Matt McClure) Date: Sat, 3 Aug 2013 14:01:15 -0400 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: <20130803160725.1D1602500CA@webabinitio.net> References: <20130803160725.1D1602500CA@webabinitio.net> Message-ID: <98D08752-5AD6-42E1-834B-3CB2CE491666@gmail.com> On Aug 3, 2013, at 12:07 PM, "R. David Murray" wrote: > Thanks. Please post your patch to the issue, it will get lost here. I'm trying to register, but I'm not receiving a confirmation email to complete the registration. -- http://matthewlmcclure.com http://about.mapmyfitness.com From fuzzyman at voidspace.org.uk Sat Aug 3 21:27:49 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 3 Aug 2013 22:27:49 +0300 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: <20130803160725.1D1602500CA@webabinitio.net> References: <20130803160725.1D1602500CA@webabinitio.net> Message-ID: Sent from my iPhone On 3 Aug 2013, at 19:07, "R. David Murray" wrote: > On Sat, 03 Aug 2013 10:27:30 -0400, Matt McClure wrote: >> Michael Foord voidspace.org.uk> writes: >>> On 2 Aug 2013, at 19:19, Antoine Pitrou pitrou.net> wrote: >>>> The patch is basically ready for commit, except for a possible doc >>>> addition, no? >>> >>> Looks to be the case, reading the patch it looks fine. I'm currently on >> holiday until Monday. >>> If anyone is motivated to do the docs too and commit that would be great. >> Otherwise I'll >>> get to it on my return. >> >> It looks like the patch is based on what will become 3.4. Would backporting >> it to 2.7 be feasible? What's involved in doing so? > > That depends on how likely Michale thinks it is that it might break > things. > It smells to me like a new feature rather than a bugfix, and it's a moderately big change. I don't think it can be backported to 2.7 other than through unittest2. Michael >> I took a crack at the docs. > > Thanks. Please post your patch to the issue, it will get lost here. > > --David From alexander.belopolsky at gmail.com Sat Aug 3 22:04:17 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 3 Aug 2013 16:04:17 -0400 Subject: [Python-Dev] objections to renaming enumobject.h/c in 3.4? In-Reply-To: References: <9B9D8018-7157-4370-AF25-83D9D013246E@gmail.com> Message-ID: On Sat, Aug 3, 2013 at 9:17 AM, Nick Coghlan wrote: > Yep, this is an area where laziness is definitely a virtue - if work > is only needed to handle a hypothetical future change, then it can be > deferred and handled as part of that change. :) > I would say that even renaming the files can wait until we actually have a conflict. Note that reimplementing enum.py in C will not cause a conflict because that code will likely go to Modules/enum.c or Modules/_enum.c and not in Objects/enumobject.c. If any renaming in Objects/ directory is in order, I would start with longobject.c and unicodeobject.c rather than enumobject.c. It is fairly obvious to look for enumerate code next to range code and tab-completion gets me to the right file fairly quickly. On the other hand, I've been trying to find intobject.c in 3.x code on more than one occasion. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Aug 4 01:36:50 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 3 Aug 2013 16:36:50 -0700 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses Message-ID: Hi All, Today the issue of cross-test global env dependencies showed its ugly head again for me. I recall a previous discussion (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) but there were many more over the years. The core problem is that some tests modify the global env (particularly importing modules) and this sometimes has adverse effects on other tests, because test.regrtest runs all tests in a single process. In the discussion linked above, the particular culprit test__all__ was judged as a candidate to be moved to a subprocess. I want to propose adding a capability to our test harness to run specific tests in subprocesses. Each test will have some simple way of asking to be run in a subprocess, and regrtest will concur (even when running -j1). test__all__ can go there, and it can help solve other problems. My particular case is trying to write a test for http://bugs.python.org/issue14988 - wherein I have to simulate a situation of non-existent pyexpat. It's not hard to write a test for it, but when run in tandem with other tests (where C extensions loaded pyexpat) it becomes seemingly impossible to set up. This should not be the case - there's nothing wrong with wanting to simulate this case, and there's nothing wrong in Python and the stdlib - it's purely an artifact of the way our regression suite works. Thoughts? Eli From eliben at gmail.com Sun Aug 4 01:47:37 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 3 Aug 2013 16:47:37 -0700 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: References: Message-ID: On Sat, Aug 3, 2013 at 4:36 PM, Eli Bendersky wrote: > Hi All, > > Today the issue of cross-test global env dependencies showed its ugly > head again for me. I recall a previous discussion > (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) > but there were many more over the years. > > The core problem is that some tests modify the global env > (particularly importing modules) and this sometimes has adverse > effects on other tests, because test.regrtest runs all tests in a > single process. In the discussion linked above, the particular culprit > test__all__ was judged as a candidate to be moved to a subprocess. > > I want to propose adding a capability to our test harness to run > specific tests in subprocesses. Each test will have some simple way of > asking to be run in a subprocess, and regrtest will concur (even when > running -j1). test__all__ can go there, and it can help solve other > problems. > > My particular case is trying to write a test for > http://bugs.python.org/issue14988 - wherein I have to simulate a > situation of non-existent pyexpat. It's not hard to write a test for > it, but when run in tandem with other tests (where C extensions loaded > pyexpat) it becomes seemingly impossible to set up. This should not be > the case - there's nothing wrong with wanting to simulate this case, > and there's nothing wrong in Python and the stdlib - it's purely an > artifact of the way our regression suite works. > > Thoughts? > > Eli FWIW the problem is also discussed here: http://bugs.python.org/issue1674555, w.r.t. test_site Eli From alexander.belopolsky at gmail.com Sun Aug 4 03:30:51 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 3 Aug 2013 21:30:51 -0400 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On Thu, Aug 1, 2013 at 8:44 AM, Nick Coghlan wrote: > > 9. Explicit guideline not to assign lambdas to names (use def, that's > what it's for) Would you consider changing the formatting in the recommended example from def f(x): return 2*x to def f(x): return 2*x ? What is the modern view on single-line def? The "Other Recommendations" section allows but discourages single-line if/for/while, but is silent about def. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Sun Aug 4 03:57:52 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 03 Aug 2013 21:57:52 -0400 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: References: Message-ID: <20130804015753.1E77525014C@webabinitio.net> On Sat, 03 Aug 2013 16:47:37 -0700, Eli Bendersky wrote: > On Sat, Aug 3, 2013 at 4:36 PM, Eli Bendersky wrote: > > Hi All, > > > > Today the issue of cross-test global env dependencies showed its ugly > > head again for me. I recall a previous discussion > > (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) > > but there were many more over the years. > > > > The core problem is that some tests modify the global env > > (particularly importing modules) and this sometimes has adverse > > effects on other tests, because test.regrtest runs all tests in a > > single process. In the discussion linked above, the particular culprit > > test__all__ was judged as a candidate to be moved to a subprocess. > > > > I want to propose adding a capability to our test harness to run > > specific tests in subprocesses. Each test will have some simple way of > > asking to be run in a subprocess, and regrtest will concur (even when > > running -j1). test__all__ can go there, and it can help solve other > > problems. > > > > My particular case is trying to write a test for > > http://bugs.python.org/issue14988 - wherein I have to simulate a > > situation of non-existent pyexpat. It's not hard to write a test for > > it, but when run in tandem with other tests (where C extensions loaded > > pyexpat) it becomes seemingly impossible to set up. This should not be > > the case - there's nothing wrong with wanting to simulate this case, > > and there's nothing wrong in Python and the stdlib - it's purely an > > artifact of the way our regression suite works. > > > > Thoughts? > > > > Eli > > FWIW the problem is also discussed here: > http://bugs.python.org/issue1674555, w.r.t. test_site Can't you just launch a subprocess from the test itself using script_helpers? --David From ncoghlan at gmail.com Sun Aug 4 03:59:06 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Aug 2013 11:59:06 +1000 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: References: Message-ID: On 4 Aug 2013 09:43, "Eli Bendersky" wrote: > > Hi All, > > Today the issue of cross-test global env dependencies showed its ugly > head again for me. I recall a previous discussion > (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) > but there were many more over the years. > > The core problem is that some tests modify the global env > (particularly importing modules) and this sometimes has adverse > effects on other tests, because test.regrtest runs all tests in a > single process. In the discussion linked above, the particular culprit > test__all__ was judged as a candidate to be moved to a subprocess. > > I want to propose adding a capability to our test harness to run > specific tests in subprocesses. Each test will have some simple way of > asking to be run in a subprocess, and regrtest will concur (even when > running -j1). test__all__ can go there, and it can help solve other > problems. > > My particular case is trying to write a test for > http://bugs.python.org/issue14988 - wherein I have to simulate a > situation of non-existent pyexpat. It's not hard to write a test for > it, but when run in tandem with other tests (where C extensions loaded > pyexpat) it becomes seemingly impossible to set up. This should not be > the case - there's nothing wrong with wanting to simulate this case, > and there's nothing wrong in Python and the stdlib - it's purely an > artifact of the way our regression suite works. I'm not actively opposed to the suggested idea, but is there a specific reason "test.support.import_fresh_module" doesn't work for this test? Cheers, Nick. > > Thoughts? > > Eli > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Aug 4 04:03:15 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 3 Aug 2013 19:03:15 -0700 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: References: Message-ID: On Sat, Aug 3, 2013 at 6:59 PM, Nick Coghlan wrote: > > On 4 Aug 2013 09:43, "Eli Bendersky" wrote: >> >> Hi All, >> >> Today the issue of cross-test global env dependencies showed its ugly >> head again for me. I recall a previous discussion >> (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) >> but there were many more over the years. >> >> The core problem is that some tests modify the global env >> (particularly importing modules) and this sometimes has adverse >> effects on other tests, because test.regrtest runs all tests in a >> single process. In the discussion linked above, the particular culprit >> test__all__ was judged as a candidate to be moved to a subprocess. >> >> I want to propose adding a capability to our test harness to run >> specific tests in subprocesses. Each test will have some simple way of >> asking to be run in a subprocess, and regrtest will concur (even when >> running -j1). test__all__ can go there, and it can help solve other >> problems. >> >> My particular case is trying to write a test for >> http://bugs.python.org/issue14988 - wherein I have to simulate a >> situation of non-existent pyexpat. It's not hard to write a test for >> it, but when run in tandem with other tests (where C extensions loaded >> pyexpat) it becomes seemingly impossible to set up. This should not be >> the case - there's nothing wrong with wanting to simulate this case, >> and there's nothing wrong in Python and the stdlib - it's purely an >> artifact of the way our regression suite works. > > I'm not actively opposed to the suggested idea, but is there a specific > reason "test.support.import_fresh_module" doesn't work for this test? I'm not an expert on this topic, but I believe there's a problem unloading code that was loaded by C extensions. import_fresh_module is thus powerless here (which also appears to be the case empirically). Eli From eliben at gmail.com Sun Aug 4 04:04:59 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 3 Aug 2013 19:04:59 -0700 Subject: [Python-Dev] Fwd: Allowing to run certain regression tests in subprocesses In-Reply-To: References: <20130804015753.1E77525014C@webabinitio.net> Message-ID: On Sat, Aug 3, 2013 at 6:57 PM, R. David Murray wrote: > On Sat, 03 Aug 2013 16:47:37 -0700, Eli Bendersky wrote: >> On Sat, Aug 3, 2013 at 4:36 PM, Eli Bendersky wrote: >> > Hi All, >> > >> > Today the issue of cross-test global env dependencies showed its ugly >> > head again for me. I recall a previous discussion >> > (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) >> > but there were many more over the years. >> > >> > The core problem is that some tests modify the global env >> > (particularly importing modules) and this sometimes has adverse >> > effects on other tests, because test.regrtest runs all tests in a >> > single process. In the discussion linked above, the particular culprit >> > test__all__ was judged as a candidate to be moved to a subprocess. >> > >> > I want to propose adding a capability to our test harness to run >> > specific tests in subprocesses. Each test will have some simple way of >> > asking to be run in a subprocess, and regrtest will concur (even when >> > running -j1). test__all__ can go there, and it can help solve other >> > problems. >> > >> > My particular case is trying to write a test for >> > http://bugs.python.org/issue14988 - wherein I have to simulate a >> > situation of non-existent pyexpat. It's not hard to write a test for >> > it, but when run in tandem with other tests (where C extensions loaded >> > pyexpat) it becomes seemingly impossible to set up. This should not be >> > the case - there's nothing wrong with wanting to simulate this case, >> > and there's nothing wrong in Python and the stdlib - it's purely an >> > artifact of the way our regression suite works. >> > >> > Thoughts? >> > >> > Eli >> >> FWIW the problem is also discussed here: >> http://bugs.python.org/issue1674555, w.r.t. test_site > > Can't you just launch a subprocess from the test itself using script_helpers? > [sorry, sent privately by mistake; forwarding to pydev] I can, but such launching will be necessarily duplicated across all tests that need this functionality (test_site, test___all__, etc). Since regrtest already has functionality for launching whole test-suites in subprocesses, it makes sense to reuse it, no? Eli From ncoghlan at gmail.com Sun Aug 4 04:08:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Aug 2013 12:08:13 +1000 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: References: Message-ID: On 4 Aug 2013 12:03, "Eli Bendersky" wrote: > > On Sat, Aug 3, 2013 at 6:59 PM, Nick Coghlan wrote: > > > > On 4 Aug 2013 09:43, "Eli Bendersky" wrote: > >> > >> Hi All, > >> > >> Today the issue of cross-test global env dependencies showed its ugly > >> head again for me. I recall a previous discussion > >> (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) > >> but there were many more over the years. > >> > >> The core problem is that some tests modify the global env > >> (particularly importing modules) and this sometimes has adverse > >> effects on other tests, because test.regrtest runs all tests in a > >> single process. In the discussion linked above, the particular culprit > >> test__all__ was judged as a candidate to be moved to a subprocess. > >> > >> I want to propose adding a capability to our test harness to run > >> specific tests in subprocesses. Each test will have some simple way of > >> asking to be run in a subprocess, and regrtest will concur (even when > >> running -j1). test__all__ can go there, and it can help solve other > >> problems. > >> > >> My particular case is trying to write a test for > >> http://bugs.python.org/issue14988 - wherein I have to simulate a > >> situation of non-existent pyexpat. It's not hard to write a test for > >> it, but when run in tandem with other tests (where C extensions loaded > >> pyexpat) it becomes seemingly impossible to set up. This should not be > >> the case - there's nothing wrong with wanting to simulate this case, > >> and there's nothing wrong in Python and the stdlib - it's purely an > >> artifact of the way our regression suite works. > > > > I'm not actively opposed to the suggested idea, but is there a specific > > reason "test.support.import_fresh_module" doesn't work for this test? > > I'm not an expert on this topic, but I believe there's a problem > unloading code that was loaded by C extensions. import_fresh_module is > thus powerless here (which also appears to be the case empirically). Sure, it's just unusual to have a case where "importing is blocked by adding None to sys.modules" differs from "not actually available", so I'd like to understand the situation better. Cheers, Nick. > > Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Aug 4 04:03:05 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Aug 2013 12:03:05 +1000 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: On 4 Aug 2013 11:30, "Alexander Belopolsky" wrote: > > > On Thu, Aug 1, 2013 at 8:44 AM, Nick Coghlan wrote: > > > > 9. Explicit guideline not to assign lambdas to names (use def, that's > > what it's for) > > > Would you consider changing the formatting in the recommended example from > > def f(x): return 2*x > > to > > def f(x): > return 2*x > > ? I consider a single line def acceptable when replacing an equivalent lambda. Restricting it to a single line makes it solely about the spelling of the assignment operation, without any vertical whitespace considerations. Cheers, Nick. > > What is the modern view on single-line def? The "Other Recommendations" section allows but discourages single-line if/for/while, but is silent about def. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Aug 4 04:44:51 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 04 Aug 2013 12:44:51 +1000 Subject: [Python-Dev] PEP 8 modernisation In-Reply-To: References: Message-ID: <51FDC023.90202@pearwood.info> On 02/08/13 06:52, Alexander Belopolsky wrote: > On Thu, Aug 1, 2013 at 4:29 PM, Terry Reedy wrote: > >> def f(x): return 2*x >> f = lambda x: 2*x >> > > Am I the only one who finds the second line above much more readable than > the first? The def statement is not intended to be written in one line. > The readability suffers because the argument is separated from the value > expression by return keyword. You are not the only one. I will continue to write "f = lambda ..." at the interactive interpreter without shame, although I rarely (never?) use it in code. -- Steven From larry at hastings.org Sun Aug 4 08:22:14 2013 From: larry at hastings.org (Larry Hastings) Date: Sat, 03 Aug 2013 23:22:14 -0700 Subject: [Python-Dev] [RELEASED] Python 3.4.0a1 Message-ID: <51FDF316.4020107@hastings.org> On behalf of the Python development team, I'm pleased to announce the first alpha release of Python 3.4. This is a preview release, and its use is not recommended for production settings. Python 3.4 includes a range of improvements of the 3.x series, including hundreds of small improvements and bug fixes. Major new features and changes in the 3.4 release series so far include: * PEP 435, a standardized "enum" module * PEP 442, improved semantics for object finalization * PEP 443, adding single-dispatch generic functions to the standard library * PEP 445, a new C API for implementing custom memory allocators To download Python 3.4.0a1 visit: http://www.python.org/download/releases/3.4.0/ Please consider trying Python 3.4.0a1 with your code and reporting any issues you notice to: http://bugs.python.org/ Enjoy! -- Larry Hastings, Release Manager larry at hastings.org (on behalf of the entire python-dev team and 3.4's contributors) From larry at hastings.org Sun Aug 4 08:48:26 2013 From: larry at hastings.org (Larry Hastings) Date: Sat, 03 Aug 2013 23:48:26 -0700 Subject: [Python-Dev] [RELEASED] Python 3.4.0a1 In-Reply-To: <51FDF316.4020107@hastings.org> References: <51FDF316.4020107@hastings.org> Message-ID: <51FDF93A.9090009@hastings.org> On 08/03/2013 11:22 PM, Larry Hastings wrote: > * PEP 435, a standardized "enum" module > * PEP 442, improved semantics for object finalization > * PEP 443, adding single-dispatch generic functions to the standard > library > * PEP 445, a new C API for implementing custom memory allocators Whoops, looks like I missed a couple here. I was in a hurry and just went off what I could find in Misc/NEWS. I'll have a more complete list in the release schedule PEP in a minute, and in the announcements for alpha 2. If you want to make sure your PEP is mentioned next time, by all means email me and rattle my cage. My apologies to those I overlooked, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sun Aug 4 09:23:41 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 Aug 2013 09:23:41 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies Message-ID: Hi, I'm currently catching up on PEP 442, which managed to fly completely below my radar so far. It's a really helpful change that could end up fixing a major usability problem that Cython was suffering from: user provided deallocation code now has a safe execution environment (well, at least in Py3.4+). That makes Cython a prime candidate for testing this, and I've just started to migrate the implementation. One thing that I found to be missing from the PEP is inheritance handling. The current implementation doesn't seem to care about base types at all, so it appears to be the responsibility of the type to call its super type finalisation function. Is that really intended? Couldn't the super type call chain be made a part of the protocol? Another bit is the exception handling. According to the documentation, tp_finalize() is supposed to first save the current exception state, then do the cleanup, then call WriteUnraisable() if necessary, then restore the exception state. http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize Is there a reason why this is left to the user implementation, rather than doing it generically right in PyObject_CallFinalizer() ? That would also make it more efficient to call through the super type hierarchy, I guess. I don't see a need to repeat this exception state swapping at each level. So, essentially, I'm wondering whether PyObject_CallFinalizer() couldn't just set up the execution environment and then call all finalisers of the type hierarchy in bottom-up order. Stefan From rdmurray at bitdance.com Sun Aug 4 14:44:51 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 04 Aug 2013 08:44:51 -0400 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: References: <20130804015753.1E77525014C@webabinitio.net> Message-ID: <20130804124452.6088B25017D@webabinitio.net> On Sat, 03 Aug 2013 19:04:21 -0700, Eli Bendersky wrote: > On Sat, Aug 3, 2013 at 6:57 PM, R. David Murray wrote: > > On Sat, 03 Aug 2013 16:47:37 -0700, Eli Bendersky wrote: > >> On Sat, Aug 3, 2013 at 4:36 PM, Eli Bendersky wrote: > >> > Hi All, > >> > > >> > Today the issue of cross-test global env dependencies showed its ugly > >> > head again for me. I recall a previous discussion > >> > (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) > >> > but there were many more over the years. > >> > > >> > The core problem is that some tests modify the global env > >> > (particularly importing modules) and this sometimes has adverse > >> > effects on other tests, because test.regrtest runs all tests in a > >> > single process. In the discussion linked above, the particular culprit > >> > test__all__ was judged as a candidate to be moved to a subprocess. > >> > > >> > I want to propose adding a capability to our test harness to run > >> > specific tests in subprocesses. Each test will have some simple way of > >> > asking to be run in a subprocess, and regrtest will concur (even when > >> > running -j1). test__all__ can go there, and it can help solve other > >> > problems. > >> > > >> > My particular case is trying to write a test for > >> > http://bugs.python.org/issue14988 - wherein I have to simulate a > >> > situation of non-existent pyexpat. It's not hard to write a test for > >> > it, but when run in tandem with other tests (where C extensions loaded > >> > pyexpat) it becomes seemingly impossible to set up. This should not be > >> > the case - there's nothing wrong with wanting to simulate this case, > >> > and there's nothing wrong in Python and the stdlib - it's purely an > >> > artifact of the way our regression suite works. > >> > > >> > Thoughts? > >> > > >> > Eli > >> > >> FWIW the problem is also discussed here: > >> http://bugs.python.org/issue1674555, w.r.t. test_site > > > > Can't you just launch a subprocess from the test itself using script_helpers? > > > > I can, but such launching will be necessarily duplicated across all > tests that need this functionality (test_site, test___all__, etc). > Since regrtest already has functionality for launching whole > test-suites in subprocesses, it makes sense to reuse it, no? In the case of test_site and test___all___ we are talking about running the entire test file in a subprocess. It sounds like you are only talking about running one individual test function in a subprocess, for which using script_helpers seems the more natural solution. --David From eliben at gmail.com Sun Aug 4 14:54:04 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 4 Aug 2013 05:54:04 -0700 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: <20130804124452.6088B25017D@webabinitio.net> References: <20130804015753.1E77525014C@webabinitio.net> <20130804124452.6088B25017D@webabinitio.net> Message-ID: On Sun, Aug 4, 2013 at 5:44 AM, R. David Murray wrote: > On Sat, 03 Aug 2013 19:04:21 -0700, Eli Bendersky wrote: >> On Sat, Aug 3, 2013 at 6:57 PM, R. David Murray wrote: >> > On Sat, 03 Aug 2013 16:47:37 -0700, Eli Bendersky wrote: >> >> On Sat, Aug 3, 2013 at 4:36 PM, Eli Bendersky wrote: >> >> > Hi All, >> >> > >> >> > Today the issue of cross-test global env dependencies showed its ugly >> >> > head again for me. I recall a previous discussion >> >> > (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) >> >> > but there were many more over the years. >> >> > >> >> > The core problem is that some tests modify the global env >> >> > (particularly importing modules) and this sometimes has adverse >> >> > effects on other tests, because test.regrtest runs all tests in a >> >> > single process. In the discussion linked above, the particular culprit >> >> > test__all__ was judged as a candidate to be moved to a subprocess. >> >> > >> >> > I want to propose adding a capability to our test harness to run >> >> > specific tests in subprocesses. Each test will have some simple way of >> >> > asking to be run in a subprocess, and regrtest will concur (even when >> >> > running -j1). test__all__ can go there, and it can help solve other >> >> > problems. >> >> > >> >> > My particular case is trying to write a test for >> >> > http://bugs.python.org/issue14988 - wherein I have to simulate a >> >> > situation of non-existent pyexpat. It's not hard to write a test for >> >> > it, but when run in tandem with other tests (where C extensions loaded >> >> > pyexpat) it becomes seemingly impossible to set up. This should not be >> >> > the case - there's nothing wrong with wanting to simulate this case, >> >> > and there's nothing wrong in Python and the stdlib - it's purely an >> >> > artifact of the way our regression suite works. >> >> > >> >> > Thoughts? >> >> > >> >> > Eli >> >> >> >> FWIW the problem is also discussed here: >> >> http://bugs.python.org/issue1674555, w.r.t. test_site >> > >> > Can't you just launch a subprocess from the test itself using script_helpers? >> > >> >> I can, but such launching will be necessarily duplicated across all >> tests that need this functionality (test_site, test___all__, etc). >> Since regrtest already has functionality for launching whole >> test-suites in subprocesses, it makes sense to reuse it, no? > > In the case of test_site and test___all___ we are talking about running > the entire test file in a subprocess. It sounds like you are only > talking about running one individual test function in a subprocess, > for which using script_helpers seems the more natural solution. I was actually planning to split this into a separate test file to make the process separation more apparent. And regardless, the question sent to the list is about the generic approach, not my particular problem. Issues of folks struggling with inter-test dependencies through global state modification come up very often. Eli From stefan_ml at behnel.de Sun Aug 4 15:24:26 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 Aug 2013 15:24:26 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies In-Reply-To: References: Message-ID: Stefan Behnel, 04.08.2013 09:23: > I'm currently catching up on PEP 442, which managed to fly completely below > my radar so far. It's a really helpful change that could end up fixing a > major usability problem that Cython was suffering from: user provided > deallocation code now has a safe execution environment (well, at least in > Py3.4+). That makes Cython a prime candidate for testing this, and I've > just started to migrate the implementation. > > One thing that I found to be missing from the PEP is inheritance handling. > The current implementation doesn't seem to care about base types at all, so > it appears to be the responsibility of the type to call its super type > finalisation function. Is that really intended? Couldn't the super type > call chain be made a part of the protocol? > > Another bit is the exception handling. According to the documentation, > tp_finalize() is supposed to first save the current exception state, then > do the cleanup, then call WriteUnraisable() if necessary, then restore the > exception state. > > http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize > > Is there a reason why this is left to the user implementation, rather than > doing it generically right in PyObject_CallFinalizer() ? That would also > make it more efficient to call through the super type hierarchy, I guess. I > don't see a need to repeat this exception state swapping at each level. > > So, essentially, I'm wondering whether PyObject_CallFinalizer() couldn't > just set up the execution environment and then call all finalisers of the > type hierarchy in bottom-up order. I continued my implementation and found that calling up the base type hierarchy is essentially the same code as calling up the hierarchy for tp_dealloc(), so that was easy to adapt to in Cython and is also more efficient than a generic loop (because it can usually benefit from inlining). So I'm personally ok with leaving the super type calling code to the user side, even though manual implementers may not be entirely happy. I think it should get explicitly documented how subtypes should deal with a tp_finalize() in (one of the) super types. It's not entirely trivial because the tp_finalize slot is not guaranteed to be filled for a super type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that PyType_Ready() copies it would still hold, though. For reference, my initial implementation in Cython is here: https://github.com/cython/cython/commit/6fdb49bd84192089c7e742d46594b59ad6431b31 I'm currently running Cython's tests suite against it to see if everything broke along the way. Will report back as soon as I got everything working. Stefan From eliben at gmail.com Sun Aug 4 15:40:15 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 4 Aug 2013 06:40:15 -0700 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: References: Message-ID: On Sat, Aug 3, 2013 at 7:08 PM, Nick Coghlan wrote: > > On 4 Aug 2013 12:03, "Eli Bendersky" wrote: >> >> On Sat, Aug 3, 2013 at 6:59 PM, Nick Coghlan wrote: >> > >> > On 4 Aug 2013 09:43, "Eli Bendersky" wrote: >> >> >> >> Hi All, >> >> >> >> Today the issue of cross-test global env dependencies showed its ugly >> >> head again for me. I recall a previous discussion >> >> (http://mail.python.org/pipermail/python-dev/2013-January/123409.html) >> >> but there were many more over the years. >> >> >> >> The core problem is that some tests modify the global env >> >> (particularly importing modules) and this sometimes has adverse >> >> effects on other tests, because test.regrtest runs all tests in a >> >> single process. In the discussion linked above, the particular culprit >> >> test__all__ was judged as a candidate to be moved to a subprocess. >> >> >> >> I want to propose adding a capability to our test harness to run >> >> specific tests in subprocesses. Each test will have some simple way of >> >> asking to be run in a subprocess, and regrtest will concur (even when >> >> running -j1). test__all__ can go there, and it can help solve other >> >> problems. >> >> >> >> My particular case is trying to write a test for >> >> http://bugs.python.org/issue14988 - wherein I have to simulate a >> >> situation of non-existent pyexpat. It's not hard to write a test for >> >> it, but when run in tandem with other tests (where C extensions loaded >> >> pyexpat) it becomes seemingly impossible to set up. This should not be >> >> the case - there's nothing wrong with wanting to simulate this case, >> >> and there's nothing wrong in Python and the stdlib - it's purely an >> >> artifact of the way our regression suite works. >> > >> > I'm not actively opposed to the suggested idea, but is there a specific >> > reason "test.support.import_fresh_module" doesn't work for this test? >> >> I'm not an expert on this topic, but I believe there's a problem >> unloading code that was loaded by C extensions. import_fresh_module is >> thus powerless here (which also appears to be the case empirically). > > Sure, it's just unusual to have a case where "importing is blocked by adding > None to sys.modules" differs from "not actually available", so I'd like to > understand the situation better. I must admit I'm confused by the behavior of import_fresh_module too. Snippet #1 raises the expected ImportError: sys.modules['pyexpat'] = None import _elementtree However, snippet #2 succeeds importing: ET = import_fresh_module('_elementtree', blocked=['pyexpat']) print(ET) I believe this happens because import_fresh_module does an import of the 'name' it's given before even looking at the blocked list. Then, it assigns None to sys.modules for the blocked names and re-imports the module. So in essence, this is somewhat equivalent to snippet #3: modname = '_elementtree' __import__(modname) del sys.modules[modname] for m in sys.modules: if modname == m or m.startswith(modname + '.'): del sys.modules[m] sys.modules['pyexpat'] = None ET = importlib.import_module(modname) print(ET) Which also succeeds. I fear I'm not familiar enough with the logic of importing to understand what's going on, but it has been my impression that this problem is occasionally encountered with import_fresh_module and C code that imports stuff (the import of pyexpat is done by C code in this case). CC'ing Brett. Eli From eliben at gmail.com Sun Aug 4 16:01:49 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 4 Aug 2013 07:01:49 -0700 Subject: [Python-Dev] [RELEASED] Python 3.4.0a1 In-Reply-To: <51FDF93A.9090009@hastings.org> References: <51FDF316.4020107@hastings.org> <51FDF93A.9090009@hastings.org> Message-ID: On Sat, Aug 3, 2013 at 11:48 PM, Larry Hastings wrote: > On 08/03/2013 11:22 PM, Larry Hastings wrote: > > * PEP 435, a standardized "enum" module > * PEP 442, improved semantics for object finalization > * PEP 443, adding single-dispatch generic functions to the standard library > * PEP 445, a new C API for implementing custom memory allocators > > > Whoops, looks like I missed a couple here. I was in a hurry and just went > off what I could find in Misc/NEWS. I'll have a more complete list in the > release schedule PEP in a minute, and in the announcements for alpha 2. > > If you want to make sure your PEP is mentioned next time, by all means email > me and rattle my cage. > Larry, if there are other things you're going to add, update the web page http://www.python.org/download/releases/3.4.0/ as well - it's the one being linked in the inter-webs now. Eli From ncoghlan at gmail.com Sun Aug 4 16:26:44 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Aug 2013 00:26:44 +1000 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: References: Message-ID: On 4 August 2013 23:40, Eli Bendersky wrote: > On Sat, Aug 3, 2013 at 7:08 PM, Nick Coghlan wrote: >> Sure, it's just unusual to have a case where "importing is blocked by adding >> None to sys.modules" differs from "not actually available", so I'd like to >> understand the situation better. > > I must admit I'm confused by the behavior of import_fresh_module too. > > Snippet #1 raises the expected ImportError: > > sys.modules['pyexpat'] = None > import _elementtree > > However, snippet #2 succeeds importing: > > ET = import_fresh_module('_elementtree', blocked=['pyexpat']) > print(ET) /me goes and looks That function was much simpler when it was first created :P Still, I'm fairly confident the complexity of that dance isn't relevant to the problem you're seeing. > I believe this happens because import_fresh_module does an import of > the 'name' it's given before even looking at the blocked list. Then, > it assigns None to sys.modules for the blocked names and re-imports > the module. So in essence, this is somewhat equivalent to snippet #3: > > modname = '_elementtree' > __import__(modname) > del sys.modules[modname] > for m in sys.modules: > if modname == m or m.startswith(modname + '.'): > del sys.modules[m] > sys.modules['pyexpat'] = None > ET = importlib.import_module(modname) > print(ET) > > Which also succeeds. > > I fear I'm not familiar enough with the logic of importing to > understand what's going on, but it has been my impression that this > problem is occasionally encountered with import_fresh_module and C > code that imports stuff (the import of pyexpat is done by C code in > this case). I had missed it was a C module doing the import. Looking into the _elementtree.c source, the problem in this case is the fact that a shared library that doesn't use PEP 3121 style per-module state is only loaded and initialised once, so reimporting it gets the same module back (from the extension loading cache), even if the Python level reference has been removed from sys.modules. Non PEP 3121 C extension modules thus don't work properly with test.support.import_fresh_module (as there's an extra level of caching involved that *can't* be cleared from Python, because it would break things). To fix this, _elementree would need to move the pyexpat C API pointer to per-module state, rather than using a static variable (see http://docs.python.org/3/c-api/module.html#initializing-c-modules). With per-module state defined, the import machine should rerun the init function when the fresh import happens, thus creating a new copy of the module. However, this isn't an entirely trivial change for _elementree, since: 1. Getting from the XMLParser instance back to the module where it was defined in order to retrieve the capsule pointer via PyModule_GetState() isn't entirely trivial in C. You'd likely do it once in the init method, store the result in an XMLParser attribute, and then tweak the EXPAT() using functions to include an appropriate local variable definition at the start of the method implementation. 2. expat_set_error would need to be updated to accept the pyexpat capsule pointer as a function parameter Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stefan_ml at behnel.de Sun Aug 4 17:59:57 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 Aug 2013 17:59:57 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies In-Reply-To: References: Message-ID: Stefan Behnel, 04.08.2013 15:24: > Stefan Behnel, 04.08.2013 09:23: >> I'm currently catching up on PEP 442, which managed to fly completely below >> my radar so far. It's a really helpful change that could end up fixing a >> major usability problem that Cython was suffering from: user provided >> deallocation code now has a safe execution environment (well, at least in >> Py3.4+). That makes Cython a prime candidate for testing this, and I've >> just started to migrate the implementation. >> >> One thing that I found to be missing from the PEP is inheritance handling. >> The current implementation doesn't seem to care about base types at all, so >> it appears to be the responsibility of the type to call its super type >> finalisation function. Is that really intended? Couldn't the super type >> call chain be made a part of the protocol? >> >> Another bit is the exception handling. According to the documentation, >> tp_finalize() is supposed to first save the current exception state, then >> do the cleanup, then call WriteUnraisable() if necessary, then restore the >> exception state. >> >> http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize >> >> Is there a reason why this is left to the user implementation, rather than >> doing it generically right in PyObject_CallFinalizer() ? That would also >> make it more efficient to call through the super type hierarchy, I guess. I >> don't see a need to repeat this exception state swapping at each level. >> >> So, essentially, I'm wondering whether PyObject_CallFinalizer() couldn't >> just set up the execution environment and then call all finalisers of the >> type hierarchy in bottom-up order. > > I continued my implementation and found that calling up the base type > hierarchy is essentially the same code as calling up the hierarchy for > tp_dealloc(), so that was easy to adapt to in Cython and is also more > efficient than a generic loop (because it can usually benefit from > inlining). So I'm personally ok with leaving the super type calling code to > the user side, even though manual implementers may not be entirely happy. > > I think it should get explicitly documented how subtypes should deal with a > tp_finalize() in (one of the) super types. It's not entirely trivial > because the tp_finalize slot is not guaranteed to be filled for a super > type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that > PyType_Ready() copies it would still hold, though. Hmm, it seems to me by now that the only safe way of handling this is to let each tp_dealloc() level in the hierarchy call tp_finalize() through PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc() functions in base types and subtypes. However, that appears like a rather cumbersome and inefficient design. It also somewhat counters the advantage of having a finalisation step before deallocation, if the finalisers are only called after (partially) cleaning up the subtypes. ISTM that this feature hasn't been fully thought out... Stefan From larry at hastings.org Sun Aug 4 20:38:32 2013 From: larry at hastings.org (Larry Hastings) Date: Sun, 04 Aug 2013 11:38:32 -0700 Subject: [Python-Dev] [RELEASED] Python 3.4.0a1 In-Reply-To: References: <51FDF316.4020107@hastings.org> <51FDF93A.9090009@hastings.org> Message-ID: <51FE9FA8.3020503@hastings.org> On 08/04/2013 07:01 AM, Eli Bendersky wrote: > Larry, if there are other things you're going to add, update the web > page http://www.python.org/download/releases/3.4.0/ as well - it's the > one being linked in the inter-webs now. Good thinking! I'll do that today. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Mon Aug 5 01:06:33 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 4 Aug 2013 16:06:33 -0700 Subject: [Python-Dev] Allowing to run certain regression tests in subprocesses In-Reply-To: References: Message-ID: On Sun, Aug 4, 2013 at 7:26 AM, Nick Coghlan wrote: > On 4 August 2013 23:40, Eli Bendersky wrote: >> On Sat, Aug 3, 2013 at 7:08 PM, Nick Coghlan wrote: >>> Sure, it's just unusual to have a case where "importing is blocked by adding >>> None to sys.modules" differs from "not actually available", so I'd like to >>> understand the situation better. >> >> I must admit I'm confused by the behavior of import_fresh_module too. >> >> Snippet #1 raises the expected ImportError: >> >> sys.modules['pyexpat'] = None >> import _elementtree >> >> However, snippet #2 succeeds importing: >> >> ET = import_fresh_module('_elementtree', blocked=['pyexpat']) >> print(ET) > > /me goes and looks > > That function was much simpler when it was first created :P > > Still, I'm fairly confident the complexity of that dance isn't > relevant to the problem you're seeing. > >> I believe this happens because import_fresh_module does an import of >> the 'name' it's given before even looking at the blocked list. Then, >> it assigns None to sys.modules for the blocked names and re-imports >> the module. So in essence, this is somewhat equivalent to snippet #3: >> >> modname = '_elementtree' >> __import__(modname) >> del sys.modules[modname] >> for m in sys.modules: >> if modname == m or m.startswith(modname + '.'): >> del sys.modules[m] >> sys.modules['pyexpat'] = None >> ET = importlib.import_module(modname) >> print(ET) >> >> Which also succeeds. >> >> I fear I'm not familiar enough with the logic of importing to >> understand what's going on, but it has been my impression that this >> problem is occasionally encountered with import_fresh_module and C >> code that imports stuff (the import of pyexpat is done by C code in >> this case). > > I had missed it was a C module doing the import. Looking into the > _elementtree.c source, the problem in this case is the fact that a > shared library that doesn't use PEP 3121 style per-module state is > only loaded and initialised once, so reimporting it gets the same > module back (from the extension loading cache), even if the Python > level reference has been removed from sys.modules. Non PEP 3121 C > extension modules thus don't work properly with > test.support.import_fresh_module (as there's an extra level of caching > involved that *can't* be cleared from Python, because it would break > things). > > To fix this, _elementree would need to move the pyexpat C API pointer > to per-module state, rather than using a static variable (see > http://docs.python.org/3/c-api/module.html#initializing-c-modules). > With per-module state defined, the import machine should rerun the > init function when the fresh import happens, thus creating a new copy > of the module. However, this isn't an entirely trivial change for > _elementree, since: > > 1. Getting from the XMLParser instance back to the module where it was > defined in order to retrieve the capsule pointer via > PyModule_GetState() isn't entirely trivial in C. You'd likely do it > once in the init method, store the result in an XMLParser attribute, > and then tweak the EXPAT() using functions to include an appropriate > local variable definition at the start of the method implementation. > > 2. expat_set_error would need to be updated to accept the pyexpat > capsule pointer as a function parameter > Thanks Nick; I suspected something of the sort is going on here, but you provide some interesting leads to look at. I'll probably open an issue to track this at some point. Eli From larry at hastings.org Mon Aug 5 10:48:29 2013 From: larry at hastings.org (Larry Hastings) Date: Mon, 05 Aug 2013 01:48:29 -0700 Subject: [Python-Dev] The Return Of Argument Clinic Message-ID: <51FF66DD.1020403@hastings.org> It's time to discuss Argument Clinic again. I think the implementation is ready for public scrutiny. (It was actually ready a week ago, but I lost a couple of days to "make distclean" corrupting my hg data store--yes, I hadn't upped my local clinic branch in a while. Eventually I gave up on repairing it and just brute-forcd it. Anyway...) My Clinic test branch is here: https://bitbucket.org/larry/python-clinic/ And before you ask, no, the above branch should never ever ever be merged back into trunk. We'll start clean once Clinic is ready for merging and do a nice neat job. ___________________________________________________________________ There's no documentation, apart from the PEP. But you can see plenty of test cases of using Clinic, just grep for the string "clinic" in */*.c. But for reference here's the list: Modules/_cursesmodule.c Modules/_datetimemodule.c Modules/_dbmmodule.c Modules/posixmodule.c Modules/unicodedata.c Modules/_weakref.c Modules/zlibmodule.c Objects/dictobject.c Objects/unicodeobject.c I haven't reimplemented every PyArg_ParseTuple "format unit" in the retooled Clinic, so it's not ready to try with every single builtin yet. The syntax is as Guido dictated it during our meeting after the Language Summit at PyCon US 2013. The implementation has been retooled, several times, and is now both nicer and more easily extensible. The internals are just a little messy, but the external interfaces are all ready for critique. ___________________________________________________________________ Here are the external interfaces as I forsee them. If you add your own data types, you'll subclass "Converter" and maybe "ReturnConverter". Take a look at the existing subclasses to get a feel for what that's like. If you implemented your own DSL, you'd make something that quacked like "PythonParser" (implementing __init__ and parse methods), and you'd deal with "Block", "Module", "Class", "Function", and "Parameter" objects a lot. What do you think? ___________________________________________________________________ What follows are six questions I'd like to put to the community, ranked oddly enough in order of how little to how much I care about the answer. BTW, by convention, every time I need a arbitrary sample function I use "os.stat". (Please quote the question line in your responses, otherwise I fear we'll get lost in the sea of text.) ___________________________________________________________________ Question 0: How should we integrate Clinic into the build process? Clinic presents a catch-22: you want it as part of the build process, but it needs Python to be built before it'll run. Currently it requires Python 3.3 or newer; it might work in 3.2, I've never tried it. We can't depend on Python 3 being available when we build. This complicates the build process somewhat. I imagine it's a solvable problem on UNIX... with the right wizardry. I have no idea how one'd approach it on Windows, but obviously we need to solve the problem there too. ___________________________________________________________________ Question 1: Which C function nomenclature? Argument Clinic generates two functions prototypes per Python function: one specifying one of the traditional signatures for builtins, whose code is generated completely by Clinic, and the other with a custom-generated signature for just that call whose code is written by the user. Currently the former doesn't have any specific name, though I have been thinking of it as the "parse" function. The latter is definitely called the "impl" (pronounced IM-pull), short for "implementation". When Clinic generates the C code, it uses the name of the Python function to create the C functions' names, with underscores in place of dots. Currently the "parse" function gets the base name ("os_stat"), and the "impl" function gets an "_impl" added to the end ("os_stat_impl"). Argument Clinic is agnostic about the names of these functions. It's possible it'd be nicer to name these the other way around, say "os_stat_parse" for the parse function and "os_stat" for the impl. Anyone have a strong opinion one way or the other? I don't much care; all I can say is that the "obvious" way to do it when I started was to add "_impl" to the impl, as it is the new creature under the sun. ___________________________________________________________________ Question 2: Emit code for modules and classes? Argument Clinic now understands the structure of the modules and classes it works with. You declare them like so: module os class os.ImaginaryClassHere def os.ImaginaryClassHere.stat(...): ... Currently it does very little with the information; right now it mainly just gets baked into the documentation. In the future I expect it to get used in the introspection metadata, and it'll definitely be relevant to external consumers of the Argument Clinic information (IDEs building per-release databases, other implementations building metadata for library interface conformance testing). Another way we could use this metadata: have Argument Clinic generate more of the boilerplate for a class or module. For example, it could kick out all the PyMethodDef structures for the class or module. If we grew Argument Clinic some, and taught it about the data members of classes and modules, it could also generate the PyModuleDef and PyTypeObject structures, and even generate a function that initialized them at runtime for you. (Though that does seem like mission creep to me.) There are some complications to this, one of which I'll discuss next. But I put it to you, gentle reader: how much boilerplate should Argument Clinic undertake to generate, and how much more class and module metadata should be wired in to it? ___________________________________________________________________ Question 3: #ifdef support for functions? Truth be told, I did experiment with having Argument Clinic generate more of the boilerplate associated with modules. Clinic already generates a macro per function defining that function's PyMethodDef structure, for example: #define OS_STAT_METHODDEF \ {"stat", (PyCFunction)os_stat, \ METH_VARARGS|METH_KEYWORDS, os_stat__doc__} For a while I had it generating the PyMethodDef structures, like so: /*[clinic] generate_method_defs os [clinic]*/ #define OS_METHODDEFS \ OS_STAT_METHODDEF, \ OS_ACCESS_METHODDEF, \ OS_TTYNAME_METHODDEF, \ static PyMethodDef os_methods[] = { OS_METHODDEFS /* existing methoddefs here... */ NULL } But I ran into trouble with os.ttyname(), which is only created and exposed if the platform defines HAVE_TTYNAME. Initially I'd just thrown all the Clinic stuff relevant to os.ttyname in the #ifdef block. But Clinic pays no attention to #ifdef statements--so it would still add OS_TTYNAME_METHODDEF, to OS_METHODDEFS. And kablooey! Right now I've backed out of this--I had enough to do without getting off into extra credit like this. But I'd like to return to it. It just seems natural to have Clinic generate this nasty boilerplate. Four approaches suggest themselves to me, listed below in order of least- to most-preferable in my opinion: 0) Don't have Clinic participate in populating the PyMethodDefs. 1) Teach Clinic to understand simple C preprocessor statements, just enough so it implicitly understands that os.ttyname was defined inside an #ifdef HAVE_TTYPE block. It would then intelligently generate the code to take this into account. 2) Explicitly tell Clinic that os.ttyname must have HAVE_TTYNAME defined in order to be active. Clinic then generates the code intelligently taking this into account, handwave handwave. 3) Change the per-function methoddef macro to have the trailing comma: #define OS_STAT_METHODDEF \ {"stat", (PyCFunction)os_stat, \ METH_VARARGS|METH_KEYWORDS, os_stat__doc__}, and suppress it in the macro Clinic generates: /*[clinic] generate_method_defs os [clinic]*/ #define OS_METHODDEFS \ OS_STAT_METHODDEF \ OS_ACCESS_METHODDEF \ OS_TTYNAME_METHODDEF \ And then the code surrounding os.ttyname can look like this: #ifdef HAVE_TTYNAME // ... real os.ttyname stuff here #else #define OS_STAT_TTYNAME #endif And I think that would work great, actually. But I haven't tried it. Do you agree that Argument Clinic should generate this information, and it should use the approach in 3) ? ___________________________________________________________________ Question 4: Return converters returning success/failure? With the addition of the "return converter", we have the lovely feature of being able to *return* a C type and have it converted back into a Python type. Your C extensions have never been more readable! The problem is that the PyObject * returned by a C builtin function serves two simultaneous purposes: it contains the return value on success, but also it is NULL if the function threw an exception. We can probably still do that for all pointer-y return types (I'm not sure, I haven't played with it yet). But if the impl function now returns "int", or some decidedly other non-pointer-y type, there's no longer a magic return value we can use to indicate "we threw an exception". This isn't the end of the world; I can detect that the impl threw an exception by calling PyErr_Occurred(). But I've been chided before for calling this unnecessarily; it's ever-so slightly expensive, in that it has to dereference TLS, and does so with an atomic operation. Not to mention that it's a function call! The impl should know whether or not it failed. So it's the interface we're defining that forces it to throw away that information. If we provided a way for it to return that information, we could shave off some cycles. The problem is, how do we do that in a way that doesn't suck? Four approaches suggest themselves to me, and sadly I think they all suck to one degree or another. In order of sucking least to most: 0) Return the real type and detect the exception with PyErr_Occurred(). This is by far the loveliest option, but it incurs runtime overhead. 1) Have the impl take an extra parameter, "int *failed". If the function fails, it sets that to a true value and returns whatever. 2) Have the impl return its calculated return value through an extra pointer-y parameter ("int *return_value"), and its actual return value is an int indicating success or failure. 3) Have the impl return a structure containing both the real return value and a success/failure integer. Then its return lines would look like this: return {-1, 0}; or maybe return {-3, PY_HORRIBLE_CLINIC_INTERFACE__SUCCESS}; Can we live with PyErr_Occurred() here? ___________________________________________________________________ Question 5: Keep too-magical class decorator Converter.wrap? Converter is the base class for converter objects, the objects that handle the details of converting a Python object into its C equivalent. The signature for Converter.__init__ has become complicated: def __init__(self, name, function, default=unspecified, *, doc_default=None, required=False) "name" is the name of the function ("stat"), "function" is an object representing the function for which this Converter is handling an argument (duck-type compatible with inspect.Signature), and default is the default (Python) value if any. "doc_default" is a string that overrides repr(default) in the documentation, handy if repr(default) is too ugly or you just want to mislead the user. "required", if True specifies that the parameter should be considered required, even if it has a default value. Complicating the matter further, converter subclasses may take extra (keyword-only and optional) parameters to configure exotic custom behavior. For example, the "Py_buffer" converter takes "zeroes" and "nullable"; the "path_t" converter implemented in posixmodule.c takes "allow_fd" and "nullable". This means that converter subclasses have to define a laborious __init__, including three parameters with defaults, then turn right around and pass most of the parameters back into super().__init__. This interface has changed several times during the development of Clinic, and I got tired of fixing up all my existing prototypes and super calls. So I made a class decorator that did it for me. Shield your eyes from the sulferous dark wizardry of Converter.wrap: @staticmethod def wrap(cls): class WrappedConverter(cls, Converter): def __init__(self, name, function, default=unspecified, *, doc_default=None, required=False, **kwargs): super(cls, self).__init__(name, function, default, doc_default=doc_default, required=required) cls.__init__(self, **kwargs) return functools.update_wrapper(WrappedConverter, cls, updated=()) When you decorate your class with Converter.wrap, you only define in your __init__ your custom arguments. All the arguments Converter.__init__ cares about are taken care of for you (aka hidden from you). As an example, here's the relevant bits of path_t_converter from posixmodule.c: @Converter.wrap class path_t_converter(Converter): def __init__(self, *, allow_fd=False, nullable=False): ... So on the one hand I admit it's smelly. On the other hand it hides a lot of stuff that the user needn't care about, and it makes the code simpler and easier to read. And it means we can change the required arguments for Converter.__init__ without breaking any code (as I have already happily done once or twice). I'd like to keep it in, and anoint it as the preferred way of declaring Converter subclasses. Anybody else have a strong opinion on this either way? (I don't currently have an equivalent mechanism for return converters--their interface is a lot simpler, and I just haven't needed it so far.) ___________________________________________________________________ Well! That's quite enough for now. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Mon Aug 5 11:42:02 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 5 Aug 2013 11:42:02 +0200 Subject: [Python-Dev] The Return Of Argument Clinic In-Reply-To: <51FF66DD.1020403@hastings.org> References: <51FF66DD.1020403@hastings.org> Message-ID: Hi Larry, On Mon, Aug 5, 2013 at 10:48 AM, Larry Hastings wrote: > Question 4: Return converters returning success/failure? The option generally used elsewhere is: if we throw an exception, we return some special value; but the special value doesn't necessarily mean by itself that an exception was set. It's a reasonable solution because the caller only needs to call PyErr_Occurred() for one special value, rather than every time. See for example any call to PyFloat_AsDouble(). A bient?t, Armin. From ncoghlan at gmail.com Mon Aug 5 11:55:22 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Aug 2013 19:55:22 +1000 Subject: [Python-Dev] The Return Of Argument Clinic In-Reply-To: <51FF66DD.1020403@hastings.org> References: <51FF66DD.1020403@hastings.org> Message-ID: On 5 August 2013 18:48, Larry Hastings wrote: > Question 0: How should we integrate Clinic into the build process? > > Clinic presents a catch-22: you want it as part of the build process, > but it needs Python to be built before it'll run. Currently it > requires Python 3.3 or newer; it might work in 3.2, I've never > tried it. > > We can't depend on Python 3 being available when we build. > This complicates the build process somewhat. I imagine it's a > solvable problem on UNIX... with the right wizardry. I have no > idea how one'd approach it on Windows, but obviously we need to > solve the problem there too. Isn't solving the bootstrapping problem the reason for checking in the clinic-generated output? If there's no Python available, we build what we have (without the clinic step), then we build it again *with* the clinic step. > ___________________________________________________________________ > Question 1: Which C function nomenclature? > Anyone have a strong opinion one way or the other? I don't much > care; all I can say is that the "obvious" way to do it when I > started was to add "_impl" to the impl, as it is the new creature > under the sun. Consider this from the client side, and I believe it answers itself: other code in the module will be expected the existing signature, so that signature needs to stay with the existing name, while the new C implementation function gets the new name. > ___________________________________________________________________ > Question 2: Emit code for modules and classes? > > There are some complications to this, one of which I'll > discuss next. But I put it to you, gentle reader: how > much boilerplate should Argument Clinic undertake to > generate, and how much more class and module metadata > should be wired in to it? I strongly recommend deferring this. Incremental development is good, and getting this bootstrapped at all is going to be challenging enough without trying to do everything at once. > ___________________________________________________________________ > Question 3: #ifdef support for functions? > > > Do you agree that Argument Clinic should generate this > information, and it should use the approach in 3) ? I think you should postpone anything related to modules and classes until the basic function support is in and working. > ___________________________________________________________________ > Question 4: Return converters returning success/failure? > > Can we live with PyErr_Occurred() here? Armin's suggestion of a valid return value (say, -1) that indicates "error may have occurred" sounds good to me. > ___________________________________________________________________ > Question 5: Keep too-magical class decorator Converter.wrap? > > I'd like to keep it in, and anoint it as the preferred way > of declaring Converter subclasses. Anybody else have a strong > opinion on this either way? Can't you get the same effect without the magic by having a separate "custom_init" method that the main __init__ method promises to call with the extra keyword args after finishing the other parts of the initialization? Them a custom converter would just look like: class path_t_converter(Converter): def custom_init(self, *, allow_fd=False, nullable=False): ... Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Mon Aug 5 14:56:06 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 5 Aug 2013 14:56:06 +0200 Subject: [Python-Dev] Inherance of file descriptor and handles on Windows (PEP 446) In-Reply-To: References: <20130726142525.79448c82@pitrou.net> <20130726232432.3651f1d5@fsol> <20130727002359.2631ea8a@fsol> Message-ID: 2013/7/27 Nick Coghlan : >> Do we even need a new PEP, or should we just do it? Or can we adapt >> Victor's PEP 446? > > Adapting the PEP sounds good - while I agree with switching to a sane > default, I think the daemonisation thread suggests there may need to > be a supporting API to help force FDs created by nominated logging > handlers to be inherited. Why would a Python logging handler be used in a child process? If the child process is a fresh Python process, it starts with the default logging handlers (no handler). Files opened by the logging module must be closed on exec(). Victor From victor.stinner at gmail.com Mon Aug 5 14:52:25 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 5 Aug 2013 14:52:25 +0200 Subject: [Python-Dev] Inherance of file descriptor and handles on Windows (PEP 446) In-Reply-To: <51F05F42.6070200@trueblade.com> References: <20130724235733.3505fab0@fsol> <51F05F42.6070200@trueblade.com> Message-ID: I checked the python-daemon module: it closes all open file descriptors except 0, 1, 2. It has a files_preserve attribute to keep some FD opens. It redirects stdin, stdout and stderr to /dev/null and keep these file descriptors open. If python-daemon is used to execute a new program, the files_preserve list can be used to mark these file descriptors as inherited. The zdaemon.zdrun module closes all open file descriptors except 0, 1, 2. It uses also dup2() to redirect stdout and stderr to the write end of a pipe. Victor 2013/7/25 Eric V. Smith : > On 7/24/2013 6:25 PM, Guido van Rossum wrote: >>>> To reduce the need for 3rd party subprocess creation code, we should >>>> have better daemon creation code in the stdlib -- I wrote some damn >>>> robust code for this purpose in my previous job, but it never saw the >>>> light of day. >>> >>> What do you call "daemon"? An actual Unix-like daemon? >> >> Yeah, a background process with parent PID 1 and not associated with >> any terminal group. > > There's PEP 3143 and https://pypi.python.org/pypi/python-daemon. I've > used it often, with great success. > > -- > Eric. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From ncoghlan at gmail.com Mon Aug 5 17:12:50 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Aug 2013 01:12:50 +1000 Subject: [Python-Dev] Inherance of file descriptor and handles on Windows (PEP 446) In-Reply-To: References: <20130724235733.3505fab0@fsol> <51F05F42.6070200@trueblade.com> Message-ID: On 5 August 2013 22:52, Victor Stinner wrote: > I checked the python-daemon module: it closes all open file > descriptors except 0, 1, 2. It has a files_preserve attribute to keep > some FD opens. It redirects stdin, stdout and stderr to /dev/null and > keep these file descriptors open. If python-daemon is used to execute > a new program, the files_preserve list can be used to mark these file > descriptors as inherited. > > The zdaemon.zdrun module closes all open file descriptors except 0, 1, > 2. It uses also dup2() to redirect stdout and stderr to the write end > of a pipe. So closed by default, and directing people towards subprocess and python-daemon if they need to keep descriptors open sounds really promising. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From v+python at g.nevcal.com Mon Aug 5 19:41:57 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 05 Aug 2013 10:41:57 -0700 Subject: [Python-Dev] The Return Of Argument Clinic In-Reply-To: <51FF66DD.1020403@hastings.org> References: <51FF66DD.1020403@hastings.org> Message-ID: <51FFE3E5.1060302@g.nevcal.com> On 8/5/2013 1:48 AM, Larry Hastings wrote: > > The impl should know whether or not it failed. So it's the > interface we're defining that forces it to throw away that > information. If we provided a way for it to return that > information, we could shave off some cycles. The problem > is, how do we do that in a way that doesn't suck? > ... > Can we live with PyErr_Occurred() here? Isn't there another option? To have the impl call a special "failed" clinic API, prior to returning failure? And if that wasn't called, then the return is success. Or does that require the same level of overhead as PyErr_Occurred? Reducing the chances of PyErr_Occurred per Armin's suggestion seems good if the above is not an improvement. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Aug 5 20:44:25 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 5 Aug 2013 20:44:25 +0200 Subject: [Python-Dev] Inherance of file descriptor and handles on Windows (PEP 446) References: <20130726142525.79448c82@pitrou.net> <20130726232432.3651f1d5@fsol> <20130727002359.2631ea8a@fsol> Message-ID: <20130805204425.35fbcfd1@fsol> On Mon, 5 Aug 2013 14:56:06 +0200 Victor Stinner wrote: > 2013/7/27 Nick Coghlan : > >> Do we even need a new PEP, or should we just do it? Or can we adapt > >> Victor's PEP 446? > > > > Adapting the PEP sounds good - while I agree with switching to a sane > > default, I think the daemonisation thread suggests there may need to > > be a supporting API to help force FDs created by nominated logging > > handlers to be inherited. > > Why would a Python logging handler be used in a child process? If the > child process is a fresh Python process, it starts with the default > logging handlers (no handler). > > Files opened by the logging module must be closed on exec(). I agree with this. It is only on fork()-without-exec() that the behaviour of python-daemon is actively anti-social. Regards Antoine. From solipsis at pitrou.net Mon Aug 5 20:51:09 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 5 Aug 2013 20:51:09 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies References: Message-ID: <20130805205109.4cc384aa@fsol> Hi, On Sun, 04 Aug 2013 09:23:41 +0200 Stefan Behnel wrote: > > I'm currently catching up on PEP 442, which managed to fly completely below > my radar so far. It's a really helpful change that could end up fixing a > major usability problem that Cython was suffering from: user provided > deallocation code now has a safe execution environment (well, at least in > Py3.4+). That makes Cython a prime candidate for testing this, and I've > just started to migrate the implementation. That's great to hear. "Safe execution environment" for finalization code is exactly what the PEP is about. > One thing that I found to be missing from the PEP is inheritance handling. > The current implementation doesn't seem to care about base types at all, so > it appears to be the responsibility of the type to call its super type > finalisation function. Is that really intended? Yes, it is intended that users have to call super().__del__() in their __del__ implementation, if they want to call the upper-level finalizer. This is exactly the same as in __init__() and (most?) other special functions. > Another bit is the exception handling. According to the documentation, > tp_finalize() is supposed to first save the current exception state, then > do the cleanup, then call WriteUnraisable() if necessary, then restore the > exception state. > > http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize > > Is there a reason why this is left to the user implementation, rather than > doing it generically right in PyObject_CallFinalizer() ? That would also > make it more efficient to call through the super type hierarchy, I guess. I > don't see a need to repeat this exception state swapping at each level. I didn't give much thought to this detail. Originally I was simply copying this bit of semantics from tp_dealloc and tp_del, but indeed we could do better. Do you want to open an issue about it? Regards Antoine. From solipsis at pitrou.net Mon Aug 5 20:56:29 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 5 Aug 2013 20:56:29 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies References: Message-ID: <20130805205629.5869ab83@fsol> On Sun, 04 Aug 2013 17:59:57 +0200 Stefan Behnel wrote: > > I continued my implementation and found that calling up the base type > > hierarchy is essentially the same code as calling up the hierarchy for > > tp_dealloc(), so that was easy to adapt to in Cython and is also more > > efficient than a generic loop (because it can usually benefit from > > inlining). So I'm personally ok with leaving the super type calling code to > > the user side, even though manual implementers may not be entirely happy. > > > > I think it should get explicitly documented how subtypes should deal with a > > tp_finalize() in (one of the) super types. It's not entirely trivial > > because the tp_finalize slot is not guaranteed to be filled for a super > > type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that > > PyType_Ready() copies it would still hold, though. Not only it could be NULL (if no upper type has a finalizer), but it could also not exist at all (if Py_TPFLAGS_HAVE_FINALIZE isn't in tp_flags). If an API is needed to make this easier then why not. But I'm not sure anyone else than Cython really has such concerns. Usually, the class hierarchy for C extension types is known at compile-time and therefore you know exactly which upper finalizers to call. > Hmm, it seems to me by now that the only safe way of handling this is to > let each tp_dealloc() level in the hierarchy call tp_finalize() through > PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in > tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc() > functions in base types and subtypes. I'm not following you. Why is it "a bit fragile" to call the base tp_finalize from a derived tp_finalize? It should actually be totally safe, since tp_finalize is a regular function called in a safe environment (unlike tp_dealloc and tp_del). Regards Antoine. From stefan_ml at behnel.de Mon Aug 5 21:03:33 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 05 Aug 2013 21:03:33 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies In-Reply-To: <20130805205109.4cc384aa@fsol> References: <20130805205109.4cc384aa@fsol> Message-ID: Hi, I was just continuing in my monologue when you replied, so I'll just drop my response below. Antoine Pitrou, 05.08.2013 20:51: > On Sun, 04 Aug 2013 09:23:41 +0200 > Stefan Behnel wrote: >> I'm currently catching up on PEP 442, which managed to fly completely below >> my radar so far. It's a really helpful change that could end up fixing a >> major usability problem that Cython was suffering from: user provided >> deallocation code now has a safe execution environment (well, at least in >> Py3.4+). That makes Cython a prime candidate for testing this, and I've >> just started to migrate the implementation. > > That's great to hear. "Safe execution environment" for finalization > code is exactly what the PEP is about. > >> One thing that I found to be missing from the PEP is inheritance handling. >> The current implementation doesn't seem to care about base types at all, so >> it appears to be the responsibility of the type to call its super type >> finalisation function. Is that really intended? > > Yes, it is intended that users have to call super().__del__() in their > __del__ implementation, if they want to call the upper-level finalizer. > This is exactly the same as in __init__() and (most?) other special > functions. That's the Python side of things. However, if a subtype overwrites tp_finalize(), then there should be a protocol for making sure the super type's tp_finalize() is called, and that it's being called in the right kind of execution environment. >> Another bit is the exception handling. According to the documentation, >> tp_finalize() is supposed to first save the current exception state, then >> do the cleanup, then call WriteUnraisable() if necessary, then restore the >> exception state. >> >> http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize >> >> Is there a reason why this is left to the user implementation, rather than >> doing it generically right in PyObject_CallFinalizer() ? That would also >> make it more efficient to call through the super type hierarchy, I guess. I >> don't see a need to repeat this exception state swapping at each level. > > I didn't give much thought to this detail. Originally I was simply > copying this bit of semantics from tp_dealloc and tp_del, but indeed we > could do better. Do you want to open an issue about it? I think the main problem I have with the PEP is this part: """ The PEP doesn't change the semantics of: * C extension types with a custom tp_dealloc function. """ Meaning, it was designed to explicitly ignore this use case. That's a mistake, IMHO. If we are to add a new finalisation protocol, why not make it work in the general case (i.e. fix the problem once and for all), instead of restricting it to a special case and leaving the rest to each user to figure out again? Separating the finalisation from the deallocation is IMO a good idea. It fixes cyclic garbage collection, that's excellent. And it removes the differences between GC and normal refcounting cleanup by clarifying in what states finalisation and deallocation are executed (one safe, one not). I think what's missing is the following. * split deallocation into a distinct finalisation and deallocation phase * make finalisation run recursively (or iteratively) through all inheritance levels, in a well defined execution environment (e.g. after saving away the exception state) * after successful finalisation, run the deallocation, as before. My guess is that a recursive finalisation phase where subtype code calls into the supertype is generally more efficient, so I think I'd prefer that. I think it's a mistake that the current implementation calls the finalisation from tp_dealloc(). Instead, both the finalisation and the deallocation should be called externally and independently from the cleanup mechanism behind Py_DECREF(). (There is no problem in CS that can't be solved by adding another level of indirection...) An obvious open question is how to deal with exceptions during finalisation. Any break in the execution chain would mean that a part of the type wouldn't be finalised. One way to handle this could be to simply assume that the deallocation phase would still clean up anything that's left over. Or the protocol could dictate that each level must swallow its own exceptions and call the super type finaliser with a clean exception state. This might suggest that an external iterative call loop through the finaliser hierarchy has a usability advantage over recursive calls. Just dropping my idea here. Stefan From solipsis at pitrou.net Mon Aug 5 21:26:30 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 5 Aug 2013 21:26:30 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies References: <20130805205109.4cc384aa@fsol> Message-ID: <20130805212630.2472c0c6@fsol> On Mon, 05 Aug 2013 21:03:33 +0200 Stefan Behnel wrote: > I think the main problem I have with the PEP is this part: > > """ > The PEP doesn't change the semantics of: > * C extension types with a custom tp_dealloc function. > """ > > Meaning, it was designed to explicitly ignore this use case. It doesn't ignore it. It lets you fix your C extension type to use tp_finalize for resource finalization. It also provides the PyObject_CallFinalizerFromDealloc() API function to make it easier to call tp_finalize from your tp_dealloc. What the above sentence means is that, if you don't change your legacy tp_dealloc, your type will not take advantage of the new facilities. (you can take a look at the _io module; it was modified to take advantage of tp_finalize) > * make finalisation run recursively (or iteratively) through all > inheritance levels, in a well defined execution environment (e.g. after > saving away the exception state) __init__ and other methods only let the user recurse explicitly. __del__ would be a weird exception if it recursed implicitly. Also, it would break backwards compatibility for existing uses of __del__. > I think it's a mistake that the current implementation calls the > finalisation from tp_dealloc(). Instead, both the finalisation and the > deallocation should be called externally and independently from the cleanup > mechanism behind Py_DECREF(). (There is no problem in CS that can't be > solved by adding another level of indirection...) Why not, but I'm not sure that would solve anything on your side. If it does, would you like to cook a patch? I wonder if there's some unexpected issue with doing what you're proposing. > An obvious open question is how to deal with exceptions during > finalisation. Any break in the execution chain would mean that a part of > the type wouldn't be finalised. Let's come back to pure Python: class A: def __del__(self): 1/0 class B(A): def __del__(self): super().__del__() self.cleanup_resources() If you want cleanup_resources() to be called always (despite A.__del__() raising), you have to use a try/finally block. There's no magic here. Letting the user call the upper finalizer explicitly lets them choose their preferred form of exception handling. Regards Antoine. From stefan_ml at behnel.de Mon Aug 5 21:32:54 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 05 Aug 2013 21:32:54 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies In-Reply-To: <20130805205629.5869ab83@fsol> References: <20130805205629.5869ab83@fsol> Message-ID: Antoine Pitrou, 05.08.2013 20:56: > On Sun, 04 Aug 2013 17:59:57 +0200 > Stefan Behnel wrote: >>> I continued my implementation and found that calling up the base type >>> hierarchy is essentially the same code as calling up the hierarchy for >>> tp_dealloc(), so that was easy to adapt to in Cython and is also more >>> efficient than a generic loop (because it can usually benefit from >>> inlining). So I'm personally ok with leaving the super type calling code to >>> the user side, even though manual implementers may not be entirely happy. >>> >>> I think it should get explicitly documented how subtypes should deal with a >>> tp_finalize() in (one of the) super types. It's not entirely trivial >>> because the tp_finalize slot is not guaranteed to be filled for a super >>> type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that >>> PyType_Ready() copies it would still hold, though. > > Not only it could be NULL (if no upper type has a finalizer), but it > could also not exist at all (if Py_TPFLAGS_HAVE_FINALIZE isn't in > tp_flags). If an API is needed to make this easier then why not. But > I'm not sure anyone else than Cython really has such concerns. Usually, > the class hierarchy for C extension types is known at compile-time and > therefore you know exactly which upper finalizers to call. Well, you shouldn't have to, though. Otherwise, it would be practically impossible to insert a new finaliser into an existing hierarchy once other people/projects have started inheriting from it. And, sure, Cython makes these things so easy that people actually do them. The Sage math system has type hierarchies that go up to 10 levels deep IIRC. That's a lot of space for future changes. >> Hmm, it seems to me by now that the only safe way of handling this is to >> let each tp_dealloc() level in the hierarchy call tp_finalize() through >> PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in >> tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc() >> functions in base types and subtypes. I think I got confused here. PyObject_CallFinalizerFromDealloc() works on the object, not the type. So it can't be used to call anything but the bottom-most tp_finalize(). > I'm not following you. Why is it "a bit fragile" to call the base > tp_finalize from a derived tp_finalize? It should actually be totally > safe, since tp_finalize is a regular function called in a safe > environment (unlike tp_dealloc and tp_del). As long as there is not OWTDI, you can't really make safe assumption about the way a super type's tp_finalize() and tp_dealloc() play together. The details definitely need to be spelled out here. Stefan From solipsis at pitrou.net Mon Aug 5 21:44:05 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 5 Aug 2013 21:44:05 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies References: <20130805205629.5869ab83@fsol> Message-ID: <20130805214405.6df5b0c1@fsol> On Mon, 05 Aug 2013 21:32:54 +0200 Stefan Behnel wrote: > > >> Hmm, it seems to me by now that the only safe way of handling this is to > >> let each tp_dealloc() level in the hierarchy call tp_finalize() through > >> PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in > >> tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc() > >> functions in base types and subtypes. > > I think I got confused here. PyObject_CallFinalizerFromDealloc() works on > the object, not the type. So it can't be used to call anything but the > bottom-most tp_finalize(). Well, the bottom-most tp_finalize() is responsible for calling the upper ones, if it wants to. > > I'm not following you. Why is it "a bit fragile" to call the base > > tp_finalize from a derived tp_finalize? It should actually be totally > > safe, since tp_finalize is a regular function called in a safe > > environment (unlike tp_dealloc and tp_del). > > As long as there is not OWTDI, you can't really make safe assumption about > the way a super type's tp_finalize() and tp_dealloc() play together. The > details definitely need to be spelled out here. I'd be glad to make the spec more explicit if needed, but first you need to tell me if the current behaviour is ok, or if you need something else (within the boundaries of backwards compatibility and reasonable expectations, though: i.e. no implicit recursion through the __mro__). Regards Antoine. From stefan_ml at behnel.de Mon Aug 5 22:30:29 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 05 Aug 2013 22:30:29 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies In-Reply-To: <20130805212630.2472c0c6@fsol> References: <20130805205109.4cc384aa@fsol> <20130805212630.2472c0c6@fsol> Message-ID: Antoine Pitrou, 05.08.2013 21:26: > On Mon, 05 Aug 2013 21:03:33 +0200 > Stefan Behnel wrote: >> I think the main problem I have with the PEP is this part: >> >> """ >> The PEP doesn't change the semantics of: >> * C extension types with a custom tp_dealloc function. >> """ >> >> Meaning, it was designed to explicitly ignore this use case. > > It doesn't ignore it. It lets you fix your C extension type to use > tp_finalize for resource finalization. It also provides the > PyObject_CallFinalizerFromDealloc() API function to make it easier to > call tp_finalize from your tp_dealloc. > > What the above sentence means is that, if you don't change your legacy > tp_dealloc, your type will not take advantage of the new facilities. > > (you can take a look at the _io module; it was modified to take > advantage of tp_finalize) Oh, I'm aware of the backwards compatibility, but I totally *want* to take advantage of the new feature. >> * make finalisation run recursively (or iteratively) through all >> inheritance levels, in a well defined execution environment (e.g. after >> saving away the exception state) > > __init__ and other methods only let the user recurse explicitly. > __del__ would be a weird exception if it recursed implicitly. Also, it > would break backwards compatibility for existing uses of __del__. Hmm, it's a bit unfortunate that tp_finalize() maps so directly to __del__(), but I think this can be fixed. In any case, each tp_finalize() function must only ever be called once, so if a subtype inherited the tp_finalize() slot from its parent, it mustn't be called again. Instead, the hierarchy would be followed upwards to search for the next tp_finalize() that's different from the current one, i.e. the function pointer differs. That means that only the top-most super type would need to call __del__(), after all tp_finalize() functions in subtypes have run. >> I think it's a mistake that the current implementation calls the >> finalisation from tp_dealloc(). Instead, both the finalisation and the >> deallocation should be called externally and independently from the cleanup >> mechanism behind Py_DECREF(). (There is no problem in CS that can't be >> solved by adding another level of indirection...) > > Why not, but I'm not sure that would solve anything on your side. Well, it would untangle both phases and make it clearer what needs to call what. If I call my super type's tp_dealloc(), does it need to call tp_finalize() or not? And if it does: the type's one or just its own one? If both phases are split, then the answer is simply: it doesn't and mustn't, because that's already been taken care of. All it has to care about is what it's there for: deallocation. > If it does, would you like to cook a patch? I wonder if there's some > unexpected issue with doing what you're proposing. I was somehow expecting this question. There's still the open issue of module initialisation, though. Not sure which is more important. Given that this feature has already been merged, I guess it's better to fix it up before people start making actual use of it, instead of putting work into a new feature that's not even started yet. >> An obvious open question is how to deal with exceptions during >> finalisation. Any break in the execution chain would mean that a part of >> the type wouldn't be finalised. > > Let's come back to pure Python: > > class A: > def __del__(self): > 1/0 > > class B(A): > def __del__(self): > super().__del__() > self.cleanup_resources() What makes you think it's a good idea to call the parent type's finaliser before doing the local finalisation, and not the other way round? What if the subtype needs access to parts of the super type for its cleanup? In other words, which makes more sense (at the C level): try: super().tp_finalize() finally: local_cleanup() or try: local_cleanup() finally: super().tp_finalize() Should that order be part of the protocol or not? (well, not for __del__() I guess, but maybe for tp_finalize()?) Coming back to the __del__() vs. tp_finalize() story, if tp_finalize() first recursed into the super types, the top-most one of which then calls __del__() and returns, we'd get an execution order that runs Python-level __del__() methods before C-level tp_finalize() functions, but loose the subtype-before-supertype execution order for tp_finalize() functions. That might call for a three-step cleanup: 1) run all Python __del__() methods recursively 2) run all tp_finalize() functions recursively 3) run tp_dealloc() recursively This would allow all three call chains to run in subtype-before-supertype order, and execute the more sensitive Python methods before the low-level "I know what I'm doing" C finalisers. > If you want cleanup_resources() to be called always (despite > A.__del__() raising), you have to use a try/finally block. There's no > magic here. > > Letting the user call the upper finalizer explicitly lets them choose > their preferred form of exception handling. I'm ok with that. I'd just prefer it if each level didn't have to execute some kind of boiler plate setup+teardown kind of code. It's a different situation if the finaliser is being called from a subtype or if it's being called as the main entry point for finalisation. I guess that's mostly just an optimisation, though. At least Cython could do that internally by wrapping the finaliser in a boiler plate function before sticking it into the tp_finalize slot, and otherwise call it directly if it's known at compile time. Stefan From matthewlmcclure at gmail.com Mon Aug 5 23:43:45 2013 From: matthewlmcclure at gmail.com (Matt McClure) Date: Mon, 5 Aug 2013 17:43:45 -0400 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: References: <20130803160725.1D1602500CA@webabinitio.net> Message-ID: On Sat, Aug 3, 2013 at 3:27 PM, Michael Foord wrote: > It smells to me like a new feature rather than a bugfix, and it's a > moderately big change. I don't think it can be backported to 2.7 other than > through unittest2. > Is http://hg.python.org/unittest2 the place to backport to unittest2? -- Matt McClure http://matthewlmcclure.com http://www.mapmyfitness.com/profile/matthewlmcclure -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Tue Aug 6 01:53:39 2013 From: larry at hastings.org (Larry Hastings) Date: Mon, 05 Aug 2013 16:53:39 -0700 Subject: [Python-Dev] The Return Of Argument Clinic In-Reply-To: References: <51FF66DD.1020403@hastings.org> Message-ID: <52003B03.6090204@hastings.org> On 08/05/2013 02:55 AM, Nick Coghlan wrote: > On 5 August 2013 18:48, Larry Hastings wrote: >> Question 0: How should we integrate Clinic into the build process? > Isn't solving the bootstrapping problem the reason for checking in the > clinic-generated output? If there's no Python available, we build what > we have (without the clinic step), then we build it again *with* the > clinic step. It solves the bootstrapping problem, but that's not the only problem Clinic presents to the development workflow. If you modify some Clinic DSL in a C file in the CPython tree, then run "make", should the Makefile re-run Clinic over that file? If you say "no", then there's no problem. If you say "yes", then we have the problem I described. >> ___________________________________________________________________ >> Question 1: Which C function nomenclature? > Consider this from the client side, and I believe it answers itself: > other code in the module will be expected the existing signature, so > that signature needs to stay with the existing name, while the new C > implementation function gets the new name. One vote for "os_stat_impl". Bringing the sum total of votes up to 1! ;-) >> ___________________________________________________________________ >> Question 2: Emit code for modules and classes? >> >> There are some complications to this, one of which I'll >> discuss next. But I put it to you, gentle reader: how >> much boilerplate should Argument Clinic undertake to >> generate, and how much more class and module metadata >> should be wired in to it? > I strongly recommend deferring this. Incremental development is good, > and getting this bootstrapped at all is going to be challenging enough > without trying to do everything at once. I basically agree. But you glossed over an important part of that question, "how much more class and module metadata should be wired in right now?". Originally Clinic didn't ask for full class and module information, you just specified the full dotted path and that was that. But that's ambiguous; Clinic wouldn't be able to infer what was a module vs what was a class. And in the future, if/when it generates module and class boilerplate, obviously it'll need to know the distinction. I figure, specifying the classes and modules doesn't add a lot of additional cost, but it'll very likely save us a lot of time in the long run, so I made it a requirement. (WAGNI!) Anyway, I guess what I was really kind of trying to get at here was: a) are there any other obvious bits of metadata Clinic should require right now for functions, b) what other metadata might Clinic take in the future--not because I want to add it, but just so we can figure out the next question, c) to what degree can we future-proof Clinic 1.0 so extension authors can more easily straddle versions. Thinking about it more with a fresh perspective, maybe all we need is a Clinic version number directive. This would declare the minimum Clinic version--which would really just track the Python version it shipped with--that you may use to process this file. Like so: /*[clinic] clinic 3.5 [clinic]*/ As long as the code Clinic generates is backwards compatible for Python 3.4, I think this will has it covered. We may at times force developers to use fresher versions of Python to process Clinic stuff, but I don't think that's a big deal. >> ___________________________________________________________________ >> Question 4: Return converters returning success/failure? >> >> Can we live with PyErr_Occurred() here? > Armin's suggestion of a valid return value (say, -1) that indicates > "error may have occurred" sounds good to me. Yes indeed, and thanks Armin for pointing it out. This works perfectly in Clinic, as each individual return converter controls the code generated for cleanup afterwards. So it can be a completely local policy per-return-converter what the magic value is. Heck, you could have multiple int converters with different magic return values (not that that seems like a good idea). >> ___________________________________________________________________ >> Question 5: Keep too-magical class decorator Converter.wrap? >> >> I'd like to keep it in, and anoint it as the preferred way >> of declaring Converter subclasses. Anybody else have a strong >> opinion on this either way? > Can't you get the same effect without the magic by having a separate > "custom_init" method that the main __init__ method promises to call > with the extra keyword args after finishing the other parts of the > initialization? Them a custom converter would just look like: > > class path_t_converter(Converter): > def custom_init(self, *, allow_fd=False, nullable=False): > ... I can get the same effect without reusing the name __init__, but I wouldn't say I can do it "without the magic". The whole point of the decorator is magic. Let's say I go with your proposal. What happens if someone makes a Converter, and wraps it with Converter.wrap, and defines their own __init__? It would never get called. Silently, by default, which is worse--though I could explicitly detect such an __init__ and throw an exception I guess. Still, now we have a class where you can't use the name __init__, you have to use this funny other name, for arbitrary "correctness" reasons. My metaphor for why I prefer my approach is the set of "os" module functions that allow "specifying a file descriptor": http://docs.python.org/3/library/os.html#path-fd Taking the example of os.chdir(), yes, it would have been more correct to require specifying the file descriptor as a separate argument, like os.chdir(None, fd=my_dir_fd) But this would have meant that when using "fd" the first parameter would always be None. And the first parameter and the "fd" do the same thing, just with different types. So while normally we eschew polymorphic parameters (and with good reason) in this case I think practicality beat purity. And I think the same holds here. Since class instance initialization functions in Python are called __init__, and this is a class instance initialization function, I think it should be called __init__. By decorating with Converter.wrap, you're signing a contract that says "yes it's a fake __init__, that's what I want". A real __init__ in this class would never be called, which is the whole point, so we might as well reuse the name for our slightly-fake __init__. Let me put it this way: Which is more surprising to the person unfamiliar with the code? That this __init__ doesn't get all the parameters, and the base class __init__ is getting called automatically? Or that this funny function "custom_init" is what gets called, and this class is not allowed to have a function called __init__? In case I didn't make it clear: the actual call site for constructing these objects is buried deep in clinic.py. Users don't create them directly, or at least I don't know why they'd ever need to do so. Instead, they're created inside Clinic preprocessor blocks in your source files, where they look like tis: parameter_name: Converter(argument=value, argument2=value) = default Using the funny magic of Converter.wrap makes this and the implementation look a great deal more alike. So I remain a fan of Converter.wrap and calling the initialization function __init__. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Aug 6 02:23:05 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 6 Aug 2013 02:23:05 +0200 Subject: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable Message-ID: Hi, My second try (old PEP 446) to change how the inheritance of file descriptors was somehow rejected. Here is a third try (new PEP 446) which should include all information of the recent discussions on python-dev, especially how file descriptors and handles are inherited on Windows. I added tables to summarize the status of Python 3.3 and the status of atomic flags. I rewrote the rationale to introduce and explain the inheritance of file descriptors, which is not an simple subject. I removed the O_NONBLOCK flag from the PEP, it is a different issue and a new PEP must be written. Online HTML version: http://www.python.org/dev/peps/pep-0446/ You may also want to read my first attempt: http://www.python.org/dev/peps/pep-0433/ PEP: 446 Title: Make newly created file descriptors non-inheritable Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 5-August-2013 Python-Version: 3.4 Abstract ======== Leaking file descriptors in child processes causes various annoying issues and is a known major security vulnerability. This PEP proposes to make all file descriptors created by Python non-inheritable by default to reduces the risk of these issues. This PEP fixes also a race condition in multithreaded applications on operating systems supporting atomic flags to create non-inheritable file descriptors. Rationale ========= Inheritance of File Descriptors ------------------------------- Each operating system handles the inheritance of file descriptors differently. Windows creates non-inheritable file descriptors by default, whereas UNIX creates inheritable file descriptors by default. Python prefers the POSIX API over the native Windows API to have a single code base, and so it creates inheritable file descriptors. There is one exception: ``os.pipe()`` creates non-inheritable pipes on Windows, whereas it creates inheritable pipes on UNIX. The reason is an implementation artifact: ``os.pipe()`` calls ``CreatePipe()`` on Windows (native API), whereas it calls ``pipe()`` on UNIX (POSIX API). The call to ``CreatePipe()`` was added in Python in 1994, before the introduction of ``pipe()`` in the POSIX API in Windows 98. The `issue #4708 `_ proposes to change ``os.pipe()`` on Windows to create inheritable pipes. Inheritance of File Descriptors on Windows ------------------------------------------ On Windows, the native type of file objects are handles (C type ``HANDLE``). These handles have a ``HANDLE_FLAG_INHERIT`` flag which defines if a handle can be inherited in a child process or not. For the POSIX API, the C runtime (CRT) provides also file descriptors (C type ``int``). The handle of a file descriptor can be get using the function ``_get_osfhandle(fd)``. A file descriptor can be created from a handle using the function ``_open_osfhandle(handle)``. Using `CreateProcess() `_, handles are only inherited if their inheritable flag (``HANDLE_FLAG_INHERIT``) is set and if the ``bInheritHandles`` parameter of ``CreateProcess()`` is ``TRUE``; all file descriptors except standard streams (0, 1, 2) are closed in the child process, even if ``bInheritHandles`` is ``TRUE``. Using the ``spawnv()`` function, all inheritable handles and all inheritable file descriptors are inherited in the child process. This function uses the undocumented fields *cbReserved2* and *lpReserved2* of the `STARTUPINFO `_ structure to pass an array of file descriptors. To replace standard streams (stdin, stdout, stderr) using ``CreateProcess()``, the ``STARTF_USESTDHANDLES`` flag must be set in the *dwFlags* field of the ``STARTUPINFO`` structure and the *bInheritHandles* parameter of ``CreateProcess()`` must be set to ``TRUE``. So when at least one standard stream is replaced, all inheritable handles are inherited by the child process. See also: * `Handle Inheritance `_ * `Q315939: PRB: Child Inherits Unintended Handles During CreateProcess Call `_ Inheritance of File Descriptors on UNIX --------------------------------------- POSIX provides a *close-on-exec* flag on file descriptors to close automatically a file descriptor when the C function ``execv()`` is called. File descriptors with the *close-on-exec* flag cleared are inherited in the child process, file descriptors with the flag set are closed in the child process. The flag can be set in two syscalls (one to get current flags, a second to set new flags) using ``fcntl()``:: int flags, res; flags = fcntl(fd, F_GETFD); if (flags == -1) { /* handle the error */ } flags |= FD_CLOEXEC; /* or "flags &= ~FD_CLOEXEC;" to clear the flag */ res = fcntl(fd, F_SETFD, flags); if (res == -1) { /* handle the error */ } FreeBSD, Linux, Mac OS X, NetBSD, OpenBSD and QNX support also setting the flag in a single syscall using ioctl():: int res; res = ioctl(fd, FIOCLEX, 0); if (!res) { /* handle the error */ } The *close-on-exec* flag has no effect on ``fork()``: all file descriptors are inherited by the child process. The `Python issue #16500 "Add an atfork module" `_ proposes to add a new ``atfork`` module to execute code at fork, it may be used to close automatically file descriptors. Issues with Inheritable File Descriptors ---------------------------------------- Most of the time, inheritable file descriptors "leaked" in child processes are not noticed, because they don't cause major bugs. It does not mean that these bugs must not be fixed. Two examples of common issues with inherited file descriptors: * On Windows, a directory cannot be removed before all file handles open in the directory are closed. The same issue can be seen with files, except if the file was created with the ``FILE_SHARE_DELETE`` flag (``O_TEMPORARY`` mode for ``open()``). * If a listening socket is leaked in a child process, the socket address cannot be reused before the parent and child processes terminated. For example, if a web server spawn a new program to handle a process, and the server restarts while the program is not done: the server cannot start because the TCP port is still in use. Leaking file descriptors is also a well known security vulnerability: read `FIO42-C. Ensure files are properly closed when they are no longer needed `_ of the CERT. An untrusted child process can read sensitive data like passwords and take control of the parent process though leaked file descriptors. It is for example a way to escape from a chroot. With a leaked listening socket, a child process can accept new connections to read sensitive data. Atomic Creation of non-inheritable File Descriptors --------------------------------------------------- In a multithreaded application, a inheritable file descriptor can be created just before a new program is spawn, before the file descriptor is made non-inheritable. In this case, the file descriptor is leaked to the child process. This race condition could be avoided if the file descriptor is created directly non-inheritable. FreeBSD, Linux, Mac OS X, Windows and many other operating systems support creating non-inheritable file descriptors with the inheritable flag cleared atomically at the creation of the file descriptor. A new ``WSA_FLAG_NO_HANDLE_INHERIT`` flag for ``WSASocket()`` was added in Windows 7 SP1 and Windows Server 2008 R2 SP1 to create non-inheritable sockets. If this flag is used on an older Windows version (ex: Windows XP SP3), ``WSASocket()`` fails with ``WSAEPROTOTYPE``. On UNIX, new flags were added for files and sockets: * ``O_CLOEXEC``: available on Linux (2.6.23), FreeBSD (8.3), OpenBSD 5.0, Solaris 11, QNX, BeOS, next NetBSD release (6.1?). This flag is part of POSIX.1-2008. * ``SOCK_CLOEXEC`` flag for ``socket()`` and ``socketpair()``, available on Linux 2.6.27, OpenBSD 5.2, NetBSD 6.0. * ``fcntl()``: ``F_DUPFD_CLOEXEC`` flag, available on Linux 2.6.24, OpenBSD 5.0, FreeBSD 9.1, NetBSD 6.0, Solaris 11. This flag is part of POSIX.1-2008. * ``fcntl()``: ``F_DUP2FD_CLOEXEC`` flag, available on FreeBSD 9.1 and Solaris 11. * ``recvmsg()``: ``MSG_CMSG_CLOEXEC``, available on Linux 2.6.23, NetBSD 6.0. On Linux older than 2.6.23, ``O_CLOEXEC`` flag is simply ignored. So ``fcntl()`` must be called to check if the file descriptor is non-inheritable: ``O_CLOEXEC`` is not supported if the ``FD_CLOEXEC`` flag is missing. On Linux older than 2.6.27, ``socket()`` or ``socketpair()`` fail with ``errno`` set to ``EINVAL`` if the ``SOCK_CLOEXEC`` flag is set in the socket type. New functions: * ``dup3()``: available on Linux 2.6.27 (and glibc 2.9) * ``pipe2()``: available on Linux 2.6.27 (and glibc 2.9) * ``accept4()``: available on Linux 2.6.28 (and glibc 2.10) On Linux older than 2.6.28, ``accept4()`` fails with ``errno`` set to ``ENOSYS``. Summary: =========================== =============== ==================================== Operating System Atomic File Atomic Socket =========================== =============== ==================================== FreeBSD 8.3 (2012) X Linux 2.6.23 (2007) 2.6.27 (2008) Mac OS X 10.8 (2012) X NetBSD 6.1 (?) 6.0 (2012) OpenBSD 5.0 (2011) 5.2 (2012) Solaris 11 (2011) X Windows XP (2001) Seven SP1 (2011), 2008 R2 SP1 (2011) =========================== =============== ==================================== Legend: * "Atomic File": first version of the operating system supporting creating atomically a non-inheritable file descriptor using ``open()`` * "Atomic Socket": first version of the operating system supporting creating atomically a non-inheritable socket * "X": not supported yet Status of Python 3.3 -------------------- Python 3.3 creates inheritable file descriptors on all platforms, except ``os.pipe()`` which creates non-inheritable file descriptors on Windows. New constants and functions related to the atomic creation of non-inheritable file descriptors were added to Python 3.3: ``os.O_CLOEXEC``, ``os.pipe2()`` and ``socket.SOCK_CLOEXEC``. On UNIX, the ``subprocess`` module closes all file descriptors in the child process by default, except standard streams (0, 1, 2) and file descriptors of the *pass_fds* parameter. If the *close_fds* parameter is set to ``False``, all inheritable file descriptors are inherited in the child process. On Windows, the ``subprocess`` closes all handles and file descriptors in the child process by default. If at least one standard stream (stdin, stdout or stderr) is replaced (ex: redirected into a pipe), all inheritable handles are inherited in the child process. All inheritable file descriptors are inherited by the child process using the functions of the ``os.execv*()`` and ``os.spawn*()`` families. On UNIX, the ``multiprocessing`` module uses ``os.fork()`` and so all file descriptors are inherited by child processes. On Windows, all inheritable handles are inherited by the child process using the ``multiprocessing`` module, all file descriptors except standard streams are closed. Summary: =========================== ============= ================== ============= Module FD on UNIX Handles on Windows FD on Windows =========================== ============= ================== ============= subprocess, default STD, pass_fds none STD subprocess, close_fds=False all all STD os.execv(), os.spawn() all all all multiprocessing all all STD =========================== ============= ================== ============= Legend: * "all": all *inheritable* file descriptors or handles are inherited in the child process * "none": all handles are closed in the child process * "STD": only file descriptors 0 (stdin), 1 (stdout) and 2 (stderr) are inherited in the child process * "pass_fds": file descriptors of the *pass_fds* parameter of the subprocess are inherited Proposal ======== Non-inheritable File Descriptors -------------------------------- The following functions are modified to make newly created file descriptors non-inheritable by default: * ``asyncore.dispatcher.create_socket()`` * ``io.FileIO`` * ``io.open()`` * ``open()`` * ``os.dup()`` * ``os.dup2()`` * ``os.fdopen()`` * ``os.open()`` * ``os.openpty()`` * ``os.pipe()`` * ``select.devpoll()`` * ``select.epoll()`` * ``select.kqueue()`` * ``socket.socket()`` * ``socket.socket.accept()`` * ``socket.socket.dup()`` * ``socket.socket.fromfd()`` * ``socket.socketpair()`` When available, atomic flags are used to make file descriptors non-inheritable. The atomicity is not guaranteed because a fallback is required when atomic flags are not available. New Functions ------------- * ``os.get_inheritable(fd: int)``: return ``True`` if the file descriptor can be inherited by child processes, ``False`` otherwise. * ``os.set_inheritable(fd: int, inheritable: bool)``: clear or set the inheritable flag of the specified file descriptor. These new functions are available on all platforms. On Windows, these functions accept also file descriptors of sockets: the result of ``sockobj.fileno()``. Other Changes ------------- * On UNIX, subprocess makes file descriptors of the *pass_fds* parameter inheritable. The file descriptor is made inheritable in the child process after the ``fork()`` and before ``execv()``, so the inheritable flag of file descriptors is unchanged in the parent process. * ``os.dup2(fd, fd2)`` makes *fd2* inheritable if *fd2* is ``0`` (stdin), ``1`` (stdout) or ``2`` (stderr) and *fd2* is different than *fd*. Backward Compatibility ====================== This PEP break applications relying on inheritance of file descriptors. Developers are encouraged to reuse the high-level Python module ``subprocess`` which handles the inheritance of file descriptors in a portable way. Applications using the ``subprocess`` module with the *pass_fds* parameter or using ``os.dup2()`` to redirect standard streams should not be affected. Python does no more conform to POSIX, since file descriptors are now made non-inheritable by default. Python was not designed to conform to POSIX, but was designed to develop portable applications. Related Work ============ The programming languages Go, Perl and Ruby make newly created file descriptors non-inheritable by default: since Go 1.0 (2009), Perl 1.0 (1987) and Ruby 2.0 (2013). The SCons project overrides builtin functions ``file()`` and ``open()`` to make files non-inheritable on Windows: see `win32.py `_. Rejected Alternatives ===================== PEP 433 ------- The PEP 433 entitled "Easier suppression of file descriptor inheritance" is a previous attempt proposing various other alternatives, but no consensus could be reached. No special case for standard streams ------------------------------------ Functions handling file descriptors should not handle standard streams (file descriptors ``0``, ``1``, ``2``) differently. This option does not work on Windows. On Windows, calling ``SetHandleInformation()`` to set or clear ``HANDLE_FLAG_INHERIT`` flag on standard streams (0, 1, 2) fails with the Windows error 87 (invalid argument). If ``os.dup2(fd, fd2)`` would always make *fd2* non-inheritable, the function would raise an exception when used to redirect standard streams. Another option is to add a new *inheritable* parameter to ``os.dup2()``. This PEP has a special-case for ``os.dup2()`` to not break backward compatibility on applications redirecting standard streams before calling the C function ``execv()``. Examples in the Python standard library: ``CGIHTTPRequestHandler.run_cgi()`` and ``pty.fork()`` use ``os.dup2()`` to redict stdin, stdout and stderr. Links ===== Python issues: * `#10115: Support accept4() for atomic setting of flags at socket creation `_ * `#12105: open() does not able to set flags, such as O_CLOEXEC `_ * `#12107: TCP listening sockets created without FD_CLOEXEC flag `_ * `#16850: Add "e" mode to open(): close-and-exec (O_CLOEXEC) / O_NOINHERIT `_ * `#16860: Use O_CLOEXEC in the tempfile module `_ * `#16946: subprocess: _close_open_fd_range_safe() does not set close-on-exec flag on Linux < 2.6.23 if O_CLOEXEC is defined `_ * `#17070: Use the new cloexec to improve security and avoid bugs `_ * `#18571: Implementation of the PEP 446: non-inheriable file descriptors `_ Other links: * `Secure File Descriptor Handling `_ (Ulrich Drepper, 2008) * `Ghosts of Unix past, part 2: Conflated designs `_ (Neil Brown, 2010) explains the history of ``O_CLOEXEC`` and ``O_NONBLOCK`` flags Copyright ========= This document has been placed into the public domain. From shibturn at gmail.com Tue Aug 6 02:46:53 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Tue, 06 Aug 2013 01:46:53 +0100 Subject: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: On 06/08/2013 1:23am, Victor Stinner wrote: > Each operating system handles the inheritance of file descriptors > differently. Windows creates non-inheritable file descriptors by > default, whereas UNIX creates inheritable file descriptors by default. The Windows API creates non-inheritable *handles* by default. But the C runtime creates inheritable fds by default. Also the socket library creates sockets with inheritable handles by default. Apparently there isn't a reliable way to make sockets non-inheritable because anti-virus/firewall software can interfere: http://stackoverflow.com/questions/12058911/can-tcp-socket-handles-be-set-not-inheritable -- Richard From shibturn at gmail.com Tue Aug 6 02:48:18 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Tue, 06 Aug 2013 01:48:18 +0100 Subject: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: <520047D2.7030808@gmail.com> On 06/08/2013 1:23am, Victor Stinner wrote: > Each operating system handles the inheritance of file descriptors > differently. Windows creates non-inheritable file descriptors by > default, whereas UNIX creates inheritable file descriptors by default. The Windows API creates non-inheritable *handles* by default. But the C runtime creates inheritable fds by default. Also the socket library creates sockets with inheritable handles by default. Apparently there isn't a reliable way to make sockets non-inheritable because anti-virus/firewall software can interfere: http://stackoverflow.com/questions/12058911/can-tcp-socket-handles-be-set-not-inheritable -- Richard From shibturn at gmail.com Tue Aug 6 02:48:18 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Tue, 06 Aug 2013 01:48:18 +0100 Subject: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: <520047D2.7030808@gmail.com> On 06/08/2013 1:23am, Victor Stinner wrote: > Each operating system handles the inheritance of file descriptors > differently. Windows creates non-inheritable file descriptors by > default, whereas UNIX creates inheritable file descriptors by default. The Windows API creates non-inheritable *handles* by default. But the C runtime creates inheritable fds by default. Also the socket library creates sockets with inheritable handles by default. Apparently there isn't a reliable way to make sockets non-inheritable because anti-virus/firewall software can interfere: http://stackoverflow.com/questions/12058911/can-tcp-socket-handles-be-set-not-inheritable -- Richard From victor.stinner at gmail.com Tue Aug 6 02:59:45 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 6 Aug 2013 02:59:45 +0200 Subject: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: > On Windows, the ``subprocess`` closes all handles and file descriptors > in the child process by default. If at least one standard stream (stdin, > stdout or stderr) is replaced (ex: redirected into a pipe), all > inheritable handles are inherited in the child process. > > Summary: > > =========================== ============= ================== ============= > Module FD on UNIX Handles on Windows FD on Windows > =========================== ============= ================== ============= > subprocess, default STD, pass_fds none STD Oh, the summary table is wrong for the "subprocess, default" line: all inheritable handles are inherited if at least one standard stream is replaced. Victor From ncoghlan at gmail.com Tue Aug 6 06:59:37 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Aug 2013 14:59:37 +1000 Subject: [Python-Dev] The Return Of Argument Clinic In-Reply-To: <52003B03.6090204@hastings.org> References: <51FF66DD.1020403@hastings.org> <52003B03.6090204@hastings.org> Message-ID: On 6 August 2013 09:53, Larry Hastings wrote: > On 08/05/2013 02:55 AM, Nick Coghlan wrote: > On 5 August 2013 18:48, Larry Hastings wrote: > > Question 0: How should we integrate Clinic into the build process? > > Isn't solving the bootstrapping problem the reason for checking in the > clinic-generated output? If there's no Python available, we build what > we have (without the clinic step), then we build it again *with* the > clinic step. > > It solves the bootstrapping problem, but that's not the only problem Clinic > presents to the development workflow. > > If you modify some Clinic DSL in a C file in the CPython tree, then run > "make", should the Makefile re-run Clinic over that file? If you say "no", > then there's no problem. If you say "yes", then we have the problem I > described. Ah, I think I see the problem you mean. What is defined in the makefile as the way to regenerate an object file from the C file. If it is run clinic and then run the compiler, then you will get a dependency loop. If it doesn't implicitly run clinic, then we risk checking in inconsistent clinic metadata. I think the simplest answer may be to have "make clinic" as an explicit command, along with a commit hook that checks for clinic metadata consistency. Then "make" doesn't have to change and there's no nasty bootstrapping problem. > ___________________________________________________________________ > Question 2: Emit code for modules and classes? > > There are some complications to this, one of which I'll > discuss next. But I put it to you, gentle reader: how > much boilerplate should Argument Clinic undertake to > generate, and how much more class and module metadata > should be wired in to it? > > I strongly recommend deferring this. Incremental development is good, > and getting this bootstrapped at all is going to be challenging enough > without trying to do everything at once. > > > I basically agree. But you glossed over an important part of that question, > "how much more class and module metadata should be wired in right now?". > > Originally Clinic didn't ask for full class and module information, you just > specified the full dotted path and that was that. But that's ambiguous; > Clinic wouldn't be able to infer what was a module vs what was a class. And > in the future, if/when it generates module and class boilerplate, obviously > it'll need to know the distinction. I figure, specifying the classes and > modules doesn't add a lot of additional cost, but it'll very likely save us > a lot of time in the long run, so I made it a requirement. (WAGNI!) Note that setuptools entry point syntax solves the namespace ambiguity problem by using ":" to separate the module name from the object's name within the module (the nost test runner does the same thing). I'm adopting that convention for the PEP 426 metadata, and it's probably appropriate as a concise notation for clinic as well. > As long as the code Clinic generates is backwards compatible for Python 3.4, > I think this will has it covered. We may at times force developers to use > fresher versions of Python to process Clinic stuff, but I don't think that's > a big deal. One of the nice things about explicitly versioned standards is that you can set the "no version stated" to the first version released and avoid the boilerplate in the common case :) > ___________________________________________________________________ > Question 5: Keep too-magical class decorator Converter.wrap? > Let's say I go with your proposal. What happens if someone makes a > Converter, and wraps it with Converter.wrap, and defines their own __init__? > It would never get called. Silently, by default, which is worse--though I > could explicitly detect such an __init__ and throw an exception I guess. > Still, now we have a class where you can't use the name __init__, you have > to use this funny other name, for arbitrary "correctness" reasons. You misunderstand me: I believe a class decorator is the *wrong solution*. I am saying Converter.wrap *shouldn't exist*, and that the logic for what it does should be directly in Converter.__init__. The additional initialisation method could be given a better name like "process_custom_params" rather than "custom_init". That is, instead of this hard to follow magic: @staticmethod def wrap(cls): class WrappedConverter(cls, Converter): def __init__(self, name, function, default=unspecified, *, doc_default=None, required=False, **kwargs): super(cls, self).__init__(name, function, default, doc_default=doc_default, required=required) cls.__init__(self, **kwargs) return functools.update_wrapper( WrappedConverter, cls, updated=()) You would just have the simple: class Converter: def __init__(self, name, function, default=unspecified, *, doc_default=None, required=False, **kwargs): .... # Existing arg process self.process_custom_params(**kwargs) def process_custom_params(self): # Default to no custom parameters allowed pass Those that just want to define custom parameters and leave the rest of the logic alone can override "process_custom_params". Those that want to completely control the initialisation can override __init__ directly. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stefan_ml at behnel.de Tue Aug 6 07:02:30 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Aug 2013 07:02:30 +0200 Subject: [Python-Dev] Make extension module initialisation more like Python module initialisation In-Reply-To: References: Message-ID: Hi, let me revive and summarize this old thread. Stefan Behnel, 08.11.2012 13:47: > I suspect that this will be put into a proper PEP at some point, but I'd > like to bring this up for discussion first. This came out of issues 13429 > and 16392. > > http://bugs.python.org/issue13429 > > http://bugs.python.org/issue16392 > > > The problem > =========== > > Python modules and extension modules are not being set up in the same way. > For Python modules, the module is created and set up first, then the module > code is being executed. For extensions, i.e. shared libraries, the module > init function is executed straight away and does both the creation and > initialisation. This means that it knows neither the __file__ it is being > loaded from nor its package (i.e. its FQMN). This hinders relative imports > and resource loading. In Py3, it's also not being added to sys.modules, > which means that a (potentially transitive) re-import of the module will > really try to reimport it and thus run into an infinite loop when it > executes the module init function again. And without the FQMN, it's not > trivial to correctly add the module to sys.modules either. > > We specifically run into this for Cython generated modules, for which it's > not uncommon that the module init code has the same level of complexity as > that of any 'regular' Python module. Also, the lack of a FQMN and correct > file path hinders the compilation of __init__.py modules, i.e. packages, > especially when relative imports are being used at module init time. The outcome of this discussion was that the extension module import protocol needs to change in order to provide all necessary information to the module init function. Brett Cannon proposed to move the module object creation into the extension module importer, i.e. outside of the user provided module init function. CPython would then load the extension module, create and initialise the module object (set __file__, __name__, etc.) and pass it into the module init function. I proposed to make the PyModuleDef struct the new entry point instead of just a generic C function, as that would give the module importer all necessary information about the module to create the module object. The only missing bit is the entry point for the new module init function. Nick Coghlan objected to the proposal of simply extending PyModuleDef with an initialiser function, as the struct is part of the stable ABI. Alternatives I see: 1) Expose a struct that points to the extension module's PyModuleDef struct and the init function and expose that struct instead. 2) Expose both the PyModuleDef and the init function as public symbols. 3) Provide a public C function as entry point that returns both a PyModuleDef pointer and a module init function pointer. 4) Change the m_init function pointer in PyModuleDef_base from func(void) to func(PyObject*) iff the PyModuleDef struct is exposed as a public symbol. 5) Duplicate PyModuleDef and adapt the new one as in 4). Alternatives 1) and 2) only differ marginally by the number of public symbols being exposed. 3) has the advantage of supporting more advanced setups, e.g. heap allocation for the PyModuleDef struct. 4) is a hack and has the disadvantage that the signature of the module init function cannot be stored across reinitialisations (PyModuleDef has no "flags" or "state" field to remember it). 5) would fix that, i.e. we could add a proper pointer to the new module init function as well as a flags field for future extensions. A similar effect could be achieved by carefully designing the struct in 1). I think 1-3 are all reasonable ways to do this, although I don't think 3) will be necessary. 5) would be a clean fix, but has the disadvantage of duplicating an entire struct just to change one field in it. I'm currently leaning towards 1), with a struct that points to PyModuleDef, module init function and a flags field for future extensions. I understand that this would need to become part of the stable ABI, so explicit extensibility is important to keep up backwards compatibility. Opinions? Stefan From ncoghlan at gmail.com Tue Aug 6 07:35:28 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Aug 2013 15:35:28 +1000 Subject: [Python-Dev] Make extension module initialisation more like Python module initialisation In-Reply-To: References: Message-ID: On 6 August 2013 15:02, Stefan Behnel wrote: > Alternatives I see: > > 1) Expose a struct that points to the extension module's PyModuleDef struct > and the init function and expose that struct instead. > > 2) Expose both the PyModuleDef and the init function as public symbols. > > 3) Provide a public C function as entry point that returns both a > PyModuleDef pointer and a module init function pointer. > > 4) Change the m_init function pointer in PyModuleDef_base from func(void) > to func(PyObject*) iff the PyModuleDef struct is exposed as a public symbol. > > 5) Duplicate PyModuleDef and adapt the new one as in 4). > > Alternatives 1) and 2) only differ marginally by the number of public > symbols being exposed. 3) has the advantage of supporting more advanced > setups, e.g. heap allocation for the PyModuleDef struct. 4) is a hack and > has the disadvantage that the signature of the module init function cannot > be stored across reinitialisations (PyModuleDef has no "flags" or "state" > field to remember it). 5) would fix that, i.e. we could add a proper > pointer to the new module init function as well as a flags field for future > extensions. A similar effect could be achieved by carefully designing the > struct in 1). > > I think 1-3 are all reasonable ways to do this, although I don't think 3) > will be necessary. 5) would be a clean fix, but has the disadvantage of > duplicating an entire struct just to change one field in it. > > I'm currently leaning towards 1), with a struct that points to PyModuleDef, > module init function and a flags field for future extensions. I understand > that this would need to become part of the stable ABI, so explicit > extensibility is important to keep up backwards compatibility. > > Opinions? I believe a better option would be to migrate module creation over to a dynamic PyModule_Slot and PyModule_Spec approach in the stable ABI, similar to the one that was defined for types in PEP 384. A related topic is that over on import-sig, we're currently tinkering with the idea of changing the way *Python* module imports happen to include a separate "ImportSpec" object (exact name TBC). The spec would contain preliminary info on all of the things that the import system can figure out *without* actually importing the module. That list includes all the special attributes that are currently set on modules: __loader__ __name__ __package__ __path__ __file__ __cached__ (Note that the attributes on the spec *may not* be the same as those in the module's own namespace - for example, __name__ and __spec__.name would differ in a module executed with -m, and __path__ and __spec__.path would end up differing in packages that directly manipulated their __path__ attribute during __init__ execution) The intent is to clean up some of the ad hoc hackery that was needed to make PEP 420 work, and reduce the amount of duplicated functionality needed in loader implementations. If you wanted to reboot this thread on import-sig, that would probably be a good thing :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stefan_ml at behnel.de Tue Aug 6 08:03:43 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Aug 2013 08:03:43 +0200 Subject: [Python-Dev] Make extension module initialisation more like Python module initialisation In-Reply-To: References: Message-ID: Nick Coghlan, 06.08.2013 07:35: > If you wanted to reboot this thread on import-sig, that would probably > be a good thing :) Sigh. Yet another list to know about and temporarily follow... The import-sig list doesn't seem to be mirrored on Gmane yet. Also, it claims to be dead w.r.t. Py3.4: """ The intent is that this SIG will be re-retired after Python 3.3 is released. """ -> http://www.python.org/community/sigs/current/import-sig/ """ Resurrected for landing PEP 382 in Python 3.3. """ -> http://mail.python.org/mailman/listinfo/import-sig Seriously, wouldn't python-dev be just fine for this? It's not like the import system is going to be rewritten for each minor release from now on. Stefan From ncoghlan at gmail.com Tue Aug 6 09:09:45 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Aug 2013 17:09:45 +1000 Subject: [Python-Dev] Make extension module initialisation more like Python module initialisation In-Reply-To: References: Message-ID: On 6 August 2013 16:03, Stefan Behnel wrote: > Seriously, wouldn't python-dev be just fine for this? It's not like the > import system is going to be rewritten for each minor release from now on. We currently use it whenever we're doing a deep dive into import system arcana, so python-dev only needs to worry about the question once it's a clearly viable proposal. I think the other thread will be quite relevant to the topic you're interested in, since we hadn't even considered extension modules yet. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From fuzzyman at voidspace.org.uk Tue Aug 6 10:25:14 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 6 Aug 2013 11:25:14 +0300 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: References: <20130803160725.1D1602500CA@webabinitio.net> Message-ID: <43DD615E-258B-4FC0-BB80-7EB16B4C4A90@voidspace.org.uk> On 6 Aug 2013, at 00:43, Matt McClure wrote: > On Sat, Aug 3, 2013 at 3:27 PM, Michael Foord wrote: > It smells to me like a new feature rather than a bugfix, and it's a moderately big change. I don't think it can be backported to 2.7 other than through unittest2. > > Is http://hg.python.org/unittest2 the place to backport to unittest2? > It is, but... unittest itself has changed so extensively since the last release of unittest2 that I'm not sure whether a completely fresh start for unittest2 might be needed. (Although I intend to do another bugfix release of this version as well.) Making unittest2 will involve: * Taking the Python 3 unittest and porting code plus tests to run on python 2 * Run the new plus old tests (removing duplications) to ensure no functionality was lost (for example string handling will be wildly different so some tests may have been removed in Python 3 that are still relevant to Python 2) * Fix the failing tests and decide whether new features that depend on later versions of Python (particularly around improvements to the inspect module and stack frames) even *can* be backported * unittest2 classes all need to inherit from the unittest versions (and do some appropriate super calls because of the extra base class) - plus there are some tests for the differences * there is a setuptools compatible test runner and the unit2 script also additional to vanilla unittest * documentation updates - new features and differences * stuff I've forgotten So it's a pretty big job, but not insurmountable :-) A version that targets Python 3.2 would also be useful - it *may* be possible to do this in a single codebase. The current approach is to have two codebases (unittest2 and unittest2py3k). The reason for this is that it's rather easier to generate a Python 3 backport by applying a few patches than it is to generate the Python 2 version - so a Python 3 backport can be much simpler. It makes life harder for projects that use unittest2 though as which project they need as test runner depends on whether they are being run on Python 2 or Python 3 - so a single codebase (or single project anyway) would be better. All the best, Michael Foord > -- > Matt McClure > http://matthewlmcclure.com > http://www.mapmyfitness.com/profile/matthewlmcclure -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From rdmurray at bitdance.com Tue Aug 6 11:44:15 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 06 Aug 2013 05:44:15 -0400 Subject: [Python-Dev] The Return Of Argument Clinic In-Reply-To: <52003B03.6090204@hastings.org> References: <51FF66DD.1020403@hastings.org> <52003B03.6090204@hastings.org> Message-ID: <20130806094415.79464250028@webabinitio.net> On Mon, 05 Aug 2013 16:53:39 -0700, Larry Hastings wrote: > Let me put it this way: Which is more surprising to the person > unfamiliar with the code? That this __init__ doesn't get all the > parameters, and the base class __init__ is getting called > automatically? Or that this funny function "custom_init" is what gets > called, and this class is not allowed to have a function called __init__? Definitely the former is more surprising. Especially since, as Nick points out, the last part of your statement isn't true: there can be a function called __init__, it just has to replicate the superclass logic if it exists, which is the way Python normally works. I use this "call a hook method from __init__" pattern in the email package's new header parsing code, by the way, for whatever that is worth :) --David From solipsis at pitrou.net Tue Aug 6 14:12:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Aug 2013 14:12:03 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies References: <20130805205109.4cc384aa@fsol> <20130805212630.2472c0c6@fsol> Message-ID: <20130806141203.53d3cb71@pitrou.net> Le Mon, 05 Aug 2013 22:30:29 +0200, Stefan Behnel a ?crit : > > Hmm, it's a bit unfortunate that tp_finalize() maps so directly to > __del__(), but I think this can be fixed. In any case, each > tp_finalize() function must only ever be called once, so if a subtype > inherited the tp_finalize() slot from its parent, it mustn't be > called again. This is already dealt with by a custom bit in the GC header (cf. _PyGC_IS_FINALIZED, IIRC). > >> An obvious open question is how to deal with exceptions during > >> finalisation. Any break in the execution chain would mean that a > >> part of the type wouldn't be finalised. > > > > Let's come back to pure Python: > > > > class A: > > def __del__(self): > > 1/0 > > > > class B(A): > > def __del__(self): > > super().__del__() > > self.cleanup_resources() > > What makes you think it's a good idea to call the parent type's > finaliser before doing the local finalisation, and not the other way > round? What if the subtype needs access to parts of the super type > for its cleanup? I'm not saying it's a good idea. I'm just saying that to reason about the C API, it is a good idea to reason about equivalent pure Python code. Since exceptions aren't implicitly silenced in pure Python code, they probably shouldn't in C code. > In other words, which makes more sense (at the C level): > > try: > super().tp_finalize() > finally: > local_cleanup() > > or > > try: > local_cleanup() > finally: > super().tp_finalize() > > Should that order be part of the protocol or not? (well, not for > __del__() I guess, but maybe for tp_finalize()?) No, it is left to the user's preference. Since tp_finalize() is meant to be equivalent to __del__(), I think it's better if the protocols aren't subtly different (to the extent to which it is possible, of course). > Coming back to the __del__() vs. tp_finalize() story, if tp_finalize() > first recursed into the super types, the top-most one of which then > calls __del__() and returns, we'd get an execution order that runs > Python-level __del__() methods before C-level tp_finalize() > functions, but loose the subtype-before-supertype execution order for > tp_finalize() functions. Well... to get that, you'd have to subclass a pure Python class with a C extension type. Does that ever happen? > That might call for a three-step cleanup: > > 1) run all Python __del__() methods recursively > 2) run all tp_finalize() functions recursively > 3) run tp_dealloc() recursively I don't see any reason why tp_finalize should be distinct from __del__, while e.g. __init__ and tp_init map to the exact same thing. (you might wonder why tp_finalize isn't called tp_del, but that's because there is already something named tp_del - something which is obsoleted by PEP 442, incidently ;-)). Regards Antoine. From brett at python.org Tue Aug 6 17:13:08 2013 From: brett at python.org (Brett Cannon) Date: Tue, 6 Aug 2013 11:13:08 -0400 Subject: [Python-Dev] The Return Of Argument Clinic In-Reply-To: References: <51FF66DD.1020403@hastings.org> <52003B03.6090204@hastings.org> Message-ID: On Tue, Aug 6, 2013 at 12:59 AM, Nick Coghlan wrote: > On 6 August 2013 09:53, Larry Hastings wrote: > > On 08/05/2013 02:55 AM, Nick Coghlan wrote: > > On 5 August 2013 18:48, Larry Hastings wrote: > > > > Question 0: How should we integrate Clinic into the build process? > > > > Isn't solving the bootstrapping problem the reason for checking in the > > clinic-generated output? If there's no Python available, we build what > > we have (without the clinic step), then we build it again *with* the > > clinic step. > > > > It solves the bootstrapping problem, but that's not the only problem > Clinic > > presents to the development workflow. > > > > If you modify some Clinic DSL in a C file in the CPython tree, then run > > "make", should the Makefile re-run Clinic over that file? If you say > "no", > > then there's no problem. If you say "yes", then we have the problem I > > described. > > Ah, I think I see the problem you mean. What is defined in the > makefile as the way to regenerate an object file from the C file. If > it is run clinic and then run the compiler, then you will get a > dependency loop. If it doesn't implicitly run clinic, then we risk > checking in inconsistent clinic metadata. > > I think the simplest answer may be to have "make clinic" as an > explicit command, along with a commit hook that checks for clinic > metadata consistency. Then "make" doesn't have to change and there's > no nasty bootstrapping problem. Can't we just do what we already do for the generated AST code or what we used to do for importlib's frozen code; we have the touch extension for hg integration for this kind of issue. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Tue Aug 6 17:18:59 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Aug 2013 17:18:59 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies In-Reply-To: <20130806141203.53d3cb71@pitrou.net> References: <20130805205109.4cc384aa@fsol> <20130805212630.2472c0c6@fsol> <20130806141203.53d3cb71@pitrou.net> Message-ID: Antoine Pitrou, 06.08.2013 14:12: > Le Mon, 05 Aug 2013 22:30:29 +0200, > Stefan Behnel a ?crit : >> >> Hmm, it's a bit unfortunate that tp_finalize() maps so directly to >> __del__(), but I think this can be fixed. In any case, each >> tp_finalize() function must only ever be called once, so if a subtype >> inherited the tp_finalize() slot from its parent, it mustn't be >> called again. > > This is already dealt with by a custom bit in the GC header (cf. > _PyGC_IS_FINALIZED, IIRC). But that's only at an instance level. If a type in the hierarchy inherited the slot function for tp_finalize() from its parent, then the child must skip its parent in the call chain to prevent calling the same slot function twice. No instance flag can help you here. >>>> An obvious open question is how to deal with exceptions during >>>> finalisation. Any break in the execution chain would mean that a >>>> part of the type wouldn't be finalised. >>> >>> Let's come back to pure Python: >>> >>> class A: >>> def __del__(self): >>> 1/0 >>> >>> class B(A): >>> def __del__(self): >>> super().__del__() >>> self.cleanup_resources() >> >> What makes you think it's a good idea to call the parent type's >> finaliser before doing the local finalisation, and not the other way >> round? What if the subtype needs access to parts of the super type >> for its cleanup? > > I'm not saying it's a good idea. I'm just saying that to reason about > the C API, it is a good idea to reason about equivalent pure Python > code. Since exceptions aren't implicitly silenced in pure Python code, > they probably shouldn't in C code. > >> In other words, which makes more sense (at the C level): >> >> try: >> super().tp_finalize() >> finally: >> local_cleanup() >> >> or >> >> try: >> local_cleanup() >> finally: >> super().tp_finalize() >> >> Should that order be part of the protocol or not? (well, not for >> __del__() I guess, but maybe for tp_finalize()?) > > No, it is left to the user's preference. Since tp_finalize() is meant > to be equivalent to __del__(), I think it's better if the protocols > aren't subtly different (to the extent to which it is possible, of > course). Ok, fine with me. If the calls are done recursively anyway, then the child can decide when to calls into its parent. >> Coming back to the __del__() vs. tp_finalize() story, if tp_finalize() >> first recursed into the super types, the top-most one of which then >> calls __del__() and returns, we'd get an execution order that runs >> Python-level __del__() methods before C-level tp_finalize() >> functions, but loose the subtype-before-supertype execution order for >> tp_finalize() functions. > > Well... to get that, you'd have to subclass a pure Python class with a > C extension type. Maybe I'm wrong here. It's the default implementation of tp_finalize() that calls __del__, right? If a Python class with a __del__ inherits from an extension type that implements tp_finalize(), then whose tp_finalize() will be executed first? The one of the Python class or the one of the extension type? Stefan From solipsis at pitrou.net Tue Aug 6 17:49:08 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Aug 2013 17:49:08 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies References: <20130805205109.4cc384aa@fsol> <20130805212630.2472c0c6@fsol> <20130806141203.53d3cb71@pitrou.net> Message-ID: <20130806174908.3e335c52@pitrou.net> Le Tue, 06 Aug 2013 17:18:59 +0200, Stefan Behnel a ?crit : > Antoine Pitrou, 06.08.2013 14:12: > > Le Mon, 05 Aug 2013 22:30:29 +0200, > > Stefan Behnel a ?crit : > >> > >> Hmm, it's a bit unfortunate that tp_finalize() maps so directly to > >> __del__(), but I think this can be fixed. In any case, each > >> tp_finalize() function must only ever be called once, so if a > >> subtype inherited the tp_finalize() slot from its parent, it > >> mustn't be called again. > > > > This is already dealt with by a custom bit in the GC header (cf. > > _PyGC_IS_FINALIZED, IIRC). > > But that's only at an instance level. If a type in the hierarchy > inherited the slot function for tp_finalize() from its parent, then > the child must skip its parent in the call chain to prevent calling > the same slot function twice. No instance flag can help you here. Ah, sorry. I had misunderstood what you were talking about. Yes, you're right, a tp_finalize implementation should avoid calling itself recursively. If there's some C API that can be added to ease it, I'm ok for adding it. > Maybe I'm wrong here. It's the default implementation of > tp_finalize() that calls __del__, right? Yes. > If a Python class with a > __del__ inherits from an extension type that implements > tp_finalize(), then whose tp_finalize() will be executed first? Then only the Python __del__ gets called. It should call super().__del__() manually, to ensure the extension type's tp_finalize gets called. Regards Antoine. From rymg19 at gmail.com Tue Aug 6 17:02:50 2013 From: rymg19 at gmail.com (Ryan) Date: Tue, 06 Aug 2013 10:02:50 -0500 Subject: [Python-Dev] Make extension module initialisation more like Python module initialisation In-Reply-To: References: Message-ID: Nice idea, but some of those may break 3rd party libraries like Boost. Python that have their own equilavent of the Python/C API. Or Even SWIG might experience trouble in one or two of those. Stefan Behnel wrote: >Hi, > >let me revive and summarize this old thread. > >Stefan Behnel, 08.11.2012 13:47: >> I suspect that this will be put into a proper PEP at some point, but >I'd >> like to bring this up for discussion first. This came out of issues >13429 >> and 16392. >> >> http://bugs.python.org/issue13429 >> >> http://bugs.python.org/issue16392 >> >> >> The problem >> =========== >> >> Python modules and extension modules are not being set up in the same >way. >> For Python modules, the module is created and set up first, then the >module >> code is being executed. For extensions, i.e. shared libraries, the >module >> init function is executed straight away and does both the creation >and >> initialisation. This means that it knows neither the __file__ it is >being >> loaded from nor its package (i.e. its FQMN). This hinders relative >imports >> and resource loading. In Py3, it's also not being added to >sys.modules, >> which means that a (potentially transitive) re-import of the module >will >> really try to reimport it and thus run into an infinite loop when it >> executes the module init function again. And without the FQMN, it's >not >> trivial to correctly add the module to sys.modules either. >> >> We specifically run into this for Cython generated modules, for which >it's >> not uncommon that the module init code has the same level of >complexity as >> that of any 'regular' Python module. Also, the lack of a FQMN and >correct >> file path hinders the compilation of __init__.py modules, i.e. >packages, >> especially when relative imports are being used at module init time. > >The outcome of this discussion was that the extension module import >protocol needs to change in order to provide all necessary information >to >the module init function. > >Brett Cannon proposed to move the module object creation into the >extension >module importer, i.e. outside of the user provided module init >function. >CPython would then load the extension module, create and initialise the >module object (set __file__, __name__, etc.) and pass it into the >module >init function. > >I proposed to make the PyModuleDef struct the new entry point instead >of >just a generic C function, as that would give the module importer all >necessary information about the module to create the module object. The >only missing bit is the entry point for the new module init function. > >Nick Coghlan objected to the proposal of simply extending PyModuleDef >with >an initialiser function, as the struct is part of the stable ABI. > >Alternatives I see: > >1) Expose a struct that points to the extension module's PyModuleDef >struct >and the init function and expose that struct instead. > >2) Expose both the PyModuleDef and the init function as public symbols. > >3) Provide a public C function as entry point that returns both a >PyModuleDef pointer and a module init function pointer. > >4) Change the m_init function pointer in PyModuleDef_base from >func(void) >to func(PyObject*) iff the PyModuleDef struct is exposed as a public >symbol. > >5) Duplicate PyModuleDef and adapt the new one as in 4). > >Alternatives 1) and 2) only differ marginally by the number of public >symbols being exposed. 3) has the advantage of supporting more advanced >setups, e.g. heap allocation for the PyModuleDef struct. 4) is a hack >and >has the disadvantage that the signature of the module init function >cannot >be stored across reinitialisations (PyModuleDef has no "flags" or >"state" >field to remember it). 5) would fix that, i.e. we could add a proper >pointer to the new module init function as well as a flags field for >future >extensions. A similar effect could be achieved by carefully designing >the >struct in 1). > >I think 1-3 are all reasonable ways to do this, although I don't think >3) >will be necessary. 5) would be a clean fix, but has the disadvantage of >duplicating an entire struct just to change one field in it. > >I'm currently leaning towards 1), with a struct that points to >PyModuleDef, >module init function and a flags field for future extensions. I >understand >that this would need to become part of the stable ABI, so explicit >extensibility is important to keep up backwards compatibility. > >Opinions? > >Stefan > > >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Tue Aug 6 17:59:06 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Aug 2013 17:59:06 +0200 Subject: [Python-Dev] Make extension module initialisation more like Python module initialisation In-Reply-To: References: Message-ID: Ryan, 06.08.2013 17:02: > Nice idea, but some of those may break 3rd party libraries like Boost. > Python that have their own equilavent of the Python/C API. Or Even SWIG > might experience trouble in one or two of those. Te idea is that this will be an alternative way of initialising a module that CPython will only use if an extension module exports the corresponding symbol. So it won't break existing code, neither source code nor binaries. Stefan From stefan_ml at behnel.de Tue Aug 6 18:38:51 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Aug 2013 18:38:51 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies In-Reply-To: <20130806174908.3e335c52@pitrou.net> References: <20130805205109.4cc384aa@fsol> <20130805212630.2472c0c6@fsol> <20130806141203.53d3cb71@pitrou.net> <20130806174908.3e335c52@pitrou.net> Message-ID: Antoine Pitrou, 06.08.2013 17:49: > Le Tue, 06 Aug 2013 17:18:59 +0200, > Stefan Behnel a ?crit : >> If a Python class with a >> __del__ inherits from an extension type that implements >> tp_finalize(), then whose tp_finalize() will be executed first? > > Then only the Python __del__ gets called. It should call > super().__del__() manually, to ensure the extension type's > tp_finalize gets called. Ok, but then all I have to do in order to disable C level finalisation for a type is to inherit from it and provide an empty __del__ method. I think that disqualifies the feature for the use in Cython. Finalisation at the Python level is nice, but at the C level it's usually vital. I had originally read this PEP as a way to get better guarantees than what dealloc can provide, but your above statement makes it rather the opposite. Stefan From solipsis at pitrou.net Tue Aug 6 19:29:42 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Aug 2013 19:29:42 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies References: <20130805205109.4cc384aa@fsol> <20130805212630.2472c0c6@fsol> <20130806141203.53d3cb71@pitrou.net> <20130806174908.3e335c52@pitrou.net> Message-ID: <20130806192942.5f4cd908@fsol> On Tue, 06 Aug 2013 18:38:51 +0200 Stefan Behnel wrote: > Antoine Pitrou, 06.08.2013 17:49: > > Le Tue, 06 Aug 2013 17:18:59 +0200, > > Stefan Behnel a ?crit : > >> If a Python class with a > >> __del__ inherits from an extension type that implements > >> tp_finalize(), then whose tp_finalize() will be executed first? > > > > Then only the Python __del__ gets called. It should call > > super().__del__() manually, to ensure the extension type's > > tp_finalize gets called. > > Ok, but then all I have to do in order to disable C level finalisation for > a type is to inherit from it and provide an empty __del__ method. > > I think that disqualifies the feature for the use in Cython. Finalisation > at the Python level is nice, but at the C level it's usually vital. I had > originally read this PEP as a way to get better guarantees than what > dealloc can provide, but your above statement makes it rather the opposite. Anything vital should probably be ensured by tp_dealloc. For example, you might close an fd early in tp_finalize, but also ensure it gets closed in tp_dealloc in the case tp_finalize wasn't called. (that said, you can also have fd leaks in pure Python...) Regards Antoine. From matthewlmcclure at gmail.com Tue Aug 6 20:05:42 2013 From: matthewlmcclure at gmail.com (Matt McClure) Date: Tue, 6 Aug 2013 14:05:42 -0400 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: <43DD615E-258B-4FC0-BB80-7EB16B4C4A90@voidspace.org.uk> References: <20130803160725.1D1602500CA@webabinitio.net> <43DD615E-258B-4FC0-BB80-7EB16B4C4A90@voidspace.org.uk> Message-ID: Hi Michael, On Tue, Aug 6, 2013 at 4:25 AM, Michael Foord wrote: > unittest itself has changed so extensively since the last release of > unittest2 that I'm not sure whether a completely fresh start for unittest2 > might be needed. (Although I intend to do another bugfix release of this > version as well.) > > Making unittest2 will involve: > > * Taking the Python 3 unittest and porting code plus tests to run > on python 2 > [ ... ] > I took a different approach and simply applied the patch of diffs[1] from the Python 3 issue to the unittest2 tip. There was a small amount of renaming "unittest" to "unittest2" required, but other than that, the patch applied pretty cleanly, and seems to pass the unit tests and avoid the ever-increasing memory problem in my private test suite. Do you think it's sufficient to port just this feature? Or if not, what am I missing that requires resyncing more of unittest2 with the changes from Python 3? Is Google Code[2] still the right place for unittest2 issues? I found that via PyPI[3]. It looks like there have been a lot of commits in the unittest2 repository since the last PyPI release (2010-07-12 -- 0.5.1). Would you plan to do another PyPI release of unittest2 with this feature? Or would you recommend using unittest2 from the repository to get it? Or am I missing a more recent packaged release somewhere else? [1]: https://bitbucket.org/matthewlmcclure/unittest2/compare/issue11798-tip..issue11798-base#diff [2]: https://code.google.com/p/unittest-ext/issues/detail?id=76&sort=-id [3]: https://pypi.python.org/pypi/unittest2 -- Matt McClure http://matthewlmcclure.com http://www.mapmyfitness.com/profile/matthewlmcclure -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Aug 6 21:26:41 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Aug 2013 21:26:41 +0200 Subject: [Python-Dev] Our failure at handling GSoC students Message-ID: <20130806212641.64b0e851@fsol> Hello, I would like to point out that we currently fail at handling GSoC projects and bringing them to completion. One cruel example is the set of PEP 3121 / PEP 384 refactorings done by Robin Schreiber: http://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype&%40sort=-activity&%40filter=status&%40action=searchid&ignore=file%3Acontent&%40search_text=pep+3121&submit=search&status=-1%2C1%2C3 Robin has produced many patches that seem to reach the stated goal (refactor C extension modules to take advantage of the latest PEPs about module initialization and extension types definition). Unfortunately, tackling both goals at the same time produces big patches with a lot of churn; and it is also not obvious the PEP 384 refactoring is useful for the stdlib (while the PEP 3121 refactoring definitely is). What didn't produce an alarm during Robin's work is that GSoC work is done in private. Therefore, other core developers than the mentor don't get to give an advice early, as would happen with any normal proposal done publicly (on the mailing-list or on the bug tracker). It is also likely that the mentor gets overworked after the GSoC period is over, is unable to finalize the patch and push it, and other core devs have a hard time catching up on the work and don't know what the shortcomings are. Regards Antoine. From eliben at gmail.com Tue Aug 6 21:43:40 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 6 Aug 2013 12:43:40 -0700 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130806212641.64b0e851@fsol> References: <20130806212641.64b0e851@fsol> Message-ID: On Tue, Aug 6, 2013 at 12:26 PM, Antoine Pitrou wrote: > > Hello, > > I would like to point out that we currently fail at handling GSoC > projects and bringing them to completion. > > One cruel example is the set of PEP 3121 / PEP 384 refactorings done by > Robin Schreiber: > > http://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype&%40sort=-activity&%40filter=status&%40action=searchid&ignore=file%3Acontent&%40search_text=pep+3121&submit=search&status=-1%2C1%2C3 > > Robin has produced many patches that seem to reach the stated goal > (refactor C extension modules to take advantage of the latest PEPs > about module initialization and extension types definition). > Unfortunately, tackling both goals at the same time produces big > patches with a lot of churn; and it is also not obvious the PEP 384 > refactoring is useful for the stdlib (while the PEP 3121 refactoring > definitely is). > > What didn't produce an alarm during Robin's work is that GSoC work is > done in private. Therefore, other core developers than the mentor don't > get to give an advice early, as would happen with any normal proposal > done publicly (on the mailing-list or on the bug tracker). It is also > likely that the mentor gets overworked after the GSoC period is over, > is unable to finalize the patch and push it, and other core devs have a > hard time catching up on the work and don't know what the shortcomings > are. > I would like to point out something that stands out in this list of issues: such a method of producing dozens of patches simultaneously is extremely unwise, unless there's a crucial piece of history I'm missing. It is much more prudent to start with one or two exemplary modules, and if those fully pass code review, send out patches for others. The reason is obvious - code review may turn up problems or requests for change. Going backwards to modify 57 patches is not something anyone would want to do. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From skip at pobox.com Tue Aug 6 21:45:50 2013 From: skip at pobox.com (Skip Montanaro) Date: Tue, 6 Aug 2013 14:45:50 -0500 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130806212641.64b0e851@fsol> References: <20130806212641.64b0e851@fsol> Message-ID: > It is also likely that the mentor gets overworked after the GSoC period is over, > is unable to finalize the patch and push it... Given that Python development is done using a good DVCS now, it seems that if each manageable chunk of changes is done on a separate branch, the likelihood of acceptance of any one change goes way up (as it's much easier to analyze in isolation), and the likelihood that one small change nukes the entire collective patch goes way down. I don't know if that will address all concerns and improve the success rate, but I would personally find it easier to process 100 changes, each with 37 patch chunks than one change having 3700 chunks. Smaller haystacks make it easier to find the needles. In addition, there should be less pressure for someone to analyze the entire lot. If you get burned out at change 12, others should be there to pick up from change 13 without having to start over, re-analyzing changes 1 through 12. Skip From solipsis at pitrou.net Tue Aug 6 21:51:08 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Aug 2013 21:51:08 +0200 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: References: <20130806212641.64b0e851@fsol> Message-ID: <20130806215108.57c684e9@fsol> On Tue, 6 Aug 2013 12:43:40 -0700 Eli Bendersky wrote: > On Tue, Aug 6, 2013 at 12:26 PM, Antoine Pitrou wrote: > > > > Hello, > > > > I would like to point out that we currently fail at handling GSoC > > projects and bringing them to completion. > > > > One cruel example is the set of PEP 3121 / PEP 384 refactorings done by > > Robin Schreiber: > > > > http://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype&%40sort=-activity&%40filter=status&%40action=searchid&ignore=file%3Acontent&%40search_text=pep+3121&submit=search&status=-1%2C1%2C3 > > > > Robin has produced many patches that seem to reach the stated goal > > (refactor C extension modules to take advantage of the latest PEPs > > about module initialization and extension types definition). > > Unfortunately, tackling both goals at the same time produces big > > patches with a lot of churn; and it is also not obvious the PEP 384 > > refactoring is useful for the stdlib (while the PEP 3121 refactoring > > definitely is). > > > > What didn't produce an alarm during Robin's work is that GSoC work is > > done in private. Therefore, other core developers than the mentor don't > > get to give an advice early, as would happen with any normal proposal > > done publicly (on the mailing-list or on the bug tracker). It is also > > likely that the mentor gets overworked after the GSoC period is over, > > is unable to finalize the patch and push it, and other core devs have a > > hard time catching up on the work and don't know what the shortcomings > > are. > > > > I would like to point out something that stands out in this list of issues: > such a method of producing dozens of patches simultaneously is extremely > unwise, unless there's a crucial piece of history I'm missing. It is much > more prudent to start with one or two exemplary modules, and if those fully > pass code review, send out patches for others. The reason is obvious - code > review may turn up problems or requests for change. Going backwards to > modify 57 patches is not something anyone would want to do. I definitely agree, but this is part of our failure too. A beginner contributor isn't supposed to know the best way to contribute if nobody tells him/her beforehand. Regards Antoine. From fred at fdrake.net Tue Aug 6 22:07:28 2013 From: fred at fdrake.net (Fred Drake) Date: Tue, 6 Aug 2013 16:07:28 -0400 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130806215108.57c684e9@fsol> References: <20130806212641.64b0e851@fsol> <20130806215108.57c684e9@fsol> Message-ID: On Tue, Aug 6, 2013 at 3:51 PM, Antoine Pitrou wrote: > I definitely agree, but this is part of our failure too. I'd say this is strictly our failure, not the students'. This isn't really a new problem, I don't think, though the shape of this collection of patches makes it obvious. I haven't been active with GSoC the last couple of years, but if we don't have any sort of guide for mentors, we probably should, and this is an issue that should be mentioned as one that requires discussion with the students. That's our role as a community and as mentors when it comes to GSoC. -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein From ethan at stoneleaf.us Tue Aug 6 22:42:18 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 06 Aug 2013 13:42:18 -0700 Subject: [Python-Dev] a Constant addition to enum Message-ID: <52015FAA.3090805@stoneleaf.us> A question came up on stackoverflow asking about the Planet example and the need to have the constant G defined in the method instead of at the class level: http://stackoverflow.com/q/17911188/208880 Since methods and descriptors are immune to enumeration my proposed solution created a Constant descriptor that could be used to keep class level constants at the class level. It's not complex, only about 7 lines. Should we have something like that included in the enum module? If we do include something like that, should it be constant, or should it be more like property? (The important differences from property being that class access still returns the value, not the property itself, and setting the class-level value changes the value but doesn't replace the property.) -- ~Ethan~ From fuzzyman at voidspace.org.uk Tue Aug 6 23:08:19 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 7 Aug 2013 00:08:19 +0300 Subject: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long In-Reply-To: References: <20130803160725.1D1602500CA@webabinitio.net> <43DD615E-258B-4FC0-BB80-7EB16B4C4A90@voidspace.org.uk> Message-ID: <584F5F70-5108-4B91-867D-A47E0709D97D@voidspace.org.uk> On 6 Aug 2013, at 21:05, Matt McClure wrote: > Hi Michael, > > On Tue, Aug 6, 2013 at 4:25 AM, Michael Foord wrote: > unittest itself has changed so extensively since the last release of unittest2 that I'm not sure whether a completely fresh start for unittest2 might be needed. (Although I intend to do another bugfix release of this version as well.) > > Making unittest2 will involve: > > * Taking the Python 3 unittest and porting code plus tests to run on python 2 > [ ... ] > > I took a different approach and simply applied the patch of diffs[1] from the Python 3 issue to the unittest2 tip. > > There was a small amount of renaming "unittest" to "unittest2" required, but other than that, the patch applied pretty cleanly, and seems to pass the unit tests and avoid the ever-increasing memory problem in my private test suite. > > Do you think it's sufficient to port just this feature? Or if not, what am I missing that requires resyncing more of unittest2 with the changes from Python 3? > > Is Google Code[2] still the right place for unittest2 issues? I found that via PyPI[3]. > > It looks like there have been a lot of commits in the unittest2 repository since the last PyPI release (2010-07-12 -- 0.5.1). Would you plan to do another PyPI release of unittest2 with this feature? Or would you recommend using unittest2 from the repository to get it? Or am I missing a more recent packaged release somewhere else? > I plan to do a bugfix release which fixes bugs in unittest2 since the PyPI release. (Foolishly I don't think I tagged so I need to work out which revision corresponds to the released version.) I will also do a new release with *all* new features. I won't do a release with just this new feature no. For unittest2 specific issues, yes google code is still the correct issue tracker. Michael > [1]: https://bitbucket.org/matthewlmcclure/unittest2/compare/issue11798-tip..issue11798-base#diff > [2]: https://code.google.com/p/unittest-ext/issues/detail?id=76&sort=-id > [3]: https://pypi.python.org/pypi/unittest2 > > -- > Matt McClure > http://matthewlmcclure.com > http://www.mapmyfitness.com/profile/matthewlmcclure -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From solipsis at pitrou.net Tue Aug 6 23:24:34 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Aug 2013 23:24:34 +0200 Subject: [Python-Dev] The Return Of Argument Clinic References: <51FF66DD.1020403@hastings.org> <52003B03.6090204@hastings.org> Message-ID: <20130806232434.363fc00f@fsol> On Mon, 05 Aug 2013 16:53:39 -0700 Larry Hastings wrote: > > On 08/05/2013 02:55 AM, Nick Coghlan wrote: > > On 5 August 2013 18:48, Larry Hastings wrote: > >> Question 0: How should we integrate Clinic into the build process? > > Isn't solving the bootstrapping problem the reason for checking in the > > clinic-generated output? If there's no Python available, we build what > > we have (without the clinic step), then we build it again *with* the > > clinic step. > > It solves the bootstrapping problem, but that's not the only problem > Clinic presents to the development workflow. > > If you modify some Clinic DSL in a C file in the CPython tree, then run > "make", should the Makefile re-run Clinic over that file? If you say > "no", then there's no problem. If you say "yes", then we have the > problem I described. I say "yes" and I think best-effort is the solution. Usually, the current clinic should be good enough to compile future C changes. If it isn't, just revert your working copy and start again (save your changes and re-apply them if desired). importlib has the same theoretical problem but it works well enough in practice, even though it could be maddening at times when the code wasn't quite stabilized. Regards Antoine. From eliben at gmail.com Tue Aug 6 23:36:27 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 6 Aug 2013 14:36:27 -0700 Subject: [Python-Dev] a Constant addition to enum In-Reply-To: <52015FAA.3090805@stoneleaf.us> References: <52015FAA.3090805@stoneleaf.us> Message-ID: On Tue, Aug 6, 2013 at 1:42 PM, Ethan Furman wrote: > A question came up on stackoverflow asking about the Planet example and > the need to have the constant G defined in the method instead of at the > class level: > > http://stackoverflow.com/q/**17911188/208880 > > Since methods and descriptors are immune to enumeration my proposed > solution created a Constant descriptor that could be used to keep class > level constants at the class level. It's not complex, only about 7 lines. > Should we have something like that included in the enum module? > > If we do include something like that, should it be constant, or should it > be more like property? (The important differences from property being that > class access still returns the value, not the property itself, and setting > the class-level value changes the value but doesn't replace the property.) > Personally, I dislike all non-simple uses of Enums. One such use is adding behavior to them. This can always be split to separate behavior from the Enum itself, and I would prefer that. We went to great lengths to ensure that things work in expected ways, but heaping additional features (even as separate decorators) is just aggravating thiings. So -1 from me. Finally, I suggest we exercise restraint in adding more capabilities to enums in 3.4; enums are a new creature for Python and it will be extremely useful to see them used in the wild for a while first. We can enhance them in 3.5, but premature enhancement is IMHO much more likely to do harm than good. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Aug 6 23:45:37 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Aug 2013 23:45:37 +0200 Subject: [Python-Dev] a Constant addition to enum References: <52015FAA.3090805@stoneleaf.us> Message-ID: <20130806234537.584da266@fsol> On Tue, 6 Aug 2013 14:36:27 -0700 Eli Bendersky wrote: > On Tue, Aug 6, 2013 at 1:42 PM, Ethan Furman wrote: > > > A question came up on stackoverflow asking about the Planet example and > > the need to have the constant G defined in the method instead of at the > > class level: > > > > http://stackoverflow.com/q/**17911188/208880 > > > > Since methods and descriptors are immune to enumeration my proposed > > solution created a Constant descriptor that could be used to keep class > > level constants at the class level. It's not complex, only about 7 lines. > > Should we have something like that included in the enum module? > > > > If we do include something like that, should it be constant, or should it > > be more like property? (The important differences from property being that > > class access still returns the value, not the property itself, and setting > > the class-level value changes the value but doesn't replace the property.) > > > > Personally, I dislike all non-simple uses of Enums. One such use is adding > behavior to them. This can always be split to separate behavior from the > Enum itself, and I would prefer that. We went to great lengths to ensure > that things work in expected ways, but heaping additional features (even as > separate decorators) is just aggravating thiings. So -1 from me. Agreed. Also, putting constants inside Enum declarations is just confusing. (it doesn't help that the presented "use case" is the typical schoolbook example that doesn't correspond to any real-world situation, just like parsing Roman numbers and solving Hanoi tower puzzles) Regards Antoine. From tjreedy at udel.edu Wed Aug 7 00:34:04 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 06 Aug 2013 18:34:04 -0400 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130806212641.64b0e851@fsol> References: <20130806212641.64b0e851@fsol> Message-ID: On 8/6/2013 3:26 PM, Antoine Pitrou wrote: > I would like to point out that we currently fail at handling GSoC > projects and bringing them to completion. > > One cruel example is the set of PEP 3121 / PEP 384 refactorings done by > Robin Schreiber: > http://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype&%40sort=-activity&%40filter=status&%40action=searchid&ignore=file%3Acontent&%40search_text=pep+3121&submit=search&status=-1%2C1%2C3 > > Robin has produced many patches that seem to reach the stated goal > (refactor C extension modules to take advantage of the latest PEPs > about module initialization and extension types definition). > Unfortunately, tackling both goals at the same time produces big > patches with a lot of churn; and it is also not obvious the PEP 384 > refactoring is useful for the stdlib (while the PEP 3121 refactoring > definitely is). > > What didn't produce an alarm during Robin's work is that GSoC work is > done in private. Therefore, other core developers than the mentor don't > get to give an advice early, as would happen with any normal proposal > done publicly (on the mailing-list or on the bug tracker). It is also > likely that the mentor gets overworked after the GSoC period is over, > is unable to finalize the patch and push it, and other core devs have a > hard time catching up on the work and don't know what the shortcomings > are. There are 2 GSOC students working on Idle tests (mentored by Todd Rovito). Each file tested is a separate issue and separate patch. I have fallen behind reviewing them because of unexpected issues first with Idle and then with buildbots, but have been able to make some comments and some commits. I plan to do more before they disappear, and to get to everything eventually. -- Terry Jan Reedy From rovitotv at gmail.com Wed Aug 7 00:36:10 2013 From: rovitotv at gmail.com (Todd V Rovito) Date: Tue, 6 Aug 2013 18:36:10 -0400 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130806212641.64b0e851@fsol> References: <20130806212641.64b0e851@fsol> Message-ID: <09A269A0-4D52-4872-9508-AC70705853A6@gmail.com> On Aug 6, 2013, at 3:26 PM, Antoine Pitrou wrote: > I would like to point out that we currently fail at handling GSoC > projects and bringing them to completion. In the past I have noticed the same thing with IDLE. Students and mentors act outside of the standard Python development process then the final student products never get committed. > One cruel example is the set of PEP 3121 / PEP 384 refactorings done by > Robin Schreiber: > http://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype&%40sort=-activity&%40filter=status&%40action=searchid&ignore=file%3Acontent&%40search_text=pep+3121&submit=search&status=-1%2C1%2C3 I agree this is a sad example. > What didn't produce an alarm during Robin's work is that GSoC work is > done in private. Therefore, other core developers than the mentor don't > get to give an advice early, as would happen with any normal proposal > done publicly (on the mailing-list or on the bug tracker). It is also > likely that the mentor gets overworked after the GSoC period is over, > is unable to finalize the patch and push it, and other core devs have a > hard time catching up on the work and don't know.... So for this year I designed an IDLE project that specifically forced the students to be like normal contributors and use the standard Python development model. See this link for the project description: http://wiki.python.org/moin/SummerOfCode/2013/python-core From the project description: "Successful student proposals should not under estimate how long it takes to get code committed to CPython. A student must be able to concisely communicate and document the unit test framework's design to the Python community in order to get the framework committed to the CPython source tree. Do not underestimate how much time this communication and documentation will actually take in your proposal!!! Often times it will take several passes and several code reviews for a patch to get committed into CPython. This project is approximately 40% coding and 60% communication. This project requires average Python coding skills with excellent communication skills and a unrelenting persistence to get this job done to the satisfaction of at least one Python Core Developer so the work will be committed into the CPython source tree." To date the students have gotten three commits completed and seven total issues opened. Here is a google spreadsheet with the details: https://docs.google.com/spreadsheet/lv?key=0AqHo248BJw3RdFRnREo5TGtrQmxvQi1oem1HUS1PNGc It is too early to tell how effective the students have been. I do wish more unit tests were created but it all takes time to convince a core Python developer to make the commit (and with good reason). In this case Terry Reedy has been a huge help! I think the students are having fun and hopefully will stay involved for years to come. -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Aug 7 01:54:16 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 7 Aug 2013 01:54:16 +0200 Subject: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: 2013/8/6 Victor Stinner : > Oh, the summary table is wrong for the "subprocess, default" line: all > inheritable handles are inherited if at least one standard stream is > replaced. I updated the PEP: - add a new section "Performances of Closing All File Descriptors" - mention a previous attempt in 2007 to add open_noinherit - complete the summary table of the status of python 3.3 to mention the "subprocess, replace stdout" case - Windows creates non-inheritable *handles* (not fds) by default See the history: http://hg.python.org/peps/log/tip/pep-0446.txt Victor From ncoghlan at gmail.com Wed Aug 7 03:39:12 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 7 Aug 2013 11:39:12 +1000 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130806212641.64b0e851@fsol> References: <20130806212641.64b0e851@fsol> Message-ID: On 7 August 2013 05:26, Antoine Pitrou wrote: > > Hello, > > I would like to point out that we currently fail at handling GSoC > projects and bringing them to completion. Agreed. > What didn't produce an alarm during Robin's work is that GSoC work is > done in private. Therefore, other core developers than the mentor don't > get to give an advice early, as would happen with any normal proposal > done publicly (on the mailing-list or on the bug tracker). This isn't the way GSoC is supposed to work. Mentors are supposed to nudge students towards the regular channels for the project. This may mean a sig (e.g. the import engine work a few years ago was discussed on import-sig. That didn't end up being committed, since Greg's work revealed some fundamental problems with the proposed architecture, but the knowledge wasn't restricted to just myself and Greg), or else a more general channel like core-mentorship or python-ideas. Ideally (and this isn't going to be possible for every GSoC project), mentors will be able to help break the project down into reviewable chunks proposed as incremental issues, rather than producing one big patch at the end of the summer. > It is also > likely that the mentor gets overworked after the GSoC period is over, > is unable to finalize the patch and push it, and other core devs have a > hard time catching up on the work and don't know what the shortcomings > are. Indeed. I added some preliminary guidelines for mentors to the GSoC "Expectations" page: http://wiki.python.org/moin/SummerOfCode/Expectations#guidelines-for-mentors I also added a link to the expectations page from http://wiki.python.org/moin/SummerOfCode/2013#prospective-mentors Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Aug 7 07:05:26 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 7 Aug 2013 15:05:26 +1000 Subject: [Python-Dev] a Constant addition to enum In-Reply-To: References: <52015FAA.3090805@stoneleaf.us> Message-ID: On 7 August 2013 07:36, Eli Bendersky wrote: > On Tue, Aug 6, 2013 at 1:42 PM, Ethan Furman wrote: >> >> A question came up on stackoverflow asking about the Planet example and >> the need to have the constant G defined in the method instead of at the >> class level: >> >> http://stackoverflow.com/q/17911188/208880 >> >> Since methods and descriptors are immune to enumeration my proposed >> solution created a Constant descriptor that could be used to keep class >> level constants at the class level. It's not complex, only about 7 lines. >> Should we have something like that included in the enum module? >> >> If we do include something like that, should it be constant, or should it >> be more like property? (The important differences from property being that >> class access still returns the value, not the property itself, and setting >> the class-level value changes the value but doesn't replace the property.) > > > Personally, I dislike all non-simple uses of Enums. One such use is adding > behavior to them. This can always be split to separate behavior from the > Enum itself, and I would prefer that. We went to great lengths to ensure > that things work in expected ways, but heaping additional features (even as > separate decorators) is just aggravating thiings. So -1 from me. > > Finally, I suggest we exercise restraint in adding more capabilities to > enums in 3.4; enums are a new creature for Python and it will be extremely > useful to see them used in the wild for a while first. We can enhance them > in 3.5, but premature enhancement is IMHO much more likely to do harm than > good. Agreed. I wouldn't be averse to taking those advanced examples out of the docs, too. Like metaclasses, you can do crazy things with enums. "Can" doesn't mean "should", however. We've had a lot of success with metaclasses by soft-pedalling them in the standard library, so people only explore them when they *really* need them. I think we'd be well advised to pursue a similar path with advanced Enum tricks. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Wed Aug 7 07:06:23 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 7 Aug 2013 14:06:23 +0900 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: References: <20130806212641.64b0e851@fsol> Message-ID: <20993.54735.5585.401998@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > On 7 August 2013 05:26, Antoine Pitrou wrote: > > I would like to point out that we currently fail at handling GSoC > > projects and bringing them to completion. > > Agreed. I have no opinion on that statement, having not looked at the projects. > > What didn't produce an alarm during Robin's work is that GSoC work is > > done in private. > > This isn't the way GSoC is supposed to work. Indeed. I've seen this in one of the orgs I mentor for, and we may ask that mentor to go elsewhere if we don't get a credible promise to shape up. (That's an indication of how seriously my orgs take "work in public", not a suggestion for Python action in any case.) > or else a more general channel like core-mentorship or > python-ideas. +1 for python-ideas if there is no better fit in a specialist list, moving to python-dev as usual if the mentor judges the student sufficiently mature.[1] If the student's posting becomes annoying on python-ideas, the mentor should provide netiquette guidance. IMO the project-specific mentoring will become an annoyance on core-mentorship since it continues for the whole summer, and changing "preliminary" venues midstream doesn't seem like a great idea to me. My orgs require (in only one case successfully :-) weekly progress reports to the developers' list (per above, that would be python-ideas, not python-dev). In that successful case, GSoC is essentially the only content posted to that dev list, though. I'm not sure if that matters. > Ideally (and this isn't going to be possible for every GSoC project), > mentors will be able to help break the project down into reviewable > chunks proposed as incremental issues, rather than producing one big > patch at the end of the summer. GSoC suggests (at about the level of an RFC SHOULD) that students be committing early and often to a publicly accessible branch. I don't see a good reason why that wouldn't work even for complex projects that can't be merged until end of summer. (I wonder whether such projects should be used as GSoC tasks, as well, but as I haven't actually looked at Python GSoC tasks, I'll leave that in parens.) Footnotes: [1] By that I mean that I often observe students mixing blue-sky design with hard-core implementation details and necessary design revision late in the summer. If the project and student are "mature," that won't happen. From regebro at gmail.com Wed Aug 7 07:20:30 2013 From: regebro at gmail.com (Lennart Regebro) Date: Wed, 7 Aug 2013 07:20:30 +0200 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130806212641.64b0e851@fsol> References: <20130806212641.64b0e851@fsol> Message-ID: On Tue, Aug 6, 2013 at 9:26 PM, Antoine Pitrou wrote: > What didn't produce an alarm during Robin's work is that GSoC work is > done in private. Why is it done in private? //Lennart From martin at v.loewis.de Wed Aug 7 10:09:16 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 07 Aug 2013 10:09:16 +0200 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130806212641.64b0e851@fsol> References: <20130806212641.64b0e851@fsol> Message-ID: <20130807100916.Horde.WBSRoNB8I3iLqxw3ppgX3A5@webmail.df.eu> Zitat von Antoine Pitrou : > > One cruel example is the set of PEP 3121 / PEP 384 refactorings done by > Robin Schreiber: I personally dont consider it failed, yet. I still plan to integrate them, hopefully for 3.4. > Robin has produced many patches that seem to reach the stated goal > (refactor C extension modules to take advantage of the latest PEPs > about module initialization and extension types definition). > Unfortunately, tackling both goals at the same time produces big > patches with a lot of churn; and it is also not obvious the PEP 384 > refactoring is useful for the stdlib (while the PEP 3121 refactoring > definitely is). Choice of supporting PEP 384 was deliberate. It will change all types into heap types, which is useful for multiple-interpreter support and GC. > > What didn't produce an alarm during Robin's work is that GSoC work is > done in private. It wasn't really done in private. Robin posted to python-dev, anybody who would have been interested could have joined discussions. > It is also > likely that the mentor gets overworked after the GSoC period is over, > is unable to finalize the patch and push it, and other core devs have a > hard time catching up on the work and don't know what the shortcomings > are. It's indeed unfortunate that RL interfered with my Python contributions. I apologize for that. However, anybody who wanted to catch up could have contacted Robin or myself. As overworked as we all are, nobody did. Regards, Martin From martin at v.loewis.de Wed Aug 7 10:34:32 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 07 Aug 2013 10:34:32 +0200 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: References: <20130806212641.64b0e851@fsol> Message-ID: <20130807103432.Horde.KOD7PDApQoVLxoBsAzPOCA5@webmail.df.eu> Zitat von Eli Bendersky : > > I would like to point out something that stands out in this list of issues: > such a method of producing dozens of patches simultaneously is extremely > unwise, unless there's a crucial piece of history I'm missing. It is much > more prudent to start with one or two exemplary modules, and if those fully > pass code review, send out patches for others. The reason is obvious - code > review may turn up problems or requests for change. Going backwards to > modify 57 patches is not something anyone would want to do. Robin did exactly that: submit a few patches first, receive feedback, submit more patches. At the end of the project,he submitted his entire work. Regards, Martin From alexander.belopolsky at gmail.com Wed Aug 7 10:41:48 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 7 Aug 2013 04:41:48 -0400 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130807100916.Horde.WBSRoNB8I3iLqxw3ppgX3A5@webmail.df.eu> References: <20130806212641.64b0e851@fsol> <20130807100916.Horde.WBSRoNB8I3iLqxw3ppgX3A5@webmail.df.eu> Message-ID: On Wed, Aug 7, 2013 at 4:09 AM, wrote: > .. >> What didn't produce an alarm during Robin's work is that GSoC work is >> done in private. >> > > It wasn't really done in private. Robin posted to python-dev, anybody > who would have been interested could have joined discussions. True. In addition, Robin's work was posted at bugs.python.org and received reviews. > However, anybody who wanted to catch up could have > contacted Robin or myself. As overworked as we all are, > nobody did. > Not true. See . -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Wed Aug 7 10:44:56 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 07 Aug 2013 10:44:56 +0200 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: References: <20130806212641.64b0e851@fsol> Message-ID: <20130807104456.Horde.GPWFH8LaicL3lp9UcufaAw2@webmail.df.eu> Zitat von Lennart Regebro : > On Tue, Aug 6, 2013 at 9:26 PM, Antoine Pitrou wrote: >> What didn't produce an alarm during Robin's work is that GSoC work is >> done in private. > > Why is it done in private? It wasn't really done in private, not more than any other contribution. A PEP was accepted before the project even started. Regards, Martin From alexander.belopolsky at gmail.com Wed Aug 7 10:54:19 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 7 Aug 2013 04:54:19 -0400 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <20130807103432.Horde.KOD7PDApQoVLxoBsAzPOCA5@webmail.df.eu> References: <20130806212641.64b0e851@fsol> <20130807103432.Horde.KOD7PDApQoVLxoBsAzPOCA5@webmail.df.eu> Message-ID: On Wed, Aug 7, 2013 at 4:34 AM, wrote: > > Robin did exactly that: submit a few patches first, receive feedback, > submit more patches. At the end of the project,he submitted his > entire work. That's not how the history looks on the tracker. Robin submitted ~50 patches before I suggested that "we should start with the "xx" modules." Then he did submit patches to the example modules, but have never responded to my reviews. http://bugs.python.org/issue15787 http://bugs.python.org/issue15848 http://bugs.python.org/issue15849 -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Aug 7 11:12:19 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 7 Aug 2013 11:12:19 +0200 Subject: [Python-Dev] Our failure at handling GSoC students References: <20130806212641.64b0e851@fsol> <20130807100916.Horde.WBSRoNB8I3iLqxw3ppgX3A5@webmail.df.eu> Message-ID: <20130807111219.4a983301@pitrou.net> Le Wed, 07 Aug 2013 10:09:16 +0200, martin at v.loewis.de a ?crit : > > > Robin has produced many patches that seem to reach the stated goal > > (refactor C extension modules to take advantage of the latest PEPs > > about module initialization and extension types definition). > > Unfortunately, tackling both goals at the same time produces big > > patches with a lot of churn; and it is also not obvious the PEP 384 > > refactoring is useful for the stdlib (while the PEP 3121 refactoring > > definitely is). > > Choice of supporting PEP 384 was deliberate. It will change all > types into heap types, which is useful for multiple-interpreter > support and GC. If I'm not mistaken, static C types shouln't benefit much from GC, since they only reference C functions. Also, PyType_FromSpec() makes reference counting delicate when subclasses are allowed. > > What didn't produce an alarm during Robin's work is that GSoC work > > is done in private. > > It wasn't really done in private. Robin posted to python-dev, anybody > who would have been interested could have joined discussions. I'm sorry if I misremembered how things happened. However, it's clear that the produced patches (including their number) cause problems for reviewers, and very few of them have been integrated. Regards Antoine. From ethan at stoneleaf.us Wed Aug 7 19:12:33 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 07 Aug 2013 10:12:33 -0700 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: References: <20130806212641.64b0e851@fsol> <20130807103432.Horde.KOD7PDApQoVLxoBsAzPOCA5@webmail.df.eu> Message-ID: <52028001.10500@stoneleaf.us> On 08/07/2013 01:54 AM, Alexander Belopolsky wrote: > > That's not how the history looks on the tracker. Robin submitted ~50 patches before I suggested that "we should start > with the "xx" modules." Then he did submit patches to the example modules, but have never responded to my reviews. Dumb question, but does he know how to publish his responses? It took me a week to figure that out. Of course, it would be up to him to ask why his responses weren't being acknowledged. (I'm speaking of the reitvald tool.) -- ~Ethan~ From victor.stinner at gmail.com Wed Aug 7 18:55:51 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 7 Aug 2013 18:55:51 +0200 Subject: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: > Also the socket library creates sockets with inheritable handles by default. Apparently there isn't a reliable way to make sockets non-inheritable because anti-virus/firewall software can interfere: > > http://stackoverflow.com/questions/12058911/can-tcp-socket-handles-be-set-not-inheritable Recent versions of Windows provide an atomic flag to create a non-inheritable socket. I hope that the falg is respected even with antivirus/firewall. For older versions of Windows, I don't see what Python can do. Is it a blocker point for the PEP? Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 7 21:43:32 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 7 Aug 2013 15:43:32 -0400 Subject: [Python-Dev] Our failure at handling GSoC students In-Reply-To: <52028001.10500@stoneleaf.us> References: <20130806212641.64b0e851@fsol> <20130807103432.Horde.KOD7PDApQoVLxoBsAzPOCA5@webmail.df.eu> <52028001.10500@stoneleaf.us> Message-ID: On Wed, Aug 7, 2013 at 1:12 PM, Ethan Furman wrote: > > Dumb question, but does he know how to publish his responses? ... (I'm speaking of the reitvald tool.) The patches that I reviewed: #15390 (datetime), #15848 (xxsubtype), and #15849 (xxmodule) did not have Reitvald "review" links. I reviewed them in the tracker comments. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Thu Aug 8 06:08:55 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 08 Aug 2013 06:08:55 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies In-Reply-To: References: <20130805205109.4cc384aa@fsol> <20130805212630.2472c0c6@fsol> <20130806141203.53d3cb71@pitrou.net> <20130806174908.3e335c52@pitrou.net> Message-ID: Stefan Behnel, 06.08.2013 18:38: > Antoine Pitrou, 06.08.2013 17:49: >> Le Tue, 06 Aug 2013 17:18:59 +0200, >> Stefan Behnel a ?crit : >>> If a Python class with a >>> __del__ inherits from an extension type that implements >>> tp_finalize(), then whose tp_finalize() will be executed first? >> >> Then only the Python __del__ gets called. It should call >> super().__del__() manually, to ensure the extension type's >> tp_finalize gets called. > > Ok, but then all I have to do in order to disable C level finalisation for > a type is to inherit from it and provide an empty __del__ method. Oh, and if the Python subtype calls super().__del__() twice, then there is no longer a guarantee that the finalisers only get executed once, right? I think it's time for at least a very visible warning in the docs that the behaviour is only 'guaranteed' for types that cannot be subtyped from Python, and that Python subtypes are free to break up the call chain in whatever way they like. Stefan From stefan_ml at behnel.de Thu Aug 8 06:33:42 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 08 Aug 2013 06:33:42 +0200 Subject: [Python-Dev] xml.etree.ElementTree.IncrementalParser (was: ElementTree iterparse string) In-Reply-To: <20130807080423.3ec89885@fsol> References: <260efc16-8404-493e-906d-8e51301c7540@email.android.com> <20130807080423.3ec89885@fsol> Message-ID: [from python-ideas] Antoine Pitrou, 07.08.2013 08:04: > Take a look at IncrementalParser: > http://docs.python.org/dev/library/xml.etree.elementtree.html#incremental-parsing Hmm, that seems to be a somewhat recent addition (April 2013). I would have preferred hearing about it before it got added. I don't like the fact that it adds a second interface to iterparse() that allows injecting arbitrary content into the parser. You can now run iterparse() to read from a file, and at an arbitrary iteration position, send it a byte string to parse from, before it goes reading more data from the file. Or take out some events before iteration continues. I think the implementation should be changed to make iterparse() return something that wraps an IncrementalParser, not something that is an IncrementalParser. Also, IMO it should mimic the interface of the TreeBuilder, which calls the data reception method "data()" and the termination method "close()". There is no reason to add yet another set of methods names just to do what others do already. Stefan From solipsis at pitrou.net Thu Aug 8 08:30:07 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 8 Aug 2013 08:30:07 +0200 Subject: [Python-Dev] PEP 442 clarification for type hierarchies References: <20130805205109.4cc384aa@fsol> <20130805212630.2472c0c6@fsol> <20130806141203.53d3cb71@pitrou.net> <20130806174908.3e335c52@pitrou.net> Message-ID: <20130808083007.78392e95@fsol> On Thu, 08 Aug 2013 06:08:55 +0200 Stefan Behnel wrote: > Stefan Behnel, 06.08.2013 18:38: > > Antoine Pitrou, 06.08.2013 17:49: > >> Le Tue, 06 Aug 2013 17:18:59 +0200, > >> Stefan Behnel a ?crit : > >>> If a Python class with a > >>> __del__ inherits from an extension type that implements > >>> tp_finalize(), then whose tp_finalize() will be executed first? > >> > >> Then only the Python __del__ gets called. It should call > >> super().__del__() manually, to ensure the extension type's > >> tp_finalize gets called. > > > > Ok, but then all I have to do in order to disable C level finalisation for > > a type is to inherit from it and provide an empty __del__ method. > > Oh, and if the Python subtype calls super().__del__() twice, then there is > no longer a guarantee that the finalisers only get executed once, right? The guarantee is that the *interpreter* will call __del__ once. You're free to call it many times yourself, it's just a method. (but super() itself is supposed to do the right thing, if you're using it properly) And, by the way, I'd like to stress again the parallel with __init__: tp_init can also be called several times if the user calls __init__ manually. Regards Antoine. From solipsis at pitrou.net Thu Aug 8 10:20:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 8 Aug 2013 10:20:39 +0200 Subject: [Python-Dev] xml.etree.ElementTree.IncrementalParser (was: ElementTree iterparse string) References: <260efc16-8404-493e-906d-8e51301c7540@email.android.com> <20130807080423.3ec89885@fsol> Message-ID: <20130808102039.4c6e1c92@pitrou.net> Hi, Le Thu, 08 Aug 2013 06:33:42 +0200, Stefan Behnel a ?crit : > [from python-ideas] > > Antoine Pitrou, 07.08.2013 08:04: > > Take a look at IncrementalParser: > > http://docs.python.org/dev/library/xml.etree.elementtree.html#incremental-parsing > > Hmm, that seems to be a somewhat recent addition (April 2013). I > would have preferred hearing about it before it got added. > > I don't like the fact that it adds a second interface to iterparse() > that allows injecting arbitrary content into the parser. > You can now > run iterparse() to read from a file, and at an arbitrary iteration > position, send it a byte string to parse from, before it goes reading > more data from the file. Or take out some events before iteration > continues. > > I think the implementation should be changed to make iterparse() > return something that wraps an IncrementalParser, not something that > is an IncrementalParser. That sounds reasonable. Do you want to post a patch? :-) > Also, IMO it should mimic the interface of the TreeBuilder, which > calls the data reception method "data()" and the termination method > "close()". There is no reason to add yet another set of methods names > just to do what others do already. Well, the difference here is that after calling eof_received() you can still (and should) call events() once to get the last events. I think it would be weird if you could still do something useful with the object after calling close(). Also, the method names are not invented, they mimick the PEP 3156 stream protocols: http://www.python.org/dev/peps/pep-3156/#stream-protocols Regards Antoine. From larry at hastings.org Thu Aug 8 11:45:35 2013 From: larry at hastings.org (Larry Hastings) Date: Thu, 08 Aug 2013 02:45:35 -0700 Subject: [Python-Dev] The Return Of Argument Clinic In-Reply-To: References: <51FF66DD.1020403@hastings.org> <52003B03.6090204@hastings.org> Message-ID: <520368BF.2050600@hastings.org> On 08/05/2013 09:59 PM, Nick Coghlan wrote: >> ___________________________________________________________________ >> Question 2: Emit code for modules and classes? >> >> [...] Originally Clinic didn't ask for full class and module information, you just >> specified the full dotted path and that was that. But that's ambiguous; >> Clinic wouldn't be able to infer what was a module vs what was a class. And >> in the future, if/when it generates module and class boilerplate, obviously >> it'll need to know the distinction. [...] > Note that setuptools entry point syntax solves the namespace ambiguity > problem by using ":" to separate the module name from the object's > name within the module (the nost test runner does the same thing). I'm > adopting that convention for the PEP 426 metadata, and it's probably > appropriate as a concise notation for clinic as well. So you're proposing that xml.etree.ElementTree.dump() be written as "xml.etree:ElementTree.dump", and datetime.datetime.now() be written as "datetime:datetime.now"? And presumably *not* specifying a colon as part of the name would be an error. >> ___________________________________________________________________ >> Question 5: Keep too-magical class decorator Converter.wrap? > You misunderstand me: I believe a class decorator is the *wrong > solution*. I am saying Converter.wrap *shouldn't exist*, and that the > logic for what it does should be directly in Converter.__init__. Well, nobody liked it, everybody hated it, so I'll go with what you proposed, though with the name converter_init() for the custom converter init function. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Aug 8 17:52:00 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 Aug 2013 01:52:00 +1000 Subject: [Python-Dev] The Return Of Argument Clinic In-Reply-To: <520368BF.2050600@hastings.org> References: <51FF66DD.1020403@hastings.org> <52003B03.6090204@hastings.org> <520368BF.2050600@hastings.org> Message-ID: On 8 Aug 2013 02:48, "Larry Hastings" wrote: > > On 08/05/2013 09:59 PM, Nick Coghlan wrote: >>> >>> ___________________________________________________________________ >>> Question 2: Emit code for modules and classes? >>> >>> [...] Originally Clinic didn't ask for full class and module information, you just >>> specified the full dotted path and that was that. But that's ambiguous; >>> Clinic wouldn't be able to infer what was a module vs what was a class. And >>> in the future, if/when it generates module and class boilerplate, obviously >>> it'll need to know the distinction. [...] >> >> Note that setuptools entry point syntax solves the namespace ambiguity >> problem by using ":" to separate the module name from the object's >> name within the module (the nost test runner does the same thing). I'm >> adopting that convention for the PEP 426 metadata, and it's probably >> appropriate as a concise notation for clinic as well. > > > So you're proposing that xml.etree.ElementTree.dump() be written as "xml.etree:ElementTree.dump", and datetime.datetime.now() be written as "datetime:datetime.now"? And presumably *not* specifying a colon as part of the name would be an error. Assuming there's no way to tell argument clinic all the functions and classes in a given C file belong to the same module, then yes, you would need the colon in every name to indicate the module portion. > > >>> ___________________________________________________________________ >>> Question 5: Keep too-magical class decorator Converter.wrap? >> >> You misunderstand me: I believe a class decorator is the *wrong >> >> solution*. I am saying Converter.wrap *shouldn't exist*, and that the >> logic for what it does should be directly in Converter.__init__. > > > Well, nobody liked it, everybody hated it, so I'll go with what you proposed, though with the name converter_init() for the custom converter init function. My future code-reading self thanks you :) Cheers, Nick. > > > /arry -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Fri Aug 9 13:11:11 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 Aug 2013 13:11:11 +0200 Subject: [Python-Dev] xml.etree.ElementTree.IncrementalParser In-Reply-To: <20130808102039.4c6e1c92@pitrou.net> References: <260efc16-8404-493e-906d-8e51301c7540@email.android.com> <20130807080423.3ec89885@fsol> <20130808102039.4c6e1c92@pitrou.net> Message-ID: Antoine Pitrou, 08.08.2013 10:20: > Le Thu, 08 Aug 2013 06:33:42 +0200, > Stefan Behnel a ?crit : >> Antoine Pitrou, 07.08.2013 08:04: >>> http://docs.python.org/dev/library/xml.etree.elementtree.html#incremental-parsing >> >> I don't like the fact that it adds a second interface to iterparse() >> that allows injecting arbitrary content into the parser. >> You can now >> run iterparse() to read from a file, and at an arbitrary iteration >> position, send it a byte string to parse from, before it goes reading >> more data from the file. Or take out some events before iteration >> continues. >> >> I think the implementation should be changed to make iterparse() >> return something that wraps an IncrementalParser, not something that >> is an IncrementalParser. > > That sounds reasonable. Do you want to post a patch? :-) I attached it to the ticket that seems to have been the source of this addition. http://bugs.python.org/issue17741 Please note that the tulip mailing list is not an appropriate place to discuss additions to the XML libraries, and ElementTree in particular. Is there a way to get automatic notification when the XML component is assigned to a ticket? (Not that it would have helped in this case, as the component was missing from the ticket.) >> Also, IMO it should mimic the interface of the TreeBuilder, which >> calls the data reception method "data()" Uups, sorry. It's actually called feed(). >> and the termination method >> "close()". There is no reason to add yet another set of methods names >> just to do what others do already. > > Well, the difference here is that after calling eof_received() you can > still (and should) call events() once to get the last events. I think > it would be weird if you could still do something useful with the object > after calling close(). > > Also, the method names are not invented, they mimick the PEP 3156 > stream protocols: > http://www.python.org/dev/peps/pep-3156/#stream-protocols I see your point about close(). I assume your reasoning was to make the IncrementalParser an arbitrary stream end-point. However, it doesn't really make all that much sense to connect an arbitrary data source to it, as the source wouldn't know that, in addition to passing in data, it would also have to ask for events from time to time. I mean, you could do it, but then it would just fill up the memory with parser events and loose the actual advantages of incremental parsing. So, in a way, the whole point of the class is to *not* be an arbitrary stream end-point. Anyway, given that there isn't really the One Obvious Way to do it, maybe you should just add a docstring to the class (ahem), reference the stream protocol as the base for its API, and then rename it to IncrementalStreamParser. That would at least make it clear why it doesn't really fit with the rest of the module API (which was designed some decade before PEP 3156) and instead uses its own naming scheme. Stefan From solipsis at pitrou.net Fri Aug 9 14:50:50 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 9 Aug 2013 14:50:50 +0200 Subject: [Python-Dev] xml.etree.ElementTree.IncrementalParser References: <260efc16-8404-493e-906d-8e51301c7540@email.android.com> <20130807080423.3ec89885@fsol> <20130808102039.4c6e1c92@pitrou.net> Message-ID: <20130809145050.25413bd0@pitrou.net> Le Fri, 09 Aug 2013 13:11:11 +0200, Stefan Behnel a ?crit : > > I attached it to the ticket that seems to have been the source of this > addition. > > http://bugs.python.org/issue17741 > > Please note that the tulip mailing list is not an appropriate place to > discuss additions to the XML libraries, and ElementTree in particular. Well, the bug tracker is the main point of discussion, except that few people bothered discussing it. > Is there a way to get automatic notification when the XML component is > assigned to a ticket? (Not that it would have helped in this case, as > the component was missing from the ticket.) You could ask to get included in the "experts" index: http://docs.python.org/devguide/experts.html (I doubt anyone would object to that) > Anyway, given that there isn't really the One Obvious Way to do it, > maybe you should just add a docstring to the class (ahem), reference > the stream protocol as the base for its API, and then rename it to > IncrementalStreamParser. I don't think there's any point in making the class name longer. Parsing XML incrementally is pretty much what it does. As for the docstring, uh, well, sure :-) (IMHO, IncrementalParser is the One Obvious Way to do incremental XML parsing in 3.4, but YMMV) Regards Antoine. From stefan_ml at behnel.de Fri Aug 9 15:24:22 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 Aug 2013 15:24:22 +0200 Subject: [Python-Dev] xml.etree.ElementTree.IncrementalParser In-Reply-To: <20130809145050.25413bd0@pitrou.net> References: <260efc16-8404-493e-906d-8e51301c7540@email.android.com> <20130807080423.3ec89885@fsol> <20130808102039.4c6e1c92@pitrou.net> <20130809145050.25413bd0@pitrou.net> Message-ID: Antoine Pitrou, 09.08.2013 14:50: > Le Fri, 09 Aug 2013 13:11:11 +0200, > Stefan Behnel a ?crit : >> I attached it to the ticket that seems to have been the source of this >> addition. >> >> http://bugs.python.org/issue17741 >> >> Please note that the tulip mailing list is not an appropriate place to >> discuss additions to the XML libraries, and ElementTree in particular. > > Well, the bug tracker is the main point of discussion, except that few > people bothered discussing it. The bug tracker is usually not a very visible place to start discussing about changes. This change is a particularly good example, I've certainly seen others. >> Is there a way to get automatic notification when the XML component is >> assigned to a ticket? (Not that it would have helped in this case, as >> the component was missing from the ticket.) > > You could ask to get included in the "experts" index: > http://docs.python.org/devguide/experts.html > (I doubt anyone would object to that) Ok, please add me for xml.etree then. I used to get added to the noisy list for ET tickets during the 3.3 release cycle, but that seems to have stopped a while back. Since it's easier to erase my name from the noisy list than to add myself to a bug I've never heard about, I'm ok with being added for anything that relates to ET, basically, be it bug or feature. >> Anyway, given that there isn't really the One Obvious Way to do it, >> maybe you should just add a docstring to the class (ahem), reference >> the stream protocol as the base for its API, and then rename it to >> IncrementalStreamParser. > > I don't think there's any point in making the class name longer. Agreed. It's not the class name that should be modified but the method names. I changed my mind and posted to the tracker. I also attached a new patch that changes the implementation to what I think it should look like. Stefan From status at bugs.python.org Fri Aug 9 18:07:38 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 9 Aug 2013 18:07:38 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130809160738.4211C56A31@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-08-02 - 2013-08-09) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4148 (+20) closed 26321 (+47) total 30469 (+67) Open issues with patches: 1874 Issues opened (56) ================== #4322: function with modified __name__ uses original name when there' http://bugs.python.org/issue4322 reopened by benjamin.peterson #17741: event-driven XML parser http://bugs.python.org/issue17741 reopened by pitrou #18630: mingw: exclude unix only modules http://bugs.python.org/issue18630 opened by rpetrov #18631: mingw: setup msvcrt and _winapi modules http://bugs.python.org/issue18631 opened by rpetrov #18632: mingw: build extensions with GCC http://bugs.python.org/issue18632 opened by rpetrov #18633: mingw: use Mingw32CCompiler as default compiler for mingw* bu http://bugs.python.org/issue18633 opened by rpetrov #18634: mingw find import library http://bugs.python.org/issue18634 opened by rpetrov #18636: mingw: setup _ssl module http://bugs.python.org/issue18636 opened by rpetrov #18637: mingw: export _PyNode_SizeOf as PyAPI for parser module http://bugs.python.org/issue18637 opened by rpetrov #18638: mingw: generalization of posix build in sysconfig.py http://bugs.python.org/issue18638 opened by rpetrov #18639: mingw: avoid circular dependency from time module during nativ http://bugs.python.org/issue18639 opened by rpetrov #18640: mingw: generalization of posix build in distutils/sysconfig.py http://bugs.python.org/issue18640 opened by rpetrov #18641: mingw: customize site http://bugs.python.org/issue18641 opened by rpetrov #18643: implement socketpair() on Windows http://bugs.python.org/issue18643 opened by neologix #18644: Got ResourceWarning: unclosed file when using test function fr http://bugs.python.org/issue18644 opened by vajrasky #18645: Add a configure option for performance guided optimization http://bugs.python.org/issue18645 opened by rhettinger #18646: Improve tutorial entry on 'Lambda Forms'. http://bugs.python.org/issue18646 opened by terry.reedy #18647: re.error: nothing to repeat http://bugs.python.org/issue18647 opened by serhiy.storchaka #18648: FP Howto and the PEP 8 lambda guildline http://bugs.python.org/issue18648 opened by terry.reedy #18650: intermittent test_pydoc failure on 3.4.0a1 http://bugs.python.org/issue18650 opened by ned.deily #18651: test failures on KFreeBSD http://bugs.python.org/issue18651 opened by doko #18652: Add itertools.first_true (return first true item in iterable) http://bugs.python.org/issue18652 opened by hynek #18653: mingw-meta: build core modules http://bugs.python.org/issue18653 opened by rpetrov #18654: modernize mingw&cygwin compiler classes http://bugs.python.org/issue18654 opened by rpetrov #18655: GUI apps take long to launch on Windows http://bugs.python.org/issue18655 opened by netrick #18659: test_precision in test_format.py is not executed and has unuse http://bugs.python.org/issue18659 opened by vajrasky #18660: os.read behavior on Linux http://bugs.python.org/issue18660 opened by dugres #18663: In unittest.TestCase.assertAlmostEqual doc specify the delta d http://bugs.python.org/issue18663 opened by py.user #18664: occasional test_threading failure http://bugs.python.org/issue18664 opened by pitrou #18667: missing HAVE_FCHOWNAT http://bugs.python.org/issue18667 opened by salinger #18669: curses.chgat() moves cursor, documentation says it shouldn't http://bugs.python.org/issue18669 opened by productivememberofsociety666 #18670: Using read_mime_types function from mimetypes module gives res http://bugs.python.org/issue18670 opened by vajrasky #18672: Fix format specifiers for debug output in _sre.c http://bugs.python.org/issue18672 opened by serhiy.storchaka #18673: Add and use O_TMPFILE for Linux 3.11 http://bugs.python.org/issue18673 opened by christian.heimes #18674: Store weak references in modules_by_index http://bugs.python.org/issue18674 opened by pitrou #18675: Daemon Threads can seg fault http://bugs.python.org/issue18675 opened by guettli #18676: Queue: document that zero is accepted as timeout value http://bugs.python.org/issue18676 opened by zyluo #18677: Enhanced context managers with ContextManagerExit and None http://bugs.python.org/issue18677 opened by kristjan.jonsson #18678: Wrong struct members name for spwd module http://bugs.python.org/issue18678 opened by vajrasky #18679: include a codec to handle escaping only control characters but http://bugs.python.org/issue18679 opened by underrun #18680: JSONDecoder should document that it raises a ValueError for ma http://bugs.python.org/issue18680 opened by corey #18681: typo in imp.reload http://bugs.python.org/issue18681 opened by felloak #18682: [PATCH] remove bogus codepath from pprint._safe_repr http://bugs.python.org/issue18682 opened by mvyskocil #18683: Core dumps on CentOS http://bugs.python.org/issue18683 opened by schlamar #18684: Pointers point out of array bound in _sre.c http://bugs.python.org/issue18684 opened by serhiy.storchaka #18685: Restore re performance to pre-PEP393 level http://bugs.python.org/issue18685 opened by serhiy.storchaka #18686: Tkinter focus_get on menu results in KeyError http://bugs.python.org/issue18686 opened by jgoeders #18687: Lib/test/leakers/test_ctypes.py still mentions the need to upd http://bugs.python.org/issue18687 opened by iwontbecreative #18688: Document undocumented Unicode object API http://bugs.python.org/issue18688 opened by serhiy.storchaka #18689: add argument for formatter to logging.Handler and subclasses i http://bugs.python.org/issue18689 opened by underrun #18690: memoryview not considered a sequence http://bugs.python.org/issue18690 opened by sfeltman #18691: sqlite3.Cursor.execute expects sequence as second argument. http://bugs.python.org/issue18691 opened by Andrew.Myers #18693: help() not helpful with enum http://bugs.python.org/issue18693 opened by ethan.furman #18694: getxattr on Linux ZFS native filesystem happily returns partia http://bugs.python.org/issue18694 opened by larry #18695: os.statvfs() not working well with unicode paths http://bugs.python.org/issue18695 opened by giampaolo.rodola #18696: In unittest.TestCase.longMessage doc remove a redundant senten http://bugs.python.org/issue18696 opened by py.user Most recent 15 issues with no replies (15) ========================================== #18696: In unittest.TestCase.longMessage doc remove a redundant senten http://bugs.python.org/issue18696 #18695: os.statvfs() not working well with unicode paths http://bugs.python.org/issue18695 #18694: getxattr on Linux ZFS native filesystem happily returns partia http://bugs.python.org/issue18694 #18691: sqlite3.Cursor.execute expects sequence as second argument. http://bugs.python.org/issue18691 #18690: memoryview not considered a sequence http://bugs.python.org/issue18690 #18689: add argument for formatter to logging.Handler and subclasses i http://bugs.python.org/issue18689 #18688: Document undocumented Unicode object API http://bugs.python.org/issue18688 #18687: Lib/test/leakers/test_ctypes.py still mentions the need to upd http://bugs.python.org/issue18687 #18684: Pointers point out of array bound in _sre.c http://bugs.python.org/issue18684 #18681: typo in imp.reload http://bugs.python.org/issue18681 #18675: Daemon Threads can seg fault http://bugs.python.org/issue18675 #18672: Fix format specifiers for debug output in _sre.c http://bugs.python.org/issue18672 #18670: Using read_mime_types function from mimetypes module gives res http://bugs.python.org/issue18670 #18669: curses.chgat() moves cursor, documentation says it shouldn't http://bugs.python.org/issue18669 #18663: In unittest.TestCase.assertAlmostEqual doc specify the delta d http://bugs.python.org/issue18663 Most recent 15 issues waiting for review (15) ============================================= #18696: In unittest.TestCase.longMessage doc remove a redundant senten http://bugs.python.org/issue18696 #18695: os.statvfs() not working well with unicode paths http://bugs.python.org/issue18695 #18694: getxattr on Linux ZFS native filesystem happily returns partia http://bugs.python.org/issue18694 #18687: Lib/test/leakers/test_ctypes.py still mentions the need to upd http://bugs.python.org/issue18687 #18685: Restore re performance to pre-PEP393 level http://bugs.python.org/issue18685 #18684: Pointers point out of array bound in _sre.c http://bugs.python.org/issue18684 #18682: [PATCH] remove bogus codepath from pprint._safe_repr http://bugs.python.org/issue18682 #18678: Wrong struct members name for spwd module http://bugs.python.org/issue18678 #18677: Enhanced context managers with ContextManagerExit and None http://bugs.python.org/issue18677 #18676: Queue: document that zero is accepted as timeout value http://bugs.python.org/issue18676 #18674: Store weak references in modules_by_index http://bugs.python.org/issue18674 #18672: Fix format specifiers for debug output in _sre.c http://bugs.python.org/issue18672 #18670: Using read_mime_types function from mimetypes module gives res http://bugs.python.org/issue18670 #18663: In unittest.TestCase.assertAlmostEqual doc specify the delta d http://bugs.python.org/issue18663 #18659: test_precision in test_format.py is not executed and has unuse http://bugs.python.org/issue18659 Top 10 most discussed issues (10) ================================= #18652: Add itertools.first_true (return first true item in iterable) http://bugs.python.org/issue18652 27 msgs #17741: event-driven XML parser http://bugs.python.org/issue17741 23 msgs #18264: enum.IntEnum is not compatible with JSON serialisation http://bugs.python.org/issue18264 22 msgs #18647: re.error: nothing to repeat http://bugs.python.org/issue18647 18 msgs #18606: Add statistics module to standard library http://bugs.python.org/issue18606 17 msgs #18629: future division breaks timedelta division by integer http://bugs.python.org/issue18629 16 msgs #18659: test_precision in test_format.py is not executed and has unuse http://bugs.python.org/issue18659 15 msgs #15651: PEP 3121, 384 refactoring applied to elementtree module http://bugs.python.org/issue15651 12 msgs #18677: Enhanced context managers with ContextManagerExit and None http://bugs.python.org/issue18677 11 msgs #16853: add a Selector to the select module http://bugs.python.org/issue16853 9 msgs Issues closed (45) ================== #3591: elementtree tests do not include bytes handling http://bugs.python.org/issue3591 closed by eli.bendersky #4885: mmap enhancement request http://bugs.python.org/issue4885 closed by pitrou #8860: Rounding in timedelta constructor is inconsistent with that in http://bugs.python.org/issue8860 closed by belopolsky #8998: add crypto routines to stdlib http://bugs.python.org/issue8998 closed by gregory.p.smith #10427: 24:00 Hour in DateTime http://bugs.python.org/issue10427 closed by belopolsky #10897: UNIX mmap unnecessarily dup() file descriptor http://bugs.python.org/issue10897 closed by neologix #13083: _sre: getstring() releases the buffer before using it http://bugs.python.org/issue13083 closed by serhiy.storchaka #13612: xml.etree.ElementTree says unknown encoding of a regular encod http://bugs.python.org/issue13612 closed by eli.bendersky #14323: Normalize math precision in RGB/YIQ conversion http://bugs.python.org/issue14323 closed by serhiy.storchaka #15301: os.chown: OverflowError: Python int too large to convert to C http://bugs.python.org/issue15301 closed by larry #15866: encode(..., 'xmlcharrefreplace') produces entities for surroga http://bugs.python.org/issue15866 closed by serhiy.storchaka #15966: concurrent.futures: Executor.submit keyword arguments may not http://bugs.python.org/issue15966 closed by mark.dickinson #16067: UAC prompt for installation shows temporary file name http://bugs.python.org/issue16067 closed by loewis #16741: `int()`, `float()`, etc think python strings are null-terminat http://bugs.python.org/issue16741 closed by serhiy.storchaka #17011: ElementPath ignores different namespace mappings for the same http://bugs.python.org/issue17011 closed by eli.bendersky #17046: test_subprocess test_executable_without_cwd fails when run wit http://bugs.python.org/issue17046 closed by ned.deily #17216: sparc linux build fails with "could not import runpy module" http://bugs.python.org/issue17216 closed by ned.deily #17372: provide pretty printer for xml.etree.ElementTree http://bugs.python.org/issue17372 closed by eli.bendersky #17478: Tkinter's split() inconsistent for bytes and unicode strings http://bugs.python.org/issue17478 closed by serhiy.storchaka #17902: Document that _elementtree C API cannot use custom TreeBuilder http://bugs.python.org/issue17902 closed by eli.bendersky #17934: Add a frame method to clear expensive details http://bugs.python.org/issue17934 closed by pitrou #18151: Idlelib: update to "with open ... except OSError" (in 2.7, le http://bugs.python.org/issue18151 closed by terry.reedy #18201: distutils write into symlinks instead of replacing them http://bugs.python.org/issue18201 closed by mgorny #18273: Simplify calling and discovery of json test package http://bugs.python.org/issue18273 closed by ezio.melotti #18357: add tests for dictview set difference operations http://bugs.python.org/issue18357 closed by ezio.melotti #18396: test_signal.test_issue9324() fails on buildbot AMD64 Windows7 http://bugs.python.org/issue18396 closed by python-dev #18443: Misc/Readme still documents TextMate http://bugs.python.org/issue18443 closed by ezio.melotti #18532: hashlib.HASH objects should officially expose the hash name http://bugs.python.org/issue18532 closed by christian.heimes #18563: No unit test for yiq to rgb and rgb to yiq converting functio http://bugs.python.org/issue18563 closed by serhiy.storchaka #18570: OverflowError in division: wrong message http://bugs.python.org/issue18570 closed by mark.dickinson #18581: Duplicate test and missing class test in test_abc.py http://bugs.python.org/issue18581 closed by ezio.melotti #18621: site.py keeps too much stuff alive when it patches builtins http://bugs.python.org/issue18621 closed by pitrou #18635: Enum sets _member_type_ to instantiated values but not the cla http://bugs.python.org/issue18635 closed by python-dev #18642: enhancement for operator 'assert' http://bugs.python.org/issue18642 closed by mark.dickinson #18649: list2cmdline function in subprocess module handles \" sequence http://bugs.python.org/issue18649 closed by sbt #18656: setting function.__name__ doesn't affect repr() http://bugs.python.org/issue18656 closed by benjamin.peterson #18657: Remove duplicate ACKS entries http://bugs.python.org/issue18657 closed by r.david.murray #18658: Mercurial CPython log ticket link is broken http://bugs.python.org/issue18658 closed by ned.deily #18661: Typo in grpmodule.c http://bugs.python.org/issue18661 closed by mark.dickinson #18662: re.escape should not escape the hyphen http://bugs.python.org/issue18662 closed by jjl #18665: Typos in frame object related code http://bugs.python.org/issue18665 closed by pitrou #18666: Unused variable in test_frame.py http://bugs.python.org/issue18666 closed by pitrou #18668: Properly document setting m_size in PyModuleDef http://bugs.python.org/issue18668 closed by python-dev #18671: enhance formatting in logging package http://bugs.python.org/issue18671 closed by vinay.sajip #18692: Connection change in compiled code http://bugs.python.org/issue18692 closed by r.david.murray From eliben at gmail.com Sun Aug 11 02:12:53 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 10 Aug 2013 17:12:53 -0700 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) Message-ID: Hello, Recently as part of the effort of untangling the tests of ElementTree and general code improvements (e.g. http://bugs.python.org/issue15651), I ran into something strange about PEP 3121-compliant modules. I'll demonstrate with csv, just as an example. PEP 3121 mandates this function to look up the module-specific state in the current sub-interpreter: PyObject* PyState_FindModule(struct PyModuleDef*); This appears to make the following assumption: a given sub-interpreter only imports any C extension *once*. If it happens more than once, the assumption breaks in troubling ways. In normal code, it should never happen more than once because of the caching in sys.modules; However, many of our tests monkey-patch sys.modules (mainly by calling test.support.import_fresh_module) and hell breaks use. Here's a simple example: ---- import sys csv = __import__('csv') csv.register_dialect('unixpwd', delimiter=':', quoting=csv.QUOTE_NONE) print(csv.list_dialects()) # ==> ['unixpwd', 'excel-tab', 'excel', 'unix'] del sys.modules['csv'] # FUN del sys.modules['_csv'] some_other_csv = __import__('csv') print(csv.list_dialects()) # ==> ['excel-tab', 'excel', 'unix'] ---- Note how doing some sys.modules acrobatics and re-importing suddenly changes the internal state of a previously imported module. This happens because: 1. The first import of 'csv' (which then imports `_csv) creates module-specific state on the heap and associates it with the current sub-interpreter. The list of dialects, amongst other things, is in that state. 2. The 'del's wipe 'csv' and '_csv' from the cache. 3. The second import of 'csv' also creates/initializes a new '_csv' module because it's not in sys.modules. This *replaces* the per-sub-interpreter cached version of the module's state with the clean state of a new module So essentially, while PEP 3121 moves state from C-file globals to per-module state, the state is still global, and this fact can be exposed from pure Python code. The above is a toy example. Here's a more serious case I ran into with ET, but once again is demonstrated with 'csv' for simplicity: ---- import io from test.support import import_fresh_module import csv csv_other = import_fresh_module('csv', fresh=['_csv', 'csv']) f = io.StringIO('foo\x00,bar\nbaz,42') reader = csv.reader(f) try: for row in reader: print(row) except csv.Error as e: print('Caught csv.error', e) except Exception as e: print('Caught Exception', e) ---- In the above, the reader throws 'csv.Error' (because of the NULL byte) but the exception clause does not catch it where expected, because it's a different exception class called `csv.Error`, due to the same problem demonstrated above (if the seemingly innocent import_fresh_module is removed, all is good). Any ideas/suggestion regarding this are welcome. This is quite an esoteric problem, but I believe it's serious. PEP 3121 is not used much (yet), but recently there was talk again about committing some of the patches created for converting Modules/*.c extensions to it during a GSoC project. I believe that we should understand the implications first. There can be a number of solutions; including modifying the PEP 3121 implementation machinery to really create/keep state "per module" and not just "per kind of module in a single sub-interpreter". Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Aug 11 02:25:04 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 10 Aug 2013 20:25:04 -0400 Subject: [Python-Dev] Buildbot failure puzzle Message-ID: At least the following 3.4 buildbots have failed today with an error I do not understand: AMD64 FreeBSD, PPC64, x86Ubuntu, x86 WinServer 2003. Except for the Windows BB, it was the only failure and hence the only reason to not be green. ERROR: test_xmlcharnamereplace (test.test_codeccallbacks.CodecCallbackTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_codeccallbacks.py", line 112, in test_xmlcharnamereplace self.assertEqual(sin.encode("ascii", "test.xmlcharnamereplace"), sout) File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_codeccallbacks.py", line 102, in xmlcharnamereplace l.append("&%s;" % html.entities.codepoint2name[ord(c)]) AttributeError: 'module' object has no attribute 'entities' test_codeccallbacks.py: lines from 2008-05-17 line 002: import html.entities ... line 102: l.append("&%s;" % html.entities.codepoint2name[ord(c)]) I checked with an editor and these are the only two appearances of 'html' (so it was not rebound to anything else) and the spellings in the file are the same. Indeed, the same code has worked on at least some of the same machines. -- Terry Jan Reedy From solipsis at pitrou.net Sun Aug 11 02:29:33 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 11 Aug 2013 02:29:33 +0200 Subject: [Python-Dev] Buildbot failure puzzle References: Message-ID: <20130811022933.06bc4901@fsol> On Sat, 10 Aug 2013 20:25:04 -0400 Terry Reedy wrote: > At least the following 3.4 buildbots have failed today with an error I > do not understand: AMD64 FreeBSD, PPC64, x86Ubuntu, x86 WinServer 2003. http://bugs.python.org/issue18706 From ncoghlan at gmail.com Sun Aug 11 02:47:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 10 Aug 2013 20:47:09 -0400 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: Message-ID: In a similar vein, Antoine recently noted that the fact the per-module state isn't a real PyObject creates a variety of interesting lifecycle management challenges. I'm not seeing an easy solution, either, except to automatically skip reinitialization when the module has already been imported. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Aug 11 03:06:02 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 10 Aug 2013 18:06:02 -0700 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: Message-ID: n Sat, Aug 10, 2013 at 5:47 PM, Nick Coghlan wrote: > In a similar vein, Antoine recently noted that the fact the per-module > state isn't a real PyObject creates a variety of interesting lifecycle > management challenges. > > I'm not seeing an easy solution, either, except to automatically skip > reinitialization when the module has already been imported. > This solution has problems. For example, in the case of ET it would preclude testing what happens when pyexpat is disabled (remember we were discussing this...). This is because there would be no real way to create new instances of such modules (they would all cache themselves in the init function - similarly to what ET now does in trunk, because otherwise some of its global-dependent crazy tests fail). A more radical solution would be to *really* have multiple instances of state per sub-interpreter. Well, they already exist -- it's PyState_FindModule which is the problematic one because it only remembers the last one. But I see that it's only being used by extension modules themselves, to efficiently find modules they belong to. It feels a bit like a hack that was made to avoid rewriting lots of code, because in general a module's objects *can* know which module instance they came from. E.g. it can be saved as a private field in classes exported by the module. So a more radical approach would be: PyState_FindModule can be deprecated, but still exist and be documented to return the state the *last* module created in this sub-interpreter. stdlib extension modules that actually use this mechanism can be rewritten to just remember the module for real, and not rely on PyState_FindModule to fetch it from a global cache. I don't think this would be hard, and it would make the good intention of PEP 3121 more real - actual intependent state per module instance. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Aug 11 03:40:46 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 10 Aug 2013 21:40:46 -0400 Subject: [Python-Dev] Green buildbot failure. Message-ID: This run recorded here shows a green test (it appears to have timed out) http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/7017 but the corresponding log for this Windows bot http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/7017/steps/test/logs/stdio has the expected os.chown failure. Are such green failures intended? -- Terry Jan Reedy From ncoghlan at gmail.com Sun Aug 11 08:58:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Aug 2013 02:58:13 -0400 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: Message-ID: On 10 Aug 2013 21:06, "Eli Bendersky" wrote: > > n Sat, Aug 10, 2013 at 5:47 PM, Nick Coghlan wrote: >> >> In a similar vein, Antoine recently noted that the fact the per-module state isn't a real PyObject creates a variety of interesting lifecycle management challenges. >> >> I'm not seeing an easy solution, either, except to automatically skip reinitialization when the module has already been imported. > > This solution has problems. For example, in the case of ET it would preclude testing what happens when pyexpat is disabled (remember we were discussing this...). This is because there would be no real way to create new instances of such modules (they would all cache themselves in the init function - similarly to what ET now does in trunk, because otherwise some of its global-dependent crazy tests fail). Right, it would still be broken, just in a less horrible way. > > A more radical solution would be to *really* have multiple instances of state per sub-interpreter. Well, they already exist -- it's PyState_FindModule which is the problematic one because it only remembers the last one. But I see that it's only being used by extension modules themselves, to efficiently find modules they belong to. It feels a bit like a hack that was made to avoid rewriting lots of code, because in general a module's objects *can* know which module instance they came from. E.g. it can be saved as a private field in classes exported by the module. > > So a more radical approach would be: > > PyState_FindModule can be deprecated, but still exist and be documented to return the state the *last* module created in this sub-interpreter. stdlib extension modules that actually use this mechanism can be rewritten to just remember the module for real, and not rely on PyState_FindModule to fetch it from a global cache. I don't think this would be hard, and it would make the good intention of PEP 3121 more real - actual intependent state per module instance. Sounds promising to me. I suspect handling exported functions will prove to be tricky, though - they may need to be redesigned to behave more like "module methods". > > Eli > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Aug 11 11:58:26 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 11 Aug 2013 11:58:26 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: Message-ID: <20130811115826.3560f8a0@fsol> On Sat, 10 Aug 2013 18:06:02 -0700 Eli Bendersky wrote: > This solution has problems. For example, in the case of ET it would > preclude testing what happens when pyexpat is disabled (remember we were > discussing this...). This is because there would be no real way to create > new instances of such modules (they would all cache themselves in the init > function - similarly to what ET now does in trunk, because otherwise some > of its global-dependent crazy tests fail). > > A more radical solution would be to *really* have multiple instances of > state per sub-interpreter. Well, they already exist -- it's > PyState_FindModule which is the problematic one because it only remembers > the last one. I'm not sure I understand your diagnosis. modules_per_index (and PyState_FindModule) is per-interpreter so we already have a per-interpreter state here. Something else must be interferring. Note that module state is just a field attached to the module object ("void *md_state" in PyModuleObject). It's really the extension modules which are per-interpreter, which is a good thing. Regards Antoine. From solipsis at pitrou.net Sun Aug 11 12:00:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 11 Aug 2013 12:00:18 +0200 Subject: [Python-Dev] Green buildbot failure. References: Message-ID: <20130811120018.3c209b6b@fsol> On Sat, 10 Aug 2013 21:40:46 -0400 Terry Reedy wrote: > > This run recorded here shows a green test (it appears to have timed out) > http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/7017 > but the corresponding log for this Windows bot > http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/7017/steps/test/logs/stdio > has the expected os.chown failure. You've got the answer at the bottom: "program finished with exit code 0" So for some reason, the test suite crashed, but with a successful exit code. Buildbot thinks it ran fine. > Are such green failures intended? Not really, no. Regards Antoine. From solipsis at pitrou.net Sun Aug 11 12:33:16 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 11 Aug 2013 12:33:16 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: Message-ID: <20130811123316.313ab101@fsol> Hi Eli, On Sat, 10 Aug 2013 17:12:53 -0700 Eli Bendersky wrote: > > Note how doing some sys.modules acrobatics and re-importing suddenly > changes the internal state of a previously imported module. This happens > because: > > 1. The first import of 'csv' (which then imports `_csv) creates > module-specific state on the heap and associates it with the current > sub-interpreter. The list of dialects, amongst other things, is in that > state. > 2. The 'del's wipe 'csv' and '_csv' from the cache. > 3. The second import of 'csv' also creates/initializes a new '_csv' module > because it's not in sys.modules. This *replaces* the per-sub-interpreter > cached version of the module's state with the clean state of a new module I would say this is pretty much expected. The converse would be a bug IMO (but perhaps Martin disagrees). PEP 3121's stated goal is not only subinterpreter support: "Extension module initialization currently has a few deficiencies. There is no cleanup for modules, the entry point name might give naming conflicts, the entry functions don't follow the usual calling convention, and multiple interpreters are not supported well." Re-initializing state when importing a module anew makes extension modules more like pure Python modules, which is a good thing. I think the piece of interpretation you offered yesterday on IRC may be the right explanation for the ET shenanigans: "Maybe the bug is that ParseError is kept in per-module state, and also exported from the module?" PEP 3121 doesn't offer any guidelines for using its API, and its example shows PyObject* fields in a module state. I'm starting to think that it might be a bad use of PEP 3121. PyObjects can, and therefore should be stored in the extension module dict where they will participate in normal resource management (i.e. garbage collection). If they are in the module dict, then they shouldn't be held alive by the module state too, otherwise the (currently tricky) lifetime management of extension modules can produce oddities. So, the PEP 3121 "module state" pointer (the optional opaque void* thing) should only be used to hold non-PyObjects. PyObjects should go to the module dict, like they do in normal Python modules. Now, the reason our PEP 3121 extension modules abuse the module state pointer to keep PyObjects is two-fold: 1. it's surprisingly easier (it's actually a one-liner if you don't handle errors - a rather bad thing, but all PEP 3121 extension modules currently don't handle a NULL return from PyState_FindModule...) 2. it protects the module from any module dict monkeypatching. It's not important if you are using a generic API on the PyObject, but it is if the PyObject is really a custom C type with well-defined fields. Those two issues can be addressed if we offer an API for it. How about: PyObject *PyState_GetModuleAttr(struct PyModuleDef *def, const char *name, PyObject *restrict_type) *def* is a pointer to the module definition. *name* is the attribute to look up on the module dict. *restrict_type*, if non-NULL, is a type object the looked up attribute must be an instance of. Lookup an attribute in the current interpreter's extension module instance for the module definition *def*. Returns a *new* reference (!), or NULL if an error occurred. An error can be: - no such module exists for the current interpreter (ImportError? RuntimeError? SystemError?) - no such attribute exists in the module dict (AttributeError) - the attribute doesn't conform to *restrict_type* (TypeError) So code can be written like: PyObject *dialects = PyState_GetModuleAttr( &_csvmodule, "dialects", &PyDict_Type); if (dialects == NULL) return NULL; Regards Antoine. From solipsis at pitrou.net Sun Aug 11 12:37:25 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 11 Aug 2013 12:37:25 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) References: <20130811123316.313ab101@fsol> Message-ID: <20130811123725.5412a53b@fsol> On Sun, 11 Aug 2013 12:33:16 +0200 Antoine Pitrou wrote: > So, the PEP 3121 "module state" pointer (the optional opaque void* > thing) should only be used to hold non-PyObjects. PyObjects should go > to the module dict, like they do in normal Python modules. Now, the > reason our PEP 3121 extension modules abuse the module state pointer to > keep PyObjects is two-fold: > > 1. it's surprisingly easier (it's actually a one-liner if you don't > handle errors - a rather bad thing, but all PEP 3121 extension modules > currently don't handle a NULL return from PyState_FindModule...) > > 2. it protects the module from any module dict monkeypatching. It's not > important if you are using a generic API on the PyObject, but it is if > the PyObject is really a custom C type with well-defined fields. I overlooked a third reason which is performance. But, those lookups are generally not performance-critical. Regards Antoine. From ncoghlan at gmail.com Sun Aug 11 13:04:40 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Aug 2013 07:04:40 -0400 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: <20130811123316.313ab101@fsol> References: <20130811123316.313ab101@fsol> Message-ID: On 11 August 2013 06:33, Antoine Pitrou wrote: > So code can be written like: > > PyObject *dialects = PyState_GetModuleAttr( > &_csvmodule, "dialects", &PyDict_Type); > if (dialects == NULL) > return NULL; This sounds like a good near term solution to me. Longer term, I think there may be value in providing a richer extension module initialisation API that lets extension modules be represented as module *subclasses* in sys.modules, since that would get us to a position where it is possible to have *multiple* instances of an extension module in the *same* subinterpreter by holding on to external references after removing them from sys.modules (which is what we do in the test suite for pure Python modules). Enabling that also ties into the question of passing info to the extension module about how it is being loaded (e.g. as a submodule of a larger package), as well as allowing extension modules to cleanly handle reload(). However, that's dependent on the ModuleSpec idea we're currently thrashing out on import-sig (and should be able to bring to python-dev soon), and I think getting that integrated at all will be ambitious enough for 3.4 - using it to improve extension module handling would then be a project for 3.5. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun Aug 11 13:48:26 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 11 Aug 2013 13:48:26 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: <20130811123316.313ab101@fsol> Message-ID: <20130811134826.3b7d48ce@fsol> On Sun, 11 Aug 2013 07:04:40 -0400 Nick Coghlan wrote: > On 11 August 2013 06:33, Antoine Pitrou wrote: > > So code can be written like: > > > > PyObject *dialects = PyState_GetModuleAttr( > > &_csvmodule, "dialects", &PyDict_Type); > > if (dialects == NULL) > > return NULL; > > This sounds like a good near term solution to me. > > Longer term, I think there may be value in providing a richer > extension module initialisation API that lets extension modules be > represented as module *subclasses* in sys.modules, since that would > get us to a position where it is possible to have *multiple* instances > of an extension module in the *same* subinterpreter by holding on to > external references after removing them from sys.modules (which is > what we do in the test suite for pure Python modules). Either that, or add a "struct PyMemberDef *m_members" field to PyModuleDef, to enable looking up stuff in the m_state using regular attribute lookup. Unfortunately, doing so would probably break the ABI. Also, allowing for module subclasses is probably more flexible in the long term. We just need to devise a convenience API for that (perhaps by allowing to create both the subclass *and* instantiate it in a single call). > However, that's dependent on the ModuleSpec idea we're > currently thrashing out on import-sig (and should be able to bring to > python-dev soon), and I think getting that integrated at all will be > ambitious enough for 3.4 - using it to improve extension module > handling would then be a project for 3.5. Sounds reasonable. Regards Antoine. From stefan_ml at behnel.de Sun Aug 11 14:01:43 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 Aug 2013 14:01:43 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: <20130811123316.313ab101@fsol> References: <20130811123316.313ab101@fsol> Message-ID: Antoine Pitrou, 11.08.2013 12:33: > On Sat, 10 Aug 2013 17:12:53 -0700 Eli Bendersky wrote: >> Note how doing some sys.modules acrobatics and re-importing suddenly >> changes the internal state of a previously imported module. This happens >> because: >> >> 1. The first import of 'csv' (which then imports `_csv) creates >> module-specific state on the heap and associates it with the current >> sub-interpreter. The list of dialects, amongst other things, is in that >> state. >> 2. The 'del's wipe 'csv' and '_csv' from the cache. >> 3. The second import of 'csv' also creates/initializes a new '_csv' module >> because it's not in sys.modules. This *replaces* the per-sub-interpreter >> cached version of the module's state with the clean state of a new module > > I would say this is pretty much expected. The converse would be a bug > IMO (but perhaps Martin disagrees). PEP 3121's stated goal is not only > subinterpreter support: > > "Extension module initialization currently has a few deficiencies. > There is no cleanup for modules, the entry point name might give > naming conflicts, the entry functions don't follow the usual calling > convention, and multiple interpreters are not supported well." > > Re-initializing state when importing a module anew makes extension > modules more like pure Python modules, which is a good thing. It's the same as defining a type or function in a loop, or inside of a closure. The whole point of reimporting is that you get a new module. However, it should not change the content of the old module, just create a new one. > So, the PEP 3121 "module state" pointer (the optional opaque void* > thing) should only be used to hold non-PyObjects. PyObjects should go > to the module dict, like they do in normal Python modules. Now, the > reason our PEP 3121 extension modules abuse the module state pointer to > keep PyObjects is two-fold: > > 1. it's surprisingly easier (it's actually a one-liner if you don't > handle errors - a rather bad thing, but all PEP 3121 extension modules > currently don't handle a NULL return from PyState_FindModule...) > > 2. it protects the module from any module dict monkeypatching. It's not > important if you are using a generic API on the PyObject, but it is if > the PyObject is really a custom C type with well-defined fields. Yes, it's a major safety problem if you can crash the interpreter by assigning None to a module attribute. > Those two issues can be addressed if we offer an API for it. How about: > > PyObject *PyState_GetModuleAttr(struct PyModuleDef *def, > const char *name, > PyObject *restrict_type) > > *def* is a pointer to the module definition. > *name* is the attribute to look up on the module dict. > *restrict_type*, if non-NULL, is a type object the looked up attribute > must be an instance of. > > Lookup an attribute in the current interpreter's extension module > instance for the module definition *def*. > Returns a *new* reference (!), or NULL if an error occurred. > An error can be: > - no such module exists for the current interpreter (ImportError? > RuntimeError? SystemError?) > - no such attribute exists in the module dict (AttributeError) > - the attribute doesn't conform to *restrict_type* (TypeError) > > So code can be written like: > > PyObject *dialects = PyState_GetModuleAttr( > &_csvmodule, "dialects", &PyDict_Type); > if (dialects == NULL) > return NULL; At least for Cython it's unlikely that it'll ever use this. It's just way too much overhead for looking up a global name. Plus, not all global names are visible in the module dict, e.g. it's common to have types that are only used internally to keep some kind of state. Those would still have to live in the internal per-module state. ISTM that this is not a proper solution for the problem, because it only covers the simple use cases. Rather, I'd prefer making the handling of names in the per-module instance state safer. Essentially, with PEP 3121, modules are just one form of an extension type. So what's wrong with giving them normal extension type fields? Functions are essentially methods of the module, global types are just inner classes. Both should keep the module alive (on the one side) and be tied to it (on the other side). If you reimport a module, you'd get a new set of everything, and the old module would just linger in the background until the last reference to it dies. In other words, I don't see why modules should be any special. Stefan From stefan_ml at behnel.de Sun Aug 11 14:16:10 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 Aug 2013 14:16:10 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: <20130811134826.3b7d48ce@fsol> References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> Message-ID: Antoine Pitrou, 11.08.2013 13:48: > On Sun, 11 Aug 2013 07:04:40 -0400 Nick Coghlan wrote: >> On 11 August 2013 06:33, Antoine Pitrou wrote: >>> So code can be written like: >>> >>> PyObject *dialects = PyState_GetModuleAttr( >>> &_csvmodule, "dialects", &PyDict_Type); >>> if (dialects == NULL) >>> return NULL; >> >> This sounds like a good near term solution to me. >> >> Longer term, I think there may be value in providing a richer >> extension module initialisation API that lets extension modules be >> represented as module *subclasses* in sys.modules, since that would >> get us to a position where it is possible to have *multiple* instances >> of an extension module in the *same* subinterpreter by holding on to >> external references after removing them from sys.modules (which is >> what we do in the test suite for pure Python modules). > > Either that, or add a "struct PyMemberDef *m_members" field to > PyModuleDef, to enable looking up stuff in the m_state using regular > attribute lookup. Hmm, yes, it's unfortunate that the module state isn't just a public part of the object struct. > Unfortunately, doing so would probably break the ABI. Also, allowing > for module subclasses is probably more flexible in the long term. +1000 > We > just need to devise a convenience API for that (perhaps by allowing to > create both the subclass *and* instantiate it in a single call). Right. This conflicts somewhat with the simplified module creation. If the module loader passed the readily instantiated module instance into the module init function, then module subtypes don't fit into this scheme anymore. One more reason why modules shouldn't be special. Essentially, we need an m_new() and m_init() for them. And the lifetime of the module type would have to be linked to the (sub-)interpreter, whereas the lifetime of the module instance would be determined by whoever uses the module and/or decides to unload/reload it. Stefan From shibturn at gmail.com Sun Aug 11 14:27:55 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Sun, 11 Aug 2013 13:27:55 +0100 Subject: [Python-Dev] Green buildbot failure. In-Reply-To: <20130811120018.3c209b6b@fsol> References: <20130811120018.3c209b6b@fsol> Message-ID: On 11/08/2013 11:00am, Antoine Pitrou wrote: > You've got the answer at the bottom: > > "program finished with exit code 0" > > So for some reason, the test suite crashed, but with a successful exit > code. Buildbot thinks it ran fine. Was the test terminated because it took too long? TerminateProcess(handle, exitcode) sometimes makes the program exit with return code 0 instead of exitcode. At any rate, test_multiprocessing contains this disabled test: # XXX sometimes get p.exitcode == 0 on Windows ... #self.assertEqual(p.exitcode, -signal.SIGTERM) -- Richard From solipsis at pitrou.net Sun Aug 11 14:32:13 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 11 Aug 2013 14:32:13 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> Message-ID: <20130811143213.45c8523d@fsol> On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: > > > We > > just need to devise a convenience API for that (perhaps by allowing to > > create both the subclass *and* instantiate it in a single call). > > Right. This conflicts somewhat with the simplified module creation. If the > module loader passed the readily instantiated module instance into the > module init function, then module subtypes don't fit into this scheme anymore. > > One more reason why modules shouldn't be special. Essentially, we need an > m_new() and m_init() for them. And the lifetime of the module type would > have to be linked to the (sub-)interpreter, whereas the lifetime of the > module instance would be determined by whoever uses the module and/or > decides to unload/reload it. It may be simpler if the only strong reference to the module type is in the module instance itself. Successive module initializations would get different types, but that shouldn't be a problem in practice. Regards Antoine. From shibturn at gmail.com Sun Aug 11 14:41:20 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Sun, 11 Aug 2013 13:41:20 +0100 Subject: [Python-Dev] Green buildbot failure. In-Reply-To: References: <20130811120018.3c209b6b@fsol> Message-ID: http://stackoverflow.com/questions/2061735/42-passed-to-terminateprocess-sometimes-getexitcodeprocess-returns-0 -- Richard From stefan_ml at behnel.de Sun Aug 11 14:48:48 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 Aug 2013 14:48:48 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: <20130811143213.45c8523d@fsol> References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> <20130811143213.45c8523d@fsol> Message-ID: Antoine Pitrou, 11.08.2013 14:32: > On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: >>> We >>> just need to devise a convenience API for that (perhaps by allowing to >>> create both the subclass *and* instantiate it in a single call). >> >> Right. This conflicts somewhat with the simplified module creation. If the >> module loader passed the readily instantiated module instance into the >> module init function, then module subtypes don't fit into this scheme anymore. >> >> One more reason why modules shouldn't be special. Essentially, we need an >> m_new() and m_init() for them. And the lifetime of the module type would >> have to be linked to the (sub-)interpreter, whereas the lifetime of the >> module instance would be determined by whoever uses the module and/or >> decides to unload/reload it. > > It may be simpler if the only strong reference to the module type is in > the module instance itself. Successive module initializations would get > different types, but that shouldn't be a problem in practice. Agreed. Then the module instance would just be the only instance of a new type that gets created each time the module initialised. Even if module subtypes were to become common place once they are generally supported (because they're the easiest way to store per-module state efficiently), module reinitialisation should be rare enough to just buy them with a new type for each. The size of the complete module state+dict will almost always outweigh the size of the one additional type by factors. Stefan From stefan_ml at behnel.de Sun Aug 11 14:53:59 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 Aug 2013 14:53:59 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> <20130811143213.45c8523d@fsol> Message-ID: Stefan Behnel, 11.08.2013 14:48: > Antoine Pitrou, 11.08.2013 14:32: >> On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: >>>> We >>>> just need to devise a convenience API for that (perhaps by allowing to >>>> create both the subclass *and* instantiate it in a single call). >>> >>> Right. This conflicts somewhat with the simplified module creation. If the >>> module loader passed the readily instantiated module instance into the >>> module init function, then module subtypes don't fit into this scheme anymore. >>> >>> One more reason why modules shouldn't be special. Essentially, we need an >>> m_new() and m_init() for them. And the lifetime of the module type would >>> have to be linked to the (sub-)interpreter, whereas the lifetime of the >>> module instance would be determined by whoever uses the module and/or >>> decides to unload/reload it. >> >> It may be simpler if the only strong reference to the module type is in >> the module instance itself. Successive module initializations would get >> different types, but that shouldn't be a problem in practice. > > Agreed. Then the module instance would just be the only instance of a new > type that gets created each time the module initialised. Even if module > subtypes were to become common place once they are generally supported > (because they're the easiest way to store per-module state efficiently), > module reinitialisation should be rare enough to just buy them with a new > type for each. The size of the complete module state+dict will almost > always outweigh the size of the one additional type by factors. BTW, this already suggests a simple module initialisation interface. The extension module would expose a function that returns a module type, and the loader/importer would then simply instantiate that. Nothing else is needed. Stefan From stefan_ml at behnel.de Sun Aug 11 14:58:48 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 Aug 2013 14:58:48 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> <20130811143213.45c8523d@fsol> Message-ID: Stefan Behnel, 11.08.2013 14:53: > Stefan Behnel, 11.08.2013 14:48: >> Antoine Pitrou, 11.08.2013 14:32: >>> On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: >>>>> We >>>>> just need to devise a convenience API for that (perhaps by allowing to >>>>> create both the subclass *and* instantiate it in a single call). >>>> >>>> Right. This conflicts somewhat with the simplified module creation. If the >>>> module loader passed the readily instantiated module instance into the >>>> module init function, then module subtypes don't fit into this scheme anymore. >>>> >>>> One more reason why modules shouldn't be special. Essentially, we need an >>>> m_new() and m_init() for them. And the lifetime of the module type would >>>> have to be linked to the (sub-)interpreter, whereas the lifetime of the >>>> module instance would be determined by whoever uses the module and/or >>>> decides to unload/reload it. >>> >>> It may be simpler if the only strong reference to the module type is in >>> the module instance itself. Successive module initializations would get >>> different types, but that shouldn't be a problem in practice. >> >> Agreed. Then the module instance would just be the only instance of a new >> type that gets created each time the module initialised. Even if module >> subtypes were to become common place once they are generally supported >> (because they're the easiest way to store per-module state efficiently), >> module reinitialisation should be rare enough to just buy them with a new >> type for each. The size of the complete module state+dict will almost >> always outweigh the size of the one additional type by factors. > > BTW, this already suggests a simple module initialisation interface. The > extension module would expose a function that returns a module type, and > the loader/importer would then simply instantiate that. Nothing else is needed. Actually, strike the word "module type" and replace it with "type". Is there really a reason why Python needs a module type at all? I mean, you can stick arbitrary objects in sys.modules, so why not allow arbitrary types to be returned by the module creation function? Stefan From eliben at gmail.com Sun Aug 11 15:03:19 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 11 Aug 2013 06:03:19 -0700 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: <20130811115826.3560f8a0@fsol> References: <20130811115826.3560f8a0@fsol> Message-ID: On Sun, Aug 11, 2013 at 2:58 AM, Antoine Pitrou wrote: > On Sat, 10 Aug 2013 18:06:02 -0700 > Eli Bendersky wrote: > > This solution has problems. For example, in the case of ET it would > > preclude testing what happens when pyexpat is disabled (remember we were > > discussing this...). This is because there would be no real way to create > > new instances of such modules (they would all cache themselves in the > init > > function - similarly to what ET now does in trunk, because otherwise some > > of its global-dependent crazy tests fail). > > > > A more radical solution would be to *really* have multiple instances of > > state per sub-interpreter. Well, they already exist -- it's > > PyState_FindModule which is the problematic one because it only remembers > > the last one. > > I'm not sure I understand your diagnosis. modules_per_index (and > PyState_FindModule) is per-interpreter so we already have a > per-interpreter state here. Something else must be interferring. > > Yes, it's per interpreter, but only one per interpreter is remembered in state->modules_by_index. What I'm trying to say is that currently two different instances of PyModuleObject *within the same interpterer* share the state if they get to it through PyState_FindModule, because they share the same PyModuleDef, and stat->modules_by_index keeps only one module per PyModuleDef. > Note that module state is just a field attached to the module object > ("void *md_state" in PyModuleObject). It's really the extension modules > which are per-interpreter, which is a good thing. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Aug 11 15:19:41 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Aug 2013 09:19:41 -0400 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> <20130811143213.45c8523d@fsol> Message-ID: On 11 Aug 2013 09:02, "Stefan Behnel" wrote: > > Stefan Behnel, 11.08.2013 14:53: > > Stefan Behnel, 11.08.2013 14:48: > >> Antoine Pitrou, 11.08.2013 14:32: > >>> On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: > >>>>> We > >>>>> just need to devise a convenience API for that (perhaps by allowing to > >>>>> create both the subclass *and* instantiate it in a single call). > >>>> > >>>> Right. This conflicts somewhat with the simplified module creation. If the > >>>> module loader passed the readily instantiated module instance into the > >>>> module init function, then module subtypes don't fit into this scheme anymore. > >>>> > >>>> One more reason why modules shouldn't be special. Essentially, we need an > >>>> m_new() and m_init() for them. And the lifetime of the module type would > >>>> have to be linked to the (sub-)interpreter, whereas the lifetime of the > >>>> module instance would be determined by whoever uses the module and/or > >>>> decides to unload/reload it. > >>> > >>> It may be simpler if the only strong reference to the module type is in > >>> the module instance itself. Successive module initializations would get > >>> different types, but that shouldn't be a problem in practice. > >> > >> Agreed. Then the module instance would just be the only instance of a new > >> type that gets created each time the module initialised. Even if module > >> subtypes were to become common place once they are generally supported > >> (because they're the easiest way to store per-module state efficiently), > >> module reinitialisation should be rare enough to just buy them with a new > >> type for each. The size of the complete module state+dict will almost > >> always outweigh the size of the one additional type by factors. > > > > BTW, this already suggests a simple module initialisation interface. The > > extension module would expose a function that returns a module type, and > > the loader/importer would then simply instantiate that. Nothing else is needed. > > Actually, strike the word "module type" and replace it with "type". Is > there really a reason why Python needs a module type at all? I mean, you > can stick arbitrary objects in sys.modules, so why not allow arbitrary > types to be returned by the module creation function? That's exactly what I have in mind, but the way extension module imports currently work means we can't easily do it just yet. Fortunately, importlib means we now have some hope of fixing that :) Cheers, Nick. > > Stefan > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Aug 11 15:26:55 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 11 Aug 2013 06:26:55 -0700 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: <20130811123316.313ab101@fsol> References: <20130811123316.313ab101@fsol> Message-ID: On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou wrote: > > Hi Eli, > > On Sat, 10 Aug 2013 17:12:53 -0700 > Eli Bendersky wrote: > > > > Note how doing some sys.modules acrobatics and re-importing suddenly > > changes the internal state of a previously imported module. This happens > > because: > > > > 1. The first import of 'csv' (which then imports `_csv) creates > > module-specific state on the heap and associates it with the current > > sub-interpreter. The list of dialects, amongst other things, is in that > > state. > > 2. The 'del's wipe 'csv' and '_csv' from the cache. > > 3. The second import of 'csv' also creates/initializes a new '_csv' > module > > because it's not in sys.modules. This *replaces* the per-sub-interpreter > > cached version of the module's state with the clean state of a new module > > I would say this is pretty much expected. I'm struggling to see how it's expected. The two imported csv modules are different (i.e. different id() of members), and yet some state is shared between them. I think the root reason for it is that "PyModuleDev _csvmodule" is uniqued per interpreter, not per module instance. Even if dialects were not a PyObject, this would still be problematic, don't you think? And note that here, unlike the ET.ParseError case, I don't think the problem is exporting internal per-module state as a module attribute. The following two are un-reconcilable, IMHO: 1. Wanting to have two instances of the same module in the same interpterer. 2. Using a global shared PyModuleDef between all instances of the same module in the same interpterer. > The converse would be a bug > IMO (but perhaps Martin disagrees). PEP 3121's stated goal is not only > subinterpreter support: > > "Extension module initialization currently has a few deficiencies. > There is no cleanup for modules, the entry point name might give > naming conflicts, the entry functions don't follow the usual calling > convention, and multiple interpreters are not supported well." > > Re-initializing state when importing a module anew makes extension > modules more like pure Python modules, which is a good thing. > > > I think the piece of interpretation you offered yesterday on IRC may be > the right explanation for the ET shenanigans: > > "Maybe the bug is that ParseError is kept in per-module state, and > also exported from the module?" > > PEP 3121 doesn't offer any guidelines for using its API, and its > example shows PyObject* fields in a module state. > > I'm starting to think that it might be a bad use of PEP 3121. PyObjects > can, and therefore should be stored in the extension module dict where > they will participate in normal resource management (i.e. garbage > collection). If they are in the module dict, then they shouldn't be > held alive by the module state too, otherwise the (currently tricky) > lifetime management of extension modules can produce oddities. > > > So, the PEP 3121 "module state" pointer (the optional opaque void* > thing) should only be used to hold non-PyObjects. PyObjects should go > to the module dict, like they do in normal Python modules. Now, the > reason our PEP 3121 extension modules abuse the module state pointer to > keep PyObjects is two-fold: > > 1. it's surprisingly easier (it's actually a one-liner if you don't > handle errors - a rather bad thing, but all PEP 3121 extension modules > currently don't handle a NULL return from PyState_FindModule...) > > 2. it protects the module from any module dict monkeypatching. It's not > important if you are using a generic API on the PyObject, but it is if > the PyObject is really a custom C type with well-defined fields. > > Those two issues can be addressed if we offer an API for it. How about: > > PyObject *PyState_GetModuleAttr(struct PyModuleDef *def, > const char *name, > PyObject *restrict_type) > > *def* is a pointer to the module definition. > *name* is the attribute to look up on the module dict. > *restrict_type*, if non-NULL, is a type object the looked up attribute > must be an instance of. > > Lookup an attribute in the current interpreter's extension module > instance for the module definition *def*. > Returns a *new* reference (!), or NULL if an error occurred. > An error can be: > - no such module exists for the current interpreter (ImportError? > RuntimeError? SystemError?) > - no such attribute exists in the module dict (AttributeError) > - the attribute doesn't conform to *restrict_type* (TypeError) > > So code can be written like: > > PyObject *dialects = PyState_GetModuleAttr( > &_csvmodule, "dialects", &PyDict_Type); > if (dialects == NULL) > return NULL; > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Aug 11 15:40:42 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 11 Aug 2013 15:40:42 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: <20130811123316.313ab101@fsol> Message-ID: <20130811154042.1e829010@fsol> On Sun, 11 Aug 2013 06:26:55 -0700 Eli Bendersky wrote: > On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou wrote: > > > > > Hi Eli, > > > > On Sat, 10 Aug 2013 17:12:53 -0700 > > Eli Bendersky wrote: > > > > > > Note how doing some sys.modules acrobatics and re-importing suddenly > > > changes the internal state of a previously imported module. This happens > > > because: > > > > > > 1. The first import of 'csv' (which then imports `_csv) creates > > > module-specific state on the heap and associates it with the current > > > sub-interpreter. The list of dialects, amongst other things, is in that > > > state. > > > 2. The 'del's wipe 'csv' and '_csv' from the cache. > > > 3. The second import of 'csv' also creates/initializes a new '_csv' > > module > > > because it's not in sys.modules. This *replaces* the per-sub-interpreter > > > cached version of the module's state with the clean state of a new module > > > > I would say this is pretty much expected. > > I'm struggling to see how it's expected. The two imported csv modules are > different (i.e. different id() of members), and yet some state is shared > between them. There are two csv modules, but there are not two _csv modules. Extension modules are currently immortal until the end of the interpreter: >>> csv = __import__('csv') >>> wcsv = weakref.ref(csv) >>> w_csv = weakref.ref(sys.modules['_csv']) >>> del sys.modules['csv'] >>> del sys.modules['_csv'] >>> del csv >>> gc.collect() 50 >>> wcsv() >>> w_csv() So, "sharing" a state is pretty much expected, since you are re-initializating an existing module. (but the module does get re-initialized, which is the point of PEP 3121) > 1. Wanting to have two instances of the same module in the same interpterer. It could be nice, but really, that's not a common use case. And it's impossible for extension modules, currently. Regards Antoine. From stefan_ml at behnel.de Sun Aug 11 15:52:52 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 Aug 2013 15:52:52 +0200 Subject: [Python-Dev] redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)) In-Reply-To: References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> <20130811143213.45c8523d@fsol> Message-ID: Nick Coghlan, 11.08.2013 15:19: > On 11 Aug 2013 09:02, "Stefan Behnel" wrote: >>> BTW, this already suggests a simple module initialisation interface. The >>> extension module would expose a function that returns a module type, and >>> the loader/importer would then simply instantiate that. Nothing else is >>> needed. >> >> Actually, strike the word "module type" and replace it with "type". Is >> there really a reason why Python needs a module type at all? I mean, you >> can stick arbitrary objects in sys.modules, so why not allow arbitrary >> types to be returned by the module creation function? > > That's exactly what I have in mind, but the way extension module imports > currently work means we can't easily do it just yet. Fortunately, importlib > means we now have some hope of fixing that :) Well, what do we need? We don't need to care about existing code, as long as the current scheme is only deprecated and not deleted. That won't happen before Py4 anyway. New code would simply export a different symbol when compiling for a CPython that supports it, which points to the function that returns the type. Then, there's already the PyType_Copy() function, which can be used to create a heap type from a statically defined type. So extension modules can simply define an (arbitrary) additional type in any way they see fit, copy it to the heap, and return it. Next, we need to define a signature for the type's __init__() method. This can be done in a future proof way by allowing arbitrary keyword arguments to be added, i.e. such a type must have a signature like def __init__(self, currently, used, pos, args, **kwargs) and simply ignore kwargs for now. Actually, we may get away with not passing all too many arguments here if we allow the importer to add stuff to the type's dict in between, specifically __file__, __path__ and friends, so that they are available before the type gets instantiated. Not sure if this is a good idea, but it would at least relieve the user from having to copy these things over from some kind of context or whatever we might want to pass in. Alternatively, we could split the instantiation up between tp_new() and tp_init(), and let the importer set stuff on the instance dict in between the two. But given that this context won't actually change once the shared library is loaded, the only reason to prefer modifying the instance instead of the type would be to avoid requiring a tp_dict for the type. Open for discussion, I guess. Did I forget anything? Sounds simple enough to me so far. Stefan From eliben at gmail.com Sun Aug 11 17:49:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 11 Aug 2013 08:49:56 -0700 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: <20130811154042.1e829010@fsol> References: <20130811123316.313ab101@fsol> <20130811154042.1e829010@fsol> Message-ID: On Sun, Aug 11, 2013 at 6:40 AM, Antoine Pitrou wrote: > On Sun, 11 Aug 2013 06:26:55 -0700 > Eli Bendersky wrote: > > On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou > wrote: > > > > > > > > Hi Eli, > > > > > > On Sat, 10 Aug 2013 17:12:53 -0700 > > > Eli Bendersky wrote: > > > > > > > > Note how doing some sys.modules acrobatics and re-importing suddenly > > > > changes the internal state of a previously imported module. This > happens > > > > because: > > > > > > > > 1. The first import of 'csv' (which then imports `_csv) creates > > > > module-specific state on the heap and associates it with the current > > > > sub-interpreter. The list of dialects, amongst other things, is in > that > > > > state. > > > > 2. The 'del's wipe 'csv' and '_csv' from the cache. > > > > 3. The second import of 'csv' also creates/initializes a new '_csv' > > > module > > > > because it's not in sys.modules. This *replaces* the > per-sub-interpreter > > > > cached version of the module's state with the clean state of a new > module > > > > > > I would say this is pretty much expected. > > > > I'm struggling to see how it's expected. The two imported csv modules are > > different (i.e. different id() of members), and yet some state is shared > > between them. > > There are two csv modules, but there are not two _csv modules. > Extension modules are currently immortal until the end of the > interpreter: > > >>> csv = __import__('csv') > >>> wcsv = weakref.ref(csv) > >>> w_csv = weakref.ref(sys.modules['_csv']) > >>> del sys.modules['csv'] > >>> del sys.modules['_csv'] > >>> del csv > >>> gc.collect() > 50 > >>> wcsv() > >>> w_csv() > '/home/antoine/cpython/default/build/lib.linux-x86_64-3.4-pydebug/_ > csv.cpython-34dm.so'> > > > So, "sharing" a state is pretty much expected, since you are > re-initializating an existing module. > (but the module does get re-initialized, which is the point of PEP 3121) > Yes, you're right - this is an oversight on my behalf. Indeed, the extensions dict in import.c keeps it alive once loaded, and only ever gets cleaned up in Py_Finalize. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Aug 11 17:56:53 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 11 Aug 2013 17:56:53 +0200 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: References: <20130811123316.313ab101@fsol> <20130811154042.1e829010@fsol> Message-ID: <20130811175653.1759db01@fsol> On Sun, 11 Aug 2013 08:49:56 -0700 Eli Bendersky wrote: > On Sun, Aug 11, 2013 at 6:40 AM, Antoine Pitrou wrote: > > > On Sun, 11 Aug 2013 06:26:55 -0700 > > Eli Bendersky wrote: > > > On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou > > wrote: > > > > > > > > > > > Hi Eli, > > > > > > > > On Sat, 10 Aug 2013 17:12:53 -0700 > > > > Eli Bendersky wrote: > > > > > > > > > > Note how doing some sys.modules acrobatics and re-importing suddenly > > > > > changes the internal state of a previously imported module. This > > happens > > > > > because: > > > > > > > > > > 1. The first import of 'csv' (which then imports `_csv) creates > > > > > module-specific state on the heap and associates it with the current > > > > > sub-interpreter. The list of dialects, amongst other things, is in > > that > > > > > state. > > > > > 2. The 'del's wipe 'csv' and '_csv' from the cache. > > > > > 3. The second import of 'csv' also creates/initializes a new '_csv' > > > > module > > > > > because it's not in sys.modules. This *replaces* the > > per-sub-interpreter > > > > > cached version of the module's state with the clean state of a new > > module > > > > > > > > I would say this is pretty much expected. > > > > > > I'm struggling to see how it's expected. The two imported csv modules are > > > different (i.e. different id() of members), and yet some state is shared > > > between them. > > > > There are two csv modules, but there are not two _csv modules. > > Extension modules are currently immortal until the end of the > > interpreter: > > > > >>> csv = __import__('csv') > > >>> wcsv = weakref.ref(csv) > > >>> w_csv = weakref.ref(sys.modules['_csv']) > > >>> del sys.modules['csv'] > > >>> del sys.modules['_csv'] > > >>> del csv > > >>> gc.collect() > > 50 > > >>> wcsv() > > >>> w_csv() > > > '/home/antoine/cpython/default/build/lib.linux-x86_64-3.4-pydebug/_ > > csv.cpython-34dm.so'> > > > > > > So, "sharing" a state is pretty much expected, since you are > > re-initializating an existing module. > > (but the module does get re-initialized, which is the point of PEP 3121) > > > > Yes, you're right - this is an oversight on my behalf. Indeed, the > extensions dict in import.c keeps it alive once loaded, and only ever gets > cleaned up in Py_Finalize. It's not the extensions dict in import.c, it's modules_by_index in the interpreter state. (otherwise it wouldn't be per-interpreter) The extensions dict holds the module *definition* (the struct PyModuleDef), not the module instance. Regards Antoine. From eliben at gmail.com Sun Aug 11 18:07:25 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 11 Aug 2013 09:07:25 -0700 Subject: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others) In-Reply-To: <20130811175653.1759db01@fsol> References: <20130811123316.313ab101@fsol> <20130811154042.1e829010@fsol> <20130811175653.1759db01@fsol> Message-ID: On Sun, Aug 11, 2013 at 8:56 AM, Antoine Pitrou wrote: > On Sun, 11 Aug 2013 08:49:56 -0700 > Eli Bendersky wrote: > > > On Sun, Aug 11, 2013 at 6:40 AM, Antoine Pitrou > wrote: > > > > > On Sun, 11 Aug 2013 06:26:55 -0700 > > > Eli Bendersky wrote: > > > > On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou > > > > wrote: > > > > > > > > > > > > > > Hi Eli, > > > > > > > > > > On Sat, 10 Aug 2013 17:12:53 -0700 > > > > > Eli Bendersky wrote: > > > > > > > > > > > > Note how doing some sys.modules acrobatics and re-importing > suddenly > > > > > > changes the internal state of a previously imported module. This > > > happens > > > > > > because: > > > > > > > > > > > > 1. The first import of 'csv' (which then imports `_csv) creates > > > > > > module-specific state on the heap and associates it with the > current > > > > > > sub-interpreter. The list of dialects, amongst other things, is > in > > > that > > > > > > state. > > > > > > 2. The 'del's wipe 'csv' and '_csv' from the cache. > > > > > > 3. The second import of 'csv' also creates/initializes a new > '_csv' > > > > > module > > > > > > because it's not in sys.modules. This *replaces* the > > > per-sub-interpreter > > > > > > cached version of the module's state with the clean state of a > new > > > module > > > > > > > > > > I would say this is pretty much expected. > > > > > > > > I'm struggling to see how it's expected. The two imported csv > modules are > > > > different (i.e. different id() of members), and yet some state is > shared > > > > between them. > > > > > > There are two csv modules, but there are not two _csv modules. > > > Extension modules are currently immortal until the end of the > > > interpreter: > > > > > > >>> csv = __import__('csv') > > > >>> wcsv = weakref.ref(csv) > > > >>> w_csv = weakref.ref(sys.modules['_csv']) > > > >>> del sys.modules['csv'] > > > >>> del sys.modules['_csv'] > > > >>> del csv > > > >>> gc.collect() > > > 50 > > > >>> wcsv() > > > >>> w_csv() > > > > > '/home/antoine/cpython/default/build/lib.linux-x86_64-3.4-pydebug/_ > > > csv.cpython-34dm.so'> > > > > > > > > > So, "sharing" a state is pretty much expected, since you are > > > re-initializating an existing module. > > > (but the module does get re-initialized, which is the point of PEP > 3121) > > > > > > > Yes, you're right - this is an oversight on my behalf. Indeed, the > > extensions dict in import.c keeps it alive once loaded, and only ever > gets > > cleaned up in Py_Finalize. > > It's not the extensions dict in import.c, it's modules_by_index in the > interpreter state. > (otherwise it wouldn't be per-interpreter) > > The extensions dict holds the module *definition* (the struct > PyModuleDef), not the module instance. > Thanks for the clarification. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Aug 11 19:43:30 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 11 Aug 2013 10:43:30 -0700 Subject: [Python-Dev] redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)) In-Reply-To: References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> <20130811143213.45c8523d@fsol> Message-ID: On Sun, Aug 11, 2013 at 6:52 AM, Stefan Behnel wrote: > Nick Coghlan, 11.08.2013 15:19: > > On 11 Aug 2013 09:02, "Stefan Behnel" wrote: > >>> BTW, this already suggests a simple module initialisation interface. > The > >>> extension module would expose a function that returns a module type, > and > >>> the loader/importer would then simply instantiate that. Nothing else is > >>> needed. > >> > >> Actually, strike the word "module type" and replace it with "type". Is > >> there really a reason why Python needs a module type at all? I mean, you > >> can stick arbitrary objects in sys.modules, so why not allow arbitrary > >> types to be returned by the module creation function? > > > > That's exactly what I have in mind, but the way extension module imports > > currently work means we can't easily do it just yet. Fortunately, > importlib > > means we now have some hope of fixing that :) > > Well, what do we need? We don't need to care about existing code, as long > as the current scheme is only deprecated and not deleted. That won't happen > before Py4 anyway. New code would simply export a different symbol when > compiling for a CPython that supports it, which points to the function that > returns the type. > > Then, there's already the PyType_Copy() function, which can be used to > create a heap type from a statically defined type. So extension modules can > simply define an (arbitrary) additional type in any way they see fit, copy > it to the heap, and return it. > > Next, we need to define a signature for the type's __init__() method. This > can be done in a future proof way by allowing arbitrary keyword arguments > to be added, i.e. such a type must have a signature like > > def __init__(self, currently, used, pos, args, **kwargs) > > and simply ignore kwargs for now. > > Actually, we may get away with not passing all too many arguments here if > we allow the importer to add stuff to the type's dict in between, > specifically __file__, __path__ and friends, so that they are available > before the type gets instantiated. Not sure if this is a good idea, but it > would at least relieve the user from having to copy these things over from > some kind of context or whatever we might want to pass in. > > Alternatively, we could split the instantiation up between tp_new() and > tp_init(), and let the importer set stuff on the instance dict in between > the two. But given that this context won't actually change once the shared > library is loaded, the only reason to prefer modifying the instance instead > of the type would be to avoid requiring a tp_dict for the type. Open for > discussion, I guess. > > Did I forget anything? Sounds simple enough to me so far. > Out of curiosity - can we list actual use cases for this new design? The previous thread, admittedly, deals with an isoteric corner-cases that comes up in overly-clever tests. If we plan to serious consider these changes - and this appears to be worth a PEP - we need a list of actual advantages over the current approach. It's not that a more conceptually pure design is an insufficient reason, IMHO, but it would be interesting to hear about other implications. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sun Aug 11 19:51:26 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 Aug 2013 19:51:26 +0200 Subject: [Python-Dev] redesigning the extension module initialisation protocol In-Reply-To: References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> <20130811143213.45c8523d@fsol> Message-ID: Eli Bendersky, 11.08.2013 19:43: > Out of curiosity - can we list actual use cases for this new design? The > previous thread, admittedly, deals with an isoteric corner-cases that comes > up in overly-clever tests. If we plan to serious consider these changes - > and this appears to be worth a PEP - we need a list of actual advantages > over the current approach. It's not that a more conceptually pure design is > an insufficient reason, IMHO, but it would be interesting to hear about > other implications. http://mail.python.org/pipermail/python-dev/2012-November/122599.html http://bugs.python.org/issue13429 http://bugs.python.org/issue16392 Yes, it definitely needs a PEP. Stefan From storchaka at gmail.com Sun Aug 11 20:23:35 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 11 Aug 2013 21:23:35 +0300 Subject: [Python-Dev] Reaping threads and subprocesses Message-ID: Some tests uses the following idiom: def test_main(): try: test.support.run_unittest(...) finally: test.support.reap_children() Other tests uses the following idiom: def test_main(): key = test.support.threading_setup() try: test.support.run_unittest(...) finally: test.support.threading_cleanup(*key) or in other words: @test.support.reap_threads def test_main(): test.support.run_unittest(...) These tests are not discoverable. There are some ways to make them discoverable. 1. Create unittest.TestCase subclasses or mixins with overloaded the run() method. class ThreadReaped: def run(self, result): key = test.support.threading_setup() try: return super().run(result) finally: test.support.threading_cleanup(*key) class ChildReaped: def run(self, result): try: return super().run(result) finally: test.support.reap_children() 2. Create unittest.TestCase subclasses or mixins with overloaded setUpClass() and tearDownClass() methods. class ThreadReaped: @classmethod def setUpClass(cls): cls._threads = test.support.threading_setup() @classmethod def tearDownClass(cls): test.support.threading_cleanup(*cls._threads) class ChildReaped: @classmethod def tearDownClass(cls): test.support.reap_children() 3. Create unittest.TestCase subclasses or mixins with overloaded setUp() and tearDown() methods. class ThreadReaped: def setUp(self): self._threads = test.support.threading_setup() def tearDown(self): test.support.threading_cleanup(*self._threads) class ChildReaped: def tearDown(self): test.support.reap_children() 4. Create unittest.TestCase subclasses or mixins with using addCleanup() in constructor. class ThreadReaped: def __init__(self): self.addCleanup(test.support.threading_cleanup, *test.support.threading_setup()) class ChildReaped: def __init__(self): self.addCleanup(test.support.reap_children) Of course instead subclassing we can use decorators which modify test class. What method is better? Do you have other suggestions? The issue where this problem was first occurred: http://bugs.python.org/issue16968. From db3l.net at gmail.com Sun Aug 11 23:10:58 2013 From: db3l.net at gmail.com (David Bolen) Date: Sun, 11 Aug 2013 17:10:58 -0400 Subject: [Python-Dev] Green buildbot failure. References: <20130811120018.3c209b6b@fsol> Message-ID: Richard Oudkerk writes: > On 11/08/2013 11:00am, Antoine Pitrou wrote: >> You've got the answer at the bottom: >> >> "program finished with exit code 0" >> >> So for some reason, the test suite crashed, but with a successful exit >> code. Buildbot thinks it ran fine. > > Was the test terminated because it took too long? Yes, it looks like it. This test (and one on the XP-4 buildbot in the same time frame) was terminated by an external watchdog script that kills python_d processes that have been running for more than 2 hours. I put the script in place (quite a while back) as a workaround for failures that would strand a python process, blocking future tests due to files remaining in use. It's a last ditch, crude, sledge-hammer. Historically, if this code ran, the buildbot had already itself timed out, so the exit code (which I can't control) wasn't very important. 2 hours had been conservative (and a trade-off as longer values also risks failing more future tests) but it may need to be increased. In this particular case it was a false alarm - the host was heavily loaded during this time frame, which I think prolonged the test time by an unusually large amount. -- David From victor.stinner at gmail.com Sun Aug 11 23:49:38 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 11 Aug 2013 23:49:38 +0200 Subject: [Python-Dev] Green buildbot failure. In-Reply-To: References: <20130811120018.3c209b6b@fsol> Message-ID: 2013/8/11 David Bolen : >> Was the test terminated because it took too long? > > Yes, it looks like it. > > This test (and one on the XP-4 buildbot in the same time frame) was > terminated by an external watchdog script that kills python_d > processes that have been running for more than 2 hours. I put the > script in place (quite a while back) as a workaround for failures that > would strand a python process, blocking future tests due to files > remaining in use. It's a last ditch, crude, sledge-hammer. test.regrtest uses faulthandler.dump_traceback_later() to stop the test after a timeout if --timeout command line option is used. http://docs.python.org/dev/library/faulthandler.html#faulthandler.dump_traceback_later Do you pass this option? The timeout is not global but one a single function of a test file, so you can use shorter timeout. It has also the advantage of dumping the traceback of all Python threads before exiting. It didn't try this feature recently on Windows, but it is supposed to work :-) Victor From ncoghlan at gmail.com Mon Aug 12 00:41:40 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Aug 2013 18:41:40 -0400 Subject: [Python-Dev] redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)) In-Reply-To: References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> <20130811143213.45c8523d@fsol> Message-ID: On 11 Aug 2013 09:55, "Stefan Behnel" wrote: > > Nick Coghlan, 11.08.2013 15:19: > > On 11 Aug 2013 09:02, "Stefan Behnel" wrote: > >>> BTW, this already suggests a simple module initialisation interface. The > >>> extension module would expose a function that returns a module type, and > >>> the loader/importer would then simply instantiate that. Nothing else is > >>> needed. > >> > >> Actually, strike the word "module type" and replace it with "type". Is > >> there really a reason why Python needs a module type at all? I mean, you > >> can stick arbitrary objects in sys.modules, so why not allow arbitrary > >> types to be returned by the module creation function? > > > > That's exactly what I have in mind, but the way extension module imports > > currently work means we can't easily do it just yet. Fortunately, importlib > > means we now have some hope of fixing that :) > > Well, what do we need? We don't need to care about existing code, as long > as the current scheme is only deprecated and not deleted. That won't happen > before Py4 anyway. New code would simply export a different symbol when > compiling for a CPython that supports it, which points to the function that > returns the type. > > Then, there's already the PyType_Copy() function, which can be used to > create a heap type from a statically defined type. So extension modules can > simply define an (arbitrary) additional type in any way they see fit, copy > it to the heap, and return it. > > Next, we need to define a signature for the type's __init__() method. We need the "ModuleSpec" object to pass here, which is what we're currently working on in import-sig. We're not going to define something specifically for C extensions when other modules suffer related problems. Cheers, Nick. This > can be done in a future proof way by allowing arbitrary keyword arguments > to be added, i.e. such a type must have a signature like > > def __init__(self, currently, used, pos, args, **kwargs) > > and simply ignore kwargs for now. > > Actually, we may get away with not passing all too many arguments here if > we allow the importer to add stuff to the type's dict in between, > specifically __file__, __path__ and friends, so that they are available > before the type gets instantiated. Not sure if this is a good idea, but it > would at least relieve the user from having to copy these things over from > some kind of context or whatever we might want to pass in. > > Alternatively, we could split the instantiation up between tp_new() and > tp_init(), and let the importer set stuff on the instance dict in between > the two. But given that this context won't actually change once the shared > library is loaded, the only reason to prefer modifying the instance instead > of the type would be to avoid requiring a tp_dict for the type. Open for > discussion, I guess. > > Did I forget anything? Sounds simple enough to me so far. > > Stefan > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From db3l.net at gmail.com Mon Aug 12 00:49:45 2013 From: db3l.net at gmail.com (David Bolen) Date: Sun, 11 Aug 2013 18:49:45 -0400 Subject: [Python-Dev] Green buildbot failure. References: <20130811120018.3c209b6b@fsol> Message-ID: Victor Stinner writes: > test.regrtest uses faulthandler.dump_traceback_later() to stop the > test after a timeout if --timeout command line option is used. The slave doesn't actually control the test parameters, which come from build/Tools/buildbot/test.bat (which runs build/PCBuild/rt.bat) plus anything sent from the master. But no, it doesn't look like that flow is currently using --timeout, so the main timeout in place is that from the buildbot slave processing (currently 3900s and based on output activity by the process under test). Windows buildbots also have an additional "kill" path where the build scripts build and execute a separate kill_python_d executable (in PCBuild) to kill off any python_d process. It does have some sequencing issues (it runs during the build stage rather than clean) but no matter where it is used, being part of the build sequence risks it being skipped if the master/slave connection breaks mid-test. For some additional background, see email threads: http://mail.python.org/pipermail/python-dev/2010-November/105585.html http://mail.python.org/pipermail/python-dev/2010-December/106510.html http://mail.python.org/pipermail/python-dev/2011-January/107776.html Anyway, the termination in this particular case is completely separate from buildbot processing. It's a small script combining pslist/pskill from sysinternals (as pskill proved always able to kill the processes) and just looking for old python_d processes that just runs constantly in the background. My Windows buildbots have three additional layers of termination handling (beyond the standard buildbot timeout and kill_python in the test itself): 1. Modification to buildbot slave code to prevent Windows process and file dialogs. 2. Auto-it script in the background to acknowledge C RTL dialogs that the prior step doesn't block. (There have been past discussions about having Python itself disable RTL dialogs in test builds) 3. The external watchdog script as a fail-safe. The first two cases will definitely be recognized as test failures, since while the dialogs are suppressed/acknowledged, the triggering code will receive a failure result. The purpose of the watchdog script was to handle cases encountered for which the normal termination processing (buildbot or python itself) simply didn't seem to work. The buildbot slave/master thought the test ended or aborted, so started new tests, but a process remained stuck in memory from the prior test. The frequency of occurrence varied over time, but during some periods was a major pain in the neck adversely affecting buildbot stability. Not sure if faulthandler's approach to process termination would have more luck, or if it would even run if, for example, the process was stuck in the RTL or at the Win32 layer. I'd certainly be willing to retire the watchdog scripts (as long as I don't just end up firefighting stuck processes again), but I suspect the first challenge would be to figure out how to simulate an appropriately stuck process that would have required the watchdog script previously, given that it was never really obvious why they were hung. -- David From victor.stinner at gmail.com Mon Aug 12 02:11:17 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 12 Aug 2013 02:11:17 +0200 Subject: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: Hi, I fixed various bugs in the implementation of the (new) PEP 446: http://hg.python.org/features/pep-446 At revision da685bd67524, the full test suite pass on: - Fedora 18 (Linux 3.9), x86_64 - FreeBSD 9.1, x86_64 - Windows 7 SP1, x86_64 - OpenIndiana (close to Solaris 11), x86_64 Some tests are failing, but these failures are unrelated to the PEP 446 (same tests are failing in the original Python): - Windows: test_signal, failure related to faulthandler (issue already fixed in default) - OpenIndiana: test_locale, test_uuid Victor From victor.stinner at gmail.com Mon Aug 12 03:12:17 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 12 Aug 2013 03:12:17 +0200 Subject: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: 2013/8/12 Victor Stinner : > I fixed various bugs in the implementation of the (new) PEP 446: > http://hg.python.org/features/pep-446 > > At revision da685bd67524, the full test suite pass on: (...) I also checked the usage of atomic flags. There was a minor bug on Linux, it is now fixed (remove an useless call to fcntl to check if SOCK_CLOEXEC works). open(): On Linux, FreeBSD and Solaris 11, O_CLOEXEC flag is used. fcntl(F_GETFD) is only called once for all file descriptors, to check if O_CLOEXEC works. On Windows, O_NOINHERIT is used. socket.socket(): On Linux, SOCK_CLOEXEC flag is used, no extra syscall is required. os.pipe(): On Linux, pipe2() is used with O_CLOEXEC. On other platforms, os.set_inheritable() must be called to make the new file descriptors non-inheritables. On Windows, the atomic flag WSA_FLAG_NO_HANDLE_INHERIT is not used to create a socket. I don't know the Windows well enough to make such change. My OpenIndiana VM looks to be older than Solaris 11: O_CLOEXEC flag is missing. I regenerated the patch in the isssue: http://bugs.python.org/issue18571 Victor From stefan_ml at behnel.de Mon Aug 12 06:51:47 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 Aug 2013 06:51:47 +0200 Subject: [Python-Dev] redesigning the extension module initialisation protocol In-Reply-To: References: <20130811123316.313ab101@fsol> <20130811134826.3b7d48ce@fsol> <20130811143213.45c8523d@fsol> Message-ID: Nick Coghlan, 12.08.2013 00:41: > On 11 Aug 2013 09:55, "Stefan Behnel" wrote: >>>>> this already suggests a simple module initialisation interface. >>>>> The >>>>> extension module would expose a function that returns a module type, >>>>> and >>>>> the loader/importer would then simply instantiate that. Nothing else >>>>> is needed. >>>> Actually, strike the word "module type" and replace it with "type". >> [...] >> Next, we need to define a signature for the type's __init__() method. > > We need the "ModuleSpec" object to pass here, which is what we're currently > working on in import-sig. Ok but that's just the very final step. All the rest is C-API specific. And for clarification: you want to let the importer create the ModuleSpec object and the pass it into the module's __init__ method? I guess it could also be passed into the type creation function then, right? Since it wouldn't harm to do that, I think it's a good idea to provide as much information to the extension module as possible, as early as we can, and that's the first time we talk to the shared library. I've started writing up a pre-PEP that describes this protocol. I think it makes sense to keep it separate from the ModuleSpec PEP as the latter can easily be accepted without changing anything at the C-API level, but it shouldn't happen the other way round. Stefan From arnaud.fontaine at nexedi.com Mon Aug 12 09:39:36 2013 From: arnaud.fontaine at nexedi.com (Arnaud Fontaine) Date: Mon, 12 Aug 2013 16:39:36 +0900 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks Message-ID: <87a9knxsqv.fsf@duckcorp.org> [I initially posted this email to python-list but didn't get any reply, probably because this is too related to python core, so I'm posting it again here, hope that's ok...] Hello, I'm currently working on implementing Import Hooks (PEP302) with Python 2.7 to be able to import modules whose code is in ZODB. However, I have stumbled upon a widely known issue about import deadlock[0][1] (note that this issue is not directly related to ZODB, but a more general question about dealing with import lock deadlock for Import Hooks), basically: Thread 1 is trying to import a module 'foo.bar' (where 'foo' is a package containing dynamic modules) handled by Import Hooks I implemented, so import lock is acquired before even running the hooks (Python/import.c:PyImport_ImportModuleLevel()). Then, these import hooks try to load objects from ZODB and a request is sent and handled by another thread (Thread 2) which itself tries to import another module. Of course, this causes a deadlock because the first thread still holds import lock. I have thought about the following solutions: * Backport the patch applied in python 3.3 from issue 9260[0]. This would be the best option because it would mean that even when trying to import any module from package 'foo', other modules and packages can be imported, which would solve my issue. However, I'm not sure it could be released into python 2.7? * Within the hooks, protect the Import Hooks with a separate lock for the loader method. This would prevent any other thread to import any modules from 'foo' package but still allows to call the finder method (ignoring module fullname not starting with 'foo') along with other finder methods, so that other ZODB modules can be imported. Then, in the loader method, until the module is actually inserted into sys.modules and then other load_module() PEP302 responsabilities being taken care of (such as exec the code), release the import lock so that Thread 2 can process requests and send objects back to Thread 1. About the finder method, I think that the separate lock is enough and releasing the import lock until the end of the method should be enough. However, even after trying to understand import.c, I'm not sure this is enough and that releasing import lock would not have nasty side-effects, any thoughts about that? * Fix the ZODB code to not avoid import but to me this seems like a dirty hack because it could happen again and I would prefer to fix this issue once and for all. Any thoughts or suggestion welcome, thanks! Regards, Arnaud Fontaine From victor.stinner at gmail.com Mon Aug 12 11:01:50 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 12 Aug 2013 11:01:50 +0200 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: <87a9knxsqv.fsf@duckcorp.org> References: <87a9knxsqv.fsf@duckcorp.org> Message-ID: >I'm currently working on implementing Import Hooks (PEP302) with Python > 2.7 to be able to import modules whose code is in ZODB. However, I have > stumbled upon a widely known issue about import deadlock[0][1] (...) In Python 3.3, the import machinery has been rewritten (importlib is used by default) and the import lock is now per module, no more global. Backporting such huge change is difficult and risky. Upgrading to Python 3.3 is more future proof and don't require to hack Python 2.7. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Mon Aug 12 11:09:45 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 12 Aug 2013 11:09:45 +0200 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: <87a9knxsqv.fsf@duckcorp.org> References: <87a9knxsqv.fsf@duckcorp.org> Message-ID: Hi Arnaud, On Mon, Aug 12, 2013 at 9:39 AM, Arnaud Fontaine wrote: > Thread 1 is trying to import a module 'foo.bar' (where 'foo' is a > package containing dynamic modules) handled by Import Hooks I > implemented, so import lock is acquired before even running the hooks > (Python/import.c:PyImport_ImportModuleLevel()). Then, these import > hooks try to load objects from ZODB and a request is sent and handled > by another thread (Thread 2) which itself tries to import another > module. A quick hack might be to call imp.release_lock() and imp.acquire_lock() explicitly, from your import hook code, around calls to ZODB. A bient?t, Armin. From arnaud.fontaine at nexedi.com Mon Aug 12 11:12:34 2013 From: arnaud.fontaine at nexedi.com (Arnaud Fontaine) Date: Mon, 12 Aug 2013 18:12:34 +0900 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: (Victor Stinner's message of "Mon, 12 Aug 2013 11:01:50 +0200") References: <87a9knxsqv.fsf@duckcorp.org> Message-ID: <87wqnrw9vh.fsf@duckcorp.org> Victor Stinner writes: >>I'm currently working on implementing Import Hooks (PEP302) with Python >> 2.7 to be able to import modules whose code is in ZODB. However, I have >> stumbled upon a widely known issue about import deadlock[0][1] (...) > > In Python 3.3, the import machinery has been rewritten (importlib is used > by default) and the import lock is now per module, no more global. Yes, I saw the bug report and its patch implementing the import lock per module (mentioned in my initial email) and watched the presentation by Brett Cannon (BTW, I could not find the diagram explained during the presentation, anyone knows if it's available somewhere?). > Backporting such huge change is difficult and risky. > > Upgrading to Python 3.3 is more future proof and don't require to hack > Python 2.7. I wish I could use Python 3.3 but unfortunately, Zope 2 does not support it. What about the other solution I suggested though? Regards, -- Arnaud Fontaine From solipsis at pitrou.net Mon Aug 12 14:10:30 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Aug 2013 14:10:30 +0200 Subject: [Python-Dev] NULL allowed in PyErr_SetString and friends? Message-ID: <20130812141030.569ce249@pitrou.net> Hello, It seems NULL is allowed as the first argument of PyErr_Format, PyErr_SetString and PyErr_SetObject. Moreover, it means "clear the error indicator". However, this is not mentioned in the docs. I was wondering if we should officialize this behaviour or change it. Regards Antoine. From eliben at gmail.com Mon Aug 12 15:01:57 2013 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 12 Aug 2013 06:01:57 -0700 Subject: [Python-Dev] NULL allowed in PyErr_SetString and friends? In-Reply-To: <20130812141030.569ce249@pitrou.net> References: <20130812141030.569ce249@pitrou.net> Message-ID: On Mon, Aug 12, 2013 at 5:10 AM, Antoine Pitrou wrote: > > Hello, > > It seems NULL is allowed as the first argument of PyErr_Format, > PyErr_SetString and PyErr_SetObject. Moreover, it means "clear the > error indicator". However, this is not mentioned in the docs. I was > wondering if we should officialize this behaviour or change it. > Since the same capability is available much more clearly through PyError_Clear (and also through PyError_Restore(NULL, NULL, NULL)), IMHO we should at least: 1. Document that NULL is not allowed in PyErr_Set{Object|String} 2. Switch all actual uses of that idiom in the stdlib to PyError_Clear If we don't fear external code breakaga that relies on this undocumented behavior, we can also add explicit treating of NULL in PyErr_Set{Object|String} (maybe even asserting). Otherwise, we can just keep the behavior as is for now, though make it more correct: to do the reset it would call PyError_Restore(NULL, value, tb) even though PyError_Restore documents that all args should be NULL to actually clear the indicator. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Aug 12 16:23:12 2013 From: brett at python.org (Brett Cannon) Date: Mon, 12 Aug 2013 10:23:12 -0400 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: <87wqnrw9vh.fsf@duckcorp.org> References: <87a9knxsqv.fsf@duckcorp.org> <87wqnrw9vh.fsf@duckcorp.org> Message-ID: On Mon, Aug 12, 2013 at 5:12 AM, Arnaud Fontaine wrote: > Victor Stinner writes: > > >>I'm currently working on implementing Import Hooks (PEP302) with Python > >> 2.7 to be able to import modules whose code is in ZODB. However, I have > >> stumbled upon a widely known issue about import deadlock[0][1] (...) > > > > In Python 3.3, the import machinery has been rewritten (importlib is used > > by default) and the import lock is now per module, no more global. > > Yes, I saw the bug report and its patch implementing the import lock per > module (mentioned in my initial email) and watched the presentation by > Brett Cannon (BTW, I could not find the diagram explained during the > presentation, anyone knows if it's available somewhere?). > http://prezi.com/mqptpza9xbic/?utm_campaign=share&utm_medium=copy -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Mon Aug 12 19:18:17 2013 From: christian at python.org (Christian Heimes) Date: Mon, 12 Aug 2013 19:18:17 +0200 Subject: [Python-Dev] SSL issues in Python stdlib and 3rd party code Message-ID: <520918D9.4040303@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Hello, last week Ryan Sleevi of the Google Chrome Security Team has informed us about about two issues in Python's SSL module. I already new about the cause of the first bug and suspected that our SSL module suffers from the second bug but I was unable to prove it. Both issues are security issues but their impact is limited if you trust only trustworthy root certification authorities. Any decent root CA would should not sign a malicious cert with NULL bytes in a subjectAltName dNSName field or with wildcards like *.*.com. By the way if you are using the cacert.pem from curl, please update your bundle ASAP. I have found a bug in its Mozilla certdata parser, too. bug #1: ssl.match_hostname() wildcard matching - ---------------------------------------------- ssl.match_hostname() doesn't implement RFC 6125 wildcard matching rules. Affected versions: - - Python 3.2 (< 3.2.5) - - Python 3.3 (< 3.3.3) - - Python 3.4a1 - - requests < 1.2.3 https://pypi.python.org/pypi/requests - - backports.ssl_match_hostname (<3.2a3) https://pypi.python.org/pypi/backports.ssl_match_hostname/ - - urllib3 < 1.6 https://github.com/shazow/urllib3 Bug reports: http://bugs.python.org/issue17997 https://github.com/kennethreitz/requests/issues/1528 https://bitbucket.org/brandon/backports.ssl_match_hostname/issue/2/match_hostname-doesnt-implement-rfc-6125 Patch: http://bugs.python.org/issue17997 has a preliminary patch. The handling of IDN A-labels is still a bit controversial, though. bug #2 failure to handle NULL bytes in subjectAltName - ----------------------------------------------------- It's basically the same issue as CVE-2013-4073. Python uses GENERAL_NAME_print() to turn a GERNAL_NAME entry into a C string. But GENERAL_NAME_print() doesn't handle embedded NULL bytes in ASN1_STRINGs correctly. You can read more about the issue at http://www.ruby-lang.org/en/news/2013/06/27/hostname-check-bypassing-vulnerability-in-openssl-client-cve-2013-4073/ Affected versions: - - Python 2.6 (< 2.6.8) - - Python 2.7 (< 2.7.5) - - Python 3.2 (< 3.2.5) - - Python 3.3 (< 3.3.3) - - Python 3.4a1 - - PyOpenSSL < 0.13 https://pypi.python.org/pypi/pyOpenSSL - - eGenix.com pyOpenSSL Distribution with PyOpenSSL < 0.13 https://pypi.python.org/pypi/M2Crypto - - M2Crypto < 0.21.1 http://www.egenix.com/products/python/pyOpenSSL/ Bug report: http://bugs.python.org/issue18709 Patches: http://bugs.python.org/issue18709 has patches for 2.7, 3.3 and default https://code.launchpad.net/~heimes/pyopenssl/pyopenssl/+merge/179673 Jean-Paul Calderone is going to release 0.13.1 soonish. It's going to contain just my fix for the issue. Marc-Andre Lemburg will build a new version of eGenix.com pyOpenSSL Distribution shortly after. I'm not going to work on a patch for M2Crypto as I don't understand SWIG. I have contacted Heikki Toivonen for M2Crypto but haven't heard back from him yet. related issue: Mozilla's certdata.txt and CKT_NSS_MUST_VERIFY_TRUST - ------------------------------------------------------------------- Recently I found bugs in curl's mk-ca-bundle.pl script, its cacert.pem and in the CA bundle of eGenix.com pyOpenSSL Distribution. Both failed to handle a new option in Mozilla's certdata.txt database correctly. As a consequence the root CA bundles contained additionally and untrustworthy root certificates. I'm not sure about the severity of the issue. Curl has already fixed its script week ago. Marc-Andre Lemburg is going to release a new distribution very soon. https://github.com/bagder/curl/commit/51f0b798fa http://curl.haxx.se/docs/caextract.html Background information: https://www.imperialviolet.org/2012/01/30/mozillaroots.html http://lists.debian.org/debian-release/2012/11/msg00411.html http://p11-glue.freedesktop.org/doc/storing-trust-policy/storing-trust-existing.html I like to thank Ryan Sleevi (Google), Chris Palmer (Google), Marc-Andre Lemburg (eGenix.com, Python core dev), Jean-Paul Calderone (PyOpenSSL), Antoine Pitrou (Python core dev), Daniel Stenberg (curl), G?nter Knauf (curl) and everybody else who was involved in reporting and fixing these issues. Regards, Christian -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJSCRjUAAoJEMeIxMHUVQ1F81UP/R8RqAFR3cODI4SgFzmvOSXa u5OHfdQcgVfQbs5+oeZgqBu5aQ2W4SHKd/wSqCCrB1CbsZDZu9Y5LRVo/MgKMuNs /c49FAyNKGUPc04RtBPfjSQ8GWlk4ziayW9PAuziBGD+ExMlSzXk1CwRwPJVPMkA 8xV48QThT4U5L1WRS4AH3XdgIpPWwJswNCuw9KjlIP/b1LZIrVvUg9rb4azVf0qu Am9IlCwIb2sMNQU1s+sADht6B3Ka4tC8ej8VoWHnTEh8T8RJgcG3j3P2GFPx0YzD 35ISU6k/Dg/dEIJjawI7Uk+dXqxhMfCWFz5Yoy7TaUtYTFAFISqys88+I/H3PxNV mewUNdRFO9ej2vyikI8s1FwGVaORYEIVtkOKzLRa7mc5ZEdh6MjZ2l3bH2GszO+J mRzDwQjipnp8NwOjn9eipdKNaywFoGnDmoydxf3w9Qq6MPsdSiqtBHPWRP5K+091 rM+49v0MAenvxtkb8IJsBbSzVMs66uwfRYl1KYyvXNpGb4TSH/GlrUqRqaX97NW2 x1NprclxVWri5/kWnV4YvqJ6OmDcigCVI780+rQFSoqMk4JKDUgUsl451KvB4ATL 5OfZ9VaVhXu6Ydrjb3bRZHuKGeRsSH7JQCURFzWiriQEGwQAg/9D19JLli6YpCBZ a5dEBzj4FAVglJEYlsZ2 =Jq/C -----END PGP SIGNATURE----- From solipsis at pitrou.net Mon Aug 12 20:06:47 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Aug 2013 20:06:47 +0200 Subject: [Python-Dev] SSL issues in Python stdlib and 3rd party code References: <520918D9.4040303@python.org> Message-ID: <20130812200647.38b5c504@fsol> Hi, On Mon, 12 Aug 2013 19:18:17 +0200 Christian Heimes wrote: > related issue: Mozilla's certdata.txt and CKT_NSS_MUST_VERIFY_TRUST > - ------------------------------------------------------------------- > > Recently I found bugs in curl's mk-ca-bundle.pl script, its cacert.pem > and in the CA bundle of eGenix.com pyOpenSSL Distribution. Both failed > to handle a new option in Mozilla's certdata.txt database correctly. > As a consequence the root CA bundles contained additionally and > untrustworthy root certificates. I'm not sure about the severity of > the issue. Which goes to show that not bundling our own set of CA certificates is the safest route. Regards Antoine. From brett at python.org Mon Aug 12 21:22:06 2013 From: brett at python.org (Brett Cannon) Date: Mon, 12 Aug 2013 15:22:06 -0400 Subject: [Python-Dev] Deprecating the formatter module Message-ID: At the PyCon CA sprint someone discovered the formatter module had somewhat low code coverage. We discovered this is because it's tested by test_sundry, i.e. it's tested by importing it and that's it. We then realized that it isn't really used by anyone (pydoc uses it but it should have been using textwrap). Looking at the history of the module it has just been a magnet for cleanup revisions and not actual usage or development since Guido added it back in 1995. I have created http://bugs.python.org/issue18716 to deprecate the formatter module for removal in Python 3.6 unless someone convinces me otherwise that deprecation and removal is the wrong move. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Aug 12 22:11:30 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 12 Aug 2013 21:11:30 +0100 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: On 12 August 2013 20:22, Brett Cannon wrote: > At the PyCon CA sprint someone discovered the formatter module had > somewhat low code coverage. We discovered this is because it's tested by > test_sundry, i.e. it's tested by importing it and that's it. > > We then realized that it isn't really used by anyone (pydoc uses it but it > should have been using textwrap). Looking at the history of the module it > has just been a magnet for cleanup revisions and not actual usage or > development since Guido added it back in 1995. > > I have created http://bugs.python.org/issue18716 to deprecate the > formatter module for removal in Python 3.6 unless someone convinces me > otherwise that deprecation and removal is the wrong move. > I can see no reason to object. But having looked at the module for the first time on the basis of this email, I have to say that if I'd stumbled across it by chance, my reaction would have been that it was another one of Python's "hidden gems" that I'd never been aware of. I would then, of course, have filed it for future reference "should I need it", and never actually used it. So I'm OK for it to go, but a little sad nevertheless :-) Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Mon Aug 12 23:06:22 2013 From: larry at hastings.org (Larry Hastings) Date: Mon, 12 Aug 2013 17:06:22 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: <52094E4E.8090909@hastings.org> On 08/12/2013 04:11 PM, Paul Moore wrote: > [...] if I'd stumbled across it by chance, my reaction would have been > that it was another one of Python's "hidden gems" that I'd never been > aware of. Hidden "gem"? No. Hidden "paste diamond", maybe. YAGNI, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Mon Aug 12 23:22:01 2013 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 12 Aug 2013 14:22:01 -0700 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: On Mon, Aug 12, 2013 at 12:22 PM, Brett Cannon wrote: > At the PyCon CA sprint someone discovered the formatter module had > somewhat low code coverage. We discovered this is because it's tested by > test_sundry, i.e. it's tested by importing it and that's it. > > We then realized that it isn't really used by anyone (pydoc uses it but it > should have been using textwrap). Looking at the history of the module it > has just been a magnet for cleanup revisions and not actual usage or > development since Guido added it back in 1995. > > I have created http://bugs.python.org/issue18716 to deprecate the > formatter module for removal in Python 3.6 unless someone convinces me > otherwise that deprecation and removal is the wrong move. > I wish we had a way to collect real-world usage on such things. I tried a couple of code search engines, but this one is difficult to unravel because many Python packages have their own formatter module (for example Django, pygments) that probably does something different. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezio.melotti at gmail.com Mon Aug 12 23:42:25 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Tue, 13 Aug 2013 00:42:25 +0300 Subject: [Python-Dev] [Python-checkins] cpython (merge 2.7 -> 2.7): Clean merge In-Reply-To: <3cDSP02w6Qz7LjM@mail.python.org> References: <3cDSP02w6Qz7LjM@mail.python.org> Message-ID: Hi, On Mon, Aug 12, 2013 at 10:51 PM, david.wolever wrote: > http://hg.python.org/cpython/rev/0f4d971b0cee > changeset: 85138:0f4d971b0cee > branch: 2.7 > parent: 85137:102b3e257dca > parent: 83899:ef037ad304c1 > user: David Wolever > date: Thu May 23 17:51:58 2013 -0400 > summary: > Clean merge > > files: > .hgtags | 1 + > Doc/c-api/exceptions.rst | 38 +- > Doc/c-api/intro.rst | 4 +- > Doc/faq/design.rst | 4 +- > Doc/faq/programming.rst | 86 + > Doc/glossary.rst | 8 + > Doc/howto/advocacy.rst | 355 ------- > Doc/howto/index.rst | 1 - > Doc/howto/sockets.rst | 8 +- > Doc/howto/urllib2.rst | 12 +- > Doc/library/codecs.rst | 172 ++- > Doc/library/collections.rst | 4 +- > Doc/library/compileall.rst | 2 +- > Doc/library/ctypes.rst | 2 +- > Doc/library/io.rst | 3 + > Doc/library/itertools.rst | 4 +- > Doc/library/numbers.rst | 8 +- > Doc/library/operator.rst | 47 +- > Doc/library/resource.rst | 21 +- > Doc/library/socket.rst | 16 +- > Doc/library/ssl.rst | 16 +- > Doc/library/stdtypes.rst | 28 +- > Doc/library/string.rst | 5 +- > Doc/library/unittest.rst | 2 + > Doc/library/urllib.rst | 7 + > Doc/library/urllib2.rst | 15 +- > Doc/reference/datamodel.rst | 9 +- > Doc/reference/expressions.rst | 15 +- > Doc/reference/simple_stmts.rst | 3 + > Doc/tutorial/inputoutput.rst | 23 +- > Doc/tutorial/modules.rst | 7 +- > Doc/using/mac.rst | 14 +- > Include/object.h | 16 +- > Include/patchlevel.h | 4 +- > Lib/_weakrefset.py | 6 + > Lib/collections.py | 2 - > Lib/ctypes/test/__init__.py | 2 +- > Lib/ctypes/test/test_wintypes.py | 43 + > Lib/ctypes/util.py | 2 +- > Lib/distutils/__init__.py | 2 +- > Lib/filecmp.py | 2 +- > Lib/gzip.py | 69 +- > Lib/idlelib/Bindings.py | 4 + > Lib/idlelib/EditorWindow.py | 31 +- > Lib/idlelib/PyShell.py | 1 - > Lib/idlelib/help.txt | 3 +- > Lib/idlelib/idlever.py | 2 +- > Lib/idlelib/run.py | 5 + > Lib/logging/handlers.py | 36 +- > Lib/mimetypes.py | 2 + > Lib/multiprocessing/pool.py | 2 + > Lib/multiprocessing/synchronize.py | 2 +- > Lib/multiprocessing/util.py | 5 +- > Lib/pickle.py | 2 +- > Lib/plistlib.py | 4 +- > Lib/pydoc_data/topics.py | 18 +- > Lib/sre_parse.py | 6 +- > Lib/ssl.py | 26 +- > Lib/tarfile.py | 12 +- > Lib/test/pickletester.py | 2 + > Lib/test/test_base64.py | 26 + > Lib/test/test_bz2.py | 31 +- > Lib/test/test_collections.py | 2 +- > Lib/test/test_dictviews.py | 5 + > Lib/test/test_gdb.py | 46 +- > Lib/test/test_gzip.py | 17 - > Lib/test/test_io.py | 4 +- > Lib/test/test_mimetypes.py | 2 + > Lib/test/test_multiprocessing.py | 32 +- > Lib/test/test_plistlib.py | 12 + > Lib/test/test_pydoc.py | 57 +- > Lib/test/test_re.py | 11 + > Lib/test/test_sax.py | 20 + > Lib/test/test_support.py | 9 + > Lib/test/test_tarfile.py | 8 + > Lib/test/test_tcl.py | 18 +- > Lib/test/test_weakset.py | 6 + > Lib/test/test_winreg.py | 12 +- > Lib/test/test_zipfile.py | 10 +- > Lib/test/testbz2_bigmem.bz2 | Bin > Lib/threading.py | 42 +- > Lib/xml/sax/saxutils.py | 8 +- > Misc/ACKS | 9 + > Misc/NEWS | 457 ++++++--- > Misc/RPM/python-2.7.spec | 2 +- > Modules/_ctypes/libffi/src/dlmalloc.c | 5 + > Modules/_multiprocessing/multiprocessing.c | 2 +- > Modules/_sqlite/cursor.c | 2 +- > Modules/_sqlite/util.c | 8 +- > Modules/_sqlite/util.h | 4 +- > Modules/_testcapimodule.c | 2 +- > Modules/cPickle.c | 10 +- > Modules/dbmmodule.c | 8 +- > Modules/operator.c | 14 +- > Modules/readline.c | 27 +- > Modules/selectmodule.c | 35 +- > Modules/signalmodule.c | 14 +- > Modules/sre.h | 4 +- > Objects/dictobject.c | 4 + > PCbuild/rt.bat | 4 +- > README | 2 +- > Tools/scripts/gprof2html.py | 2 +- > configure | 2 +- > configure.ac | 2 +- > setup.py | 8 +- > 105 files changed, 1301 insertions(+), 955 deletions(-) > > To avoid these big merges you can do: # check the two heads that you are going to merge and their csids hg heads . # update to the other head (the one you pulled, not the one you committed) hg up csid-of-the-other-head # merge your changes on with the ones you pulled hg merge This will merge the changes you just committed with the ones you pulled, and result in a shorter diff that is easier to read/review/merge. Otherwise pulling and updating before committing will avoid the problem entirely (unless you end up in a push-race). Best Regards, Ezio Melotti From solipsis at pitrou.net Mon Aug 12 23:49:23 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Aug 2013 23:49:23 +0200 Subject: [Python-Dev] Deprecating the formatter module References: Message-ID: <20130812234923.4270bd93@fsol> On Mon, 12 Aug 2013 14:22:01 -0700 Eli Bendersky wrote: > On Mon, Aug 12, 2013 at 12:22 PM, Brett Cannon wrote: > > > At the PyCon CA sprint someone discovered the formatter module had > > somewhat low code coverage. We discovered this is because it's tested by > > test_sundry, i.e. it's tested by importing it and that's it. > > > > We then realized that it isn't really used by anyone (pydoc uses it but it > > should have been using textwrap). Looking at the history of the module it > > has just been a magnet for cleanup revisions and not actual usage or > > development since Guido added it back in 1995. > > > > I have created http://bugs.python.org/issue18716 to deprecate the > > formatter module for removal in Python 3.6 unless someone convinces me > > otherwise that deprecation and removal is the wrong move. > > > > I wish we had a way to collect real-world usage on such things. I tried a > couple of code search engines, but this one is difficult to unravel because > many Python packages have their own formatter module (for example Django, > pygments) that probably does something different. "Ohloh code search" shows a couple matches for AbstractFormatter in Python projects: http://code.ohloh.net/search?s=%22AbstractFormatter%22&pp=0&fl=Python&mp=1&ml=1&me=1&md=1&ff=1&filterChecked=true From solipsis at pitrou.net Mon Aug 12 23:50:30 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Aug 2013 23:50:30 +0200 Subject: [Python-Dev] cpython (merge 2.7 -> 2.7): Clean merge References: <3cDSP02w6Qz7LjM@mail.python.org> Message-ID: <20130812235030.3ed3ec73@fsol> On Tue, 13 Aug 2013 00:42:25 +0300 Ezio Melotti wrote: > > To avoid these big merges you can do: > # check the two heads that you are going to merge and their csids > hg heads . > # update to the other head (the one you pulled, not the one you committed) > hg up csid-of-the-other-head > # merge your changes on with the ones you pulled > hg merge > > This will merge the changes you just committed with the ones you > pulled, and result in a shorter diff that is easier to > read/review/merge. > Otherwise pulling and updating before committing will avoid the > problem entirely (unless you end up in a push-race). Or, if you are working on a single branch and no-one is watching you, you can do "hg pull --rebase". Regards Antoine. From rymg19 at gmail.com Mon Aug 12 23:01:36 2013 From: rymg19 at gmail.com (Ryan) Date: Mon, 12 Aug 2013 16:01:36 -0500 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: I never realized it existed till now. Considering the usually erratic projects I like do, I can see that coming in use in several in which I had to do odd workarounds. Keep it, but put better documentation. It's needed. Brett Cannon wrote: >At the PyCon CA sprint someone discovered the formatter module had >somewhat >low code coverage. We discovered this is because it's tested by >test_sundry, i.e. it's tested by importing it and that's it. > >We then realized that it isn't really used by anyone (pydoc uses it but >it >should have been using textwrap). Looking at the history of the module >it >has just been a magnet for cleanup revisions and not actual usage or >development since Guido added it back in 1995. > >I have created http://bugs.python.org/issue18716 to deprecate the >formatter >module for removal in Python 3.6 unless someone convinces me otherwise >that >deprecation and removal is the wrong move. > > >------------------------------------------------------------------------ > >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.fontaine at nexedi.com Tue Aug 13 04:06:51 2013 From: arnaud.fontaine at nexedi.com (Arnaud Fontaine) Date: Tue, 13 Aug 2013 11:06:51 +0900 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: (Armin Rigo's message of "Mon, 12 Aug 2013 11:09:45 +0200") References: <87a9knxsqv.fsf@duckcorp.org> Message-ID: <877gfqqr7o.fsf@duckcorp.org> Hi, Armin Rigo writes: > On Mon, Aug 12, 2013 at 9:39 AM, Arnaud Fontaine wrote: >> Thread 1 is trying to import a module 'foo.bar' (where 'foo' is a >> package containing dynamic modules) handled by Import Hooks I >> implemented, so import lock is acquired before even running the hooks >> (Python/import.c:PyImport_ImportModuleLevel()). Then, these import >> hooks try to load objects from ZODB and a request is sent and handled >> by another thread (Thread 2) which itself tries to import another >> module. > > A quick hack might be to call imp.release_lock() and > imp.acquire_lock() explicitly, from your import hook code, around > calls to ZODB. I suggested the same in my initial email, but I was wondering if there could be any issue by releasing the lock in find_module()/load_module() until the module is actually added to sys.modules. Cheers, -- Arnaud Fontaine From arnaud.fontaine at nexedi.com Tue Aug 13 04:10:13 2013 From: arnaud.fontaine at nexedi.com (Arnaud Fontaine) Date: Tue, 13 Aug 2013 11:10:13 +0900 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: (Brett Cannon's message of "Mon, 12 Aug 2013 10:23:12 -0400") References: <87a9knxsqv.fsf@duckcorp.org> <87wqnrw9vh.fsf@duckcorp.org> Message-ID: <87y586pchm.fsf@duckcorp.org> Brett Cannon writes: > On Mon, Aug 12, 2013 at 5:12 AM, Arnaud Fontaine > Yes, I saw the bug report and its patch implementing the import lock per >> module (mentioned in my initial email) and watched the presentation by >> Brett Cannon (BTW, I could not find the diagram explained during the >> presentation, anyone knows if it's available somewhere?). > > http://prezi.com/mqptpza9xbic/?utm_campaign=share&utm_medium=copy Thanks. Is the full diagram only available somewhere? (I mean as an image or PDF file, not within the presentation document itself) Cheers, -- Arnaud Fontaine From solipsis at pitrou.net Tue Aug 13 08:50:00 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 13 Aug 2013 08:50:00 +0200 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks References: <87a9knxsqv.fsf@duckcorp.org> <877gfqqr7o.fsf@duckcorp.org> Message-ID: <20130813085000.6c686907@fsol> On Tue, 13 Aug 2013 11:06:51 +0900 Arnaud Fontaine wrote: > Hi, > > Armin Rigo writes: > > On Mon, Aug 12, 2013 at 9:39 AM, Arnaud Fontaine wrote: > >> Thread 1 is trying to import a module 'foo.bar' (where 'foo' is a > >> package containing dynamic modules) handled by Import Hooks I > >> implemented, so import lock is acquired before even running the hooks > >> (Python/import.c:PyImport_ImportModuleLevel()). Then, these import > >> hooks try to load objects from ZODB and a request is sent and handled > >> by another thread (Thread 2) which itself tries to import another > >> module. > > > > A quick hack might be to call imp.release_lock() and > > imp.acquire_lock() explicitly, from your import hook code, around > > calls to ZODB. > > I suggested the same in my initial email, but I was wondering if there > could be any issue by releasing the lock in find_module()/load_module() > until the module is actually added to sys.modules. Well, you are obviously on your own with such hacks. There is a reason the lock exists. Regards Antoine. From christian at python.org Tue Aug 13 11:06:27 2013 From: christian at python.org (Christian Heimes) Date: Tue, 13 Aug 2013 11:06:27 +0200 Subject: [Python-Dev] SSL issues in Python stdlib and 3rd party code In-Reply-To: <520918D9.4040303@python.org> References: <520918D9.4040303@python.org> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 CVE-2013-4238 has been signed to NULL bytes in subjectAltName issue. http://bugs.python.org/issue18709 http://www.openwall.com/lists/oss-security/2013/08/13/2 Should we assign a CVE to issue in ssl.match_hostname(), too? Even more projects have copied our code (bzr, tornado, pip, setuptools): http://bugs.python.org/issue17997 https://bugs.mageia.org/show_bug.cgi?id=10391 https://bugzilla.redhat.com/show_bug.cgi?id=963260#c11 Christian -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJSCfcLAAoJEMeIxMHUVQ1FOC0P/0bPHK67qHbLf6HkHiVGNoAe NUX5oT28bm00RyfmjU9ZPA3RWnPjFL9yiVXqP0mWzs4OzdPjGrHkw+uH285d/rFv Di/Bcckq1lz/wzzsBeF/vviPVaSdV3tjlABgl/M6b902XhqEhZGg3RtiWmOvn+tc 1uKnXM4kWr/nUDbKYC2mBqbZD0IvN+XBQcy2cikjEtYcZc4QO80Dq9pL6g+3c4jH 7PpcMDyffsqD+Cd/PKK+Aq2tJOSHdHnK7V3/kTpRd+jheKSnq6idZYwQDU9sOkHT NcVjqJtFkhGTzSD7u1/kNtD0UEleXn8sOxJwBLjcAqg+dV0BUEJk8uwuUn4Mi9Di MaZbCs7NU/gPFdrS9pVxujaKaANbM4BJJwravA1/YYgPOGt1MhWlREbTg6W69w2+ 57/PXs2Vt1nHISEyvCJLkIDVHeZx8ccm57YJ+zEMI2MKIBP7+21zY3Yq+86RwHs0 /h2mkzj8EQVcwvaVT4XfjezMp0A6Tbh/iwIQEbY6zUQ8OSBlbQ7FhF8VNXOqb5fh pSVv0B6j1nNB8IaAAlMC56wRX2cmT8LvejUfGUq0duP+yiDYScknuqnhPePM1PZz oPHSDbbfLI5s0Ab9d0encKKWatNmeoml/V7td5PUEAicDHJ1WnTB+FM9Qxv3qNQn 5J+eNhg2Bjj2en8PnbFo =NiC2 -----END PGP SIGNATURE----- From pelson.pub at gmail.com Tue Aug 13 11:17:06 2013 From: pelson.pub at gmail.com (Phil Elson) Date: Tue, 13 Aug 2013 10:17:06 +0100 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: On 12 August 2013 22:01, Ryan wrote: > Keep it, but put better documentation. It's needed. There are many a useful package outside of the standard library. If this is genuinely useful in some specialist use cases then I'm sure the code will find its way to a github repo and be maintained as a standalone package in itself. Who knows, being outside of the stdlib might breathe a new lease of life into the concept if the release cycles are not bound to Python releases. Personally, I'd say delete it, unless there is a good reason not to. -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue Aug 13 12:34:36 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 13 Aug 2013 13:34:36 +0300 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: 12.08.13 22:22, Brett Cannon ???????(??): > I have created http://bugs.python.org/issue18716 to deprecate the > formatter module for removal in Python 3.6 unless someone convinces me > otherwise that deprecation and removal is the wrong move. The formatter module doesn't look such buggy as the audioop module. If the formatter module was not removed in Python 3.0 I don't see why it can't wait to 4.0. From brett at python.org Tue Aug 13 15:24:04 2013 From: brett at python.org (Brett Cannon) Date: Tue, 13 Aug 2013 09:24:04 -0400 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: <87y586pchm.fsf@duckcorp.org> References: <87a9knxsqv.fsf@duckcorp.org> <87wqnrw9vh.fsf@duckcorp.org> <87y586pchm.fsf@duckcorp.org> Message-ID: On Mon, Aug 12, 2013 at 10:10 PM, Arnaud Fontaine < arnaud.fontaine at nexedi.com> wrote: > Brett Cannon writes: > > On Mon, Aug 12, 2013 at 5:12 AM, Arnaud Fontaine < > arnaud.fontaine at nexedi.com wrote: > >> Yes, I saw the bug report and its patch implementing the import lock per > >> module (mentioned in my initial email) and watched the presentation by > >> Brett Cannon (BTW, I could not find the diagram explained during the > >> presentation, anyone knows if it's available somewhere?). > > > > http://prezi.com/mqptpza9xbic/?utm_campaign=share&utm_medium=copy > > Thanks. Is the full diagram only available somewhere? (I mean as an > image or PDF file, not within the presentation document itself) > Nope. I made the diagrams separately and then joined them directly in the presentation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Aug 13 15:36:20 2013 From: brett at python.org (Brett Cannon) Date: Tue, 13 Aug 2013 09:36:20 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: On Tue, Aug 13, 2013 at 6:34 AM, Serhiy Storchaka wrote: > 12.08.13 22:22, Brett Cannon ???????(??): > > I have created http://bugs.python.org/**issue18716to deprecate the >> formatter module for removal in Python 3.6 unless someone convinces me >> otherwise that deprecation and removal is the wrong move. >> > > The formatter module doesn't look such buggy as the audioop module. If the > formatter module was not removed in Python 3.0 I don't see why it can't > wait to 4.0. It doesn't help that at PyCon CA we couldn't find a single person who had heard of the module (which included people like Alex Gaynor and Larry Hastings). I'm definitely deprecating it, but we can discuss postponing deletion. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Aug 13 16:35:52 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Aug 2013 10:35:52 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: On 13 Aug 2013 09:39, "Brett Cannon" wrote: > > > > > On Tue, Aug 13, 2013 at 6:34 AM, Serhiy Storchaka wrote: >> >> 12.08.13 22:22, Brett Cannon ???????(??): >> >>> I have created http://bugs.python.org/issue18716 to deprecate the >>> formatter module for removal in Python 3.6 unless someone convinces me >>> otherwise that deprecation and removal is the wrong move. >> >> >> The formatter module doesn't look such buggy as the audioop module. If the formatter module was not removed in Python 3.0 I don't see why it can't wait to 4.0. > > > It doesn't help that at PyCon CA we couldn't find a single person who had heard of the module (which included people like Alex Gaynor and Larry Hastings). > > I'm definitely deprecating it, but we can discuss postponing deletion. Unless something is a serious bug magnet or is causing us maintenance hassles, we shouldn't remove it. Neither applies here, so an indefinite PendingDeprecation makes the most sense to me. Cheers, Nick. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.fontaine at nexedi.com Tue Aug 13 10:28:42 2013 From: arnaud.fontaine at nexedi.com (Arnaud Fontaine) Date: Tue, 13 Aug 2013 17:28:42 +0900 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: <20130813085000.6c686907@fsol> (Antoine Pitrou's message of "Tue, 13 Aug 2013 08:50:00 +0200") References: <87a9knxsqv.fsf@duckcorp.org> <877gfqqr7o.fsf@duckcorp.org> <20130813085000.6c686907@fsol> Message-ID: <87zjsmnged.fsf@duckcorp.org> Antoine Pitrou writes: > On Tue, 13 Aug 2013 11:06:51 +0900 Arnaud Fontaine wrote: >> I suggested the same in my initial email, but I was wondering if there >> could be any issue by releasing the lock in find_module()/load_module() >> until the module is actually added to sys.modules. > > Well, you are obviously on your own with such hacks. There is a reason > the lock exists. Yes. Actually, I was thinking about implementing something similar to what has been done in Python 3.3 but for Python 2.7 with a corser-grain lock. From my understanding of import.c, it should work but I was hoping that someone with more experience in import code would confirm: Currently, I have a package, foo, registered into sys.meta_path which loads modules through its find_module() and load_module() methods (PEP302). Access to load_module() of this package is protected by a RLock I defined, so that modules within foo cannot be imported in parallel. Until the module is added to sys.modules and then the code loaded, release the import lock. For find_module(), only filter the full module name and if this is a module from foo package, then acquired the same RLock defined for load_module() to access variables shared with find_module(). Also, if a patch backporting the features from Python 3.3 to import.c for Python 2.7 would be written, is there any chance it could be accepted? Regards, -- Arnaud Fontaine From solipsis at pitrou.net Tue Aug 13 17:31:47 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 13 Aug 2013 17:31:47 +0200 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks References: <87a9knxsqv.fsf@duckcorp.org> <877gfqqr7o.fsf@duckcorp.org> <20130813085000.6c686907@fsol> <87zjsmnged.fsf@duckcorp.org> Message-ID: <20130813173147.1603e342@pitrou.net> Le Tue, 13 Aug 2013 17:28:42 +0900, Arnaud Fontaine a ?crit : > Antoine Pitrou writes: > > On Tue, 13 Aug 2013 11:06:51 +0900 Arnaud Fontaine > > wrote: > >> I suggested the same in my initial email, but I was wondering if > >> there could be any issue by releasing the lock in > >> find_module()/load_module() until the module is actually added to > >> sys.modules. > > > > Well, you are obviously on your own with such hacks. There is a > > reason the lock exists. > > Yes. Actually, I was thinking about implementing something similar to > what has been done in Python 3.3 but for Python 2.7 with a > corser-grain lock. From my understanding of import.c, it should work > but I was hoping that someone with more experience in import code > would confirm: It's probably possible, but it will be non-trivial and delicate to get right. > Also, if a patch backporting the features from Python 3.3 to import.c > for Python 2.7 would be written, is there any chance it could be > accepted? Definitely not. We generally don't backport any features, especially when the risk is high due to a feature's implementation complexity. Regards Antoine. From tjreedy at udel.edu Tue Aug 13 18:37:45 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 13 Aug 2013 12:37:45 -0400 Subject: [Python-Dev] SSL issues in Python stdlib and 3rd party code In-Reply-To: References: <520918D9.4040303@python.org> Message-ID: On 8/13/2013 5:06 AM, Christian Heimes wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > CVE-2013-4238 has been signed to NULL bytes in subjectAltName issue. assigned... > > http://bugs.python.org/issue18709 > http://www.openwall.com/lists/oss-security/2013/08/13/2 > > Should we assign a CVE to issue in ssl.match_hostname(), too? Even > more projects have copied our code (bzr, tornado, pip, setuptools): > > http://bugs.python.org/issue17997 > https://bugs.mageia.org/show_bug.cgi?id=10391 > https://bugzilla.redhat.com/show_bug.cgi?id=963260#c11 I personlly thought that the CVE people did the assigning, or are you talking about asking them? What are the implications of 'yes' versus 'no'? If a number would get more attention, and you think that needed, do it. -- Terry Jan Reedy From steve at pearwood.info Wed Aug 14 01:51:18 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 14 Aug 2013 09:51:18 +1000 Subject: [Python-Dev] Guidance regarding tests for the standard lib Message-ID: <520AC676.1080707@pearwood.info> Hi, I have raise a tracker item and PEP for adding a statistics module to the standard library: http://bugs.python.org/issue18606 http://www.python.org/dev/peps/pep-0450/ and I'm about to submit a patch containing my updated code and tests, but I've run into a problem with testing. My existing tests use unittest, and follow the basic boilerplate documented here: http://docs.python.org/3/library/test.html and when run directly they pass, but when I try to run them using Python 3.4a -m test -j3 they break. My question is, is it acceptable to post the code and tests to the tracker as-is, and ask for a pronouncement on the PEP first, and then fix the test breakage later? If not, can somebody mentor me in understanding what I need to do here? To avoid all doubt, the tests pass if I call them like this: ./python Lib/test/test_statistics.py but raise errors when I call them like this: ./python -m test -j3 Thanks in advance, -- Steven From ethan at stoneleaf.us Wed Aug 14 02:14:30 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 13 Aug 2013 17:14:30 -0700 Subject: [Python-Dev] Guidance regarding tests for the standard lib In-Reply-To: <520AC676.1080707@pearwood.info> References: <520AC676.1080707@pearwood.info> Message-ID: <520ACBE6.2010006@stoneleaf.us> On 08/13/2013 04:51 PM, Steven D'Aprano wrote: > > I have raise a tracker item and PEP for adding a statistics module to the standard library: > > http://bugs.python.org/issue18606 > > http://www.python.org/dev/peps/pep-0450/ The bug-tracker doesn't think you've submitted a CLA yet. If you haven't you'll need to get that done. -- ~Ethan~ From ethan at stoneleaf.us Wed Aug 14 02:12:10 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 13 Aug 2013 17:12:10 -0700 Subject: [Python-Dev] Guidance regarding tests for the standard lib In-Reply-To: <520AC676.1080707@pearwood.info> References: <520AC676.1080707@pearwood.info> Message-ID: <520ACB5A.1080201@stoneleaf.us> On 08/13/2013 04:51 PM, Steven D'Aprano wrote: > > My question is, is it acceptable to post the code and tests to > the tracker as-is, and ask for a pronouncement on the PEP first, > and then fix the test breakage later? Certainly. -- ~Ethan~ From victor.stinner at gmail.com Wed Aug 14 02:37:39 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 14 Aug 2013 02:37:39 +0200 Subject: [Python-Dev] Guidance regarding tests for the standard lib In-Reply-To: <520AC676.1080707@pearwood.info> References: <520AC676.1080707@pearwood.info> Message-ID: Send the patch somewhere (ex: attach it to an email, or to the bug tracker, as you want), or give the error message, if you want some help. > Ask for a pronouncement on the PEP first, and then fix the test breakage later? Sometimes, it's possible to pronounce on a PEP without a working implementation. But it's easier to pronounce with a working implementation :-) Victor From ncoghlan at gmail.com Wed Aug 14 02:50:23 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Aug 2013 20:50:23 -0400 Subject: [Python-Dev] Guidance regarding tests for the standard lib In-Reply-To: References: <520AC676.1080707@pearwood.info> Message-ID: On 13 Aug 2013 19:40, "Victor Stinner" wrote: > > Send the patch somewhere (ex: attach it to an email, or to the bug > tracker, as you want), or give the error message, if you want some > help. > > > Ask for a pronouncement on the PEP first, and then fix the test breakage later? > > Sometimes, it's possible to pronounce on a PEP without a working > implementation. But it's easier to pronounce with a working > implementation :-) It's more typical for reference implementations to be "proof of concept" quality code, though. It's very rare to have a full, ready to apply patch (although we certainly don't complain if that happens!) (To directly answer the original question: a test integration glitch isn't a blocker for PEP acceptance, just for actually committing the implementation) Cheers, Nick. > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Aug 14 04:22:23 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 14 Aug 2013 12:22:23 +1000 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: <520AE9DF.6090406@pearwood.info> On 13/08/13 23:36, Brett Cannon wrote: > On Tue, Aug 13, 2013 at 6:34 AM, Serhiy Storchaka wrote: > >> 12.08.13 22:22, Brett Cannon ???????(??): >> >> I have created http://bugs.python.org/**issue18716to deprecate the >>> formatter module for removal in Python 3.6 unless someone convinces me >>> otherwise that deprecation and removal is the wrong move. >>> >> >> The formatter module doesn't look such buggy as the audioop module. If the >> formatter module was not removed in Python 3.0 I don't see why it can't >> wait to 4.0. > > > It doesn't help that at PyCon CA we couldn't find a single person who had > heard of the module (which included people like Alex Gaynor and Larry > Hastings). You know, there may be one or two Python programmers who didn't go to PyCon CA... :-) I knew of formatter before this thread. I've looked at it for use in my own code, but the lack of useful documentation or clear examples put me off, so I put it in the pile of "things to investigate later", but it is definitely a module that interests me. The documentation for 2.7 claims that it is used by HTMLParser, but that doesn't appear to be true. The claim is removed from the 3.3 documentation. Over on Python-Ideas mailing list, there are periodic frenzies of requests to delete all sorts of things, e.g. a recent thread debating deleting various string methods in favour of using {} string formatting. My answer there is the same as my answer here: unless the formatter module is actively harmful, then deprecation risks causing more pain than benefit. All those thousands of coders who don't use formatter? The benefit to them of deprecating it is next to zero. That one guy in Uberwald you've never heard of who does use it? It's going to cause him a lot of pain. -- Steven From tjreedy at udel.edu Wed Aug 14 06:00:58 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 14 Aug 2013 00:00:58 -0400 Subject: [Python-Dev] Guidance regarding tests for the standard lib In-Reply-To: <520AC676.1080707@pearwood.info> References: <520AC676.1080707@pearwood.info> Message-ID: <520B00FA.9010207@udel.edu> On 8/13/2013 7:51 PM, Steven D'Aprano wrote: > http://bugs.python.org/issue18606 Tests at end of statistics.patch. > and I'm about to submit a patch containing my updated code and tests, > but I've run into a problem with testing. My existing tests use > unittest, and follow the basic boilerplate documented here: > > http://docs.python.org/3/library/test.html +def test_main(): + # run_unittest() + unittest.main() + + +if __name__ == "__main__": + test_main() This is faulty, as explained below. It is not the boilerplate in http://docs.python.org/3/library/test.html#writing-unit-tests-for-the-test-package which is correct. Compress to just if __name__ == "__main__": unittest_main() > To avoid all doubt, the tests pass if I call them like this: > ./python Lib/test/test_statistics.py That is a faulty test as explained below. Patch applies cleanly on Win7, 32bit build. And indeed, F:\Python\dev\py34\PCbuild>python_d ../Lib/test/test_statistics.py works. (the progess dots will have to go before being applied ;-). The above should be equivalent to ...> python -m test.test_statistics. but it is not. F:\Python\dev\py34\PCbuild>python_d -m test.test_statistics Traceback (most recent call last): File "F:\Python\dev\py34\lib\runpy.py", line 160, in _run_module_as_main "__main__", fname, loader, pkg_name) File "F:\Python\dev\py34\lib\runpy.py", line 73, in _run_code exec(code, run_globals) File "F:\Python\dev\py34\lib\test\test_statistics.py", line 16, in from test_statistics_approx import NumericTestCase ImportError: No module named 'test_statistics_approx' Your test *depends* on Lib/test being the current directory, at the beginning of the path. It should not. Your import must be from test.test_statistics_appox import NumericTestCase With this fixed, the above command works. Once the test runs from anywhere with unittest, run with regrtest, ...> python -m test test_statistics Initially, this gives a traceback with the same last three lines. With the revision, I get File "F:\Python\dev\py34\lib\unittest\loader.py", line 113, in loadTestsFromName parent, obj = obj, getattr(obj, part) AttributeError: 'module' object has no attribute 'test_statistics' This is related to the first fault I mentioned above. If regrtest finds the old style 'test_main', it uses the regrtest loader. If it does not, it uses the unittest loader. This is documented in the code ;-). With the proper 2-line boilerplate as a second change, the second line now works. > but raise errors when I call them like this: > ./python -m test -j3 LOL. In one jump, you changed the current directory, the test runner, the number of test files run, and the mode of testing. Try just one change at a time. When you run the suite, you run test_statistics_approx.py, including the test case imported into test_statistics. (Is it run twice?). Tested as part of the suite gives a bizarre message I do not pretend to understand. [290/379/3] test_statistics_approx Usage: regrtest.py [options] regrtest.py: error: no such option: --slaveargs test test_statistics_approx crashed -- Traceback (most recent call last): File "F:\Python\dev\py34\lib\optparse.py", line 1391, in parse_args stop = self._process_args(largs, rargs, values) File "F:\Python\dev\py34\lib\optparse.py", line 1431, in _process_args self._process_long_opt(rargs, values) File "F:\Python\dev\py34\lib\optparse.py", line 1484, in _process_long_opt opt = self._match_long_opt(opt) File "F:\Python\dev\py34\lib\optparse.py", line 1469, in _match_long_opt return _match_abbrev(opt, self._long_opt) File "F:\Python\dev\py34\lib\optparse.py", line 1674, in _match_abbrev raise BadOptionError(s) optparse.BadOptionError: no such option: --slaveargs During handling of the above exception, another exception occurred: Traceback (most recent call last): File "F:\Python\dev\py34\lib\test\regrtest.py", line 1305, in runtest_inner test_runner() File "F:\Python\dev\py34\lib\test\test_statistics_approx.py", line 597, in test_main unittest.main() File "F:\Python\dev\py34\lib\unittest\main.py", line 124, in __init__ self.parseArgs(argv) File "F:\Python\dev\py34\lib\unittest\main.py", line 148, in parseArgs options, args = parser.parse_args(argv[1:]) File "F:\Python\dev\py34\lib\optparse.py", line 1393, in parse_args self.error(str(err)) File "F:\Python\dev\py34\lib\optparse.py", line 1573, in error self.exit(2, "%s: error: %s\n" % (self.get_prog_name(), msg)) File "F:\Python\dev\py34\lib\optparse.py", line 1563, in exit sys.exit(status) SystemExit: 2 So back up and test it by itself first. It passes under unittest, but needs the revised boilerplate for regrtest. The two together work fine. When I rerun the suite, poof!, the mysterious tracback is gone and both pass. The following appears after the 2nd. ''' .. ---------------------------------------------------------------------- Ran 2 tests in 0.000s OK ''' Perhaps this is from the doctest. If so, it probably should be surpressed. One last thing: test order. Buildbots randomize the order of running files, so if your tests modify the environment in a way that affects another test, it might cause occasional failures. -- Terry Jan Reedy From arnaud.fontaine at nexedi.com Wed Aug 14 07:17:59 2013 From: arnaud.fontaine at nexedi.com (Arnaud Fontaine) Date: Wed, 14 Aug 2013 14:17:59 +0900 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: <20130813173147.1603e342@pitrou.net> (Antoine Pitrou's message of "Tue, 13 Aug 2013 17:31:47 +0200") References: <87a9knxsqv.fsf@duckcorp.org> <877gfqqr7o.fsf@duckcorp.org> <20130813085000.6c686907@fsol> <87zjsmnged.fsf@duckcorp.org> <20130813173147.1603e342@pitrou.net> Message-ID: <87bo50onp4.fsf@duckcorp.org> Antoine Pitrou writes: > Le Tue, 13 Aug 2013 17:28:42 +0900, Arnaud Fontaine a ?crit : >> Yes. Actually, I was thinking about implementing something similar to >> what has been done in Python 3.3 but for Python 2.7 with a >> corser-grain lock. From my understanding of import.c, it should work >> but I was hoping that someone with more experience in import code >> would confirm: > > It's probably possible, but it will be non-trivial and delicate to get > right. >From my understanding of import.c source code, until something is added to sys.modules or the code loaded, there should be no side-effect to releasing the lock, right? (eg there is no global variables/data being shared for importing modules, meaning that releasing the lock should be safe as long as the modules loaded through import hooks are protected by a lock) >> Also, if a patch backporting the features from Python 3.3 to import.c >> for Python 2.7 would be written, is there any chance it could be >> accepted? > > Definitely not. We generally don't backport any features, especially > when the risk is high due to a feature's implementation complexity. I see, thanks for your answer. Regards, -- Arnaud Fontaine From solipsis at pitrou.net Wed Aug 14 10:15:29 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 14 Aug 2013 10:15:29 +0200 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks References: <87a9knxsqv.fsf@duckcorp.org> <877gfqqr7o.fsf@duckcorp.org> <20130813085000.6c686907@fsol> <87zjsmnged.fsf@duckcorp.org> <20130813173147.1603e342@pitrou.net> <87bo50onp4.fsf@duckcorp.org> Message-ID: <20130814101529.07697e45@pitrou.net> Le Wed, 14 Aug 2013 14:17:59 +0900, Arnaud Fontaine a ?crit : > Antoine Pitrou writes: > > > Le Tue, 13 Aug 2013 17:28:42 +0900, Arnaud Fontaine > > a ?crit : > >> Yes. Actually, I was thinking about implementing something similar > >> to what has been done in Python 3.3 but for Python 2.7 with a > >> corser-grain lock. From my understanding of import.c, it should > >> work but I was hoping that someone with more experience in import > >> code would confirm: > > > > It's probably possible, but it will be non-trivial and delicate to > > get right. > > From my understanding of import.c source code, until something is > added to sys.modules or the code loaded, there should be no > side-effect to releasing the lock, right? (eg there is no global > variables/data being shared for importing modules, meaning that > releasing the lock should be safe as long as the modules loaded > through import hooks are protected by a lock) Er, probably, but import.c is a nasty pile of code. It's true the import lock is there mainly to: - avoid incomplete modules from being seen by other threads - avoid a module from being executed twice But that doesn't mean it can't have had any other - unintended - benefits ;-) (also, some import hooks might not be thread-safe, something which they haven't had to bother about until now) Regards Antoine. From brett at python.org Wed Aug 14 17:08:29 2013 From: brett at python.org (Brett Cannon) Date: Wed, 14 Aug 2013 11:08:29 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: <520AE9DF.6090406@pearwood.info> References: <520AE9DF.6090406@pearwood.info> Message-ID: On Tue, Aug 13, 2013 at 10:22 PM, Steven D'Aprano wrote: > On 13/08/13 23:36, Brett Cannon wrote: > >> On Tue, Aug 13, 2013 at 6:34 AM, Serhiy Storchaka > >wrote: >> >> 12.08.13 22:22, Brett Cannon ???????(??): >>> >>> I have created http://bugs.python.org/****issue18716 >>> >to >>> deprecate the >>> >>> formatter module for removal in Python 3.6 unless someone convinces me >>>> otherwise that deprecation and removal is the wrong move. >>>> >>>> >>> The formatter module doesn't look such buggy as the audioop module. If >>> the >>> formatter module was not removed in Python 3.0 I don't see why it can't >>> wait to 4.0. >>> >> >> >> It doesn't help that at PyCon CA we couldn't find a single person who had >> heard of the module (which included people like Alex Gaynor and Larry >> Hastings). >> > > > You know, there may be one or two Python programmers who didn't go to > PyCon CA... :-) > Sure, but you would assume at least *one* person would have known of the module in a room of sprinters. > > I knew of formatter before this thread. I've looked at it for use in my > own code, but the lack of useful documentation or clear examples put me > off, so I put it in the pile of "things to investigate later", but it is > definitely a module that interests me. > Sure, but not to the extent to try and update the documentation, provide examples, etc. And totally lacking tests doesn't help with the argument there is interest in it either. > > The documentation for 2.7 claims that it is used by HTMLParser, but that > doesn't appear to be true. The claim is removed from the 3.3 documentation. > > Over on Python-Ideas mailing list, there are periodic frenzies of requests > to delete all sorts of things, e.g. a recent thread debating deleting > various string methods in favour of using {} string formatting. My answer > there is the same as my answer here: unless the formatter module is > actively harmful, then deprecation risks causing more pain than benefit. > All those thousands of coders who don't use formatter? The benefit to them > of deprecating it is next to zero. That one guy in Uberwald you've never > heard of who does use it? It's going to cause him a lot of pain. That's fine, but this is also about me and every other core developer who stands behind every module in the stdlib as being as bug-free as possible and generally useful to warrant people learning about them. I'm saying I don't want to maintain it and have to clean up it's code every time there is a stdlib-wide cleanup or if there is ever a bug. Every bug takes time and effort away -- no matter how infrequently the module has them -- from other modules which people could be focusing their time on which have a greater impact on the wider community. You can argue that some core dev might care enough to maintain it, but unless someone lists themselves on http://docs.python.org/devguide/experts.html for a module then the burden of maintenance falls on all of us. There is also the issue of semantic overload in terms of simply trying to keep straight what is in the stdlib. It also means book authors need to decide whether to care about this module or not and use it in examples. Teachers need to be aware of the module to answer questions, etc. We take adding a module to the stdlib very seriously for all of these reasons and yet people seem to forget that the exact same reasons apply to modules already in the stdlib, whether they would be added today or not (and in this instance I would argue not). There is a balance to keeping the load of work for core devs at a level that is tenable to the level of quality we expect from ourselves which means making sure we don't let cruft build up in the stdlib and overwhelm us. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Aug 14 17:26:13 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 14 Aug 2013 17:26:13 +0200 Subject: [Python-Dev] Deprecating the formatter module References: <520AE9DF.6090406@pearwood.info> Message-ID: <20130814172613.3a0df3da@pitrou.net> Le Wed, 14 Aug 2013 11:08:29 -0400, Brett Cannon a ?crit : > > > > You know, there may be one or two Python programmers who didn't go > > to PyCon CA... :-) > > > > Sure, but you would assume at least *one* person would have known of > the module in a room of sprinters. Not necessarily. There are specialized modules which may be of use to only a certain category of people, and unknown to others. > > Over on Python-Ideas mailing list, there are periodic frenzies of > > requests to delete all sorts of things, e.g. a recent thread > > debating deleting various string methods in favour of using {} > > string formatting. My answer there is the same as my answer here: > > unless the formatter module is actively harmful, then deprecation > > risks causing more pain than benefit. All those thousands of coders > > who don't use formatter? The benefit to them of deprecating it is > > next to zero. That one guy in Uberwald you've never heard of who > > does use it? It's going to cause him a lot of pain. > > That's fine, but this is also about me and every other core developer > who stands behind every module in the stdlib as being as bug-free as > possible and generally useful to warrant people learning about them. > I'm saying I don't want to maintain it and have to clean up it's code > every time there is a stdlib-wide cleanup or if there is ever a bug. Not all of us have to maintain it :-) I'm not gonna maintain it either, but that doesn't mean someone else won't step up. Regards Antoine. From ncoghlan at gmail.com Wed Aug 14 17:47:15 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 14 Aug 2013 11:47:15 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On 14 August 2013 11:08, Brett Cannon wrote: > We take adding a module to the stdlib very seriously for all of these > reasons and yet people seem to forget that the exact same reasons apply to > modules already in the stdlib, whether they would be added today or not (and > in this instance I would argue not). There is a balance to keeping the load > of work for core devs at a level that is tenable to the level of quality we > expect from ourselves which means making sure we don't let cruft build up in > the stdlib and overwhelm us. I've already suggested a solution to that at the language summit [1]: we create a "Legacy Modules" section in the docs index and dump all the modules that are in the "These are only in the standard library because they were added before PyPI existed, aren't really actively maintained, but we can't remove them due to backwards compatibility concerns" category there. Clear indication of their status for authors, educators, future users and us, with no risk of breaking currently working code. Cheers, Nick. [1] http://python-notes.curiousefficiency.org/en/latest/conferences/pyconus2013/20130313-language-summit.html#legacy-modules -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From brett at python.org Wed Aug 14 17:55:00 2013 From: brett at python.org (Brett Cannon) Date: Wed, 14 Aug 2013 11:55:00 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On Wed, Aug 14, 2013 at 11:47 AM, Nick Coghlan wrote: > On 14 August 2013 11:08, Brett Cannon wrote: > > We take adding a module to the stdlib very seriously for all of these > > reasons and yet people seem to forget that the exact same reasons apply > to > > modules already in the stdlib, whether they would be added today or not > (and > > in this instance I would argue not). There is a balance to keeping the > load > > of work for core devs at a level that is tenable to the level of quality > we > > expect from ourselves which means making sure we don't let cruft build > up in > > the stdlib and overwhelm us. > > I've already suggested a solution to that at the language summit [1]: > we create a "Legacy Modules" section in the docs index and dump all > the modules that are in the "These are only in the standard library > because they were added before PyPI existed, aren't really actively > maintained, but we can't remove them due to backwards compatibility > concerns" category there. > > Clear indication of their status for authors, educators, future users > and us, with no risk of breaking currently working code. > I view a deprecation as the same thing. If we leave the module in until Python 4 then I can live with that, but simply moving documentation around is not enough to communicate to those who didn't read the release notes to know modules they rely on are now essentially orphaned. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Aug 14 18:09:39 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 14 Aug 2013 12:09:39 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On 14 August 2013 11:55, Brett Cannon wrote: > On Wed, Aug 14, 2013 at 11:47 AM, Nick Coghlan wrote: >> >> On 14 August 2013 11:08, Brett Cannon wrote: >> > We take adding a module to the stdlib very seriously for all of these >> > reasons and yet people seem to forget that the exact same reasons apply >> > to >> > modules already in the stdlib, whether they would be added today or not >> > (and >> > in this instance I would argue not). There is a balance to keeping the >> > load >> > of work for core devs at a level that is tenable to the level of quality >> > we >> > expect from ourselves which means making sure we don't let cruft build >> > up in >> > the stdlib and overwhelm us. >> >> I've already suggested a solution to that at the language summit [1]: >> we create a "Legacy Modules" section in the docs index and dump all >> the modules that are in the "These are only in the standard library >> because they were added before PyPI existed, aren't really actively >> maintained, but we can't remove them due to backwards compatibility >> concerns" category there. >> >> Clear indication of their status for authors, educators, future users >> and us, with no risk of breaking currently working code. > > > I view a deprecation as the same thing. If we leave the module in until > Python 4 then I can live with that, but simply moving documentation around > is not enough to communicate to those who didn't read the release notes to > know modules they rely on are now essentially orphaned. No, a deprecation isn't enough, because it doesn't help authors and educators to know "this is legacy, you can skip it". We need both. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eliben at gmail.com Wed Aug 14 18:17:32 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 14 Aug 2013 09:17:32 -0700 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On Wed, Aug 14, 2013 at 9:09 AM, Nick Coghlan wrote: > On 14 August 2013 11:55, Brett Cannon wrote: > > On Wed, Aug 14, 2013 at 11:47 AM, Nick Coghlan > wrote: > >> > >> On 14 August 2013 11:08, Brett Cannon wrote: > >> > We take adding a module to the stdlib very seriously for all of these > >> > reasons and yet people seem to forget that the exact same reasons > apply > >> > to > >> > modules already in the stdlib, whether they would be added today or > not > >> > (and > >> > in this instance I would argue not). There is a balance to keeping the > >> > load > >> > of work for core devs at a level that is tenable to the level of > quality > >> > we > >> > expect from ourselves which means making sure we don't let cruft build > >> > up in > >> > the stdlib and overwhelm us. > >> > >> I've already suggested a solution to that at the language summit [1]: > >> we create a "Legacy Modules" section in the docs index and dump all > >> the modules that are in the "These are only in the standard library > >> because they were added before PyPI existed, aren't really actively > >> maintained, but we can't remove them due to backwards compatibility > >> concerns" category there. > >> > >> Clear indication of their status for authors, educators, future users > >> and us, with no risk of breaking currently working code. > > > > > > I view a deprecation as the same thing. If we leave the module in until > > Python 4 then I can live with that, but simply moving documentation > around > > is not enough to communicate to those who didn't read the release notes > to > > know modules they rely on are now essentially orphaned. > > No, a deprecation isn't enough, because it doesn't help authors and > educators to know "this is legacy, you can skip it". We need both. > +1 for both and for leaving the module in until "Python 4". Nick, perhaps we can have this "legacy-zation" process for modules documented somewhere? Devguide? mini-PEP? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Aug 14 18:12:42 2013 From: brett at python.org (Brett Cannon) Date: Wed, 14 Aug 2013 12:12:42 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On Wed, Aug 14, 2013 at 12:09 PM, Nick Coghlan wrote: > On 14 August 2013 11:55, Brett Cannon wrote: > > On Wed, Aug 14, 2013 at 11:47 AM, Nick Coghlan > wrote: > >> > >> On 14 August 2013 11:08, Brett Cannon wrote: > >> > We take adding a module to the stdlib very seriously for all of these > >> > reasons and yet people seem to forget that the exact same reasons > apply > >> > to > >> > modules already in the stdlib, whether they would be added today or > not > >> > (and > >> > in this instance I would argue not). There is a balance to keeping the > >> > load > >> > of work for core devs at a level that is tenable to the level of > quality > >> > we > >> > expect from ourselves which means making sure we don't let cruft build > >> > up in > >> > the stdlib and overwhelm us. > >> > >> I've already suggested a solution to that at the language summit [1]: > >> we create a "Legacy Modules" section in the docs index and dump all > >> the modules that are in the "These are only in the standard library > >> because they were added before PyPI existed, aren't really actively > >> maintained, but we can't remove them due to backwards compatibility > >> concerns" category there. > >> > >> Clear indication of their status for authors, educators, future users > >> and us, with no risk of breaking currently working code. > > > > > > I view a deprecation as the same thing. If we leave the module in until > > Python 4 then I can live with that, but simply moving documentation > around > > is not enough to communicate to those who didn't read the release notes > to > > know modules they rely on are now essentially orphaned. > > No, a deprecation isn't enough, because it doesn't help authors and > educators to know "this is legacy, you can skip it". We need both. > That I'm fine with, your wording just suggested to me you were only thinking of the doc change. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Wed Aug 14 18:23:32 2013 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 14 Aug 2013 17:23:32 +0100 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: <520BAF04.6090109@mrabarnett.plus.com> On 14/08/2013 17:17, Eli Bendersky wrote: > > > > On Wed, Aug 14, 2013 at 9:09 AM, Nick Coghlan > wrote: > > On 14 August 2013 11:55, Brett Cannon > wrote: > > On Wed, Aug 14, 2013 at 11:47 AM, Nick Coghlan > > wrote: > >> > >> On 14 August 2013 11:08, Brett Cannon > wrote: > >> > We take adding a module to the stdlib very seriously for all > of these > >> > reasons and yet people seem to forget that the exact same > reasons apply > >> > to > >> > modules already in the stdlib, whether they would be added > today or not > >> > (and > >> > in this instance I would argue not). There is a balance to > keeping the > >> > load > >> > of work for core devs at a level that is tenable to the level > of quality > >> > we > >> > expect from ourselves which means making sure we don't let > cruft build > >> > up in > >> > the stdlib and overwhelm us. > >> > >> I've already suggested a solution to that at the language summit > [1]: > >> we create a "Legacy Modules" section in the docs index and dump all > >> the modules that are in the "These are only in the standard library > >> because they were added before PyPI existed, aren't really actively > >> maintained, but we can't remove them due to backwards compatibility > >> concerns" category there. > >> > >> Clear indication of their status for authors, educators, future > users > >> and us, with no risk of breaking currently working code. > > > > > > I view a deprecation as the same thing. If we leave the module in > until > > Python 4 then I can live with that, but simply moving > documentation around > > is not enough to communicate to those who didn't read the release > notes to > > know modules they rely on are now essentially orphaned. > > No, a deprecation isn't enough, because it doesn't help authors and > educators to know "this is legacy, you can skip it". We need both. > > > +1 for both and for leaving the module in until "Python 4". > > Nick, perhaps we can have this "legacy-zation" process for modules > documented somewhere? Devguide? mini-PEP? > What about also for certain features of modules, such as re's LOCALE flag? From skip at pobox.com Wed Aug 14 18:41:10 2013 From: skip at pobox.com (Skip Montanaro) Date: Wed, 14 Aug 2013 11:41:10 -0500 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: Message-ID: > We then realized that it isn't really used by anyone (pydoc uses it but it > should have been using textwrap). Looking at the history of the module it > has just been a magnet for cleanup revisions and not actual usage or > development since Guido added it back in 1995. Note that it is/was used in Grail, whose most recent release appears to have been 1999: http://grail.sourceforge.net/ I'm not suggesting this is an overriding reason to keep it, just noting that it has seen significant use at one time by a rather prominent Python developer (who was apparently not at your sprint). :-) Skip From ncoghlan at gmail.com Wed Aug 14 19:42:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 14 Aug 2013 13:42:09 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On 14 August 2013 12:17, Eli Bendersky wrote: > On Wed, Aug 14, 2013 at 9:09 AM, Nick Coghlan wrote: >> >> On 14 August 2013 11:55, Brett Cannon wrote: >> > I view a deprecation as the same thing. If we leave the module in until >> > Python 4 then I can live with that, but simply moving documentation >> > around >> > is not enough to communicate to those who didn't read the release notes >> > to >> > know modules they rely on are now essentially orphaned. >> >> No, a deprecation isn't enough, because it doesn't help authors and >> educators to know "this is legacy, you can skip it". We need both. > > > +1 for both and for leaving the module in until "Python 4". > > Nick, perhaps we can have this "legacy-zation" process for modules > documented somewhere? Devguide? mini-PEP? That would be PEP 4 :) PEPs 5 and 6 could do with some TLC, too :P Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Wed Aug 14 19:50:59 2013 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 14 Aug 2013 18:50:59 +0100 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On Wed, Aug 14, 2013 at 6:42 PM, Nick Coghlan wrote: > That would be PEP 4 :) What's the normal way to update a PEP? """"... proposals for deprecating modules MUST be made by providing a change to the text of this PEP, which SHOULD be a patch posted to SourceForge...""" Would that now be more accurately done as a Mercurial patch, or a tracker issue? ChrisA From brett at python.org Wed Aug 14 20:15:37 2013 From: brett at python.org (Brett Cannon) Date: Wed, 14 Aug 2013 14:15:37 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On Wed, Aug 14, 2013 at 1:50 PM, Chris Angelico wrote: > On Wed, Aug 14, 2013 at 6:42 PM, Nick Coghlan wrote: > > That would be PEP 4 :) > > What's the normal way to update a PEP? > > """"... proposals for deprecating modules MUST be made by providing a > change to the text of this PEP, which SHOULD be a patch posted to > SourceForge...""" > > Would that now be more accurately done as a Mercurial patch, or a tracker > issue? > Email here with proposed changes and/or email to PEP editors (skip python-dev if it's a simple typo fix). -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Aug 14 20:37:07 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 14 Aug 2013 14:37:07 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On 8/14/2013 12:09 PM, Nick Coghlan wrote: > On 14 August 2013 11:55, Brett Cannon wrote: >> I view a deprecation as the same thing. If we leave the module in until >> Python 4 then I can live with that, but simply moving documentation around >> is not enough to communicate to those who didn't read the release notes to >> know modules they rely on are now essentially orphaned. > > No, a deprecation isn't enough, because it doesn't help authors and > educators to know "this is legacy, you can skip it". We need both. At least a couple of releases before deletion, we should put a 'legacy' package up on pypi. Then the deprecation message could say to use that as an alternative. -- Terry Jan Reedy From eliben at gmail.com Wed Aug 14 20:45:50 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 14 Aug 2013 11:45:50 -0700 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: On Wed, Aug 14, 2013 at 11:37 AM, Terry Reedy wrote: > On 8/14/2013 12:09 PM, Nick Coghlan wrote: > >> On 14 August 2013 11:55, Brett Cannon wrote: >> > > I view a deprecation as the same thing. If we leave the module in until >>> Python 4 then I can live with that, but simply moving documentation >>> around >>> is not enough to communicate to those who didn't read the release notes >>> to >>> know modules they rely on are now essentially orphaned. >>> >> >> No, a deprecation isn't enough, because it doesn't help authors and >> educators to know "this is legacy, you can skip it". We need both. >> > > At least a couple of releases before deletion, we should put a 'legacy' > package up on pypi. Then the deprecation message could say to use that as > an alternative. > To reiterate a point that was raised previously -- IMHO it would be a mistake to actually delete this (or other) modules before "Python 4". There's been enough breakage in Python 3 already. Some projects may only switch to Python 3.x when x is 4 or 5 or 9. Let's not make it even harder! I suggest we revisit this issue when the module in question becomes an actual maintenance burden. For the time being, if we feel bad this module isn't well documented/tested/understood, ISTM that moving it to "deprecated" status and to a "legacy/obsolete" section of the library documentation should help us handle those feelings of guilt. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Aug 15 00:07:58 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 14 Aug 2013 15:07:58 -0700 Subject: [Python-Dev] format, int, and IntEnum Message-ID: <520BFFBE.3050501@stoneleaf.us> From http://bugs.python.org/issue18738: Ethan Furman commented: > --> class Test(enum.IntEnum): > ... one = 1 > ... two = 2 > ... > > --> '{}'.format(Test.one) > 'Test.one' > > --> '{:d}'.format(Test.one) > '1' > > --> '{:}'.format(Test.one) > 'Test.one' > > --> '{:10}'.format(Test.one) > ' 1' Eric V. Smith commented: > The value of int is always used, except when the format string is empty. PEP 3101 explicitly requires this behavior. "For all built-in types, an empty format specification will produce the equivalent of str(value)." The "built-in type" here refers to int, since IntEnum is derived from int (at least I think it is: I haven't followed the metaclass and multiple inheritance completely). Ethan Furman commented: > So what you're saying is that '{:}' is empty, but '{:10}' is not? Eric V. Smith commented: > Yes, exactly. The part before the colon says which argument to .format() > to use. The empty string there means "use the next one". The part after the > colon is the format specifier. In the first example above, there's an empty > string after the colon, and in the second example there's a "10" after the > colon. > > Which is why it's really easier to use: > format(obj, '') > and > format(obj, '10') > instead of .format examples. By using the built-in format, you only need > to write the format specifier, not the ''.format() "which argument am I > processing" stuff with the braces and colons. Eli Bendersky commented: > Eric, I'd have to disagree with this part. Placing strictly formal > interpretation of "empty" aside, it seems to me unacceptable that > field-width affects the interpretation of the string. This appears more > like bug in the .format implementation than the original intention. I > suspect that at this point it may be useful to take this discussion to > pydev to get more opinions. From storchaka at gmail.com Thu Aug 15 01:01:15 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 15 Aug 2013 02:01:15 +0300 Subject: [Python-Dev] format and int subclasses (Was: format, int, and IntEnum) In-Reply-To: <520BFFBE.3050501@stoneleaf.us> References: <520BFFBE.3050501@stoneleaf.us> Message-ID: 15.08.13 01:07, Ethan Furman ???????(??): > From http://bugs.python.org/issue18738: Actually the problem not only in IntEnum, but in any in subclass. Currently for empty format specifier int.__format__(x, '') returns str(x). But __str__ can be overloaded in a subclass. I think that for less surprising we can extend this behavior for format specifier with the width, the alignment and the fill character but without the type char. I.e. int.__format__(x. '_<10') should return same as format(str(x), '_<10'). The question remains what to do with the sign option. And with the '=' alignment. From steve at pearwood.info Thu Aug 15 03:25:55 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 15 Aug 2013 11:25:55 +1000 Subject: [Python-Dev] PEP 450 adding statistics module Message-ID: <520C2E23.40405@pearwood.info> Hi all, I have raised a tracker item and PEP for adding a statistics module to the standard library: http://bugs.python.org/issue18606 http://www.python.org/dev/peps/pep-0450/ There has been considerable discussion on python-ideas, which is now reflected by the PEP. I've signed the Contributor Agreement, and submitted a patch containing updated code and tests. The tests aren't yet integrated with the test runner but are runnable manually. Can I request that people please look at this issue, with an aim to ruling on the PEP and (hopefully) adding the module to 3.4 before feature freeze? If it is accepted, I am willing to be primary maintainer for this module in the future. Thanks, -- Steven From steve at pearwood.info Thu Aug 15 03:28:52 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 15 Aug 2013 11:28:52 +1000 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> Message-ID: <520C2ED4.5010900@pearwood.info> On 15/08/13 01:08, Brett Cannon wrote: > On Tue, Aug 13, 2013 at 10:22 PM, Steven D'Aprano wrote: > >> On 13/08/13 23:36, Brett Cannon wrote: >> >>> On Tue, Aug 13, 2013 at 6:34 AM, Serhiy Storchaka >>> wrote: >>> >>> 12.08.13 22:22, Brett Cannon ???????(??): >>>> >>>> I have created http://bugs.python.org/****issue18716 >>>> >to >>>> deprecate the >>>> >>>> formatter module for removal in Python 3.6 unless someone convinces me >>>>> otherwise that deprecation and removal is the wrong move. [...] > There is a balance to keeping the > load of work for core devs at a level that is tenable to the level of > quality we expect from ourselves which means making sure we don't let cruft > build up in the stdlib and overwhelm us. These are all very good arguments, for both sides, and it is a balance between code churn and bit rot, but on balance I'm going to come down firmly in favour of Nick's earlier recommendation: PendingDeprecation (and/or a move to a "Legacy" section in the docs). In a couple more releases there may be concrete plans for an eventual Python 4, at which time we can discuss whether to delay deprecation until Python 4 or drop it sooner. And in the meantime, perhaps somebody will decide to give the module some love and attention. I'm not able to give a commitment to do so right now, but it is a module that interests me so maybe it will be me, he says optimistically. -- Steven From tjreedy at udel.edu Thu Aug 15 03:49:42 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 14 Aug 2013 21:49:42 -0400 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520C2E23.40405@pearwood.info> References: <520C2E23.40405@pearwood.info> Message-ID: On 8/14/2013 9:25 PM, Steven D'Aprano wrote: > The tests aren't > yet integrated with the test runner but are runnable manually. What do you mean? With the changes I gave you, they run fine as part of the test suite. -- Terry Jan Reedy From eliben at gmail.com Thu Aug 15 05:23:19 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 14 Aug 2013 20:23:19 -0700 Subject: [Python-Dev] format and int subclasses (Was: format, int, and IntEnum) In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> Message-ID: On Wed, Aug 14, 2013 at 4:01 PM, Serhiy Storchaka wrote: > 15.08.13 01:07, Ethan Furman ???????(??): > >> From http://bugs.python.org/**issue18738 >> : >> > > Actually the problem not only in IntEnum, but in any in subclass. > > Currently for empty format specifier int.__format__(x, '') returns str(x). > But __str__ can be overloaded in a subclass. I think that for less > surprising we can extend this behavior for format specifier with the width, > the alignment and the fill character but without the type char. I.e. > int.__format__(x. '_<10') should return same as format(str(x), '_<10'). > > The question remains what to do with the sign option. And with the '=' > alignment. > Yes, the problem here is certainly not IntEnum - specific; it's just that IntEnum is the first "for real" use case of subclassing 'int' in the stdlib. Consider this toy example: class IntSubclass(int): def __str__(self): return 'foo' s = IntSubclass(42) print('{:}'.format(s)) print('{:10}'.format(s)) This prints: foo 42 Which is, IMHO, madness. In the issue, Eric pointed out that PEP 3101 says "For all built-in types, an empty format specification will produce the equivalent of str(value)", and that {:10} is not an "empty format specification", but this looks much more like a simple bug to me. Following the "format terminology", I consider the format specification empty when the representation type (i.e. 'd', 'x' for ints, or 's' for string) is empty. Things like field width are completely orthogonal to the interpretation of the value (in a similar vein to traditional "%10d" formatting). If the lack of the representation type is interpreted as 'd' (which seems to be the case), it should always be interpreted as 'd', even when width is given. A different question is *whether* it should be interptered as 'd'. Arguably, in the presence of subclasses of 'int', this may not be desirable. But this behavior seems to be officially documented so we may not have much to do about it. (*) Well, except making it logically consistent as specified above. Eli (*) So to get the str() for sure, user code will have to explicitly force the 's' representation type - {:s}, etc. It's not a big burden, and not really different from "%s" % obj. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Aug 15 06:27:36 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 15 Aug 2013 00:27:36 -0400 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: <520BFFBE.3050501@stoneleaf.us> References: <520BFFBE.3050501@stoneleaf.us> Message-ID: I think Eric is overinterpreting the spec, there. While that particular sentence requires that the empty format string will be equivalent to a plain str() operation for builtin types, it is only a recommendation for other types. For enums, I believe they should be formatted like their base types (so !s and !r will show the enum name, anything without coercion will show the value) . Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Thu Aug 15 07:22:52 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 14 Aug 2013 22:22:52 -0700 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> Message-ID: <520C65AC.3070107@g.nevcal.com> On 8/14/2013 9:27 PM, Nick Coghlan wrote: > > I think Eric is overinterpreting the spec, there. While that > particular sentence requires that the empty format string will be > equivalent to a plain str() operation for builtin types, it is only a > recommendation for other types. For enums, I believe they should be > formatted like their base types (so !s and !r will show the enum name, > anything without coercion will show the value) . > > Cheers, > Nick. > I could agree with the above for IntEnum, but Enum doesn't have a "base type", it has a contained type, no? -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Aug 15 09:07:35 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 15 Aug 2013 10:07:35 +0300 Subject: [Python-Dev] format and int subclasses (Was: format, int, and IntEnum) In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> Message-ID: 15.08.13 06:23, Eli Bendersky ???????(??): > Yes, the problem here is certainly not IntEnum - specific; it's just > that IntEnum is the first "for real" use case of subclassing 'int' in > the stdlib. Even not the first. >> '{}'.format(True) 'True' >>> '{:10}'.format(True) ' 1' From solipsis at pitrou.net Thu Aug 15 11:08:45 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 15 Aug 2013 11:08:45 +0200 Subject: [Python-Dev] Deprecating the formatter module References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> Message-ID: <20130815110845.78ccde98@fsol> On Thu, 15 Aug 2013 11:28:52 +1000 Steven D'Aprano wrote: > > These are all very good arguments, for both sides, and it is a balance between code churn and bit rot, but on balance I'm going to come down firmly in favour of Nick's earlier recommendation: PendingDeprecation (and/or a move to a "Legacy" section in the docs). In a couple more releases there may be concrete plans for an eventual Python 4, at which time we can discuss whether to delay deprecation until Python 4 or drop it sooner. We don't have any substantial change in store for an eventual "Python 4", so it's quite a remote hypothesis right now. Regards Antoine. From victor.stinner at gmail.com Thu Aug 15 11:16:20 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 15 Aug 2013 11:16:20 +0200 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: <20130815110845.78ccde98@fsol> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> Message-ID: 2013/8/15 Antoine Pitrou : > We don't have any substantial change in store for an eventual "Python > 4", so it's quite a remote hypothesis right now. I prefered the transition between Linux 2 and Linux 3 (no major change, just a "normal" release except the version), rather than the transition between KDE 3 and KDE 4 (in short, everything was broken, the desktop was not usable). I prefer to not start a list of things that we will make the transition from Python 3 to Python 4 harder. Can't we do small changes between each Python release, even between major versions? Victor From solipsis at pitrou.net Thu Aug 15 11:22:14 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 15 Aug 2013 11:22:14 +0200 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> Message-ID: <20130815112214.0662e057@fsol> On Thu, 15 Aug 2013 11:16:20 +0200 Victor Stinner wrote: > 2013/8/15 Antoine Pitrou : > > We don't have any substantial change in store for an eventual "Python > > 4", so it's quite a remote hypothesis right now. > > I prefered the transition between Linux 2 and Linux 3 (no major > change, just a "normal" release except the version), rather than the > transition between KDE 3 and KDE 4 (in short, everything was broken, > the desktop was not usable). > > I prefer to not start a list of things that we will make the > transition from Python 3 to Python 4 harder. Can't we do small changes > between each Python release, even between major versions? That's exactly what I'm saying. But some changes cannot be made without breakage, e.g. the unicode transition. Then it makes sense to bundle all breaking changes in a single version change. Regards Antoine. From eric at trueblade.com Thu Aug 15 12:03:56 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 15 Aug 2013 06:03:56 -0400 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> Message-ID: <520CA78C.1030902@trueblade.com> On 8/15/2013 12:27 AM, Nick Coghlan wrote: > I think Eric is overinterpreting the spec, there. While that particular > sentence requires that the empty format string will be equivalent to a > plain str() operation for builtin types, it is only a recommendation for > other types. For enums, I believe they should be formatted like their > base types (so !s and !r will show the enum name, anything without > coercion will show the value) . I don't think I'm over-interpreting the spec (but of course I'd say that!). The spec is very precise on the meaning of "format specifier": it means the entire string (the second argument to __format__). I'll grant that in the sentence in question it uses "format specification", not "format specifier", though. I think this interpretation also meshes with builtin-in "format": with no format_spec argument, it uses an zero-length string as the default specifically to get the str(obj) behavior. Using bool as an example because it's easier to type: >>> format(True) 'True' >>> format(True, '10') ' 1' I think it was Guido who specifically wanted this behavior, although of course now I can't find the email about it. The closest I could find is Talin (PEP 3101 author) requesting it: http://mail.python.org/pipermail/python-3000/2007-August/010121.html. http://docs.python.org/3/library/string.html#format-specification-mini-language says 'A general convention is that an empty format string ("") produces the same result as if you had called str() on the value. A non-empty format string typically modifies the result.' Again, the wording is not very tight, but it is talking about format specifiers here. I still think the best thing to do is implement __format__ for IntEnum, and there implement whatever behavior is decided. I don't think changing the meaning of existing objects (specifically int here) is a good course of action. Whether we can implement code that breaks existing format specifiers that would work for "int" but won't work for "str" is open to debate. See http://bugs.python.org/msg195225. Personally, I think it's okay, but if you think IntEnum needs to be an exact replacement for int, maybe not. -- Eric. From dickinsm at gmail.com Thu Aug 15 13:42:41 2013 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 15 Aug 2013 12:42:41 +0100 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520C2E23.40405@pearwood.info> References: <520C2E23.40405@pearwood.info> Message-ID: The PEP and code look generally good to me. I think the API for median and its variants deserves some wider discussion: the reference implementation has a callable 'median', and variant callables 'median.low', 'median.high', 'median.grouped'. The pattern of attaching the variant callables as attributes on the main callable is unusual, and isn't something I've seen elsewhere in the standard library. I'd like to see some explanation in the PEP for why it's done this way. (There was already some discussion of this on the issue, but that was more centered around the implementation than the API.) I'd propose two alternatives for this: either have separate functions 'median', 'median_low', 'median_high', etc., or have a single function 'median' with a "method" argument that takes a string specifying computation using a particular method. I don't see a really good reason to deviate from standard patterns here, and fear that users would find the current API surprising. Mark On Thu, Aug 15, 2013 at 2:25 AM, Steven D'Aprano wrote: > Hi all, > > I have raised a tracker item and PEP for adding a statistics module to the > standard library: > > http://bugs.python.org/**issue18606 > > http://www.python.org/dev/**peps/pep-0450/ > > There has been considerable discussion on python-ideas, which is now > reflected by the PEP. I've signed the Contributor Agreement, and submitted > a patch containing updated code and tests. The tests aren't yet integrated > with the test runner but are runnable manually. > > Can I request that people please look at this issue, with an aim to ruling > on the PEP and (hopefully) adding the module to 3.4 before feature freeze? > If it is accepted, I am willing to be primary maintainer for this module in > the future. > > > Thanks, > > > -- > Steven > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > dickinsm%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Aug 15 13:55:54 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 15 Aug 2013 21:55:54 +1000 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: References: <520C2E23.40405@pearwood.info> Message-ID: <520CC1CA.90608@pearwood.info> On 15/08/13 11:49, Terry Reedy wrote: > On 8/14/2013 9:25 PM, Steven D'Aprano wrote: > >> The tests aren't >> yet integrated with the test runner but are runnable manually. > > What do you mean? With the changes I gave you, they run fine as part of the test suite. I'm sorry Terry, at the time I posted I hadn't seen your patch, my apologies. -- Steven From dickinsm at gmail.com Thu Aug 15 14:18:05 2013 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 15 Aug 2013 13:18:05 +0100 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520C2E23.40405@pearwood.info> References: <520C2E23.40405@pearwood.info> Message-ID: On Thu, Aug 15, 2013 at 2:25 AM, Steven D'Aprano wrote: > Can I request that people please look at this issue, with an aim to ruling > on the PEP and (hopefully) adding the module to 3.4 before feature freeze? > If it is accepted, I am willing to be primary maintainer for this module in > the future. > Bah. I seem to have forgotten how to not top-post. Apologies. Please ignore the previous message, and I'll try again... The PEP and code look generally good to me. I think the API for median and its variants deserves some wider discussion: the reference implementation has a callable 'median', and variant callables 'median.low', 'median.high', 'median.grouped'. The pattern of attaching the variant callables as attributes on the main callable is unusual, and isn't something I've seen elsewhere in the standard library. I'd like to see some explanation in the PEP for why it's done this way. (There was already some discussion of this on the issue, but that was more centered around the implementation than the API.) I'd propose two alternatives for this: either have separate functions 'median', 'median_low', 'median_high', etc., or have a single function 'median' with a "method" argument that takes a string specifying computation using a particular method. I don't see a really good reason to deviate from standard patterns here, and fear that users would find the current API surprising. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Aug 15 14:23:52 2013 From: brett at python.org (Brett Cannon) Date: Thu, 15 Aug 2013 08:23:52 -0400 Subject: [Python-Dev] Deprecating the formatter module In-Reply-To: <20130815112214.0662e057@fsol> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> Message-ID: On Thu, Aug 15, 2013 at 5:22 AM, Antoine Pitrou wrote: > On Thu, 15 Aug 2013 11:16:20 +0200 > Victor Stinner wrote: > > 2013/8/15 Antoine Pitrou : > > > We don't have any substantial change in store for an eventual "Python > > > 4", so it's quite a remote hypothesis right now. > > > > I prefered the transition between Linux 2 and Linux 3 (no major > > change, just a "normal" release except the version), rather than the > > transition between KDE 3 and KDE 4 (in short, everything was broken, > > the desktop was not usable). > > > > I prefer to not start a list of things that we will make the > > transition from Python 3 to Python 4 harder. Can't we do small changes > > between each Python release, even between major versions? > > That's exactly what I'm saying. > But some changes cannot be made without breakage, e.g. the unicode > transition. Then it makes sense to bundle all breaking changes in a > single version change. > Getting a little ahead of ourselves with defining what exactly Python 4 will be, but I have been thinking that if we take a "deprecated modules sit in Python 3 bitrotting until Python 4 for backwards-compatibility reasons" then I'm fine with that as that gives a long period of adjustment to the module going away. I just don't want any deprecated module to sit there forever. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Thu Aug 15 14:29:35 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 15 Aug 2013 08:29:35 -0400 Subject: [Python-Dev] When to remove deprecated stuff (was: Deprecating the formatter module) In-Reply-To: <20130815112214.0662e057@fsol> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> Message-ID: <20130815122936.805FF250168@webabinitio.net> On Thu, 15 Aug 2013 11:22:14 +0200, Antoine Pitrou wrote: > On Thu, 15 Aug 2013 11:16:20 +0200 > Victor Stinner wrote: > > 2013/8/15 Antoine Pitrou : > > > We don't have any substantial change in store for an eventual "Python > > > 4", so it's quite a remote hypothesis right now. > > > > I prefered the transition between Linux 2 and Linux 3 (no major > > change, just a "normal" release except the version), rather than the > > transition between KDE 3 and KDE 4 (in short, everything was broken, > > the desktop was not usable). > > > > I prefer to not start a list of things that we will make the > > transition from Python 3 to Python 4 harder. Can't we do small changes > > between each Python release, even between major versions? > > That's exactly what I'm saying. > But some changes cannot be made without breakage, e.g. the unicode > transition. Then it makes sense to bundle all breaking changes in a > single version change. A number of us (I don't know how many) have clearly been thinking about "Python 4" as the time when we remove cruft. This will not cause any backward compatibility issues for anyone who has paid heed to the deprecation warnings, but will for those who haven't. The question then becomes, is it better to "bundle" these removals into the Python 4 release, or do them incrementally? If we are going to do them incrementally we should make that decision soonish, so that we don't end up having a whole bunch happen at once and defeat the (theoretical) purpose of doing them incrementally. (I say theoretical because what is the purpose? To spread out the breakage pain over multiple releases, so that every release breaks something?) --David From solipsis at pitrou.net Thu Aug 15 14:36:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 15 Aug 2013 14:36:03 +0200 Subject: [Python-Dev] When to remove deprecated stuff (was: Deprecating the formatter module) References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> Message-ID: <20130815143603.3fe29f54@fsol> On Thu, 15 Aug 2013 08:29:35 -0400 "R. David Murray" wrote: > On Thu, 15 Aug 2013 11:22:14 +0200, Antoine Pitrou wrote: > > On Thu, 15 Aug 2013 11:16:20 +0200 > > Victor Stinner wrote: > > > 2013/8/15 Antoine Pitrou : > > > > We don't have any substantial change in store for an eventual "Python > > > > 4", so it's quite a remote hypothesis right now. > > > > > > I prefered the transition between Linux 2 and Linux 3 (no major > > > change, just a "normal" release except the version), rather than the > > > transition between KDE 3 and KDE 4 (in short, everything was broken, > > > the desktop was not usable). > > > > > > I prefer to not start a list of things that we will make the > > > transition from Python 3 to Python 4 harder. Can't we do small changes > > > between each Python release, even between major versions? > > > > That's exactly what I'm saying. > > But some changes cannot be made without breakage, e.g. the unicode > > transition. Then it makes sense to bundle all breaking changes in a > > single version change. > > A number of us (I don't know how many) have clearly been thinking about > "Python 4" as the time when we remove cruft. This will not cause any > backward compatibility issues for anyone who has paid heed to the > deprecation warnings, but will for those who haven't. Which is why we shouldn't silence deprecation warnings. Regards Antoine. From brett at python.org Thu Aug 15 14:38:50 2013 From: brett at python.org (Brett Cannon) Date: Thu, 15 Aug 2013 08:38:50 -0400 Subject: [Python-Dev] When to remove deprecated stuff (was: Deprecating the formatter module) In-Reply-To: <20130815122936.805FF250168@webabinitio.net> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> Message-ID: On Thu, Aug 15, 2013 at 8:29 AM, R. David Murray wrote: > On Thu, 15 Aug 2013 11:22:14 +0200, Antoine Pitrou > wrote: > > On Thu, 15 Aug 2013 11:16:20 +0200 > > Victor Stinner wrote: > > > 2013/8/15 Antoine Pitrou : > > > > We don't have any substantial change in store for an eventual "Python > > > > 4", so it's quite a remote hypothesis right now. > > > > > > I prefered the transition between Linux 2 and Linux 3 (no major > > > change, just a "normal" release except the version), rather than the > > > transition between KDE 3 and KDE 4 (in short, everything was broken, > > > the desktop was not usable). > > > > > > I prefer to not start a list of things that we will make the > > > transition from Python 3 to Python 4 harder. Can't we do small changes > > > between each Python release, even between major versions? > > > > That's exactly what I'm saying. > > But some changes cannot be made without breakage, e.g. the unicode > > transition. Then it makes sense to bundle all breaking changes in a > > single version change. > > A number of us (I don't know how many) have clearly been thinking about > "Python 4" as the time when we remove cruft. This will not cause any > backward compatibility issues for anyone who has paid heed to the > deprecation warnings, but will for those who haven't. The question > then becomes, is it better to "bundle" these removals into the > Python 4 release, or do them incrementally? > > If we are going to do them incrementally we should make that decision > soonish, so that we don't end up having a whole bunch happen at once > and defeat the (theoretical) purpose of doing them incrementally. > > (I say theoretical because what is the purpose? To spread out the > breakage pain over multiple releases, so that every release breaks > something?) > Incremental has one benefit, and that's we get to completely stop caring sooner. =) By completely removing the module sooner lessens the chance of errant bug reports, etc. Doing the removal en-masse at massive release boundaries is that compatibility for the stdlib can be stated as "Python 3" instead of "Python 3.3 and older" for those that continue to rely on the older modules. I also don't know if we would want to extend this to intra-module objects as well (e.g. classes). In which case we would need to clearly state that anything deprecated is considered dead and under active bitrot and so do not expect it to necessarily work with the rest of the module (e.g. new APIs to work with the deprecated ones). -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Aug 15 14:40:45 2013 From: brett at python.org (Brett Cannon) Date: Thu, 15 Aug 2013 08:40:45 -0400 Subject: [Python-Dev] When to remove deprecated stuff (was: Deprecating the formatter module) In-Reply-To: <20130815143603.3fe29f54@fsol> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130815143603.3fe29f54@fsol> Message-ID: On Thu, Aug 15, 2013 at 8:36 AM, Antoine Pitrou wrote: > On Thu, 15 Aug 2013 08:29:35 -0400 > "R. David Murray" wrote: > > > On Thu, 15 Aug 2013 11:22:14 +0200, Antoine Pitrou > wrote: > > > On Thu, 15 Aug 2013 11:16:20 +0200 > > > Victor Stinner wrote: > > > > 2013/8/15 Antoine Pitrou : > > > > > We don't have any substantial change in store for an eventual > "Python > > > > > 4", so it's quite a remote hypothesis right now. > > > > > > > > I prefered the transition between Linux 2 and Linux 3 (no major > > > > change, just a "normal" release except the version), rather than the > > > > transition between KDE 3 and KDE 4 (in short, everything was broken, > > > > the desktop was not usable). > > > > > > > > I prefer to not start a list of things that we will make the > > > > transition from Python 3 to Python 4 harder. Can't we do small > changes > > > > between each Python release, even between major versions? > > > > > > That's exactly what I'm saying. > > > But some changes cannot be made without breakage, e.g. the unicode > > > transition. Then it makes sense to bundle all breaking changes in a > > > single version change. > > > > A number of us (I don't know how many) have clearly been thinking about > > "Python 4" as the time when we remove cruft. This will not cause any > > backward compatibility issues for anyone who has paid heed to the > > deprecation warnings, but will for those who haven't. > > Which is why we shouldn't silence deprecation warnings. > What we should probably do is have unittest turn deprecations on by default when running your tests but leave them silent otherwise. I still think keeping them silent for the benefit of end-users is a good thing as long as we make it easier for developers to switch on warnings without thinking about it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Aug 15 15:08:22 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 15 Aug 2013 23:08:22 +1000 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: References: <520C2E23.40405@pearwood.info> Message-ID: <520CD2C6.8060103@pearwood.info> On 15/08/13 21:42, Mark Dickinson wrote: > The PEP and code look generally good to me. > > I think the API for median and its variants deserves some wider discussion: > the reference implementation has a callable 'median', and variant callables > 'median.low', 'median.high', 'median.grouped'. The pattern of attaching > the variant callables as attributes on the main callable is unusual, and > isn't something I've seen elsewhere in the standard library. I'd like to > see some explanation in the PEP for why it's done this way. (There was > already some discussion of this on the issue, but that was more centered > around the implementation than the API.) > > I'd propose two alternatives for this: either have separate functions > 'median', 'median_low', 'median_high', etc., or have a single function > 'median' with a "method" argument that takes a string specifying > computation using a particular method. I don't see a really good reason to > deviate from standard patterns here, and fear that users would find the > current API surprising. Alexander Belopolsky has convinced me (off-list) that my current implementation is better changed to a more conservative one of a callable singleton instance with methods implementing the alternative computations. I'll have something like: def _singleton(cls): return cls() @_singleton class median: def __call__(self, data): ... def low(self, data): ... ... In my earlier stats module, I had a single median function that took a argument to choose between alternatives. I called it "scheme": median(data, scheme="low") R uses parameter called "type" to choose between alternate calculations, not for median as we are discussing, but for quantiles: quantile(x, probs ... type = 7, ...). SAS also uses a similar system, but with different numeric codes. I rejected both "type" and "method" as the parameter name since it would cause confusion with the usual meanings of those words. I eventually decided against this system for two reasons: - Each scheme ended up needing to be a separate function, for ease of both implementation and testing. So I had four private median functions, which I put inside a class to act as namespace and avoid polluting the main namespace. Then I needed a "master function" to select which of the methods should be called, with all the additional testing and documentation that entailed. - The API doesn't really feel very Pythonic to me. For example, we write: mystring.rjust(width) dict.items() rather than mystring.justify(width, "right") or dict.iterate("items"). So I think individual methods is a better API, and one which is more familiar to most Python users. The only innovation (if that's what it is) is to have median a callable object. As far as having four separate functions, median, median_low, etc., it just doesn't feel right to me. It puts four slight variations of the same function into the main namespace, instead of keeping them together in a namespace. Names like median_low merely simulates a namespace with pseudo-methods separated with underscores instead of dots, only without the advantages of a real namespace. (I treat variance and std dev differently, and make the sample and population forms separate top-level functions rather than methods, simply because they are so well-known from scientific calculators that it is unthinkable to me to do differently. Whenever I use numpy, I am surprised all over again that it has only a single variance function.) -- Steven From ezio.melotti at gmail.com Thu Aug 15 15:15:50 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 15 Aug 2013 16:15:50 +0300 Subject: [Python-Dev] When to remove deprecated stuff (was: Deprecating the formatter module) In-Reply-To: <20130815122936.805FF250168@webabinitio.net> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> Message-ID: Hi, On Thu, Aug 15, 2013 at 3:29 PM, R. David Murray wrote: > On Thu, 15 Aug 2013 11:22:14 +0200, Antoine Pitrou wrote: >> On Thu, 15 Aug 2013 11:16:20 +0200 >> Victor Stinner wrote: >> > 2013/8/15 Antoine Pitrou : >> > > We don't have any substantial change in store for an eventual "Python >> > > 4", so it's quite a remote hypothesis right now. >> > >> > I prefered the transition between Linux 2 and Linux 3 (no major >> > change, just a "normal" release except the version), rather than the >> > transition between KDE 3 and KDE 4 (in short, everything was broken, >> > the desktop was not usable). >> > >> > I prefer to not start a list of things that we will make the >> > transition from Python 3 to Python 4 harder. Can't we do small changes >> > between each Python release, even between major versions? >> >> That's exactly what I'm saying. >> But some changes cannot be made without breakage, e.g. the unicode >> transition. Then it makes sense to bundle all breaking changes in a >> single version change. > > A number of us (I don't know how many) have clearly been thinking about > "Python 4" as the time when we remove cruft. This will not cause any > backward compatibility issues for anyone who has paid heed to the > deprecation warnings, but will for those who haven't. The question > then becomes, is it better to "bundle" these removals into the > Python 4 release, or do them incrementally? > A while ago I wrote an email to python-dev about our deprecation policy: http://mail.python.org/pipermail/python-dev/2011-October/114199.html My idea was to turn this into an informational PEP but I didn't receive much feedback. If people are interested I could still do it. Best Regards, Ezio Melotti > If we are going to do them incrementally we should make that decision > soonish, so that we don't end up having a whole bunch happen at once > and defeat the (theoretical) purpose of doing them incrementally. > > (I say theoretical because what is the purpose? To spread out the > breakage pain over multiple releases, so that every release breaks > something?) > > --David From eliben at gmail.com Thu Aug 15 16:59:21 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 15 Aug 2013 07:59:21 -0700 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: <520CA78C.1030902@trueblade.com> References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> Message-ID: On Thu, Aug 15, 2013 at 3:03 AM, Eric V. Smith wrote: > On 8/15/2013 12:27 AM, Nick Coghlan wrote: > > I think Eric is overinterpreting the spec, there. While that particular > > sentence requires that the empty format string will be equivalent to a > > plain str() operation for builtin types, it is only a recommendation for > > other types. For enums, I believe they should be formatted like their > > base types (so !s and !r will show the enum name, anything without > > coercion will show the value) . > > I don't think I'm over-interpreting the spec (but of course I'd say > that!). The spec is very precise on the meaning of "format specifier": > it means the entire string (the second argument to __format__). I'll > grant that in the sentence in question it uses "format specification", > not "format specifier", though. > > I think this interpretation also meshes with builtin-in "format": with > no format_spec argument, it uses an zero-length string as the default > specifically to get the str(obj) behavior. > > Using bool as an example because it's easier to type: > > >>> format(True) > 'True' > >>> format(True, '10') > ' 1' > > Eric, which-ever way you interpret the spec, the above violates the least-surprise principle; do you agree? It's easily one of those things that makes the "WTF, Python?" lists. Do you disagree? Unfortunately, I don't think there's a lot we can do about it now. It's a design mistake, locked with backwards compatibility until "Python 4". For IntEnum, being in control of __format__ and being a new class, I suppose we can create any behavior we want here. Can we do more? Is it even conceivable to rig the boolean sub-type to change this behavior to be more rational? I suspect that no, but one can hope ;-) And in any case, the documentation has to be tightened a bit formally to express what we mean exactly, how it translates to the behavior of builtin types, and what is allowed for custom types. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Aug 15 16:37:52 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 15 Aug 2013 07:37:52 -0700 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130815143603.3fe29f54@fsol> Message-ID: <520CE7C0.8050705@stoneleaf.us> On 08/15/2013 05:40 AM, Brett Cannon wrote: > > What we should probably do is have unittest turn deprecations on by default when running your tests but leave them > silent otherwise. I still think keeping them silent for the benefit of end-users is a good thing as long as we make it > easier for developers to switch on warnings without thinking about it. +1 From ncoghlan at gmail.com Thu Aug 15 17:06:41 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 15 Aug 2013 10:06:41 -0500 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: <520CA78C.1030902@trueblade.com> References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> Message-ID: On 15 August 2013 05:03, Eric V. Smith wrote: > On 8/15/2013 12:27 AM, Nick Coghlan wrote: >> I think Eric is overinterpreting the spec, there. While that particular >> sentence requires that the empty format string will be equivalent to a >> plain str() operation for builtin types, it is only a recommendation for >> other types. For enums, I believe they should be formatted like their >> base types (so !s and !r will show the enum name, anything without >> coercion will show the value) . > > I don't think I'm over-interpreting the spec (but of course I'd say > that!). The spec is very precise on the meaning of "format specifier": > it means the entire string (the second argument to __format__). I'll > grant that in the sentence in question it uses "format specification", > not "format specifier", though. By overinterpreting, I meant interpreting that part to mean literally calling str() when the specifier was empty, but not calling it otherwise. That's not what it means: it means that "str(x)" and "format(x)" will typically produce the *same result*, not that format will actually call str. If a subclass overrides __str__ but not __format__ (or vice-versa), then they can and *should* diverge. So, ideally, we would be more consistent, and either *always* call str() for subclasses when no type specifier is given, or *never* call it and always use the value. In this case, for integers, a missing type specifier for numeric types is defined as meaning "d", so calling str() on subclasses is wrong - it should be using the value, not the string representation. So "no type specifier" + "int subclass" *should* have meant calling int(self) prior to formatting rather than str(self), and subtypes look bool would have needed to override both __str__ and __format__. The problem now is that changing this behaviour in the base class would break subclasses like bool that expect format(x) to produce the same result as str(x) even though they have overridden __str__ but not __format__. So, I think the path to consistency here needs to be that, if a builtin subclass overrides __str__ without overriding __format__, then the formatting result will *always* be "format(str(x), fmtspec)", regardless of whether fmtspec is empty or not. This will make bool formatting work as expected, even when using things like the field width specifiers. There's a slight backwards compatibility risk even with that more conservative change, though - attempting to use numeric formatting codes with affected subtypes will now throw an exception, and for codes common to numbers and strings the output will change :( >>> format(True, "10") ' 1' >>> format(str(True), "10") 'True ' >>> format(True, "10,") ' 1' >>> format(str(True), "10,") Traceback (most recent call last): File "", line 1, in ValueError: Cannot specify ',' with 's'. So we may just have to live with the wart and tell people to always override both to ensure consistent behaviour :( > I still think the best thing to do is implement __format__ for IntEnum, > and there implement whatever behavior is decided. I don't think changing > the meaning of existing objects (specifically int here) is a good course > of action. I don't think changing the processing of int is proposed - just the handling of subclasses. However, since it seems to me that "make bool formatting work consistently" is a more conservative change (if we change the base class behaviour at all), then Enum subclasses will need __format__ defined regardless. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eric at trueblade.com Thu Aug 15 17:15:58 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 15 Aug 2013 11:15:58 -0400 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> Message-ID: On Aug 15, 2013, at 10:59 AM, Eli Bendersky wrote: > > > > On Thu, Aug 15, 2013 at 3:03 AM, Eric V. Smith wrote: >> On 8/15/2013 12:27 AM, Nick Coghlan wrote: >> > I think Eric is overinterpreting the spec, there. While that particular >> > sentence requires that the empty format string will be equivalent to a >> > plain str() operation for builtin types, it is only a recommendation for >> > other types. For enums, I believe they should be formatted like their >> > base types (so !s and !r will show the enum name, anything without >> > coercion will show the value) . >> >> I don't think I'm over-interpreting the spec (but of course I'd say >> that!). The spec is very precise on the meaning of "format specifier": >> it means the entire string (the second argument to __format__). I'll >> grant that in the sentence in question it uses "format specification", >> not "format specifier", though. >> >> I think this interpretation also meshes with builtin-in "format": with >> no format_spec argument, it uses an zero-length string as the default >> specifically to get the str(obj) behavior. >> >> Using bool as an example because it's easier to type: >> >> >>> format(True) >> 'True' >> >>> format(True, '10') >> ' 1' > > Eric, which-ever way you interpret the spec, the above violates the least-surprise principle; do you agree? It's easily one of those things that makes the "WTF, Python?" lists. Do you disagree? Oh, I completely agree that it doesn't make much sense, and is surprising. I was just trying to explain why we see the current behavior. > Unfortunately, I don't think there's a lot we can do about it now. It's a design mistake, locked with backwards compatibility until "Python 4". Agreed. > For IntEnum, being in control of __format__ and being a new class, I suppose we can create any behavior we want here. Right. That's the intent. I'd personally be okay with checking for int format codes (bdxX, etc.) and treating it as an int, otherwise a string. As I've pointed out, it's slightly fragile, and there are a few cases where a valid int format spec will give an error when treating it as a string, but that doesn't bother me. > Can we do more? Is it even conceivable to rig the boolean sub-type to change this behavior to be more rational? I suspect that no, but one can hope ;-) I don't think there's much we can do, unfortunately. I think bool should work the same as the proposed IntEnum changes, but that's an incompatible change. > And in any case, the documentation has to be tightened a bit formally to express what we mean exactly, how it translates to the behavior of builtin types, and what is allowed for custom types. I'm okay with that. Eric. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Aug 15 17:22:33 2013 From: brett at python.org (Brett Cannon) Date: Thu, 15 Aug 2013 11:22:33 -0400 Subject: [Python-Dev] When to remove deprecated stuff (was: Deprecating the formatter module) In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> Message-ID: On Thu, Aug 15, 2013 at 9:15 AM, Ezio Melotti wrote: > Hi, > > On Thu, Aug 15, 2013 at 3:29 PM, R. David Murray > wrote: > > On Thu, 15 Aug 2013 11:22:14 +0200, Antoine Pitrou > wrote: > >> On Thu, 15 Aug 2013 11:16:20 +0200 > >> Victor Stinner wrote: > >> > 2013/8/15 Antoine Pitrou : > >> > > We don't have any substantial change in store for an eventual > "Python > >> > > 4", so it's quite a remote hypothesis right now. > >> > > >> > I prefered the transition between Linux 2 and Linux 3 (no major > >> > change, just a "normal" release except the version), rather than the > >> > transition between KDE 3 and KDE 4 (in short, everything was broken, > >> > the desktop was not usable). > >> > > >> > I prefer to not start a list of things that we will make the > >> > transition from Python 3 to Python 4 harder. Can't we do small changes > >> > between each Python release, even between major versions? > >> > >> That's exactly what I'm saying. > >> But some changes cannot be made without breakage, e.g. the unicode > >> transition. Then it makes sense to bundle all breaking changes in a > >> single version change. > > > > A number of us (I don't know how many) have clearly been thinking about > > "Python 4" as the time when we remove cruft. This will not cause any > > backward compatibility issues for anyone who has paid heed to the > > deprecation warnings, but will for those who haven't. The question > > then becomes, is it better to "bundle" these removals into the > > Python 4 release, or do them incrementally? > > > > A while ago I wrote an email to python-dev about our deprecation policy: > http://mail.python.org/pipermail/python-dev/2011-October/114199.html > > My idea was to turn this into an informational PEP but I didn't > receive much feedback. > If people are interested I could still do it. > Wouldn't hurt, but should probably be a part of PEP 4. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Thu Aug 15 17:36:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 15 Aug 2013 08:36:56 -0700 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> Message-ID: On Thu, Aug 15, 2013 at 8:15 AM, Eric V. Smith wrote: > On Aug 15, 2013, at 10:59 AM, Eli Bendersky wrote: > > > > > On Thu, Aug 15, 2013 at 3:03 AM, Eric V. Smith wrote: > >> On 8/15/2013 12:27 AM, Nick Coghlan wrote: >> > I think Eric is overinterpreting the spec, there. While that particular >> > sentence requires that the empty format string will be equivalent to a >> > plain str() operation for builtin types, it is only a recommendation for >> > other types. For enums, I believe they should be formatted like their >> > base types (so !s and !r will show the enum name, anything without >> > coercion will show the value) . >> >> I don't think I'm over-interpreting the spec (but of course I'd say >> that!). The spec is very precise on the meaning of "format specifier": >> it means the entire string (the second argument to __format__). I'll >> grant that in the sentence in question it uses "format specification", >> not "format specifier", though. >> >> I think this interpretation also meshes with builtin-in "format": with >> no format_spec argument, it uses an zero-length string as the default >> specifically to get the str(obj) behavior. >> >> Using bool as an example because it's easier to type: >> >> >>> format(True) >> 'True' >> >>> format(True, '10') >> ' 1' >> >> > Eric, which-ever way you interpret the spec, the above violates the > least-surprise principle; do you agree? It's easily one of those things > that makes the "WTF, Python?" lists. Do you disagree? > > > Oh, I completely agree that it doesn't make much sense, and is surprising. > I was just trying to explain why we see the current behavior. > > Unfortunately, I don't think there's a lot we can do about it now. It's a > design mistake, locked with backwards compatibility until "Python 4". > > > Agreed. > > For IntEnum, being in control of __format__ and being a new class, I > suppose we can create any behavior we want here. > > > Right. That's the intent. I'd personally be okay with checking for int > format codes (bdxX, etc.) and treating it as an int, otherwise a string. As > I've pointed out, it's slightly fragile, and there are a few cases where a > valid int format spec will give an error when treating it as a string, but > that doesn't bother me. > This got me thinking when we were discussing it in the issue. It's plausible that every subclass of builtin types will need to implement __format__ to act sanely. So maybe we can propose some sort of API (on the Python level) that makes parsing the format string easy and will not make code go stale? What do you think? > Can we do more? Is it even conceivable to rig the boolean sub-type to > change this behavior to be more rational? I suspect that no, but one can > hope ;-) > > > I don't think there's much we can do, unfortunately. I think bool should > work the same as the proposed IntEnum changes, but that's an incompatible > change. > :-( -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Aug 15 17:21:46 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 15 Aug 2013 08:21:46 -0700 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> Message-ID: <520CF20A.8030308@stoneleaf.us> Given that the !r and !s format codes can be used to get the repr and str of an IntEnum, would it be acceptable to have IntEnum's __format__ simply pass through to int's __format__? And likewise with all mix-in classes? -- ~Ethan~ From eric at trueblade.com Thu Aug 15 17:43:37 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 15 Aug 2013 11:43:37 -0400 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> Message-ID: <490024AF-C92B-446D-A720-46D28712E050@trueblade.com> On Aug 15, 2013, at 11:36 AM, Eli Bendersky wrote: > > > > On Thu, Aug 15, 2013 at 8:15 AM, Eric V. Smith wrote: >> On Aug 15, 2013, at 10:59 AM, Eli Bendersky wrote: >> >>> >>> >>> >>> On Thu, Aug 15, 2013 at 3:03 AM, Eric V. Smith wrote: >>>> On 8/15/2013 12:27 AM, Nick Coghlan wrote: >>>> > I think Eric is overinterpreting the spec, there. While that particular >>>> > sentence requires that the empty format string will be equivalent to a >>>> > plain str() operation for builtin types, it is only a recommendation for >>>> > other types. For enums, I believe they should be formatted like their >>>> > base types (so !s and !r will show the enum name, anything without >>>> > coercion will show the value) . >>>> >>>> I don't think I'm over-interpreting the spec (but of course I'd say >>>> that!). The spec is very precise on the meaning of "format specifier": >>>> it means the entire string (the second argument to __format__). I'll >>>> grant that in the sentence in question it uses "format specification", >>>> not "format specifier", though. >>>> >>>> I think this interpretation also meshes with builtin-in "format": with >>>> no format_spec argument, it uses an zero-length string as the default >>>> specifically to get the str(obj) behavior. >>>> >>>> Using bool as an example because it's easier to type: >>>> >>>> >>> format(True) >>>> 'True' >>>> >>> format(True, '10') >>>> ' 1' >>> >>> Eric, which-ever way you interpret the spec, the above violates the least-surprise principle; do you agree? It's easily one of those things that makes the "WTF, Python?" lists. Do you disagree? >> >> Oh, I completely agree that it doesn't make much sense, and is surprising. I was just trying to explain why we see the current behavior. >> >>> Unfortunately, I don't think there's a lot we can do about it now. It's a design mistake, locked with backwards compatibility until "Python 4". >> >> Agreed. >> >>> For IntEnum, being in control of __format__ and being a new class, I suppose we can create any behavior we want here. >> >> Right. That's the intent. I'd personally be okay with checking for int format codes (bdxX, etc.) and treating it as an int, otherwise a string. As I've pointed out, it's slightly fragile, and there are a few cases where a valid int format spec will give an error when treating it as a string, but that doesn't bother me. > > This got me thinking when we were discussing it in the issue. It's plausible that every subclass of builtin types will need to implement __format__ to act sanely. So maybe we can propose some sort of API (on the Python level) that makes parsing the format string easy and will not make code go stale? What do you think? I've proposed this in the past, primarily for Decimal. I'd be okay with it. It would need to done carefully to allow us to expand the format string, for example when we added ','. Maybe return a namedtuple or equivalent. But remember, not all types understand the same format strings. datetime being the classic case. > >> >>> Can we do more? Is it even conceivable to rig the boolean sub-type to change this behavior to be more rational? I suspect that no, but one can hope ;-) >> >> I don't think there's much we can do, unfortunately. I think bool should work the same as the proposed IntEnum changes, but that's an incompatible change. > > :-( I know. Me, too. Eric. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Thu Aug 15 17:49:10 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 15 Aug 2013 08:49:10 -0700 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: <490024AF-C92B-446D-A720-46D28712E050@trueblade.com> References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> <490024AF-C92B-446D-A720-46D28712E050@trueblade.com> Message-ID: > This got me thinking when we were discussing it in the issue. It's > plausible that every subclass of builtin types will need to implement > __format__ to act sanely. So maybe we can propose some sort of API (on the > Python level) that makes parsing the format string easy and will not make > code go stale? What do you think? > > > I've proposed this in the past, primarily for Decimal. I'd be okay with > it. It would need to done carefully to allow us to expand the format > string, for example when we added ','. Maybe return a namedtuple or > equivalent. > > But remember, not all types understand the same format strings. datetime > being the classic case. > > Sure, this is why I specifically restricted it to subclasses of builtin types, because these should presumably understand the flags used for builtin types. Anyway, we'll see how much parsing will have to be done in practice for IntEnum - it can serve as a guinea pig. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Aug 15 19:04:07 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 15 Aug 2013 13:04:07 -0400 Subject: [Python-Dev] Issue 13248: 3.4 Removals? Message-ID: Related to the current deprecation discussion: http://bugs.python.org/issue13248 This is a master list of deprecated items scheduled for removal in 3.4. Anything that is going to be removed should be done now, before the next alpha, methinks. -- Terry Jan Reedy From tjreedy at udel.edu Thu Aug 15 19:34:12 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 15 Aug 2013 13:34:12 -0400 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: <20130815122936.805FF250168@webabinitio.net> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> Message-ID: On 8/15/2013 8:29 AM, R. David Murray wrote: > A number of us (I don't know how many) have clearly been thinking about > "Python 4" as the time when we remove cruft. This will not cause any > backward compatibility issues for anyone who has paid heed to the > deprecation warnings, but will for those who haven't. The question > then becomes, is it better to "bundle" these removals into the > Python 4 release, or do them incrementally? 4.0 will be at most 6 releases after the upcoming 3.4, which is 9 to 12 years, which is 7 to 10 years after any regular 2.7 maintainance ends. The deprecated unittest synonyms are documented as being removed in 4.0 and that already defines 4.0 as a future cruft-removal release. However, I would not want it defined as the only cruft-removal release and used as a reason or excuse to suspend removals until then. I would personally prefer to do little* removals incrementally, as was done before the decision to put off 2.x removals to 3.0. So I would have 4.0 be an 'extra' or 'bigger' cruft removal release, but not the only one. * Removing one or two pure synonyms or little used features from a module. The unittest synonym removal is not 'little' because there are 13 synonyms and at least some were well used. > If we are going to do them incrementally we should make that decision > soonish, so that we don't end up having a whole bunch happen at once > and defeat the (theoretical) purpose of doing them incrementally. > > (I say theoretical because what is the purpose? To spread out the > breakage pain over multiple releases, so that every release breaks > something?) Little removals will usually break something, but not most things. Yes, I think it better to upset a few people with each release than lots of people all at once. I think enabling deprecation notices in unittest is a great idea. Among other reasons, it should spread the effect of bigger removals scheduled farther in the future over the extended deprecation period. Most deprecation notices should provide an alternative. (There might be an exception is for things that should not be done ;-). For module removals, the alternative should be a legacy package on PyPI. -- Terry Jan Reedy From eric at trueblade.com Thu Aug 15 19:37:17 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 15 Aug 2013 13:37:17 -0400 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> Message-ID: <520D11CD.6060005@trueblade.com> On 08/15/2013 11:06 AM, Nick Coghlan wrote: > On 15 August 2013 05:03, Eric V. Smith wrote: >> On 8/15/2013 12:27 AM, Nick Coghlan wrote: >>> I think Eric is overinterpreting the spec, there. While that particular >>> sentence requires that the empty format string will be equivalent to a >>> plain str() operation for builtin types, it is only a recommendation for >>> other types. For enums, I believe they should be formatted like their >>> base types (so !s and !r will show the enum name, anything without >>> coercion will show the value) . >> >> I don't think I'm over-interpreting the spec (but of course I'd say >> that!). The spec is very precise on the meaning of "format specifier": >> it means the entire string (the second argument to __format__). I'll >> grant that in the sentence in question it uses "format specification", >> not "format specifier", though. > > By overinterpreting, I meant interpreting that part to mean literally > calling str() when the specifier was empty, but not calling it > otherwise. That's not what it means: it means that "str(x)" and > "format(x)" will typically produce the *same result*, not that format > will actually call str. If a subclass overrides __str__ but not > __format__ (or vice-versa), then they can and *should* diverge. > > So, ideally, we would be more consistent, and either *always* call > str() for subclasses when no type specifier is given, or *never* call > it and always use the value. > > In this case, for integers, a missing type specifier for numeric types > is defined as meaning "d", so calling str() on subclasses is wrong - > it should be using the value, not the string representation. So "no > type specifier" + "int subclass" *should* have meant calling int(self) > prior to formatting rather than str(self), and subtypes look bool > would have needed to override both __str__ and __format__. I'm pretty sure that when I implemented this we went over this point in extreme detail, and that we specifically wanted it to apply to an empty "format specifier", not an empty "type specifier" (which the PEP and the documentation call a "presentation type"). Your interpretation might make more sense, but it's not what we discussed at the time (to my recollection), and I think it's too late to change now, unfortunately. > The problem now is that changing this behaviour in the base class > would break subclasses like bool that expect format(x) to produce the > same result as str(x) even though they have overridden __str__ but not > __format__. > > So, I think the path to consistency here needs to be that, if a > builtin subclass overrides __str__ without overriding __format__, then > the formatting result will *always* be "format(str(x), fmtspec)", > regardless of whether fmtspec is empty or not. This will make bool > formatting work as expected, even when using things like the field > width specifiers. > > There's a slight backwards compatibility risk even with that more > conservative change, though - attempting to use numeric formatting > codes with affected subtypes will now throw an exception, and for > codes common to numbers and strings the output will change :( > >>>> format(True, "10") > ' 1' >>>> format(str(True), "10") > 'True ' >>>> format(True, "10,") > ' 1' >>>> format(str(True), "10,") > Traceback (most recent call last): > File "", line 1, in > ValueError: Cannot specify ',' with 's'. > > So we may just have to live with the wart and tell people to always > override both to ensure consistent behaviour :( I think that's true. >> I still think the best thing to do is implement __format__ for IntEnum, >> and there implement whatever behavior is decided. I don't think changing >> the meaning of existing objects (specifically int here) is a good course >> of action. > > I don't think changing the processing of int is proposed - just the > handling of subclasses. However, since it seems to me that "make bool > formatting work consistently" is a more conservative change (if we > change the base class behaviour at all), then Enum subclasses will > need __format__ defined regardless. Agreed. And I think the real discussion here needs to be: what should __format__ for IntEnum (or maybe Enum) do? Is not being consistent with what int.__format__ accepts okay? (For example, "+" or "10,".) Or does being a "drop-in replacement for int" mean that any format string for int must also work for IntEnum? As I've said, I think breaking a few format strings is okay, but of course it's debatable. Eric. From eric at trueblade.com Thu Aug 15 19:44:15 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 15 Aug 2013 13:44:15 -0400 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: <520CF20A.8030308@stoneleaf.us> References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> <520CF20A.8030308@stoneleaf.us> Message-ID: <520D136F.2010408@trueblade.com> On 08/15/2013 11:21 AM, Ethan Furman wrote: > Given that the !r and !s format codes can be used to get the repr and > str of an IntEnum, would it be acceptable to have IntEnum's __format__ > simply pass through to int's __format__? And likewise with all mix-in > classes? That helps with str.format(), but not with built-in format(). There, you'd have to explicitly call str() or repr(): >>> '{:10}'.format(True) ' 1' >>> format(True, '10') ' 1' >>> '{!s:10}'.format(True) 'True ' >>> format(str(True), '10') 'True ' Eric. From dickinsm at gmail.com Thu Aug 15 19:58:05 2013 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 15 Aug 2013 18:58:05 +0100 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520CD2C6.8060103@pearwood.info> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> Message-ID: On Thu, Aug 15, 2013 at 2:08 PM, Steven D'Aprano wrote: > > - Each scheme ended up needing to be a separate function, for ease of both > implementation and testing. So I had four private median functions, which I > put inside a class to act as namespace and avoid polluting the main > namespace. Then I needed a "master function" to select which of the methods > should be called, with all the additional testing and documentation that > entailed. > That's just an implementation issue, though, and sounds like a minor inconvenience to the implementor rather than anything serious; I don't think that that should dictate the API that's used. - The API doesn't really feel very Pythonic to me. For example, we write: > And I guess this is subjective: conversely, the API you're proposing doesn't feel Pythonic to me. :-) I'd like the hear the opinion of other python-dev readers. Thanks for the detailed replies. Would it be possible to put some of this reasoning into the PEP? Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Thu Aug 15 19:48:37 2013 From: rymg19 at gmail.com (Ryan) Date: Thu, 15 Aug 2013 12:48:37 -0500 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520CD2C6.8060103@pearwood.info> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> Message-ID: For the naming, how about changing median(callable) to median.regular? That way, we don't have to deal with a callable namespace. Steven D'Aprano wrote: >On 15/08/13 21:42, Mark Dickinson wrote: >> The PEP and code look generally good to me. >> >> I think the API for median and its variants deserves some wider >discussion: >> the reference implementation has a callable 'median', and variant >callables >> 'median.low', 'median.high', 'median.grouped'. The pattern of >attaching >> the variant callables as attributes on the main callable is unusual, >and >> isn't something I've seen elsewhere in the standard library. I'd >like to >> see some explanation in the PEP for why it's done this way. (There >was >> already some discussion of this on the issue, but that was more >centered >> around the implementation than the API.) >> >> I'd propose two alternatives for this: either have separate >functions >> 'median', 'median_low', 'median_high', etc., or have a single >function >> 'median' with a "method" argument that takes a string specifying >> computation using a particular method. I don't see a really good >reason to >> deviate from standard patterns here, and fear that users would find >the >> current API surprising. > >Alexander Belopolsky has convinced me (off-list) that my current >implementation is better changed to a more conservative one of a >callable singleton instance with methods implementing the alternative >computations. I'll have something like: > > >def _singleton(cls): > return cls() > > >@_singleton >class median: > def __call__(self, data): > ... > def low(self, data): > ... > ... > > >In my earlier stats module, I had a single median function that took a >argument to choose between alternatives. I called it "scheme": > >median(data, scheme="low") > >R uses parameter called "type" to choose between alternate >calculations, not for median as we are discussing, but for quantiles: > >quantile(x, probs ... type = 7, ...). > >SAS also uses a similar system, but with different numeric codes. I >rejected both "type" and "method" as the parameter name since it would >cause confusion with the usual meanings of those words. I eventually >decided against this system for two reasons: > >- Each scheme ended up needing to be a separate function, for ease of >both implementation and testing. So I had four private median >functions, which I put inside a class to act as namespace and avoid >polluting the main namespace. Then I needed a "master function" to >select which of the methods should be called, with all the additional >testing and documentation that entailed. > >- The API doesn't really feel very Pythonic to me. For example, we >write: > >mystring.rjust(width) >dict.items() > >rather than mystring.justify(width, "right") or dict.iterate("items"). >So I think individual methods is a better API, and one which is more >familiar to most Python users. The only innovation (if that's what it >is) is to have median a callable object. > > >As far as having four separate functions, median, median_low, etc., it >just doesn't feel right to me. It puts four slight variations of the >same function into the main namespace, instead of keeping them together >in a namespace. Names like median_low merely simulates a namespace with >pseudo-methods separated with underscores instead of dots, only without >the advantages of a real namespace. > >(I treat variance and std dev differently, and make the sample and >population forms separate top-level functions rather than methods, >simply because they are so well-known from scientific calculators that >it is unthinkable to me to do differently. Whenever I use numpy, I am >surprised all over again that it has only a single variance function.) > > > >-- >Steven >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Thu Aug 15 20:09:41 2013 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 15 Aug 2013 19:09:41 +0100 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> Message-ID: On Thu, Aug 15, 2013 at 6:48 PM, Ryan wrote: > For the naming, how about changing median(callable) to median.regular? > That way, we don't have to deal with a callable namespace. > Hmm. That sounds like a step backwards to me: whatever the API is, a simple "from statistics import median; m = median(my_data)" should still work in the simple case. Mark > > Steven D'Aprano wrote: > >> On 15/08/13 21:42, Mark Dickinson wrote: >> >>> The PEP and code look generally good to me. >>> >>> I think the API for median and its variants deserves some wider discussion: >>> the reference implementation has a callable 'median', and variant callables >>> 'median.low', 'median.high', 'median.grouped'. The pattern of attaching >>> the variant callables as attributes on the main callable is unusual, and >>> isn't something I've seen elsewhere in the standard library. I'd like to >>> see some explanation in the PEP for why it's done this way. (There was >>> already some discussion of this on the issue, but that was more centered >>> around the implementation than the API.) >>> >>> I'd propose two alternatives for this: either have separate functions >>> 'median', 'median_low', 'median_high', etc., or have a single function >>> 'median' with a "method" argument that takes a string specifying >>> computation using a particular method. I don't see a really good reason to >>> deviate from standard patterns here, and fear that users would find the >>> current API surprising. >> >> >> Alexander Belopolsky has convinced me (off-list) that my current implementation is better changed to a more conservative one of a callable singleton instance with methods implementing the alternative computations. I'll have something like: >> >> >> def _singleton(cls): >> return cls() >> >> >> @_singleton >> class median: >> def __call__(self, data): >> ... >> def low(self, data): >> ... >> ... >> >> >> In my earlier stats module, I had a single median function that took a argument to choose between alternatives. I called it "scheme": >> >> median(data, scheme="low") >> >> R uses parameter >> called "type" to choose between alternate calculations, not for median as we are discussing, but for quantiles: >> >> quantile(x, probs ... type = 7, ...). >> >> SAS also uses a similar system, but with different numeric codes. I rejected both "type" and "method" as the parameter name since it would cause confusion with the usual meanings of those words. I eventually decided against this system for two reasons: >> >> - Each scheme ended up needing to be a separate function, for ease of both implementation and testing. So I had four private median functions, which I put inside a class to act as namespace and avoid polluting the main namespace. Then I needed a "master function" to select which of the methods should be called, with all the additional testing and documentation that entailed. >> >> - The API doesn't really feel very Pythonic to me. For example, we write: >> >> mystring.rjust(width) >> dict.items() >> >> rather than mystring.justify(width, >> "right") or dict.iterate("items"). So I think individual methods is a better API, and one which is more familiar to most Python users. The only innovation (if that's what it is) is to have median a callable object. >> >> >> As far as having four separate functions, median, median_low, etc., it just doesn't feel right to me. It puts four slight variations of the same function into the main namespace, instead of keeping them together in a namespace. Names like median_low merely simulates a namespace with pseudo-methods separated with underscores instead of dots, only without the advantages of a real namespace. >> >> (I treat variance and std dev differently, and make the sample and population forms separate top-level functions rather than methods, simply because they are so well-known from scientific calculators that it is unthinkable to me to do differently. Whenever I use numpy, I am surprised all over again that it has only a single variance function.) >> >> >> > -- > Sent from my Android phone with K-9 Mail. Please excuse my brevity. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/dickinsm%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu Aug 15 20:10:39 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 15 Aug 2013 14:10:39 -0400 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> Message-ID: <520D199F.3000704@trueblade.com> On 08/15/2013 01:58 PM, Mark Dickinson wrote: > On Thu, Aug 15, 2013 at 2:08 PM, Steven D'Aprano > wrote: > > > - Each scheme ended up needing to be a separate function, for ease > of both implementation and testing. So I had four private median > functions, which I put inside a class to act as namespace and avoid > polluting the main namespace. Then I needed a "master function" to > select which of the methods should be called, with all the > additional testing and documentation that entailed. > > > That's just an implementation issue, though, and sounds like a minor > inconvenience to the implementor rather than anything serious; I don't > think that that should dictate the API that's used. > > - The API doesn't really feel very Pythonic to me. For example, we > write: > > > And I guess this is subjective: conversely, the API you're proposing > doesn't feel Pythonic to me. :-) I'd like the hear the opinion of other > python-dev readers. I agree with Mark: the proposed median, median.low, etc., doesn't feel right. Is there any example of doing this in the stdlib? I suggest just median(), median_low(), etc. If we do end up keeping it, simpler than the callable singleton is: >>> def median(): return 'median' ... >>> def _median_low(): return 'median.low' ... >>> median.low = _median_low >>> del _median_low >>> median() 'median' >>> median.low() 'median.low' Eric. From python at mrabarnett.plus.com Thu Aug 15 20:17:39 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 15 Aug 2013 19:17:39 +0100 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: <20130815122936.805FF250168@webabinitio.net> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> Message-ID: <520D1B43.1030908@mrabarnett.plus.com> On 15/08/2013 13:29, R. David Murray wrote: > On Thu, 15 Aug 2013 11:22:14 +0200, Antoine Pitrou wrote: >> On Thu, 15 Aug 2013 11:16:20 +0200 >> Victor Stinner wrote: >> > 2013/8/15 Antoine Pitrou : >> > > We don't have any substantial change in store for an eventual "Python >> > > 4", so it's quite a remote hypothesis right now. >> > >> > I prefered the transition between Linux 2 and Linux 3 (no major >> > change, just a "normal" release except the version), rather than the >> > transition between KDE 3 and KDE 4 (in short, everything was broken, >> > the desktop was not usable). >> > >> > I prefer to not start a list of things that we will make the >> > transition from Python 3 to Python 4 harder. Can't we do small changes >> > between each Python release, even between major versions? >> >> That's exactly what I'm saying. >> But some changes cannot be made without breakage, e.g. the unicode >> transition. Then it makes sense to bundle all breaking changes in a >> single version change. > > A number of us (I don't know how many) have clearly been thinking about > "Python 4" as the time when we remove cruft. This will not cause any > backward compatibility issues for anyone who has paid heed to the > deprecation warnings, but will for those who haven't. The question > then becomes, is it better to "bundle" these removals into the > Python 4 release, or do them incrementally? > > If we are going to do them incrementally we should make that decision > soonish, so that we don't end up having a whole bunch happen at once > and defeat the (theoretical) purpose of doing them incrementally. > > (I say theoretical because what is the purpose? To spread out the > breakage pain over multiple releases, so that every release breaks > something?) > Talking of cruft, would that include these methods of the Thread class? getName() setName() isDaemon() setDaemon() From rdmurray at bitdance.com Thu Aug 15 20:24:50 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 15 Aug 2013 14:24:50 -0400 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520D199F.3000704@trueblade.com> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> Message-ID: <20130815182451.329612500CB@webabinitio.net> On Thu, 15 Aug 2013 14:10:39 -0400, "Eric V. Smith" wrote: > On 08/15/2013 01:58 PM, Mark Dickinson wrote: > > On Thu, Aug 15, 2013 at 2:08 PM, Steven D'Aprano > > wrote: > > > > > > - Each scheme ended up needing to be a separate function, for ease > > of both implementation and testing. So I had four private median > > functions, which I put inside a class to act as namespace and avoid > > polluting the main namespace. Then I needed a "master function" to > > select which of the methods should be called, with all the > > additional testing and documentation that entailed. > > > > > > That's just an implementation issue, though, and sounds like a minor > > inconvenience to the implementor rather than anything serious; I don't > > think that that should dictate the API that's used. > > > > - The API doesn't really feel very Pythonic to me. For example, we > > write: > > > > > > And I guess this is subjective: conversely, the API you're proposing > > doesn't feel Pythonic to me. :-) I'd like the hear the opinion of other > > python-dev readers. > > I agree with Mark: the proposed median, median.low, etc., doesn't feel > right. Is there any example of doing this in the stdlib? I suggest just > median(), median_low(), etc. I too prefer the median_low naming rather than median.low. I'm not sure I can articulate why, but certainly the fact that that latter isn't used anywhere else in the stdlib that I can think of is probably a lot of it :) Perhaps the underlying thought is that we don't use classes pure function namespaces: we expect classes to be something more than that. --David From rdmurray at bitdance.com Thu Aug 15 20:30:37 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 15 Aug 2013 14:30:37 -0400 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> Message-ID: <20130815183038.149242500CB@webabinitio.net> On Thu, 15 Aug 2013 13:34:12 -0400, Terry Reedy wrote: > On 8/15/2013 8:29 AM, R. David Murray wrote: > > A number of us (I don't know how many) have clearly been thinking about > > "Python 4" as the time when we remove cruft. This will not cause any > > backward compatibility issues for anyone who has paid heed to the > > deprecation warnings, but will for those who haven't. The question > > then becomes, is it better to "bundle" these removals into the > > Python 4 release, or do them incrementally? > > 4.0 will be at most 6 releases after the upcoming 3.4, which is 9 to 12 > years, which is 7 to 10 years after any regular 2.7 maintainance ends. > > The deprecated unittest synonyms are documented as being removed in 4.0 > and that already defines 4.0 as a future cruft-removal release. However, > I would not want it defined as the only cruft-removal release and used > as a reason or excuse to suspend removals until then. I would personally > prefer to do little* removals incrementally, as was done before the > decision to put off 2.x removals to 3.0. So I would have 4.0 be an > 'extra' or 'bigger' cruft removal release, but not the only one. > > * Removing one or two pure synonyms or little used features from a > module. The unittest synonym removal is not 'little' because there are > 13 synonyms and at least some were well used. Yes, by "removing cruft" I mostly had in mind the bigger cruft, like whole modules or stuff that is likely to break a lot of existing code. --David From solipsis at pitrou.net Thu Aug 15 20:31:32 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 15 Aug 2013 20:31:32 +0200 Subject: [Python-Dev] PEP 450 adding statistics module References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <20130815182451.329612500CB@webabinitio.net> Message-ID: <20130815203132.08206dd1@fsol> On Thu, 15 Aug 2013 14:24:50 -0400 "R. David Murray" wrote: > On Thu, 15 Aug 2013 14:10:39 -0400, "Eric V. Smith" wrote: > > On 08/15/2013 01:58 PM, Mark Dickinson wrote: > > > On Thu, Aug 15, 2013 at 2:08 PM, Steven D'Aprano > > > wrote: > > > > > > > > > - Each scheme ended up needing to be a separate function, for ease > > > of both implementation and testing. So I had four private median > > > functions, which I put inside a class to act as namespace and avoid > > > polluting the main namespace. Then I needed a "master function" to > > > select which of the methods should be called, with all the > > > additional testing and documentation that entailed. > > > > > > > > > That's just an implementation issue, though, and sounds like a minor > > > inconvenience to the implementor rather than anything serious; I don't > > > think that that should dictate the API that's used. > > > > > > - The API doesn't really feel very Pythonic to me. For example, we > > > write: > > > > > > > > > And I guess this is subjective: conversely, the API you're proposing > > > doesn't feel Pythonic to me. :-) I'd like the hear the opinion of other > > > python-dev readers. > > > > I agree with Mark: the proposed median, median.low, etc., doesn't feel > > right. Is there any example of doing this in the stdlib? I suggest just > > median(), median_low(), etc. > > I too prefer the median_low naming rather than median.low. I'm not > sure I can articulate why, but certainly the fact that that latter > isn't used anywhere else in the stdlib that I can think of is > probably a lot of it :) Count me in the Agreement Car, with Mark and RDM. Regards Antoine. From ethan at stoneleaf.us Thu Aug 15 19:55:45 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 15 Aug 2013 10:55:45 -0700 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: <520D136F.2010408@trueblade.com> References: <520BFFBE.3050501@stoneleaf.us> <520CA78C.1030902@trueblade.com> <520CF20A.8030308@stoneleaf.us> <520D136F.2010408@trueblade.com> Message-ID: <520D1621.6040608@stoneleaf.us> On 08/15/2013 10:44 AM, Eric V. Smith wrote: > On 08/15/2013 11:21 AM, Ethan Furman wrote: >> Given that the !r and !s format codes can be used to get the repr and >> str of an IntEnum, would it be acceptable to have IntEnum's __format__ >> simply pass through to int's __format__? And likewise with all mix-in >> classes? > > That helps with str.format(), but not with built-in format(). There, > you'd have to explicitly call str() or repr(): Given that !s and !r are explicitly asking for str or repr, I'm okay with that. -- ~Ethan~ From eliben at gmail.com Thu Aug 15 22:00:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 15 Aug 2013 13:00:56 -0700 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <20130815182451.329612500CB@webabinitio.net> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <20130815182451.329612500CB@webabinitio.net> Message-ID: > > > > And I guess this is subjective: conversely, the API you're proposing > > > doesn't feel Pythonic to me. :-) I'd like the hear the opinion of > other > > > python-dev readers. > > > > I agree with Mark: the proposed median, median.low, etc., doesn't feel > > right. Is there any example of doing this in the stdlib? I suggest just > > median(), median_low(), etc. > > I too prefer the median_low naming rather than median.low. I'm not > sure I can articulate why, but certainly the fact that that latter > isn't used anywhere else in the stdlib that I can think of is > probably a lot of it :) > > Perhaps the underlying thought is that we don't use classes pure > function namespaces: we expect classes to be something more than > that. > Certainly. Python does not force the "everything is a class" philosophy of Java and Ruby. Classes have their uses, but namespacing isn't it. There are modules for namespaces. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu Aug 15 22:16:21 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 15 Aug 2013 16:16:21 -0400 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <20130815182451.329612500CB@webabinitio.net> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <20130815182451.329612500CB@webabinitio.net> Message-ID: <520D3715.5080601@trueblade.com> On 8/15/2013 2:24 PM, R. David Murray wrote: > On Thu, 15 Aug 2013 14:10:39 -0400, "Eric V. Smith" wrote: >> On 08/15/2013 01:58 PM, Mark Dickinson wrote: >>> On Thu, Aug 15, 2013 at 2:08 PM, Steven D'Aprano >> > wrote: >>> >>> >>> - Each scheme ended up needing to be a separate function, for ease >>> of both implementation and testing. So I had four private median >>> functions, which I put inside a class to act as namespace and avoid >>> polluting the main namespace. Then I needed a "master function" to >>> select which of the methods should be called, with all the >>> additional testing and documentation that entailed. >>> >>> >>> That's just an implementation issue, though, and sounds like a minor >>> inconvenience to the implementor rather than anything serious; I don't >>> think that that should dictate the API that's used. >>> >>> - The API doesn't really feel very Pythonic to me. For example, we >>> write: >>> >>> >>> And I guess this is subjective: conversely, the API you're proposing >>> doesn't feel Pythonic to me. :-) I'd like the hear the opinion of other >>> python-dev readers. >> >> I agree with Mark: the proposed median, median.low, etc., doesn't feel >> right. Is there any example of doing this in the stdlib? I suggest just >> median(), median_low(), etc. > > I too prefer the median_low naming rather than median.low. I'm not > sure I can articulate why, but certainly the fact that that latter > isn't used anywhere else in the stdlib that I can think of is > probably a lot of it :) Actually, there is one place I can think of: itertools.chain.from_iterable. But I think that was a mistake, too. As a recent discussion showed, it's not exactly discoverable. The fact that it's not mentioned in the list of functions at the top of the documentation doesn't help. And "chain" is documented as a "module function", and "chain.from_iterable" as a "classmethod" making it all the more confusing. I think itertools.combinations and itertools.combinations_with_replacement is the better example of related functions that should be followed. Not nested, no special parameters trying to differentiate them: just two different function names. -- Eric. From fuzzyman at voidspace.org.uk Thu Aug 15 22:28:39 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 15 Aug 2013 23:28:39 +0300 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520D199F.3000704@trueblade.com> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> Message-ID: <7B2971E1-4BD1-477E-B554-ABEC6954E7FA@voidspace.org.uk> On 15 Aug 2013, at 21:10, "Eric V. Smith" wrote: > On 08/15/2013 01:58 PM, Mark Dickinson wrote: >> On Thu, Aug 15, 2013 at 2:08 PM, Steven D'Aprano > > wrote: >> >> >> - Each scheme ended up needing to be a separate function, for ease >> of both implementation and testing. So I had four private median >> functions, which I put inside a class to act as namespace and avoid >> polluting the main namespace. Then I needed a "master function" to >> select which of the methods should be called, with all the >> additional testing and documentation that entailed. >> >> >> That's just an implementation issue, though, and sounds like a minor >> inconvenience to the implementor rather than anything serious; I don't >> think that that should dictate the API that's used. >> >> - The API doesn't really feel very Pythonic to me. For example, we >> write: >> >> >> And I guess this is subjective: conversely, the API you're proposing >> doesn't feel Pythonic to me. :-) I'd like the hear the opinion of other >> python-dev readers. > > I agree with Mark: the proposed median, median.low, etc., doesn't feel > right. Is there any example of doing this in the stdlib? I suggest just > median(), median_low(), etc. > > If we do end up keeping it, simpler than the callable singleton is: > >>>> def median(): return 'median' > ... >>>> def _median_low(): return 'median.low' > ... >>>> median.low = _median_low >>>> del _median_low >>>> median() > 'median' >>>> median.low() > 'median.low' There's the patch decorator in unittest.mock which provides: patch(...) patch.object(...) patch.dict(...) The implementation is exactly as you suggest. (e.g. patch.object = _patch_object) Michael > > Eric. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From tjreedy at udel.edu Thu Aug 15 23:10:38 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 15 Aug 2013 17:10:38 -0400 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520D3715.5080601@trueblade.com> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <20130815182451.329612500CB@webabinitio.net> <520D3715.5080601@trueblade.com> Message-ID: On 8/15/2013 4:16 PM, Eric V. Smith wrote: > itertools.chain.from_iterable. But I think that was a mistake, too. As a > recent discussion showed, it's not exactly discoverable. The fact that > it's not mentioned in the list of functions at the top of the > documentation doesn't help. And "chain" is documented as a "module > function", and "chain.from_iterable" as a "classmethod" making it all > the more confusing. > > I think itertools.combinations and > itertools.combinations_with_replacement is the better example of related > functions that should be followed. Not nested, no special parameters > trying to differentiate them: just two different function names. Great implied idea. I opened http://bugs.python.org/issue18752 "Make chain.from_iterable an alias for a new chain_iterable." -- Terry Jan Reedy From tjreedy at udel.edu Thu Aug 15 23:27:01 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 15 Aug 2013 17:27:01 -0400 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520C2E23.40405@pearwood.info> References: <520C2E23.40405@pearwood.info> Message-ID: On 8/14/2013 9:25 PM, Steven D'Aprano wrote: > Hi all, > > I have raised a tracker item and PEP for adding a statistics module to > the standard library: > > http://bugs.python.org/issue18606 > > http://www.python.org/dev/peps/pep-0450/ > > There has been considerable discussion on python-ideas, I have avoided this discussion, in spite of a decade+ experience as a statistician-programmer, because I am quite busy with Idle testing and there seem to be enough other knowledgeable people around. But I approve of the general idea. I once naively used the shortcut computing formula for variance, present in all too many statistics books, in a program I supplied to a couple of laboratories. After a few months, maybe even a year, of daily use, it crashed trying to take the square root of a negative variance*. Whoops. Fortunately, I was still around to quickly fix it. *As I remember, the three value were something like 10000, 10000, 10001 as single-precision floats. -- Terry Jan Reedy From fuzzyman at voidspace.org.uk Thu Aug 15 23:27:20 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 16 Aug 2013 00:27:20 +0300 Subject: [Python-Dev] When to remove deprecated stuff (was: Deprecating the formatter module) In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130815143603.3fe29f54@fsol> Message-ID: <070D1AE3-749E-427E-B58E-D8E8EBFA4742@voidspace.org.uk> On 15 Aug 2013, at 15:40, Brett Cannon wrote: > > > > On Thu, Aug 15, 2013 at 8:36 AM, Antoine Pitrou wrote: > On Thu, 15 Aug 2013 08:29:35 -0400 > "R. David Murray" wrote: > > > On Thu, 15 Aug 2013 11:22:14 +0200, Antoine Pitrou wrote: > > > On Thu, 15 Aug 2013 11:16:20 +0200 > > > Victor Stinner wrote: > > > > 2013/8/15 Antoine Pitrou : > > > > > We don't have any substantial change in store for an eventual "Python > > > > > 4", so it's quite a remote hypothesis right now. > > > > > > > > I prefered the transition between Linux 2 and Linux 3 (no major > > > > change, just a "normal" release except the version), rather than the > > > > transition between KDE 3 and KDE 4 (in short, everything was broken, > > > > the desktop was not usable). > > > > > > > > I prefer to not start a list of things that we will make the > > > > transition from Python 3 to Python 4 harder. Can't we do small changes > > > > between each Python release, even between major versions? > > > > > > That's exactly what I'm saying. > > > But some changes cannot be made without breakage, e.g. the unicode > > > transition. Then it makes sense to bundle all breaking changes in a > > > single version change. > > > > A number of us (I don't know how many) have clearly been thinking about > > "Python 4" as the time when we remove cruft. This will not cause any > > backward compatibility issues for anyone who has paid heed to the > > deprecation warnings, but will for those who haven't. > > Which is why we shouldn't silence deprecation warnings. > > What we should probably do is have unittest turn deprecations on by default when running your tests but leave them silent otherwise. Hmmm.... I thought we already did this. I guess not. Anyway, I concur. Michael > I still think keeping them silent for the benefit of end-users is a good thing as long as we make it easier for developers to switch on warnings without thinking about it. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From christian at python.org Fri Aug 16 01:08:10 2013 From: christian at python.org (Christian Heimes) Date: Fri, 16 Aug 2013 01:08:10 +0200 Subject: [Python-Dev] PEP 452 API for Cryptographic Hash Functions v2.0 Message-ID: <520D5F5A.7090803@python.org> Hello, I have written a revised version of PEP 247. It's heavily based on AMKs original version from 2001. Version 2.0 adds ``name`` and ``block_size`` as mandatory attributes. It defines that hashing objects operate only on byte-like objects in Python 3.x, too. I have also developed an abstract base class for cryptographic hashing algorithm [1]. Should I add it to the PEP and make it mandatory for Python 3.4+? Regards, Christian [1] http://bugs.python.org/issue18742 -------------- next part -------------- PEP: 452 Title: API for Cryptographic Hash Functions v2.0 Version: $Revision$ Last-Modified: $Date$ Author: A.M. Kuchling , Christian Heimes Status: Draft Type: Informational Created: 15-Aug-2013 Post-History: Replaces: 247 Abstract There are several different modules available that implement cryptographic hashing algorithms such as MD5 or SHA. This document specifies a standard API for such algorithms, to make it easier to switch between different implementations. Specification All hashing modules should present the same interface. Additional methods or variables can be added, but those described in this document should always be present. Hash function modules define one function: new([string]) (unkeyed hashes) new([key] , [string]) (keyed hashes) Create a new hashing object and return it. The first form is for hashes that are unkeyed, such as MD5 or SHA. For keyed hashes such as HMAC, 'key' is a required parameter containing a string giving the key to use. In both cases, the optional 'string' parameter, if supplied, will be immediately hashed into the object's starting state, as if obj.update(string) was called. After creating a hashing object, arbitrary bytes can be fed into the object using its update() method, and the hash value can be obtained at any time by calling the object's digest() method. Although the parameter is called 'string', hashing objects operate on 8-bit data only. Both 'key' and 'string' must be a bytes-like object (bytes, bytearray...). A hashing object may support one-dimensional, contiguous buffers as argument, too. Text (unicode) is no longer supported in Python 3.x. Python 2.x implementations may take ASCII-only unicode as argument, but portable code should not rely on the feature. Arbitrary additional keyword arguments can be added to this function, but if they're not supplied, sensible default values should be used. For example, 'rounds' and 'digest_size' keywords could be added for a hash function which supports a variable number of rounds and several different output sizes, and they should default to values believed to be secure. Hash function modules define one variable: digest_size An integer value; the size of the digest produced by the hashing objects created by this module, measured in bytes. You could also obtain this value by creating a sample object and accessing its 'digest_size' attribute, but it can be convenient to have this value available from the module. Hashes with a variable output size will set this variable to None. Hashing objects require the following attribute: digest_size This attribute is identical to the module-level digest_size variable, measuring the size of the digest produced by the hashing object, measured in bytes. If the hash has a variable output size, this output size must be chosen when the hashing object is created, and this attribute must contain the selected size. Therefore None is *not* a legal value for this attribute. block_size An integer value or ``NotImplemented``; the internal block size of the hash algorithm in bytes. The block size is used by the HMAC module to pad the secret key to digest_size or to hash the secret key if it is longer than digest_size. If no HMAC algorithm is standardized for the the hash algorithm, return ``NotImplemented`` instead. name A text string value; the canonical, lowercase name of the hashing algorithm. The name should be a suitable parameter for :func:`hashlib.new`. Hashing objects require the following methods: copy() Return a separate copy of this hashing object. An update to this copy won't affect the original object. digest() Return the hash value of this hashing object as a bytes containing 8-bit data. The object is not altered in any way by this function; you can continue updating the object after calling this function. hexdigest() Return the hash value of this hashing object as a string containing hexadecimal digits. Lowercase letters should be used for the digits 'a' through 'f'. Like the .digest() method, this method mustn't alter the object. update(string) Hash bytes-like 'string' into the current state of the hashing object. update() can be called any number of times during a hashing object's lifetime. Hashing modules can define additional module-level functions or object methods and still be compliant with this specification. Here's an example, using a module named 'MD5': >>> import hashlib >>> from Crypto.Hash import MD5 >>> m = MD5.new() >>> isinstance(m, hashlib.CryptoHash) True >>> m.name 'md5' >>> m.digest_size 16 >>> m.block_size 64 >>> m.update(b'abc') >>> m.digest() b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr' >>> m.hexdigest() '900150983cd24fb0d6963f7d28e17f72' >>> MD5.new(b'abc').digest() b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr' Rationale The digest size is measured in bytes, not bits, even though hash algorithm sizes are usually quoted in bits; MD5 is a 128-bit algorithm and not a 16-byte one, for example. This is because, in the sample code I looked at, the length in bytes is often needed (to seek ahead or behind in a file; to compute the length of an output string) while the length in bits is rarely used. Therefore, the burden will fall on the few people actually needing the size in bits, who will have to multiply digest_size by 8. It's been suggested that the update() method would be better named append(). However, that method is really causing the current state of the hashing object to be updated, and update() is already used by the md5 and sha modules included with Python, so it seems simplest to leave the name update() alone. The order of the constructor's arguments for keyed hashes was a sticky issue. It wasn't clear whether the key should come first or second. It's a required parameter, and the usual convention is to place required parameters first, but that also means that the 'string' parameter moves from the first position to the second. It would be possible to get confused and pass a single argument to a keyed hash, thinking that you're passing an initial string to an unkeyed hash, but it doesn't seem worth making the interface for keyed hashes more obscure to avoid this potential error. Changes from Version 1.0 to Version 2.0 Version 2.0 of API for Cryptographic Hash Functions clarifies some aspects of the API and brings it up-to-date. It also formalized aspects that were already de-facto standards and provided by most implementations. Version 2.0 introduces the following new attributes: name The name property was made mandatory by :issue:`18532`. block_size The new version also specifies that the return value ``NotImplemented`` prevents HMAC support. Version 2.0 takes the separation of binary and text data in Python 3.0 into account. The 'string' argument to new() and update() as well as the 'key' argument must be bytes-like objects. On Python 2.x a hashing object may also support ASCII-only unicode. The actual name of argument is not changed as it is part of the public API. Code may depend on the fact that the argument is called 'string'. Recommanded names for common hashing algorithms algorithm variant recommended name ---------- --------- ---------------- MD5 md5 RIPEMD-160 ripemd160 SHA-1 sha1 SHA-2 SHA-224 sha224 SHA-256 sha256 SHA-384 sha384 SHA-512 sha512 SHA-3 SHA-3-224 sha3_224 SHA-3-256 sha3_256 SHA-3-384 sha3_384 SHA-3-512 sha3_512 WHIRLPOOL whirlpool Changes 2001-09-17: Renamed clear() to reset(); added digest_size attribute to objects; added .hexdigest() method. 2001-09-20: Removed reset() method completely. 2001-09-28: Set digest_size to None for variable-size hashes. 2013-08-15: Added block_size and name attributes; clarified that 'string' actually referes to bytes-like objects. Acknowledgements Thanks to Aahz, Andrew Archibald, Rich Salz, Itamar Shtull-Trauring, and the readers of the python-crypto list for their comments on this PEP. Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil End: -------------- next part -------------- A non-text attachment was scrubbed... Name: hashlib_abc.py Type: text/x-python Size: 1355 bytes Desc: not available URL: From ezio.melotti at gmail.com Fri Aug 16 01:23:24 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Fri, 16 Aug 2013 02:23:24 +0300 Subject: [Python-Dev] When to remove deprecated stuff (was: Deprecating the formatter module) In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130815143603.3fe29f54@fsol> Message-ID: On Thu, Aug 15, 2013 at 3:40 PM, Brett Cannon wrote: > On Thu, Aug 15, 2013 at 8:36 AM, Antoine Pitrou wrote: > >> > A number of us (I don't know how many) have clearly been thinking about >> > "Python 4" as the time when we remove cruft. This will not cause any >> > backward compatibility issues for anyone who has paid heed to the >> > deprecation warnings, but will for those who haven't. >> >> Which is why we shouldn't silence deprecation warnings. >> > > What we should probably do is have unittest turn deprecations on by > default when running your tests but leave them silent otherwise. > http://bugs.python.org/issue10535 (I put the keys of the time machine back at their usual place) Best Regards, Ezio Melotti > I still think keeping them silent for the benefit of end-users is a good > thing as long as we make it easier for developers to switch on warnings > without thinking about it. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Aug 16 01:30:23 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 15 Aug 2013 18:30:23 -0500 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <20130815203132.08206dd1@fsol> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <20130815182451.329612500CB@webabinitio.net> <20130815203132.08206dd1@fsol> Message-ID: +1 for the PEP in general from me, but using the underscore based pseudo-namespace for the median variants. The attribute approach isn't *wrong*, just surprising enough that I think independent functions with the "median_" prefix in their name is a better idea. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Fri Aug 16 01:30:34 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 15 Aug 2013 19:30:34 -0400 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <7B2971E1-4BD1-477E-B554-ABEC6954E7FA@voidspace.org.uk> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <7B2971E1-4BD1-477E-B554-ABEC6954E7FA@voidspace.org.uk> Message-ID: <20130815233034.98B962500C7@webabinitio.net> On Thu, 15 Aug 2013 23:28:39 +0300, Michael Foord wrote: > > On 15 Aug 2013, at 21:10, "Eric V. Smith" wrote: > > > On 08/15/2013 01:58 PM, Mark Dickinson wrote: > >> On Thu, Aug 15, 2013 at 2:08 PM, Steven D'Aprano >> > wrote: > >> > >> > >> - Each scheme ended up needing to be a separate function, for ease > >> of both implementation and testing. So I had four private median > >> functions, which I put inside a class to act as namespace and avoid > >> polluting the main namespace. Then I needed a "master function" to > >> select which of the methods should be called, with all the > >> additional testing and documentation that entailed. > >> > >> > >> That's just an implementation issue, though, and sounds like a minor > >> inconvenience to the implementor rather than anything serious; I don't > >> think that that should dictate the API that's used. > >> > >> - The API doesn't really feel very Pythonic to me. For example, we > >> write: > >> > >> > >> And I guess this is subjective: conversely, the API you're proposing > >> doesn't feel Pythonic to me. :-) I'd like the hear the opinion of other > >> python-dev readers. > > > > I agree with Mark: the proposed median, median.low, etc., doesn't feel > > right. Is there any example of doing this in the stdlib? I suggest just > > median(), median_low(), etc. > > > > If we do end up keeping it, simpler than the callable singleton is: > > > >>>> def median(): return 'median' > > ... > >>>> def _median_low(): return 'median.low' > > ... > >>>> median.low = _median_low > >>>> del _median_low > >>>> median() > > 'median' > >>>> median.low() > > 'median.low' > > > There's the patch decorator in unittest.mock which provides: > > patch(...) > patch.object(...) > patch.dict(...) > > The implementation is exactly as you suggest. (e.g. patch.object = _patch_object) Truthfully there are a number of things about the mock API that make me uncomfortable, including that one. But despite that I'm glad we didn't try to re-engineer it. Take that as you will :) --David From steve at pearwood.info Fri Aug 16 04:44:54 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 16 Aug 2013 12:44:54 +1000 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520D199F.3000704@trueblade.com> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> Message-ID: <520D9226.1070706@pearwood.info> On 16/08/13 04:10, Eric V. Smith wrote: > I agree with Mark: the proposed median, median.low, etc., doesn't feel > right. Is there any example of doing this in the stdlib? The most obvious case is datetime: we have datetime(), and datetime.now(), datetime.today(), and datetime.strftime(). The only API difference between it and median is that datetime is a type and median is not, but that's a difference that makes no difference: both are callables, and being a type is an implementation detail. dict used to be a function that returned a type. Now it is a type. Implementation detail. Even builtins do this: dict() and dict.fromkeys(), for example. If you include unbound methods, nearly every type in Python uses the callable(), callable.method() API. I am truly perplexed by the opposition to the median API. It's a trivially small difference to a pattern you find everywhere. > If we do end up keeping it, simpler than the callable singleton is: > >>>> def median(): return 'median' > ... >>>> def _median_low(): return 'median.low' > ... >>>> median.low = _median_low >>>> del _median_low >>>> median() > 'median' >>>> median.low() > 'median.low' That is the implementation I currently have. Alexander has convinced me that attaching functions to functions in this way is sub-optimal, because help(median) doesn't notice the attributes, so I'm ruling this implementation out. My preference is to make median a singleton instance with a __call__ method, and the other flavours regular methods. Although I don't like polluting the global namespace with an unnecessary class that will only be instantiated once, if it helps I can do this: class _Median: def __call__(self, data): ... def low(self, data): ... median = _Median() If that standard OOP design is unacceptable, I will swap the dots for underscores, but I won't like it. -- Steven From steve at pearwood.info Fri Aug 16 05:07:56 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 16 Aug 2013 13:07:56 +1000 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <20130815182451.329612500CB@webabinitio.net> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <20130815182451.329612500CB@webabinitio.net> Message-ID: <520D978C.4040805@pearwood.info> On 16/08/13 04:24, R. David Murray wrote: > I too prefer the median_low naming rather than median.low. I'm not > sure I can articulate why, but certainly the fact that that latter > isn't used anywhere else in the stdlib that I can think of is > probably a lot of it:) And the reason it's not used in the stdlib is because whenever somebody proposes doing so, python-dev says "but it's never been used in the stdlib before". *wink* > Perhaps the underlying thought is that we don't use classes pure > function namespaces: we expect classes to be something more than > that. To be perfectly frank, I agree! Using a class is not my first preference, and I'm suspicious of singletons, but classes and instances are the only flexible namespace type we have short of modules and packages. We have nothing like C++ namespaces. (I have some ideas about that, but they are experimental and utterly not appropriate for first-time use in a std lib module.) Considering how long the namespaces line has been part of the Zen, Python is surprisingly inflexible when it comes to namespaces. There are classes, and modules, and nothing in-between. A separate module for median is too much. There's only four functions. If there were a dozen such functions, I'd push them out into a module, but using a full-blown package structure just for the sake of median is overkill. It is possible to construct a module object on the fly, but I expect that would be even less welcome than a class, and besides, modules aren't callable, which leads to such ugly and error-prone constructions as datetime.datetime and friends. I won't impose "median.median" or "median.regular" on anyone :-) Anyway, this is my last defence of median.low() and friends. If consensus is still against it, I'll use underscores. (I will add a section in the PEP about it, one way or the other.) -- Steven From tjreedy at udel.edu Fri Aug 16 05:13:57 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 15 Aug 2013 23:13:57 -0400 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520D9226.1070706@pearwood.info> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <520D9226.1070706@pearwood.info> Message-ID: On 8/15/2013 10:44 PM, Steven D'Aprano wrote: > The most obvious case is datetime: we have datetime(), and > datetime.now(), datetime.today(), and datetime.strftime(). The only API > difference between it and median is that datetime is a type and median > is not, but that's a difference that makes no difference: I and several others, see them as conceptually different in a way that makes a big difference. Datetime is a number structure with distinct properties and operations. The median of a set of values from a totally ordered set is the middle value (if there is an odd number). The median is a function and the result of the function and the type of the result depends on the type of the inputs. The only complication is when there are an even number of items and the middle two cannot be averaged. I presume that is what medium_low is about (pick the lower of the middle two). It is a variant function with a more general definition, not a method of a type. None of the above have anything to do with Python implementations. -- Terry Jan Reedy From eric at trueblade.com Fri Aug 16 05:40:45 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 15 Aug 2013 23:40:45 -0400 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520D9226.1070706@pearwood.info> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <520D9226.1070706@pearwood.info> Message-ID: <520D9F3D.3080504@trueblade.com> On 8/15/2013 10:44 PM, Steven D'Aprano wrote: > On 16/08/13 04:10, Eric V. Smith wrote: > >> I agree with Mark: the proposed median, median.low, etc., doesn't feel >> right. Is there any example of doing this in the stdlib? > > The most obvious case is datetime: we have datetime(), and > datetime.now(), datetime.today(), and datetime.strftime(). The only API > difference between it and median is that datetime is a type and median > is not, but that's a difference that makes no difference: both are > callables, and being a type is an implementation detail. dict used to be > a function that returned a type. Now it is a type. Implementation detail. > > Even builtins do this: dict() and dict.fromkeys(), for example. Except those classmethods are all alternate constructors for the class of which they're members (it's datetime.strptime, not .strftime). That's a not uncommon idiom. To me, that's a logical difference from the proposed median. I understand it's all just namespaces and callables, but I think the proposed median(), median.low(), etc. just confuse users and makes things less discoverable. I'd expect dir(statistics) to tell me all of the available functions in the module. I wouldn't expect to need to look inside all of the returned functions to see what other functions exist. To see what I mean, look at help(itertools), and see how much harder it is to find chain.from_iterable than it is to find combination_with_replacement. BTW, I'm +1 on adding the statistics module. -- Eric. From eliben at gmail.com Fri Aug 16 06:14:59 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 15 Aug 2013 21:14:59 -0700 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520D9226.1070706@pearwood.info> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <520D9226.1070706@pearwood.info> Message-ID: On Thu, Aug 15, 2013 at 7:44 PM, Steven D'Aprano wrote: > On 16/08/13 04:10, Eric V. Smith wrote: > > I agree with Mark: the proposed median, median.low, etc., doesn't feel >> right. Is there any example of doing this in the stdlib? >> > > The most obvious case is datetime: we have datetime(), and datetime.now(), > datetime.today(), and datetime.strftime(). The only API difference between > it and median is that datetime is a type and median is not, but that's a > difference that makes no difference: both are callables, and being a type > is an implementation detail. dict used to be a function that returned a > type. Now it is a type. Implementation detail. > > Even builtins do this: dict() and dict.fromkeys(), for example. If you > include unbound methods, nearly every type in Python uses the callable(), > callable.method() API. I am truly perplexed by the opposition to the median > API. It's a trivially small difference to a pattern you find everywhere. Steven, this is a completely inappropriate comparison. datetime.now(), dict.fromkeys() and others are *factory methods*, also known as alternative constructors. This is a very common idiom in OOP, especially in languages where there is no explicit operator overloading for constructors (and even in those languages, like C++, this idiom is used above some level of complexity). This is totally unlike using a class as a namespace. The latter is unpythonic. If you need a namespace, use a module. If you don't need a namespace, then just use functions. Classes are the wrong tool to express the namespace abstraction in Python. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Fri Aug 16 09:47:55 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Fri, 16 Aug 2013 08:47:55 +0100 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520CD2C6.8060103@pearwood.info> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> Message-ID: On 15 August 2013 14:08, Steven D'Aprano wrote: > > - The API doesn't really feel very Pythonic to me. For example, we write: > > mystring.rjust(width) > dict.items() > > rather than mystring.justify(width, "right") or dict.iterate("items"). So I > think individual methods is a better API, and one which is more familiar to > most Python users. The only innovation (if that's what it is) is to have > median a callable object. Although you're talking about median() above I think that this same reasoning applies to the mode() signature. In the reference implementation it has the signature: def mode(data, max_modes=1): ... The behaviour is that with the default max_modes=1 it will return the unique mode or raise an error if there isn't a unique mode: >>> mode([1, 2, 3, 3]) 3 >>> mode([]) StatisticsError: no mode >>> mode([1, 1, 2, 3, 3]) AssertionError You can use the max_modes parameter to specify that more than one mode is acceptable and setting max_modes to 0 or None returns all modes no matter how many. In these cases mode() returns a list: >>> mode([1, 1, 2, 3, 3], max_modes=2) [1, 3] >>> mode([1, 1, 2, 3, 3], max_modes=None) [1, 3] I can't think of a situation where 1 or 2 modes are acceptable but 3 is not. The only forms I can imagine using are mode(data) to get the unique mode if it exists and mode(data, max_modes=None) to get the set of all modes. But for that usage it would be better to have a boolean flag and then either way you're at the point where it would normally become two functions. Also I dislike changing the return type based on special numeric values: >>> mode([1, 2, 3, 3], max_modes=0) [3] >>> mode([1, 2, 3, 3], max_modes=1) 3 >>> mode([1, 2, 3, 3], max_modes=2) [3] >>> mode([1, 2, 3, 3], max_modes=3) [3] My preference would be to have two functions, one called e.g. modes() and one called mode(). modes() always returns a list of the most frequent values no matter how many. mode() returns a unique mode if there is one or raises an error. I think that that would be simpler to document and easier to learn and use. If the user is for whatever reason happy with 1 or 2 modes but not 3 then they can call modes() and check for themselves. Also I think that: >>> modes([]) [] but I expect others to disagree. Oscar From fuzzyman at voidspace.org.uk Fri Aug 16 10:01:51 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 16 Aug 2013 11:01:51 +0300 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <20130815233034.98B962500C7@webabinitio.net> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <7B2971E1-4BD1-477E-B554-ABEC6954E7FA@voidspace.org.uk> <20130815233034.98B962500C7@webabinitio.net> Message-ID: <9C25CE44-10A6-414C-8F8C-859ACD9609F0@voidspace.org.uk> On 16 Aug 2013, at 02:30, R. David Murray wrote: > On Thu, 15 Aug 2013 23:28:39 +0300, Michael Foord wrote: >> >> On 15 Aug 2013, at 21:10, "Eric V. Smith" wrote: >> >>> On 08/15/2013 01:58 PM, Mark Dickinson wrote: >>>> On Thu, Aug 15, 2013 at 2:08 PM, Steven D'Aprano >>> > wrote: >>>> >>>> >>>> - Each scheme ended up needing to be a separate function, for ease >>>> of both implementation and testing. So I had four private median >>>> functions, which I put inside a class to act as namespace and avoid >>>> polluting the main namespace. Then I needed a "master function" to >>>> select which of the methods should be called, with all the >>>> additional testing and documentation that entailed. >>>> >>>> >>>> That's just an implementation issue, though, and sounds like a minor >>>> inconvenience to the implementor rather than anything serious; I don't >>>> think that that should dictate the API that's used. >>>> >>>> - The API doesn't really feel very Pythonic to me. For example, we >>>> write: >>>> >>>> >>>> And I guess this is subjective: conversely, the API you're proposing >>>> doesn't feel Pythonic to me. :-) I'd like the hear the opinion of other >>>> python-dev readers. >>> >>> I agree with Mark: the proposed median, median.low, etc., doesn't feel >>> right. Is there any example of doing this in the stdlib? I suggest just >>> median(), median_low(), etc. >>> >>> If we do end up keeping it, simpler than the callable singleton is: >>> >>>>>> def median(): return 'median' >>> ... >>>>>> def _median_low(): return 'median.low' >>> ... >>>>>> median.low = _median_low >>>>>> del _median_low >>>>>> median() >>> 'median' >>>>>> median.low() >>> 'median.low' >> >> >> There's the patch decorator in unittest.mock which provides: >> >> patch(...) >> patch.object(...) >> patch.dict(...) >> >> The implementation is exactly as you suggest. (e.g. patch.object = _patch_object) > > Truthfully there are a number of things about the mock API that make me > uncomfortable, including that one. But despite that I'm glad we > didn't try to re-engineer it. Take that as you will :) > Hah. mock used to provide separate patch and patch_object "functions" (they're really just factory functions for classes) but "patch.object" and "patch.dict" are easy to remember and you only have to import a single object instead of a proliferation. In my experience it's been a better API. The separate function was deprecated and removed a while ago. Other parts of the mock API and architecture are somewhat legacy - it's a six year old project with a lot of users, so it's somewhat inevitable. If starting from scratch I wouldn't do it *very* differently though. Michael > --David > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From solipsis at pitrou.net Fri Aug 16 10:51:26 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 16 Aug 2013 10:51:26 +0200 Subject: [Python-Dev] PEP 450 adding statistics module References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520D199F.3000704@trueblade.com> <520D9226.1070706@pearwood.info> Message-ID: <20130816105126.2721d39b@fsol> On Fri, 16 Aug 2013 12:44:54 +1000 Steven D'Aprano wrote: > On 16/08/13 04:10, Eric V. Smith wrote: > > > I agree with Mark: the proposed median, median.low, etc., doesn't feel > > right. Is there any example of doing this in the stdlib? > > The most obvious case is datetime: we have datetime(), and datetime.now(), datetime.today(), and datetime.strftime(). The only API difference between it and median is that datetime is a type and median is not, but that's a difference that makes no difference: Of course it does. The datetime classmethods return datetime instances, which is why it makes sense to have them classmethods (as opposed to module functions). The median functions, however, don't return median instances. > My preference is to make median a singleton instance with a __call__ method, and the other flavours regular methods. Although I don't like polluting the global namespace with an unnecessary class that will only be instantiated once, if it helps I can do this: > > class _Median: > def __call__(self, data): ... > def low(self, data): ... > > median = _Median() > > If that standard OOP design is unacceptable, I will swap the dots for underscores, but I won't like it. Using "OOP design" for something which is conceptually not OO (you are just providing callables in the end, not types and objects: your _Median "type" doesn't carry any state) is not really standard in Python. It would be in Java :-) Regards Antoine. From mark at hotpy.org Fri Aug 16 11:51:56 2013 From: mark at hotpy.org (Mark Shannon) Date: Fri, 16 Aug 2013 10:51:56 +0100 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520CD2C6.8060103@pearwood.info> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> Message-ID: <520DF63C.8010507@hotpy.org> On 15/08/13 14:08, Steven D'Aprano wrote: > On 15/08/13 21:42, Mark Dickinson wrote: >> The PEP and code look generally good to me. >> >> I think the API for median and its variants deserves some wider discussion: >> the reference implementation has a callable 'median', and variant callables >> 'median.low', 'median.high', 'median.grouped'. The pattern of attaching >> the variant callables as attributes on the main callable is unusual, and >> isn't something I've seen elsewhere in the standard library. I'd like to >> see some explanation in the PEP for why it's done this way. (There was >> already some discussion of this on the issue, but that was more centered >> around the implementation than the API.) >> >> I'd propose two alternatives for this: either have separate functions >> 'median', 'median_low', 'median_high', etc., or have a single function >> 'median' with a "method" argument that takes a string specifying >> computation using a particular method. I don't see a really good reason to >> deviate from standard patterns here, and fear that users would find the >> current API surprising. > > Alexander Belopolsky has convinced me (off-list) that my current implementation is better changed to a more conservative one of a callable singleton instance with methods implementing the alternative > computations. I'll have something like: > > > def _singleton(cls): > return cls() > > > @_singleton > class median: > def __call__(self, data): > ... > def low(self, data): > ... > ... > Horrible. > > In my earlier stats module, I had a single median function that took a argument to choose between alternatives. I called it "scheme": > > median(data, scheme="low") What is wrong with this? It's a perfect API; simple and self-explanatory. median is a function in the mathematical sense and it should be a function in Python. > > R uses parameter called "type" to choose between alternate calculations, not for median as we are discussing, but for quantiles: > > quantile(x, probs ... type = 7, ...). > > SAS also uses a similar system, but with different numeric codes. I rejected both "type" and "method" as the parameter name since it would cause confusion with the usual meanings of those words. I > eventually decided against this system for two reasons: There are other words to choose from ;) "scheme" seems OK to me. > > - Each scheme ended up needing to be a separate function, for ease of both implementation and testing. So I had four private median functions, which I put inside a class to act as namespace and avoid > polluting the main namespace. Then I needed a "master function" to select which of the methods should be called, with all the additional testing and documentation that entailed. > > - The API doesn't really feel very Pythonic to me. For example, we write: > > mystring.rjust(width) > dict.items() These are methods on objects, the result of these calls depends on the value of 'self' argument, not merely its class. No so with a median singleton. We also have len(seq) and copy.copy(obj) No classes required. > > rather than mystring.justify(width, "right") or dict.iterate("items"). So I think individual methods is a better API, and one which is more familiar to most Python users. The only innovation (if > that's what it is) is to have median a callable object. > > > As far as having four separate functions, median, median_low, etc., it just doesn't feel right to me. It puts four slight variations of the same function into the main namespace, instead of keeping > them together in a namespace. Names like median_low merely simulates a namespace with pseudo-methods separated with underscores instead of dots, only without the advantages of a real namespace. > > (I treat variance and std dev differently, and make the sample and population forms separate top-level functions rather than methods, simply because they are so well-known from scientific calculators > that it is unthinkable to me to do differently. Whenever I use numpy, I am surprised all over again that it has only a single variance function.) > > > From steve at pearwood.info Fri Aug 16 12:03:53 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 16 Aug 2013 20:03:53 +1000 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> Message-ID: <520DF909.9010803@pearwood.info> On 16/08/13 17:47, Oscar Benjamin wrote: > I can't think of a situation where 1 or 2 modes are acceptable but 3 > is not. The only forms I can imagine using are mode(data) to get the > unique mode if it exists and mode(data, max_modes=None) to get the set > of all modes. Hmmm, I think you are right. The current design is leftover from when mode also supported continuous data, and it made more sense there. > But for that usage it would be better to have a boolean > flag and then either way you're at the point where it would normally > become two functions. Alright, you've convinced me. I'll provide two functions: mode, which returns the single value with the highest frequency, or raises; and a second function, which collates the data into a sorted (value, frequency) list. Bike-shedding on the name of this second function is welcomed :-) -- Steven From oscar.j.benjamin at gmail.com Fri Aug 16 12:43:16 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Fri, 16 Aug 2013 11:43:16 +0100 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520DF909.9010803@pearwood.info> References: <520C2E23.40405@pearwood.info> <520CD2C6.8060103@pearwood.info> <520DF909.9010803@pearwood.info> Message-ID: On Aug 16, 2013 11:05 AM, "Steven D'Aprano" wrote: > > I'll provide two functions: mode, which returns the single value with the highest frequency, or raises; and a second function, which collates the data into a sorted (value, frequency) list. Bike-shedding on the name of this second function is welcomed :-) I'd call it counts() and prefer an OrderedDict for easy lookup. By that point you're very close to Counter though (which it currently uses internally). Oscar -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Aug 16 18:07:41 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 16 Aug 2013 18:07:41 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130816160741.64C0856A77@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-08-09 - 2013-08-16) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4152 ( +4) closed 26377 (+56) total 30529 (+60) Open issues with patches: 1896 Issues opened (51) ================== #17477: update the bsddb module do build with db 5.x versions http://bugs.python.org/issue17477 reopened by jcea #18693: help() not helpful with enum http://bugs.python.org/issue18693 reopened by ethan.furman #18697: Unify arguments names in Unicode object C API documentation http://bugs.python.org/issue18697 opened by serhiy.storchaka #18699: What is Future.running() for in PEP 3148 / concurrent.futures. http://bugs.python.org/issue18699 opened by gvanrossum #18701: Remove outdated PY_VERSION_HEX checks http://bugs.python.org/issue18701 opened by serhiy.storchaka #18702: Report skipped tests as skipped http://bugs.python.org/issue18702 opened by serhiy.storchaka #18703: To change the doc of html/faq/gui.html http://bugs.python.org/issue18703 opened by madan.ram #18704: IDLE: PEP8 Style Check Integration http://bugs.python.org/issue18704 opened by JayKrish #18705: Fix typos/spelling mistakes in Lib/*.py files http://bugs.python.org/issue18705 opened by iwontbecreative #18706: test failure in test_codeccallbacks http://bugs.python.org/issue18706 opened by pitrou #18707: the readme should also talk about how to build doc. http://bugs.python.org/issue18707 opened by madan.ram #18709: SSL module fails to handle NULL bytes inside subjectAltNames g http://bugs.python.org/issue18709 opened by christian.heimes #18710: Add PyState_GetModuleAttr http://bugs.python.org/issue18710 opened by pitrou #18711: Add PyErr_FormatV http://bugs.python.org/issue18711 opened by pitrou #18712: Pure Python operator.index doesn't match the C version. http://bugs.python.org/issue18712 opened by mark.dickinson #18713: Enable surrogateescape on stdin and stdout when appropriate http://bugs.python.org/issue18713 opened by ncoghlan #18714: Add tests for pdb.find_function http://bugs.python.org/issue18714 opened by kevinjqiu #18715: Tests fail when run with coverage http://bugs.python.org/issue18715 opened by seydou #18716: Deprecate the formatter module http://bugs.python.org/issue18716 opened by brett.cannon #18717: test for request.urlretrieve http://bugs.python.org/issue18717 opened by mjehanzeb #18718: datetime documentation contradictory on leap second support http://bugs.python.org/issue18718 opened by wolever #18720: Switch suitable constants in the socket module to IntEnum http://bugs.python.org/issue18720 opened by eli.bendersky #18723: shorten function of textwrap module is susceptible to non-norm http://bugs.python.org/issue18723 opened by vajrasky #18725: Multiline shortening http://bugs.python.org/issue18725 opened by serhiy.storchaka #18726: json functions have too many positional parameters http://bugs.python.org/issue18726 opened by serhiy.storchaka #18727: test for writing dictionary rows to CSV http://bugs.python.org/issue18727 opened by mjehanzeb #18728: Increased test coverage for filecmp.py http://bugs.python.org/issue18728 opened by Alex.Volkov #18729: In unittest.TestLoader.discover doc select the name of load_te http://bugs.python.org/issue18729 opened by py.user #18730: suffix parameter in NamedTemporaryFile silently fails when not http://bugs.python.org/issue18730 opened by dloewenherz #18731: Increased test coverage for uu and telnet http://bugs.python.org/issue18731 opened by Alex.Volkov #18733: elementtree: stop the parser more quickly on error http://bugs.python.org/issue18733 opened by haypo #18734: Berkeley DB versions 4.4-4.9 are not discovered by setup.py http://bugs.python.org/issue18734 opened by Eddie.Stanley #18736: Invalid charset in HTML pages inside documentation in CHM form http://bugs.python.org/issue18736 opened by grv87 #18737: Get virtual subclasses of an ABC http://bugs.python.org/issue18737 opened by christian.heimes #18738: String formatting (% and str.format) issues with Enum http://bugs.python.org/issue18738 opened by ethan.furman #18739: math.log of a long returns a different value of math.log of an http://bugs.python.org/issue18739 opened by gregory.p.smith #18741: Fix typos/spelling mistakes in Lib/*/*/.py files http://bugs.python.org/issue18741 opened by iwontbecreative #18742: Abstract base class for hashlib http://bugs.python.org/issue18742 opened by christian.heimes #18743: References to non-existant "StringIO" module http://bugs.python.org/issue18743 opened by jcea #18744: pathological performance using tarfile http://bugs.python.org/issue18744 opened by teamnoir #18745: Test enum in test_json is ignorant of infinity value http://bugs.python.org/issue18745 opened by vajrasky #18746: test_threading.test_finalize_with_trace() fails on FreeBSD bui http://bugs.python.org/issue18746 opened by haypo #18747: Re-seed OpenSSL's PRNG after fork http://bugs.python.org/issue18747 opened by christian.heimes #18748: libgcc_s.so.1 must be installed for pthread_cancel to work http://bugs.python.org/issue18748 opened by ionel.mc #18750: '' % [1] doesn't fail http://bugs.python.org/issue18750 opened by asvetlov #18751: A manager's server never joins its threads http://bugs.python.org/issue18751 opened by pitrou #18752: Make chain.from_iterable an alias for a new chain_iterable. http://bugs.python.org/issue18752 opened by terry.reedy #18753: [c]ElementTree.fromstring fails to parse ]]> http://bugs.python.org/issue18753 opened by kees #18754: Run Python child processes in isolated mode in the test suite? http://bugs.python.org/issue18754 opened by haypo #18755: imp read functions do not try to re-open files that have been http://bugs.python.org/issue18755 opened by brett.cannon #18756: os.urandom() fails under high load http://bugs.python.org/issue18756 opened by christian.heimes Most recent 15 issues with no replies (15) ========================================== #18755: imp read functions do not try to re-open files that have been http://bugs.python.org/issue18755 #18752: Make chain.from_iterable an alias for a new chain_iterable. http://bugs.python.org/issue18752 #18751: A manager's server never joins its threads http://bugs.python.org/issue18751 #18746: test_threading.test_finalize_with_trace() fails on FreeBSD bui http://bugs.python.org/issue18746 #18745: Test enum in test_json is ignorant of infinity value http://bugs.python.org/issue18745 #18741: Fix typos/spelling mistakes in Lib/*/*/.py files http://bugs.python.org/issue18741 #18736: Invalid charset in HTML pages inside documentation in CHM form http://bugs.python.org/issue18736 #18733: elementtree: stop the parser more quickly on error http://bugs.python.org/issue18733 #18731: Increased test coverage for uu and telnet http://bugs.python.org/issue18731 #18730: suffix parameter in NamedTemporaryFile silently fails when not http://bugs.python.org/issue18730 #18729: In unittest.TestLoader.discover doc select the name of load_te http://bugs.python.org/issue18729 #18714: Add tests for pdb.find_function http://bugs.python.org/issue18714 #18711: Add PyErr_FormatV http://bugs.python.org/issue18711 #18701: Remove outdated PY_VERSION_HEX checks http://bugs.python.org/issue18701 #18697: Unify arguments names in Unicode object C API documentation http://bugs.python.org/issue18697 Most recent 15 issues waiting for review (15) ============================================= #18754: Run Python child processes in isolated mode in the test suite? http://bugs.python.org/issue18754 #18747: Re-seed OpenSSL's PRNG after fork http://bugs.python.org/issue18747 #18746: test_threading.test_finalize_with_trace() fails on FreeBSD bui http://bugs.python.org/issue18746 #18745: Test enum in test_json is ignorant of infinity value http://bugs.python.org/issue18745 #18743: References to non-existant "StringIO" module http://bugs.python.org/issue18743 #18742: Abstract base class for hashlib http://bugs.python.org/issue18742 #18741: Fix typos/spelling mistakes in Lib/*/*/.py files http://bugs.python.org/issue18741 #18739: math.log of a long returns a different value of math.log of an http://bugs.python.org/issue18739 #18738: String formatting (% and str.format) issues with Enum http://bugs.python.org/issue18738 #18737: Get virtual subclasses of an ABC http://bugs.python.org/issue18737 #18731: Increased test coverage for uu and telnet http://bugs.python.org/issue18731 #18729: In unittest.TestLoader.discover doc select the name of load_te http://bugs.python.org/issue18729 #18728: Increased test coverage for filecmp.py http://bugs.python.org/issue18728 #18727: test for writing dictionary rows to CSV http://bugs.python.org/issue18727 #18723: shorten function of textwrap module is susceptible to non-norm http://bugs.python.org/issue18723 Top 10 most discussed issues (10) ================================= #18738: String formatting (% and str.format) issues with Enum http://bugs.python.org/issue18738 49 msgs #18720: Switch suitable constants in the socket module to IntEnum http://bugs.python.org/issue18720 18 msgs #18606: Add statistics module to standard library http://bugs.python.org/issue18606 12 msgs #18710: Add PyState_GetModuleAttr http://bugs.python.org/issue18710 10 msgs #18693: help() not helpful with enum http://bugs.python.org/issue18693 9 msgs #18706: test failure in test_codeccallbacks http://bugs.python.org/issue18706 9 msgs #18723: shorten function of textwrap module is susceptible to non-norm http://bugs.python.org/issue18723 9 msgs #18747: Re-seed OpenSSL's PRNG after fork http://bugs.python.org/issue18747 9 msgs #8713: multiprocessing needs option to eschew fork() under Linux http://bugs.python.org/issue8713 8 msgs #18647: re.error: nothing to repeat http://bugs.python.org/issue18647 8 msgs Issues closed (54) ================== #3526: Customized malloc implementation on SunOS and AIX http://bugs.python.org/issue3526 closed by pitrou #6132: Implement the GIL with critical sections in Windows http://bugs.python.org/issue6132 closed by pitrou #8112: xmlrpc.server: ServerHTMLDoc.docroutine uses (since 3.0) depre http://bugs.python.org/issue8112 closed by r.david.murray #12015: possible characters in temporary file name is too few http://bugs.python.org/issue12015 closed by python-dev #12075: python3.2 memory leak when reloading class with attributes http://bugs.python.org/issue12075 closed by pitrou #12645: test.support. import_fresh_module - incorrect doc http://bugs.python.org/issue12645 closed by python-dev #13647: Python SSL stack doesn't securely validate certificate (as cli http://bugs.python.org/issue13647 closed by pitrou #15581: curses: segfault in addstr() http://bugs.python.org/issue15581 closed by haypo #16499: CLI option for isolated mode http://bugs.python.org/issue16499 closed by christian.heimes #17701: Improving strftime documentation http://bugs.python.org/issue17701 closed by wolever #18090: dict_contains first argument declared register, and shouldn't http://bugs.python.org/issue18090 closed by larry #18121: antigravity leaks subprocess.Popen object http://bugs.python.org/issue18121 closed by pitrou #18226: IDLE Unit test for FormatParagrah.py http://bugs.python.org/issue18226 closed by terry.reedy #18264: enum.IntEnum is not compatible with JSON serialisation http://bugs.python.org/issue18264 closed by python-dev #18268: ElementTree.fromstring non-deterministically gives unicode tex http://bugs.python.org/issue18268 closed by eli.bendersky #18296: test_os.test_trailers() is failing on AMD64 FreeBSD 9.0 dtrace http://bugs.python.org/issue18296 closed by haypo #18367: See if a venv setup can be used for devinabox for coverage http://bugs.python.org/issue18367 closed by brett.cannon #18405: crypt.mksalt() result has unnecessarily low entropy http://bugs.python.org/issue18405 closed by haypo #18425: IDLE Unit test for IdleHistory.py http://bugs.python.org/issue18425 closed by terry.reedy #18451: Omit test files in devinabox coverage run http://bugs.python.org/issue18451 closed by brett.cannon #18453: There are unused variables inside DateTimeTestCase class in te http://bugs.python.org/issue18453 closed by ezio.melotti #18465: There are unused variables and unused import in Lib/test/test_ http://bugs.python.org/issue18465 closed by ezio.melotti #18483: Add hour-24 type of time to test_http2time_formats in test_htt http://bugs.python.org/issue18483 closed by ezio.melotti #18484: No unit test for iso2time function from http.cookiejar module http://bugs.python.org/issue18484 closed by ezio.melotti #18501: _elementtree.c calls Python callbacks while a Python exception http://bugs.python.org/issue18501 closed by haypo #18505: Duplicate function names in test_email.py http://bugs.python.org/issue18505 closed by ezio.melotti #18516: Typos in Lib/email/generator.py and Lib/email/architecture.rst http://bugs.python.org/issue18516 closed by ezio.melotti #18579: Dereference after NULL check in listobject.c merge_hi() http://bugs.python.org/issue18579 closed by christian.heimes #18585: Add a text truncation function http://bugs.python.org/issue18585 closed by pitrou #18598: Importlib, more verbosity please http://bugs.python.org/issue18598 closed by brett.cannon #18600: email.policy doc example passes 'policy' to as_string, but tha http://bugs.python.org/issue18600 closed by r.david.murray #18609: test_ctypes failure on AIX in PyEval_CallObjectWithKeywords http://bugs.python.org/issue18609 closed by haypo #18655: GUI apps take long to launch on Windows http://bugs.python.org/issue18655 closed by terry.reedy #18660: os.read behavior on Linux http://bugs.python.org/issue18660 closed by benjamin.peterson #18663: In unittest.TestCase.assertAlmostEqual doc specify the delta d http://bugs.python.org/issue18663 closed by ezio.melotti #18667: missing HAVE_FCHOWNAT http://bugs.python.org/issue18667 closed by larry #18673: Add O_TMPFILE to os module http://bugs.python.org/issue18673 closed by christian.heimes #18676: Queue: document that zero is accepted as timeout value http://bugs.python.org/issue18676 closed by terry.reedy #18680: JSONDecoder should document that it raises a ValueError for ma http://bugs.python.org/issue18680 closed by wolever #18681: typo in imp.reload http://bugs.python.org/issue18681 closed by ezio.melotti #18687: Lib/test/leakers/test_ctypes.py still mentions the need to upd http://bugs.python.org/issue18687 closed by ezio.melotti #18689: add argument for formatter to logging.Handler and subclasses i http://bugs.python.org/issue18689 closed by vinay.sajip #18696: In unittest.TestCase.longMessage doc remove a redundant senten http://bugs.python.org/issue18696 closed by ezio.melotti #18698: importlib.reload() does not return the module in sys.modules http://bugs.python.org/issue18698 closed by eric.snow #18700: test_cgi raises ResourceWarning http://bugs.python.org/issue18700 closed by madison.may #18708: Change required in python 3.4 interpretor . http://bugs.python.org/issue18708 closed by mark.dickinson #18719: Remove false optimization for equality comparison of hashed st http://bugs.python.org/issue18719 closed by rhettinger #18721: test for FTP cwd function http://bugs.python.org/issue18721 closed by orsenthil #18722: Remove uses of the register keyword http://bugs.python.org/issue18722 closed by pitrou #18724: Typo in docs.python.org: smtplib python2.7 http://bugs.python.org/issue18724 closed by ned.deily #18732: IdleHistory.History: eliminate unused parameter; other cleanup http://bugs.python.org/issue18732 closed by terry.reedy #18735: SSL/TLS pinning for the ssl module http://bugs.python.org/issue18735 closed by raymontag #18740: str is number methods don't recognize '.' http://bugs.python.org/issue18740 closed by brett.cannon #18749: [issue 18606] Re: Add statistics module to standard library http://bugs.python.org/issue18749 closed by ezio.melotti From peter.a.portante at gmail.com Sat Aug 17 13:55:32 2013 From: peter.a.portante at gmail.com (Peter Portante) Date: Sat, 17 Aug 2013 07:55:32 -0400 Subject: [Python-Dev] Should the default Python 2.7 web page mention 2.7.5 instead of 2.7.2? Message-ID: See http://www.python.org/download/releases/2.7/ as of Saturday, August 17th, 2013. "Note: A bugfix release, 2.7.2, is currently available. Its use is recommended." Kinds regards, -peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Sat Aug 17 17:19:28 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 17 Aug 2013 08:19:28 -0700 Subject: [Python-Dev] Should the default Python 2.7 web page mention 2.7.5 instead of 2.7.2? In-Reply-To: References: Message-ID: Fixed, although hopefully there are few links to that page, since it's really the release page of 2.7.0. 2013/8/17 Peter Portante : > See http://www.python.org/download/releases/2.7/ as of Saturday, August > 17th, 2013. > > "Note: A bugfix release, 2.7.2, is currently available. Its use is > recommended." > > Kinds regards, -peter > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/benjamin%40python.org > -- Regards, Benjamin From steve at pearwood.info Sat Aug 17 19:02:28 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 18 Aug 2013 03:02:28 +1000 Subject: [Python-Dev] Tracker meta-issue Message-ID: <520FACA4.6040103@pearwood.info> Is this the right place to request somebody look at an issue in the meta-tracker? http://psf.upfronthosting.co.za/roundup/meta/issue517 Replying to reviews gives an AttributeError exception. This was reported three months ago, and is currently unassigned. Actually, on closer look, it seems to be the same issue as here: http://psf.upfronthosting.co.za/roundup/meta/issue484 which was ten months ago. Thanks, Steven From peter.a.portante at gmail.com Sat Aug 17 19:11:56 2013 From: peter.a.portante at gmail.com (Peter Portante) Date: Sat, 17 Aug 2013 13:11:56 -0400 Subject: [Python-Dev] Should the default Python 2.7 web page mention 2.7.5 instead of 2.7.2? In-Reply-To: References: Message-ID: FWIW: The first hit that I saw googling "Python 2.7" brought up that page. On Sat, Aug 17, 2013 at 11:19 AM, Benjamin Peterson wrote: > Fixed, although hopefully there are few links to that page, since it's > really the release page of 2.7.0. > > 2013/8/17 Peter Portante : > > See http://www.python.org/download/releases/2.7/ as of Saturday, August > > 17th, 2013. > > > > "Note: A bugfix release, 2.7.2, is currently available. Its use is > > recommended." > > > > Kinds regards, -peter > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > http://mail.python.org/mailman/options/python-dev/benjamin%40python.org > > > > > > -- > Regards, > Benjamin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Aug 17 20:23:44 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 17 Aug 2013 20:23:44 +0200 Subject: [Python-Dev] Tracker meta-issue References: <520FACA4.6040103@pearwood.info> Message-ID: <20130817202344.536ba3d2@fsol> On Sun, 18 Aug 2013 03:02:28 +1000 Steven D'Aprano wrote: > Is this the right place to request somebody look at an issue in the meta-tracker? > > http://psf.upfronthosting.co.za/roundup/meta/issue517 Officially, I suppose the "right place" should be the tracker-discuss mailing-list, but both Martin and Ezio are certainly following python-dev. Also, given that it significantly affects your ability to work on your patch, I think it's fair that you ping this list, anyway. Regards Antoine. > > Replying to reviews gives an AttributeError exception. This was reported three months ago, and is currently unassigned. > > Actually, on closer look, it seems to be the same issue as here: > > http://psf.upfronthosting.co.za/roundup/meta/issue484 > > which was ten months ago. > > > > > Thanks, > > Steven From solipsis at pitrou.net Sat Aug 17 20:42:28 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 17 Aug 2013 20:42:28 +0200 Subject: [Python-Dev] cpython: Use a known unique object for the dummy entry. References: <3cHGPr73nVz7LjQ@mail.python.org> Message-ID: <20130817204228.2b2a6bf1@fsol> On Sat, 17 Aug 2013 11:32:00 +0200 (CEST) raymond.hettinger wrote: > http://hg.python.org/cpython/rev/2c9a2b588a89 > changeset: 85218:2c9a2b588a89 > user: Raymond Hettinger > date: Sat Aug 17 02:31:53 2013 -0700 > summary: > Use a known unique object for the dummy entry. > > This lets us run PyObject_RichCompareBool() without > first needing to check whether the entry is a dummy. > > files: > Objects/setobject.c | 45 ++++++++++++++------------------ > 1 files changed, 20 insertions(+), 25 deletions(-) This broke test_gdb on several machines: ====================================================================== FAIL: test_sets (test.test_gdb.PrettyPrintTests) Verify the pretty-printing of sets ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/antoine/cpython/default/Lib/test/test_gdb.py", line 319, in test_sets self.assertEqual(gdb_repr, "{'b'}") AssertionError: "{, 'b'}" != "{'b'}" - {, 'b'} + {'b'} Obviously the pretty-printing of sets isn't able to recognize the dummy from regular set contents, anymore :-) It should be fixable, but I don't know how. Regards Antoine. From arigo at tunes.org Sun Aug 18 09:03:56 2013 From: arigo at tunes.org (Armin Rigo) Date: Sun, 18 Aug 2013 09:03:56 +0200 Subject: [Python-Dev] cpython: Use a known unique object for the dummy entry. In-Reply-To: <20130817204228.2b2a6bf1@fsol> References: <3cHGPr73nVz7LjQ@mail.python.org> <20130817204228.2b2a6bf1@fsol> Message-ID: Hi, On Sat, Aug 17, 2013 at 8:42 PM, Antoine Pitrou wrote: >> summary: >> Use a known unique object for the dummy entry. Another issue with this change: the dummy object should be of a dummy subclass of 'object', rather than of 'object' itself. When it is 'object' itself, a custom __eq__() method will be called, sending what should be the dummy object to the pure Python code explicitly, as in the example below. This is bad because ---in addition to calling __eq__() with unexpected arguments, which might break some code--- we could then take the dummy object, and try to insert it into another set... class A(object): def __init__(self, hash): self.hash = hash def __eq__(self, other): print("seen!", self, other) return False def __hash__(self): return self.hash a1 = A(1) a2 = A(2) s = {a1, a2} s.remove(a2) A(2) in s A bient?t, Armin. From solipsis at pitrou.net Sun Aug 18 13:53:42 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 18 Aug 2013 13:53:42 +0200 Subject: [Python-Dev] cpython: Use a known unique object for the dummy entry. In-Reply-To: References: <3cHGPr73nVz7LjQ@mail.python.org> <20130817204228.2b2a6bf1@fsol> Message-ID: <20130818135342.646ab4ed@fsol> On Sun, 18 Aug 2013 09:03:56 +0200 Armin Rigo wrote: > Hi, > > On Sat, Aug 17, 2013 at 8:42 PM, Antoine Pitrou wrote: > >> summary: > >> Use a known unique object for the dummy entry. > > Another issue with this change: the dummy object should be of a dummy > subclass of 'object', rather than of 'object' itself. When it is > 'object' itself, a custom __eq__() method will be called, sending what > should be the dummy object to the pure Python code explicitly, as in > the example below. This is bad because ---in addition to calling > __eq__() with unexpected arguments, which might break some code--- we > could then take the dummy object, and try to insert it into another > set... Indeed. Also, any non-trivial __eq__ will start receiving unexpected objects and may break in mysterious ways... Regards Antoine. From mark at hotpy.org Sun Aug 18 18:31:24 2013 From: mark at hotpy.org (Mark Shannon) Date: Sun, 18 Aug 2013 17:31:24 +0100 Subject: [Python-Dev] cpython: Use a known unique object for the dummy entry. In-Reply-To: <20130817204228.2b2a6bf1@fsol> References: <3cHGPr73nVz7LjQ@mail.python.org> <20130817204228.2b2a6bf1@fsol> Message-ID: <5210F6DC.9020002@hotpy.org> On 17/08/13 19:42, Antoine Pitrou wrote: > On Sat, 17 Aug 2013 11:32:00 +0200 (CEST) > raymond.hettinger wrote: > >> http://hg.python.org/cpython/rev/2c9a2b588a89 >> changeset: 85218:2c9a2b588a89 >> user: Raymond Hettinger >> date: Sat Aug 17 02:31:53 2013 -0700 >> summary: >> Use a known unique object for the dummy entry. >> >> This lets us run PyObject_RichCompareBool() without >> first needing to check whether the entry is a dummy. >> >> files: >> Objects/setobject.c | 45 ++++++++++++++------------------ >> 1 files changed, 20 insertions(+), 25 deletions(-) > > This broke test_gdb on several machines: > > ====================================================================== > FAIL: test_sets (test.test_gdb.PrettyPrintTests) > Verify the pretty-printing of sets > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/antoine/cpython/default/Lib/test/test_gdb.py", line 319, > in test_sets self.assertEqual(gdb_repr, "{'b'}") > AssertionError: "{, 'b'}" != "{'b'}" > - {, 'b'} > + {'b'} > > > Obviously the pretty-printing of sets isn't able to recognize the dummy > from regular set contents, anymore :-) It should be fixable, but I > don't know how. By giving the dummy object a custom type, the dummy object can be recognised by testing that its type equals PySetDummy_Type (or whatever it is called) See dictobject.c for an implementation of a suitable dummy object. Cheers, Mark. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mark%40hotpy.org > From solipsis at pitrou.net Sun Aug 18 20:46:26 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 18 Aug 2013 20:46:26 +0200 Subject: [Python-Dev] cpython: Use a known unique object for the dummy entry. References: <3cHGPr73nVz7LjQ@mail.python.org> <20130817204228.2b2a6bf1@fsol> <5210F6DC.9020002@hotpy.org> Message-ID: <20130818204626.04eac6ea@fsol> On Sun, 18 Aug 2013 17:31:24 +0100 Mark Shannon wrote: > > By giving the dummy object a custom type, the dummy object can be > recognised by testing that its type equals PySetDummy_Type (or > whatever it is called) > > See dictobject.c for an implementation of a suitable dummy object. The most reasonable thing to do would probably be to share the same dummy object between setobject.c and dictobject.c, then. Raymond, it would be nice if you could take a look! Regards Antoine. From arnaud.fontaine at nexedi.com Sat Aug 17 10:51:06 2013 From: arnaud.fontaine at nexedi.com (Arnaud Fontaine) Date: Sat, 17 Aug 2013 17:51:06 +0900 Subject: [Python-Dev] Dealing with import lock deadlock in Import Hooks In-Reply-To: <20130814101529.07697e45@pitrou.net> (Antoine Pitrou's message of "Wed, 14 Aug 2013 10:15:29 +0200") References: <87a9knxsqv.fsf@duckcorp.org> <877gfqqr7o.fsf@duckcorp.org> <20130813085000.6c686907@fsol> <87zjsmnged.fsf@duckcorp.org> <20130813173147.1603e342@pitrou.net> <87bo50onp4.fsf@duckcorp.org> <20130814101529.07697e45@pitrou.net> Message-ID: <87bo4whf9h.fsf@duckcorp.org> Antoine Pitrou writes: > Le Wed, 14 Aug 2013 14:17:59 +0900, Arnaud Fontaine a ?crit : >> From my understanding of import.c source code, until something is >> added to sys.modules or the code loaded, there should be no >> side-effect to releasing the lock, right? (eg there is no global >> variables/data being shared for importing modules, meaning that >> releasing the lock should be safe as long as the modules loaded >> through import hooks are protected by a lock) > > Er, probably, but import.c is a nasty pile of code. > It's true the import lock is there mainly to: > - avoid incomplete modules from being seen by other threads > - avoid a module from being executed twice Yes. Hopefully, the implementation in Python 3.3 should be much better! ;-) > But that doesn't mean it can't have had any other - unintended - > benefits ;-) Indeed, that's why I checked the source code, but I will check again anyway to make sure. > (also, some import hooks might not be thread-safe, something which they > haven't had to bother about until now) Good point, I didn't think about that. Thanks! Regards, -- Arnaud Fontaine From tim.peters at gmail.com Mon Aug 19 19:51:41 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 19 Aug 2013 12:51:41 -0500 Subject: [Python-Dev] hg verify warnings Message-ID: > ... > $ hg verify > repository uses revlog format 1 > checking changesets > checking manifests > crosschecking files in changesets and manifests > checking files > warning: copy source of 'Modules/_threadmodule.c' not in parents of 60ad83716733 > warning: copy source of 'Objects/bytesobject.c' not in parents of 64bb1d258322 > warning: copy source of 'Objects/stringobject.c' not in parents of 357e268e7c5f > 9799 files, 79660 changesets, 176851 total revisions > 3 warnings encountered! > > > $ hg --version > Mercurial Distributed SCM (version 2.3.2) > (see http://mercurial.selenic.com for more information) FYI, I found this kind of warning in my own (non-Python) repository, created from scratch just a few weeks ago, so it's NOT necessarily the case that this has something to do with using ancient hg releases. I was teaching myself hg at the time, and suspect I did something hg didn't expect <0.5 wink>. Here's a tiny repository that displays the same kind of thing: Make a new repository (this is on Windows - does it matter? doubt it): C:\>hg init HHH C:\>cd HHH Hg is up to date (2.7): C:\HHH>hg version Mercurial Distributed SCM (version 2.7) (see http://mercurial.selenic.com for more information) ... Make a subdirectory and add a file: C:\HHH>mkdir sub C:\HHH>cd sub C:\HHH\sub>echo a > a.txt C:\HHH\sub>hg add adding a.txt C:\HHH\sub>hg ci -m initial Move the file up a level: C:\HHH\sub>hg move a.txt .. Now here's the funky part! Unsure about what was going on in my own repository, so doing as little as possible per step, I committed the move in _2_ steps (do a plain "hg ci" at this point and nothing goes wrong). So first I recorded that the subdirectory a.txt is gone: C:\HHH\sub>hg ci a.txt -m moving Then up to the parent directory, and commit the new location of a.txt: C:\HHH\sub>cd .. C:\HHH>hg st A a.txt C:\HHH>hg ci -m moving Now verify -v complains: C:\HHH>hg verify -v repository uses revlog format 1 checking changesets checking manifests crosschecking files in changesets and manifests checking files warning: copy source of 'a.txt' not in parents of 9c2205c187bf 2 files, 3 changesets, 2 total revisions 1 warnings encountered! What the warning says seems to me to be true: C:\HHH>hg log changeset: 2:9c2205c187bf tag: tip user: Tim Peters date: Mon Aug 19 12:24:43 2013 -0500 summary: moving changeset: 1:60fffa9b0194 user: Tim Peters date: Mon Aug 19 12:24:26 2013 -0500 summary: moving changeset: 0:0193842498ab user: Tim Peters date: Mon Aug 19 12:24:05 2013 -0500 summary: initial The parent of 2 (9c2205c187bf) is 1 (60fffa9b0194), and indeed the copy _source_ (HHH\sub\a.txt) was removed by changeset 1. Why that could be "bad" escapes me, though. Regardless, I suspect Python's warnings came from similarly overly elaborate learning-curve workflow, and are harmless. From raymond.hettinger at gmail.com Mon Aug 19 22:11:05 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 19 Aug 2013 13:11:05 -0700 Subject: [Python-Dev] cpython: Use a known unique object for the dummy entry. In-Reply-To: <20130818204626.04eac6ea@fsol> References: <3cHGPr73nVz7LjQ@mail.python.org> <20130817204228.2b2a6bf1@fsol> <5210F6DC.9020002@hotpy.org> <20130818204626.04eac6ea@fsol> Message-ID: > The most reasonable thing to do would probably be to share the same > dummy object between setobject.c and dictobject.c, then. > Raymond, it would be nice if you could take a look! Thanks, I will look at it shortly. Raymond On Sun, Aug 18, 2013 at 11:46 AM, Antoine Pitrou wrote: > On Sun, 18 Aug 2013 17:31:24 +0100 > Mark Shannon wrote: > > > > By giving the dummy object a custom type, the dummy object can be > > recognised by testing that its type equals PySetDummy_Type (or > > whatever it is called) > > > > See dictobject.c for an implementation of a suitable dummy object. > > The most reasonable thing to do would probably be to share the same > dummy object between setobject.c and dictobject.c, then. > > Raymond, it would be nice if you could take a look! > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/raymond.hettinger%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Aug 20 00:25:22 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 19 Aug 2013 18:25:22 -0400 Subject: [Python-Dev] a Constant addition to enum In-Reply-To: References: <52015FAA.3090805@stoneleaf.us> Message-ID: <20130819182522.4b5c90b6@anarchist> On Aug 06, 2013, at 02:36 PM, Eli Bendersky wrote: >Personally, I dislike all non-simple uses of Enums. One such use is adding >behavior to them. This can always be split to separate behavior from the >Enum itself, and I would prefer that. We went to great lengths to ensure >that things work in expected ways, but heaping additional features (even as >separate decorators) is just aggravating thiings. So -1 from me. > >Finally, I suggest we exercise restraint in adding more capabilities to >enums in 3.4; enums are a new creature for Python and it will be extremely >useful to see them used in the wild for a while first. We can enhance them >in 3.5, but premature enhancement is IMHO much more likely to do harm than >good. As you can probably guess, I fully agree. I also agree with Nick's suggestion to remove the advanced examples from the documentation. -Barry From tim.peters at gmail.com Tue Aug 20 01:48:44 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 19 Aug 2013 18:48:44 -0500 Subject: [Python-Dev] hg verify warnings In-Reply-To: References: Message-ID: >> ... >> $ hg verify >> repository uses revlog format 1 >> checking changesets >> checking manifests >> crosschecking files in changesets and manifests >> checking files >> warning: copy source of 'Modules/_threadmodule.c' not in parents of 60ad83716733 >> warning: copy source of 'Objects/bytesobject.c' not in parents of 64bb1d258322 >> warning: copy source of 'Objects/stringobject.c' not in parents of 357e268e7c5f >> 9799 files, 79660 changesets, 176851 total revisions >> 3 warnings encountered! >> >> >> $ hg --version >> Mercurial Distributed SCM (version 2.3.2) >> (see http://mercurial.selenic.com for more information) [Tim, reproduces this kind of warning in a 1-file repository, via moving the file, then committing the removal of the old location before committing the addition of the new location] > ... > Regardless, I suspect Python's warnings came from similarly overly > elaborate learning-curve workflow, Nope! At least not for _threadmodule.c: that got renamed in a single commit (7fe3a8648ce2), and I don't see anything fishy about, or around, it. I gave up on tracing back bytesobject.c/stringobject.c, because not only did one get renamed to the other, later it got renamed back again. > and are harmless. I still expect they're harmless, albeit without a shred of evidence ;-) From jphalip at gmail.com Tue Aug 20 03:00:33 2013 From: jphalip at gmail.com (Julien Phalip) Date: Mon, 19 Aug 2013 18:00:33 -0700 Subject: [Python-Dev] HTTPOnly and Secure cookie flags Message-ID: <2F6E6695-EB9B-410B-9082-1B8703BD5286@gmail.com> Hi, I'm currently working on a patch for ticket #16611 [1], which addresses some inconsistencies in the handling of HTTPOnly and Secure flags in the 'https.cookies' module. I originally found out about this ticket after noticing a bug in Django [2]. I could implement a work-around for Django but first I'd like to be sure how the original issue in Python should be addressed. So I'm eager to reach a resolution for this ticket. If you have any feedback on the patch or some tips on how this ticket could be moved forward, please let me know! Thanks a lot for your help! Kind regards, Julien [1] http://bugs.python.org/issue16611 [2] https://code.djangoproject.com/ticket/20755 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Aug 20 06:25:58 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 19 Aug 2013 23:25:58 -0500 Subject: [Python-Dev] Status of 3.2 in Hg repository? Message-ID: > hg branches default 85277:4f7845be9e23 2.7 85276:7b867a46a8b4 3.2 83826:b9b521efeba3 3.3 85274:7ab07f15d78c (inactive) 2.6 82288:936621d33c38 (inactive) 3.1 80967:087ce7bbac9f (inactive) Is it expected that 3.2 is active? Looks like it's been that way a couple months. The branch head: > hg log -r 3.2 changeset: 83826:b9b521efeba3 branch: 3.2 parent: 83739:6255b40c6a61 user: Antoine Pitrou date: Sat May 18 17:56:42 2013 +0200 summary: Issue #17980: Fix possible abuse of ssl.match_hostname() for denial of service using certificates with many wildcards (CVE-2013-2099). Besides that one, there are only two other changesets that would be involved in a merge of 3.2 into default, both having to do with making the 3.2.5 release. As I understand the development docs (and I may be wrong), the intent is that only the default and 2.7 branches should be active at this time. Educate me :-) From arigo at tunes.org Tue Aug 20 09:28:36 2013 From: arigo at tunes.org (Armin Rigo) Date: Tue, 20 Aug 2013 09:28:36 +0200 Subject: [Python-Dev] hg verify warnings In-Reply-To: References: Message-ID: Hi Tim, On Tue, Aug 20, 2013 at 1:48 AM, Tim Peters wrote: >>> warning: copy source of 'Modules/_threadmodule.c' not in parents of 60ad83716733 >>> warning: copy source of 'Objects/bytesobject.c' not in parents of 64bb1d258322 >>> warning: copy source of 'Objects/stringobject.c' not in parents of 357e268e7c5f I've seen this once already (with another big repository). The problem I had was only these warnings when running "hg verify", and it was fixed by simply checking out a new copy of the repository. It seems that you have the same problem: for example my own copy of CPython doesn't show any warning in "hg verify". I've deleted my slightly-broken repo by then (as it was already several years old I suspected an old version of mercurial). Maybe I shouldn't have. How about you send your repository to the Mercurial bug tracker? A bient?t, Armin. From solipsis at pitrou.net Tue Aug 20 10:10:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 20 Aug 2013 10:10:39 +0200 Subject: [Python-Dev] Status of 3.2 in Hg repository? References: Message-ID: <20130820101039.4eea3d81@pitrou.net> Hi Tim, Le Mon, 19 Aug 2013 23:25:58 -0500, Tim Peters a ?crit : > > hg log -r 3.2 > changeset: 83826:b9b521efeba3 > branch: 3.2 > parent: 83739:6255b40c6a61 > user: Antoine Pitrou > date: Sat May 18 17:56:42 2013 +0200 > summary: Issue #17980: Fix possible abuse of ssl.match_hostname() > for denial of service using certificates with many wildcards > (CVE-2013-2099). Oops, that's me :-S Now I don't remember if I omitted to merge deliberately, or if that was an oversight. For the record, the issue was fixed in 3.3 too, albeit not with a merge changeset. Regards Antoine. From tim.peters at gmail.com Tue Aug 20 17:12:51 2013 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 20 Aug 2013 10:12:51 -0500 Subject: [Python-Dev] hg verify warnings In-Reply-To: References: Message-ID: [Tim] >>>> warning: copy source of 'Modules/_threadmodule.c' not in parents of 60ad83716733 >>>> warning: copy source of 'Objects/bytesobject.c' not in parents of 64bb1d258322 >>>> warning: copy source of 'Objects/stringobject.c' not in parents of 357e268e7c5f [Armin] > I've seen this once already (with another big repository). The > problem I had was only these warnings when running "hg verify", and it > was fixed by simply checking out a new copy of the repository. It > seems that you have the same problem: for example my own copy of > CPython doesn't show any warning in "hg verify". Try running "hg verify -v" - these warnings only appear when verify is run in verbose mode. > I've deleted my slightly-broken repo by then (as it was already > several years old I suspected an old version of mercurial). Maybe I > shouldn't have. How about you send your repository to the Mercurial > bug tracker? Other people already have, for other projects. The developers don't care much; e.g., http://permalink.gmane.org/gmane.comp.version-control.mercurial.general/23195 From tim.peters at gmail.com Tue Aug 20 17:43:57 2013 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 20 Aug 2013 10:43:57 -0500 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: <20130820101039.4eea3d81@pitrou.net> References: <20130820101039.4eea3d81@pitrou.net> Message-ID: [Tim] >> > hg log -r 3.2 >> changeset: 83826:b9b521efeba3 >> branch: 3.2 >> parent: 83739:6255b40c6a61 >> user: Antoine Pitrou >> date: Sat May 18 17:56:42 2013 +0200 >> summary: Issue #17980: Fix possible abuse of ssl.match_hostname() >> for denial of service using certificates with many wildcards >> (CVE-2013-2099). [Antoine] > Oops, that's me :-S > Now I don't remember if I omitted to merge deliberately, or if that was > an oversight. Well, yours is just the tip of the 3.2 branch. 3.2 was already active when you made this commit, left over from the 3.2.5 release fiddling (when, presumably, a merge to default was also skipped): > hg log -v -r "children(ancestor(3.2, default)):: and branch(3.2)" changeset: 83738:cef745775b65 branch: 3.2 tag: v3.2.5 user: Georg Brandl date: Sun May 12 12:28:20 2013 +0200 files: Include/patchlevel.h Lib/distutils/__init__.py Lib/idlelib/idlever.py Misc/NEWS Misc/RPM/python-3.2.spec README description: Bump to version 3.2.5. changeset: 83739:6255b40c6a61 branch: 3.2 user: Georg Brandl date: Sun May 12 12:28:30 2013 +0200 files: .hgtags description: Added tag v3.2.5 for changeset cef745775b65 changeset: 83826:b9b521efeba3 branch: 3.2 parent: 83739:6255b40c6a61 user: Antoine Pitrou date: Sat May 18 17:56:42 2013 +0200 files: Lib/ssl.py Lib/test/test_ssl.py Misc/NEWS description: Issue #17980: Fix possible abuse of ssl.match_hostname() for denial of service using certificates with many wi ldcards (CVE-2013-2099). > For the record, the issue was fixed in 3.3 too, albeit not with a merge > changeset. In that case, I bet this one is easy to fix, for someone who knows what they're doing ;-) From rdmurray at bitdance.com Tue Aug 20 19:16:05 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 20 Aug 2013 13:16:05 -0400 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> Message-ID: <20130820171605.DB59F2507B6@webabinitio.net> On Tue, 20 Aug 2013 10:43:57 -0500, Tim Peters wrote: > [Tim] > >> > hg log -r 3.2 > >> changeset: 83826:b9b521efeba3 > >> branch: 3.2 > >> parent: 83739:6255b40c6a61 > >> user: Antoine Pitrou > >> date: Sat May 18 17:56:42 2013 +0200 > >> summary: Issue #17980: Fix possible abuse of ssl.match_hostname() > >> for denial of service using certificates with many wildcards > >> (CVE-2013-2099). > > [Antoine] > > Oops, that's me :-S > > Now I don't remember if I omitted to merge deliberately, or if that was > > an oversight. > > Well, yours is just the tip of the 3.2 branch. 3.2 was already active > when you made this commit, left over from the 3.2.5 release fiddling > (when, presumably, a merge to default was also skipped): Georg indicated at the time that not merging was intentional. --David From antoine at python.org Tue Aug 20 19:27:58 2013 From: antoine at python.org (Antoine Pitrou) Date: Tue, 20 Aug 2013 19:27:58 +0200 Subject: [Python-Dev] Status of 3.2 in Hg repository? References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> Message-ID: <20130820192758.19e4953f@fsol> On Tue, 20 Aug 2013 13:16:05 -0400 "R. David Murray" wrote: > On Tue, 20 Aug 2013 10:43:57 -0500, Tim Peters wrote: > > [Tim] > > >> > hg log -r 3.2 > > >> changeset: 83826:b9b521efeba3 > > >> branch: 3.2 > > >> parent: 83739:6255b40c6a61 > > >> user: Antoine Pitrou > > >> date: Sat May 18 17:56:42 2013 +0200 > > >> summary: Issue #17980: Fix possible abuse of ssl.match_hostname() > > >> for denial of service using certificates with many wildcards > > >> (CVE-2013-2099). > > > > [Antoine] > > > Oops, that's me :-S > > > Now I don't remember if I omitted to merge deliberately, or if that was > > > an oversight. > > > > Well, yours is just the tip of the 3.2 branch. 3.2 was already active > > when you made this commit, left over from the 3.2.5 release fiddling > > (when, presumably, a merge to default was also skipped): > > Georg indicated at the time that not merging was intentional. Ah, now I remember indeed: http://mail.python.org/pipermail/python-committers/2013-May/002580.html Thanks Antoine. From tim.peters at gmail.com Tue Aug 20 19:47:46 2013 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 20 Aug 2013 12:47:46 -0500 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: <20130820192758.19e4953f@fsol> References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> Message-ID: [Tim] >>>>> > hg log -r 3.2 >>>>> changeset: 83826:b9b521efeba3 >>>>> branch: 3.2 >>>>> parent: 83739:6255b40c6a61 >>>>> user: Antoine Pitrou >>>>> date: Sat May 18 17:56:42 2013 +0200 >>>>> summary: Issue #17980: Fix possible abuse of ssl.match_hostname() >>>>> for denial of service using certificates with many wildcards >>>>> (CVE-2013-2099). [Antoine] >>>> Oops, that's me :-S >>>> Now I don't remember if I omitted to merge deliberately, or if that was >>>> an oversight. [Tim] >>> Well, yours is just the tip of the 3.2 branch. 3.2 was already active >>> when you made this commit, left over from the 3.2.5 release fiddling >>> (when, presumably, a merge to default was also skipped): [R. David Murray] >> Georg indicated at the time that not merging was intentional. [Antoine] > Ah, now I remember indeed: > http://mail.python.org/pipermail/python-committers/2013-May/002580.html Which says: I asked about this on IRC and was told that 3.2 is now a standalone branch like 2.7. Security fixes will be applied by the release manager only, and Georg doesn't see any point in null merging the commits. Isn't the point exactly the same as for all other "old-to-new branch" null merges? That is, 1. So that 3.2 doesn't show up as an active branch under "hg branches"; and, 2. So that security fixes applied to the 3.2 branch can be easily forward-ported to the 3.3 and default branches via no-hassle merges. What is gained by _not_ merging here? I don't see it. I suppose the answer, as to everything else, lies in "hg strip" . From antoine at python.org Tue Aug 20 19:55:27 2013 From: antoine at python.org (Antoine Pitrou) Date: Tue, 20 Aug 2013 19:55:27 +0200 Subject: [Python-Dev] Status of 3.2 in Hg repository? References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> Message-ID: <20130820195527.7bd0253a@fsol> On Tue, 20 Aug 2013 12:47:46 -0500 Tim Peters wrote: > > [Antoine] > > Ah, now I remember indeed: > > http://mail.python.org/pipermail/python-committers/2013-May/002580.html > > Which says: > > I asked about this on IRC and was told that 3.2 is now a > standalone branch like 2.7. Security fixes will be applied > by the release manager only, and Georg doesn't see any > point in null merging the commits. > > Isn't the point exactly the same as for all other "old-to-new branch" > null merges? That is, > > 1. So that 3.2 doesn't show up as an active branch under "hg branches"; and, > > 2. So that security fixes applied to the 3.2 branch can be easily > forward-ported to the 3.3 and default branches via no-hassle merges. > > What is gained by _not_ merging here? I don't see it. Perhaps Georg doesn't like merges? ;-) I suppose what's gained is "one less command to type". Regards Antoine. From victor.stinner at gmail.com Wed Aug 21 01:30:23 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 Aug 2013 01:30:23 +0200 Subject: [Python-Dev] PEP 446: issue with sockets Message-ID: Hi, I have a new question for my PEP 446 (inheritance of file descriptors). os.get/set_inheritable(handle) has strange behaviour on Windows, and so I would like to add new os.get/set_handle_inheritable() functions to avoid it. The problem is that a socket would require a different function depending on the OS: os.get/set_handle_inheritable() on Windows, os.get/set_inheritable() on UNIX. Should I add a portable helper to the socket module (socket.get/set_inheritable)? Or add 2 methods to the socket class? Now the details. I have an issue with sockets and the PEP 446. On Windows, my implementation of the os.set_inheritable(fd: int, inheritable: bool) function tries to guess if fd is a file descriptor or a handle. The reason is that the fileno() method of a socket returns a file descriptor on UNIX, whereas it returns a handle on Windows. It is convinient to have a os.set_interiable() function which accepts both types. The issue is that os.get_inheritable() does the same guess and it has a strange behaviour. Calling os.get_inheritable() with integers in the range(20) has a border effect: open(filename) creates a file descriptor 10, whereas it creates a file descriptor 3 if get_inheritable() was not called (why 10 and not 3?). To avoid the border effect, it's better to not guess if the parameter is a file descriptor or a Windows handle, and a new two new functions: os.get_handle_inheritable() and os.set_handle_inheritable(). Victor From victor.stinner at gmail.com Wed Aug 21 01:57:32 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 Aug 2013 01:57:32 +0200 Subject: [Python-Dev] PEP 446: issue with sockets In-Reply-To: References: Message-ID: 2013/8/21 Victor Stinner : > Should I add a portable helper to the > socket module (socket.get/set_inheritable)? Add the two following functions to the socket module: def get_inheritable(sock): if os.name == 'nt': return os.get_handle_inheritable(sock.fileno()) else: return os.get_inheritable(sock.fileno()) def set_inheritable(sock, inheritable): if os.name == 'nt': os.set_handle_inheritable(sock.fileno(), inheritable) else: os.set_inheritable(sock.fileno(), inheritable) Usage: socket.get_inheritable(sock) and socket.set_inheritable(sock, True) > Or add 2 methods to the socket class? Add the two following methods to the socket class: def get_inheritable(self): if os.name == 'nt': return os.get_handle_inheritable(self.fileno()) else: return os.get_inheritable(self.fileno()) def set_inheritable(self, inheritable): if os.name == 'nt': os.set_handle_inheritable(self.fileno(), inheritable) else: os.set_inheritable(self.fileno(), inheritable) Usage: s.get_inheritable() and sock.set_inheritable(True) Victor From guido at python.org Wed Aug 21 01:57:46 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 20 Aug 2013 16:57:46 -0700 Subject: [Python-Dev] PEP 446: issue with sockets In-Reply-To: References: Message-ID: Agreed that guessing whether something's a handle or not is terrible. If this is truly only for sockets then maybe it should live in the socket module? Also, are you sure the things returned by socket.fleno() are really Windows handles? I thought they were some other artificial namespace used just by sockets. On Tue, Aug 20, 2013 at 4:30 PM, Victor Stinner wrote: > Hi, > > I have a new question for my PEP 446 (inheritance of file descriptors). > > os.get/set_inheritable(handle) has strange behaviour on Windows, and > so I would like to add new os.get/set_handle_inheritable() functions > to avoid it. The problem is that a socket would require a different > function depending on the OS: os.get/set_handle_inheritable() on > Windows, os.get/set_inheritable() on UNIX. Should I add a portable > helper to the socket module (socket.get/set_inheritable)? Or add 2 > methods to the socket class? > > Now the details. > > I have an issue with sockets and the PEP 446. On Windows, my > implementation of the os.set_inheritable(fd: int, inheritable: bool) > function tries to guess if fd is a file descriptor or a handle. The > reason is that the fileno() method of a socket returns a file > descriptor on UNIX, whereas it returns a handle on Windows. It is > convinient to have a os.set_interiable() function which accepts both > types. > > The issue is that os.get_inheritable() does the same guess and it has > a strange behaviour. Calling os.get_inheritable() with integers in the > range(20) has a border effect: open(filename) creates a file > descriptor 10, whereas it creates a file descriptor 3 if > get_inheritable() was not called (why 10 and not 3?). > > To avoid the border effect, it's better to not guess if the parameter > is a file descriptor or a Windows handle, and a new two new functions: > os.get_handle_inheritable() and os.set_handle_inheritable(). > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 21 02:07:23 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 20 Aug 2013 17:07:23 -0700 Subject: [Python-Dev] PEP 446: issue with sockets In-Reply-To: References: Message-ID: Since this is a new API and only applies to sockets, making them methods sounds good. (I'd put the 'nt' test outside the method defs though so they are tested only once per import.) On Tue, Aug 20, 2013 at 4:57 PM, Victor Stinner wrote: > 2013/8/21 Victor Stinner : > > Should I add a portable helper to the > > socket module (socket.get/set_inheritable)? > > Add the two following functions to the socket module: > > def get_inheritable(sock): > if os.name == 'nt': > return os.get_handle_inheritable(sock.fileno()) > else: > return os.get_inheritable(sock.fileno()) > > def set_inheritable(sock, inheritable): > if os.name == 'nt': > os.set_handle_inheritable(sock.fileno(), inheritable) > else: > os.set_inheritable(sock.fileno(), inheritable) > > Usage: socket.get_inheritable(sock) and socket.set_inheritable(sock, True) > > > Or add 2 methods to the socket class? > > Add the two following methods to the socket class: > > def get_inheritable(self): > if os.name == 'nt': > return os.get_handle_inheritable(self.fileno()) > else: > return os.get_inheritable(self.fileno()) > > def set_inheritable(self, inheritable): > if os.name == 'nt': > os.set_handle_inheritable(self.fileno(), inheritable) > else: > os.set_inheritable(self.fileno(), inheritable) > > Usage: s.get_inheritable() and sock.set_inheritable(True) > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Aug 21 02:19:50 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 Aug 2013 02:19:50 +0200 Subject: [Python-Dev] PEP 446: issue with sockets In-Reply-To: References: Message-ID: 2013/8/21 Guido van Rossum : > Also, are you sure the things returned by socket.fleno() are really Windows > handles? I thought they were some other artificial namespace used just by > sockets. (You know what? I know understand and love the UNIX concept "everything is file"!) I don't know if a socket handle is similar to file handles or if they are specials. At least, GetHandleInformation() and SetHandleInformation() functions, used by os.get/set_handle_inheritable(), accept socket handles. Outside the socket module, the subprocess and multiprocessing modules use also Windows handles. The subprocess has for example a private _make_inheritable() method which could be replaced with os.set_handle_inheritable(). I'm not sure because _make_inheritable() duplicates the input handle, whereas os.set_handle_inheritable() modify directly the handle. I don't know why the handle needs to be duplicated. Victor From tim.peters at gmail.com Wed Aug 21 02:33:57 2013 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 20 Aug 2013 19:33:57 -0500 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: <20130820195527.7bd0253a@fsol> References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: [Tim, wondering why the 3.2 branch isn't "inactive"] >> ... >> What is gained by _not_ merging here? I don't see it. [Antoine Pitrou] > Perhaps Georg doesn't like merges? ;-) > I suppose what's gained is "one less command to type". So let's try a different question ;-) Would anyone _object_ to completing the process described in the docs: merge 3.2 into 3.3, then merge 3.3 into default? I'd be happy to do that. I'd throw away all the merge changes except for adding the v3,2.5 tag to .hgtags. The only active branches remaining would be `default` and 2.7, which is what I expected when I started this ;-) From barry at python.org Wed Aug 21 03:00:41 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 20 Aug 2013 21:00:41 -0400 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: <20130820210041.15d5edc1@anarchist> On Aug 20, 2013, at 07:33 PM, Tim Peters wrote: >So let's try a different question ;-) Would anyone _object_ to >completing the process described in the docs: merge 3.2 into 3.3, >then merge 3.3 into default? I'd be happy to do that. I'd throw away >all the merge changes except for adding the v3,2.5 tag to .hgtags. > >The only active branches remaining would be `default` and 2.7, which >is what I expected when I started this ;-) That's what I'd expect to, so no objections from me. But I'm only the lowly 2.6 RM and I will soon be rid of that particular albatross! FWIW, I'm still merging relevant 2.6 changes into 2.7 (or null merging them if not relevant). Oh, and welcome back Uncle Timmy! (If that's you're real name.) -Barry From victor.stinner at gmail.com Wed Aug 21 03:01:17 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 Aug 2013 03:01:17 +0200 Subject: [Python-Dev] PEP 446: issue with sockets In-Reply-To: References: Message-ID: 2013/8/21 Guido van Rossum : > Since this is a new API and only applies to sockets, making them methods > sounds good. (I'd put the 'nt' test outside the method defs though so they > are tested only once per import.) I added get_inheritable() and set_inheritable() methods to socket.socket. The names are PEP 8 compliant, whereas socket.socket.getblocking() and socket.socket.settimeout() are not :-) So we have 4 new functions and 2 new methods just to get and set the inheritable attribute :-) http://www.python.org/dev/peps/pep-0446/#new-functions-and-methods We might remove os.get/set_handle_inheritable() from the PEP 446, if they are not really useful. Victor From ethan at stoneleaf.us Wed Aug 21 08:15:16 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 20 Aug 2013 23:15:16 -0700 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: References: <520BFFBE.3050501@stoneleaf.us> Message-ID: <52145AF4.9080309@stoneleaf.us> On 08/14/2013 09:27 PM, Nick Coghlan wrote: > For enums, I believe they should be formatted like their > base types (so !s and !r will show the enum name, anything without > coercion will show the value) . I agree. While one of the big reasons for an Enum type was the pretty str and repr, I don't see format in that area. How often will one type in `"{}".format(some_var)` to find out what type of object one has? Myself, I would just type `some_var`. -- ~Ethan~ From ethan at stoneleaf.us Wed Aug 21 09:06:13 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 21 Aug 2013 00:06:13 -0700 Subject: [Python-Dev] format, int, and IntEnum In-Reply-To: <52145AF4.9080309@stoneleaf.us> References: <520BFFBE.3050501@stoneleaf.us> <52145AF4.9080309@stoneleaf.us> Message-ID: <521466E5.9040007@stoneleaf.us> On 08/20/2013 11:15 PM, Ethan Furman wrote: > On 08/14/2013 09:27 PM, Nick Coghlan wrote: >> For enums, I believe they should be formatted like their >> base types (so !s and !r will show the enum name, anything without >> coercion will show the value) . > > I agree. While one of the big reasons for an Enum type was the pretty > str and repr, I don't see format in that area. > > How often will one type in `"{}".format(some_var)` to find out what type > of object one has? Myself, I would just type `some_var`. So, these are some of the ways we have to display an object: str() calls obj.__str__() repr() calls obj.__repr__() "%s" calls obj.__str__() "%r" calls obj.__repr__() "%d" calls... not sure, but we see the int value "{}".format() should (IMO) also display the value of the object Using int as the case study, its presentation types are ['b', 'd', 'n', 'o', 'x', 'X']. Notice there is no 's' nor 'r' in there, as int expects to display a number, not arbitrary text. So, for mixed-type Enumerations, I think any format calls should simply be forwarded to the mixed-in type (unless, of course, a custom __format__ was specified in the new Enumeration). -- ~Ethan~ From arigo at tunes.org Wed Aug 21 10:47:13 2013 From: arigo at tunes.org (Armin Rigo) Date: Wed, 21 Aug 2013 10:47:13 +0200 Subject: [Python-Dev] hg verify warnings In-Reply-To: References: Message-ID: Hi Tim, On Tue, Aug 20, 2013 at 5:12 PM, Tim Peters wrote: > Try running "hg verify -v" - these warnings only appear when verify is > run in verbose mode. Indeed. Ignore what I said then about a broken copy of the repository: any copy will show these three warnings, and they can be safely ignored it seems. A bient?t, Armin. From shibturn at gmail.com Wed Aug 21 12:58:05 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Wed, 21 Aug 2013 11:58:05 +0100 Subject: [Python-Dev] PEP 446: issue with sockets In-Reply-To: References: Message-ID: On 21/08/2013 1:19am, Victor Stinner wrote: > 2013/8/21 Guido van Rossum : >> Also, are you sure the things returned by socket.fleno() are really Windows >> handles? I thought they were some other artificial namespace used just by >> sockets. > > (You know what? I know understand and love the UNIX concept > "everything is file"!) > > I don't know if a socket handle is similar to file handles or if they > are specials. At least, GetHandleInformation() and > SetHandleInformation() functions, used by > os.get/set_handle_inheritable(), accept socket handles. Anti-virus software and firewalls can stop SetHandleInformation() from working properly on sockets: http://stackoverflow.com/questions/12058911/can-tcp-socket-handles-be-set-not-inheritable -- Richard From brett at python.org Wed Aug 21 14:22:23 2013 From: brett at python.org (Brett Cannon) Date: Wed, 21 Aug 2013 08:22:23 -0400 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: On Tue, Aug 20, 2013 at 8:33 PM, Tim Peters wrote: > [Tim, wondering why the 3.2 branch isn't "inactive"] > >> ... > >> What is gained by _not_ merging here? I don't see it. > > [Antoine Pitrou] > > Perhaps Georg doesn't like merges? ;-) > > I suppose what's gained is "one less command to type". > > So let's try a different question ;-) Would anyone _object_ to > completing the process described in the docs: merge 3.2 into 3.3, > then merge 3.3 into default? I'd be happy to do that. I'd throw away > all the merge changes except for adding the v3,2.5 tag to .hgtags. > > The only active branches remaining would be `default` and 2.7, which > is what I expected when I started this ;-) While I would think Georg can object if he wants, I see no reason to help visibly shutter the 3.2 branch by doing null merges. It isn't like it makes using hg harder or the history harder to read. -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Aug 21 14:50:51 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 Aug 2013 14:50:51 +0200 Subject: [Python-Dev] PEP 446: issue with sockets In-Reply-To: References: Message-ID: 2013/8/21 Richard Oudkerk : > On 21/08/2013 1:19am, Victor Stinner wrote: >> I don't know if a socket handle is similar to file handles or if they >> are specials. At least, GetHandleInformation() and >> SetHandleInformation() functions, used by >> os.get/set_handle_inheritable(), accept socket handles. > > Anti-virus software and firewalls can stop SetHandleInformation() from > working properly on sockets: > > http://stackoverflow.com/questions/12058911/can-tcp-socket-handles-be-set-not-inheritable Yeah, I know, I already added the link to the PEP. I improved the implementation of the PEP 446: it now uses the WSA_FLAG_NO_HANDLE_INHERIT flag when it is available (Windows 7 SP1 and Windows Server 2008 R2 SP1, which is probably a minor percentage of Windows installations). On older Windows versions, I don't see what Python can do to workaround the issue except of calling SetHandleInformation() on the result of WSASocket(). Victor From shibturn at gmail.com Wed Aug 21 15:11:41 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Wed, 21 Aug 2013 14:11:41 +0100 Subject: [Python-Dev] PEP 446: issue with sockets In-Reply-To: References: Message-ID: <5214BC8D.8010401@gmail.com> On 21/08/2013 1:50pm, Victor Stinner wrote: > 2013/8/21 Richard Oudkerk : >> On 21/08/2013 1:19am, Victor Stinner wrote: >>> I don't know if a socket handle is similar to file handles or if they >>> are specials. At least, GetHandleInformation() and >>> SetHandleInformation() functions, used by >>> os.get/set_handle_inheritable(), accept socket handles. >> >> Anti-virus software and firewalls can stop SetHandleInformation() from >> working properly on sockets: >> >> http://stackoverflow.com/questions/12058911/can-tcp-socket-handles-be-set-not-inheritable > > Yeah, I know, I already added the link to the PEP. > > I improved the implementation of the PEP 446: it now uses the > WSA_FLAG_NO_HANDLE_INHERIT flag when it is available (Windows 7 SP1 > and Windows Server 2008 R2 SP1, which is probably a minor percentage > of Windows installations). > > On older Windows versions, I don't see what Python can do to > workaround the issue except of calling SetHandleInformation() on the > result of WSASocket(). > > Victor > If the socket methods are not guaranteed to work then they should come with a nice big warning to that effect. It seems that the only reliable way for a parent process to give a socket to a child process is to use WSADuplicateSocket(). (But that requires communication between parent and child after the child has started.) -- Richard From tim.peters at gmail.com Wed Aug 21 20:22:22 2013 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 21 Aug 2013 13:22:22 -0500 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: [Tim, wondering why the 3.2 branch isn't "inactive"] >> ... >> So let's try a different question ;-) Would anyone _object_ to >> completing the process described in the docs: merge 3.2 into 3.3, >> then merge 3.3 into default? I'd be happy to do that. I'd throw away >> all the merge changes except for adding the v3,2.5 tag to .hgtags. >> >> The only active branches remaining would be `default` and 2.7, which >> is what I expected when I started this ;-) [Brett Cannon] > While I would think Georg can object if he wants, I see no reason to help > visibly shutter the 3.2 branch by doing null merges. It isn't like it makes > using hg harder or the history harder to read. Well, why do we _ever_ do a null merge? Then why don't the reasons apply in this case? What happened here doesn't match the documented workflow - so one or the other should be changed. It has proved tedious to find out why this exception exists, and the only reason I've found so far amounts to "the RM didn't want to bother -- and the only record of that is someone's memory of an IRC chat". As mentioned before, if a security hole is found in 3.2 and gets repaired there, the poor soul who fixes 3.2 will have all the same questions when they try to forward-merge the fix to 3.3 and default. Because the merge wasn't done when 3.2.5 was released, they'll have a pile of files show up in their merge attempt that have nothing to do with their fix. Not only the release artifacts, but also a critical fix Antoine applied to ssl.py a week after the 3.2.5 release. It turns out that one was already applied to later branches, but I know that only because Antoine said so here. Do the "null merge", and none of those questions will arise. And, indeed, that's _why_ we want to do null merges (when applicable) in general - right? So that future merges become much easier. BTW, it's not quite a null-merge. The v3.2.5 release tag doesn't currently exist in the 3.3 or default .hgtags files. So long as 3.2 has a topological head, people on the 3.3 and default branches won't notice (unless they look directly at .hgtags - they can still use "v3.2.5" in hg commands successfully), but that's mighty obscure ;-) From brett at python.org Wed Aug 21 21:26:42 2013 From: brett at python.org (Brett Cannon) Date: Wed, 21 Aug 2013 15:26:42 -0400 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: On Wed, Aug 21, 2013 at 2:22 PM, Tim Peters wrote: > [Tim, wondering why the 3.2 branch isn't "inactive"] > >> ... > >> So let's try a different question ;-) Would anyone _object_ to > >> completing the process described in the docs: merge 3.2 into 3.3, > >> then merge 3.3 into default? I'd be happy to do that. I'd throw away > >> all the merge changes except for adding the v3,2.5 tag to .hgtags. > >> > >> The only active branches remaining would be `default` and 2.7, which > >> is what I expected when I started this ;-) > > [Brett Cannon] > > While I would think Georg can object if he wants, I see no reason to help > > visibly shutter the 3.2 branch by doing null merges. It isn't like it > makes > > using hg harder or the history harder to read. > > Well, why do we _ever_ do a null merge? Then why don't the reasons > apply in this case? > After reading that sentence I realize there is a key "not" missing: "I see no reason NOT to help visibly shutter the 3.2. branch ...". IOW I say do the null merge. Sorry about that. > > What happened here doesn't match the documented workflow - so one or > the other should be changed. It has proved tedious to find out why > this exception exists, and the only reason I've found so far amounts > to "the RM didn't want to bother -- and the only record of that is > someone's memory of an IRC chat". > > As mentioned before, if a security hole is found in 3.2 and gets > repaired there, the poor soul who fixes 3.2 will have all the same > questions when they try to forward-merge the fix to 3.3 and default. > Because the merge wasn't done when 3.2.5 was released, they'll have a > pile of files show up in their merge attempt that have nothing to do > with their fix. Not only the release artifacts, but also a critical > fix Antoine applied to ssl.py a week after the 3.2.5 release. It > turns out that one was already applied to later branches, but I know > that only because Antoine said so here. > > Do the "null merge", and none of those questions will arise. And, > indeed, that's _why_ we want to do null merges (when applicable) in > general - right? So that future merges become much easier. > > BTW, it's not quite a null-merge. The v3.2.5 release tag doesn't > currently exist in the 3.3 or default .hgtags files. So long as 3.2 > has a topological head, people on the 3.3 and default branches won't > notice (unless they look directly at .hgtags - they can still use > "v3.2.5" in hg commands successfully), but that's mighty obscure ;-) > Yes it is. =) -------------- next part -------------- An HTML attachment was scrubbed... URL: From merwok at netwok.org Wed Aug 21 21:31:35 2013 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 21 Aug 2013 15:31:35 -0400 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: <52151597.8090907@netwok.org> Le 21/08/2013 14:22, Tim Peters a ?crit : > BTW, it's not quite a null-merge. The v3.2.5 release tag doesn't > currently exist in the 3.3 or default .hgtags files. So long as 3.2 > has a topological head, people on the 3.3 and default branches won't > notice (unless they look directly at .hgtags - they can still use > "v3.2.5" in hg commands successfully), but that's mighty obscure ;-) IIRC Mercurial looks at the union of .hgtags in all unmerged heads to produce the list of existing tags. :-) From tim.peters at gmail.com Wed Aug 21 21:34:33 2013 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 21 Aug 2013 14:34:33 -0500 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: [Brett] > ... > After reading that sentence I realize there is a key "not" missing: "I see > no reason NOT to help visibly shutter the 3.2. branch ...". IOW I say do the > null merge. Sorry about that. No problem! Since I've been inactive for a long time, it's good for me to practice vigorously defending what's currently documented - tests my understanding, and lets me annoy people at the same time ;-) Here's what I intend to do (unless an objection appears): hg up 3.3 hg merge 3.2 # merge in the v3.2.5 tag definition from .hgtags, # but revert everything else hg revert -a -X .hgtags -r . hg resolve -a -m hg diff # to ensure that only the v3.2.5 tag in .hgtags changed hg commit and then much the same steps to merge 3.3 into default. BTW, "null merging" this way is annoying, because at least in my installation kdiff3 pops up for every uselessly conflicting file. Anyone know a reason not to do: hg -y merge --tool=internal:fail 3.2 instead? I saw that idea on some Hg wiki. From tim.peters at gmail.com Wed Aug 21 21:37:23 2013 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 21 Aug 2013 14:37:23 -0500 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: <52151597.8090907@netwok.org> References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> <52151597.8090907@netwok.org> Message-ID: [Tim] >> BTW, it's not quite a null-merge. The v3.2.5 release tag doesn't >> currently exist in the 3.3 or default .hgtags files. So long as 3.2 >> has a topological head, people on the 3.3 and default branches won't >> notice (unless they look directly at .hgtags - they can still use >> "v3.2.5" in hg commands successfully), but that's mighty obscure ;-) [?ric Araujo] > IIRC Mercurial looks at the union of .hgtags in all unmerged heads to > produce the list of existing tags. :-) Right! That's the obscure reason why v3.2.5 works even on branches where .hgtags doesn't contain a v3.2.5 entry - but will fail if 3.2 ceases to have "a topological head" (unless 3.2's .hgtags is merged to a still-active branch). From rdmurray at bitdance.com Wed Aug 21 22:52:21 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 21 Aug 2013 16:52:21 -0400 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: <20130821205222.4B9A8250160@webabinitio.net> On Wed, 21 Aug 2013 14:34:33 -0500, Tim Peters wrote: > [Brett] > > ... > > After reading that sentence I realize there is a key "not" missing: "I see > > no reason NOT to help visibly shutter the 3.2. branch ...". IOW I say do the > > null merge. Sorry about that. > > No problem! Since I've been inactive for a long time, it's good for > me to practice vigorously defending what's currently documented - > tests my understanding, and lets me annoy people at the same time ;-) > > Here's what I intend to do (unless an objection appears): > > hg up 3.3 > hg merge 3.2 > # merge in the v3.2.5 tag definition from .hgtags, > # but revert everything else > hg revert -a -X .hgtags -r . > hg resolve -a -m > hg diff # to ensure that only the v3.2.5 tag in .hgtags changed > hg commit You'll need a push here, too. And at that point it may fail. It may be the case that only Georg can push to 3.2, I don't remember for sure. (Note that it may not have been Antoine who did the push of that patch to 3.2...if Georg used transplant, for example, it would show as Antoine's commit, IIUC.) I agree that it would cause less developer mind-overhead if the branch were merged. Georg has been scarce lately...if the branch is locked, there are people besides him who can unlock it (Antoine, for one). --David From tjreedy at udel.edu Wed Aug 21 22:59:36 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 21 Aug 2013 16:59:36 -0400 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: <20130821205222.4B9A8250160@webabinitio.net> References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> <20130821205222.4B9A8250160@webabinitio.net> Message-ID: On 8/21/2013 4:52 PM, R. David Murray wrote: > On Wed, 21 Aug 2013 14:34:33 -0500, Tim Peters wrote: >> [Brett] >>> ... >>> After reading that sentence I realize there is a key "not" missing: "I see >>> no reason NOT to help visibly shutter the 3.2. branch ...". IOW I say do the >>> null merge. Sorry about that. >> >> No problem! Since I've been inactive for a long time, it's good for >> me to practice vigorously defending what's currently documented - >> tests my understanding, and lets me annoy people at the same time ;-) >> >> Here's what I intend to do (unless an objection appears): >> >> hg up 3.3 >> hg merge 3.2 >> # merge in the v3.2.5 tag definition from .hgtags, >> # but revert everything else >> hg revert -a -X .hgtags -r . >> hg resolve -a -m >> hg diff # to ensure that only the v3.2.5 tag in .hgtags changed >> hg commit > > You'll need a push here, too. And at that point it may fail. It may be > the case that only Georg can push to 3.2, I don't remember for sure. Where is the push to 3.2? I only see changes to 3.3 (to be repeated with 3.4). > I agree that it would cause less developer mind-overhead if the branch > were merged. -- Terry Jan Reedy From rdmurray at bitdance.com Wed Aug 21 23:10:52 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 21 Aug 2013 17:10:52 -0400 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> <20130821205222.4B9A8250160@webabinitio.net> Message-ID: <20130821211052.5B4F42500B9@webabinitio.net> On Wed, 21 Aug 2013 16:59:36 -0400, Terry Reedy wrote: > On 8/21/2013 4:52 PM, R. David Murray wrote: > > On Wed, 21 Aug 2013 14:34:33 -0500, Tim Peters wrote: > >> [Brett] > >>> ... > >>> After reading that sentence I realize there is a key "not" missing: "I see > >>> no reason NOT to help visibly shutter the 3.2. branch ...". IOW I say do the > >>> null merge. Sorry about that. > >> > >> No problem! Since I've been inactive for a long time, it's good for > >> me to practice vigorously defending what's currently documented - > >> tests my understanding, and lets me annoy people at the same time ;-) > >> > >> Here's what I intend to do (unless an objection appears): > >> > >> hg up 3.3 > >> hg merge 3.2 > >> # merge in the v3.2.5 tag definition from .hgtags, > >> # but revert everything else > >> hg revert -a -X .hgtags -r . > >> hg resolve -a -m > >> hg diff # to ensure that only the v3.2.5 tag in .hgtags changed > >> hg commit > > > > You'll need a push here, too. And at that point it may fail. It may be > > the case that only Georg can push to 3.2, I don't remember for sure. > > Where is the push to 3.2? I only see changes to 3.3 (to be repeated with > 3.4). Ah, good point. --David From tim.peters at gmail.com Wed Aug 21 23:12:17 2013 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 21 Aug 2013 16:12:17 -0500 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> <20130821205222.4B9A8250160@webabinitio.net> Message-ID: [Tim] >>> ... >>> Here's what I intend to do (unless an objection appears): >>> >>> hg up 3.3 >>> hg merge 3.2 >>> # merge in the v3.2.5 tag definition from .hgtags, >>> # but revert everything else >>> hg revert -a -X .hgtags -r . >>> hg resolve -a -m >>> hg diff # to ensure that only the v3.2.5 tag in .hgtags changed >>> hg commit [R. David Murray] >> You'll need a push here, too. And at that point it may fail. It may be >> the case that only Georg can push to 3.2, I don't remember for sure. [Terry Reedy] > Where is the push to 3.2? I only see changes to 3.3 (to be repeated with > 3.4). Hi, Terry - long time no type :-) Merging 3.2 into 3.3 does make the original 3.2 head a parent of the merge changeset - I don't know whether whatever restrictions are in place would prevent that. I'd sure be _surprised_ if it prevented it, though ;-) From timothy.c.delaney at gmail.com Thu Aug 22 00:23:01 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 22 Aug 2013 08:23:01 +1000 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: On 22 August 2013 05:34, Tim Peters wrote: > Anyone know a reason not to do: > > hg -y merge --tool=internal:fail 3.2 > > instead? I saw that idea on some Hg wiki. That would be from http://mercurial.selenic.com/wiki/TipsAndTricks#Keep_.22My.22_or_.22Their.22_files_when_doing_a_merge. I think it's a perfectly reasonable approach. I expanded on it a little to make it more general (to choose which parent to discard) in http://stackoverflow.com/questions/14984793/mercurial-close-default-branch-and-replace-with-a-named-branch-as-new-default/14991454#14991454 . Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Aug 22 11:31:13 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 22 Aug 2013 12:31:13 +0300 Subject: [Python-Dev] cpython: Cleanup test_builtin In-Reply-To: <3cL5SY00YTz7LlF@mail.python.org> References: <3cL5SY00YTz7LlF@mail.python.org> Message-ID: 22.08.13 02:59, victor.stinner ???????(??): > http://hg.python.org/cpython/rev/0a1e1b929665 > changeset: 85308:0a1e1b929665 > user: Victor Stinner > date: Thu Aug 22 01:58:12 2013 +0200 > summary: > Cleanup test_builtin > > files: > Lib/test/test_builtin.py | 16 ++++------------ > 1 files changed, 4 insertions(+), 12 deletions(-) [...] > def test_open(self): > self.write_testfile() > fp = open(TESTFN, 'r') > - try: > + with fp: > self.assertEqual(fp.readline(4), '1+1\n') > self.assertEqual(fp.readline(), 'The quick brown fox jumps over the lazy dog.\n') > self.assertEqual(fp.readline(4), 'Dear') > self.assertEqual(fp.readline(100), ' John\n') > self.assertEqual(fp.read(300), 'XXX'*100) > self.assertEqual(fp.read(1000), 'YYY'*100) > - finally: > - fp.close() > - unlink(TESTFN) You forgot self.addCleanup(unlink, TESTFN) (here and in other places). From petri at digip.org Thu Aug 22 13:00:06 2013 From: petri at digip.org (Petri Lehtinen) Date: Thu, 22 Aug 2013 14:00:06 +0300 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> Message-ID: <20130822110006.GP20273@p29> Terry Reedy wrote: > On 8/15/2013 8:29 AM, R. David Murray wrote: > > >A number of us (I don't know how many) have clearly been thinking about > >"Python 4" as the time when we remove cruft. This will not cause any > >backward compatibility issues for anyone who has paid heed to the > >deprecation warnings, but will for those who haven't. The question > >then becomes, is it better to "bundle" these removals into the > >Python 4 release, or do them incrementally? > > 4.0 will be at most 6 releases after the upcoming 3.4, which is 9 to > 12 years, which is 7 to 10 years after any regular 2.7 maintainance > ends. > > The deprecated unittest synonyms are documented as being removed in > 4.0 and that already defines 4.0 as a future cruft-removal release. > However, I would not want it defined as the only cruft-removal > release and used as a reason or excuse to suspend removals until > then. I would personally prefer to do little* removals > incrementally, as was done before the decision to put off 2.x > removals to 3.0. So I would have 4.0 be an 'extra' or 'bigger' cruft > removal release, but not the only one. > > * Removing one or two pure synonyms or little used features from a > module. The unittest synonym removal is not 'little' because there > are 13 synonyms and at least some were well used. > > >If we are going to do them incrementally we should make that decision > >soonish, so that we don't end up having a whole bunch happen at once > >and defeat the (theoretical) purpose of doing them incrementally. > > > >(I say theoretical because what is the purpose? To spread out the > >breakage pain over multiple releases, so that every release breaks > >something?) > > Little removals will usually break something, but not most things. > Yes, I think it better to upset a few people with each release than > lots of people all at once. I think enabling deprecation notices in > unittest is a great idea. Among other reasons, it should spread the > effect of bigger removals scheduled farther in the future over the > extended deprecation period. > > Most deprecation notices should provide an alternative. (There might > be an exception is for things that should not be done ;-). For > module removals, the alternative should be a legacy package on PyPI. Removing some cruft on each release can be very painful for users. Django's deprecation policy works like this: They deprecate something in version A.B. It still works normally in A.B+1, generates a (silenced) DeprecationWarning in A.B+2, and is finally removed in A.B+3. So if I haven't read through the full release notes of each release nor enabled DeprecationWarnings, I end up having something broken each time I upgrade Django. I hope the same will not start happening each time I upgrade Python. When the removals happen on major version boundaries, it should be more evident that something will break. Then people can enable DepreationWarnings and test all the affected code thoroughly with the new version before upgrading. Petri From victor.stinner at gmail.com Thu Aug 22 13:48:53 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 22 Aug 2013 13:48:53 +0200 Subject: [Python-Dev] cpython: Cleanup test_builtin In-Reply-To: References: <3cL5SY00YTz7LlF@mail.python.org> Message-ID: > You forgot self.addCleanup(unlink, TESTFN) (here and in other places). These functions call write_testfile() which creates the file but also schedules its removal when the test is done (since my changeset): def write_testfile(self): # NB the first 4 lines are also used to test input, below fp = open(TESTFN, 'w') self.addCleanup(unlink, TESTFN) ... Victor 2013/8/22 Serhiy Storchaka : > 22.08.13 02:59, victor.stinner ???????(??): > >> http://hg.python.org/cpython/rev/0a1e1b929665 >> changeset: 85308:0a1e1b929665 >> user: Victor Stinner >> date: Thu Aug 22 01:58:12 2013 +0200 >> summary: >> Cleanup test_builtin >> >> files: >> Lib/test/test_builtin.py | 16 ++++------------ >> 1 files changed, 4 insertions(+), 12 deletions(-) > > [...] > >> def test_open(self): >> self.write_testfile() >> fp = open(TESTFN, 'r') >> - try: >> + with fp: >> self.assertEqual(fp.readline(4), '1+1\n') >> self.assertEqual(fp.readline(), 'The quick brown fox jumps >> over the lazy dog.\n') >> self.assertEqual(fp.readline(4), 'Dear') >> self.assertEqual(fp.readline(100), ' John\n') >> self.assertEqual(fp.read(300), 'XXX'*100) >> self.assertEqual(fp.read(1000), 'YYY'*100) >> - finally: >> - fp.close() >> - unlink(TESTFN) > > > You forgot self.addCleanup(unlink, TESTFN) (here and in other places). > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From solipsis at pitrou.net Thu Aug 22 13:51:47 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 22 Aug 2013 13:51:47 +0200 Subject: [Python-Dev] When to remove deprecated stuff References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> Message-ID: <20130822135147.1d39b2af@pitrou.net> Le Thu, 22 Aug 2013 14:00:06 +0300, Petri Lehtinen a ?crit : > Terry Reedy wrote: > > On 8/15/2013 8:29 AM, R. David Murray wrote: > > > > >A number of us (I don't know how many) have clearly been thinking > > >about "Python 4" as the time when we remove cruft. This will not > > >cause any backward compatibility issues for anyone who has paid > > >heed to the deprecation warnings, but will for those who haven't. > > >The question then becomes, is it better to "bundle" these removals > > >into the Python 4 release, or do them incrementally? > > > > 4.0 will be at most 6 releases after the upcoming 3.4, which is 9 to > > 12 years, which is 7 to 10 years after any regular 2.7 maintainance > > ends. > > > > The deprecated unittest synonyms are documented as being removed in > > 4.0 and that already defines 4.0 as a future cruft-removal release. > > However, I would not want it defined as the only cruft-removal > > release and used as a reason or excuse to suspend removals until > > then. I would personally prefer to do little* removals > > incrementally, as was done before the decision to put off 2.x > > removals to 3.0. So I would have 4.0 be an 'extra' or 'bigger' cruft > > removal release, but not the only one. > > > > * Removing one or two pure synonyms or little used features from a > > module. The unittest synonym removal is not 'little' because there > > are 13 synonyms and at least some were well used. > > > > >If we are going to do them incrementally we should make that > > >decision soonish, so that we don't end up having a whole bunch > > >happen at once and defeat the (theoretical) purpose of doing them > > >incrementally. > > > > > >(I say theoretical because what is the purpose? To spread out the > > >breakage pain over multiple releases, so that every release breaks > > >something?) > > > > Little removals will usually break something, but not most things. > > Yes, I think it better to upset a few people with each release than > > lots of people all at once. I think enabling deprecation notices in > > unittest is a great idea. Among other reasons, it should spread the > > effect of bigger removals scheduled farther in the future over the > > extended deprecation period. > > > > Most deprecation notices should provide an alternative. (There might > > be an exception is for things that should not be done ;-). For > > module removals, the alternative should be a legacy package on PyPI. > > Removing some cruft on each release can be very painful for users. > > Django's deprecation policy works like this: They deprecate something > in version A.B. It still works normally in A.B+1, generates a > (silenced) DeprecationWarning in A.B+2, and is finally removed in > A.B+3. So if I haven't read through the full release notes of each > release nor enabled DeprecationWarnings, I end up having something > broken each time I upgrade Django. So, again, perhaps we can stop silencing DeprecationWarning. Regards Antoine. From fuzzyman at voidspace.org.uk Thu Aug 22 15:45:29 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 22 Aug 2013 16:45:29 +0300 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: <20130822110006.GP20273@p29> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> Message-ID: <2539AE1A-B4EC-4568-BFE5-C036369FA1A1@voidspace.org.uk> On 22 Aug 2013, at 14:00, Petri Lehtinen wrote: > Terry Reedy wrote: >> On 8/15/2013 8:29 AM, R. David Murray wrote: >> >>> A number of us (I don't know how many) have clearly been thinking about >>> "Python 4" as the time when we remove cruft. This will not cause any >>> backward compatibility issues for anyone who has paid heed to the >>> deprecation warnings, but will for those who haven't. The question >>> then becomes, is it better to "bundle" these removals into the >>> Python 4 release, or do them incrementally? >> >> 4.0 will be at most 6 releases after the upcoming 3.4, which is 9 to >> 12 years, which is 7 to 10 years after any regular 2.7 maintainance >> ends. >> >> The deprecated unittest synonyms are documented as being removed in >> 4.0 and that already defines 4.0 as a future cruft-removal release. >> However, I would not want it defined as the only cruft-removal >> release and used as a reason or excuse to suspend removals until >> then. I would personally prefer to do little* removals >> incrementally, as was done before the decision to put off 2.x >> removals to 3.0. So I would have 4.0 be an 'extra' or 'bigger' cruft >> removal release, but not the only one. >> >> * Removing one or two pure synonyms or little used features from a >> module. The unittest synonym removal is not 'little' because there >> are 13 synonyms and at least some were well used. >> >>> If we are going to do them incrementally we should make that decision >>> soonish, so that we don't end up having a whole bunch happen at once >>> and defeat the (theoretical) purpose of doing them incrementally. >>> >>> (I say theoretical because what is the purpose? To spread out the >>> breakage pain over multiple releases, so that every release breaks >>> something?) >> >> Little removals will usually break something, but not most things. >> Yes, I think it better to upset a few people with each release than >> lots of people all at once. I think enabling deprecation notices in >> unittest is a great idea. Among other reasons, it should spread the >> effect of bigger removals scheduled farther in the future over the >> extended deprecation period. >> >> Most deprecation notices should provide an alternative. (There might >> be an exception is for things that should not be done ;-). For >> module removals, the alternative should be a legacy package on PyPI. > > Removing some cruft on each release can be very painful for users. > > Django's deprecation policy works like this: They deprecate something > in version A.B. It still works normally in A.B+1, generates a > (silenced) DeprecationWarning in A.B+2, and is finally removed in > A.B+3. So if I haven't read through the full release notes of each > release nor enabled DeprecationWarnings, I end up having something > broken each time I upgrade Django. > So you're still using features deprecated three releases ago, you haven't checked for DeprecationWarnings and it's Django making your life difficult? Why not check for the deprecation warnings? Michael > I hope the same will not start happening each time I upgrade Python. > When the removals happen on major version boundaries, it should be > more evident that something will break. Then people can enable > DepreationWarnings and test all the affected code thoroughly with the > new version before upgrading. > > Petri > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From storchaka at gmail.com Thu Aug 22 16:00:03 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 22 Aug 2013 17:00:03 +0300 Subject: [Python-Dev] cpython: Cleanup test_builtin In-Reply-To: References: <3cL5SY00YTz7LlF@mail.python.org> Message-ID: 22.08.13 14:48, Victor Stinner ???????(??): >> You forgot self.addCleanup(unlink, TESTFN) (here and in other places). > > These functions call write_testfile() which creates the file but also > schedules its removal when the test is done (since my changeset): > > def write_testfile(self): > # NB the first 4 lines are also used to test input, below > fp = open(TESTFN, 'w') > self.addCleanup(unlink, TESTFN) > ... Oh, sorry. From rdmurray at bitdance.com Thu Aug 22 16:43:24 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 22 Aug 2013 10:43:24 -0400 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: <2539AE1A-B4EC-4568-BFE5-C036369FA1A1@voidspace.org.uk> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> <2539AE1A-B4EC-4568-BFE5-C036369FA1A1@voidspace.org.uk> Message-ID: <20130822144324.EE0FA25014C@webabinitio.net> On Thu, 22 Aug 2013 16:45:29 +0300, Michael Foord wrote: > > On 22 Aug 2013, at 14:00, Petri Lehtinen wrote: > > > Terry Reedy wrote: > >> On 8/15/2013 8:29 AM, R. David Murray wrote: > >> > >>> A number of us (I don't know how many) have clearly been thinking about > >>> "Python 4" as the time when we remove cruft. This will not cause any > >>> backward compatibility issues for anyone who has paid heed to the > >>> deprecation warnings, but will for those who haven't. The question > >>> then becomes, is it better to "bundle" these removals into the > >>> Python 4 release, or do them incrementally? > >> > >> 4.0 will be at most 6 releases after the upcoming 3.4, which is 9 to > >> 12 years, which is 7 to 10 years after any regular 2.7 maintainance > >> ends. > >> > >> The deprecated unittest synonyms are documented as being removed in > >> 4.0 and that already defines 4.0 as a future cruft-removal release. > >> However, I would not want it defined as the only cruft-removal > >> release and used as a reason or excuse to suspend removals until > >> then. I would personally prefer to do little* removals > >> incrementally, as was done before the decision to put off 2.x > >> removals to 3.0. So I would have 4.0 be an 'extra' or 'bigger' cruft > >> removal release, but not the only one. > >> > >> * Removing one or two pure synonyms or little used features from a > >> module. The unittest synonym removal is not 'little' because there > >> are 13 synonyms and at least some were well used. > >> > >>> If we are going to do them incrementally we should make that decision > >>> soonish, so that we don't end up having a whole bunch happen at once > >>> and defeat the (theoretical) purpose of doing them incrementally. > >>> > >>> (I say theoretical because what is the purpose? To spread out the > >>> breakage pain over multiple releases, so that every release breaks > >>> something?) > >> > >> Little removals will usually break something, but not most things. > >> Yes, I think it better to upset a few people with each release than > >> lots of people all at once. I think enabling deprecation notices in > >> unittest is a great idea. Among other reasons, it should spread the > >> effect of bigger removals scheduled farther in the future over the > >> extended deprecation period. > >> > >> Most deprecation notices should provide an alternative. (There might > >> be an exception is for things that should not be done ;-). For > >> module removals, the alternative should be a legacy package on PyPI. > > > > Removing some cruft on each release can be very painful for users. > > > > Django's deprecation policy works like this: They deprecate something > > in version A.B. It still works normally in A.B+1, generates a > > (silenced) DeprecationWarning in A.B+2, and is finally removed in > > A.B+3. So if I haven't read through the full release notes of each > > release nor enabled DeprecationWarnings, I end up having something > > broken each time I upgrade Django. > > > > So you're still using features deprecated three releases ago, you haven't checked for DeprecationWarnings and it's Django making your life difficult? > > Why not check for the deprecation warnings? Doing so makes very little difference. This is my opinion (others obviously differ): Putting in one big chunk of effort at a major release boundary is easier to schedule than putting in a chunk of effort on *every* feature release. More importantly, having it happen only at the major release boundary means there's only one hard deadline every ten-ish years, rather than a hard deadline every 1.5 years. Expecting things to break when you switch to the new feature release makes one view feature releases with dread rather than excitement. This applies whether or not one is testing with deprecation warnings on. Yes, there's a little less pressure if you are making the fixes on the deprecation release boundary, because you can always ship the code anyway if it is winds up being too big of a bear, so you have more scheduling flexibility. But you still face the *psychological* hurdle of "feature release upgrade...will need to fix the all the things they've deprecated...let's put that off". Especially since what we are talking about here is the *big* cruft, and thus more likely to be a pain to fix. So, the operant question is which do the majority of *users* prefer, some required "fix my code" work at every feature release, or the ability to schedule the "fix my code" work at their convenience, with a hard deadline (for anything not already fixed) at the major release boundary? Note that under this suggested scenario the amount of work people will need to do for Python4 will be trivial compared to that for Python3, and there won't be any controversy about single-code-base vs conversion, because we'll still be maintaining backward compatibility and just removing cruft. So the user base will probably heave a sigh of relief (those that were around for this transition, at least :) rather than a groan. On the other hand, it *does* make a Python4 transition still a big deal ("that package doesn't support Python4 yet, it still uses old cruft XXX".) Ports sure will be easier than with Python3, though! Also, even without removing big cruft, there *will* be things that need to be fixed when switching to a new feature release, so I'm really talking about relative levels of pain and when the bigger pain occurs. How does one judge what the optimal amount of change is? It would be great if we could figure out how to figure out what the users want. We more or less have one user opinion so far, from Petri, based on his experience as a Django user. We developers are also users, of course, but our opinions are colored by our needs as developers as well, so we aren't reliable judges. --David PS: When thinking about this, remember that our effective policy for (the second half of?) Python2 was to hold all the big cruft removal until Python3. Even some stuff that was originally scheduled to be removed sooner got left in. So our user base is currently used to things being pretty stable from a deprecation/backward compatibility standpoint. From rosuav at gmail.com Thu Aug 22 17:07:55 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 23 Aug 2013 01:07:55 +1000 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: <2539AE1A-B4EC-4568-BFE5-C036369FA1A1@voidspace.org.uk> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> <2539AE1A-B4EC-4568-BFE5-C036369FA1A1@voidspace.org.uk> Message-ID: On Thu, Aug 22, 2013 at 11:45 PM, Michael Foord wrote: > On 22 Aug 2013, at 14:00, Petri Lehtinen wrote: >> Django's deprecation policy works like this: They deprecate something >> in version A.B. It still works normally in A.B+1, generates a >> (silenced) DeprecationWarning in A.B+2, and is finally removed in >> A.B+3. So if I haven't read through the full release notes of each >> release nor enabled DeprecationWarnings, I end up having something >> broken each time I upgrade Django. >> > > So you're still using features deprecated three releases ago, you haven't checked for DeprecationWarnings and it's Django making your life difficult? > > Why not check for the deprecation warnings? Sounds like the DeprecationWarnings give you just one version of advance notice. You would have to be (a) upgrading every version as it comes out, and (b) checking your log of warnings prior to every upgrade. Neither A.B nor A.B+1 will help you even if you check the warnings. So it would still require checking the full release notes every time, if you want to know about what's being deprecated. Seems a lot of annoying breakage to me. Python is frequently not upgraded release-by-release. I've had servers jump several versions at a time; my oldest server now is running 3.1.1 (and 2.6.4), so when it eventually gets upgraded, it'll probably jump to 3.3 or 3.4. Unless something's producing visible warnings all the way back to 3.1, removal in 3.4 has the potential to be surprising. ChrisA From ezio.melotti at gmail.com Thu Aug 22 19:34:44 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 22 Aug 2013 19:34:44 +0200 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: <20130822110006.GP20273@p29> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> Message-ID: Hi, On Thu, Aug 22, 2013 at 1:00 PM, Petri Lehtinen wrote: > > Removing some cruft on each release can be very painful for users. > > Django's deprecation policy works like this: They deprecate something > in version A.B. It still works normally in A.B+1, generates a > (silenced) DeprecationWarning in A.B+2, and is finally removed in > A.B+3. I see two problems with this: 1) DeprecationWarnings should be generated as soon as the feature is deprecated (i.e. A.B, not A.B+2). Holding off the warnings is not helping anyone. 2) The deprecation period seems fixed and independent from the feature. IMHO the period should vary depending on what is being deprecated. Little known/used "features" could be deprecated in A.B and removed in A.B+1; common "features" can be deprecated in A.B and removed in A.B+n, with an n >= 2 (or even wait for A+1). > So if I haven't read through the full release notes of each > release nor enabled DeprecationWarnings, I end up having something > broken each time I upgrade Django. > Reading the release notes shouldn't be required -- DeprecationWarnings are already supposed to tell you what to change. If you have good test coverage this should happen automatically (at least with unittest), but even if you don't you should run your code with -Wa before upgrading (or test your code on the new version before upgrading Python/Django/etc. in production). Best Regards, Ezio Melotti > I hope the same will not start happening each time I upgrade Python. > When the removals happen on major version boundaries, it should be > more evident that something will break. Then people can enable > DepreationWarnings and test all the affected code thoroughly with the > new version before upgrading. > > Petri From donald at stufft.io Thu Aug 22 19:45:30 2013 From: donald at stufft.io (Donald Stufft) Date: Thu, 22 Aug 2013 13:45:30 -0400 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> Message-ID: On Aug 22, 2013, at 1:34 PM, Ezio Melotti wrote: > Hi, > > On Thu, Aug 22, 2013 at 1:00 PM, Petri Lehtinen wrote: >> >> Removing some cruft on each release can be very painful for users. >> >> Django's deprecation policy works like this: They deprecate something >> in version A.B. It still works normally in A.B+1, generates a >> (silenced) DeprecationWarning in A.B+2, and is finally removed in >> A.B+3. > > I see two problems with this: > 1) DeprecationWarnings should be generated as soon as the feature is > deprecated (i.e. A.B, not A.B+2). Holding off the warnings is not > helping anyone. > 2) The deprecation period seems fixed and independent from the > feature. IMHO the period should vary depending on what is being > deprecated. Little known/used "features" could be deprecated in A.B > and removed in A.B+1; common "features" can be deprecated in A.B and > removed in A.B+n, with an n >= 2 (or even wait for A+1). This isn't exactly accurate, it raises a PendingDeprecation warning in A.B, Deprecation in A.B+1, and removed in A.B+2. PendingDeprecation (In Django) was designed to be silent by default and Deprecation loud by default. That got messed up when Python silenced Deprecation warnings by default and we've had to resort to some ugliness to restore that behavior. > >> So if I haven't read through the full release notes of each >> release nor enabled DeprecationWarnings, I end up having something >> broken each time I upgrade Django. >> > > Reading the release notes shouldn't be required -- DeprecationWarnings > are already supposed to tell you what to change. > If you have good test coverage this should happen automatically (at > least with unittest), but even if you don't you should run your code > with -Wa before upgrading (or test your code on the new version before > upgrading Python/Django/etc. in production). > > Best Regards, > Ezio Melotti > >> I hope the same will not start happening each time I upgrade Python. >> When the removals happen on major version boundaries, it should be >> more evident that something will break. Then people can enable >> DepreationWarnings and test all the affected code thoroughly with the >> new version before upgrading. >> >> Petri > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald%40stufft.io ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From ezio.melotti at gmail.com Thu Aug 22 20:00:14 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 22 Aug 2013 20:00:14 +0200 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: <20130822144324.EE0FA25014C@webabinitio.net> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> <2539AE1A-B4EC-4568-BFE5-C036369FA1A1@voidspace.org.uk> <20130822144324.EE0FA25014C@webabinitio.net> Message-ID: On Thu, Aug 22, 2013 at 4:43 PM, R. David Murray wrote: > On Thu, 22 Aug 2013 16:45:29 +0300, Michael Foord wrote: >> >> On 22 Aug 2013, at 14:00, Petri Lehtinen wrote: >> > >> > Django's deprecation policy works like this: They deprecate something >> > in version A.B. It still works normally in A.B+1, generates a >> > (silenced) DeprecationWarning in A.B+2, and is finally removed in >> > A.B+3. So if I haven't read through the full release notes of each >> > release nor enabled DeprecationWarnings, I end up having something >> > broken each time I upgrade Django. >> > >> >> So you're still using features deprecated three releases ago, you haven't checked for DeprecationWarnings and it's Django making your life difficult? >> >> Why not check for the deprecation warnings? > > Doing so makes very little difference. > > This is my opinion (others obviously differ): > > Putting in one big chunk of effort at a major release boundary is easier > to schedule than putting in a chunk of effort on *every* feature > release. IMHO there is a third (and better option) that you are missing. Assume I'm using A.B, and see some DeprecationWarnings. Now I have at least 1.5 years to fix them before A.B+1 is released, and once that happens there shouldn't be any warnings left so I can upgrade successfully. Once I do, more warnings will pop up, but then again I will have 1.5+ years to fix them. It seems to me that the problem only arises when the developers ignore (or possibly are unaware of) the warnings until it's time to upgrade. > More importantly, having it happen only at the major release > boundary means there's only one hard deadline every ten-ish years, rather > than a hard deadline every 1.5 years. > > [...] > > How does one judge what the optimal amount of change is? > > It would be great if we could figure out how to figure out what the > users want. We more or less have one user opinion so far, from Petri, > based on his experience as a Django user. We developers are also users, > of course, but our opinions are colored by our needs as developers as > well, so we aren't reliable judges. As I see it there are 3 groups: 1) developers writing libraries/frameworks/interpreters; 2) developers using these libraries/frameworks/interpreters; 3) end users using the applications wrote by the developers. The first group is responsible of warning the second group of the things that got deprecated and give them enough time to update their code. The second group is responsible to listen to the warnings and update their code accordingly. The third group is responsible to sit back and enjoy our hard work without seeing warnings/errors. Best Regards, Ezio Melotti > > --David > > PS: When thinking about this, remember that our effective policy for > (the second half of?) Python2 was to hold all the big cruft removal until > Python3. Even some stuff that was originally scheduled to be removed > sooner got left in. So our user base is currently used to things being > pretty stable from a deprecation/backward compatibility standpoint. From ezio.melotti at gmail.com Thu Aug 22 20:14:54 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 22 Aug 2013 20:14:54 +0200 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> Message-ID: On Thu, Aug 22, 2013 at 7:45 PM, Donald Stufft wrote: > > On Aug 22, 2013, at 1:34 PM, Ezio Melotti wrote: > >> Hi, >> >> On Thu, Aug 22, 2013 at 1:00 PM, Petri Lehtinen wrote: >>> >>> Removing some cruft on each release can be very painful for users. >>> >>> Django's deprecation policy works like this: They deprecate something >>> in version A.B. It still works normally in A.B+1, generates a >>> (silenced) DeprecationWarning in A.B+2, and is finally removed in >>> A.B+3. >> >> I see two problems with this: >> 1) DeprecationWarnings should be generated as soon as the feature is >> deprecated (i.e. A.B, not A.B+2). Holding off the warnings is not >> helping anyone. >> 2) The deprecation period seems fixed and independent from the >> feature. IMHO the period should vary depending on what is being >> deprecated. Little known/used "features" could be deprecated in A.B >> and removed in A.B+1; common "features" can be deprecated in A.B and >> removed in A.B+n, with an n >= 2 (or even wait for A+1). > > This isn't exactly accurate, it raises a PendingDeprecation warning in A.B, > Deprecation in A.B+1, and removed in A.B+2. > > PendingDeprecation (In Django) was designed to be silent by default > and Deprecation loud by default. That got messed up when Python > silenced Deprecation warnings by default and we've had to resort to > some ugliness to restore that behavior. > So it's not much different from what we do now, except that we basically stopped using PendingDeprecationWarning -> DeprecationWarning and just use DeprecationWarnings from the beginning. I don't see many advantages in keeping the pending deprecation warnings silent for developers, as it just encourages procrastination :) One advantage is that under your scheme, one can assume that what shows up as deprecated (not pending deprecated) will be removed in the next version, so you could focus your work on them first, but this doesn't work for our scheme were a deprecated "feature" might stay there for a couple of versions. Maybe we should introduce a ``.removed_in`` attribute to DeprecationWarnings? We some times mention it in the deprecation message and the docs, but there's no way to get that information programmatically. Best Regards, Ezio Melotti >> >>> So if I haven't read through the full release notes of each >>> release nor enabled DeprecationWarnings, I end up having something >>> broken each time I upgrade Django. >>> >> >> Reading the release notes shouldn't be required -- DeprecationWarnings >> are already supposed to tell you what to change. >> If you have good test coverage this should happen automatically (at >> least with unittest), but even if you don't you should run your code >> with -Wa before upgrading (or test your code on the new version before >> upgrading Python/Django/etc. in production). >> >> Best Regards, >> Ezio Melotti >> >>> I hope the same will not start happening each time I upgrade Python. >>> When the removals happen on major version boundaries, it should be >>> more evident that something will break. Then people can enable >>> DepreationWarnings and test all the affected code thoroughly with the >>> new version before upgrading. >>> >>> Petri >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald%40stufft.io > > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > From rdmurray at bitdance.com Thu Aug 22 20:40:33 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 22 Aug 2013 14:40:33 -0400 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> <2539AE1A-B4EC-4568-BFE5-C036369FA1A1@voidspace.org.uk> <20130822144324.EE0FA25014C@webabinitio.net> Message-ID: <20130822184033.B15DB25003E@webabinitio.net> On Thu, 22 Aug 2013 20:00:14 +0200, Ezio Melotti wrote: > On Thu, Aug 22, 2013 at 4:43 PM, R. David Murray wrote: > > On Thu, 22 Aug 2013 16:45:29 +0300, Michael Foord wrote: > >> > >> On 22 Aug 2013, at 14:00, Petri Lehtinen wrote: > >> > > >> > Django's deprecation policy works like this: They deprecate something > >> > in version A.B. It still works normally in A.B+1, generates a > >> > (silenced) DeprecationWarning in A.B+2, and is finally removed in > >> > A.B+3. So if I haven't read through the full release notes of each > >> > release nor enabled DeprecationWarnings, I end up having something > >> > broken each time I upgrade Django. > >> > > >> > >> So you're still using features deprecated three releases ago, you haven't checked for DeprecationWarnings and it's Django making your life difficult? > >> > >> Why not check for the deprecation warnings? > > > > Doing so makes very little difference. > > > > This is my opinion (others obviously differ): > > > > Putting in one big chunk of effort at a major release boundary is easier > > to schedule than putting in a chunk of effort on *every* feature > > release. > > IMHO there is a third (and better option) that you are missing. > > Assume I'm using A.B, and see some DeprecationWarnings. Now I have at > least 1.5 years to fix them before A.B+1 is released, and once that > happens there shouldn't be any warnings left so I can upgrade > successfully. Once I do, more warnings will pop up, but then again I > will have 1.5+ years to fix them. > > It seems to me that the problem only arises when the developers ignore > (or possibly are unaware of) the warnings until it's time to upgrade. I think you missed my point. It is the *change itself* that causes action to be needed. If a project has a policy of dealing with deprecated features when the warnings happen, then they need to do that work before the version where the feature is removed is released. If they have a policy of ignoring deprecation warnings, then they have to do that work before their users can upgrade to the version where the feature is removed. So the pain exists in equal measure either way, with the same periodicity, only the timing of when the work is done is affected by whether or not you pay attention to deprecation warnings. And yes, you presumably have a more relaxed fix schedule and happier users if you pay attention to deprecation warnings, so you should do that (IMO). I'm asking if the bigger removals should be only on major version boundaries, thus allowing *more* time for that relaxed fix mode for the stuff that takes more work to fix. It does occur to me that this would mean we'd bikeshed about whether any given change was major or not...but we'd do that anyway, because we'd argue about h^h^h^h^h^h^h^h^h discuss how many releases to wait before actually removing it, depending on how disruptive it was likely to be. --David From victor.stinner at gmail.com Thu Aug 22 23:32:11 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 22 Aug 2013 23:32:11 +0200 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review Message-ID: Hi, I know that I wrote it more than once, but I consider that my PEP 446 is now ready for a final review: http://www.python.org/dev/peps/pep-0446/ The implementation is also working, complete and ready for a review. http://hg.python.org/features/pep-446 http://bugs.python.org/issue18571 I run regulary the test suite on my virtual machines: Windows 7 SP1, Linux 3.2, FreeBSD 9 and OpenIndiana (close to Solaris 11), but also sometimes on OpenBSD 5.2. I don't expect major bugs, but you may find minor issues, especially on old operating systems. I don't have access to older systems. *** I collected the list of all threads related to the inheritance of file descriptors on python-dev since january 2013. I counted no more than 239 messages! Thank you all developers who contributed to these discussions and so helped me to write the PEP 433 and the PEP 446, especially Charles-Fran?ois Natali and Richard Oudkerk! I read again all these messages, and I cannot "go backward" to the PEP 433. There are too many good reasons against adding a global variable (sys.getdefaultcloexec()), and adding a new inheritable parameter without changing the inheritance to non-inheritable does not solve issues listed in the Rationale of the PEP 446. At the beginning of the discussions, most developers already agreed that making file descriptors non-inheritable by default is the best choice. It took me some months to really understand all the consequences of changing the inheritance. I had also many technical issues because each operating system handles the inheritance of file descriptors differently, especially Windows vs UNIX. For atomic flags, there are differences even between minor versions of operating systems. For example, the O_CLOEXEC flag is only supported since Linux 2.6.23. We spend a lot of time to discuss each intermediate solution, but the conclusion is that no compromise can be found on an intermediate solution, only a radical change can solve the problem. [Python-Dev] Add "e" (close and exec) mode to open() (13 messages) Tue Jan 8 2013 http://mail.python.org/pipermail/python-dev/2013-January/123494.html [Python-Dev] Set close-on-exec flag by default in SocketServer (31) Wed Jan 9 2013 http://mail.python.org/pipermail/python-dev/2013-January/123552.html [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors (27) Sun Jan 13 2013 http://mail.python.org/pipermail/python-dev/2013-January/123609.html [Python-Dev] Implementation of the PEP 433 (2) Fri Jan 25 2013 http://mail.python.org/pipermail/python-dev/2013-January/123684.html [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter (37) Fri Jan 25 2013 http://mail.python.org/pipermail/python-dev/2013-January/123685.html [Python-Dev] PEP 433: second try (5) Tue Jan 29 2013 http://mail.python.org/pipermail/python-dev/2013-January/123763.html [Python-Dev] Release or not release the GIL (11) Fri Feb 1 2013 http://mail.python.org/pipermail/python-dev/2013-February/123780.html [Python-Dev] Status of the PEP 433? (2) Thu Feb 14 2013 http://mail.python.org/pipermail/python-dev/2013-February/124070.html [Python-Dev] PEP 446: Add new parameters to configure the inherance of files and for non-blocking sockets (24) Thu Jul 4 2013 http://mail.python.org/pipermail/python-dev/2013-July/127168.html [Python-Dev] Inherance of file descriptor and handles on Windows (PEP 446) (41) Wed Jul 24 2013 http://mail.python.org/pipermail/python-dev/2013-July/127509.html http://mail.python.org/pipermail/python-dev/2013-August/127791.html [Python-Dev] PEP 446: Open issues/questions (29) Sun Jul 28 2013 http://mail.python.org/pipermail/python-dev/2013-July/127626.html http://mail.python.org/pipermail/python-dev/2013-August/127728.html [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable (8) Tue Aug 6 2013 http://mail.python.org/pipermail/python-dev/2013-August/127805.html [Python-Dev] PEP 446: issue with sockets (9) Wed Aug 21 2013 http://mail.python.org/pipermail/python-dev/2013-August/128045.html Victor From stephen at xemacs.org Fri Aug 23 01:37:24 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 23 Aug 2013 08:37:24 +0900 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: <20130822184033.B15DB25003E@webabinitio.net> References: <520AE9DF.6090406@pearwood.info> <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> <2539AE1A-B4EC-4568-BFE5-C036369FA1A1@voidspace.org.uk> <20130822144324.EE0FA25014C@webabinitio.net> <20130822184033.B15DB25003E@webabinitio.net> Message-ID: <8738q1fgaz.fsf@uwakimon.sk.tsukuba.ac.jp> R. David Murray writes: > It is the *change itself* that causes > action to be needed. If a project has a policy of dealing with deprecated > features when the warnings happen, then they need to do that work before > the version where the feature is removed is released. If they have > a policy of ignoring deprecation warnings, then they have to do that > work before their users can upgrade to the version where the feature > is removed. So the pain exists in equal measure either way, with the > same periodicity, only the timing of when the work is done is affected > by whether or not you pay attention to deprecation warnings. This is exactly correct analysis. Changing the DeprecationWarning policy is not going to save anybody any work, and it's not likely to silence the grumbling. It's the feature removal that causes the extra work and that is what causes the complaints. > And yes, you presumably have a more relaxed fix schedule and > happier users if you pay attention to deprecation warnings, so you > should do that (IMO). Sure, but human nature doesn't work that way. Some people will, others won't, and the latter are likely to think they have reason to complain. > I'm asking if the bigger removals should be only on major version > boundaries, thus allowing *more* time for that relaxed fix mode for the > stuff that takes more work to fix. My take is that it's not going to help that much. You just don't know what's going to take more work to fix. A trivial-to-fix problem in a proprietary library abandoned by its developer can take a complete rewrite of your program. From petri at digip.org Fri Aug 23 07:38:55 2013 From: petri at digip.org (Petri Lehtinen) Date: Fri, 23 Aug 2013 08:38:55 +0300 Subject: [Python-Dev] When to remove deprecated stuff In-Reply-To: <20130822144324.EE0FA25014C@webabinitio.net> References: <520C2ED4.5010900@pearwood.info> <20130815110845.78ccde98@fsol> <20130815112214.0662e057@fsol> <20130815122936.805FF250168@webabinitio.net> <20130822110006.GP20273@p29> <2539AE1A-B4EC-4568-BFE5-C036369FA1A1@voidspace.org.uk> <20130822144324.EE0FA25014C@webabinitio.net> Message-ID: <20130823053854.GC20273@p29> R. David Murray wrote: > > So you're still using features deprecated three releases ago, you haven't checked for DeprecationWarnings and it's Django making your life difficult? > > > > Why not check for the deprecation warnings? > > Doing so makes very little difference. > > This is my opinion (others obviously differ): > > Putting in one big chunk of effort at a major release boundary is easier > to schedule than putting in a chunk of effort on *every* feature > release. More importantly, having it happen only at the major release > boundary means there's only one hard deadline every ten-ish years, rather > than a hard deadline every 1.5 years. > > Expecting things to break when you switch to the new feature release > makes one view feature releases with dread rather than excitement. > > This applies whether or not one is testing with deprecation warnings on. > Yes, there's a little less pressure if you are making the fixes on > the deprecation release boundary, because you can always ship the > code anyway if it is winds up being too big of a bear, so you have more > scheduling flexibility. But you still face the *psychological* hurdle of > "feature release upgrade...will need to fix the all the things they've > deprecated...let's put that off". Especially since what we are talking > about here is the *big* cruft, and thus more likely to be a pain to fix. These are my thoughts exactly. Maybe I exaggerated a bit about Django. I was slightly unaware of the deprecation policy when Django 1.3 came out (IIRC it was the first release that actually removed deprecated stuff after 1.0). Nowadays I read release notes carefully and do what's needed, and nothing has broken badly ever since. What's really bothering me is that I have to change something in my code every time I upgrade Django. So as David said, it's more like "sigh, a new feature release again" than "yay, new cool features!". Or actually, it's a combination of both because I really want the new features. Petri From cf.natali at gmail.com Fri Aug 23 09:48:15 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 23 Aug 2013 09:48:15 +0200 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review In-Reply-To: References: Message-ID: Hello, A couple remarks: > The following functions are modified to make newly created file descriptors non-inheritable by default: > [...] > os.dup() then > os.dup2() has a new optional inheritable parameter: os.dup2(fd, fd2, inheritable=True). fd2 is created inheritable by default, but non-inheritable if inheritable is False. Why does dup2() create inheritable FD, and not dup()? I think a hint is given a little later: > Applications using the subprocess module with the pass_fds parameter or using os.dup2() to redirect standard streams should not be affected. But that's overly-optimistic. For example, a lot of code uses the guarantee that dup()/open()... returns the lowest numbered file descriptor available, so code like this: r, w = os.pipe() if os.fork() == 0: # child os.close(r) os.close(1) dup(w) *will break* And that's a lot of code (e.g. that's what _posixsubprocess.c uses, but since it's implemented in C it's wouldn't be affected). We've already had this discussion, and I stand by my claim that changing the default *will break* user code. Furthermore, many people use Python for system programming, and this change would be highly surprising. So no matter what the final decision on this PEP is, it must be kept in mind. > The programming languages Go, Perl and Ruby make newly created file descriptors non-inheritable by default: since Go 1.0 (2009), Perl 1.0 (1987) and Ruby 2.0 (2013). OK, but do they expose OS file descriptors? I'm sure such a change would be fine for Java, which doesn't expose FDs and fork(), but Python's another story. Last time, I said that to me, the FD inheritance issue is solved on POSIX by the subprocess module which passes close_fds. In my own code, I use subprocess, which is the "official", portable and safe way to create child processes in Python. Someone using fork() + exec() should know what he's doing, and be able to deal with the consequences: I'm not only talking about FD inheritance, but also about async-signal/multi-threaded safety ;-) As for Windows, since it doesn't have fork(), it would make sense to make its FD non heritable by default. And then use what you describe here to selectively inherit FDs (i.e. implement keep_fds): """ Since Windows Vista, CreateProcess() supports an extension of the STARTUPINFO struture: the STARTUPINFOEX structure. Using this new structure, it is possible to specify a list of handles to inherit: PROC_THREAD_ATTRIBUTE_HANDLE_LIST. Read Programmatically controlling which handles are inherited by new processes in Win32 (Raymond Chen, Dec 2011) for more information. """ cf From stefan_ml at behnel.de Fri Aug 23 10:50:18 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 Aug 2013 10:50:18 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules Message-ID: Hi, this has been subject to a couple of threads on python-dev already, for example: http://thread.gmane.org/gmane.comp.python.devel/135764/focus=140986 http://thread.gmane.org/gmane.comp.python.devel/141037/focus=141046 It originally came out of issues 13429 and 16392. http://bugs.python.org/issue13429 http://bugs.python.org/issue16392 Here's an initial attempt at a PEP for it. It is based on the (unfinished) ModuleSpec PEP, which is being discussed on the import-sig mailing list. http://mail.python.org/pipermail/import-sig/2013-August/000688.html Stefan PEP: 4XX Title: Redesigning extension modules Version: $Revision$ Last-Modified: $Date$ Author: Stefan Behnel BDFL-Delegate: ??? Discussions-To: ??? Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Aug-2013 Python-Version: 3.4 Post-History: 23-Aug-2013 Resolution: Abstract ======== This PEP proposes a redesign of the way in which extension modules interact with the interpreter runtime. This was last revised for Python 3.0 in PEP 3121, but did not solve all problems at the time. The goal is to solve them by bringing extension modules closer to the way Python modules behave. An implication of this PEP is that extension modules can use arbitrary types for their module implementation and are no longer restricted to types.ModuleType. This makes it easy to support properties at the module level and to safely store arbitrary global state in the module that is covered by normal garbage collection and supports reloading and sub-interpreters. Motivation ========== Python modules and extension modules are not being set up in the same way. For Python modules, the module is created and set up first, then the module code is being executed. For extensions, i.e. shared libraries, the module init function is executed straight away and does both the creation and initialisation. This means that it knows neither the __file__ it is being loaded from nor its package (i.e. its fully qualified module name, FQMN). This hinders relative imports and resource loading. In Py3, it's also not being added to sys.modules, which means that a (potentially transitive) re-import of the module will really try to reimport it and thus run into an infinite loop when it executes the module init function again. And without the FQMN, it is not trivial to correctly add the module to sys.modules either. This is specifically a problem for Cython generated modules, for which it's not uncommon that the module init code has the same level of complexity as that of any 'regular' Python module. Also, the lack of a FQMN and correct file path hinders the compilation of __init__.py modules, i.e. packages, especially when relative imports are being used at module init time. Furthermore, the majority of currently existing extension modules has problems with sub-interpreter support and/or reloading and it is neither easy nor efficient with the current infrastructure to support these features. This PEP also addresses these issues. The current process =================== Currently, extension modules export an initialisation function named "PyInit_modulename", named after the file name of the shared library. This function is executed by the import machinery and must return either NULL in the case of an exception, or a fully initialised module object. The function receives no arguments, so it has no way of knowing about its import context. During its execution, the module init function creates a module object based on a PyModuleDef struct. It then continues to initialise it by adding attributes to the module dict, creating types, etc. In the back, the shared library loader keeps a note of the fully qualified module name of the last module that it loaded, and when a module gets created that has a matching name, this global variable is used to determine the FQMN of the module object. This is not entirely safe as it relies on the module init function creating its own module object first, but this assumption usually holds in practice. The main problem in this process is the missing support for passing state into the module init function, and for safely passing state through to the module creation code. The proposal ============ The current extension module initialisation will be deprecated in favour of a new initialisation scheme. Since the current scheme will continue to be available, existing code will continue to work unchanged, including binary compatibility. Extension modules that support the new initialisation scheme must export a new public symbol "PyModuleCreate_modulename", where "modulename" is the name of the shared library. This mimics the previous naming convention for the "PyInit_modulename" function. This symbol must resolve to a C function with the following signature:: PyObject* (*PyModuleTypeCreateFunction)(PyObject* module_spec) The "module_spec" argument receives a "ModuleSpec" instance, as defined in PEP 4XX (FIXME). (All names are obviously up for debate and bike-shedding at this point.) When called, this function must create and return a type object, either a Python class or an extension type that is allocated on the heap. This type will be instantiated as module instance by the importer. There is no requirement for this type to be exactly or a subtype of types.ModuleType. Any type can be returned. This follows the current support for allowing arbitrary objects in sys.modules and makes it easier for extension modules to define a type that exactly matches their needs for holding module state. The constructor of this type must have the following signature:: def __init__(self, module_spec): The "module_spec" argument receives the same object as the one passed into the module type creation function. Implementation ============== XXX - not started Reloading and Sub-Interpreters ============================== To "reload" an extension module, the module create function is executed again and returns a new module type. This type is then instantiated as by the original module loader and replaces the previous entry in sys.modules. Once the last references to the previous module and its type are gone, both will be subject to normal garbage collection. Sub-interpreter support is an inherent property of the design. During import in the sub-interpreter, the module create function is executed and returns a new module type that is local to the sub-interpreter. Both the type and its module instance are subject to garbage collection in the sub-interpreter. Open questions ============== It is not immediately obvious how extensions should be handled that want to register more than one module in their module init function, e.g. compiled packages. One possibility would be to leave the setup to the user, who would have to know all FQMNs anyway in this case (or could construct them from the module spec of the current module), although not the import file path. A C-API could be provided to register new module types in the current interpreter, given a user provided ModuleSpec. There is no inherent requirement for the module creation function to actually return a type. It could return a arbitrary callable that creates a 'modulish' object when called. Should there be a type check in place that makes sure that what it returns is a type? I don't currently see a need for this. Copyright ========= This document has been placed in the public domain. From solipsis at pitrou.net Fri Aug 23 11:18:22 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 23 Aug 2013 11:18:22 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules References: Message-ID: <20130823111822.49cba700@pitrou.net> Hi, Le Fri, 23 Aug 2013 10:50:18 +0200, Stefan Behnel a ?crit : > > Here's an initial attempt at a PEP for it. It is based on the > (unfinished) ModuleSpec PEP, which is being discussed on the > import-sig mailing list. Thanks for trying this. I think the PEP should contain working example code for module initialization (and creation), to help gauge the complexity for module writers. Regards Antoine. From status at bugs.python.org Fri Aug 23 18:07:31 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 23 Aug 2013 18:07:31 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130823160731.6E5BB56A35@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-08-16 - 2013-08-23) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4168 (+16) closed 26426 (+49) total 30594 (+65) Open issues with patches: 1914 Issues opened (47) ================== #17702: os.environ converts key type from string to bytes in KeyError http://bugs.python.org/issue17702 reopened by haypo #18757: Fix internal references for concurrent modules http://bugs.python.org/issue18757 opened by serhiy.storchaka #18758: Fix internal references in the documentation http://bugs.python.org/issue18758 opened by serhiy.storchaka #18760: Fix internal doc references for the xml package http://bugs.python.org/issue18760 opened by serhiy.storchaka #18763: subprocess: file descriptors should be closed after preexec_fn http://bugs.python.org/issue18763 opened by neologix #18764: The pdb print command prints repr instead of str in python3 http://bugs.python.org/issue18764 opened by r.david.murray #18765: unittest needs a way to launch pdb.post_mortem or other debug http://bugs.python.org/issue18765 opened by gregory.p.smith #18766: IDLE: Autocomplete in editor doesn't work for un-imported modu http://bugs.python.org/issue18766 opened by philwebster #18767: csv documentation does not note default quote constant http://bugs.python.org/issue18767 opened by bemclaugh #18769: argparse remove subparser http://bugs.python.org/issue18769 opened by Michael.Bikovitsky #18772: Fix gdb plugin for new sets dummy object http://bugs.python.org/issue18772 opened by pitrou #18774: There is still last bit of GNU Pth code in signalmodule.c http://bugs.python.org/issue18774 opened by vajrasky #18775: name attribute for HMAC http://bugs.python.org/issue18775 opened by christian.heimes #18776: atexit error display behavior changed in python 3 http://bugs.python.org/issue18776 opened by doughellmann #18778: email docstrings and comments say about Unicode strings http://bugs.python.org/issue18778 opened by serhiy.storchaka #18779: Misleading documentations and comments in regular expression H http://bugs.python.org/issue18779 opened by vajrasky #18780: SystemError when formatting int subclass http://bugs.python.org/issue18780 opened by serhiy.storchaka #18783: No more refer to Python "long" http://bugs.python.org/issue18783 opened by serhiy.storchaka #18784: minor uuid.py loading optimization http://bugs.python.org/issue18784 opened by eugals #18785: Add get_body and iter_attachments to provisional email API http://bugs.python.org/issue18785 opened by r.david.murray #18787: Misleading error from getspnam function of spwd module http://bugs.python.org/issue18787 opened by vajrasky #18789: XML Vunerability Table Unclear http://bugs.python.org/issue18789 opened by joe-tennies #18790: incorrect text in argparse add_help example http://bugs.python.org/issue18790 opened by purplezephyr #18795: pstats - allow stats sorting by cumulative time per call and t http://bugs.python.org/issue18795 opened by alexnvdias #18796: Wrong documentation of show_code function from dis module http://bugs.python.org/issue18796 opened by vajrasky #18798: Typo and unused variables in test fcntl http://bugs.python.org/issue18798 opened by vajrasky #18799: Resurrect and fix test_404 in Lib/test/test_xmlrpc.py http://bugs.python.org/issue18799 opened by vajrasky #18800: Document Fraction's numerator and denominator properties http://bugs.python.org/issue18800 opened by icedream91 #18801: inspect.classify_class_attrs() misclassifies object.__new__() http://bugs.python.org/issue18801 opened by eric.snow #18802: ipaddress documentation errors http://bugs.python.org/issue18802 opened by jongfoster #18803: Fix more typos in .py files http://bugs.python.org/issue18803 opened by iwontbecreative #18804: pythorun.c: is_valid_fd() should not duplicate the file descri http://bugs.python.org/issue18804 opened by haypo #18805: ipaddress netmask/hostmask parsing bugs http://bugs.python.org/issue18805 opened by jongfoster #18806: socketmodule: fix/improve setipaddr() numeric addresses handli http://bugs.python.org/issue18806 opened by neologix #18807: Allow venv to create copies, even when symlinks are supported http://bugs.python.org/issue18807 opened by andrea.corbellini #18808: Thread.join returns before PyThreadState is destroyed http://bugs.python.org/issue18808 opened by Tamas.K #18809: Expose mtime check in importlib.machinery.FileFinder http://bugs.python.org/issue18809 opened by brett.cannon #18810: Stop doing stat calls in importlib.machinery.FileFinder to see http://bugs.python.org/issue18810 opened by brett.cannon #18813: Speed up slice object processing http://bugs.python.org/issue18813 opened by scoder #18814: Add tools for "cleaning" surrogate escaped strings http://bugs.python.org/issue18814 opened by ncoghlan #18815: DOCUMENTATION: "mmap .close()" doesn't close the underlying fi http://bugs.python.org/issue18815 opened by jcea #18816: "mmap.flush()" is always synchronous, hurting performance http://bugs.python.org/issue18816 opened by jcea #18817: Got resource warning when running Lib/aifc.py http://bugs.python.org/issue18817 opened by vajrasky #18818: Empty PYTHONIOENCODING is not the same as nonexistent http://bugs.python.org/issue18818 opened by serhiy.storchaka #18819: tarfile fills devmajor and devminor fields even for non-device http://bugs.python.org/issue18819 opened by Nuutti.Kotivuori #18820: json.dump() ignores its 'default' option when serializing dict http://bugs.python.org/issue18820 opened by july #18821: Add .lastitem attribute to takewhile instances http://bugs.python.org/issue18821 opened by oscarbenjamin Most recent 15 issues with no replies (15) ========================================== #18821: Add .lastitem attribute to takewhile instances http://bugs.python.org/issue18821 #18819: tarfile fills devmajor and devminor fields even for non-device http://bugs.python.org/issue18819 #18817: Got resource warning when running Lib/aifc.py http://bugs.python.org/issue18817 #18815: DOCUMENTATION: "mmap .close()" doesn't close the underlying fi http://bugs.python.org/issue18815 #18810: Stop doing stat calls in importlib.machinery.FileFinder to see http://bugs.python.org/issue18810 #18809: Expose mtime check in importlib.machinery.FileFinder http://bugs.python.org/issue18809 #18807: Allow venv to create copies, even when symlinks are supported http://bugs.python.org/issue18807 #18803: Fix more typos in .py files http://bugs.python.org/issue18803 #18801: inspect.classify_class_attrs() misclassifies object.__new__() http://bugs.python.org/issue18801 #18800: Document Fraction's numerator and denominator properties http://bugs.python.org/issue18800 #18798: Typo and unused variables in test fcntl http://bugs.python.org/issue18798 #18796: Wrong documentation of show_code function from dis module http://bugs.python.org/issue18796 #18790: incorrect text in argparse add_help example http://bugs.python.org/issue18790 #18789: XML Vunerability Table Unclear http://bugs.python.org/issue18789 #18785: Add get_body and iter_attachments to provisional email API http://bugs.python.org/issue18785 Most recent 15 issues waiting for review (15) ============================================= #18820: json.dump() ignores its 'default' option when serializing dict http://bugs.python.org/issue18820 #18819: tarfile fills devmajor and devminor fields even for non-device http://bugs.python.org/issue18819 #18818: Empty PYTHONIOENCODING is not the same as nonexistent http://bugs.python.org/issue18818 #18817: Got resource warning when running Lib/aifc.py http://bugs.python.org/issue18817 #18813: Speed up slice object processing http://bugs.python.org/issue18813 #18805: ipaddress netmask/hostmask parsing bugs http://bugs.python.org/issue18805 #18803: Fix more typos in .py files http://bugs.python.org/issue18803 #18802: ipaddress documentation errors http://bugs.python.org/issue18802 #18799: Resurrect and fix test_404 in Lib/test/test_xmlrpc.py http://bugs.python.org/issue18799 #18798: Typo and unused variables in test fcntl http://bugs.python.org/issue18798 #18796: Wrong documentation of show_code function from dis module http://bugs.python.org/issue18796 #18790: incorrect text in argparse add_help example http://bugs.python.org/issue18790 #18787: Misleading error from getspnam function of spwd module http://bugs.python.org/issue18787 #18785: Add get_body and iter_attachments to provisional email API http://bugs.python.org/issue18785 #18784: minor uuid.py loading optimization http://bugs.python.org/issue18784 Top 10 most discussed issues (10) ================================= #18756: os.urandom() fails under high load http://bugs.python.org/issue18756 34 msgs #18747: Re-seed OpenSSL's PRNG after fork http://bugs.python.org/issue18747 32 msgs #18713: Clearly document the use of PYTHONIOENCODING to set surrogatee http://bugs.python.org/issue18713 30 msgs #18712: Pure Python operator.index doesn't match the C version. http://bugs.python.org/issue18712 16 msgs #18738: String formatting (% and str.format) issues with Enum http://bugs.python.org/issue18738 13 msgs #18772: Fix gdb plugin for new sets dummy object http://bugs.python.org/issue18772 13 msgs #18606: Add statistics module to standard library http://bugs.python.org/issue18606 12 msgs #18748: libgcc_s.so.1 must be installed for pthread_cancel to work http://bugs.python.org/issue18748 7 msgs #15809: IDLE console uses incorrect encoding. http://bugs.python.org/issue15809 6 msgs #16396: Importing ctypes.wintypes on Linux gives a traceback http://bugs.python.org/issue16396 6 msgs Issues closed (50) ================== #2537: re.compile(r'((x|y+)*)*') should not fail http://bugs.python.org/issue2537 closed by serhiy.storchaka #6923: Need pthread_atfork-like functionality in CPython http://bugs.python.org/issue6923 closed by neologix #8865: select.poll is not thread safe http://bugs.python.org/issue8865 closed by serhiy.storchaka #10654: test_datetime sometimes fails on Python3.x windows binary http://bugs.python.org/issue10654 closed by terry.reedy #13461: Error on test_issue_1395_5 with Python 2.7 and VS2010 http://bugs.python.org/issue13461 closed by serhiy.storchaka #15175: pydoc -k zip throws segmentation fault http://bugs.python.org/issue15175 closed by serhiy.storchaka #15233: atexit: guarantee order of execution of registered functions? http://bugs.python.org/issue15233 closed by neologix #16105: Pass read only FD to signal.set_wakeup_fd http://bugs.python.org/issue16105 closed by pitrou #16190: Misleading warning in random module docs http://bugs.python.org/issue16190 closed by pitrou #16463: testConnectTimeout of test_timeout TCPTimeoutTestCasefailures http://bugs.python.org/issue16463 closed by neologix #16699: Mountain Lion buildbot lacks disk space http://bugs.python.org/issue16699 closed by ezio.melotti #17119: Integer overflow when passing large string or tuple to Tkinter http://bugs.python.org/issue17119 closed by serhiy.storchaka #17400: ipaddress.is_private needs to take into account of rfc6598 http://bugs.python.org/issue17400 closed by pmoody #17803: Calling Tkinter.Tk() with a baseName keyword argument throws U http://bugs.python.org/issue17803 closed by serhiy.storchaka #17998: internal error in regular expression engine http://bugs.python.org/issue17998 closed by serhiy.storchaka #18178: Redefinition of malloc(3) family of functions at build time http://bugs.python.org/issue18178 closed by christian.heimes #18324: set_payload does not handle binary payloads correctly http://bugs.python.org/issue18324 closed by r.david.murray #18445: Tools/Script/Readme is outdated http://bugs.python.org/issue18445 closed by akuchling #18466: Spelling mistakes in various code comments. http://bugs.python.org/issue18466 closed by ezio.melotti #18562: Regex howto: revision pass http://bugs.python.org/issue18562 closed by akuchling #18686: Tkinter focus_get on menu results in KeyError http://bugs.python.org/issue18686 closed by serhiy.storchaka #18701: Remove outdated PY_VERSION_HEX checks http://bugs.python.org/issue18701 closed by serhiy.storchaka #18705: Fix typos/spelling mistakes in Lib/*.py files http://bugs.python.org/issue18705 closed by ezio.melotti #18706: test failure in test_codeccallbacks http://bugs.python.org/issue18706 closed by ezio.melotti #18707: the readme should also talk about how to build doc. http://bugs.python.org/issue18707 closed by ezio.melotti #18718: datetime documentation contradictory on leap second support http://bugs.python.org/issue18718 closed by ezio.melotti #18741: Fix typos/spelling mistakes in Lib/*/*/.py files http://bugs.python.org/issue18741 closed by ezio.melotti #18753: [c]ElementTree.fromstring fails to parse ]]> http://bugs.python.org/issue18753 closed by kees #18755: imp read functions do not try to re-open files that have been http://bugs.python.org/issue18755 closed by brett.cannon #18759: Fix internal doc references for logging package http://bugs.python.org/issue18759 closed by ezio.melotti #18761: Fix internal doc references for the email package http://bugs.python.org/issue18761 closed by serhiy.storchaka #18762: error in test_multiprocessing_forkserver http://bugs.python.org/issue18762 closed by sbt #18768: Wrong documentation of RAND_egd function in ssl module http://bugs.python.org/issue18768 closed by christian.heimes #18770: Python insert operation on list http://bugs.python.org/issue18770 closed by brett.cannon #18771: Reduce the cost of hash collisions for set objects http://bugs.python.org/issue18771 closed by rhettinger #18773: When a signal handler fails to write to the file descriptor re http://bugs.python.org/issue18773 closed by vajrasky #18777: Cannot compile _ssl.c using openssl > 1.0 http://bugs.python.org/issue18777 closed by christian.heimes #18781: re.escape escapes underscore (Python 2.7) http://bugs.python.org/issue18781 closed by ezio.melotti #18782: sqlite3 row factory and multiprocessing map http://bugs.python.org/issue18782 closed by ned.deily #18786: test_multiprocessing_spawn crashes under PowerLinux http://bugs.python.org/issue18786 closed by pitrou #18788: Proof of concept: implicit call syntax http://bugs.python.org/issue18788 closed by ncoghlan #18791: PIL freeze reading Jpeg file http://bugs.python.org/issue18791 closed by ezio.melotti #18792: test_ftplib timeouts http://bugs.python.org/issue18792 closed by pitrou #18793: occasional test_multiprocessing_forkserver failure on FreeBSD http://bugs.python.org/issue18793 closed by sbt #18794: select.devpoll objects have no close() method http://bugs.python.org/issue18794 closed by python-dev #18797: Don't needlessly change refcounts of dummy objects for sets http://bugs.python.org/issue18797 closed by rhettinger #18811: add ssl-based generator to random module http://bugs.python.org/issue18811 closed by neologix #18812: PyImport_Import redundant calls to find module http://bugs.python.org/issue18812 closed by brett.cannon #1633953: re.compile("(.*$){1,4}", re.MULTILINE) fails http://bugs.python.org/issue1633953 closed by serhiy.storchaka #1666318: shutil.copytree doesn't give control over directory permission http://bugs.python.org/issue1666318 closed by pitrou From victor.stinner at gmail.com Fri Aug 23 19:07:09 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 23 Aug 2013 19:07:09 +0200 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review In-Reply-To: References: Message-ID: Hi, I will try to answer to your worries. Tell me if I should complete the PEP with these answers. 2013/8/23 Charles-Fran?ois Natali : > Why does dup2() create inheritable FD, and not dup()? Ah yes, there were as section explaining it. In a previous version of the PEP (and its implementation), os.dup() and os.dup2() created non-inheritable FD, but inheritable FD for standard steams (fd 0, 1, 2). I did a research on http://code.ohloh.net/ to see how os.dup2() is used in Python projects. 99.4% (169 projects on 170) uses os.dup2() to replace standard streams: file descriptors 0, 1, 2 (stdin, stdout, stderr); sometimes only stdout or stderr. I only found a short demo script using dup2() with arbitrary file descriptor numbers (10 and 11) to keep a copy of stdout and stderr before replacing them (also with dup2). I didn't find use cases of dup() to inherit file descriptors in the Python standard library. It's the opposite: when os.dup() (or the C function dup() is used), the FD must not be inherited. For example, os.listdir(fd) duplicates fd: a child process must not inherit the duplicated file descriptor, it may lead a security vulnerability (ex: parent process running as root, whereas the child process is running as a different used and not allowed to open the directory). > For example, a lot of code uses the guarantee that dup()/open()... > returns the lowest numbered file descriptor available, so code like > this: > > r, w = os.pipe() > if os.fork() == 0: > # child > os.close(r) > os.close(1) > dup(w) > > *will break* > > And that's a lot of code (e.g. that's what _posixsubprocess.c uses, > but since it's implemented in C it's wouldn't be affected). Yes, it will break. As I wrote in my previous email, we cannot solve all issues listed in the Rationale section of the PEP without breaking applications (or at least breaking backward compatibility). It is even explicitly said in the "Backward Compatibility" section: http://www.python.org/dev/peps/pep-0446/#backward-compatibility "This PEP break applications relying on inheritance of file descriptors." But I also added a hint to fix applications: "Developers are encouraged to reuse the high-level Python module subprocess which handles the inheritance of file descriptors in a portable way." If you don't want to use subprocess, yes, you will have to add "os.set_inheritable(w)" in the child process. About your example: I'm not sure that it is reliable/portable. I saw daemon libraries closing *all* file descriptors and then expecting new file descriptors to become 0, 1 and 2. Your example is different because w is still open. On Windows, I have seen cases with only fd 0, 1, 2 open, and the next open() call gives the fd 10 or 13... I'm optimistic and I expect that most Python applications and libraries already use the subprocess module. The subprocess module closes all file descriptors (except 0, 1, 2) since Python 3.2. Developers relying on the FD inheritance and using the subprocess with Python 3.2 or later already had to use the pass_fds parameter. > Furthermore, many people use Python for system programming, and this > change would be highly surprising. Yes, it is a voluntary design choice (of the PEP). It is also said explicitly in the "Backward Compatibility" section: "Python does no more conform to POSIX, since file descriptors are now made non-inheritable by default. Python was not designed to conform to POSIX, but was designed to develop portable applications." > So no matter what the final decision on this PEP is, it must be kept in mind. The purpose of the PEP is to explain correctly the context and the consequences of the changes, so Guido van Rossum can uses the PEP to make its final decision. >> The programming languages Go, Perl and Ruby make newly created file descriptors non-inheritable by default: since Go 1.0 (2009), Perl 1.0 (1987) and Ruby 2.0 (2013). > > OK, but do they expose OS file descriptors? Yes: - Perl: fileno() function - Ruby: fileno() method of a file object - Go: fd() method of a file object > Last time, I said that to me, the FD inheritance issue is solved on > POSIX by the subprocess module which passes close_fds. In my own code, > I use subprocess, which is the "official", portable and safe way to > create child processes in Python. Someone using fork() + exec() should > know what he's doing, and be able to deal with the consequences: I'm > not only talking about FD inheritance, but also about > async-signal/multi-threaded safety ;-) The subprocess module has still a (minor?) race condition in the child process. Another C thread can create a new file descriptor after the subprocess module closed all file descriptors and before exec(). I hope that it is very unlikely, but it can happen. It's also explained in the PEP (see "Closing All Open File Descriptors"). I suppose that the race condition explains why Linux still has no closefrom() or nextfd() system calls. IMO the kernel is the best place to decide which FD should be kept, and which must not be inherited (must be closed), in the child process. I like the close-on-exec flag (and HANDLE_FLAG_INHERIT on Windows). > As for Windows, since it doesn't have fork(), it would make sense to > make its FD non heritable by default. As said in the "Inheritance of File Descriptors on Windows" section, Python gets inheritable handles and file descriptors because it does not use the Windows native API. Applications developed for Windows using the native API only create non-inheritable handles, and so don't have all these annoying inheritance issues. Windows does not have a fork() function, but handles and file descriptors can be inherited using CreateProcess() and spawn(), see the table in the "Status of Python 3.3" section. > And then use what you describe > here to selectively inherit FDs (i.e. implement keep_fds): > """ > Since Windows Vista, CreateProcess() supports an extension of the > STARTUPINFO struture: the STARTUPINFOEX structure. Using this new > structure, it is possible to specify a list of handles to inherit: > PROC_THREAD_ATTRIBUTE_HANDLE_LIST. Read Programmatically controlling > which handles are inherited by new processes in Win32 (Raymond Chen, > Dec 2011) for more information. > """ This feature can only be used to inherit *handles*, and it does not need of an intermediate program. If you want to inherit only some file descriptors, you need an intermediate program which will recreates them and then use spawn() (or directly the reserved fields of the STARTUPINFO structure). Using an intermediate program has unexpected consequences, so I prefer to use this option. The feature is also specific to Windows 7, a recent Windows version, whereas there are still millions of Windows XP installations (and I read that Python 3.4 will still supported Windows XP!). We need more feedback from users to know their use cases before chosing the best implementation of pass_handles/pass_fds on Windows. Richard agreed that this point can be deferred. Victor From victor.stinner at gmail.com Fri Aug 23 19:26:35 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 23 Aug 2013 19:26:35 +0200 Subject: [Python-Dev] [Python-checkins] cpython (merge 3.3 -> default): merge for issue #18755 In-Reply-To: <3cM6Zd15k2z7LqT@mail.python.org> References: <3cM6Zd15k2z7LqT@mail.python.org> Message-ID: 2013/8/23 brett.cannon : > http://hg.python.org/cpython/rev/7d30ecf5c916 > changeset: 85339:7d30ecf5c916 > parent: 85336:391f36ef461a > parent: 85337:ddd610cb65ef > user: Brett Cannon > date: Fri Aug 23 11:52:19 2013 -0400 > summary: > merge for issue #18755 > > files: > Lib/imp.py | 9 +++++++-- > Lib/test/test_imp.py | 9 +++++++++ > 2 files changed, 16 insertions(+), 2 deletions(-) You didn't merge the log entry in Misc/NEWS. Is it voluntary? Mercurial asked me to add the log entry when I merged 26c049dc1a4a into default => 01f33959ddf6. changeset: 85338:b107f7a8730d branch: 3.3 user: Brett Cannon date: Fri Aug 23 11:47:26 2013 -0400 files: Misc/NEWS description: NEW entry for issue #18755 diff -r ddd610cb65ef -r b107f7a8730d Misc/NEWS --- a/Misc/NEWS Fri Aug 23 11:45:57 2013 -0400 +++ b/Misc/NEWS Fri Aug 23 11:47:26 2013 -0400 @@ -66,6 +66,9 @@ Core and Builtins Library ------- +- Issue #18755: Fixed the loader used in imp to allow get_data() to be called + multiple times. + - Issue #16809: Fixed some tkinter incompabilities with Tcl/Tk 8.6. - Issue #16809: Tkinter's splitlist() and split() methods now accept Tcl_Obj Victor From brett at python.org Fri Aug 23 19:35:35 2013 From: brett at python.org (Brett Cannon) Date: Fri, 23 Aug 2013 13:35:35 -0400 Subject: [Python-Dev] Benchmarks now run unmodified under Python 3 Message-ID: I just committed a change to the benchmarks suite so that there is no longer a translation step to allow running under Python 3. Should make it much easier to just keep a checkout lying around to use for both Python 2 and 3 benchmarking. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Aug 23 19:50:53 2013 From: brett at python.org (Brett Cannon) Date: Fri, 23 Aug 2013 13:50:53 -0400 Subject: [Python-Dev] [Python-checkins] cpython (merge 3.3 -> default): merge for issue #18755 In-Reply-To: References: <3cM6Zd15k2z7LqT@mail.python.org> Message-ID: I explicitly did a second merge, reverted Misc/NEWS and committed, so that shouldn't have come up. It's not a big deal having the entry, but I didn't worry about it either. On Fri, Aug 23, 2013 at 1:26 PM, Victor Stinner wrote: > 2013/8/23 brett.cannon : > > http://hg.python.org/cpython/rev/7d30ecf5c916 > > changeset: 85339:7d30ecf5c916 > > parent: 85336:391f36ef461a > > parent: 85337:ddd610cb65ef > > user: Brett Cannon > > date: Fri Aug 23 11:52:19 2013 -0400 > > summary: > > merge for issue #18755 > > > > files: > > Lib/imp.py | 9 +++++++-- > > Lib/test/test_imp.py | 9 +++++++++ > > 2 files changed, 16 insertions(+), 2 deletions(-) > > You didn't merge the log entry in Misc/NEWS. Is it voluntary? > Mercurial asked me to add the log entry when I merged 26c049dc1a4a > into default => 01f33959ddf6. > > changeset: 85338:b107f7a8730d > branch: 3.3 > user: Brett Cannon > date: Fri Aug 23 11:47:26 2013 -0400 > files: Misc/NEWS > description: > NEW entry for issue #18755 > > > diff -r ddd610cb65ef -r b107f7a8730d Misc/NEWS > --- a/Misc/NEWS Fri Aug 23 11:45:57 2013 -0400 > +++ b/Misc/NEWS Fri Aug 23 11:47:26 2013 -0400 > @@ -66,6 +66,9 @@ Core and Builtins > Library > ------- > > +- Issue #18755: Fixed the loader used in imp to allow get_data() to be > called > + multiple times. > + > - Issue #16809: Fixed some tkinter incompabilities with Tcl/Tk 8.6. > > - Issue #16809: Tkinter's splitlist() and split() methods now accept > Tcl_Obj > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cf.natali at gmail.com Fri Aug 23 22:30:23 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 23 Aug 2013 22:30:23 +0200 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review In-Reply-To: References: Message-ID: > About your example: I'm not sure that it is reliable/portable. I sa > daemon libraries closing *all* file descriptors and then expecting new > file descriptors to become 0, 1 and 2. Your example is different > because w is still open. On Windows, I have seen cases with only fd 0, > 1, 2 open, and the next open() call gives the fd 10 or 13... Well, my example uses fork(), so obviously doesn't apply to Windows. It's perfectly safe on Unix. > I'm optimistic and I expect that most Python applications and > libraries already use the subprocess module. The subprocess module > closes all file descriptors (except 0, 1, 2) since Python 3.2. > Developers relying on the FD inheritance and using the subprocess with > Python 3.2 or later already had to use the pass_fds parameter. As long as the PEP makes it clear that this breaks backward compatibility, that's fine. IMO the risk of breakage outweights the modicum benefit. > The subprocess module has still a (minor?) race condition in the child > process. Another C thread can create a new file descriptor after the > subprocess module closed all file descriptors and before exec(). I > hope that it is very unlikely, but it can happen. No it can't, because after fork(), there's only one thread. It's perfectly safe. cf From stefan_ml at behnel.de Sat Aug 24 00:57:48 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Aug 2013 00:57:48 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 Message-ID: Hi, ticket 17741 has introduced a new feature in the xml.etree.ElementTree module that was added without any major review. http://bugs.python.org/issue17741 http://hg.python.org/cpython/rev/f903cf864191 I only recently learned about this addition and after taking a couple of closer looks, I found that it represents a rather seriously degradation of the ET module API. Since I am not a core developer, it is rather easy for the original committer to close the ticket and to sit and wait for the next releases to come in order to get the implementation out. So I'm sorry for having to ask publicly on this list for the code to be removed, before it does any harm in the wild. Let me explain why the addition is such a disaster. I'm personally involved in this because I will eventually have to implement features that occur in ElementTree also in the external lxml.etree package. And this is not an API that I can implement with a clear conscience. There are two parts to parsing in the ET API. The first is the parser, and the second is the target object (similar to a SAX handler). The parser has an incremental parsing API consisting of the functions .feed() and .close(). When it receives data through the .feed() method, it parses it and passes events on to the target. The target is commonly a TreeBuilder that builds an in-memory tree, but is not limited to that. Calling the .close() method tells the parser that the parsing is done and that it should finish it up. The class that was now added is called "IncrementalParser". It has two methods for passing data in: "data_received()" and "eof_received()". So the first thing to note is that this addition is only a copy of the existing API and functionality, but under different names. It is hard to understand for me how anyone could consider this a consistent design. Then, the purpose of this addition was to provide a way to collect parse events. That is the obvious role of the target object. In the new implementation, the target object is being instantiated, but not actually meant to collect the events. Instead, it's the parser collecting the events, based on what the target object returns (which it doesn't currently have to do at all). This is totally backwards. Instead, it should be up to the target object to decide which events to collect, how to process them and how to present them to the user. This is clearly how the API was originally designed. Also, the IncrementalParser doesn't directly collect the events itself but gets them through a sort of backdoor in the underlying parser. That parser object is actually being passed into the IncrementalParser as a parameter, which means that user provided parser objects will also have to implement that backdoor now, even though they may not actually be able to provide that functionality. My proposal for fixing these obvious design problems is to let each part of the parsing chain do what it's there for. Use the existing XMLParser (or an HTMLParser, as in lxml.etree) to feed in data incrementally, and let the target object process and collect the events. So, instead of replacing the parser interface with a new one, there should be a dedicated target object (or maybe just a wrapper for a TreeBuilder) that collects the parse events in this specific way. Since the time is running in favour of the already added implementation, I'm asking to back it out for the time being. I specifically do not want to rush in a replacement. Once there is an implementation that matches the established API, I'm all in favour of adding it, because the feature itself is a good idea. But keeping a wrong design in, just because "it's there", even before anyone has actually started using it, is just asking for future deprecation hassle. It's not too late for removal now, but it will be in a couple of weeks if it is not done now. Stefan From solipsis at pitrou.net Sat Aug 24 01:26:08 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 24 Aug 2013 01:26:08 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 References: Message-ID: <20130824012608.0187fa3f@fsol> On Sat, 24 Aug 2013 00:57:48 +0200 Stefan Behnel wrote: > Hi, > > ticket 17741 has introduced a new feature in the xml.etree.ElementTree > module that was added without any major review. > > http://bugs.python.org/issue17741 As I've already indicated on the tracker, I'm completely saturated with Stefan's qualms about a minor API addition and I'm not willing to process anymore of them. Hence I won't respond to the bulk of his e-mail. But I still want to clarify that claiming that the feature was "added without any major review" is outrageous and manipulative. Perhaps Stefan thinks that an ElementTree code review not by him (but, for example, by Eli, who currently maintains ElementTree) is not "major". Well, good for him. His best chance to influence the development process, though, is to contribute more, not harass active developers. Regards Antoine. From stefan_ml at behnel.de Sat Aug 24 07:32:34 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Aug 2013 07:32:34 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130824012608.0187fa3f@fsol> References: <20130824012608.0187fa3f@fsol> Message-ID: Antoine Pitrou, 24.08.2013 01:26: > On Sat, 24 Aug 2013 00:57:48 +0200 > Stefan Behnel wrote: >> ticket 17741 has introduced a new feature in the xml.etree.ElementTree >> module that was added without any major review. >> >> http://bugs.python.org/issue17741 > > As I've already indicated on the tracker, I'm completely saturated > with Stefan's qualms about a minor API addition and I'm not willing to > process anymore of them. Hence I won't respond to the bulk of his > e-mail. > > But I still want to clarify that claiming that the feature was "added > without any major review" is outrageous and manipulative. Perhaps > Stefan thinks that an ElementTree code review not by him (but, for > example, by Eli, who currently maintains ElementTree) is not "major". The reason why I'm saying this is that the way the change came in is rather - unorthodox. As Antoine noted in the ticket, he proposed the change on the tulip mailing list. That is a completely wrong place to discuss a new XML API, as can be seen from the replies. http://thread.gmane.org/gmane.comp.python.tulip/171 Specifically, no-one noticed the major overlap with the existing API and functionality at that point, nor the contradictions between the existing API and the new one. In the ticket, Eli stated that he didn't have time for the review ATM, and then, two days later, commented that the patch looks good. To me (and I'm really only interpreting here), this indicates that the review was mostly at the patch level. Note that he didn't comment on the API overlap either, so my guess is that he just didn't notice it. In my experience, reviewing design and thinking about alternatives takes more time than that, especially when you're "swamped", as he put it. I mean, it took *me* almost a day to dig into the implications and into the patch (as can be seen by my incremental comments), and I have the background of having written a complete implementation of that library myself. So, to put it more nicely, I think this feature was added without the amount of review that it needs, and now that I've given it that review, I'm asking for removal of the feature and a proper redesign that fits into the existing library. Stefan From ncoghlan at gmail.com Sat Aug 24 07:51:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Aug 2013 15:51:13 +1000 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: <20130823111822.49cba700@pitrou.net> References: <20130823111822.49cba700@pitrou.net> Message-ID: On 23 August 2013 19:18, Antoine Pitrou wrote: > > Hi, > > Le Fri, 23 Aug 2013 10:50:18 +0200, > Stefan Behnel a ?crit : >> >> Here's an initial attempt at a PEP for it. It is based on the >> (unfinished) ModuleSpec PEP, which is being discussed on the >> import-sig mailing list. > > Thanks for trying this. I think the PEP should contain working example > code for module initialization (and creation), to help gauge the > complexity for module writers. I've been thinking a lot about this as part of reviewing PEP 451 (the ModuleSpec PEP that Stefan's pre-PEP mentions). The relevant feedback on import-sig hasn't made it into PEP 451 yet (Eric is still considering the suggestion), but what I'm proposing is a new relatively *stateless* API for loaders, which consists of two methods: def create_module(self, spec): """Given a ModuleSpec, return the object to be added to sys.modules""" def exec_module(self, mod): """Execute the given module, updating it for the current system state""" create_module would be optional - if not defined, the import system would automatically create a normal module object. If it is defined, the import system would call it and then take care of setting all the standard attributes (__name__, __spec__, etc) on the result if the loader hadn't already set them. exec_module would be required, and is the part that actually fully initialises the module. "imp.reload" would then translate to calling exec_module on an existing module without recreating it. For loaders that provide the new API, the global import state manipulation would all be handled by the import system. Such loaders would still be free to provide load_module() anyway for backwards compatibility with earlier Python versions, since the new API would take precedence. In this context, the API I was considering for extension modules was slightly different from that in Stefan's proto-PEP (although it was based on some of Stefan's suggestions in the earlier threads). Specifically, I'm thinking of an API like this that does a better job of supporting reloading: PyObject * PyImportCreate_(PyObject *spec); /* Optional */ int PyImportExec_(PyObject *mod); Implementing PyImportCreate would only be needed if you had C level state to store - if you're happy storing everything in the module globals, then you would only need to implement PyImportExec. My current plan is to create an experimental prototype of this approach this weekend. That will include stdlib test cases, so it will also show how it looks from the extension developer's point of view. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Aug 24 07:57:50 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Aug 2013 15:57:50 +1000 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> Message-ID: On 24 August 2013 15:32, Stefan Behnel wrote: > So, to put it more nicely, I think this feature was added without the > amount of review that it needs, and now that I've given it that review, I'm > asking for removal of the feature and a proper redesign that fits into the > existing library. FWIW, it seems to me that this is something that could live in *tulip* as an adapter between the tulip data processing APIs and the existing ElementTree incremental parsing APIs, without needing to be added directly to xml.etree at all. It certainly seems premature to be adding tulip-inspired APIs to other parts of the standard library, when tulip itself hasn't been deemed ready for inclusion. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sat Aug 24 12:58:44 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 24 Aug 2013 12:58:44 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 References: <20130824012608.0187fa3f@fsol> Message-ID: <20130824125844.57a8f9a3@fsol> On Sat, 24 Aug 2013 15:57:50 +1000 Nick Coghlan wrote: > On 24 August 2013 15:32, Stefan Behnel wrote: > > So, to put it more nicely, I think this feature was added without the > > amount of review that it needs, and now that I've given it that review, I'm > > asking for removal of the feature and a proper redesign that fits into the > > existing library. > > FWIW, it seems to me that this is something that could live in *tulip* > as an adapter between the tulip data processing APIs and the existing > ElementTree incremental parsing APIs, without needing to be added > directly to xml.etree at all. This is not an adapter, it's a new feature that ElementTree wasn't providing before. The need to process data in a non-blocking way (not merely incremental) didn't appear with Tulip, and the fact that the API is *inspired* by Tulip doesn't make it Tulip-specific. (I'm also curious why Tulip would start providing data-processing APIs: until now, I thought it's supposed to be a networking library :-)) Furthermore, such a feature has to access implementation details of ElementTree, so it's only natural that it be provided in ElementTree. By the way, just know that Stefan tried to provide a patch that would better suit his API desires, and failed because ElementTree's current implementation makes it difficult to do so. Someone can take the whole thing over if they want to, change the API and make it more shiny or different, tweak the implementation to suit it better to their own aesthetic sensibilities, but please don't revert an useful feature unless it's based on concrete, serious issues rather than a platonic disagreement about design. Regards Antoine. From ncoghlan at gmail.com Sat Aug 24 13:36:51 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Aug 2013 21:36:51 +1000 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: On 24 August 2013 15:51, Nick Coghlan wrote: > My current plan is to create an experimental prototype of this > approach this weekend. That will include stdlib test cases, so it will > also show how it looks from the extension developer's point of view. I prototyped as much as I could without PEP 451's ModuleSpec support here: https://bitbucket.org/ncoghlan/cpython_sandbox/commits/branch/new_extension_imports On systems that use dynload_shlib (at least Linux & the BSDs), this branch allows extension modules to be imported if they provide a PyImportExec_NAME hook. The new hook is preferred to the existing PyInit_NAME hook, so extension modules using the stable ABI can provide both and degrade to the legacy initialisation API on older versions of Python. The PyImportExec hook is called with a pre-created module object that the hook is then expected to populate. To aid in this task, I added two new APIs: PyModule_SetDocString PyModule_AddFunctions These cover setting the docstring and adding module level functions, tasks that are handled through the PyModule_Create API when using the PyInit_NAME style hook. The _testimportexec.c module was derived from the existing example xxlimited.c module, with a few name changes. The main functional difference is that _testimportexec uses the new API, so the module object is created externally and passed in to the API, rather than being created by the extension module. The effect of this can be seen in the test suite, where ImportExecTests.test_fresh_module shows that loading the module twice will create two *different* modules, unlike the legacy API. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sat Aug 24 13:53:28 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 24 Aug 2013 13:53:28 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: <20130824135328.5eb11c6f@fsol> On Sat, 24 Aug 2013 21:36:51 +1000 Nick Coghlan wrote: > On 24 August 2013 15:51, Nick Coghlan wrote: > > My current plan is to create an experimental prototype of this > > approach this weekend. That will include stdlib test cases, so it will > > also show how it looks from the extension developer's point of view. > > I prototyped as much as I could without PEP 451's ModuleSpec support here: > > https://bitbucket.org/ncoghlan/cpython_sandbox/commits/branch/new_extension_imports > > On systems that use dynload_shlib (at least Linux & the BSDs), this > branch allows extension modules to be imported if they provide a > PyImportExec_NAME hook. The new hook is preferred to the existing > PyInit_NAME hook, so extension modules using the stable ABI can > provide both and degrade to the legacy initialisation API on older > versions of Python. > > The PyImportExec hook is called with a pre-created module object that > the hook is then expected to populate. To aid in this task, I added > two new APIs: > > PyModule_SetDocString > PyModule_AddFunctions I was thinking about something like PyType_FromSpec, only specialized for module subclasses to make it easier to declare them (e.g. PyModuleType_FromSpec). This would also imply extension module have to be subclasses of the built-in module type. They can't be arbitrary objects like Stefan proposed. I'm not sure what the latter enables, but it would probably make things more difficult internally. Regards Antoine. From stefan_ml at behnel.de Sat Aug 24 14:46:32 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Aug 2013 14:46:32 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130824125844.57a8f9a3@fsol> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> Message-ID: Antoine Pitrou, 24.08.2013 12:58: > By the way, just know that Stefan tried to provide a patch that would > better suit his API desires, and failed because ElementTree's current > implementation makes it difficult to do so. Absolutely. I agree that your current implementation is a hack that works around these issues. That doesn't mean that they go away, though. And yes, I even provided a half-finished implementation, even though I'm certainly not interested enough in this feature to make much use of it. Why don't you just take a look at my patch and finish it up? Given that you are apparently the most ambitious supporter of this feature, I would expect you to provide an appropriate implementation and have it reviewed. And with "reviewed" I also mean "accept criticism". Just because you managed to sneak in a hack doesn't mean it has to stay there once it's uncovered. > Someone can take the whole thing over if they want to, change the API > and make it more shiny or different, tweak the implementation to suit > it better to their own aesthetic sensibilities, but please don't revert > an useful feature unless it's based on concrete, serious issues rather > than a platonic disagreement about design. As I said, the only reason why the current implementation is there is "because it's there". The problems of the current iterparse implementation should not be taken as a reason for a design decision of a new feature. Instead, they should be fixed and the feature should be based on these fixes. Yes, that's more work than adding a hack. But I'm sure that cleaning up first will pay off quite quickly. I already gave lots of reasons for that. Stefan From stefan_ml at behnel.de Sat Aug 24 14:51:42 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Aug 2013 14:51:42 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: <20130824135328.5eb11c6f@fsol> References: <20130823111822.49cba700@pitrou.net> <20130824135328.5eb11c6f@fsol> Message-ID: Antoine Pitrou, 24.08.2013 13:53: > This would also imply extension module have to be subclasses of the > built-in module type. They can't be arbitrary objects like Stefan > proposed. I'm not sure what the latter enables, but it would probably > make things more difficult internally. My line of thought was more like: if Python code can stick anything into sys.modules and the runtime doesn't care, why can't extension modules stick anything into sys.modules as well? I can't really see the advantage of requiring a subtype here. Or even just a type, as I said. I guess we'll have to start using this in real code to see if it makes any difference. Stefan From solipsis at pitrou.net Sat Aug 24 14:53:13 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 24 Aug 2013 14:53:13 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> Message-ID: <20130824145313.1d38a8bc@fsol> On Sat, 24 Aug 2013 14:46:32 +0200 Stefan Behnel wrote: > > As I said, the only reason why the current implementation is there is > "because it's there". No. It works, it's functional, it fills an use case, and it doesn't seem to have any concrete issues. Get over it, Stefan, and stop trolling us. From solipsis at pitrou.net Sat Aug 24 15:00:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 24 Aug 2013 15:00:03 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules References: <20130823111822.49cba700@pitrou.net> <20130824135328.5eb11c6f@fsol> Message-ID: <20130824150003.34e30752@fsol> On Sat, 24 Aug 2013 14:51:42 +0200 Stefan Behnel wrote: > Antoine Pitrou, 24.08.2013 13:53: > > This would also imply extension module have to be subclasses of the > > built-in module type. They can't be arbitrary objects like Stefan > > proposed. I'm not sure what the latter enables, but it would probably > > make things more difficult internally. > > My line of thought was more like: if Python code can stick anything into > sys.modules and the runtime doesn't care, why can't extension modules stick > anything into sys.modules as well? > > I can't really see the advantage of requiring a subtype here. Or even just > a type, as I said. sys.modules doesn't care indeed. There's still the whole extension-specific code, though, i.e. the eternal PyModuleDef store and the state management routines. How much of it would remain with your proposal? Regards Antoine. From stefan_ml at behnel.de Sat Aug 24 15:07:12 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Aug 2013 15:07:12 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: <20130824150003.34e30752@fsol> References: <20130823111822.49cba700@pitrou.net> <20130824135328.5eb11c6f@fsol> <20130824150003.34e30752@fsol> Message-ID: Antoine Pitrou, 24.08.2013 15:00: > On Sat, 24 Aug 2013 14:51:42 +0200 > Stefan Behnel wrote: >> Antoine Pitrou, 24.08.2013 13:53: >>> This would also imply extension module have to be subclasses of the >>> built-in module type. They can't be arbitrary objects like Stefan >>> proposed. I'm not sure what the latter enables, but it would probably >>> make things more difficult internally. >> >> My line of thought was more like: if Python code can stick anything into >> sys.modules and the runtime doesn't care, why can't extension modules stick >> anything into sys.modules as well? >> >> I can't really see the advantage of requiring a subtype here. Or even just >> a type, as I said. > > sys.modules doesn't care indeed. There's still the whole > extension-specific code, though, i.e. the eternal PyModuleDef store > and the state management routines. How much of it would remain with > your proposal? PEP 3121 would no longer be necessary. Extension types can do all we need. No more special casing of modules, that was the idea. Stefan From stefan_ml at behnel.de Sat Aug 24 15:12:48 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Aug 2013 15:12:48 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130824145313.1d38a8bc@fsol> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130824145313.1d38a8bc@fsol> Message-ID: Antoine Pitrou, 24.08.2013 14:53: > it doesn't seem to have any concrete issues. I don't consider closing your eyes and ignoring the obvious a good strategy for software design. Stefan From rdmurray at bitdance.com Sat Aug 24 15:17:56 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 24 Aug 2013 09:17:56 -0400 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130824145313.1d38a8bc@fsol> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130824145313.1d38a8bc@fsol> Message-ID: <20130824131757.4D9A0250192@webabinitio.net> On Sat, 24 Aug 2013 14:53:13 +0200, Antoine Pitrou wrote: > On Sat, 24 Aug 2013 14:46:32 +0200 > Stefan Behnel wrote: > > > > As I said, the only reason why the current implementation is there is > > "because it's there". > > No. It works, it's functional, it fills an use case, and it doesn't seem > to have any concrete issues. > > Get over it, Stefan, and stop trolling us. Stefan is not trolling. He's raising objections that you disagree with. It costs nothing to keep the discussion civil. --David From stefan_ml at behnel.de Sat Aug 24 15:19:44 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Aug 2013 15:19:44 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: Nick Coghlan, 24.08.2013 13:36: > On 24 August 2013 15:51, Nick Coghlan wrote: >> My current plan is to create an experimental prototype of this >> approach this weekend. That will include stdlib test cases, so it will >> also show how it looks from the extension developer's point of view. > > I prototyped as much as I could without PEP 451's ModuleSpec support here: > > https://bitbucket.org/ncoghlan/cpython_sandbox/commits/branch/new_extension_imports Cool. I'll take a look. > On systems that use dynload_shlib (at least Linux & the BSDs), this > branch allows extension modules to be imported if they provide a > PyImportExec_NAME hook. The new hook is preferred to the existing > PyInit_NAME hook, so extension modules using the stable ABI can > provide both and degrade to the legacy initialisation API on older > versions of Python. Hmm, right, good call. Since both init schemes have to be part of the stable ABI, we can's rely on people compiling out one or the other. So using the old one as a fallback should work. However, only actual usage in code will tell us how it feels on user side. Supporting both in the same binary will most likely complicate things quite a bit. > The PyImportExec hook is called with a pre-created module object that > the hook is then expected to populate. To aid in this task, I added > two new APIs: > > PyModule_SetDocString > PyModule_AddFunctions > > These cover setting the docstring and adding module level functions, > tasks that are handled through the PyModule_Create API when using the > PyInit_NAME style hook. What are those needed for? If you subtype the module type, or provide an arbitrary extension type as implementation, you'd get these for free, wouldn't you? It's in no way different from setting up an extension type. > The _testimportexec.c module Where can I find that module? > was derived from the existing example > xxlimited.c module, with a few name changes. The main functional > difference is that _testimportexec uses the new API, so the module > object is created externally and passed in to the API, rather than > being created by the extension module. The effect of this can be seen > in the test suite, where ImportExecTests.test_fresh_module shows that > loading the module twice will create two *different* modules, unlike > the legacy API. Stefan From ncoghlan at gmail.com Sat Aug 24 16:03:01 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Aug 2013 00:03:01 +1000 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130824125844.57a8f9a3@fsol> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> Message-ID: On 24 August 2013 20:58, Antoine Pitrou wrote: > Someone can take the whole thing over if they want to, change the API > and make it more shiny or different, tweak the implementation to suit > it better to their own aesthetic sensibilities, but please don't revert > an useful feature unless it's based on concrete, serious issues rather > than a platonic disagreement about design. While "It's a useful feature" is a necessary criterion for adding something to the standard library, it has never been a *sufficient* criterion. There's a lot more to take into account when judging the latter, and one of the big ones if "There should be one obvious way to do it". Looking at the current documentation of ElementTree sets of alarm bells on that front, as it contains the following method descriptions for XMLParser: close() Finishes feeding data to the parser. Returns an element structure. feed(data) Feeds data to the parser. data is encoded data. And these for IncrementalParser: data_received(data) Feed the given bytes data to the incremental parser. eof_received() Signal the incremental parser that the data stream is terminated. events() Iterate over the events which have been encountered in the data fed to the parser. This method yields (event, elem) pairs, where event is a string representing the type of event (e.g. "end") and elem is the encountered Element object. Events provided in a previous call to events() will not be yielded again. It is thoroughly unclear to me as a user of the module how and why one would use the new IncrementalParser API over the existing incremental XMLParser API. If there is some defect in the XMLParser API that prevents it from interoperating correctly with asynchronous code, then *that is a bug to be fixed*, rather than avoided by adding a whole new parallel API. If Stefan's "please revert this" as lxml.etree maintainer isn't enough, then I'm happy to add a "please revert this" as a core committer that is confused about how and when the new tulip-inspired incremental parsing API should be used in preference to the existing incremental parsing API, and believes this needs to be clearly resolved before adding a second way to do it (especially if there's a possibility of using a different implementation strategy that avoids adding the second way). Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sat Aug 24 16:13:34 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 24 Aug 2013 16:13:34 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> Message-ID: <20130824161334.31756a13@fsol> On Sun, 25 Aug 2013 00:03:01 +1000 Nick Coghlan wrote: > If Stefan's "please revert this" as lxml.etree maintainer isn't > enough, then I'm happy to add a "please revert this" as a core > committer that is confused about how and when the new tulip-inspired > incremental parsing API should be used in preference to the existing > incremental parsing API, and believes this needs to be clearly > resolved before adding a second way to do it > (especially if there's a > possibility of using a different implementation strategy that avoids > adding the second way). To be clear, again: anyone who wants to "see it resolved" can take over the issue and handle it by themselves. I'm done with it. Regards Antoine. From ncoghlan at gmail.com Sat Aug 24 16:22:39 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Aug 2013 00:22:39 +1000 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: On 24 August 2013 23:19, Stefan Behnel wrote: > Nick Coghlan, 24.08.2013 13:36: >> On 24 August 2013 15:51, Nick Coghlan wrote: >>> My current plan is to create an experimental prototype of this >>> approach this weekend. That will include stdlib test cases, so it will >>> also show how it looks from the extension developer's point of view. >> >> I prototyped as much as I could without PEP 451's ModuleSpec support here: >> >> https://bitbucket.org/ncoghlan/cpython_sandbox/commits/branch/new_extension_imports > > Cool. I'll take a look. The new _PyImport_CreateAndExecExtensionModule function does the heavy lifting: https://bitbucket.org/ncoghlan/cpython_sandbox/src/081f8f7e3ee27dc309463b48e6c67cf4880fca12/Python/importdl.c?at=new_extension_imports#cl-65 One key point to note is that it *doesn't* call _PyImport_FixupExtensionObject, which is the API that handles all the PEP 3121 per-module state stuff. Instead, the idea will be for modules that don't need additional C level state to just implement PyImportExec_NAME, while those that *do* need C level state implement PyImportCreate_NAME and return a custom object (which may or may not be a module subtype). Such modules can still support reloading (e.g. to pick up reloaded or removed module dependencies) by providing PyImportExec_NAME as well. (in a PEP 451 world, this would likely be split up as two separate functions, one for create, one for exec) >> On systems that use dynload_shlib (at least Linux & the BSDs), this >> branch allows extension modules to be imported if they provide a >> PyImportExec_NAME hook. The new hook is preferred to the existing >> PyInit_NAME hook, so extension modules using the stable ABI can >> provide both and degrade to the legacy initialisation API on older >> versions of Python. > > Hmm, right, good call. Since both init schemes have to be part of the > stable ABI, we can's rely on people compiling out one or the other. So > using the old one as a fallback should work. However, only actual usage in > code will tell us how it feels on user side. Supporting both in the same > binary will most likely complicate things quite a bit. It shouldn't be too bad - the PyInit_NAME fallback would just need to do the equivalent of calling PyImportCreate_NAME (or PyModule_Create if not using a custom object), call PyImportExec_NAME on it, and then return the result. Modules that genuinely *needed* the new behaviour wouldn't be able to provide a sensible fallback, and would thus be limited to Python 3.4+ >> The PyImportExec hook is called with a pre-created module object that >> the hook is then expected to populate. To aid in this task, I added >> two new APIs: >> >> PyModule_SetDocString >> PyModule_AddFunctions >> >> These cover setting the docstring and adding module level functions, >> tasks that are handled through the PyModule_Create API when using the >> PyInit_NAME style hook. > > What are those needed for? If you subtype the module type, or provide an > arbitrary extension type as implementation, you'd get these for free, > wouldn't you? It's in no way different from setting up an extension type. The idea is to let people use an import system provided module object if they don't define a custom PyImportCreate_NAME hook. Setting the docstring and adding module level functions were the two things that PyModule_Create previously handled neatly through the Py_ModuleDef struct. The two new API functions just break out those subsets as separate operations to call on the import system provided module. >> The _testimportexec.c module > > Where can I find that module? Oops, forgot to add it to the repo. Uploaded now: https://bitbucket.org/ncoghlan/cpython_sandbox/src/081f8f7e3ee27dc309463b48e6c67cf4880fca12/Modules/_testimportexec.c?at=new_extension_imports Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Aug 24 16:26:15 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Aug 2013 00:26:15 +1000 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130824161334.31756a13@fsol> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130824161334.31756a13@fsol> Message-ID: On 25 August 2013 00:13, Antoine Pitrou wrote: > On Sun, 25 Aug 2013 00:03:01 +1000 > Nick Coghlan wrote: >> If Stefan's "please revert this" as lxml.etree maintainer isn't >> enough, then I'm happy to add a "please revert this" as a core >> committer that is confused about how and when the new tulip-inspired >> incremental parsing API should be used in preference to the existing >> incremental parsing API, and believes this needs to be clearly >> resolved before adding a second way to do it >> (especially if there's a >> possibility of using a different implementation strategy that avoids >> adding the second way). > > To be clear, again: anyone who wants to "see it resolved" can take over > the issue and handle it by themselves. I'm done with it. OK, I'll revert it for now, then. If someone else steps up to resolve the API duplication problem, cool, otherwise we can continue to live without this as a standard library feature. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Aug 24 16:33:22 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Aug 2013 00:33:22 +1000 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130824161334.31756a13@fsol> Message-ID: On 25 August 2013 00:26, Nick Coghlan wrote: > On 25 August 2013 00:13, Antoine Pitrou wrote: >> On Sun, 25 Aug 2013 00:03:01 +1000 >> Nick Coghlan wrote: >>> If Stefan's "please revert this" as lxml.etree maintainer isn't >>> enough, then I'm happy to add a "please revert this" as a core >>> committer that is confused about how and when the new tulip-inspired >>> incremental parsing API should be used in preference to the existing >>> incremental parsing API, and believes this needs to be clearly >>> resolved before adding a second way to do it >>> (especially if there's a >>> possibility of using a different implementation strategy that avoids >>> adding the second way). >> >> To be clear, again: anyone who wants to "see it resolved" can take over >> the issue and handle it by themselves. I'm done with it. > > OK, I'll revert it for now, then. If someone else steps up to resolve > the API duplication problem, cool, otherwise we can continue to live > without this as a standard library feature. On the other hand... because other changes have been made to the module since the original commit, a simple "hg backout" is no longer possible :( Stefan - if you'd like this reverted, you're going to have to either make the alternative solution work correctly, or else craft the commit to undo the API addition. However, I have at least reopened http://bugs.python.org/issue17741 Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eliben at gmail.com Sat Aug 24 17:03:22 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 24 Aug 2013 08:03:22 -0700 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130824161334.31756a13@fsol> Message-ID: On Sat, Aug 24, 2013 at 7:33 AM, Nick Coghlan wrote: > On 25 August 2013 00:26, Nick Coghlan wrote: > > On 25 August 2013 00:13, Antoine Pitrou wrote: > >> On Sun, 25 Aug 2013 00:03:01 +1000 > >> Nick Coghlan wrote: > >>> If Stefan's "please revert this" as lxml.etree maintainer isn't > >>> enough, then I'm happy to add a "please revert this" as a core > >>> committer that is confused about how and when the new tulip-inspired > >>> incremental parsing API should be used in preference to the existing > >>> incremental parsing API, and believes this needs to be clearly > >>> resolved before adding a second way to do it > >>> (especially if there's a > >>> possibility of using a different implementation strategy that avoids > >>> adding the second way). > >> > >> To be clear, again: anyone who wants to "see it resolved" can take over > >> the issue and handle it by themselves. I'm done with it. > > > > OK, I'll revert it for now, then. If someone else steps up to resolve > > the API duplication problem, cool, otherwise we can continue to live > > without this as a standard library feature. > > On the other hand... because other changes have been made to the > module since the original commit, a simple "hg backout" is no longer > possible :( > > Stefan - if you'd like this reverted, you're going to have to either > make the alternative solution work correctly, or else craft the commit > to undo the API addition. > > However, I have at least reopened http://bugs.python.org/issue17741 > Let's please keep the discussion calm and civil, everyone, and keep things in proportion. This is precisely what alpha releases are for - we have time before beta (Nov 24) to tweak the API. It's a fairly minor feature that *does* appear useful. I agree it would be nice to find an API that's acceptable for more developers. I'll try to find time to review this again, and others are free to do so too. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sat Aug 24 17:43:07 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Aug 2013 17:43:07 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: Nick Coghlan, 24.08.2013 16:22: > On 24 August 2013 23:19, Stefan Behnel wrote: >> Nick Coghlan, 24.08.2013 13:36: >>> On 24 August 2013 15:51, Nick Coghlan wrote: >>>> My current plan is to create an experimental prototype of this >>>> approach this weekend. That will include stdlib test cases, so it will >>>> also show how it looks from the extension developer's point of view. >>> >>> I prototyped as much as I could without PEP 451's ModuleSpec support here: >>> >>> https://bitbucket.org/ncoghlan/cpython_sandbox/commits/branch/new_extension_imports >> >> Cool. I'll take a look. > > The new _PyImport_CreateAndExecExtensionModule function does the heavy lifting: > > https://bitbucket.org/ncoghlan/cpython_sandbox/src/081f8f7e3ee27dc309463b48e6c67cf4880fca12/Python/importdl.c?at=new_extension_imports#cl-65 > > One key point to note is that it *doesn't* call > _PyImport_FixupExtensionObject, which is the API that handles all the > PEP 3121 per-module state stuff. Instead, the idea will be for modules > that don't need additional C level state to just implement > PyImportExec_NAME, while those that *do* need C level state implement > PyImportCreate_NAME and return a custom object (which may or may not > be a module subtype). Is it really a common case for an extension module not to need any C level state at all? I mean, this might work for very simple accelerator modules with only a few stand-alone functions. But anything non-trivial will almost certainly have some kind of global state, cache, external library, etc., and that state is best stored at the C level for safety reasons. > Such modules can still support reloading (e.g. > to pick up reloaded or removed module dependencies) by providing > PyImportExec_NAME as well. > > (in a PEP 451 world, this would likely be split up as two separate > functions, one for create, one for exec) Can't we just always require extension modules to implement their own type? Sure, it's a lot of boiler plate code, but that could be handled by a simple C code generator or maybe even a copy&paste example in the docs. I would like to avoid making it too easy for users in the future to get anything wrong with reloading or sub-interpreters. Most people won't test these things for their own code and the harder it is to make them not work, the more likely it is that a given set of dependencies will properly work in a sub-interpreter. If users are required to implement their own type, I think it would be more obvious where to put global module state, how to define functions (i.e. module methods), how to handle garbage collection at the global module level, etc. >>> On systems that use dynload_shlib (at least Linux & the BSDs), this >>> branch allows extension modules to be imported if they provide a >>> PyImportExec_NAME hook. The new hook is preferred to the existing >>> PyInit_NAME hook, so extension modules using the stable ABI can >>> provide both and degrade to the legacy initialisation API on older >>> versions of Python. >> >> Hmm, right, good call. Since both init schemes have to be part of the >> stable ABI, we can's rely on people compiling out one or the other. So >> using the old one as a fallback should work. However, only actual usage in >> code will tell us how it feels on user side. Supporting both in the same >> binary will most likely complicate things quite a bit. > > It shouldn't be too bad - the PyInit_NAME fallback would just need to > do the equivalent of calling PyImportCreate_NAME (or PyModule_Create > if not using a custom object), call PyImportExec_NAME on it, and then > return the result. > > Modules that genuinely *needed* the new behaviour wouldn't be able to > provide a sensible fallback, and would thus be limited to Python 3.4+ Right. I only saw it from the POV of Cython, which *will* have to support both, and *will* use the new feature in Py3.4+. No idea how that is going to work, but we've found so many tricks and work-arounds in the past that I'm sure it'll work somehow. Module level properties are just way too tempting not to make use of them. Stefan From tjreedy at udel.edu Sat Aug 24 20:42:24 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 24 Aug 2013 14:42:24 -0400 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> Message-ID: On 8/24/2013 10:03 AM, Nick Coghlan wrote: I have not used ET or equivalent, but I do have opinions on function names. > Looking at the current documentation of ElementTree sets of alarm > bells on that front, as it contains the following method descriptions > for XMLParser: > > close() > Finishes feeding data to the parser. Returns an element structure. > > feed(data) > Feeds data to the parser. data is encoded data. These are short active verbs, reused from other Python contexts. > > And these for IncrementalParser: > > data_received(data) > Feed the given bytes data to the incremental parser. Longer, awkward, and to me ugly in comparison to 'feed'. Since it seems to mean more or less the same thing, why not reuse 'feed' and continue to build on people prior knowledge of Python? Is this the 'tulip inspired' part? If so, I hope the names are not set in stone yet. > eof_received() > Signal the incremental parser that the data stream is terminated. What is the incremental parser supposed to do with the information? Close ;-? > events() > Iterate over the events which have been encountered in the > data fed to the parser. This method yields (event, elem) pairs, where > event is a string representing the type of event (e.g. "end") and elem > is the encountered Element object. Events provided in a previous call > to events() will not be yielded again. Plural nouns work well as iterator names: 'for event in events:'. -- Terry Jan Reedy From g.brandl at gmx.net Sat Aug 24 20:51:17 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 24 Aug 2013 20:51:17 +0200 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: Am 21.08.2013 21:26, schrieb Brett Cannon: > > > > On Wed, Aug 21, 2013 at 2:22 PM, Tim Peters > wrote: > > [Tim, wondering why the 3.2 branch isn't "inactive"] > >> ... > >> So let's try a different question ;-) Would anyone _object_ to > >> completing the process described in the docs: merge 3.2 into 3.3, > >> then merge 3.3 into default? I'd be happy to do that. I'd throw away > >> all the merge changes except for adding the v3,2.5 tag to .hgtags. > >> > >> The only active branches remaining would be `default` and 2.7, which > >> is what I expected when I started this ;-) > > [Brett Cannon] > > While I would think Georg can object if he wants, I see no reason to help > > visibly shutter the 3.2 branch by doing null merges. It isn't like it makes > > using hg harder or the history harder to read. > > Well, why do we _ever_ do a null merge? Then why don't the reasons > apply in this case? > > > After reading that sentence I realize there is a key "not" missing: "I see no > reason NOT to help visibly shutter the 3.2. branch ...". IOW I say do the null > merge. Sorry about that. FWIW I have no real objections, I just don't see the gain. Georg From tjreedy at udel.edu Sat Aug 24 21:11:56 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 24 Aug 2013 15:11:56 -0400 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> <20130824135328.5eb11c6f@fsol> Message-ID: On 8/24/2013 8:51 AM, Stefan Behnel wrote: > Antoine Pitrou, 24.08.2013 13:53: >> This would also imply extension module have to be subclasses of the >> built-in module type. They can't be arbitrary objects like Stefan >> proposed. I'm not sure what the latter enables, but it would probably >> make things more difficult internally. > > My line of thought was more like: if Python code can stick anything into > sys.modules and the runtime doesn't care, why can't extension modules stick > anything into sys.modules as well? Being able to stick anything in sys.modules in CPython is an implementation artifact rather than language feature. "sys.modules This is a dictionary that maps module names to modules which have already been loaded." This implies to me that an implementation could use a dict subclass (or subtype if you prefer) that checks that keys are names and values ModuleType instances (or None). "This can be manipulated to force reloading of modules and other tricks." I guess this refers to the undocumented (at least here) option of None as a signal value. > I can't really see the advantage of requiring a subtype here. Or even just > a type, as I said. A 'module' has to work with the import machinery and user code. I would ask, "What is the advantage of loosening the current spec?" (Or reinterpreting 'module', if you prefer.) Loosening is hard to undo once done. -- Terry Jan Reedy From benjamin at python.org Sat Aug 24 21:17:15 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 24 Aug 2013 14:17:15 -0500 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> <20130824135328.5eb11c6f@fsol> Message-ID: 2013/8/24 Terry Reedy : > On 8/24/2013 8:51 AM, Stefan Behnel wrote: >> >> Antoine Pitrou, 24.08.2013 13:53: >>> >>> This would also imply extension module have to be subclasses of the >>> built-in module type. They can't be arbitrary objects like Stefan >>> proposed. I'm not sure what the latter enables, but it would probably >>> make things more difficult internally. >> >> >> My line of thought was more like: if Python code can stick anything into >> sys.modules and the runtime doesn't care, why can't extension modules >> stick >> anything into sys.modules as well? > > > Being able to stick anything in sys.modules in CPython is an implementation > artifact rather than language feature. This is not really true. Many people use this feature to replace modules as they are being imported with other things. -- Regards, Benjamin From tim.peters at gmail.com Sat Aug 24 22:38:08 2013 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 24 Aug 2013 15:38:08 -0500 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: [Tim, wondering why the 3.2 branch isn't "inactive"] [Georg Brandl] > FWIW I have no real objections, I just don't see the gain. I'm glad it's OK! Especially because it's already been done ;-) Two gains: 1. "hg branches" output now matches what the developer docs imply it should be. It didn't before. 2. If a security fix needs to made to 3.2, it will be much easier to forward-merge it to the 3.3 and default branches now (the merges won't suck in a pile of ancient, and unwanted, irrelevant-to-the-fix changes). From g.brandl at gmx.net Sat Aug 24 22:59:52 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 24 Aug 2013 22:59:52 +0200 Subject: [Python-Dev] Status of 3.2 in Hg repository? In-Reply-To: References: <20130820101039.4eea3d81@pitrou.net> <20130820171605.DB59F2507B6@webabinitio.net> <20130820192758.19e4953f@fsol> <20130820195527.7bd0253a@fsol> Message-ID: Am 24.08.2013 22:38, schrieb Tim Peters: > [Tim, wondering why the 3.2 branch isn't "inactive"] > > [Georg Brandl] >> FWIW I have no real objections, I just don't see the gain. > > I'm glad it's OK! Especially because it's already been done ;-) > > Two gains: > > 1. "hg branches" output now matches what the developer docs imply it > should be. It didn't before. Well, the dev docs are not dogma and could be changed :) > 2. If a security fix needs to made to 3.2, it will be much easier to > forward-merge it to the 3.3 and default branches now (the merges won't > suck in a pile of ancient, and unwanted, irrelevant-to-the-fix > changes). It's unusual to develop a security fix on 3.2; usually the fix is done in the active branches and then backported to security-only branches. But I get the consistency argument (and especially the .hgtags entry is nice to have in the newer branches). cheers, Georg From ncoghlan at gmail.com Sat Aug 24 23:26:41 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Aug 2013 07:26:41 +1000 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> <20130824135328.5eb11c6f@fsol> Message-ID: On 25 Aug 2013 05:19, "Benjamin Peterson" wrote: > > 2013/8/24 Terry Reedy : > > On 8/24/2013 8:51 AM, Stefan Behnel wrote: > >> > >> Antoine Pitrou, 24.08.2013 13:53: > >>> > >>> This would also imply extension module have to be subclasses of the > >>> built-in module type. They can't be arbitrary objects like Stefan > >>> proposed. I'm not sure what the latter enables, but it would probably > >>> make things more difficult internally. > >> > >> > >> My line of thought was more like: if Python code can stick anything into > >> sys.modules and the runtime doesn't care, why can't extension modules > >> stick > >> anything into sys.modules as well? > > > > > > Being able to stick anything in sys.modules in CPython is an implementation > > artifact rather than language feature. > > This is not really true. Many people use this feature to replace > modules as they are being imported with other things. Right - arbitrary objects in sys.modules is definitely a supported feature (e.g. most lazy import mechanisms rely on that). However, such objects should really provide the module level attributes the import system expects for ducktyping purposes, which is why I suggest the import system should automatically take care of setting those. Cheers, Nick. > > -- > Regards, > Benjamin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Aug 24 23:43:44 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Aug 2013 07:43:44 +1000 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: On 25 Aug 2013 01:44, "Stefan Behnel" wrote: > > Nick Coghlan, 24.08.2013 16:22: > > On 24 August 2013 23:19, Stefan Behnel wrote: > >> Nick Coghlan, 24.08.2013 13:36: > >>> On 24 August 2013 15:51, Nick Coghlan wrote: > >>>> My current plan is to create an experimental prototype of this > >>>> approach this weekend. That will include stdlib test cases, so it will > >>>> also show how it looks from the extension developer's point of view. > >>> > >>> I prototyped as much as I could without PEP 451's ModuleSpec support here: > >>> > >>> https://bitbucket.org/ncoghlan/cpython_sandbox/commits/branch/new_extension_imports > >> > >> Cool. I'll take a look. > > > > The new _PyImport_CreateAndExecExtensionModule function does the heavy lifting: > > > > https://bitbucket.org/ncoghlan/cpython_sandbox/src/081f8f7e3ee27dc309463b48e6c67cf4880fca12/Python/importdl.c?at=new_extension_imports#cl-65 > > > > One key point to note is that it *doesn't* call > > _PyImport_FixupExtensionObject, which is the API that handles all the > > PEP 3121 per-module state stuff. Instead, the idea will be for modules > > that don't need additional C level state to just implement > > PyImportExec_NAME, while those that *do* need C level state implement > > PyImportCreate_NAME and return a custom object (which may or may not > > be a module subtype). > > Is it really a common case for an extension module not to need any C level > state at all? I mean, this might work for very simple accelerator modules > with only a few stand-alone functions. But anything non-trivial will almost > certainly have some kind of global state, cache, external library, etc., > and that state is best stored at the C level for safety reasons. I'd prefer to encourage people to put that state on an exported *type* rather than directly in the module global state. So while I agree we need to *support* C level module globals, I'd prefer to provide a simpler alternative that avoids them. We also need the create/exec split to properly support reloading. Reload *must* reinitialize the object already in sys.modules instead of inserting a different object or it completely misses the point of reloading modules over deleting and reimporting them (i.e. implicitly affecting the references from other modules that imported the original object). > > Such modules can still support reloading (e.g. > > to pick up reloaded or removed module dependencies) by providing > > PyImportExec_NAME as well. > > > > (in a PEP 451 world, this would likely be split up as two separate > > functions, one for create, one for exec) > > Can't we just always require extension modules to implement their own type? > Sure, it's a lot of boiler plate code, but that could be handled by a > simple C code generator or maybe even a copy&paste example in the docs. I > would like to avoid making it too easy for users in the future to get > anything wrong with reloading or sub-interpreters. Most people won't test > these things for their own code and the harder it is to make them not work, > the more likely it is that a given set of dependencies will properly work > in a sub-interpreter. > > If users are required to implement their own type, I think it would be more > obvious where to put global module state, how to define functions (i.e. > module methods), how to handle garbage collection at the global module > level, etc. Take a look at the current example - everything gets stored in the module dict for the simple case with no C level global state. The module level functions are still added through a Py_MethodDef array, the docstring still comes from a C char pointer. I did have to fix the custom type's tp_new method to use the type pointer passed in by the interpreter rather than a C static global pointer, but that change would also have been needed if defining a custom type. Since Antoine fixed it, there's also nothing particularly quirky about module destruction in 3.4+ - cyclic GC should "just work". Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Aug 25 00:35:02 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 24 Aug 2013 15:35:02 -0700 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130824161334.31756a13@fsol> Message-ID: On Sat, Aug 24, 2013 at 7:33 AM, Nick Coghlan wrote: > On 25 August 2013 00:26, Nick Coghlan wrote: > > On 25 August 2013 00:13, Antoine Pitrou wrote: > >> On Sun, 25 Aug 2013 00:03:01 +1000 > >> Nick Coghlan wrote: > >>> If Stefan's "please revert this" as lxml.etree maintainer isn't > >>> enough, then I'm happy to add a "please revert this" as a core > >>> committer that is confused about how and when the new tulip-inspired > >>> incremental parsing API should be used in preference to the existing > >>> incremental parsing API, and believes this needs to be clearly > >>> resolved before adding a second way to do it > >>> (especially if there's a > >>> possibility of using a different implementation strategy that avoids > >>> adding the second way). > >> > >> To be clear, again: anyone who wants to "see it resolved" can take over > >> the issue and handle it by themselves. I'm done with it. > > > > OK, I'll revert it for now, then. If someone else steps up to resolve > > the API duplication problem, cool, otherwise we can continue to live > > without this as a standard library feature. > > On the other hand... because other changes have been made to the > module since the original commit, a simple "hg backout" is no longer > possible :( > > Stefan - if you'd like this reverted, you're going to have to either > make the alternative solution work correctly, or else craft the commit > to undo the API addition. > I'm strongly opposed to reverting because it cleaned up messy code duplication and actually make the code size smaller. While I agree that the API of incremental parsing should be given another look, IncrementalParser can also be seen as an implementation detail of iterparse(). Thus, it's probably OK to revert the documentation part of the commit to not mention IncrementalParser at all, making it an undocumented internal implementation detail (one of many in this module and elsewhere). However, since we're still in alpha I don't see much point to doing this change now. Let's keep discussing this in the issue. Anyone interested - please make yourself nosy and any feedback on the API is welcome. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Sun Aug 25 02:55:40 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 25 Aug 2013 09:55:40 +0900 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130824161334.31756a13@fsol> Message-ID: <87txieeghf.fsf@uwakimon.sk.tsukuba.ac.jp> Eli Bendersky writes: > I'm strongly opposed to reverting [the change to ElementTree] > because it cleaned up messy code duplication and actually make the > code size smaller. While I agree that the API of incremental parsing > should be given another look, IncrementalParser can also be seen as > an implementation detail of iterparse(). Except that its API is familiar and cleaner. Does any current application depend on *not* doing whatever it is that the new API does that IncrementalParser *does* do? If not, why not keep the API of IncrementalParser and shim the new code in under that? > Thus, it's probably OK to revert the documentation part of the > commit to not mention IncrementalParser at all, FWIW, as somebody who can recall using ET exactly one, IncrementalParser is what I used. From eliben at gmail.com Sun Aug 25 04:53:06 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 24 Aug 2013 19:53:06 -0700 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <87txieeghf.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130824161334.31756a13@fsol> <87txieeghf.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sat, Aug 24, 2013 at 5:55 PM, Stephen J. Turnbull wrote: > Eli Bendersky writes: > > > I'm strongly opposed to reverting [the change to ElementTree] > > because it cleaned up messy code duplication and actually make the > > code size smaller. While I agree that the API of incremental parsing > > should be given another look, IncrementalParser can also be seen as > > an implementation detail of iterparse(). > > Except that its API is familiar and cleaner. Does any current > application depend on *not* doing whatever it is that the new API does > that IncrementalParser *does* do? If not, why not keep the API of > IncrementalParser and shim the new code in under that? > I'm having a difficulty parsing the above. Could you please re-phrase your suggestion? > > > Thus, it's probably OK to revert the documentation part of the > > commit to not mention IncrementalParser at all, > > FWIW, as somebody who can recall using ET exactly one, > IncrementalParser is what I used. > > Just to be on the safe side, I want to make sure that you indeed mean IncrementalParser, which was committed 4 months ago into the Mercurial default branch (3.4) and has only seen an alpha release? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Sun Aug 25 06:12:36 2013 From: pje at telecommunity.com (PJ Eby) Date: Sun, 25 Aug 2013 00:12:36 -0400 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: Message-ID: On Fri, Aug 23, 2013 at 4:50 AM, Stefan Behnel wrote: > Reloading and Sub-Interpreters > ============================== > > To "reload" an extension module, the module create function is executed > again and returns a new module type. This type is then instantiated as by > the original module loader and replaces the previous entry in sys.modules. > Once the last references to the previous module and its type are gone, both > will be subject to normal garbage collection. I haven't had a chance to address this on the import-sig discussion yet about ModuleSpec, but I would like to just mention that one property of the existing module system that I'm not sure either this proposal or the ModuleSpec proposal preserves is that it's possible to implement lazy importing of modules using standard reload() semantics. My "Importing" package offers lazy imports by creating module objects in sys.modules that are a subtype of ModuleType, and use a __getattribute__ hook so that trying to use them fires off a reload() of the module. Because the dummy module doesn't have __file__ or anything else initialized, the import system searches for the module and then loads it, reusing the existing module object, even though it's actually only executing the module code for the first time. That the existing object be reused is important, because once the dummy is in sys.modules, it can also be imported by other modules, so references to it can abound everywhere, and we wish only for it to be loaded lazily, without needing to trace down and replace all instances of it. This also preserves other invariants of the module system. Anyway, the reason I was asking why reloading is being handled as a special case in the ModuleSpec proposal -- and the reason I'm curious about certain provisions of this proposal -- is that making the assumption you can only reload something with the same spec/location/etc. it was originally loaded with, and/or that if you are reloading a module then you previously had a chance to do things to it, doesn't jibe with the way things work currently. That is to say, in the pure PEP 302 world, there is no special status for "reload" that is different from "load" -- the *only* thing that's different is that there is already a module object to use, and there is *no guarantee that it's a module object that was initialized by the loader now being invoked*. AFAICT both this proposal and the ModuleSpec one are making an invalid assumption per PEP 302, and aren't explicitly proposing to change the status quo: they just assume things that aren't actually assured by the prior specs or implementations. So, for example, this extension module proposal needs to cover what happens if an extension module is reloaded and the module object is not of the type or instance it's expecting. Must it do its own checking? Error handling? Will some other portion of the import system be expected to handle it? For that matter, what happens (in either proposal) if you reload() a module which only has a __name__, and no other attributes? I haven't tested with importlib, but with earlier Pythons this results in a standard module search being done by reload(). But the ModuleSpec proposal and this one seem to assume that a reload()-ed module must already be associated with a loader, location, and/or spec. From ncoghlan at gmail.com Sun Aug 25 08:36:48 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Aug 2013 16:36:48 +1000 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: Message-ID: On 25 August 2013 14:12, PJ Eby wrote: > That is to say, in the pure PEP 302 world, there is no special status > for "reload" that is different from "load" -- the *only* thing that's > different is that there is already a module object to use, and there > is *no guarantee that it's a module object that was initialized by the > loader now being invoked*. Yeah, this is an aspect of why I'd like PEP 451 to use create & exec for the new loader API components. That way, any loader which either doesn't define the create method, or which returns NotImplemented from the call (a subtlety needed to make this work for C extensions), can be used with reload *and* with the -m switch via runpy (currently runpy demands the ability to get hold of the code object). > AFAICT both this proposal and the ModuleSpec one are making an invalid > assumption per PEP 302, and aren't explicitly proposing to change the > status quo: they just assume things that aren't actually assured by > the prior specs or implementations. Indeed. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Sun Aug 25 09:30:44 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 25 Aug 2013 16:30:44 +0900 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130824161334.31756a13@fsol> <87txieeghf.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87ioyudy6z.fsf@uwakimon.sk.tsukuba.ac.jp> Eli Bendersky writes: > On Sat, Aug 24, 2013 at 5:55 PM, Stephen J. Turnbull wrote: >> FWIW, as somebody who can recall using ET exactly once, >> IncrementalParser is what I used. > Just to be on the safe side, I want to make sure that you indeed > mean IncrementalParser, which was committed 4 months ago into the > Mercurial default branch (3.4) and has only seen an alpha release? > Eli Oops, and thank you for your courtesy. No, actually looking at the code this time, I meant xml.sax.xmlreader.IncrementalParser, which has the same API as the new etree.ElementTree.IncrementalParser. No wonder it seems familiar. As for the suggestion, AIUI, you proposed keeping the current layering of iterparse on top of IncrementalParser, and then removing Incrementalparser from the documentation. My suggestion is to rename the current "IncrementalParser" class, and then use the IncrementalParser interface for what is currently named "iterparse". Assuming that, as Stefan claims, data_received == feed, and so on. Steve From stefan_ml at behnel.de Sun Aug 25 13:54:30 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 Aug 2013 13:54:30 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: Nick Coghlan, 24.08.2013 23:43: > On 25 Aug 2013 01:44, "Stefan Behnel" wrote: >> Nick Coghlan, 24.08.2013 16:22: >>> The new _PyImport_CreateAndExecExtensionModule function does the heavy >>> lifting: >>> >>> https://bitbucket.org/ncoghlan/cpython_sandbox/src/081f8f7e3ee27dc309463b48e6c67cf4880fca12/Python/importdl.c?at=new_extension_imports#cl-65 >>> >>> One key point to note is that it *doesn't* call >>> _PyImport_FixupExtensionObject, which is the API that handles all the >>> PEP 3121 per-module state stuff. Instead, the idea will be for modules >>> that don't need additional C level state to just implement >>> PyImportExec_NAME, while those that *do* need C level state implement >>> PyImportCreate_NAME and return a custom object (which may or may not >>> be a module subtype). >> >> Is it really a common case for an extension module not to need any C level >> state at all? I mean, this might work for very simple accelerator modules >> with only a few stand-alone functions. But anything non-trivial will >> almost >> certainly have some kind of global state, cache, external library, etc., >> and that state is best stored at the C level for safety reasons. > > I'd prefer to encourage people to put that state on an exported *type* > rather than directly in the module global state. So while I agree we need > to *support* C level module globals, I'd prefer to provide a simpler > alternative that avoids them. But that has an impact on the API then. Why do you want the users of an extension module to go through a separate object (even if it's just a singleton, for example) instead of going through functions at the module level? We don't currently encourage or propose this design for Python modules either. Quite the contrary, it's extremely common for Python modules to provide most of their functionality at the function level. And IMHO that's a good thing. Note that even global functions usually hold state, be it in the form of globally imported modules, global caches, constants, ... > We also need the create/exec split to properly support reloading. Reload > *must* reinitialize the object already in sys.modules instead of inserting > a different object or it completely misses the point of reloading modules > over deleting and reimporting them (i.e. implicitly affecting the > references from other modules that imported the original object). Interesting. I never thought of it that way. I'm not sure this can be done in general. What if the module has threads running that access the global state? In that case, reinitialising the module object itself would almost certainly lead to a crash. And what if you do "from extmodule import some_function" in a Python module? Then reloading couldn't replace that reference, just as for normal Python modules. Meaning that you'd still have to keep both modules properly alive in order to prevent crashes due to lost global state of the imported function. The difference to Python modules here is that in Python code, you'll get some kind of exception if state is lost during a reload. In C code, you'll most likely get a crash. How would you even make sure global state is properly cleaned up? Would you call tp_clear() on the module object before re-running the init code? Or how else would you enable the init code to do the right thing during both the first run (where global state is uninitialised) and subsequent runs (where global state may hold valid state and owned Python references)? Even tp_clear() may not be enough, because it's only meant to clean up Python references, not C-level state. Basically, for reloading to be correct without changing the object reference, it would have to go all the way through tp_dealloc(), catch the object at the very end, right before it gets freed, and then re-initialise it. This sounds like we need some kind of indirection (as you mentioned above), but without the API impact that a separate type implies. Simply making modules an arbitrary extension type, as I proposed, cannot solve this. (Actually, my intuition tells me that if it can't really be made to work 100% for Python modules, e.g. due to the from-import case, why bother with it for extension types?) >>> Such modules can still support reloading (e.g. >>> to pick up reloaded or removed module dependencies) by providing >>> PyImportExec_NAME as well. >>> >>> (in a PEP 451 world, this would likely be split up as two separate >>> functions, one for create, one for exec) >> >> Can't we just always require extension modules to implement their own >> type? >> Sure, it's a lot of boiler plate code, but that could be handled by a >> simple C code generator or maybe even a copy&paste example in the docs. I >> would like to avoid making it too easy for users in the future to get >> anything wrong with reloading or sub-interpreters. Most people won't test >> these things for their own code and the harder it is to make them not >> work, >> the more likely it is that a given set of dependencies will properly work >> in a sub-interpreter. >> >> If users are required to implement their own type, I think it would be >> more >> obvious where to put global module state, how to define functions (i.e. >> module methods), how to handle garbage collection at the global module >> level, etc. > > Take a look at the current example - everything gets stored in the module > dict for the simple case with no C level global state. Well, you're storing types there. And those types are your module API. I understand that it's just an example, but I don't think it matches a common case. As far as I can see, the types are not even interacting with each other, let alone doing any C-level access of each other. We should try to focus on the normal case that needs C-level state and C-level field access of extension types. Once that's solved, we can still think about how to make the really simple cases simpler, if it turns out that they are not simple enough. Keeping everything in the module dict is a design that (IMHO) is too error prone. C state should be kept safely at the C level, outside of the reach of Python code. I don't want users of my extension module to be able to provoke a crash by saying "extmodule._xyz = None". I didn't know about PyType_FromSpec(), BTW. It looks like a nice addition for manually written code (although useless for Cython). Stefan From stefan_ml at behnel.de Sun Aug 25 14:36:42 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 Aug 2013 14:36:42 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: Message-ID: Hi, thanks for bringing this up. It clearly shows that there is more to this problem than I initially thought. Let me just add one idea that your post gave me. PJ Eby, 25.08.2013 06:12: > My "Importing" package offers lazy imports by creating module objects > in sys.modules that are a subtype of ModuleType, and use a > __getattribute__ hook so that trying to use them fires off a reload() > of the module. I wonder if this wouldn't be an approach to fix the reloading problem in general. What if extension module loading, at least with the new scheme, didn't return the module object itself and put it into sys.modules but created a wrapper that redirects its __getattr__ and __setattr__ to the actual module object? That would have a tiny performance impact on attribute access, but I'd expect that to be negligible given that the usual reason for the extension module to exist is that it does non-trivial stuff in whatever its API provides. Reloading could then really create a completely new module object and replace the reference inside of the wrapper. That way, code that currently uses "from extmodule import xyz" would continue to see the original version of the module as of the time of its import, and code that just did "import extmodule" and then used attribute access at need would always see the current content of the module as it was last loaded. I think that, together with keeping module global state in the module object itself, would nicely fix both cases. Stefan From tjreedy at udel.edu Sun Aug 25 20:06:49 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 25 Aug 2013 14:06:49 -0400 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: On 8/25/2013 7:54 AM, Stefan Behnel wrote: > And what if you do "from extmodule import some_function" in a Python > module? Then reloading couldn't replace that reference, just as for normal > Python modules. Meaning that you'd still have to keep both modules properly > alive in order to prevent crashes due to lost global state of the imported > function. People who want to reload modules sometimes know before they start that they will want to. If so, they can just 'import' instead of 'from import' and access everything through the module. There is still the problem of persistent class instances directly accessing classes for attributes, but maybe that can be directed through the class also. -- Terry Jan Reedy From solipsis at pitrou.net Mon Aug 26 10:36:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 26 Aug 2013 10:36:18 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> Message-ID: <20130826103618.6f7a2d38@pitrou.net> Le Sat, 24 Aug 2013 14:42:24 -0400, Terry Reedy a ?crit : > > > > And these for IncrementalParser: > > > > data_received(data) > > Feed the given bytes data to the incremental parser. > > Longer, awkward, and to me ugly in comparison to 'feed'. Since it > seems to mean more or less the same thing, why not reuse 'feed' and > continue to build on people prior knowledge of Python? Just because *your* prior knowledge of Python doesn't include event-driven processing using network libraries, doesn't mean it's a completely new and unknown thing to other people. There are reasons why "data_received" is better (less ambiguous) than "feed". If you want to influence tulip's design, however, this is the wrong mailing-list to do so. From tseaver at palladion.com Mon Aug 26 14:24:58 2013 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 26 Aug 2013 08:24:58 -0400 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130826103618.6f7a2d38@pitrou.net> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/26/2013 04:36 AM, Antoine Pitrou wrote: > event-driven processing using network librarie Maybe I missed something: why should considerations from that topic influence the design of an API for XML processing? 'feed' and 'close' make much more sense for a parser API, as well has having the benefit of long usage. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlIbSRoACgkQ+gerLs4ltQ5lhwCgnG7TLgSkVf+gXSOxO1KP2kLC eLwAn1QbqbHUqJ7bKV6us/nDQ79AYUgk =aN8S -----END PGP SIGNATURE----- From solipsis at pitrou.net Mon Aug 26 14:51:59 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 26 Aug 2013 14:51:59 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> Message-ID: <20130826145159.79a2d99c@pitrou.net> Le Mon, 26 Aug 2013 08:24:58 -0400, Tres Seaver a ?crit : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 08/26/2013 04:36 AM, Antoine Pitrou wrote: > > event-driven processing using network librarie > > Maybe I missed something: why should considerations from that topic > influence the design of an API for XML processing? Because this API is mostly useful when the data is received (*) at a slow enough speed - which usually means from the network, not from a hard drive. ((*) "data" ... "received"; does it ring a bell? ;-)) If you want iterative processing from a fast data source, you can already use iterparse(): it's blocking, but it's not a problem with disk I/O (not to mention that non-blocking disk I/O doesn't really exist under Linux, AFAIK: I haven't been able to get EAGAIN with os.read() on a non-blocking file, even when reading from a huge uncached file). The whole *point* of adding IncrementalParser was to parse incoming XML data in a way that is friendly with event-driven network programming, other use cases being *already* covered by existing APIs. This is why it's far from nonsensical to re-use an existing terminology from that world. If you don't do any non-blocking network I/O, then fine - you won't even need the API, and can safely ignore its existence. From hodgestar+pythondev at gmail.com Mon Aug 26 17:44:41 2013 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Mon, 26 Aug 2013 17:44:41 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130826145159.79a2d99c@pitrou.net> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> Message-ID: On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou wrote: > Because this API is mostly useful when the data is received (*) at a > slow enough speed - which usually means from the network, not from a > hard drive. It looks like all the events have to be ready before one can start iterating over .events() in the new API? That doesn't seem that useful from an asynchronous programming perspective and .data_received() and .eof_received() appear to be thin wrappers over .feed() and .close()? Am I misunderstanding something? From solipsis at pitrou.net Mon Aug 26 17:57:51 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 26 Aug 2013 17:57:51 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> Message-ID: <20130826175751.7a6896aa@pitrou.net> Le Mon, 26 Aug 2013 17:44:41 +0200, Simon Cross a ?crit : > On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou > wrote: > > Because this API is mostly useful when the data is received (*) at a > > slow enough speed - which usually means from the network, not from a > > hard drive. > > It looks like all the events have to be ready before one can start > iterating over .events() in the new API? That doesn't seem that useful > from an asynchronous programming perspective and .data_received() and > .eof_received() appear to be thin wrappers over .feed() and .close()? What do you mean, "all events have to be ready"? If you look at the unit tests, the events are generated on-the-fly, not at the end of the document. (exactly the same as iterparse(), except that iterparse() is blocking) Implementation-wise, data_received() and eof_received() are not thin wrappers over feed() and close(), they rely on an internal API to get at the generated events (which justifies putting the functionality inside the etree module, by the way). Regards Antoine. From eliben at gmail.com Mon Aug 26 18:14:38 2013 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 26 Aug 2013 09:14:38 -0700 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130826175751.7a6896aa@pitrou.net> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> Message-ID: On Mon, Aug 26, 2013 at 8:57 AM, Antoine Pitrou wrote: > Le Mon, 26 Aug 2013 17:44:41 +0200, > Simon Cross a ?crit : > > On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou > > wrote: > > > Because this API is mostly useful when the data is received (*) at a > > > slow enough speed - which usually means from the network, not from a > > > hard drive. > > > > It looks like all the events have to be ready before one can start > > iterating over .events() in the new API? That doesn't seem that useful > > from an asynchronous programming perspective and .data_received() and > > .eof_received() appear to be thin wrappers over .feed() and .close()? > > What do you mean, "all events have to be ready"? > If you look at the unit tests, the events are generated on-the-fly, > not at the end of the document. > (exactly the same as iterparse(), except that iterparse() is blocking) > > Implementation-wise, data_received() and eof_received() are not thin > wrappers over feed() and close(), they rely on an internal API to get > at the generated events (which justifies putting the functionality > inside the etree module, by the way). > Antoine, you opted out of the tracker issue but I feel it's fair to let you know that after a lot of discussion with Nick and Stefan (*), we've settled on renaming the input methods to feed & close, and the output method to read_events. We are also considering a different name for the class. I've posted with more detail and rationale in http://bugs.python.org/issue17741, but to summarize: The input-side of IncrementalParser is the same as the plain XMLParser. The latter can also be given data incrementally by means of "feed". By default it would collect the whole tree and return it in close(), but in reality you can rig a custom target that does something more fluid (though not to the full extent of IncrementalParser). Therefore it was deemed confusing to have different names for this. Another reason is consistency with xml.sax.xmlreader.IncrementalParser, which also has feed() and close(). As for the output method name, Nick suggested that read_events conveys the destructive nature of the method better (by analogy to file/stream APIs), and others agreed. As for the class name, IncrementalParser is ambiguous because it's not immediately clear which side is incremental. Input or output? For the input, it's no more incremental than XMLParser itself, as stated above. The output is what's different here, so we're considering a few candidates for a better name that conveys the meaning more precisely. And to reiterate, I realize that it's unpleasant for you to have this dug up after it has already been committed. I assume the blame for not reviewing it in more detail originally. However, I feel it would still be better to revise this now than just leave it be. APIs added to stdlib are cooked in there for a *long time*. Alternatively, Nick suggested granting this API a "provisional" status (PEP 411), and that's an option if we don't manage to reach some sort of consensus. Eli (*) Well, to be completely precise, Stefan is still opposed to the whole idea. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Aug 26 18:21:05 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 26 Aug 2013 18:21:05 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> Message-ID: <20130826182105.23b2b66b@pitrou.net> Le Mon, 26 Aug 2013 09:14:38 -0700, Eli Bendersky a ?crit : > > Antoine, you opted out of the tracker issue but I feel it's fair to > let you know that after a lot of discussion with Nick and Stefan (*), > we've settled on renaming the input methods to feed & close, and the > output method to read_events. We are also considering a different > name for the class. Fair enough. > As for the class name, IncrementalParser is ambiguous because it's not > immediately clear which side is incremental. Input or output? Both are :-) (which makes sense, really: an incremental input without output will only yield a slight memory consumption benefit - only slight, since the object tree representation should be much more costly than its bytes serialization -; an incremental output without input doesn't seem to have any point at all) Regards Antoine. From eliben at gmail.com Mon Aug 26 18:40:36 2013 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 26 Aug 2013 09:40:36 -0700 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130826182105.23b2b66b@pitrou.net> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> <20130826182105.23b2b66b@pitrou.net> Message-ID: On Mon, Aug 26, 2013 at 9:21 AM, Antoine Pitrou wrote: > Le Mon, 26 Aug 2013 09:14:38 -0700, > Eli Bendersky a ?crit : > > > > Antoine, you opted out of the tracker issue but I feel it's fair to > > let you know that after a lot of discussion with Nick and Stefan (*), > > we've settled on renaming the input methods to feed & close, and the > > output method to read_events. We are also considering a different > > name for the class. > > Fair enough. > > > As for the class name, IncrementalParser is ambiguous because it's not > > immediately clear which side is incremental. Input or output? > > Both are :-) > > Yes, exactly :-) "Incremental", though, seems to support the conjecture that it's the input. Which is true, but, since XMLParser is also "incremental" in this sense, slightly confusing. As a more anecdotal piece of evidence: when the issue was reopened, I myself at first got confused by exactly this point because I forgot all about this in the months that passed since the commit. And I recalled that when I initially reviewed your patch, I got confused too :-) That would suggest one of two things: (1) The name is indeed confusing or (2) I'm stupid. The fact that Nick also got confused when trying a cursory understanding of the documentation cams me down w.r.t. (2). Back to the discussion, my new favorite is NonblockingParser. Because its input side is exactly similar to XMLParser, the "Nonblocking" in the name points to the difference in output, which is correct. As the popular quote says, "There are only two hard problems in Computer Science: cache invalidation and naming things." Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Aug 26 19:40:46 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 26 Aug 2013 18:40:46 +0100 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> <20130826182105.23b2b66b@pitrou.net> Message-ID: On 26 August 2013 17:40, Eli Bendersky wrote: > Yes, exactly :-) "Incremental", though, seems to support the conjecture > that it's the input. Which is true, but, since XMLParser is also > "incremental" in this sense, slightly confusing. As a data point, until you explained the difference between the two classes earlier in this thread, I too had been completely confused as both the existing and the new classes are "incremental" (on the input side - that's what I interpret "incremental" as meaning). It never even occurred to me that the difference was in the *output* side. Maybe "NonBlocking" would imply that to me. Or maybe "Generator". But regardless, I think the changes you've made sound good, and I'm certainly less concerned with the new version(as someone who will likely never use the new API, and therefore doesn't really have a vote). Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Mon Aug 26 20:34:32 2013 From: rymg19 at gmail.com (Ryan) Date: Mon, 26 Aug 2013 13:34:32 -0500 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> Message-ID: <24e7e386-4421-48c1-a0f1-f6180c499a9e@email.android.com> How about StreamParser? I mean, even if it isn't quite the same, that name would still make sense. Eli Bendersky wrote: >On Mon, Aug 26, 2013 at 8:57 AM, Antoine Pitrou >wrote: > >> Le Mon, 26 Aug 2013 17:44:41 +0200, >> Simon Cross a ?crit : >> > On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou > >> > wrote: >> > > Because this API is mostly useful when the data is received (*) >at a >> > > slow enough speed - which usually means from the network, not >from a >> > > hard drive. >> > >> > It looks like all the events have to be ready before one can start >> > iterating over .events() in the new API? That doesn't seem that >useful >> > from an asynchronous programming perspective and .data_received() >and >> > .eof_received() appear to be thin wrappers over .feed() and >.close()? >> >> What do you mean, "all events have to be ready"? >> If you look at the unit tests, the events are generated on-the-fly, >> not at the end of the document. >> (exactly the same as iterparse(), except that iterparse() is >blocking) >> >> Implementation-wise, data_received() and eof_received() are not thin >> wrappers over feed() and close(), they rely on an internal API to get >> at the generated events (which justifies putting the functionality >> inside the etree module, by the way). >> > >Antoine, you opted out of the tracker issue but I feel it's fair to let >you >know that after a lot of discussion with Nick and Stefan (*), we've >settled >on renaming the input methods to feed & close, and the output method to >read_events. We are also considering a different name for the class. > >I've posted with more detail and rationale in >http://bugs.python.org/issue17741, but to summarize: > >The input-side of IncrementalParser is the same as the plain XMLParser. >The >latter can also be given data incrementally by means of "feed". By >default >it would collect the whole tree and return it in close(), but in >reality >you can rig a custom target that does something more fluid (though not >to >the full extent of IncrementalParser). Therefore it was deemed >confusing to >have different names for this. Another reason is consistency with >xml.sax.xmlreader.IncrementalParser, which also has feed() and close(). > >As for the output method name, Nick suggested that read_events conveys >the >destructive nature of the method better (by analogy to file/stream >APIs), >and others agreed. > >As for the class name, IncrementalParser is ambiguous because it's not >immediately clear which side is incremental. Input or output? For the >input, it's no more incremental than XMLParser itself, as stated above. >The >output is what's different here, so we're considering a few candidates >for >a better name that conveys the meaning more precisely. > >And to reiterate, I realize that it's unpleasant for you to have this >dug >up after it has already been committed. I assume the blame for not >reviewing it in more detail originally. However, I feel it would still >be >better to revise this now than just leave it be. APIs added to stdlib >are >cooked in there for a *long time*. Alternatively, Nick suggested >granting >this API a "provisional" status (PEP 411), and that's an option if we >don't >manage to reach some sort of consensus. > >Eli > >(*) Well, to be completely precise, Stefan is still opposed to the >whole >idea. > > >------------------------------------------------------------------------ > >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Mon Aug 26 20:38:33 2013 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 26 Aug 2013 11:38:33 -0700 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> <20130826182105.23b2b66b@pitrou.net> Message-ID: On Mon, Aug 26, 2013 at 10:40 AM, Paul Moore wrote: > On 26 August 2013 17:40, Eli Bendersky wrote: > >> Yes, exactly :-) "Incremental", though, seems to support the conjecture >> that it's the input. Which is true, but, since XMLParser is also >> "incremental" in this sense, slightly confusing. > > > As a data point, until you explained the difference between the two > classes earlier in this thread, I too had been completely confused as both > the existing and the new classes are "incremental" (on the input side - > that's what I interpret "incremental" as meaning). It never even occurred > to me that the difference was in the *output* side. Maybe "NonBlocking" > would imply that to me. Or maybe "Generator". But regardless, I think the > changes you've made sound good, and I'm certainly less concerned with the > new version(as someone who will likely never use the new API, and therefore > doesn't really have a vote). > Thanks for the data point; it is useful. > How about StreamParser? The problem with StreamParser is similar to IncrementalParser. "Stream" carries the impression that it refers to the input. But the input of ET parsers is *always* streaming, in a way (the feed/close interface). I want a name that conveys that the *output* is also nonblocking/streaming/yielding/generating/etc. Therefore Nonblocking (I'll let better English experts to decide whether B should be capitalized) sounds better to me, because it helps convey that both sides of the parser are asynchronous. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Mon Aug 26 20:53:07 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 Aug 2013 20:53:07 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> <20130826182105.23b2b66b@pitrou.net> Message-ID: Paul Moore, 26.08.2013 19:40: > On 26 August 2013 17:40, Eli Bendersky wrote: > >> Yes, exactly :-) "Incremental", though, seems to support the conjecture >> that it's the input. Which is true, but, since XMLParser is also >> "incremental" in this sense, slightly confusing. > > As a data point, until you explained the difference between the two classes > earlier in this thread, I too had been completely confused as both the > existing and the new classes are "incremental" (on the input side - that's > what I interpret "incremental" as meaning). It never even occurred to me > that the difference was in the *output* side. The fix I'm proposing is to not make it two separate classes. But those who are interested in the details should really participate in the ticket discussion rather than here. Stefan From rymg19 at gmail.com Mon Aug 26 20:55:25 2013 From: rymg19 at gmail.com (Ryan) Date: Mon, 26 Aug 2013 13:55:25 -0500 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> <20130826182105.23b2b66b@pitrou.net> Message-ID: <991feac9-c3b3-440b-a3b7-c58143e0fd41@email.android.com> Nonblocking sounds too Internet-related. How about...flow? Ah, I'll probably still end up using Expat regardless. Eli Bendersky wrote: >On Mon, Aug 26, 2013 at 10:40 AM, Paul Moore >wrote: > >> On 26 August 2013 17:40, Eli Bendersky wrote: >> >>> Yes, exactly :-) "Incremental", though, seems to support the >conjecture >>> that it's the input. Which is true, but, since XMLParser is also >>> "incremental" in this sense, slightly confusing. >> >> >> As a data point, until you explained the difference between the two >> classes earlier in this thread, I too had been completely confused as >both >> the existing and the new classes are "incremental" (on the input side >- >> that's what I interpret "incremental" as meaning). It never even >occurred >> to me that the difference was in the *output* side. Maybe >"NonBlocking" >> would imply that to me. Or maybe "Generator". But regardless, I think >the >> changes you've made sound good, and I'm certainly less concerned with >the >> new version(as someone who will likely never use the new API, and >therefore >> doesn't really have a vote). >> > >Thanks for the data point; it is useful. > >> How about StreamParser? > >The problem with StreamParser is similar to IncrementalParser. "Stream" >carries the impression that it refers to the input. But the input of ET >parsers is *always* streaming, in a way (the feed/close interface). I >want >a name that conveys that the *output* is also >nonblocking/streaming/yielding/generating/etc. Therefore Nonblocking >(I'll >let better English experts to decide whether B should be capitalized) >sounds better to me, because it helps convey that both sides of the >parser >are asynchronous. > >Eli > > >------------------------------------------------------------------------ > >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Aug 26 22:03:40 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 26 Aug 2013 13:03:40 -0700 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review In-Reply-To: References: Message-ID: On Fri, Aug 23, 2013 at 1:30 PM, Charles-Fran?ois Natali wrote: >> About your example: I'm not sure that it is reliable/portable. I sa >> daemon libraries closing *all* file descriptors and then expecting new >> file descriptors to become 0, 1 and 2. Your example is different >> because w is still open. On Windows, I have seen cases with only fd 0, >> 1, 2 open, and the next open() call gives the fd 10 or 13... > > Well, my example uses fork(), so obviously doesn't apply to Windows. > It's perfectly safe on Unix. But relying on this in UNIX has also been discouraged ever since the dup2() system call was introduced. (I can't easily find a reference about its history but IIRC it is probably as old as UNIX v7 or otherwise BSD 4.x.) >> I'm optimistic and I expect that most Python applications and >> libraries already use the subprocess module. The subprocess module >> closes all file descriptors (except 0, 1, 2) since Python 3.2. >> Developers relying on the FD inheritance and using the subprocess with >> Python 3.2 or later already had to use the pass_fds parameter. > > As long as the PEP makes it clear that this breaks backward > compatibility, that's fine. IMO the risk of breakage outweights the > modicum benefit. I know this will break code. But it is for the good of mankind. (I will now review the full PEP, finally.) -- --Guido van Rossum (python.org/~guido) From guido at python.org Mon Aug 26 23:35:26 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 26 Aug 2013 14:35:26 -0700 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review In-Reply-To: References: Message-ID: Hi Victor, I have reviewed the PEP and I think it is good. Thank you so much for pushing this topic and for your very thorough review of all the feedback, related issues and so on. It is an exemplary PEP! I've made a bunch of small edits (mostly to improve grammar slightly, hope you don't mind) and committed these to the repo. I've also got a few more comments on the text that I didn't want to commit behind your back; I've written these up in a Rietveld review of the PEP (which you can also use to see exactly what I did already commit). https://codereview.appspot.com/13240043/ Here's a summary of those review changes: https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt File pep-0446.txt (right): https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode25 pep-0446.txt:25: descriptors. I'd add at this point: We are aware of the code breakage this is likely to cause, and doing it anyway for the good of mankind. (Details in the section "Backward Compatibility" below.) https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode80 pep-0446.txt:80: inheritable handles are inherited by the child process. Maybe mention here that this also affects the subprocess module? (You mention it later, but it's important to realize at this point.) https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode437 pep-0446.txt:437: condition. As C-F Natali pointed out, this is not actually a problem, because after fork() only the main thread survives. Maybe just delete this paragraph? https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode450 pep-0446.txt:450: parameter is a non-empty list of file descriptors. Well, it could pass closefrom() the max of the given list and manually close the rest. This would be useful if the system max is large but none of the FDs given in the list is. (This would be more complex code but it would address the issue for most programs.) https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode486 pep-0446.txt:486: * ``socket.socketpair()`` I would call out that dup2() is intentionally not in this list, and add a rationale for that omission below. https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode528 pep-0446.txt:528: by default, but non-inheritable if *inheritable* is ``False``. This might be a good place to explain the rationale for this exception. https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode538 pep-0446.txt:538: descriptors). I would say it should not be changed because the default is still better. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Aug 27 00:19:30 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 27 Aug 2013 00:19:30 +0200 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review In-Reply-To: References: Message-ID: 2013/8/26 Guido van Rossum : > I have reviewed the PEP and I think it is good. Thank you so much for > pushing this topic and for your very thorough review of all the feedback, > related issues and so on. It is an exemplary PEP! Thanks :-) I updated the PEP: http://hg.python.org/peps/rev/edd8250f6893 > I've made a bunch of small edits (mostly to improve grammar slightly, hope > you don't mind) and committed these to the repo. Thanks, I'm not a native english speaker, so not problem for such edit. > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode437 > pep-0446.txt:437: condition. > As C-F Natali pointed out, this is not actually a problem, because after > fork() > only the main thread survives. Maybe just delete this paragraph? Ok, I didn't know that only one thread survives to fork(). (I read Charles' email, but I forgot to update the PEP.) I simply deleted the paragraph. > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode450 > pep-0446.txt:450: parameter is a non-empty list of file descriptors. > Well, it could pass closefrom() the max of the given list and manually close > the > rest. This would be useful if the system max is large but none of the FDs > given > in the list is. (This would be more complex code but it would address the > issue > for most programs.) This was related to the multi-thread issue, which does not exist, so I also removed this paragraph. Using closefrom() to optimize subprocess is unrelated to this PEP. (And yes, the maximum file descriptor can be huge!) > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode538 > pep-0446.txt:538: descriptors). > I would say it should not be changed because the default is still better. > :-) (The PEP does not propose to change the default value.) Under Linux, recent versions of the glibc uses non-inheritable FD for internal files. Slowly, more and more libraries and programs will do the same. This PEP is a step in this direction ;-) Victor From guido at python.org Tue Aug 27 00:50:15 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 26 Aug 2013 15:50:15 -0700 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review In-Reply-To: References: Message-ID: Wow, that was quick! I propose that we wait for one more day for any feedback from others in response to this post, and then accept the PEP. On Mon, Aug 26, 2013 at 3:19 PM, Victor Stinner wrote: > 2013/8/26 Guido van Rossum : > > I have reviewed the PEP and I think it is good. Thank you so much for > > pushing this topic and for your very thorough review of all the feedback, > > related issues and so on. It is an exemplary PEP! > > Thanks :-) I updated the PEP: > http://hg.python.org/peps/rev/edd8250f6893 > > > I've made a bunch of small edits (mostly to improve grammar slightly, > hope > > you don't mind) and committed these to the repo. > > Thanks, I'm not a native english speaker, so not problem for such edit. > > > > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode437 > > pep-0446.txt:437: condition. > > As C-F Natali pointed out, this is not actually a problem, because after > > fork() > > only the main thread survives. Maybe just delete this paragraph? > > Ok, I didn't know that only one thread survives to fork(). (I read > Charles' email, but I forgot to update the PEP.) I simply deleted the > paragraph. > > > > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode450 > > pep-0446.txt:450: parameter is a non-empty list of file descriptors. > > Well, it could pass closefrom() the max of the given list and manually > close > > the > > rest. This would be useful if the system max is large but none of the FDs > > given > > in the list is. (This would be more complex code but it would address the > > issue > > for most programs.) > > This was related to the multi-thread issue, which does not exist, so I > also removed this paragraph. > > Using closefrom() to optimize subprocess is unrelated to this PEP. > > (And yes, the maximum file descriptor can be huge!) > > > > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode538 > > pep-0446.txt:538: descriptors). > > I would say it should not be changed because the default is still better. > > :-) > > (The PEP does not propose to change the default value.) > > Under Linux, recent versions of the glibc uses non-inheritable FD for > internal files. Slowly, more and more libraries and programs will do > the same. This PEP is a step in this direction ;-) > > Victor > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue Aug 27 01:21:50 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 27 Aug 2013 11:21:50 +1200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <991feac9-c3b3-440b-a3b7-c58143e0fd41@email.android.com> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> <20130826182105.23b2b66b@pitrou.net> <991feac9-c3b3-440b-a3b7-c58143e0fd41@email.android.com> Message-ID: <521BE30E.9020501@canterbury.ac.nz> Ryan wrote: > Nonblocking sounds too Internet-related. How about...flow? AsyncParser? -- Greg From scott+python-dev at scottdial.com Tue Aug 27 05:45:55 2013 From: scott+python-dev at scottdial.com (Scott Dial) Date: Mon, 26 Aug 2013 23:45:55 -0400 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130826145159.79a2d99c@pitrou.net> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> Message-ID: <521C20F3.7060809@scottdial.com> On 8/26/2013 8:51 AM, Antoine Pitrou wrote: > Le Mon, 26 Aug 2013 08:24:58 -0400, > Tres Seaver a ?crit : >> On 08/26/2013 04:36 AM, Antoine Pitrou wrote: >>> event-driven processing using network libraries >> >> Maybe I missed something: why should considerations from that topic >> influence the design of an API for XML processing? > > Because this API is mostly useful when the data is received (*) at a > slow enough speed - which usually means from the network, not from a > hard drive. ... > The whole *point* of adding IncrementalParser was to parse incoming > XML data in a way that is friendly with event-driven network > programming, other use cases being *already* covered by existing > APIs. This is why it's far from nonsensical to re-use an existing > terminology from that world. Since when is Tulip the OOWTDI? If this was Twisted, it would be "write" and "finish"[1]. Tulip's Protocol ABC isn't even a good match for the application. There is reason that Twisted has a separate Consumer/Producer interface from the network I/O interface. I'm sure there is other existing practice in this specific area too (e.g., XMLParser). [1] http://twistedmatrix.com/documents/13.1.0/api/twisted.protocols.ftp.IFinishableConsumer.html -- Scott Dial scott at scottdial.com From cf.natali at gmail.com Tue Aug 27 08:00:09 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 27 Aug 2013 08:00:09 +0200 Subject: [Python-Dev] hg.python.org is slow Message-ID: Hi, I'm trying to checkout a pristine clone from ssh://hg at hg.python.org/cpython, and it's taking forever: """ 07:45:35.605941 IP 192.168.0.23.43098 > virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22081460, win 14225, options [nop,nop,TS val 368519 ecr 2401783356], length 0 07:45:38.558348 IP virt-7yvsjn.psf.osuosl.org.ssh > 192.168.0.23.43098: Flags [.], seq 22081460:22082908, ack 53985, win 501, options [nop,nop,TS val 2401784064 ecr 368519], length 1448 07:45:38.558404 IP 192.168.0.23.43098 > virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22082908, win 14225, options [nop,nop,TS val 369257 ecr 2401784064], length 0 07:45:39.649995 IP virt-7yvsjn.psf.osuosl.org.ssh > 192.168.0.23.43098: Flags [.], seq 22082908:22084356, ack 53985, win 501, options [nop,nop,TS val 2401784367 ecr 369257], length 1448 """ See the time to just get an ACK? Am I the only one experiencing this? Cheers, cf From nad at acm.org Tue Aug 27 09:34:44 2013 From: nad at acm.org (Ned Deily) Date: Tue, 27 Aug 2013 00:34:44 -0700 Subject: [Python-Dev] hg.python.org is slow References: Message-ID: In article , Charles-Francois Natali wrote: > I'm trying to checkout a pristine clone from > ssh://hg at hg.python.org/cpython, and it's taking forever: > """ > 07:45:35.605941 IP 192.168.0.23.43098 > > virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22081460, win 14225, > options [nop,nop,TS val 368519 ecr 2401783356], length 0 > 07:45:38.558348 IP virt-7yvsjn.psf.osuosl.org.ssh > > 192.168.0.23.43098: Flags [.], seq 22081460:22082908, ack 53985, win > 501, options [nop,nop,TS val 2401784064 ecr 368519], length 1448 > 07:45:38.558404 IP 192.168.0.23.43098 > > virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22082908, win 14225, > options [nop,nop,TS val 369257 ecr 2401784064], length 0 > 07:45:39.649995 IP virt-7yvsjn.psf.osuosl.org.ssh > > 192.168.0.23.43098: Flags [.], seq 22082908:22084356, ack 53985, win > 501, options [nop,nop,TS val 2401784367 ecr 369257], length 1448 > """ > > See the time to just get an ACK? > > Am I the only one experiencing this? At the moment (about 90 minutes after you posted this), I was just did a reasonable-sized pull via ssh: with no apparent delays. But I'm a *lot* closer to the server than you are. BTW, do you have ssh compression enabled for that host? -- Ned Deily, nad at acm.org From hodgestar+pythondev at gmail.com Tue Aug 27 09:58:57 2013 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Tue, 27 Aug 2013 09:58:57 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: <20130826175751.7a6896aa@pitrou.net> References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> Message-ID: On Mon, Aug 26, 2013 at 5:57 PM, Antoine Pitrou wrote: > What do you mean, "all events have to be ready"? > If you look at the unit tests, the events are generated on-the-fly, > not at the end of the document. > (exactly the same as iterparse(), except that iterparse() is blocking) So you have to poll .events()? That also seems unhelpful from an event driven programming perspective. What I'm driving at is that I'd expect to have access to some sort of deferred that fires when the next event is ready to be processed and I don't see that here. From solipsis at pitrou.net Tue Aug 27 10:09:14 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 27 Aug 2013 10:09:14 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> Message-ID: <20130827100914.01bc6163@pitrou.net> Le Tue, 27 Aug 2013 09:58:57 +0200, Simon Cross a ?crit : > On Mon, Aug 26, 2013 at 5:57 PM, Antoine Pitrou > wrote: > > What do you mean, "all events have to be ready"? > > If you look at the unit tests, the events are generated on-the-fly, > > not at the end of the document. > > (exactly the same as iterparse(), except that iterparse() is > > blocking) > > So you have to poll .events()? That also seems unhelpful from an event > driven programming perspective. At most, you're gonna poll events() after calling data_received() (there's no other way some events can be generated, after all). You can also poll it less often, depending on how often you're interested in new events. That is, it is amenable to both push and pull modes. > What I'm driving at is that I'd expect to have access to some sort of > deferred that fires when the next event is ready to be processed and I > don't see that here. That would be sensible to do if Deferred was a construct shared amongst major async frameworks, but it isn't ;-) (and it looks like Guido won't include a Deferred-alike in PEP 3156) Letting people "poll" events() lets them plug any async-callback-firing primitive they like, almost trivially. Regards Antoine. From stefan_ml at behnel.de Tue Aug 27 10:10:51 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 27 Aug 2013 10:10:51 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 In-Reply-To: References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <20130826175751.7a6896aa@pitrou.net> Message-ID: Simon Cross, 27.08.2013 09:58: > On Mon, Aug 26, 2013 at 5:57 PM, Antoine Pitrou wrote: >> What do you mean, "all events have to be ready"? >> If you look at the unit tests, the events are generated on-the-fly, >> not at the end of the document. >> (exactly the same as iterparse(), except that iterparse() is blocking) > > So you have to poll .events()? That also seems unhelpful from an event > driven programming perspective. > > What I'm driving at is that I'd expect to have access to some sort of > deferred that fires when the next event is ready to be processed and I > don't see that here. The idea is that you pass data into the parser and then ask read_events() for an event iterator. If/When that's empty, you're done. No repeated polling or anything, just all in one shot whenever data is available. It's a really nice interface by design, just badly integrated into the existing API. Stefan From solipsis at pitrou.net Tue Aug 27 10:16:00 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 27 Aug 2013 10:16:00 +0200 Subject: [Python-Dev] hg.python.org is slow References: Message-ID: <20130827101600.46f827ca@pitrou.net> Le Tue, 27 Aug 2013 08:00:09 +0200, Charles-Fran?ois Natali a ?crit : > Hi, > > I'm trying to checkout a pristine clone from > ssh://hg at hg.python.org/cpython, and it's taking forever: > """ > 07:45:35.605941 IP 192.168.0.23.43098 > > virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22081460, win 14225, > options [nop,nop,TS val 368519 ecr 2401783356], length 0 > 07:45:38.558348 IP virt-7yvsjn.psf.osuosl.org.ssh > > 192.168.0.23.43098: Flags [.], seq 22081460:22082908, ack 53985, win > 501, options [nop,nop,TS val 2401784064 ecr 368519], length 1448 > 07:45:38.558404 IP 192.168.0.23.43098 > > virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22082908, win 14225, > options [nop,nop,TS val 369257 ecr 2401784064], length 0 > 07:45:39.649995 IP virt-7yvsjn.psf.osuosl.org.ssh > > 192.168.0.23.43098: Flags [.], seq 22082908:22084356, ack 53985, win > 501, options [nop,nop,TS val 2401784367 ecr 369257], length 1448 > """ > > See the time to just get an ACK? Sounds a lot like a network problem, then? Have you tried a traceroute? (HTTP cloning works fine here, from a free.fr connection; I'll try a ssh clone tonight if you're still having problems.) cheers Antoine. From cf.natali at gmail.com Tue Aug 27 10:37:00 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 27 Aug 2013 10:37:00 +0200 Subject: [Python-Dev] hg.python.org is slow In-Reply-To: <20130827101600.46f827ca@pitrou.net> References: <20130827101600.46f827ca@pitrou.net> Message-ID: 2013/8/27 Antoine Pitrou : > Sounds a lot like a network problem, then? If I'm the only one, it's likely, although these pathological timeouts are transient, and I don't have any problem with other servers (my line sustains 8Mb/s without problem). > Have you tried a traceroute? I'll try tonight if this persists, and keep you posted. 2013/8/27 Ned Deily : > BTW, do you have ssh compression enabled for that host? Yep. cf From solipsis at pitrou.net Tue Aug 27 12:23:15 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 27 Aug 2013 12:23:15 +0200 Subject: [Python-Dev] hg.python.org is slow References: <20130827101600.46f827ca@pitrou.net> Message-ID: <20130827122315.2a777611@pitrou.net> Le Tue, 27 Aug 2013 10:37:00 +0200, Charles-Fran?ois Natali a ?crit : > 2013/8/27 Antoine Pitrou : > > Sounds a lot like a network problem, then? > > If I'm the only one, it's likely, although these pathological timeouts > are transient, and I don't have any problem with other servers (my > line sustains 8Mb/s without problem). Well, "network problem" doesn't mean the problem is on your side :-) We've had network problems with the hosting in the past, although they were noticed by many people usually (mostly non-North American people). Regards Antoine. From solipsis at pitrou.net Tue Aug 27 19:37:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 27 Aug 2013 19:37:18 +0200 Subject: [Python-Dev] please back out changeset f903cf864191 before alpha-2 References: <20130824012608.0187fa3f@fsol> <20130824125844.57a8f9a3@fsol> <20130826103618.6f7a2d38@pitrou.net> <20130826145159.79a2d99c@pitrou.net> <521C20F3.7060809@scottdial.com> Message-ID: <20130827193718.5be32d89@fsol> On Mon, 26 Aug 2013 23:45:55 -0400 Scott Dial wrote: > On 8/26/2013 8:51 AM, Antoine Pitrou wrote: > > Le Mon, 26 Aug 2013 08:24:58 -0400, > > Tres Seaver a ?crit : > >> On 08/26/2013 04:36 AM, Antoine Pitrou wrote: > >>> event-driven processing using network libraries > >> > >> Maybe I missed something: why should considerations from that topic > >> influence the design of an API for XML processing? > > > > Because this API is mostly useful when the data is received (*) at a > > slow enough speed - which usually means from the network, not from a > > hard drive. > ... > > The whole *point* of adding IncrementalParser was to parse incoming > > XML data in a way that is friendly with event-driven network > > programming, other use cases being *already* covered by existing > > APIs. This is why it's far from nonsensical to re-use an existing > > terminology from that world. > > Since when is Tulip the OOWTDI? If this was Twisted, it would be "write" > and "finish"[1]. Tulip's Protocol ABC isn't even a good match for the > application. There is reason that Twisted has a separate > Consumer/Producer interface from the network I/O interface. > I'm sure > there is other existing practice in this specific area too (e.g., > XMLParser). I'm really not convinced further bikeshedding on this issue has any point. If you have any concrete concerns, you can voice them on the issue tracker. Regards Antoine. From solipsis at pitrou.net Tue Aug 27 21:20:42 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 27 Aug 2013 21:20:42 +0200 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review References: Message-ID: <20130827212042.22f31d32@fsol> Hi, I have a small comment to make: > On UNIX, the subprocess module closes almost all file descriptors in > the child process. This operation requires MAXFD system calls, where > MAXFD is the maximum number of file descriptors, even if there are > only few open file descriptors. This maximum can be read using: > os.sysconf("SC_OPEN_MAX"). If your intent is to remove the closerange() call from subprocess, be aware that it may let through some file descriptors opened by third-party code (such as C extensions). This may or may not be something we want to worry about, but there's still a small potential for security regressions. Regards Antoine. From victor.stinner at gmail.com Tue Aug 27 22:26:31 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 27 Aug 2013 22:26:31 +0200 Subject: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review In-Reply-To: <20130827212042.22f31d32@fsol> References: <20130827212042.22f31d32@fsol> Message-ID: 2013/8/27 Antoine Pitrou : >> On UNIX, the subprocess module closes almost all file descriptors in >> the child process. This operation requires MAXFD system calls, where >> MAXFD is the maximum number of file descriptors, even if there are >> only few open file descriptors. This maximum can be read using: >> os.sysconf("SC_OPEN_MAX"). > > If your intent is to remove the closerange() call from subprocess, be > aware that it may let through some file descriptors opened by > third-party code (such as C extensions). This may or may not be > something we want to worry about, but there's still a small potential > for security regressions. The PEP doesn't change the default value of the close_fds parameter of subprocess: file descriptors and handles are still closed in the child process. I modified the PEP to explain the link between non-inheritable FDs and performances: http://hg.python.org/peps/rev/d88fbf9941fa If you don't use third party code, or if you control third party code and you know that these modules only create non-inheritable FDs, it is now safe (thanks to the PEP 446) to use close_fds=False... which avoids the cost of closing MAXFD file descriptors explicitly in the child process. Victor From guido at python.org Wed Aug 28 00:17:11 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 27 Aug 2013 15:17:11 -0700 Subject: [Python-Dev] Accepted: PEP 446 -- Make newly created file descriptors non-inheritable Message-ID: Congratulations Victor, PEP 446 is accepted! Thanks for your tiresome work and the last-minute changes. I will update the PEP's status to "Accepted" right away. You can change it to "Final" after all the changes have been committed to the default branch for inclusion into the next 3.4 alpha. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Aug 28 02:08:00 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 28 Aug 2013 02:08:00 +0200 Subject: [Python-Dev] Accepted: PEP 446 -- Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: 2013/8/28 Guido van Rossum : > Congratulations Victor, PEP 446 is accepted! Thanks. I just commited the implementation into default (future Python 3.4): http://hg.python.org/cpython/rev/ef889c3d5dc6 http://bugs.python.org/issue18571 I tested it on Linux, FreeBSD 9, OpenIndiana and Windows 7. Let see if the buildbots appreciate non-inheritable file descriptors. Please test python default on your favorite application to see if the PEP 446 broke or not! If the PEP 446 breaks your application, don't worry, it's for the "good of mankind" :-) Adding some calls to os.set_inheritable(fd, True) and using subprocess with pass_fds should fix your issues. I changed the status of the PEP to Final. I also closed the issues related to the PEP 446: #10115: Support accept4() for atomic setting of flags at socket creation #12107: TCP listening sockets created without FD_CLOEXEC flag #16946: subprocess: _close_open_fd_range_safe() does not set close-on-exec flag on Linux < 2.6.23 if O_CLOEXEC is defined #17070: PEP 433: Use the new cloexec to improve security and avoid bugs #18571: Implementation of the PEP 446: non-inheritable file descriptors Victor From p.f.moore at gmail.com Wed Aug 28 08:29:14 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 28 Aug 2013 07:29:14 +0100 Subject: [Python-Dev] Accepted: PEP 446 -- Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: On 27 August 2013 23:17, Guido van Rossum wrote: > Thanks for your tiresome work I'm guessing you meant "tireless" here :-) Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Aug 28 13:37:23 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 28 Aug 2013 13:37:23 +0200 Subject: [Python-Dev] Test the test suite? Message-ID: Hi, I just noticed that tests using @requires_freebsd_version and @requires_linux_version decorator from test.support are never run since this commit (almost 2 years ago): changeset: 72618:3b1859f80e6d user: Charles-Fran?ois Natali date: Mon Oct 03 19:40:37 2011 +0200 files: Lib/test/support.py description: Introduce support.requires_freebsd_version decorator. ... raise unittest.SkipTest( - "Linux kernel %s or higher required, not %s" - % (min_version_txt, version_txt)) - return func(*args, **kw) - wrapper.min_version = min_version + "%s version %s or higher required, not %s" + % (sysname, min_version_txt, version_txt)) I don't want to blame Charles-Fran?ois, nobody saw the issue during 2 years! No, my question is: how can we detect that a test is never run? Do we need test covertage on the test suite? Or inject faults in the code to test the test suite? Any other idea? I fixed the decorators in Python 3.3 (84debb4abd50) and 3.4 (f98fd5712b0e). Victor From guido at python.org Wed Aug 28 16:03:53 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 28 Aug 2013 07:03:53 -0700 Subject: [Python-Dev] Accepted: PEP 446 -- Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: Whoop. Yes. I guess it was me who was tired. :-) On Tuesday, August 27, 2013, Paul Moore wrote: > On 27 August 2013 23:17, Guido van Rossum > > wrote: > >> Thanks for your tiresome work > > > I'm guessing you meant "tireless" here :-) > > Paul > -- --Guido van Rossum (on iPad) -------------- next part -------------- An HTML attachment was scrubbed... URL: From xdegaye at gmail.com Wed Aug 28 16:13:41 2013 From: xdegaye at gmail.com (Xavier de Gaye) Date: Wed, 28 Aug 2013 16:13:41 +0200 Subject: [Python-Dev] Test the test suite? In-Reply-To: References: Message-ID: It happens that few tests are also never run because of name conflicts. See issue 16056. Xavier On Wed, Aug 28, 2013 at 1:37 PM, Victor Stinner wrote: > Hi, > > I just noticed that tests using @requires_freebsd_version and > @requires_linux_version decorator from test.support are never run > since this commit (almost 2 years ago): > > changeset: 72618:3b1859f80e6d > user: Charles-Fran?ois Natali > date: Mon Oct 03 19:40:37 2011 +0200 > files: Lib/test/support.py > description: > Introduce support.requires_freebsd_version decorator. > > ... > > raise unittest.SkipTest( > - "Linux kernel %s or higher required, not %s" > - % (min_version_txt, version_txt)) > - return func(*args, **kw) > - wrapper.min_version = min_version > + "%s version %s or higher required, not %s" > + % (sysname, min_version_txt, version_txt)) > > > I don't want to blame Charles-Fran?ois, nobody saw the issue during 2 years! > > No, my question is: how can we detect that a test is never run? Do we > need test covertage on the test suite? Or inject faults in the code to > test the test suite? Any other idea? > > I fixed the decorators in Python 3.3 (84debb4abd50) and 3.4 (f98fd5712b0e). > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/xdegaye%40gmail.com -- Xavier Les Chemins de Lokoti: http://lokoti.alwaysdata.net From python at mrabarnett.plus.com Wed Aug 28 17:15:05 2013 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 28 Aug 2013 16:15:05 +0100 Subject: [Python-Dev] Accepted: PEP 446 -- Make newly created file descriptors non-inheritable In-Reply-To: References: Message-ID: <521E13F9.20800@mrabarnett.plus.com> On 28/08/2013 07:29, Paul Moore wrote: > On 27 August 2013 23:17, Guido van Rossum > wrote: > > Thanks for your tiresome work > > > I'm guessing you meant "tireless" here :-) > That depends. It might have been tiresome for the one doing it! From storchaka at gmail.com Wed Aug 28 17:53:43 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Aug 2013 18:53:43 +0300 Subject: [Python-Dev] Test the test suite? In-Reply-To: References: Message-ID: 28.08.13 14:37, Victor Stinner ???????(??): > No, my question is: how can we detect that a test is never run? Do we > need test covertage on the test suite? Or inject faults in the code to > test the test suite? Any other idea? Currently a lot of tests are skipped silently. See issue18702 [1]. Perhaps we need a tool which collects skipped and runned tests, compare these sets with sets from a previous run on the same buildbot and reports if they are different. [1] http://bugs.python.org/issue18702 From solipsis at pitrou.net Wed Aug 28 20:31:07 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 28 Aug 2013 20:31:07 +0200 Subject: [Python-Dev] cpython: Issue #18571: Implementation of the PEP 446: file descriptors and file handles References: <3cPmKh2nMPz7LlL@mail.python.org> Message-ID: <20130828203107.282fbbc7@fsol> Hi, On Wed, 28 Aug 2013 01:20:56 +0200 (CEST) victor.stinner wrote: > http://hg.python.org/cpython/rev/ef889c3d5dc6 > changeset: 85420:ef889c3d5dc6 > user: Victor Stinner > date: Wed Aug 28 00:53:59 2013 +0200 > summary: > Issue #18571: Implementation of the PEP 446: file descriptors and file handles > are now created non-inheritable; add functions os.get/set_inheritable(), > os.get/set_handle_inheritable() and socket.socket.get/set_inheritable(). I don't want to sound too demanding, but was this patch actually reviewed? I can't find a single review comment in http://bugs.python.org/issue18571 Regards Antoine. From victor.stinner at gmail.com Wed Aug 28 21:43:00 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 28 Aug 2013 21:43:00 +0200 Subject: [Python-Dev] cpython: Issue #18571: Implementation of the PEP 446: file descriptors and file handles In-Reply-To: <20130828203107.282fbbc7@fsol> References: <3cPmKh2nMPz7LlL@mail.python.org> <20130828203107.282fbbc7@fsol> Message-ID: 2013/8/28 Antoine Pitrou : > I don't want to sound too demanding, but was this patch actually > reviewed? I can't find a single review comment in > http://bugs.python.org/issue18571 No, it was not. The first patch for the PEP 446 (issue #18571) was available for a review approximatively one month ago. The implementation of the PEP 433 is very close and the issue #17036 had patches since january. I tested my implementation on many different platforms, I didn't see any regression. You can still review the commit and modify directly the source code if you would like to change something. You can also use python-dev@ mailing list if you have comments or questions. Victor From solipsis at pitrou.net Wed Aug 28 21:49:17 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 28 Aug 2013 21:49:17 +0200 Subject: [Python-Dev] cpython: Issue #18571: Implementation of the PEP 446: file descriptors and file handles In-Reply-To: References: <3cPmKh2nMPz7LlL@mail.python.org> <20130828203107.282fbbc7@fsol> Message-ID: <20130828214917.296cc7f6@fsol> On Wed, 28 Aug 2013 21:43:00 +0200 Victor Stinner wrote: > 2013/8/28 Antoine Pitrou : > > I don't want to sound too demanding, but was this patch actually > > reviewed? I can't find a single review comment in > > http://bugs.python.org/issue18571 > > No, it was not. The first patch for the PEP 446 (issue #18571) was > available for a review approximatively one month ago. The > implementation of the PEP 433 is very close and the issue #17036 had > patches since january. > > I tested my implementation on many different platforms, I didn't see > any regression. > > You can still review the commit and modify directly the source code if > you would like to change something. You can also use python-dev@ > mailing list if you have comments or questions. Well, reviewing a 1500-line commit is not very doable. Regards Antoine. From victor.stinner at gmail.com Wed Aug 28 21:55:36 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 28 Aug 2013 21:55:36 +0200 Subject: [Python-Dev] cpython: Issue #18571: Implementation of the PEP 446: file descriptors and file handles In-Reply-To: <20130828214917.296cc7f6@fsol> References: <3cPmKh2nMPz7LlL@mail.python.org> <20130828203107.282fbbc7@fsol> <20130828214917.296cc7f6@fsol> Message-ID: 2013/8/28 Antoine Pitrou : > Well, reviewing a 1500-line commit is not very doable. You can use Rietveld if you prefer: http://bugs.python.org/review/18571/#ps9085 The commit is this patch + changes to Misc/NEWS and Doc/whatnews/3.4.rst. Victor From ncoghlan at gmail.com Wed Aug 28 23:40:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 29 Aug 2013 07:40:13 +1000 Subject: [Python-Dev] Test the test suite? In-Reply-To: References: Message-ID: On 29 Aug 2013 02:34, "Serhiy Storchaka" wrote: > > 28.08.13 14:37, Victor Stinner ???????(??): > >> No, my question is: how can we detect that a test is never run? Do we >> need test covertage on the test suite? Or inject faults in the code to >> test the test suite? Any other idea? > > > Currently a lot of tests are skipped silently. See issue18702 [1]. Perhaps we need a tool which collects skipped and runned tests, compare these sets with sets from a previous run on the same buildbot and reports if they are different. > > [1] http://bugs.python.org/issue18702 Figuring out a way to collect and merge coverage data would likely be more useful, since that could be applied to the standard library as well. Ned Batchelder's coverage.py supports aggregating data from multiple runs. Cheers, Nick. > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Aug 29 02:16:20 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 29 Aug 2013 02:16:20 +0200 Subject: [Python-Dev] Add a new tracemalloc module to trace memory allocations Message-ID: Hi, Thanks to the PEP 445, it becomes possible to trace easily memory allocations. I propose to add a new tracemalloc module computing the memory usage per file and per line number. It has also a private method to retrieve the location (filename and line number) of a memory allocation of an object. tracemalloc is different than Heapy or PySizer because it is focused on the location of a memory allocation rather that the object type or object content. I have an implementation of the module for Python 2.5-3.4, but it requires to patch and recompile Python: https://pypi.python.org/pypi/pytracemalloc My proposed implementation for Python 3.4 is different: * reuse the PEP 445 to hook memory allocators * use a simple C implementation of an hash table called "cfuhash" (coming from the libcfu project, BSD license) instead of depending on the glib library. I simplified and adapted cfuhash for my usage * no enable() / disable() function: tracemalloc can only be enabled before startup by setting PYTHONTRACEMALLOC=1 environment variable * traces (size of the memory block, Python filename, Python line number) are stored directly in the memory block, not in a separated hash table I chose PYTHONTRACEMALLOC env var instead of enable()/disable() functions to be able to really trace *all* memory allocated by Python, especially memory allocated at startup, during Python initialization. The (high-level) API should be reviewed and discussed. The most interesting part is to take "snapshots" and compare snapshots. The module can load snapshots from disk and compare them later for deeper analysis (ex: ignore some files). For the documentation, see the following page: https://pypi.python.org/pypi/pytracemalloc I created the following issue to track the implementation: http://bugs.python.org/issue18874 The implementation: http://hg.python.org/features/tracemalloc * * * I also created a "pyfailmalloc" project based on the PEP 445 to inject MemoryError exceptions. I used this module to check if Python handles correctly memory allocation failures. The answer is no, I fixed many bugs (see issue #18408). Project homepage: https://bitbucket.org/haypo/pyfailmalloc Charles-Fran?ois Natali and Serhiy Storchaka asked me to add this module somewhere in Python 3.4: "how about adding pyfailmalloc to the main repo (maybe under Tools), with a script making it easy to run the tests suite with it enabled?" What is the best place for such module? Add it to Modules/ directory but as a private module: "_failmalloc"? * * * Example of tracemalloc output (it is more readable with colors, try in a terminal). The first top is sorted by total size, whereas the second top is sorted (automatically) with the size difference. You can see for example that the linecache module likes caching data (1.5 MB after 10 seconds of tests). $ PYTHONTRACEMALLOC=1 ./python -m test ... == CPython 3.4.0a1+ (default:2ce9e5f6b47c+, Aug 29 2013, 02:03:02) [GCC 4.7.2 20121109 (Red Hat 4.7.2-8)] == Linux-3.9.4-200.fc18.x86_64-x86_64-with-fedora-18-Spherical_Cow little-endian == /home/haypo/prog/python/tracemalloc/build/test_python_11087 ... [ 1/380] test_grammar [ 2/380] test_opcodes [ 3/380] test_dict [ 4/380] test_builtin [ 5/380] test_exceptions [ 6/380] test_types [ 7/380] test_unittest 2013-08-29 02:06:22: Top 25 allocations per file and line #1: :704: size=5 MiB, count=56227, average=105 B #2: .../tracemalloc/Lib/linecache.py:127: size=1004 KiB, count=8706, average=118 B #3: .../Lib/unittest/mock.py:1764: size=895 KiB, count=7841, average=116 B #4: .../Lib/unittest/mock.py:1805: size=817 KiB, count=15101, average=55 B #5: .../Lib/test/test_dict.py:35: size=768 KiB, count=8, average=96 KiB #6: :274: size=703 KiB, count=4604, average=156 B #7: ???:?: size=511 KiB, count=4445, average=117 B #8: .../Lib/unittest/mock.py:350: size=370 KiB, count=1227, average=308 B #9: .../Lib/unittest/case.py:306: size=343 KiB, count=1390, average=253 B #10: .../Lib/unittest/case.py:496: size=330 KiB, count=650, average=521 B #11: .../Lib/unittest/case.py:327: size=291 KiB, count=717, average=416 B #12: .../Lib/collections/__init__.py:368: size=239 KiB, count=2170, average=113 B #13: .../Lib/test/test_grammar.py:132: size=195 KiB, count=1250, average=159 B #14: .../Lib/unittest/mock.py:379: size=118 KiB, count=152, average=800 B #15: .../tracemalloc/Lib/contextlib.py:37: size=102 KiB, count=672, average=156 B #16: :1430: size=91 KiB, count=1193, average=78 B #17: .../tracemalloc/Lib/inspect.py:1399: size=79 KiB, count=104, average=784 B #18: .../tracemalloc/Lib/abc.py:133: size=77 KiB, count=275, average=289 B #19: .../Lib/unittest/case.py:43: size=73 KiB, count=593, average=127 B #20: .../Lib/unittest/mock.py:491: size=67 KiB, count=153, average=450 B #21: :1438: size=64 KiB, count=20, average=3321 B #22: .../Lib/unittest/case.py:535: size=56 KiB, count=76, average=766 B #23: .../tracemalloc/Lib/sre_compile.py:508: size=54 KiB, count=115, average=485 B #24: .../Lib/unittest/case.py:300: size=48 KiB, count=616, average=80 B #25: .../Lib/test/regrtest.py:1207: size=48 KiB, count=2, average=24 KiB 7333 more: size=4991 KiB, count=28051, average=182 B Total Python memory: size=17 MiB, count=136358, average=136 B Total process memory: size=42 MiB (ignore tracemalloc: 22 KiB) [ 8/380] test_doctest [ 9/380] test_doctest2 [ 10/380] test_support [ 11/380] test___all__ 2013-08-29 02:08:44: Top 25 allocations per file and line (compared to 2013-08-29 02:08:39) #1: :704: size=8 MiB (+2853 KiB), count=80879 (+24652), average=109 B #2: .../tracemalloc/Lib/linecache.py:127: size=1562 KiB (+557 KiB), count=13669 (+4964), average=117 B #3: :274: size=955 KiB (+252 KiB), count=6415 (+1811), average=152 B #4: .../Lib/collections/__init__.py:368: size=333 KiB (+93 KiB), count=3136 (+966), average=108 B #5: .../tracemalloc/Lib/abc.py:133: size=148 KiB (+71 KiB), count=483 (+211), average=314 B #6: .../Lib/urllib/parse.py:476: size=71 KiB (+71 KiB), count=969 (+969), average=75 B #7: .../tracemalloc/Lib/base64.py:143: size=59 KiB (+59 KiB), count=1025 (+1025), average=59 B #8: .../tracemalloc/Lib/doctest.py:1283: size=56 KiB (+56 KiB), count=507 (+507), average=113 B #9: .../tracemalloc/Lib/sre_compile.py:508: size=89 KiB (+36 KiB), count=199 (+86), average=460 B #10: :53: size=67 KiB (+32 KiB), count=505 (+242), average=136 B #11: :1048: size=61 KiB (+27 KiB), count=332 (+138), average=188 B #12: .../Lib/unittest/case.py:496: size=351 KiB (+25 KiB), count=688 (+48), average=522 B #13: .../tracemalloc/Lib/_weakrefset.py:38: size=55 KiB (+24 KiB), count=521 (+236), average=109 B #14: .../Lib/email/quoprimime.py:56: size=24 KiB (+24 KiB), count=190 (+190), average=132 B #15: .../Lib/email/quoprimime.py:57: size=24 KiB (+24 KiB), count=1 (+1) #16: .../test/support/__init__.py:1055: size=24 KiB (+24 KiB), count=1 (+1) #17: .../Lib/test/test___all__.py:48: size=23 KiB (+23 KiB), count=283 (+283), average=84 B #18: .../tracemalloc/Lib/_weakrefset.py:37: size=60 KiB (+22 KiB), count=438 (+174), average=140 B #19: .../tracemalloc/Lib/sre_parse.py:73: size=48 KiB (+20 KiB), count=263 (+114), average=189 B #20: :5: size=61 KiB (+18 KiB), count=173 (+57), average=364 B #21: .../Lib/test/test___all__.py:34: size=16 KiB (+16 KiB), count=164 (+164), average=104 B #22: .../Lib/unittest/mock.py:491: size=79 KiB (+16 KiB), count=145 (+3), average=560 B #23: .../Lib/collections/__init__.py:362: size=50 KiB (+16 KiB), count=23 (+7), average=2255 B #24: .../Lib/test/test___all__.py:59: size=13 KiB (+13 KiB), count=165 (+165), average=84 B #25: .../tracemalloc/Lib/doctest.py:1291: size=13 KiB (+13 KiB), count=170 (+170), average=81 B 10788 more: size=7 MiB (-830 KiB), count=36291 (-16379), average=220 B Total Python memory: size=20 MiB (+3567 KiB), count=147635 (+20805), average=143 B Total process memory: size=49 MiB (+7 MiB) (ignore tracemalloc: 1669 KiB) Victor From victor.stinner at gmail.com Thu Aug 29 03:07:25 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 29 Aug 2013 03:07:25 +0200 Subject: [Python-Dev] Add a new tracemalloc module to trace memory allocations In-Reply-To: References: Message-ID: 2013/8/29 Victor Stinner : > My proposed implementation for Python 3.4 is different: > > * no enable() / disable() function: tracemalloc can only be enabled > before startup by setting PYTHONTRACEMALLOC=1 environment variable > > * traces (size of the memory block, Python filename, Python line > number) are stored directly in the memory block, not in a separated > hash table > > I chose PYTHONTRACEMALLOC env var instead of enable()/disable() > functions to be able to really trace *all* memory allocated by Python, > especially memory allocated at startup, during Python initialization. I'm not sure that having to set an environment variable is the most convinient option, especially on Windows. Storing traces directly into memory blocks should use less memory, but it requires to start tracemalloc before the first memory allocation. It is possible to add again enable() and disable() methods to dynamically install/uninstall the hook on memory allocators. I solved this issue in the current implementation by using a second hash table (pointer => trace). We can keep the environment variable as PYTHONFAULTHANDLER which enables faulthandler at startup. faulthandler has also a command line option: -X faulthandler. We may add -X tracemalloc. Victor From solipsis at pitrou.net Thu Aug 29 08:40:29 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Aug 2013 08:40:29 +0200 Subject: [Python-Dev] devguide: Issue #18871: make it more explicit that the test suite should be run before References: <3cQLJN2DbWz7Lk0@mail.python.org> <521E9045.2010704@udel.edu> Message-ID: <20130829084029.6ce1308a@fsol> On Wed, 28 Aug 2013 20:05:25 -0400 Terry Reedy wrote: > On 8/28/2013 5:51 PM, antoine.pitrou wrote: > > > +Does the test suite still pass? > > There are several assumptions packed in that question. > > 0. The test suite is relevant to the patch. > > Not true for doc patches, some idlelib patches, and probably some others. That was implicit in the formulation: "You must :ref:`run the whole test suite ` to ensure that it passes before pushing any *code changes*" (not doc changes, spelling corrections, etc.). The only reason to know that the test suite isn't relevant to a code change is to... *run the test suite*. Don't assume that you are omniscient and know that your change won't break something. Developers not being omniscient is why we have a test suite in the first place. (btw, if idlelib isn't tested, it should!) > 1. The test suite runs to completion. > > For at least a couple of years, that was not true on Windows desktops. > The popup Runtime Error boxes from now are now gone, but a month ago, a > crashing test froze Command Prompt. But was the test corrected? Did you open an issue at the time? Did you try to debug the error? We are collectively responsible for Python's quality. This is not some third-party software you have no control on. > 2. The test suite runs without error. > > I have hardly ever seen this on Windows, ever (with sporadic runs over > several years). Today, test_email and test_sax failed in successive > runs*. > > 3. If there is a new failure, it is due to the patch. > > There have been at least one, but I think more, intermittent failures of > Windows tests in the last few months Same answer as above. > 4. The gain of answering that question is worth the cost. Accepting responsibility for one's own changes is part of why we trust each others as committers. The cost of *you* running the test suite is smaller than the cost of other developers trying to investigate a sudden buildbot failure, where it comes from etc. Having you run the test suite encourages you to be conscious of its existence, of its imperfections, and to feel responsible in making it better. Shrugging it off because it sometimes doesn't work isn't helpful. We are *striving* to make it better (both in coverage and in robustness). > I worry that further emphasizing an overly broad, time-consuming, and > sometimes impractical rule will only discourage more participation. IMO, this is a calculated risk that is worthwhile to take. Times have changed, and being rigorous with testing is central in most successful projects nowadays. Regards Antoine. From brett at python.org Thu Aug 29 15:27:26 2013 From: brett at python.org (Brett Cannon) Date: Thu, 29 Aug 2013 09:27:26 -0400 Subject: [Python-Dev] Add a new tracemalloc module to trace memory allocations In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 8:16 PM, Victor Stinner wrote: > Hi, > > Thanks to the PEP 445, it becomes possible to trace easily memory > allocations. I propose to add a new tracemalloc module computing the > memory usage per file and per line number. It has also a private > method to retrieve the location (filename and line number) of a memory > allocation of an object. > > tracemalloc is different than Heapy or PySizer because it is focused > on the location of a memory allocation rather that the object type or > object content. > > I have an implementation of the module for Python 2.5-3.4, but it > requires to patch and recompile Python: > https://pypi.python.org/pypi/pytracemalloc > > > My proposed implementation for Python 3.4 is different: > > * reuse the PEP 445 to hook memory allocators > > * use a simple C implementation of an hash table called "cfuhash" > (coming from the libcfu project, BSD license) instead of depending on > the glib library. I simplified and adapted cfuhash for my usage > > * no enable() / disable() function: tracemalloc can only be enabled > before startup by setting PYTHONTRACEMALLOC=1 environment variable > > * traces (size of the memory block, Python filename, Python line > number) are stored directly in the memory block, not in a separated > hash table > > I chose PYTHONTRACEMALLOC env var instead of enable()/disable() > functions to be able to really trace *all* memory allocated by Python, > especially memory allocated at startup, during Python initialization. > > > The (high-level) API should be reviewed and discussed. The most > interesting part is to take "snapshots" and compare snapshots. The > module can load snapshots from disk and compare them later for deeper > analysis (ex: ignore some files). > > For the documentation, see the following page: > https://pypi.python.org/pypi/pytracemalloc > > I created the following issue to track the implementation: > http://bugs.python.org/issue18874 > > The implementation: > http://hg.python.org/features/tracemalloc Without looking at the code or docs I can the concept sounds very cool! > > > * * * > > I also created a "pyfailmalloc" project based on the PEP 445 to inject > MemoryError exceptions. I used this module to check if Python handles > correctly memory allocation failures. The answer is no, I fixed many > bugs (see issue #18408). > > Project homepage: > https://bitbucket.org/haypo/pyfailmalloc > > Charles-Fran?ois Natali and Serhiy Storchaka asked me to add this > module somewhere in Python 3.4: "how about adding pyfailmalloc to the > main repo (maybe under Tools), with a script making it easy to run the > tests suite with it enabled?" > > What is the best place for such module? Add it to Modules/ directory > but as a private module: "_failmalloc"? > Would extension module authors find it useful? If so maybe we need a malloc package with trace and fail submodules? And if we add it we might want to add to running the tool as part of the devguide something people can work on. -Brett > > * * * > > Example of tracemalloc output (it is more readable with colors, try in > a terminal). The first top is sorted by total size, whereas the second > top is sorted (automatically) with the size difference. You can see > for example that the linecache module likes caching data (1.5 MB after > 10 seconds of tests). > > > $ PYTHONTRACEMALLOC=1 ./python -m test > ... > == CPython 3.4.0a1+ (default:2ce9e5f6b47c+, Aug 29 2013, 02:03:02) > [GCC 4.7.2 20121109 (Red Hat 4.7.2-8)] > == Linux-3.9.4-200.fc18.x86_64-x86_64-with-fedora-18-Spherical_Cow > little-endian > == /home/haypo/prog/python/tracemalloc/build/test_python_11087 > ... > [ 1/380] test_grammar > [ 2/380] test_opcodes > [ 3/380] test_dict > [ 4/380] test_builtin > [ 5/380] test_exceptions > [ 6/380] test_types > [ 7/380] test_unittest > > 2013-08-29 02:06:22: Top 25 allocations per file and line > #1: :704: size=5 MiB, count=56227, > average=105 B > #2: .../tracemalloc/Lib/linecache.py:127: size=1004 KiB, count=8706, > average=118 B > #3: .../Lib/unittest/mock.py:1764: size=895 KiB, count=7841, average=116 B > #4: .../Lib/unittest/mock.py:1805: size=817 KiB, count=15101, average=55 B > #5: .../Lib/test/test_dict.py:35: size=768 KiB, count=8, average=96 KiB > #6: :274: size=703 KiB, count=4604, > average=156 B > #7: ???:?: size=511 KiB, count=4445, average=117 B > #8: .../Lib/unittest/mock.py:350: size=370 KiB, count=1227, average=308 B > #9: .../Lib/unittest/case.py:306: size=343 KiB, count=1390, average=253 B > #10: .../Lib/unittest/case.py:496: size=330 KiB, count=650, average=521 B > #11: .../Lib/unittest/case.py:327: size=291 KiB, count=717, average=416 B > #12: .../Lib/collections/__init__.py:368: size=239 KiB, count=2170, > average=113 B > #13: .../Lib/test/test_grammar.py:132: size=195 KiB, count=1250, > average=159 B > #14: .../Lib/unittest/mock.py:379: size=118 KiB, count=152, average=800 B > #15: .../tracemalloc/Lib/contextlib.py:37: size=102 KiB, count=672, > average=156 B > #16: :1430: size=91 KiB, count=1193, > average=78 B > #17: .../tracemalloc/Lib/inspect.py:1399: size=79 KiB, count=104, > average=784 B > #18: .../tracemalloc/Lib/abc.py:133: size=77 KiB, count=275, average=289 B > #19: .../Lib/unittest/case.py:43: size=73 KiB, count=593, average=127 B > #20: .../Lib/unittest/mock.py:491: size=67 KiB, count=153, average=450 B > #21: :1438: size=64 KiB, count=20, > average=3321 B > #22: .../Lib/unittest/case.py:535: size=56 KiB, count=76, average=766 B > #23: .../tracemalloc/Lib/sre_compile.py:508: size=54 KiB, count=115, > average=485 B > #24: .../Lib/unittest/case.py:300: size=48 KiB, count=616, average=80 B > #25: .../Lib/test/regrtest.py:1207: size=48 KiB, count=2, average=24 KiB > 7333 more: size=4991 KiB, count=28051, average=182 B > Total Python memory: size=17 MiB, count=136358, average=136 B > Total process memory: size=42 MiB (ignore tracemalloc: 22 KiB) > > [ 8/380] test_doctest > [ 9/380] test_doctest2 > [ 10/380] test_support > [ 11/380] test___all__ > > 2013-08-29 02:08:44: Top 25 allocations per file and line (compared to > 2013-08-29 02:08:39) > #1: :704: size=8 MiB (+2853 KiB), > count=80879 (+24652), average=109 B > #2: .../tracemalloc/Lib/linecache.py:127: size=1562 KiB (+557 KiB), > count=13669 (+4964), average=117 B > #3: :274: size=955 KiB (+252 KiB), > count=6415 (+1811), average=152 B > #4: .../Lib/collections/__init__.py:368: size=333 KiB (+93 KiB), > count=3136 (+966), average=108 B > #5: .../tracemalloc/Lib/abc.py:133: size=148 KiB (+71 KiB), count=483 > (+211), average=314 B > #6: .../Lib/urllib/parse.py:476: size=71 KiB (+71 KiB), count=969 > (+969), average=75 B > #7: .../tracemalloc/Lib/base64.py:143: size=59 KiB (+59 KiB), > count=1025 (+1025), average=59 B > #8: .../tracemalloc/Lib/doctest.py:1283: size=56 KiB (+56 KiB), > count=507 (+507), average=113 B > #9: .../tracemalloc/Lib/sre_compile.py:508: size=89 KiB (+36 KiB), > count=199 (+86), average=460 B > #10: :53: size=67 KiB (+32 KiB), > count=505 (+242), average=136 B > #11: :1048: size=61 KiB (+27 KiB), > count=332 (+138), average=188 B > #12: .../Lib/unittest/case.py:496: size=351 KiB (+25 KiB), count=688 > (+48), average=522 B > #13: .../tracemalloc/Lib/_weakrefset.py:38: size=55 KiB (+24 KiB), > count=521 (+236), average=109 B > #14: .../Lib/email/quoprimime.py:56: size=24 KiB (+24 KiB), count=190 > (+190), average=132 B > #15: .../Lib/email/quoprimime.py:57: size=24 KiB (+24 KiB), count=1 (+1) > #16: .../test/support/__init__.py:1055: size=24 KiB (+24 KiB), count=1 (+1) > #17: .../Lib/test/test___all__.py:48: size=23 KiB (+23 KiB), count=283 > (+283), average=84 B > #18: .../tracemalloc/Lib/_weakrefset.py:37: size=60 KiB (+22 KiB), > count=438 (+174), average=140 B > #19: .../tracemalloc/Lib/sre_parse.py:73: size=48 KiB (+20 KiB), > count=263 (+114), average=189 B > #20: :5: size=61 KiB (+18 KiB), count=173 (+57), average=364 B > #21: .../Lib/test/test___all__.py:34: size=16 KiB (+16 KiB), count=164 > (+164), average=104 B > #22: .../Lib/unittest/mock.py:491: size=79 KiB (+16 KiB), count=145 > (+3), average=560 B > #23: .../Lib/collections/__init__.py:362: size=50 KiB (+16 KiB), > count=23 (+7), average=2255 B > #24: .../Lib/test/test___all__.py:59: size=13 KiB (+13 KiB), count=165 > (+165), average=84 B > #25: .../tracemalloc/Lib/doctest.py:1291: size=13 KiB (+13 KiB), > count=170 (+170), average=81 B > 10788 more: size=7 MiB (-830 KiB), count=36291 (-16379), average=220 B > Total Python memory: size=20 MiB (+3567 KiB), count=147635 (+20805), > average=143 B > Total process memory: size=49 MiB (+7 MiB) (ignore tracemalloc: 1669 KiB) > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Aug 29 15:54:38 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 29 Aug 2013 15:54:38 +0200 Subject: [Python-Dev] Add a new tracemalloc module to trace memory allocations In-Reply-To: References: Message-ID: 2013/8/29 Brett Cannon : >> I also created a "pyfailmalloc" project based on the PEP 445 to inject >> MemoryError exceptions. (...) > > Would extension module authors find it useful? I don't know, I created two months ago and I didn't made a public annoucement. > If so maybe we need a malloc package with trace and fail submodules? I read somewhere "flat is better than nested". failmalloc and tracemalloc are not directly related. I guess that they can be used at the same time, but I didn't try. > And if we add it we might want to add to running the tool as part of the > devguide something people can work on. There are still tricky open issues related to failmalloc :-) - frame_fasttolocals.patch: fix for PyFrame_FastToLocals(), I didn't commit the patch because it is not atomic (it does not handle errors correctly). I should modify it to copy the locals before modifying the dict, so it can be restored in case of errors. - #18507: import_init() should not use Py_FatalError() but return an error - #18509: CJK decoders should return MBERR_EXCEPTION on PyUnicodeWriter error - fix tests hang when an exception occurs in a thread Victor From ericsnowcurrently at gmail.com Thu Aug 29 18:43:23 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 29 Aug 2013 10:43:23 -0600 Subject: [Python-Dev] [Python-checkins] cpython: Issue #16799: Switched from getopt to argparse style in regrtest's argument In-Reply-To: <3cQdkn3XY7z7Ljp@mail.python.org> References: <3cQdkn3XY7z7Ljp@mail.python.org> Message-ID: On Thu, Aug 29, 2013 at 3:27 AM, serhiy.storchaka < python-checkins at python.org> wrote: > http://hg.python.org/cpython/rev/997de0edc5bd > changeset: 85444:997de0edc5bd > parent: 85442:676bbd5b0254 > user: Serhiy Storchaka > date: Thu Aug 29 12:26:23 2013 +0300 > summary: > Issue #16799: Switched from getopt to argparse style in regrtest's > argument > parsing. Added more tests for regrtest's argument parsing. > > files: > Lib/test/regrtest.py | 529 +++++++++++-------------- > Lib/test/test_regrtest.py | 328 ++++++++++++--- > Misc/NEWS | 3 + > 3 files changed, 500 insertions(+), 360 deletions(-) > > > diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py > --- a/Lib/test/regrtest.py > +++ b/Lib/test/regrtest.py > ... > diff --git a/Lib/test/test_regrtest.py b/Lib/test/test_regrtest.py > --- a/Lib/test/test_regrtest.py > +++ b/Lib/test/test_regrtest.py > @@ -4,97 +4,281 @@ > > import argparse > import getopt > We aren't using getopt in this module anymore, are we? -eric > +import os.path > import unittest > from test import regrtest, support > > -def old_parse_args(args): > - """Parse arguments as regrtest did strictly prior to 3.4. > - > - Raises getopt.GetoptError on bad arguments. > - """ > - return getopt.getopt(args, 'hvqxsoS:rf:lu:t:TD:NLR:FdwWM:nj:Gm:', > - ['help', 'verbose', 'verbose2', 'verbose3', 'quiet', > - 'exclude', 'single', 'slow', 'randomize', 'fromfile=', > 'findleaks', > - 'use=', 'threshold=', 'coverdir=', 'nocoverdir', > - 'runleaks', 'huntrleaks=', 'memlimit=', 'randseed=', > - 'multiprocess=', 'coverage', 'slaveargs=', 'forever', 'debug', > - 'start=', 'nowindows', 'header', 'testdir=', 'timeout=', 'wait', > - 'failfast', 'match=']) > - > class ParseArgsTestCase(unittest.TestCase): > > - """Test that regrtest's parsing code matches the prior getopt > behavior.""" > + """Test regrtest's argument parsing.""" > > - def _parse_args(self, args): > - # This is the same logic as that used in regrtest.main() > - parser = regrtest._create_parser() > - ns = parser.parse_args(args=args) > - opts = regrtest._convert_namespace_to_getopt(ns) > - return opts, ns.args > + def checkError(self, args, msg): > + with support.captured_stderr() as err, > self.assertRaises(SystemExit): > + regrtest._parse_args(args) > + self.assertIn(msg, err.getvalue()) > > - def _check_args(self, args, expected=None): > - """ > - The expected parameter is for cases when the behavior of the new > - parse_args differs from the old (but deliberately so). > - """ > - if expected is None: > - try: > - expected = old_parse_args(args) > - except getopt.GetoptError: > - # Suppress usage string output when an > argparse.ArgumentError > - # error is raised. > - with support.captured_stderr(): > - self.assertRaises(SystemExit, self._parse_args, args) > - return > - # The new parse_args() sorts by long option string. > - expected[0].sort() > - actual = self._parse_args(args) > - self.assertEqual(actual, expected) > + def test_help(self): > + for opt in '-h', '--help': > + with self.subTest(opt=opt): > + with support.captured_stdout() as out, \ > + self.assertRaises(SystemExit): > + regrtest._parse_args([opt]) > + self.assertIn('Run Python regression tests.', > out.getvalue()) > + > + def test_timeout(self): > + ns = regrtest._parse_args(['--timeout', '4.2']) > + self.assertEqual(ns.timeout, 4.2) > + self.checkError(['--timeout'], 'expected one argument') > + self.checkError(['--timeout', 'foo'], 'invalid float value') > + > + def test_wait(self): > + ns = regrtest._parse_args(['--wait']) > + self.assertTrue(ns.wait) > + > + def test_slaveargs(self): > + ns = regrtest._parse_args(['--slaveargs', '[[], {}]']) > + self.assertEqual(ns.slaveargs, '[[], {}]') > + self.checkError(['--slaveargs'], 'expected one argument') > + > + def test_start(self): > + for opt in '-S', '--start': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'foo']) > + self.assertEqual(ns.start, 'foo') > + self.checkError([opt], 'expected one argument') > + > + def test_verbose(self): > + ns = regrtest._parse_args(['-v']) > + self.assertEqual(ns.verbose, 1) > + ns = regrtest._parse_args(['-vvv']) > + self.assertEqual(ns.verbose, 3) > + ns = regrtest._parse_args(['--verbose']) > + self.assertEqual(ns.verbose, 1) > + ns = regrtest._parse_args(['--verbose'] * 3) > + self.assertEqual(ns.verbose, 3) > + ns = regrtest._parse_args([]) > + self.assertEqual(ns.verbose, 0) > + > + def test_verbose2(self): > + for opt in '-w', '--verbose2': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.verbose2) > + > + def test_verbose3(self): > + for opt in '-W', '--verbose3': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.verbose3) > + > + def test_debug(self): > + for opt in '-d', '--debug': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.debug) > + > + def test_quiet(self): > + for opt in '-q', '--quiet': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.quiet) > + self.assertEqual(ns.verbose, 0) > + > + def test_slow(self): > + for opt in '-o', '--slow': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.print_slow) > + > + def test_header(self): > + ns = regrtest._parse_args(['--header']) > + self.assertTrue(ns.header) > + > + def test_randomize(self): > + for opt in '-r', '--randomize': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.randomize) > + > + def test_randseed(self): > + ns = regrtest._parse_args(['--randseed', '12345']) > + self.assertEqual(ns.random_seed, 12345) > + self.assertTrue(ns.randomize) > + self.checkError(['--randseed'], 'expected one argument') > + self.checkError(['--randseed', 'foo'], 'invalid int value') > + > + def test_fromfile(self): > + for opt in '-f', '--fromfile': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'foo']) > + self.assertEqual(ns.fromfile, 'foo') > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, 'foo', '-s'], "don't go together") > + > + def test_exclude(self): > + for opt in '-x', '--exclude': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.exclude) > + > + def test_single(self): > + for opt in '-s', '--single': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.single) > + self.checkError([opt, '-f', 'foo'], "don't go together") > + > + def test_match(self): > + for opt in '-m', '--match': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'pattern']) > + self.assertEqual(ns.match_tests, 'pattern') > + self.checkError([opt], 'expected one argument') > + > + def test_failfast(self): > + for opt in '-G', '--failfast': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, '-v']) > + self.assertTrue(ns.failfast) > + ns = regrtest._parse_args([opt, '-W']) > + self.assertTrue(ns.failfast) > + self.checkError([opt], '-G/--failfast needs either -v or > -W') > + > + def test_use(self): > + for opt in '-u', '--use': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'gui,network']) > + self.assertEqual(ns.use_resources, ['gui', 'network']) > + ns = regrtest._parse_args([opt, 'gui,none,network']) > + self.assertEqual(ns.use_resources, ['network']) > + expected = list(regrtest.RESOURCE_NAMES) > + expected.remove('gui') > + ns = regrtest._parse_args([opt, 'all,-gui']) > + self.assertEqual(ns.use_resources, expected) > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, 'foo'], 'invalid resource') > + > + def test_memlimit(self): > + for opt in '-M', '--memlimit': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, '4G']) > + self.assertEqual(ns.memlimit, '4G') > + self.checkError([opt], 'expected one argument') > + > + def test_testdir(self): > + ns = regrtest._parse_args(['--testdir', 'foo']) > + self.assertEqual(ns.testdir, os.path.join(support.SAVEDCWD, > 'foo')) > + self.checkError(['--testdir'], 'expected one argument') > + > + def test_findleaks(self): > + for opt in '-l', '--findleaks': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.findleaks) > + > + def test_findleaks(self): > + for opt in '-L', '--runleaks': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.runleaks) > + > + def test_findleaks(self): > + for opt in '-R', '--huntrleaks': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, ':']) > + self.assertEqual(ns.huntrleaks, (5, 4, 'reflog.txt')) > + ns = regrtest._parse_args([opt, '6:']) > + self.assertEqual(ns.huntrleaks, (6, 4, 'reflog.txt')) > + ns = regrtest._parse_args([opt, ':3']) > + self.assertEqual(ns.huntrleaks, (5, 3, 'reflog.txt')) > + ns = regrtest._parse_args([opt, '6:3:leaks.log']) > + self.assertEqual(ns.huntrleaks, (6, 3, 'leaks.log')) > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, '6'], > + 'needs 2 or 3 colon-separated arguments') > + self.checkError([opt, 'foo:'], 'invalid huntrleaks value') > + self.checkError([opt, '6:foo'], 'invalid huntrleaks > value') > + > + def test_multiprocess(self): > + for opt in '-j', '--multiprocess': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, '2']) > + self.assertEqual(ns.use_mp, 2) > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, 'foo'], 'invalid int value') > + self.checkError([opt, '2', '-T'], "don't go together") > + self.checkError([opt, '2', '-l'], "don't go together") > + self.checkError([opt, '2', '-M', '4G'], "don't go > together") > + > + def test_findleaks(self): > + for opt in '-T', '--coverage': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.trace) > + > + def test_coverdir(self): > + for opt in '-D', '--coverdir': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'foo']) > + self.assertEqual(ns.coverdir, > + os.path.join(support.SAVEDCWD, 'foo')) > + self.checkError([opt], 'expected one argument') > + > + def test_nocoverdir(self): > + for opt in '-N', '--nocoverdir': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertIsNone(ns.coverdir) > + > + def test_threshold(self): > + for opt in '-t', '--threshold': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, '1000']) > + self.assertEqual(ns.threshold, 1000) > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, 'foo'], 'invalid int value') > + > + def test_nowindows(self): > + for opt in '-n', '--nowindows': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.nowindows) > + > + def test_forever(self): > + for opt in '-F', '--forever': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.forever) > + > > def test_unrecognized_argument(self): > - self._check_args(['--xxx']) > - > - def test_value_not_provided(self): > - self._check_args(['--start']) > - > - def test_short_option(self): > - # getopt returns the short option whereas argparse returns the > long. > - expected = ([('--quiet', '')], []) > - self._check_args(['-q'], expected=expected) > - > - def test_long_option(self): > - self._check_args(['--quiet']) > + self.checkError(['--xxx'], 'usage:') > > def test_long_option__partial(self): > - self._check_args(['--qui']) > + ns = regrtest._parse_args(['--qui']) > + self.assertTrue(ns.quiet) > + self.assertEqual(ns.verbose, 0) > > def test_two_options(self): > - self._check_args(['--quiet', '--exclude']) > - > - def test_option_with_value(self): > - self._check_args(['--start', 'foo']) > + ns = regrtest._parse_args(['--quiet', '--exclude']) > + self.assertTrue(ns.quiet) > + self.assertEqual(ns.verbose, 0) > + self.assertTrue(ns.exclude) > > def test_option_with_empty_string_value(self): > - self._check_args(['--start', '']) > + ns = regrtest._parse_args(['--start', '']) > + self.assertEqual(ns.start, '') > > def test_arg(self): > - self._check_args(['foo']) > + ns = regrtest._parse_args(['foo']) > + self.assertEqual(ns.args, ['foo']) > > def test_option_and_arg(self): > - self._check_args(['--quiet', 'foo']) > + ns = regrtest._parse_args(['--quiet', 'foo']) > + self.assertTrue(ns.quiet) > + self.assertEqual(ns.verbose, 0) > + self.assertEqual(ns.args, ['foo']) > > - def test_fromfile(self): > - self._check_args(['--fromfile', 'file']) > - > - def test_match(self): > - self._check_args(['--match', 'pattern']) > - > - def test_randomize(self): > - self._check_args(['--randomize']) > - > - > -def test_main(): > - support.run_unittest(__name__) > > if __name__ == '__main__': > - test_main() > + unittest.main() > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -153,6 +153,9 @@ > Tests > ----- > > +- Issue #16799: Switched from getopt to argparse style in regrtest's > argument > + parsing. Added more tests for regrtest's argument parsing. > + > - Issue #18792: Use "127.0.0.1" or "::1" instead of "localhost" as much as > possible, since "localhost" goes through a DNS lookup under recent > Windows > versions. > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Thu Aug 29 18:50:08 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 29 Aug 2013 09:50:08 -0700 Subject: [Python-Dev] [Python-checkins] cpython: Issue #16799: Switched from getopt to argparse style in regrtest's argument In-Reply-To: <3cQdkn3XY7z7Ljp@mail.python.org> References: <3cQdkn3XY7z7Ljp@mail.python.org> Message-ID: Great job, Serhiy. In general, eating our own dogfood is a great idea. The more we use new Python features in our own stuff, the better. Eli On Thu, Aug 29, 2013 at 2:27 AM, serhiy.storchaka < python-checkins at python.org> wrote: > http://hg.python.org/cpython/rev/997de0edc5bd > changeset: 85444:997de0edc5bd > parent: 85442:676bbd5b0254 > user: Serhiy Storchaka > date: Thu Aug 29 12:26:23 2013 +0300 > summary: > Issue #16799: Switched from getopt to argparse style in regrtest's > argument > parsing. Added more tests for regrtest's argument parsing. > > files: > Lib/test/regrtest.py | 529 +++++++++++-------------- > Lib/test/test_regrtest.py | 328 ++++++++++++--- > Misc/NEWS | 3 + > 3 files changed, 500 insertions(+), 360 deletions(-) > > > diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py > --- a/Lib/test/regrtest.py > +++ b/Lib/test/regrtest.py > @@ -233,18 +233,20 @@ > # We add help explicitly to control what argument group it renders > under. > group.add_argument('-h', '--help', action='help', > help='show this help message and exit') > - group.add_argument('--timeout', metavar='TIMEOUT', > + group.add_argument('--timeout', metavar='TIMEOUT', type=float, > help='dump the traceback and exit if a test takes > ' > 'more than TIMEOUT seconds; disabled if > TIMEOUT ' > 'is negative or equals to zero') > - group.add_argument('--wait', action='store_true', help='wait for user > ' > - 'input, e.g., allow a debugger to be attached') > + group.add_argument('--wait', action='store_true', > + help='wait for user input, e.g., allow a debugger ' > + 'to be attached') > group.add_argument('--slaveargs', metavar='ARGS') > - group.add_argument('-S', '--start', metavar='START', help='the name > of ' > - 'the test at which to start.' + more_details) > + group.add_argument('-S', '--start', metavar='START', > + help='the name of the test at which to start.' + > + more_details) > > group = parser.add_argument_group('Verbosity') > - group.add_argument('-v', '--verbose', action='store_true', > + group.add_argument('-v', '--verbose', action='count', > help='run tests in verbose mode with output to > stdout') > group.add_argument('-w', '--verbose2', action='store_true', > help='re-run failed tests in verbose mode') > @@ -254,7 +256,7 @@ > help='print traceback for failed tests') > group.add_argument('-q', '--quiet', action='store_true', > help='no output unless one or more tests fail') > - group.add_argument('-o', '--slow', action='store_true', > + group.add_argument('-o', '--slow', action='store_true', > dest='print_slow', > help='print the slowest 10 tests') > group.add_argument('--header', action='store_true', > help='print header with interpreter info') > @@ -262,45 +264,60 @@ > group = parser.add_argument_group('Selecting tests') > group.add_argument('-r', '--randomize', action='store_true', > help='randomize test execution order.' + > more_details) > - group.add_argument('--randseed', metavar='SEED', help='pass a random > seed ' > - 'to reproduce a previous random run') > - group.add_argument('-f', '--fromfile', metavar='FILE', help='read > names ' > - 'of tests to run from a file.' + more_details) > + group.add_argument('--randseed', metavar='SEED', > + dest='random_seed', type=int, > + help='pass a random seed to reproduce a previous ' > + 'random run') > + group.add_argument('-f', '--fromfile', metavar='FILE', > + help='read names of tests to run from a file.' + > + more_details) > group.add_argument('-x', '--exclude', action='store_true', > help='arguments are tests to *exclude*') > - group.add_argument('-s', '--single', action='store_true', > help='single ' > - 'step through a set of tests.' + more_details) > - group.add_argument('-m', '--match', metavar='PAT', help='match test > cases ' > - 'and methods with glob pattern PAT') > - group.add_argument('-G', '--failfast', action='store_true', > help='fail as ' > - 'soon as a test fails (only with -v or -W)') > - group.add_argument('-u', '--use', metavar='RES1,RES2,...', > help='specify ' > - 'which special resource intensive tests to run.' + > - more_details) > - group.add_argument('-M', '--memlimit', metavar='LIMIT', help='run > very ' > - 'large memory-consuming tests.' + more_details) > + group.add_argument('-s', '--single', action='store_true', > + help='single step through a set of tests.' + > + more_details) > + group.add_argument('-m', '--match', metavar='PAT', > + dest='match_tests', > + help='match test cases and methods with glob > pattern PAT') > + group.add_argument('-G', '--failfast', action='store_true', > + help='fail as soon as a test fails (only with -v > or -W)') > + group.add_argument('-u', '--use', metavar='RES1,RES2,...', > + action='append', type=resources_list, > + help='specify which special resource intensive > tests ' > + 'to run.' + more_details) > + group.add_argument('-M', '--memlimit', metavar='LIMIT', > + help='run very large memory-consuming tests.' + > + more_details) > group.add_argument('--testdir', metavar='DIR', > + type=relative_filename, > help='execute test files in the specified > directory ' > '(instead of the Python stdlib test suite)') > > group = parser.add_argument_group('Special runs') > - group.add_argument('-l', '--findleaks', action='store_true', help='if > GC ' > - 'is available detect tests that leak memory') > + group.add_argument('-l', '--findleaks', action='store_true', > + help='if GC is available detect tests that leak > memory') > group.add_argument('-L', '--runleaks', action='store_true', > help='run the leaks(1) command just before exit.' + > - more_details) > + more_details) > group.add_argument('-R', '--huntrleaks', metavar='RUNCOUNTS', > + type=huntrleaks, > help='search for reference leaks (needs debug > build, ' > 'very slow).' + more_details) > group.add_argument('-j', '--multiprocess', metavar='PROCESSES', > + dest='use_mp', type=int, > help='run PROCESSES processes at once') > - group.add_argument('-T', '--coverage', action='store_true', > help='turn on ' > - 'code coverage tracing using the trace module') > + group.add_argument('-T', '--coverage', action='store_true', > + dest='trace', > + help='turn on code coverage tracing using the > trace ' > + 'module') > group.add_argument('-D', '--coverdir', metavar='DIR', > + type=relative_filename, > help='directory where coverage files are put') > - group.add_argument('-N', '--nocoverdir', action='store_true', > + group.add_argument('-N', '--nocoverdir', > + action='store_const', const=None, dest='coverdir', > help='put coverage files alongside modules') > group.add_argument('-t', '--threshold', metavar='THRESHOLD', > + type=int, > help='call gc.set_threshold(THRESHOLD)') > group.add_argument('-n', '--nowindows', action='store_true', > help='suppress error message boxes on Windows') > @@ -313,43 +330,103 @@ > > return parser > > -# TODO: remove this function as described in issue #16799, for example. > -# We use this function since regrtest.main() was originally written to use > -# getopt for parsing. > -def _convert_namespace_to_getopt(ns): > - """Convert an argparse.Namespace object to a getopt-style opts list. > +def relative_filename(string): > + # CWD is replaced with a temporary dir before calling main(), so we > + # join it with the saved CWD so it ends up where the user expects. > + return os.path.join(support.SAVEDCWD, string) > > - The return value of this function mimics the first element of > - getopt.getopt()'s (opts, args) return value. In addition, the > (option, > - value) pairs in the opts list are sorted by option and use the long > - option string. The args part of (opts, args) can be mimicked by the > - args attribute of the Namespace object we are using in regrtest. > - """ > - opts = [] > - args_dict = vars(ns) > - for key in sorted(args_dict.keys()): > - if key == 'args': > +def huntrleaks(string): > + args = string.split(':') > + if len(args) not in (2, 3): > + raise argparse.ArgumentTypeError( > + 'needs 2 or 3 colon-separated arguments') > + nwarmup = int(args[0]) if args[0] else 5 > + ntracked = int(args[1]) if args[1] else 4 > + fname = args[2] if len(args) > 2 and args[2] else 'reflog.txt' > + return nwarmup, ntracked, fname > + > +def resources_list(string): > + u = [x.lower() for x in string.split(',')] > + for r in u: > + if r == 'all' or r == 'none': > continue > - val = args_dict[key] > - # Don't continue if val equals '' because this means an option > - # accepting a value was provided the empty string. Such values > should > - # show up in the returned opts list. > - if val is None or val is False: > - continue > - if val is True: > - # Then an option with action store_true was passed. getopt > - # includes these with value '' in the opts list. > - val = '' > - opts.append(('--' + key, val)) > - return opts > + if r[0] == '-': > + r = r[1:] > + if r not in RESOURCE_NAMES: > + raise argparse.ArgumentTypeError('invalid resource: ' + r) > + return u > > - > -def main(tests=None, testdir=None, verbose=0, quiet=False, > +def _parse_args(args, **kwargs): > + # Defaults > + ns = argparse.Namespace(testdir=None, verbose=0, quiet=False, > exclude=False, single=False, randomize=False, fromfile=None, > findleaks=False, use_resources=None, trace=False, > coverdir='coverage', > runleaks=False, huntrleaks=False, verbose2=False, > print_slow=False, > random_seed=None, use_mp=None, verbose3=False, forever=False, > - header=False, failfast=False, match_tests=None): > + header=False, failfast=False, match_tests=None) > + for k, v in kwargs.items(): > + if not hasattr(ns, k): > + raise TypeError('%r is an invalid keyword argument ' > + 'for this function' % k) > + setattr(ns, k, v) > + if ns.use_resources is None: > + ns.use_resources = [] > + > + parser = _create_parser() > + parser.parse_args(args=args, namespace=ns) > + > + if ns.single and ns.fromfile: > + parser.error("-s and -f don't go together!") > + if ns.use_mp and ns.trace: > + parser.error("-T and -j don't go together!") > + if ns.use_mp and ns.findleaks: > + parser.error("-l and -j don't go together!") > + if ns.use_mp and ns.memlimit: > + parser.error("-M and -j don't go together!") > + if ns.failfast and not (ns.verbose or ns.verbose3): > + parser.error("-G/--failfast needs either -v or -W") > + > + if ns.quiet: > + ns.verbose = 0 > + if ns.timeout is not None: > + if hasattr(faulthandler, 'dump_traceback_later'): > + if ns.timeout <= 0: > + ns.timeout = None > + else: > + print("Warning: The timeout option requires " > + "faulthandler.dump_traceback_later") > + ns.timeout = None > + if ns.use_mp is not None: > + if ns.use_mp <= 0: > + # Use all cores + extras for tests that like to sleep > + ns.use_mp = 2 + (os.cpu_count() or 1) > + if ns.use_mp == 1: > + ns.use_mp = None > + if ns.use: > + for a in ns.use: > + for r in a: > + if r == 'all': > + ns.use_resources[:] = RESOURCE_NAMES > + continue > + if r == 'none': > + del ns.use_resources[:] > + continue > + remove = False > + if r[0] == '-': > + remove = True > + r = r[1:] > + if remove: > + if r in ns.use_resources: > + ns.use_resources.remove(r) > + elif r not in ns.use_resources: > + ns.use_resources.append(r) > + if ns.random_seed is not None: > + ns.randomize = True > + > + return ns > + > + > +def main(tests=None, **kwargs): > """Execute a test suite. > > This also parses command-line options and modifies its behavior > @@ -372,7 +449,6 @@ > directly to set the values that would normally be set by flags > on the command line. > """ > - > # Display the Python traceback on fatal errors (e.g. segfault) > faulthandler.enable(all_threads=True) > > @@ -389,174 +465,48 @@ > > support.record_original_stdout(sys.stdout) > > - parser = _create_parser() > - ns = parser.parse_args() > - opts = _convert_namespace_to_getopt(ns) > - args = ns.args > - usage = parser.error > + ns = _parse_args(sys.argv[1:], **kwargs) > > - # Defaults > - if random_seed is None: > - random_seed = random.randrange(10000000) > - if use_resources is None: > - use_resources = [] > - debug = False > - start = None > - timeout = None > - for o, a in opts: > - if o in ('-v', '--verbose'): > - verbose += 1 > - elif o in ('-w', '--verbose2'): > - verbose2 = True > - elif o in ('-d', '--debug'): > - debug = True > - elif o in ('-W', '--verbose3'): > - verbose3 = True > - elif o in ('-G', '--failfast'): > - failfast = True > - elif o in ('-q', '--quiet'): > - quiet = True; > - verbose = 0 > - elif o in ('-x', '--exclude'): > - exclude = True > - elif o in ('-S', '--start'): > - start = a > - elif o in ('-s', '--single'): > - single = True > - elif o in ('-o', '--slow'): > - print_slow = True > - elif o in ('-r', '--randomize'): > - randomize = True > - elif o == '--randseed': > - randomize = True > - random_seed = int(a) > - elif o in ('-f', '--fromfile'): > - fromfile = a > - elif o in ('-m', '--match'): > - match_tests = a > - elif o in ('-l', '--findleaks'): > - findleaks = True > - elif o in ('-L', '--runleaks'): > - runleaks = True > - elif o in ('-t', '--threshold'): > - import gc > - gc.set_threshold(int(a)) > - elif o in ('-T', '--coverage'): > - trace = True > - elif o in ('-D', '--coverdir'): > - # CWD is replaced with a temporary dir before calling main(), > so we > - # need join it with the saved CWD so it goes where the user > expects. > - coverdir = os.path.join(support.SAVEDCWD, a) > - elif o in ('-N', '--nocoverdir'): > - coverdir = None > - elif o in ('-R', '--huntrleaks'): > - huntrleaks = a.split(':') > - if len(huntrleaks) not in (2, 3): > - print(a, huntrleaks) > - usage('-R takes 2 or 3 colon-separated arguments') > - if not huntrleaks[0]: > - huntrleaks[0] = 5 > - else: > - huntrleaks[0] = int(huntrleaks[0]) > - if not huntrleaks[1]: > - huntrleaks[1] = 4 > - else: > - huntrleaks[1] = int(huntrleaks[1]) > - if len(huntrleaks) == 2 or not huntrleaks[2]: > - huntrleaks[2:] = ["reflog.txt"] > - # Avoid false positives due to various caches > - # filling slowly with random data: > - warm_caches() > - elif o in ('-M', '--memlimit'): > - support.set_memlimit(a) > - elif o in ('-u', '--use'): > - u = [x.lower() for x in a.split(',')] > - for r in u: > - if r == 'all': > - use_resources[:] = RESOURCE_NAMES > - continue > - if r == 'none': > - del use_resources[:] > - continue > - remove = False > - if r[0] == '-': > - remove = True > - r = r[1:] > - if r not in RESOURCE_NAMES: > - usage('Invalid -u/--use option: ' + a) > - if remove: > - if r in use_resources: > - use_resources.remove(r) > - elif r not in use_resources: > - use_resources.append(r) > - elif o in ('-n', '--nowindows'): > - import msvcrt > - msvcrt.SetErrorMode(msvcrt.SEM_FAILCRITICALERRORS| > - msvcrt.SEM_NOALIGNMENTFAULTEXCEPT| > - msvcrt.SEM_NOGPFAULTERRORBOX| > - msvcrt.SEM_NOOPENFILEERRORBOX) > - try: > - msvcrt.CrtSetReportMode > - except AttributeError: > - # release build > - pass > - else: > - for m in [msvcrt.CRT_WARN, msvcrt.CRT_ERROR, > msvcrt.CRT_ASSERT]: > - msvcrt.CrtSetReportMode(m, msvcrt.CRTDBG_MODE_FILE) > - msvcrt.CrtSetReportFile(m, msvcrt.CRTDBG_FILE_STDERR) > - elif o in ('-F', '--forever'): > - forever = True > - elif o in ('-j', '--multiprocess'): > - use_mp = int(a) > - if use_mp <= 0: > - # Use all cores + extras for tests that like to sleep > - use_mp = 2 + (os.cpu_count() or 1) > - if use_mp == 1: > - use_mp = None > - elif o == '--header': > - header = True > - elif o == '--slaveargs': > - args, kwargs = json.loads(a) > - try: > - result = runtest(*args, **kwargs) > - except KeyboardInterrupt: > - result = INTERRUPTED, '' > - except BaseException as e: > - traceback.print_exc() > - result = CHILD_ERROR, str(e) > - sys.stdout.flush() > - print() # Force a newline (just in case) > - print(json.dumps(result)) > - sys.exit(0) > - elif o == '--testdir': > - # CWD is replaced with a temporary dir before calling main(), > so we > - # join it with the saved CWD so it ends up where the user > expects. > - testdir = os.path.join(support.SAVEDCWD, a) > - elif o == '--timeout': > - if hasattr(faulthandler, 'dump_traceback_later'): > - timeout = float(a) > - if timeout <= 0: > - timeout = None > - else: > - print("Warning: The timeout option requires " > - "faulthandler.dump_traceback_later") > - timeout = None > - elif o == '--wait': > - input("Press any key to continue...") > + if ns.huntrleaks: > + # Avoid false positives due to various caches > + # filling slowly with random data: > + warm_caches() > + if ns.memlimit is not None: > + support.set_memlimit(ns.memlimit) > + if ns.threshold is not None: > + import gc > + gc.set_threshold(ns.threshold) > + if ns.nowindows: > + import msvcrt > + msvcrt.SetErrorMode(msvcrt.SEM_FAILCRITICALERRORS| > + msvcrt.SEM_NOALIGNMENTFAULTEXCEPT| > + msvcrt.SEM_NOGPFAULTERRORBOX| > + msvcrt.SEM_NOOPENFILEERRORBOX) > + try: > + msvcrt.CrtSetReportMode > + except AttributeError: > + # release build > + pass > else: > - print(("No handler for option {}. Please report this as a > bug " > - "at http://bugs.python.org.").format(o), > file=sys.stderr) > - sys.exit(1) > - if single and fromfile: > - usage("-s and -f don't go together!") > - if use_mp and trace: > - usage("-T and -j don't go together!") > - if use_mp and findleaks: > - usage("-l and -j don't go together!") > - if use_mp and support.max_memuse: > - usage("-M and -j don't go together!") > - if failfast and not (verbose or verbose3): > - usage("-G/--failfast needs either -v or -W") > + for m in [msvcrt.CRT_WARN, msvcrt.CRT_ERROR, > msvcrt.CRT_ASSERT]: > + msvcrt.CrtSetReportMode(m, msvcrt.CRTDBG_MODE_FILE) > + msvcrt.CrtSetReportFile(m, msvcrt.CRTDBG_FILE_STDERR) > + if ns.wait: > + input("Press any key to continue...") > + > + if ns.slaveargs is not None: > + args, kwargs = json.loads(ns.slaveargs) > + try: > + result = runtest(*args, **kwargs) > + except KeyboardInterrupt: > + result = INTERRUPTED, '' > + except BaseException as e: > + traceback.print_exc() > + result = CHILD_ERROR, str(e) > + sys.stdout.flush() > + print() # Force a newline (just in case) > + print(json.dumps(result)) > + sys.exit(0) > > good = [] > bad = [] > @@ -565,12 +515,12 @@ > environment_changed = [] > interrupted = False > > - if findleaks: > + if ns.findleaks: > try: > import gc > except ImportError: > print('No GC available, disabling findleaks.') > - findleaks = False > + ns.findleaks = False > else: > # Uncomment the line below to report garbage that is not > # freeable by reference counting alone. By default only > @@ -578,42 +528,40 @@ > #gc.set_debug(gc.DEBUG_SAVEALL) > found_garbage = [] > > - if single: > + if ns.single: > filename = os.path.join(TEMPDIR, 'pynexttest') > try: > - fp = open(filename, 'r') > - next_test = fp.read().strip() > - tests = [next_test] > - fp.close() > + with open(filename, 'r') as fp: > + next_test = fp.read().strip() > + tests = [next_test] > except OSError: > pass > > - if fromfile: > + if ns.fromfile: > tests = [] > - fp = open(os.path.join(support.SAVEDCWD, fromfile)) > - count_pat = re.compile(r'\[\s*\d+/\s*\d+\]') > - for line in fp: > - line = count_pat.sub('', line) > - guts = line.split() # assuming no test has whitespace in its > name > - if guts and not guts[0].startswith('#'): > - tests.extend(guts) > - fp.close() > + with open(os.path.join(support.SAVEDCWD, ns.fromfile)) as fp: > + count_pat = re.compile(r'\[\s*\d+/\s*\d+\]') > + for line in fp: > + line = count_pat.sub('', line) > + guts = line.split() # assuming no test has whitespace in > its name > + if guts and not guts[0].startswith('#'): > + tests.extend(guts) > > # Strip .py extensions. > - removepy(args) > + removepy(ns.args) > removepy(tests) > > stdtests = STDTESTS[:] > nottests = NOTTESTS.copy() > - if exclude: > - for arg in args: > + if ns.exclude: > + for arg in ns.args: > if arg in stdtests: > stdtests.remove(arg) > nottests.add(arg) > - args = [] > + ns.args = [] > > # For a partial run, we do not need to clutter the output. > - if verbose or header or not (quiet or single or tests or args): > + if ns.verbose or ns.header or not (ns.quiet or ns.single or tests or > ns.args): > # Print basic platform information > print("==", platform.python_implementation(), > *sys.version.split()) > print("== ", platform.platform(aliased=True), > @@ -623,37 +571,39 @@ > > # if testdir is set, then we are not running the python tests suite, > so > # don't add default tests to be executed or skipped (pass empty > values) > - if testdir: > - alltests = findtests(testdir, list(), set()) > + if ns.testdir: > + alltests = findtests(ns.testdir, list(), set()) > else: > - alltests = findtests(testdir, stdtests, nottests) > + alltests = findtests(ns.testdir, stdtests, nottests) > > - selected = tests or args or alltests > - if single: > + selected = tests or ns.args or alltests > + if ns.single: > selected = selected[:1] > try: > next_single_test = alltests[alltests.index(selected[0])+1] > except IndexError: > next_single_test = None > # Remove all the selected tests that precede start if it's set. > - if start: > + if ns.start: > try: > - del selected[:selected.index(start)] > + del selected[:selected.index(ns.start)] > except ValueError: > - print("Couldn't find starting test (%s), using all tests" % > start) > - if randomize: > - random.seed(random_seed) > - print("Using random seed", random_seed) > + print("Couldn't find starting test (%s), using all tests" % > ns.start) > + if ns.randomize: > + if ns.random_seed is None: > + ns.random_seed = random.randrange(10000000) > + random.seed(ns.random_seed) > + print("Using random seed", ns.random_seed) > random.shuffle(selected) > - if trace: > + if ns.trace: > import trace, tempfile > tracer = trace.Trace(ignoredirs=[sys.base_prefix, > sys.base_exec_prefix, > tempfile.gettempdir()], > trace=False, count=True) > > test_times = [] > - support.verbose = verbose # Tell tests to be moderately quiet > - support.use_resources = use_resources > + support.verbose = ns.verbose # Tell tests to be moderately quiet > + support.use_resources = ns.use_resources > save_modules = sys.modules.keys() > > def accumulate_result(test, result): > @@ -671,7 +621,7 @@ > skipped.append(test) > resource_denieds.append(test) > > - if forever: > + if ns.forever: > def test_forever(tests=list(selected)): > while True: > for test in tests: > @@ -686,7 +636,7 @@ > test_count = '/{}'.format(len(selected)) > test_count_width = len(test_count) - 1 > > - if use_mp: > + if ns.use_mp: > try: > from threading import Thread > except ImportError: > @@ -710,11 +660,12 @@ > output.put((None, None, None, None)) > return > args_tuple = ( > - (test, verbose, quiet), > - dict(huntrleaks=huntrleaks, > use_resources=use_resources, > - debug=debug, output_on_failure=verbose3, > - timeout=timeout, failfast=failfast, > - match_tests=match_tests) > + (test, ns.verbose, ns.quiet), > + dict(huntrleaks=ns.huntrleaks, > + use_resources=ns.use_resources, > + debug=ns.debug, > output_on_failure=ns.verbose3, > + timeout=ns.timeout, failfast=ns.failfast, > + match_tests=ns.match_tests) > ) > # -E is needed by some tests, e.g. test_import > # Running the child from the same working directory > ensures > @@ -743,19 +694,19 @@ > except BaseException: > output.put((None, None, None, None)) > raise > - workers = [Thread(target=work) for i in range(use_mp)] > + workers = [Thread(target=work) for i in range(ns.use_mp)] > for worker in workers: > worker.start() > finished = 0 > test_index = 1 > try: > - while finished < use_mp: > + while finished < ns.use_mp: > test, stdout, stderr, result = output.get() > if test is None: > finished += 1 > continue > accumulate_result(test, result) > - if not quiet: > + if not ns.quiet: > fmt = "[{1:{0}}{2}/{3}] {4}" if bad else > "[{1:{0}}{2}] {4}" > print(fmt.format( > test_count_width, test_index, test_count, > @@ -778,29 +729,30 @@ > worker.join() > else: > for test_index, test in enumerate(tests, 1): > - if not quiet: > + if not ns.quiet: > fmt = "[{1:{0}}{2}/{3}] {4}" if bad else "[{1:{0}}{2}] > {4}" > print(fmt.format( > test_count_width, test_index, test_count, len(bad), > test)) > sys.stdout.flush() > - if trace: > + if ns.trace: > # If we're tracing code coverage, then we don't exit with > status > # if on a false return value from main. > - tracer.runctx('runtest(test, verbose, quiet, > timeout=timeout)', > + tracer.runctx('runtest(test, ns.verbose, ns.quiet, > timeout=ns.timeout)', > globals=globals(), locals=vars()) > else: > try: > - result = runtest(test, verbose, quiet, huntrleaks, > debug, > - output_on_failure=verbose3, > - timeout=timeout, failfast=failfast, > - match_tests=match_tests) > + result = runtest(test, ns.verbose, ns.quiet, > + ns.huntrleaks, ns.debug, > + output_on_failure=ns.verbose3, > + timeout=ns.timeout, > failfast=ns.failfast, > + match_tests=ns.match_tests) > accumulate_result(test, result) > except KeyboardInterrupt: > interrupted = True > break > except: > raise > - if findleaks: > + if ns.findleaks: > gc.collect() > if gc.garbage: > print("Warning: test created", len(gc.garbage), end=' > ') > @@ -821,11 +773,11 @@ > omitted = set(selected) - set(good) - set(bad) - set(skipped) > print(count(len(omitted), "test"), "omitted:") > printlist(omitted) > - if good and not quiet: > + if good and not ns.quiet: > if not bad and not skipped and not interrupted and len(good) > 1: > print("All", end=' ') > print(count(len(good), "test"), "OK.") > - if print_slow: > + if ns.print_slow: > test_times.sort(reverse=True) > print("10 slowest tests:") > for time, test in test_times[:10]: > @@ -839,18 +791,19 @@ > print("{} altered the execution environment:".format( > count(len(environment_changed), "test"))) > printlist(environment_changed) > - if skipped and not quiet: > + if skipped and not ns.quiet: > print(count(len(skipped), "test"), "skipped:") > printlist(skipped) > > - if verbose2 and bad: > + if ns.verbose2 and bad: > print("Re-running failed tests in verbose mode") > for test in bad: > print("Re-running test %r in verbose mode" % test) > sys.stdout.flush() > try: > - verbose = True > - ok = runtest(test, True, quiet, huntrleaks, debug, > timeout=timeout) > + ns.verbose = True > + ok = runtest(test, True, ns.quiet, ns.huntrleaks, > ns.debug, > + timeout=ns.timeout) > except KeyboardInterrupt: > # print a newline separate from the ^C > print() > @@ -858,18 +811,18 @@ > except: > raise > > - if single: > + if ns.single: > if next_single_test: > with open(filename, 'w') as fp: > fp.write(next_single_test + '\n') > else: > os.unlink(filename) > > - if trace: > + if ns.trace: > r = tracer.results() > - r.write_results(show_missing=True, summary=True, > coverdir=coverdir) > + r.write_results(show_missing=True, summary=True, > coverdir=ns.coverdir) > > - if runleaks: > + if ns.runleaks: > os.system("leaks %d" % os.getpid()) > > sys.exit(len(bad) > 0 or interrupted) > diff --git a/Lib/test/test_regrtest.py b/Lib/test/test_regrtest.py > --- a/Lib/test/test_regrtest.py > +++ b/Lib/test/test_regrtest.py > @@ -4,97 +4,281 @@ > > import argparse > import getopt > +import os.path > import unittest > from test import regrtest, support > > -def old_parse_args(args): > - """Parse arguments as regrtest did strictly prior to 3.4. > - > - Raises getopt.GetoptError on bad arguments. > - """ > - return getopt.getopt(args, 'hvqxsoS:rf:lu:t:TD:NLR:FdwWM:nj:Gm:', > - ['help', 'verbose', 'verbose2', 'verbose3', 'quiet', > - 'exclude', 'single', 'slow', 'randomize', 'fromfile=', > 'findleaks', > - 'use=', 'threshold=', 'coverdir=', 'nocoverdir', > - 'runleaks', 'huntrleaks=', 'memlimit=', 'randseed=', > - 'multiprocess=', 'coverage', 'slaveargs=', 'forever', 'debug', > - 'start=', 'nowindows', 'header', 'testdir=', 'timeout=', 'wait', > - 'failfast', 'match=']) > - > class ParseArgsTestCase(unittest.TestCase): > > - """Test that regrtest's parsing code matches the prior getopt > behavior.""" > + """Test regrtest's argument parsing.""" > > - def _parse_args(self, args): > - # This is the same logic as that used in regrtest.main() > - parser = regrtest._create_parser() > - ns = parser.parse_args(args=args) > - opts = regrtest._convert_namespace_to_getopt(ns) > - return opts, ns.args > + def checkError(self, args, msg): > + with support.captured_stderr() as err, > self.assertRaises(SystemExit): > + regrtest._parse_args(args) > + self.assertIn(msg, err.getvalue()) > > - def _check_args(self, args, expected=None): > - """ > - The expected parameter is for cases when the behavior of the new > - parse_args differs from the old (but deliberately so). > - """ > - if expected is None: > - try: > - expected = old_parse_args(args) > - except getopt.GetoptError: > - # Suppress usage string output when an > argparse.ArgumentError > - # error is raised. > - with support.captured_stderr(): > - self.assertRaises(SystemExit, self._parse_args, args) > - return > - # The new parse_args() sorts by long option string. > - expected[0].sort() > - actual = self._parse_args(args) > - self.assertEqual(actual, expected) > + def test_help(self): > + for opt in '-h', '--help': > + with self.subTest(opt=opt): > + with support.captured_stdout() as out, \ > + self.assertRaises(SystemExit): > + regrtest._parse_args([opt]) > + self.assertIn('Run Python regression tests.', > out.getvalue()) > + > + def test_timeout(self): > + ns = regrtest._parse_args(['--timeout', '4.2']) > + self.assertEqual(ns.timeout, 4.2) > + self.checkError(['--timeout'], 'expected one argument') > + self.checkError(['--timeout', 'foo'], 'invalid float value') > + > + def test_wait(self): > + ns = regrtest._parse_args(['--wait']) > + self.assertTrue(ns.wait) > + > + def test_slaveargs(self): > + ns = regrtest._parse_args(['--slaveargs', '[[], {}]']) > + self.assertEqual(ns.slaveargs, '[[], {}]') > + self.checkError(['--slaveargs'], 'expected one argument') > + > + def test_start(self): > + for opt in '-S', '--start': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'foo']) > + self.assertEqual(ns.start, 'foo') > + self.checkError([opt], 'expected one argument') > + > + def test_verbose(self): > + ns = regrtest._parse_args(['-v']) > + self.assertEqual(ns.verbose, 1) > + ns = regrtest._parse_args(['-vvv']) > + self.assertEqual(ns.verbose, 3) > + ns = regrtest._parse_args(['--verbose']) > + self.assertEqual(ns.verbose, 1) > + ns = regrtest._parse_args(['--verbose'] * 3) > + self.assertEqual(ns.verbose, 3) > + ns = regrtest._parse_args([]) > + self.assertEqual(ns.verbose, 0) > + > + def test_verbose2(self): > + for opt in '-w', '--verbose2': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.verbose2) > + > + def test_verbose3(self): > + for opt in '-W', '--verbose3': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.verbose3) > + > + def test_debug(self): > + for opt in '-d', '--debug': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.debug) > + > + def test_quiet(self): > + for opt in '-q', '--quiet': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.quiet) > + self.assertEqual(ns.verbose, 0) > + > + def test_slow(self): > + for opt in '-o', '--slow': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.print_slow) > + > + def test_header(self): > + ns = regrtest._parse_args(['--header']) > + self.assertTrue(ns.header) > + > + def test_randomize(self): > + for opt in '-r', '--randomize': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.randomize) > + > + def test_randseed(self): > + ns = regrtest._parse_args(['--randseed', '12345']) > + self.assertEqual(ns.random_seed, 12345) > + self.assertTrue(ns.randomize) > + self.checkError(['--randseed'], 'expected one argument') > + self.checkError(['--randseed', 'foo'], 'invalid int value') > + > + def test_fromfile(self): > + for opt in '-f', '--fromfile': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'foo']) > + self.assertEqual(ns.fromfile, 'foo') > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, 'foo', '-s'], "don't go together") > + > + def test_exclude(self): > + for opt in '-x', '--exclude': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.exclude) > + > + def test_single(self): > + for opt in '-s', '--single': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.single) > + self.checkError([opt, '-f', 'foo'], "don't go together") > + > + def test_match(self): > + for opt in '-m', '--match': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'pattern']) > + self.assertEqual(ns.match_tests, 'pattern') > + self.checkError([opt], 'expected one argument') > + > + def test_failfast(self): > + for opt in '-G', '--failfast': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, '-v']) > + self.assertTrue(ns.failfast) > + ns = regrtest._parse_args([opt, '-W']) > + self.assertTrue(ns.failfast) > + self.checkError([opt], '-G/--failfast needs either -v or > -W') > + > + def test_use(self): > + for opt in '-u', '--use': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'gui,network']) > + self.assertEqual(ns.use_resources, ['gui', 'network']) > + ns = regrtest._parse_args([opt, 'gui,none,network']) > + self.assertEqual(ns.use_resources, ['network']) > + expected = list(regrtest.RESOURCE_NAMES) > + expected.remove('gui') > + ns = regrtest._parse_args([opt, 'all,-gui']) > + self.assertEqual(ns.use_resources, expected) > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, 'foo'], 'invalid resource') > + > + def test_memlimit(self): > + for opt in '-M', '--memlimit': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, '4G']) > + self.assertEqual(ns.memlimit, '4G') > + self.checkError([opt], 'expected one argument') > + > + def test_testdir(self): > + ns = regrtest._parse_args(['--testdir', 'foo']) > + self.assertEqual(ns.testdir, os.path.join(support.SAVEDCWD, > 'foo')) > + self.checkError(['--testdir'], 'expected one argument') > + > + def test_findleaks(self): > + for opt in '-l', '--findleaks': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.findleaks) > + > + def test_findleaks(self): > + for opt in '-L', '--runleaks': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.runleaks) > + > + def test_findleaks(self): > + for opt in '-R', '--huntrleaks': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, ':']) > + self.assertEqual(ns.huntrleaks, (5, 4, 'reflog.txt')) > + ns = regrtest._parse_args([opt, '6:']) > + self.assertEqual(ns.huntrleaks, (6, 4, 'reflog.txt')) > + ns = regrtest._parse_args([opt, ':3']) > + self.assertEqual(ns.huntrleaks, (5, 3, 'reflog.txt')) > + ns = regrtest._parse_args([opt, '6:3:leaks.log']) > + self.assertEqual(ns.huntrleaks, (6, 3, 'leaks.log')) > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, '6'], > + 'needs 2 or 3 colon-separated arguments') > + self.checkError([opt, 'foo:'], 'invalid huntrleaks value') > + self.checkError([opt, '6:foo'], 'invalid huntrleaks > value') > + > + def test_multiprocess(self): > + for opt in '-j', '--multiprocess': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, '2']) > + self.assertEqual(ns.use_mp, 2) > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, 'foo'], 'invalid int value') > + self.checkError([opt, '2', '-T'], "don't go together") > + self.checkError([opt, '2', '-l'], "don't go together") > + self.checkError([opt, '2', '-M', '4G'], "don't go > together") > + > + def test_findleaks(self): > + for opt in '-T', '--coverage': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.trace) > + > + def test_coverdir(self): > + for opt in '-D', '--coverdir': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, 'foo']) > + self.assertEqual(ns.coverdir, > + os.path.join(support.SAVEDCWD, 'foo')) > + self.checkError([opt], 'expected one argument') > + > + def test_nocoverdir(self): > + for opt in '-N', '--nocoverdir': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertIsNone(ns.coverdir) > + > + def test_threshold(self): > + for opt in '-t', '--threshold': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt, '1000']) > + self.assertEqual(ns.threshold, 1000) > + self.checkError([opt], 'expected one argument') > + self.checkError([opt, 'foo'], 'invalid int value') > + > + def test_nowindows(self): > + for opt in '-n', '--nowindows': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.nowindows) > + > + def test_forever(self): > + for opt in '-F', '--forever': > + with self.subTest(opt=opt): > + ns = regrtest._parse_args([opt]) > + self.assertTrue(ns.forever) > + > > def test_unrecognized_argument(self): > - self._check_args(['--xxx']) > - > - def test_value_not_provided(self): > - self._check_args(['--start']) > - > - def test_short_option(self): > - # getopt returns the short option whereas argparse returns the > long. > - expected = ([('--quiet', '')], []) > - self._check_args(['-q'], expected=expected) > - > - def test_long_option(self): > - self._check_args(['--quiet']) > + self.checkError(['--xxx'], 'usage:') > > def test_long_option__partial(self): > - self._check_args(['--qui']) > + ns = regrtest._parse_args(['--qui']) > + self.assertTrue(ns.quiet) > + self.assertEqual(ns.verbose, 0) > > def test_two_options(self): > - self._check_args(['--quiet', '--exclude']) > - > - def test_option_with_value(self): > - self._check_args(['--start', 'foo']) > + ns = regrtest._parse_args(['--quiet', '--exclude']) > + self.assertTrue(ns.quiet) > + self.assertEqual(ns.verbose, 0) > + self.assertTrue(ns.exclude) > > def test_option_with_empty_string_value(self): > - self._check_args(['--start', '']) > + ns = regrtest._parse_args(['--start', '']) > + self.assertEqual(ns.start, '') > > def test_arg(self): > - self._check_args(['foo']) > + ns = regrtest._parse_args(['foo']) > + self.assertEqual(ns.args, ['foo']) > > def test_option_and_arg(self): > - self._check_args(['--quiet', 'foo']) > + ns = regrtest._parse_args(['--quiet', 'foo']) > + self.assertTrue(ns.quiet) > + self.assertEqual(ns.verbose, 0) > + self.assertEqual(ns.args, ['foo']) > > - def test_fromfile(self): > - self._check_args(['--fromfile', 'file']) > - > - def test_match(self): > - self._check_args(['--match', 'pattern']) > - > - def test_randomize(self): > - self._check_args(['--randomize']) > - > - > -def test_main(): > - support.run_unittest(__name__) > > if __name__ == '__main__': > - test_main() > + unittest.main() > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -153,6 +153,9 @@ > Tests > ----- > > +- Issue #16799: Switched from getopt to argparse style in regrtest's > argument > + parsing. Added more tests for regrtest's argument parsing. > + > - Issue #18792: Use "127.0.0.1" or "::1" instead of "localhost" as much as > possible, since "localhost" goes through a DNS lookup under recent > Windows > versions. > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Fri Aug 30 00:10:27 2013 From: christian at python.org (Christian Heimes) Date: Fri, 30 Aug 2013 00:10:27 +0200 Subject: [Python-Dev] Coverity Scan Spotlight Python Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Hello, Coverity has published its "Coverity Scan Spotlight Python" a couple of hours ago. It features a summary of Python's ecosystem, an interview with me about Python core development and a defect report. The report is awesome. We have reached a defect density of .005 defects per 1,000 lines of code. In 2012 the average defect density of Open Source Software was 0.69. http://www.coverity.com/company/press-releases/read/coverity-finds-python-sets-new-level-of-quality-for-open-source-software http://wpcme.coverity.com/wp-content/uploads/2013-Coverity-Scan-Spotlight-Python.pdf The internet likes it, too. http://www.prnewswire.com/news-releases/coverity-finds-python-sets-new-level-of-quality-for-open-source-software-221629931.html http://www.securityweek.com/python-gets-high-marks-open-source-software-security-report Thank you very much to Kristin Brennan and Dakshesh Vyas from Coverity as well as everybody who has helped to fix the remaining issues! Christian -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJSH8bEAAoJEMeIxMHUVQ1FFQcQAL1/Tb5PFMdLXwWsMt9D06aP A2qQPunEnfDBMdQz4GTEeDmHPdjs/EgAtUz4sLI48HlAmpdWEtoVPCdg1GvKSvMi IRVHR5LAtxe5p8M42+8DnSFyIOtEsbtv06W5cHvRxr6RuIkY3bTy0SVhtP9JW+N7 wQKsp2cOIOz/FHDWWQWjxwlZmUWEGkvSSggzbYxcdsaJeGHoJgkuzoChQ3mCtUCo w231OTKBZhGQp/VpMK+Q7OXWm78BZdB6d4GcSR3meCU9GpRMfPBxPF7v4IWvDPv9 4l/y922hmLLoOchJG+PDqcDhX1dnFm1t3Q199iqS5c0c+ttgaMRdSJEXZpZrubxe k+frJiOivG4G7BuzgQ39yF01rRHpjs57FW9FBbt4pp2c+4iOEkgARH+L/e2ZwOnk puXE45AfKwJwHLc4RDOhxdaPy/ovOh53HY68UxXoKjeZKWK5ShRopk0muvYG0y5O +8PbAKOYgJbe//NC3ac89V/1eu4rrFhN7xsK2Wc8i+kcbTB2XIVFElLHuV5wjmLd MMXFlm9LDJFOw12E4sF3MPaHyXQYpNJHvbnuxCkcHRQoLKzrcRJ2Y0Jj4HPSUCsj JhfmHX7Zu+/akmT4haqXUdtRrn4wji0OYqGydEqi4aLy7ELrC1EVNZY4OkbUhJO8 gGbpseJXtVThXQ7fymMS =++g9 -----END PGP SIGNATURE----- From eliben at gmail.com Fri Aug 30 00:25:37 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 29 Aug 2013 15:25:37 -0700 Subject: [Python-Dev] Coverity Scan Spotlight Python In-Reply-To: References: Message-ID: Great work, Christian! On Thu, Aug 29, 2013 at 3:10 PM, Christian Heimes wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > Hello, > > Coverity has published its "Coverity Scan Spotlight Python" a couple > of hours ago. It features a summary of Python's ecosystem, an > interview with me about Python core development and a defect report. > The report is awesome. We have reached a defect density of .005 > defects per 1,000 lines of code. In 2012 the average defect density of > Open Source Software was 0.69. > > > http://www.coverity.com/company/press-releases/read/coverity-finds-python-sets-new-level-of-quality-for-open-source-software > > > http://wpcme.coverity.com/wp-content/uploads/2013-Coverity-Scan-Spotlight-Python.pdf > > The internet likes it, too. > > > http://www.prnewswire.com/news-releases/coverity-finds-python-sets-new-level-of-quality-for-open-source-software-221629931.html > > > http://www.securityweek.com/python-gets-high-marks-open-source-software-security-report > > > Thank you very much to Kristin Brennan and Dakshesh Vyas from Coverity > as well as everybody who has helped to fix the remaining issues! > > Christian > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.12 (GNU/Linux) > Comment: Using GnuPG with undefined - http://www.enigmail.net/ > > iQIcBAEBCgAGBQJSH8bEAAoJEMeIxMHUVQ1FFQcQAL1/Tb5PFMdLXwWsMt9D06aP > A2qQPunEnfDBMdQz4GTEeDmHPdjs/EgAtUz4sLI48HlAmpdWEtoVPCdg1GvKSvMi > IRVHR5LAtxe5p8M42+8DnSFyIOtEsbtv06W5cHvRxr6RuIkY3bTy0SVhtP9JW+N7 > wQKsp2cOIOz/FHDWWQWjxwlZmUWEGkvSSggzbYxcdsaJeGHoJgkuzoChQ3mCtUCo > w231OTKBZhGQp/VpMK+Q7OXWm78BZdB6d4GcSR3meCU9GpRMfPBxPF7v4IWvDPv9 > 4l/y922hmLLoOchJG+PDqcDhX1dnFm1t3Q199iqS5c0c+ttgaMRdSJEXZpZrubxe > k+frJiOivG4G7BuzgQ39yF01rRHpjs57FW9FBbt4pp2c+4iOEkgARH+L/e2ZwOnk > puXE45AfKwJwHLc4RDOhxdaPy/ovOh53HY68UxXoKjeZKWK5ShRopk0muvYG0y5O > +8PbAKOYgJbe//NC3ac89V/1eu4rrFhN7xsK2Wc8i+kcbTB2XIVFElLHuV5wjmLd > MMXFlm9LDJFOw12E4sF3MPaHyXQYpNJHvbnuxCkcHRQoLKzrcRJ2Y0Jj4HPSUCsj > JhfmHX7Zu+/akmT4haqXUdtRrn4wji0OYqGydEqi4aLy7ELrC1EVNZY4OkbUhJO8 > gGbpseJXtVThXQ7fymMS > =++g9 > -----END PGP SIGNATURE----- > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/eliben%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Aug 30 00:46:58 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Aug 2013 00:46:58 +0200 Subject: [Python-Dev] Coverity Scan Spotlight Python References: Message-ID: <20130830004658.07215e29@fsol> On Fri, 30 Aug 2013 00:10:27 +0200 Christian Heimes wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > Hello, > > Coverity has published its "Coverity Scan Spotlight Python" a couple > of hours ago. It features a summary of Python's ecosystem, an > interview with me about Python core development and a defect report. > The report is awesome. We have reached a defect density of .005 > defects per 1,000 lines of code. What is a defect? Isn't it a bit weird to keep having a non-zero defect density, if those defects are identified? (or, if those defects are not bugs, what is the metric supposed to measure?) Regards Antoine. From christian at python.org Fri Aug 30 01:18:03 2013 From: christian at python.org (Christian Heimes) Date: Fri, 30 Aug 2013 01:18:03 +0200 Subject: [Python-Dev] Coverity Scan Spotlight Python In-Reply-To: <20130830004658.07215e29@fsol> References: <20130830004658.07215e29@fsol> Message-ID: <521FD6AB.3050107@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Am 30.08.2013 00:46, schrieb Antoine Pitrou: > On Fri, 30 Aug 2013 00:10:27 +0200 Christian Heimes > wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 >> >> Hello, >> >> Coverity has published its "Coverity Scan Spotlight Python" a >> couple of hours ago. It features a summary of Python's ecosystem, >> an interview with me about Python core development and a defect >> report. The report is awesome. We have reached a defect density >> of .005 defects per 1,000 lines of code. > > What is a defect? Isn't it a bit weird to keep having a non-zero > defect density, if those defects are identified? > > (or, if those defects are not bugs, what is the metric supposed to > measure?) The last defect is http://bugs.python.org/issue18550 "internal_setblocking() doesn't check return value of fcntl()". It's unlikely that the missing check is going to cause trouble. It's tedious to fix it, too. At least one affected function can't signal an error because it is defined as void. Christian -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJSH9adAAoJEMeIxMHUVQ1FU+wQAKEQcZbCrOgD1vIzOdfZXgGV qRHRqhhSoxfhApQ+zhCem/qPGNYBhQyZ4ReXVdCtlvd15p28oa5thFDO7wFfbaBm iQ9mV6nUn3vWgKr2PueEtUrQFd80t4t97AHyU04KblBJjesq8tv5l26i2SGl5YtS QWAJMi3zCbv2iZ2DlyjSs3zpGMzk2mj85dKYtU6ql+mKXH7utR3HUpFiHiL7sjCw D6Q5leORscqoqRxSwNtaT+vAWold5cmWHaH2nGOKj6vaBGKQbFEXRuMAj0sKyPj/ h3N/o+8DAdWH4J3eP8RcIKsai65vmXnzc77s8V2t9kFbuqZn/6CyMwkhsGxsl86h DyN24LhwcB+pK45KFBX92JEhYWQ8OumcfE3Hb/2wIHNFClEvMNSbh7N+5GzjXE0u xpsPjQpT9cldhWOcbPpVFx77zDVvsQczGSiqeH90zKCT7T9AIwUOYrjA0GiO/Nm/ wDMbmyL2/EMkDrnZ+X1YIwWaZOBEQlQofSSVnd1/g0fMm+5kJrW44W1D4grt0hpK TB2uApUCls4qdh3Juu630rMZNKm5/Tvfmtjr/mKHtRCcQvMmhRs2x901/I8ZdwQ+ AoL+yM2qPmsriSTkANGwZHJw2yzTJOv2PXG41ohitE2GdS10i5aRhySVepcjZx/k Gn/FRAsP/AVKReqOVooF =AyxK -----END PGP SIGNATURE----- From sturla at molden.no Fri Aug 30 01:24:12 2013 From: sturla at molden.no (Sturla Molden) Date: Fri, 30 Aug 2013 01:24:12 +0200 Subject: [Python-Dev] Coverity Scan Spotlight Python In-Reply-To: References: Message-ID: <2DF2F3A2-1015-4B2D-9CDF-9C5A98B11275@molden.no> Do the numbers add up? .005 defects in 1,000 lines of code is one defect in every 200,000 lines of code. However they also claim that "to date, the Coverity Scan service has analyzed nearly 400,000 lines of Python code and identified 996 new defects ? 860 of which have been fixed by the Python community." Sturla Sendt fra min iPad Den 30. aug. 2013 kl. 00:10 skrev Christian Heimes : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > Hello, > > Coverity has published its "Coverity Scan Spotlight Python" a couple > of hours ago. It features a summary of Python's ecosystem, an > interview with me about Python core development and a defect report. > The report is awesome. We have reached a defect density of .005 > defects per 1,000 lines of code. In 2012 the average defect density of > Open Source Software was 0.69. > > http://www.coverity.com/company/press-releases/read/coverity-finds-python-sets-new-level-of-quality-for-open-source-software > > http://wpcme.coverity.com/wp-content/uploads/2013-Coverity-Scan-Spotlight-Python.pdf > > The internet likes it, too. > > http://www.prnewswire.com/news-releases/coverity-finds-python-sets-new-level-of-quality-for-open-source-software-221629931.html > > http://www.securityweek.com/python-gets-high-marks-open-source-software-security-report > > > Thank you very much to Kristin Brennan and Dakshesh Vyas from Coverity > as well as everybody who has helped to fix the remaining issues! > > Christian > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.12 (GNU/Linux) > Comment: Using GnuPG with undefined - http://www.enigmail.net/ > > iQIcBAEBCgAGBQJSH8bEAAoJEMeIxMHUVQ1FFQcQAL1/Tb5PFMdLXwWsMt9D06aP > A2qQPunEnfDBMdQz4GTEeDmHPdjs/EgAtUz4sLI48HlAmpdWEtoVPCdg1GvKSvMi > IRVHR5LAtxe5p8M42+8DnSFyIOtEsbtv06W5cHvRxr6RuIkY3bTy0SVhtP9JW+N7 > wQKsp2cOIOz/FHDWWQWjxwlZmUWEGkvSSggzbYxcdsaJeGHoJgkuzoChQ3mCtUCo > w231OTKBZhGQp/VpMK+Q7OXWm78BZdB6d4GcSR3meCU9GpRMfPBxPF7v4IWvDPv9 > 4l/y922hmLLoOchJG+PDqcDhX1dnFm1t3Q199iqS5c0c+ttgaMRdSJEXZpZrubxe > k+frJiOivG4G7BuzgQ39yF01rRHpjs57FW9FBbt4pp2c+4iOEkgARH+L/e2ZwOnk > puXE45AfKwJwHLc4RDOhxdaPy/ovOh53HY68UxXoKjeZKWK5ShRopk0muvYG0y5O > +8PbAKOYgJbe//NC3ac89V/1eu4rrFhN7xsK2Wc8i+kcbTB2XIVFElLHuV5wjmLd > MMXFlm9LDJFOw12E4sF3MPaHyXQYpNJHvbnuxCkcHRQoLKzrcRJ2Y0Jj4HPSUCsj > JhfmHX7Zu+/akmT4haqXUdtRrn4wji0OYqGydEqi4aLy7ELrC1EVNZY4OkbUhJO8 > gGbpseJXtVThXQ7fymMS > =++g9 > -----END PGP SIGNATURE----- > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/sturla%40molden.no From tim.peters at gmail.com Fri Aug 30 01:43:13 2013 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 29 Aug 2013 18:43:13 -0500 Subject: [Python-Dev] Can someone try to duplicate corruption on Gentoo? Message-ID: In http://bugs.python.org/issue18843 a user reported a debug PyMalloc "bad leading pad byte" memory corruption death while running their code. After some thrashing, they decided to rebuild Python, and got the same kind of error while rebuilding Python. See http://bugs.python.org/msg196481 in that bug report: """ # emerge dev-lang/python:2.7 * IMPORTANT: 11 news items need reading for repository 'gentoo'. * Use eselect news to read news items. Calculating dependencies... done! Debug memory block at address p=0xa7f5900: API 'o' 80 bytes originally requested The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfb): at p-7: 0xfb at p-6: 0xfb at p-5: 0xfa *** OUCH at p-4: 0xfb at p-3: 0xfb at p-2: 0xfb at p-1: 0xfb Because memory is corrupted at the start, the count of bytes requested may be bogus, and checking the trailing pad bytes may segfault. The 8 pad bytes at tail=0xa7f5950 are FORBIDDENBYTE, as expected. The block was made by call #21242094 to debug malloc/realloc. Data at p: 73 00 00 00 79 00 00 00 ... 67 00 00 00 00 00 00 00 Fatal Python error: bad leading pad byte Aborted (core dumped) # """ I don't have access to Gentoo, and don't know squat about its `emerge`, but if someone else can do this it might help ;-) The Python used to run `emerge` here was a --with-pydebug Python the bug reporter built earlier. From tseaver at palladion.com Fri Aug 30 02:40:13 2013 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 29 Aug 2013 20:40:13 -0400 Subject: [Python-Dev] Coverity Scan Spotlight Python In-Reply-To: <2DF2F3A2-1015-4B2D-9CDF-9C5A98B11275@molden.no> References: <2DF2F3A2-1015-4B2D-9CDF-9C5A98B11275@molden.no> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/29/2013 07:24 PM, Sturla Molden wrote: > > Do the numbers add up? > > .005 defects in 1,000 lines of code is one defect in every 200,000 > lines of code. > > However they also claim that "to date, the Coverity Scan service has > analyzed nearly 400,000 lines of Python code and identified 996 new > defects ? 860 of which have been fixed by the Python community." FWIW: David Wheeler's 'sloccount' reports 800,489 lines of code in the Python 3.3.1 tarball, of which 403,266 lines are Python code, and 368,474 are ANSI C. That defect rate would imply 4 open defects in Python itself. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlIf6e0ACgkQ+gerLs4ltQ6X6wCgosAIUJyGjcBqbeAMLwMH24TJ j3cAoNKPEuKEbVmke2IZuSdtl2nMAFL4 =MoZm -----END PGP SIGNATURE----- From rdmurray at bitdance.com Fri Aug 30 04:45:58 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 29 Aug 2013 22:45:58 -0400 Subject: [Python-Dev] Can someone try to duplicate corruption on Gentoo? In-Reply-To: References: Message-ID: <20130830024558.A91442507A2@webabinitio.net> On Thu, 29 Aug 2013 18:43:13 -0500, Tim Peters wrote: > In > > http://bugs.python.org/issue18843 > > a user reported a debug PyMalloc "bad leading pad byte" memory > corruption death while running their code. After some thrashing, they > decided to rebuild Python, and got the same kind of error while > rebuilding Python. See > > http://bugs.python.org/msg196481 > > in that bug report: > > """ > # emerge dev-lang/python:2.7 > > * IMPORTANT: 11 news items need reading for repository 'gentoo'. > * Use eselect news to read news items. > > Calculating dependencies... done! > Debug memory block at address p=0xa7f5900: API 'o' > 80 bytes originally requested > The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfb): > at p-7: 0xfb > at p-6: 0xfb > at p-5: 0xfa *** OUCH > at p-4: 0xfb > at p-3: 0xfb > at p-2: 0xfb > at p-1: 0xfb > Because memory is corrupted at the start, the count of bytes requested > may be bogus, and checking the trailing pad bytes may segfault. > The 8 pad bytes at tail=0xa7f5950 are FORBIDDENBYTE, as expected. > The block was made by call #21242094 to debug malloc/realloc. > Data at p: 73 00 00 00 79 00 00 00 ... 67 00 00 00 00 00 00 00 > Fatal Python error: bad leading pad byte > Aborted (core dumped) Emerge uses Python, and 2.7 is the default system python on Gentoo, so unless he changed his default, that error almost certainly came from the existing Python he was having trouble with. Just for fun I re-emerged 2.7.3-r3 on my system, and that worked fine. --David From tim.peters at gmail.com Fri Aug 30 05:02:16 2013 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 29 Aug 2013 22:02:16 -0500 Subject: [Python-Dev] Can someone try to duplicate corruption on Gentoo? In-Reply-To: <20130830024558.A91442507A2@webabinitio.net> References: <20130830024558.A91442507A2@webabinitio.net> Message-ID: [R. David Murray ] > Emerge uses Python, and 2.7 is the default system python on Gentoo, > so unless he changed his default, that error almost certainly came from > the existing Python he was having trouble with. Yes, "The Python used to run `emerge` here was a --with-pydebug Python the bug reporter built earlier". > Just for fun I re-emerged 2.7.3-r3 on my system, and that worked fine. Thanks for trying! Note that only a debug-build (or at least with PYMALLOC_DEBUG defined) Python can generate a "bad leading pad byte" error, so if you didn't use a debug-build Python to run `emerge`, you could not have seen the same error. From tjreedy at udel.edu Fri Aug 30 05:53:18 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 29 Aug 2013 23:53:18 -0400 Subject: [Python-Dev] Coverity Scan Spotlight Python In-Reply-To: <2DF2F3A2-1015-4B2D-9CDF-9C5A98B11275@molden.no> References: <2DF2F3A2-1015-4B2D-9CDF-9C5A98B11275@molden.no> Message-ID: On 8/29/2013 7:24 PM, Sturla Molden wrote: > > Do the numbers add up? > > .005 defects in 1,000 lines of code is one defect in every 200,000 lines of code. > > However they also claim that "to date, the Coverity Scan service has analyzed nearly 400,000 lines of Python code and identified 996 new defects ? 860 of which have been fixed by the Python community." Some marked as 'false positive', some as 'intentional'. -- Terry Jan Reedy From andrew.svetlov at gmail.com Fri Aug 30 11:24:07 2013 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Fri, 30 Aug 2013 12:24:07 +0300 Subject: [Python-Dev] Add function to signal module for getting main thread id Message-ID: Main thread is slightly different from others. Signals can be subscribed from main thread only. Tulip has special logic for main thread. In application code we can explicitly know which thread is executed, main or not. But from library it's not easy. Tulip uses check like threading.current_thread().name == 'MainThread' This approach has a problem: thread name is writable attribute and can be changed by user code. My proposition is to add function like get_mainthread_id() -> int which return ident for main thread (I know function name is not perfect, please guess better one). Signal module already has required data as internal variable static long main_thread; I just guess to expose this value to python. Thoughts? -- Thanks, Andrew Svetlov From solipsis at pitrou.net Fri Aug 30 11:34:43 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Aug 2013 11:34:43 +0200 Subject: [Python-Dev] Add function to signal module for getting main thread id References: Message-ID: <20130830113443.220c27a0@pitrou.net> Le Fri, 30 Aug 2013 12:24:07 +0300, Andrew Svetlov a ?crit : > Main thread is slightly different from others. > Signals can be subscribed from main thread only. > Tulip has special logic for main thread. > In application code we can explicitly know which thread is executed, > main or not. > But from library it's not easy. > Tulip uses check like > threading.current_thread().name == 'MainThread' > This approach has a problem: thread name is writable attribute and can > be changed by user code. Really? Please at least use: > > My proposition is to add function like get_mainthread_id() -> int > which return ident for main thread (I know function name is not > perfect, please guess better one). > Signal module already has required data as internal variable > static long main_thread; > I just guess to expose this value to python. > Thoughts? > From victor.stinner at gmail.com Fri Aug 30 11:36:57 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 30 Aug 2013 11:36:57 +0200 Subject: [Python-Dev] Add function to signal module for getting main thread id In-Reply-To: References: Message-ID: 2013/8/30 Andrew Svetlov : > Tulip uses check like > threading.current_thread().name == 'MainThread' You should use the identifier, not the name: threading.current_thread().ident. > This approach has a problem: thread name is writable attribute and can > be changed by user code. The ident attribute cannot be modified. > My proposition is to add function like get_mainthread_id() -> int > which return ident for main thread (I know function name is not > perfect, please guess better one). Just call threading.get_ident() at startup, when you have only one thread (the main thread). There is an ugly hack to get the identifier of the main thread: threading._shutdown.__self__.ident. > Signal module already has required data as internal variable > static long main_thread; > I just guess to expose this value to python. > Thoughts? If we expose the identifier of the main thread, something should be added to the threading module, not the signal module. Is it possible that the main thread exit while there are still other live threads? Victor From solipsis at pitrou.net Fri Aug 30 11:39:37 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Aug 2013 11:39:37 +0200 Subject: [Python-Dev] Add function to signal module for getting main thread id References: Message-ID: <20130830113937.652d76af@pitrou.net> Le Fri, 30 Aug 2013 12:24:07 +0300, Andrew Svetlov a ?crit : > Main thread is slightly different from others. > Signals can be subscribed from main thread only. > Tulip has special logic for main thread. > In application code we can explicitly know which thread is executed, > main or not. > But from library it's not easy. > Tulip uses check like > threading.current_thread().name == 'MainThread' > This approach has a problem: thread name is writable attribute and can > be changed by user code. Please at least use: >>> isinstance(threading.current_thread(), threading._MainThread) True But really, what we need is a threading.main_thread() function. (Apologies for the previous incomplete reply (keyboard mishap)) Regards Antoine. From solipsis at pitrou.net Fri Aug 30 11:43:20 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Aug 2013 11:43:20 +0200 Subject: [Python-Dev] Add function to signal module for getting main thread id References: Message-ID: <20130830114320.360ec3a6@pitrou.net> Le Fri, 30 Aug 2013 11:36:57 +0200, Victor Stinner a ?crit : > > If we expose the identifier of the main thread, something should be > added to the threading module, not the signal module. Agreed. > Is it possible that the main thread exit while there are still other > live threads? "exit" in what sense? In the C sense, no: when the main C thread exits, the whole process is terminated (this is how our "daemon threads" work). In the Python sense, yes: we have a test for it: http://hg.python.org/cpython/file/c347b9063a9e/Lib/test/test_threading.py#l325 Regards Antoine. From phd at phdru.name Fri Aug 30 11:44:35 2013 From: phd at phdru.name (Oleg Broytman) Date: Fri, 30 Aug 2013 13:44:35 +0400 Subject: [Python-Dev] Add function to signal module for getting main thread id In-Reply-To: References: Message-ID: <20130830094435.GA10722@iskra.aviel.ru> On Fri, Aug 30, 2013 at 12:24:07PM +0300, Andrew Svetlov wrote: > Main thread is slightly different from others. > Signals can be subscribed from main thread only. > Tulip has special logic for main thread. > In application code we can explicitly know which thread is executed, > main or not. > But from library it's not easy. > Tulip uses check like > threading.current_thread().name == 'MainThread' > This approach has a problem: thread name is writable attribute and can > be changed by user code. You can test threading.current_thread().__class__ is threading._MainThread or threading.current_thread().ident == threading._MainThread.ident > My proposition is to add function like get_mainthread_id() -> int > which return ident for main thread threading._MainThread.ident ? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From andrew.svetlov at gmail.com Fri Aug 30 11:47:04 2013 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Fri, 30 Aug 2013 12:47:04 +0300 Subject: [Python-Dev] Add function to signal module for getting main thread id In-Reply-To: <20130830113937.652d76af@pitrou.net> References: <20130830113937.652d76af@pitrou.net> Message-ID: I missed _MainThread in threading, that's why I've guessed to add function to signal module. threading.main_thread() is much better sure. On Fri, Aug 30, 2013 at 12:39 PM, Antoine Pitrou wrote: > > Le Fri, 30 Aug 2013 12:24:07 +0300, > Andrew Svetlov a ?crit : >> Main thread is slightly different from others. >> Signals can be subscribed from main thread only. >> Tulip has special logic for main thread. >> In application code we can explicitly know which thread is executed, >> main or not. >> But from library it's not easy. >> Tulip uses check like >> threading.current_thread().name == 'MainThread' >> This approach has a problem: thread name is writable attribute and can >> be changed by user code. > > Please at least use: > > >>> isinstance(threading.current_thread(), threading._MainThread) > True > > But really, what we need is a threading.main_thread() function. > > (Apologies for the previous incomplete reply (keyboard mishap)) > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From andrew.svetlov at gmail.com Fri Aug 30 11:52:12 2013 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Fri, 30 Aug 2013 12:52:12 +0300 Subject: [Python-Dev] Add function to signal module for getting main thread id In-Reply-To: <20130830094435.GA10722@iskra.aviel.ru> References: <20130830094435.GA10722@iskra.aviel.ru> Message-ID: _MainThread can be used as workaround, but adding public function makes value. Oleg, as I understand _MainThread is a class, not class instance, test for threading._MainThread.ident doesn't make sense. On Fri, Aug 30, 2013 at 12:44 PM, Oleg Broytman wrote: > On Fri, Aug 30, 2013 at 12:24:07PM +0300, Andrew Svetlov wrote: >> Main thread is slightly different from others. >> Signals can be subscribed from main thread only. >> Tulip has special logic for main thread. >> In application code we can explicitly know which thread is executed, >> main or not. >> But from library it's not easy. >> Tulip uses check like >> threading.current_thread().name == 'MainThread' >> This approach has a problem: thread name is writable attribute and can >> be changed by user code. > > You can test > threading.current_thread().__class__ is threading._MainThread > or > threading.current_thread().ident == threading._MainThread.ident > >> My proposition is to add function like get_mainthread_id() -> int >> which return ident for main thread > > threading._MainThread.ident ? > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From andrew.svetlov at gmail.com Fri Aug 30 12:27:15 2013 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Fri, 30 Aug 2013 13:27:15 +0300 Subject: [Python-Dev] Add function to signal module for getting main thread id In-Reply-To: References: <20130830094435.GA10722@iskra.aviel.ru> Message-ID: I've filed http://bugs.python.org/issue18882 for this. On Fri, Aug 30, 2013 at 12:52 PM, Andrew Svetlov wrote: > _MainThread can be used as workaround, but adding public function makes value. > > Oleg, as I understand _MainThread is a class, not class instance, test > for threading._MainThread.ident doesn't make sense. > > On Fri, Aug 30, 2013 at 12:44 PM, Oleg Broytman wrote: >> On Fri, Aug 30, 2013 at 12:24:07PM +0300, Andrew Svetlov wrote: >>> Main thread is slightly different from others. >>> Signals can be subscribed from main thread only. >>> Tulip has special logic for main thread. >>> In application code we can explicitly know which thread is executed, >>> main or not. >>> But from library it's not easy. >>> Tulip uses check like >>> threading.current_thread().name == 'MainThread' >>> This approach has a problem: thread name is writable attribute and can >>> be changed by user code. >> >> You can test >> threading.current_thread().__class__ is threading._MainThread >> or >> threading.current_thread().ident == threading._MainThread.ident >> >>> My proposition is to add function like get_mainthread_id() -> int >>> which return ident for main thread >> >> threading._MainThread.ident ? >> >> Oleg. >> -- >> Oleg Broytman http://phdru.name/ phd at phdru.name >> Programmers don't die, they just GOSUB without RETURN. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com > > > > -- > Thanks, > Andrew Svetlov -- Thanks, Andrew Svetlov From cf.natali at gmail.com Fri Aug 30 12:29:12 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 30 Aug 2013 12:29:12 +0200 Subject: [Python-Dev] EINTR handling... Message-ID: Hello, This has been bothering me for years: why don't we properly handle EINTR, by running registered signal handlers and restarting the interrupted syscall (or eventually returning early e.g. for sleep)? EINTR is really a nuisance, and exposing it to Python code is just pointless. Now some people might argue that some code relies on EINTR to interrupt a syscall on purpose, but I don't really buy it: it's highly non-portable (depends on the syscall, SA_RESTART flag...) and subject to race conditions (it it comes before the syscall or if you get a partial read/write you'll deadlock). Furthermore, the stdlib code base is not consistent: some code paths handle EINTR, e.g. subprocess, multiprocessing, sock_sendall() does but not sock_send()... Just grep for EINTR and InterruptedError and you'll be amazed. GHC, the JVM and probably other platforms handle EINTR, maybe it's time for us too? Just for reference, here are some issues due to EINTR popping up: http://bugs.python.org/issue17097 http://bugs.python.org/issue12268 http://bugs.python.org/issue9867 http://bugs.python.org/issue7978 http://bugs.python.org/issue12493 http://bugs.python.org/issue3771 cf From christian at python.org Fri Aug 30 14:06:11 2013 From: christian at python.org (Christian Heimes) Date: Fri, 30 Aug 2013 14:06:11 +0200 Subject: [Python-Dev] Add function to signal module for getting main thread id In-Reply-To: <20130830113937.652d76af@pitrou.net> References: <20130830113937.652d76af@pitrou.net> Message-ID: Am 30.08.2013 11:39, schrieb Antoine Pitrou: > > Le Fri, 30 Aug 2013 12:24:07 +0300, > Andrew Svetlov a ?crit : >> Main thread is slightly different from others. >> Signals can be subscribed from main thread only. >> Tulip has special logic for main thread. >> In application code we can explicitly know which thread is executed, >> main or not. >> But from library it's not easy. >> Tulip uses check like >> threading.current_thread().name == 'MainThread' >> This approach has a problem: thread name is writable attribute and can >> be changed by user code. > > Please at least use: > > >>> isinstance(threading.current_thread(), threading._MainThread) > True > > But really, what we need is a threading.main_thread() function. What happens, when a program fork()s from another thread than the main thread? AFAIR the other threads are suspended and the forking thread is the new main thread. Or something similar... (Yes, I'm aware that threading + fork is an abomination.) Christian From amauryfa at gmail.com Fri Aug 30 14:06:53 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Fri, 30 Aug 2013 14:06:53 +0200 Subject: [Python-Dev] EINTR handling... In-Reply-To: References: Message-ID: 2013/8/30 Charles-Fran?ois Natali > Hello, > > This has been bothering me for years: why don't we properly handle > EINTR, by running registered signal handlers and restarting the > interrupted syscall (or eventually returning early e.g. for sleep)? > > EINTR is really a nuisance, and exposing it to Python code is just > pointless. > I agree. Is there a way to see in C code where EINTR is not handled? Or a method to handle this systematically? > Now some people might argue that some code relies on EINTR to > interrupt a syscall on purpose, but I don't really buy it: it's highly > non-portable (depends on the syscall, SA_RESTART flag...) and subject > to race conditions (it it comes before the syscall or if you get a > partial read/write you'll deadlock). > > Furthermore, the stdlib code base is not consistent: some code paths > handle EINTR, e.g. subprocess, multiprocessing, sock_sendall() does > but not sock_send()... > Just grep for EINTR and InterruptedError and you'll be amazed. > > GHC, the JVM and probably other platforms handle EINTR, maybe it's > time for us too? > > Just for reference, here are some issues due to EINTR popping up: > http://bugs.python.org/issue17097 > http://bugs.python.org/issue12268 > http://bugs.python.org/issue9867 > http://bugs.python.org/issue7978 > http://bugs.python.org/issue12493 > http://bugs.pythoto see n.org/issue3771 > > > cf > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/amauryfa%40gmail.com > -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Aug 30 14:09:37 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Aug 2013 22:09:37 +1000 Subject: [Python-Dev] Add function to signal module for getting main thread id In-Reply-To: References: <20130830094435.GA10722@iskra.aviel.ru> Message-ID: On 30 August 2013 20:27, Andrew Svetlov wrote: > I've filed http://bugs.python.org/issue18882 for this. I don't actually object to the addition, but is there any way that "threading.enumerate()[0]" *won't* be the main thread? (subinterpreters, perhaps, but they're going to have trouble anyway, since they won't have access to the real main thread) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Aug 30 14:13:51 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Aug 2013 22:13:51 +1000 Subject: [Python-Dev] EINTR handling... In-Reply-To: References: Message-ID: On 30 August 2013 20:29, Charles-Fran?ois Natali wrote: > Hello, > > This has been bothering me for years: why don't we properly handle > EINTR, by running registered signal handlers and restarting the > interrupted syscall (or eventually returning early e.g. for sleep)? > > EINTR is really a nuisance, and exposing it to Python code is just pointless. > > Now some people might argue that some code relies on EINTR to > interrupt a syscall on purpose, but I don't really buy it: it's highly > non-portable (depends on the syscall, SA_RESTART flag...) and subject > to race conditions (it it comes before the syscall or if you get a > partial read/write you'll deadlock). > > Furthermore, the stdlib code base is not consistent: some code paths > handle EINTR, e.g. subprocess, multiprocessing, sock_sendall() does > but not sock_send()... > Just grep for EINTR and InterruptedError and you'll be amazed. > > GHC, the JVM and probably other platforms handle EINTR, maybe it's > time for us too? Sounds good to me. I don't believe there's been a conscious decision that we *shouldn't* handle it, it just hasn't annoyed anyone enough for them to propose a systematic fix in CPython. If that latter part is no longer true, great ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From christian at python.org Fri Aug 30 14:18:17 2013 From: christian at python.org (Christian Heimes) Date: Fri, 30 Aug 2013 14:18:17 +0200 Subject: [Python-Dev] Coverity Scan Spotlight Python In-Reply-To: <2DF2F3A2-1015-4B2D-9CDF-9C5A98B11275@molden.no> References: <2DF2F3A2-1015-4B2D-9CDF-9C5A98B11275@molden.no> Message-ID: <52208D89.9050909@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Am 30.08.2013 01:24, schrieb Sturla Molden: > > Do the numbers add up? > > .005 defects in 1,000 lines of code is one defect in every 200,000 > lines of code. > > However they also claim that "to date, the Coverity Scan service > has analyzed nearly 400,000 lines of Python code and identified 996 > new defects ? 860 of which have been fixed by the Python > community." Yes, the numbers add up. The difference between 860 and 996 are false positive defects and code that is intentionally written in a way, which looks suspicious to Coverity Scan. I have documented the most common limitations in the devguide [1]. By the way Coverity Scan doesn't understand Python code. It can only analyzes C, C++ and Java code. [1] Christian -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJSII1+AAoJEMeIxMHUVQ1FNsIP/jmiMD8p39zHj8Ggb6NM1q9W WotnQzM2vLE90s9VfewQB914u4rFjEtYVWD6P88QQEwdcrBfnMs/xvFBcyW/yuVd EB57hTjqWKSgGdcFsKoAmlFtSzFTUtM3Yc4aiyYHwsn7vJPTbxAO/6GAToGhHeP6 96f0oXz4uqeM4RJNCbHPt57kHT9OUhsITiZ11rtlsYziGwpRKL5K7bd+bbh/HlPy BDRVfU112vDjOiCRFGPlmMy2ShJabZwT5uZ4+0VGgGo5/Af3H3UU7pYw1cuwnjgh CIv/jYFH8OgNvC+hwvai2OxQfH7aXtUhcSPUSOOmPUQ/pbkTMY65Ya2iIRtEoIrY 8FwayYTMzGkCkEZoS4HXO1wGNCcj3tM8ivGP89aJDpySYLmuJoLa5x/aNKKxyo+X n9HT4BAkuYuFi1qQsPh9kW+FR4VCWTob7BSjOXrY7T8X6plon+fwFseQMkE8JUqI ckwTJCHDIc23d/HiTNhI8Ank3v28JQLdVTIPYnSKU6YpxjDAO0J+BgExAHpAyVwZ snEz9zVj/x4YRkUgxWwTMj/ctKDEpX9mehg5rytlWIaKUtPbTmR+aWxG06+TCd1c dg0cEYso+tvVUAYfZX24dn/7NPrmkBHjGM0ph2PH0S+GcpHF861GvflaSwzQ/ceD kYF3msFihRocFXfy8iNj =Usp8 -----END PGP SIGNATURE----- From solipsis at pitrou.net Fri Aug 30 14:18:44 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Aug 2013 14:18:44 +0200 Subject: [Python-Dev] Add function to signal module for getting main thread id References: <20130830094435.GA10722@iskra.aviel.ru> Message-ID: <20130830141844.1af99bce@pitrou.net> Le Fri, 30 Aug 2013 22:09:37 +1000, Nick Coghlan a ?crit : > On 30 August 2013 20:27, Andrew Svetlov > wrote: > > I've filed http://bugs.python.org/issue18882 for this. > > I don't actually object to the addition, but is there any way that > "threading.enumerate()[0]" *won't* be the main thread? enumerate() doesn't guarantee any ordering, and the underlying container is a dict (actually, there are two of them). > (subinterpreters, perhaps, but they're going to have trouble anyway, > since they won't have access to the real main thread) Ah, subinterpreters :-) cheers Antoine. From solipsis at pitrou.net Fri Aug 30 14:21:54 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Aug 2013 14:21:54 +0200 Subject: [Python-Dev] Add function to signal module for getting main thread id References: <20130830113937.652d76af@pitrou.net> Message-ID: <20130830142154.09ab7880@pitrou.net> Le Fri, 30 Aug 2013 14:06:11 +0200, Christian Heimes a ?crit : > Am 30.08.2013 11:39, schrieb Antoine Pitrou: > > > > Le Fri, 30 Aug 2013 12:24:07 +0300, > > Andrew Svetlov a ?crit : > >> Main thread is slightly different from others. > >> Signals can be subscribed from main thread only. > >> Tulip has special logic for main thread. > >> In application code we can explicitly know which thread is > >> executed, main or not. > >> But from library it's not easy. > >> Tulip uses check like > >> threading.current_thread().name == 'MainThread' > >> This approach has a problem: thread name is writable attribute and > >> can be changed by user code. > > > > Please at least use: > > > > >>> isinstance(threading.current_thread(), threading._MainThread) > > True > > > > But really, what we need is a threading.main_thread() function. > > What happens, when a program fork()s from another thread than the main > thread? AFAIR the other threads are suspended and the forking thread > is the new main thread. Or something similar... Yes. We even support it :-) http://hg.python.org/cpython/file/c347b9063a9e/Lib/test/test_threading.py#l503 (well, whoever wrote that test wanted to support it. I don't think that's me) Regards Antoine. From andrew.svetlov at gmail.com Fri Aug 30 14:51:08 2013 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Fri, 30 Aug 2013 15:51:08 +0300 Subject: [Python-Dev] Add function to signal module for getting main thread id In-Reply-To: <20130830142154.09ab7880@pitrou.net> References: <20130830113937.652d76af@pitrou.net> <20130830142154.09ab7880@pitrou.net> Message-ID: I've made a patch. It works except scenario described by Christian Heimes. See details in http://bugs.python.org/issue18882 On Fri, Aug 30, 2013 at 3:21 PM, Antoine Pitrou wrote: > Le Fri, 30 Aug 2013 14:06:11 +0200, > Christian Heimes a ?crit : >> Am 30.08.2013 11:39, schrieb Antoine Pitrou: >> > >> > Le Fri, 30 Aug 2013 12:24:07 +0300, >> > Andrew Svetlov a ?crit : >> >> Main thread is slightly different from others. >> >> Signals can be subscribed from main thread only. >> >> Tulip has special logic for main thread. >> >> In application code we can explicitly know which thread is >> >> executed, main or not. >> >> But from library it's not easy. >> >> Tulip uses check like >> >> threading.current_thread().name == 'MainThread' >> >> This approach has a problem: thread name is writable attribute and >> >> can be changed by user code. >> > >> > Please at least use: >> > >> > >>> isinstance(threading.current_thread(), threading._MainThread) >> > True >> > >> > But really, what we need is a threading.main_thread() function. >> >> What happens, when a program fork()s from another thread than the main >> thread? AFAIR the other threads are suspended and the forking thread >> is the new main thread. Or something similar... > > Yes. We even support it :-) > http://hg.python.org/cpython/file/c347b9063a9e/Lib/test/test_threading.py#l503 > > (well, whoever wrote that test wanted to support it. I don't think > that's me) > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From cf.natali at gmail.com Fri Aug 30 15:10:49 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 30 Aug 2013 15:10:49 +0200 Subject: [Python-Dev] EINTR handling... In-Reply-To: References: Message-ID: 2013/8/30 Amaury Forgeot d'Arc : > I agree. > Is there a way to see in C code where EINTR is not handled? EINTR can be returned on slow syscalls, so a good heuristic would be to start with code that releases the GIL. But I don't see a generic way apart from grepping for syscalls that are documented to return EINTR. > Or a method to handle this systematically? The glibc defines this macro: # define TEMP_FAILURE_RETRY(expression) \ (__extension__ \ ({ long int __result; \ do __result = (long int) (expression); \ while (__result == -1L && errno == EINTR); \ __result; })) #endif which you can then use as: pid = TEMP_FAILURE_RETRY(waitpid(pid, &status, options)); Unfortunately, it's not as easy for us, since we must release the GIL around the syscall, try again if it failed with EINTR, only after having called PyErr_CheckSignals() to run signal handlers. e.g. waitpid(): """ Py_BEGIN_ALLOW_THREADS pid = waitpid(pid, &status, options); Py_END_ALLOW_THREADS """ should become (conceptually): """ begin_handle_eintr: Py_BEGIN_ALLOW_THREADS pid = waitpid(pid, &status, options); Py_END_ALLOW_THREADS if (pid < 0 && errno == EINTR) { if (PyErr_CheckSignals()) return NULL; goto begin_handle_eintr; } """ We might want to go for a clever macro (like BEGIN_SELECT_LOOP in socketmodule.c). 2013/8/30 Nick Coghlan : > Sounds good to me. I don't believe there's been a conscious decision > that we *shouldn't* handle it, it just hasn't annoyed anyone enough > for them to propose a systematic fix in CPython. If that latter part > is no longer true, great ;) Great, I'll open a bug report then :) cf From status at bugs.python.org Fri Aug 30 18:07:18 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 30 Aug 2013 18:07:18 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130830160718.E46935691D@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-08-23 - 2013-08-30) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4182 (+14) closed 26476 (+50) total 30658 (+64) Open issues with patches: 1910 Issues opened (48) ================== #17400: ipaddress should make it easy to identify rfc6598 addresses http://bugs.python.org/issue17400 reopened by ncoghlan #18822: poor proxyval() coverage in test_gdb http://bugs.python.org/issue18822 opened by pitrou #18823: Idle: use pipes instead of sockets to talk with user subproces http://bugs.python.org/issue18823 opened by terry.reedy #18824: Adding LogRecord attribute "traceback" http://bugs.python.org/issue18824 opened by Sworddragon #18825: Making msg optional on logging.exception() and similar variant http://bugs.python.org/issue18825 opened by Sworddragon #18826: reversed() requires a sequence - Could work on any iterator? http://bugs.python.org/issue18826 opened by dstufft #18828: urljoin behaves differently with custom and standard schemas http://bugs.python.org/issue18828 opened by mher.movsisyan #18829: csv produces confusing error message when passed a non-string http://bugs.python.org/issue18829 opened by Thibault.Kruse #18830: Remove duplicates from a result of getclasstree() http://bugs.python.org/issue18830 opened by serhiy.storchaka #18831: importlib.import_module() bypasses builtins.__import__ http://bugs.python.org/issue18831 opened by brett.cannon #18834: Add Clang to distutils to build C/C++ extensions http://bugs.python.org/issue18834 opened by Ryan.Gonzalez #18835: Add aligned memroy variants to the suite of PyMem functions/ma http://bugs.python.org/issue18835 opened by rhettinger #18837: multiprocessing.reduction is undocumented http://bugs.python.org/issue18837 opened by tpievila #18838: The order of interactive prompt and traceback on Windows http://bugs.python.org/issue18838 opened by Drekin #18840: Tutorial recommends pickle module without any warning of insec http://bugs.python.org/issue18840 opened by dstufft #18841: math.isfinite fails with Decimal sNAN http://bugs.python.org/issue18841 opened by stevenjd #18842: Add float.is_finite is_nan is_infinite to match Decimal method http://bugs.python.org/issue18842 opened by stevenjd #18843: Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at P http://bugs.python.org/issue18843 opened by mmokrejs #18844: allow weights in random.choice http://bugs.python.org/issue18844 opened by aisaac #18845: 2.7.5-r2: Fatal Python error: Segmentation fault http://bugs.python.org/issue18845 opened by mmokrejs #18848: In unittest.TestResult .startTestRun() and .stopTestRun() meth http://bugs.python.org/issue18848 opened by py.user #18849: Failure to try another name for tempfile when directory with c http://bugs.python.org/issue18849 opened by vlad #18850: xml.etree.ElementTree accepts control chars. http://bugs.python.org/issue18850 opened by maker #18851: subprocess's Popen closes stdout/stderr filedescriptors used i http://bugs.python.org/issue18851 opened by janwijbrand #18852: site.py does not handle readline.__doc__ being None http://bugs.python.org/issue18852 opened by theller #18853: Got ResourceWarning unclosed file when running Lib/shlex.py de http://bugs.python.org/issue18853 opened by vajrasky #18854: is_multipart and walk should document their treatment of 'mess http://bugs.python.org/issue18854 opened by r.david.murray #18855: Inconsistent README filenames http://bugs.python.org/issue18855 opened by madison.may #18856: Added test coverage for calendar print functions http://bugs.python.org/issue18856 opened by madison.may #18857: urlencode of a None value uses the string 'None' http://bugs.python.org/issue18857 opened by Joshua.Johnston #18858: dummy_threading lacks threading.get_ident() equivalent http://bugs.python.org/issue18858 opened by zuo #18859: README.valgrind should mention --with-valgrind http://bugs.python.org/issue18859 opened by tim.peters #18860: Add content manager API to email package http://bugs.python.org/issue18860 opened by r.david.murray #18861: Problems with recursive automatic exception chaining http://bugs.python.org/issue18861 opened by Nikratio #18862: Implement __subclasshook__() for Finders and Loaders in import http://bugs.python.org/issue18862 opened by eric.snow #18864: Implementation for PEP 451 (importlib.machinery.ModuleSpec) http://bugs.python.org/issue18864 opened by eric.snow #18870: eval() uses latin-1 to decode str http://bugs.python.org/issue18870 opened by valhallasw #18872: platform.linux_distribution() doesn't recognize Amazon Linux http://bugs.python.org/issue18872 opened by Lorin.Hochstein #18873: "Encoding" detected in non-comment lines http://bugs.python.org/issue18873 opened by Paul.Bonser #18874: Add a new tracemalloc module to trace memory allocations http://bugs.python.org/issue18874 opened by haypo #18875: Automatic insertion of the closing parentheses, brackets, and http://bugs.python.org/issue18875 opened by irdb #18876: Problems with files opened in append mode with io module http://bugs.python.org/issue18876 opened by erik.bray #18877: tkinter askopenfilenames does not work in Windows library fold http://bugs.python.org/issue18877 opened by tegavu #18878: Add support of the 'with' statement to sunau.open. http://bugs.python.org/issue18878 opened by serhiy.storchaka #18879: tempfile.NamedTemporaryFile can close the file too early, if n http://bugs.python.org/issue18879 opened by jort.bloem #18880: ssl.SSLSocket shutdown doesn't behave like socket.shutdown http://bugs.python.org/issue18880 opened by zielmicha #18882: Add threading.main_thread() function http://bugs.python.org/issue18882 opened by asvetlov #18885: handle EINTR in the stdlib http://bugs.python.org/issue18885 opened by neologix Most recent 15 issues with no replies (15) ========================================== #18885: handle EINTR in the stdlib http://bugs.python.org/issue18885 #18880: ssl.SSLSocket shutdown doesn't behave like socket.shutdown http://bugs.python.org/issue18880 #18878: Add support of the 'with' statement to sunau.open. http://bugs.python.org/issue18878 #18877: tkinter askopenfilenames does not work in Windows library fold http://bugs.python.org/issue18877 #18875: Automatic insertion of the closing parentheses, brackets, and http://bugs.python.org/issue18875 #18873: "Encoding" detected in non-comment lines http://bugs.python.org/issue18873 #18864: Implementation for PEP 451 (importlib.machinery.ModuleSpec) http://bugs.python.org/issue18864 #18862: Implement __subclasshook__() for Finders and Loaders in import http://bugs.python.org/issue18862 #18860: Add content manager API to email package http://bugs.python.org/issue18860 #18854: is_multipart and walk should document their treatment of 'mess http://bugs.python.org/issue18854 #18853: Got ResourceWarning unclosed file when running Lib/shlex.py de http://bugs.python.org/issue18853 #18849: Failure to try another name for tempfile when directory with c http://bugs.python.org/issue18849 #18842: Add float.is_finite is_nan is_infinite to match Decimal method http://bugs.python.org/issue18842 #18841: math.isfinite fails with Decimal sNAN http://bugs.python.org/issue18841 #18837: multiprocessing.reduction is undocumented http://bugs.python.org/issue18837 Most recent 15 issues waiting for review (15) ============================================= #18882: Add threading.main_thread() function http://bugs.python.org/issue18882 #18880: ssl.SSLSocket shutdown doesn't behave like socket.shutdown http://bugs.python.org/issue18880 #18878: Add support of the 'with' statement to sunau.open. http://bugs.python.org/issue18878 #18876: Problems with files opened in append mode with io module http://bugs.python.org/issue18876 #18874: Add a new tracemalloc module to trace memory allocations http://bugs.python.org/issue18874 #18872: platform.linux_distribution() doesn't recognize Amazon Linux http://bugs.python.org/issue18872 #18870: eval() uses latin-1 to decode str http://bugs.python.org/issue18870 #18860: Add content manager API to email package http://bugs.python.org/issue18860 #18856: Added test coverage for calendar print functions http://bugs.python.org/issue18856 #18853: Got ResourceWarning unclosed file when running Lib/shlex.py de http://bugs.python.org/issue18853 #18851: subprocess's Popen closes stdout/stderr filedescriptors used i http://bugs.python.org/issue18851 #18849: Failure to try another name for tempfile when directory with c http://bugs.python.org/issue18849 #18844: allow weights in random.choice http://bugs.python.org/issue18844 #18834: Add Clang to distutils to build C/C++ extensions http://bugs.python.org/issue18834 #18830: Remove duplicates from a result of getclasstree() http://bugs.python.org/issue18830 Top 10 most discussed issues (10) ================================= #17741: event-driven XML parser http://bugs.python.org/issue17741 59 msgs #18843: Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at P http://bugs.python.org/issue18843 33 msgs #18850: xml.etree.ElementTree accepts control chars. http://bugs.python.org/issue18850 26 msgs #18643: implement socketpair() on Windows http://bugs.python.org/issue18643 14 msgs #18851: subprocess's Popen closes stdout/stderr filedescriptors used i http://bugs.python.org/issue18851 11 msgs #16853: add a Selector to the select module http://bugs.python.org/issue16853 9 msgs #18870: eval() uses latin-1 to decode str http://bugs.python.org/issue18870 8 msgs #5720: ctime: I don't think that word means what you think it means. http://bugs.python.org/issue5720 7 msgs #18756: os.urandom() fails under high load http://bugs.python.org/issue18756 7 msgs #18808: Thread.join returns before PyThreadState is destroyed http://bugs.python.org/issue18808 7 msgs Issues closed (48) ================== #5876: __repr__ returning unicode doesn't work when called implicitly http://bugs.python.org/issue5876 closed by haypo #8713: multiprocessing needs option to eschew fork() under Linux http://bugs.python.org/issue8713 closed by sbt #10115: Support accept4() for atomic setting of flags at socket creati http://bugs.python.org/issue10115 closed by haypo #11798: Test cases not garbage collected after run http://bugs.python.org/issue11798 closed by asvetlov #12107: TCP listening sockets created without FD_CLOEXEC flag http://bugs.python.org/issue12107 closed by haypo #14914: pysetup installed distribute despite dry run option being spec http://bugs.python.org/issue14914 closed by eric.araujo #14974: rename packaging.pypi to packaging.index http://bugs.python.org/issue14974 closed by eric.araujo #15507: test_subprocess assumes SIGINT is not being ignored. http://bugs.python.org/issue15507 closed by gregory.p.smith #16611: Cookie.py does not parse httponly or secure cookie flags http://bugs.python.org/issue16611 closed by r.david.murray #16799: start using argparse.Namespace in regrtest http://bugs.python.org/issue16799 closed by serhiy.storchaka #16946: subprocess: _close_open_fd_range_safe() does not set close-on- http://bugs.python.org/issue16946 closed by haypo #17036: Implementation of the PEP 433: Easier suppression of file desc http://bugs.python.org/issue17036 closed by haypo #17070: PEP 433: Use the new cloexec to improve security and avoid bug http://bugs.python.org/issue17070 closed by haypo #17588: runpy cannot run Unicode path on Windows http://bugs.python.org/issue17588 closed by haypo #17974: Migrate unittest to argparse http://bugs.python.org/issue17974 closed by serhiy.storchaka #18394: cgi.FieldStorage triggers ResourceWarning sometimes http://bugs.python.org/issue18394 closed by brett.cannon #18538: `python -m dis ` should use argparse http://bugs.python.org/issue18538 closed by python-dev #18571: Implementation of the PEP 446: non-inheritable file descriptor http://bugs.python.org/issue18571 closed by haypo #18586: Allow running benchmarks for Python 3 from same directory http://bugs.python.org/issue18586 closed by brett.cannon #18647: re.error: nothing to repeat http://bugs.python.org/issue18647 closed by serhiy.storchaka #18743: References to non-existant "StringIO" module http://bugs.python.org/issue18743 closed by serhiy.storchaka #18757: Fix internal references for concurrent modules http://bugs.python.org/issue18757 closed by serhiy.storchaka #18760: Fix internal doc references for the xml package http://bugs.python.org/issue18760 closed by serhiy.storchaka #18763: subprocess: file descriptors should be closed after preexec_fn http://bugs.python.org/issue18763 closed by neologix #18783: No more refer to Python "long" http://bugs.python.org/issue18783 closed by serhiy.storchaka #18796: Wrong documentation of show_code function from dis module http://bugs.python.org/issue18796 closed by ezio.melotti #18798: Typo and unused variables in test fcntl http://bugs.python.org/issue18798 closed by ezio.melotti #18803: Fix more typos in .py files http://bugs.python.org/issue18803 closed by ezio.melotti #18806: socketmodule: fix/improve setipaddr() numeric addresses handli http://bugs.python.org/issue18806 closed by neologix #18807: Allow venv to create copies, even when symlinks are supported http://bugs.python.org/issue18807 closed by python-dev #18817: Got resource warning when running Lib/aifc.py http://bugs.python.org/issue18817 closed by serhiy.storchaka #18827: mistake in the timedelta.total_seconds docstring http://bugs.python.org/issue18827 closed by allo #18832: New regex module degrades re performance http://bugs.python.org/issue18832 closed by ned.deily #18833: Increased test coverage for telnetlib http://bugs.python.org/issue18833 closed by ezio.melotti #18836: Potential race condition in exceptions http://bugs.python.org/issue18836 closed by r.david.murray #18839: Wrong sentence in sys.exit.__doc__ http://bugs.python.org/issue18839 closed by ezio.melotti #18846: python.exe stdout stderr issues again http://bugs.python.org/issue18846 closed by benjamin.peterson #18847: spam http://bugs.python.org/issue18847 closed by ezio.melotti #18863: Encoding a unicode with unicode() and ignoring errors http://bugs.python.org/issue18863 closed by ned.deily #18865: multiprocessing: remove util.pipe()? http://bugs.python.org/issue18865 closed by sbt #18866: spam http://bugs.python.org/issue18866 closed by ezio.melotti #18867: spam http://bugs.python.org/issue18867 closed by ezio.melotti #18868: Python3 unbuffered stdin http://bugs.python.org/issue18868 closed by pitrou #18869: test suite: faulthandler signal handler is lost http://bugs.python.org/issue18869 closed by neologix #18871: Be more stringent about the test suite http://bugs.python.org/issue18871 closed by pitrou #18881: Can someone try to duplicate corruption on Gentoo? http://bugs.python.org/issue18881 closed by tim.peters #18883: python-3.3.2-r2: Modules/xxlimited.c:17:error: #error Py_LIMIT http://bugs.python.org/issue18883 closed by skrah #18884: python-2.7.5-r3: 40 bytes in 1 blocks are definitely lost http://bugs.python.org/issue18884 closed by haypo From solipsis at pitrou.net Fri Aug 30 19:02:07 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Aug 2013 19:02:07 +0200 Subject: [Python-Dev] EINTR handling... References: Message-ID: <20130830190207.07b43c95@fsol> On Fri, 30 Aug 2013 12:29:12 +0200 Charles-Fran?ois Natali wrote: > > Furthermore, the stdlib code base is not consistent: some code paths > handle EINTR, e.g. subprocess, multiprocessing, sock_sendall() does > but not sock_send()... > Just grep for EINTR and InterruptedError and you'll be amazed. > > GHC, the JVM and probably other platforms handle EINTR, maybe it's > time for us too? I don't have any precise opinion on this. It's true that we should have a systematic approach, I just don't know if all interfaces should handler EINTR automatically, or only the high-level ones. (for the sake of clarity, I'm fine with either :-)) Regards Antoine. From solipsis at pitrou.net Fri Aug 30 19:07:44 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Aug 2013 19:07:44 +0200 Subject: [Python-Dev] cpython: Issue #17741: Rename IncrementalParser and its methods. References: <3cRLDG5Rqdz7LjS@mail.python.org> Message-ID: <20130830190744.7fb2afc1@fsol> Hello, On Fri, 30 Aug 2013 14:51:42 +0200 (CEST) eli.bendersky wrote: > diff --git a/Doc/library/xml.etree.elementtree.rst b/Doc/library/xml.etree.elementtree.rst > --- a/Doc/library/xml.etree.elementtree.rst > +++ b/Doc/library/xml.etree.elementtree.rst > @@ -105,37 +105,42 @@ > >>> root[0][1].text > '2008' > > -Incremental parsing > -^^^^^^^^^^^^^^^^^^^ > +Pull API for asynchronous parsing > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I think the documentation should use another term than "asynchronous": "non-blocking" or "event-driven" would be fine. "Asynchronous" is loaded with various meanings and connotations which IMO make it the wrong term here. For example, in many contexts "asynchronous" means "does the job behind your back, e.g. in a worker thread". POSIX defines some asynchronous I/O APIs which are ostensibly not the same as non-blocking I/O: http://pubs.opengroup.org/onlinepubs/9699919799/functions/aio_read.html Regards Antoine. From eliben at gmail.com Fri Aug 30 19:23:44 2013 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 30 Aug 2013 10:23:44 -0700 Subject: [Python-Dev] cpython: Issue #17741: Rename IncrementalParser and its methods. In-Reply-To: <20130830190744.7fb2afc1@fsol> References: <3cRLDG5Rqdz7LjS@mail.python.org> <20130830190744.7fb2afc1@fsol> Message-ID: On Fri, Aug 30, 2013 at 10:07 AM, Antoine Pitrou wrote: > > Hello, > > On Fri, 30 Aug 2013 14:51:42 +0200 (CEST) > eli.bendersky wrote: > > diff --git a/Doc/library/xml.etree.elementtree.rst > b/Doc/library/xml.etree.elementtree.rst > > --- a/Doc/library/xml.etree.elementtree.rst > > +++ b/Doc/library/xml.etree.elementtree.rst > > @@ -105,37 +105,42 @@ > > >>> root[0][1].text > > '2008' > > > > -Incremental parsing > > -^^^^^^^^^^^^^^^^^^^ > > +Pull API for asynchronous parsing > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > I think the documentation should use another term than "asynchronous": > "non-blocking" or "event-driven" would be fine. "Asynchronous" is loaded > with various meanings and connotations which IMO make it the wrong term > here. > > For example, in many contexts "asynchronous" means "does the job behind > your back, e.g. in a worker thread". POSIX defines some asynchronous > I/O APIs which are ostensibly not the same as non-blocking I/O: > http://pubs.opengroup.org/onlinepubs/9699919799/functions/aio_read.html > Makes sense. I'll change it to non-blocking, since this doc already uses "blocking" here and there to refer to the opposite effect. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Aug 30 19:57:29 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 30 Aug 2013 10:57:29 -0700 Subject: [Python-Dev] EINTR handling... In-Reply-To: <20130830190207.07b43c95@fsol> References: <20130830190207.07b43c95@fsol> Message-ID: I don't have a strong opinion on this either. The distinction between send() and send_all() makes sense to me though (send_all() works hard to get all your data out, send() only does what it can quickly). Personally for calls like select() I think returning early on EINTR makes sense, it's usually part of a select loop anyway. The only thing I care deeply about is that you shouldn't restart anything before letting a Python-level signal handler run. A program might well have a Python signal handler that must run before the I/O is restarted, and the signal handler might raise an exception (like the default SIGINT handler, which raises KeyboardInterrupt). On Fri, Aug 30, 2013 at 10:02 AM, Antoine Pitrou wrote: > On Fri, 30 Aug 2013 12:29:12 +0200 > Charles-Fran?ois Natali wrote: > > > > Furthermore, the stdlib code base is not consistent: some code paths > > handle EINTR, e.g. subprocess, multiprocessing, sock_sendall() does > > but not sock_send()... > > Just grep for EINTR and InterruptedError and you'll be amazed. > > > > GHC, the JVM and probably other platforms handle EINTR, maybe it's > > time for us too? > > I don't have any precise opinion on this. It's true that we should have > a systematic approach, I just don't know if all interfaces should > handler EINTR automatically, or only the high-level ones. > (for the sake of clarity, I'm fine with either :-)) > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Aug 30 22:16:53 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 30 Aug 2013 16:16:53 -0400 Subject: [Python-Dev] Coverity Scan Spotlight Python In-Reply-To: <52208D89.9050909@python.org> References: <2DF2F3A2-1015-4B2D-9CDF-9C5A98B11275@molden.no> <52208D89.9050909@python.org> Message-ID: On 8/30/2013 8:18 AM, Christian Heimes wrote: > By the way Coverity Scan doesn't understand Python code. It can only > analyzes C, C++ and Java code. Have you (or Coverity) thought about which, if any, of the C defect categories apply to Python? (Assuming no use of ctypes ;-). Would it make any sense to apply their technology to Python code scanning? -- Terry Jan Reedy From rymg19 at gmail.com Sat Aug 31 03:37:54 2013 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Fri, 30 Aug 2013 20:37:54 -0500 Subject: [Python-Dev] cpython: Issue #17741: Rename IncrementalParser and its methods. In-Reply-To: References: <3cRLDG5Rqdz7LjS@mail.python.org> <20130830190744.7fb2afc1@fsol> Message-ID: I still think non-blocking sounds network-related... On Fri, Aug 30, 2013 at 12:23 PM, Eli Bendersky wrote: > > > > On Fri, Aug 30, 2013 at 10:07 AM, Antoine Pitrou wrote: > >> >> Hello, >> >> On Fri, 30 Aug 2013 14:51:42 +0200 (CEST) >> eli.bendersky wrote: >> > diff --git a/Doc/library/xml.etree.elementtree.rst >> b/Doc/library/xml.etree.elementtree.rst >> > --- a/Doc/library/xml.etree.elementtree.rst >> > +++ b/Doc/library/xml.etree.elementtree.rst >> > @@ -105,37 +105,42 @@ >> > >>> root[0][1].text >> > '2008' >> > >> > -Incremental parsing >> > -^^^^^^^^^^^^^^^^^^^ >> > +Pull API for asynchronous parsing >> > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> >> I think the documentation should use another term than "asynchronous": >> "non-blocking" or "event-driven" would be fine. "Asynchronous" is loaded >> with various meanings and connotations which IMO make it the wrong term >> here. >> >> For example, in many contexts "asynchronous" means "does the job behind >> your back, e.g. in a worker thread". POSIX defines some asynchronous >> I/O APIs which are ostensibly not the same as non-blocking I/O: >> http://pubs.opengroup.org/onlinepubs/9699919799/functions/aio_read.html >> > > Makes sense. I'll change it to non-blocking, since this doc already uses > "blocking" here and there to refer to the opposite effect. > > Eli > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com > > -- Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat Aug 31 04:32:16 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 30 Aug 2013 19:32:16 -0700 Subject: [Python-Dev] cpython: Issue #17741: Rename IncrementalParser and its methods. In-Reply-To: References: <3cRLDG5Rqdz7LjS@mail.python.org> <20130830190744.7fb2afc1@fsol> Message-ID: <522155B0.7010700@stoneleaf.us> On 08/30/2013 06:37 PM, Ryan Gonzalez wrote: > I still think non-blocking sounds network-related... Sometimes it is. And sometimes it's user-input related, or waiting on a local-pipeline related. But in all cases it means, return whatever is ready, don't block if nothing is ready. -- ~Ethan~ From steve at pearwood.info Sat Aug 31 04:58:39 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 31 Aug 2013 12:58:39 +1000 Subject: [Python-Dev] PEP 450 adding statistics module In-Reply-To: <520C2E23.40405@pearwood.info> References: <520C2E23.40405@pearwood.info> Message-ID: <52215BDF.2090109@pearwood.info> Hi all, I think that PEP 450 is now ready for a PEP dictator. There have been a number of code reviews, and feedback has been taken into account. The test suite passes. I'm not aware of any unanswered issues with the code. At least two people other than myself think that the implementation is ready for a dictator, and nobody has objected. There is still on-going work on speeding up the implementation for the statistics.sum function, but that will not effect the interface or the substantially change the test suite. http://bugs.python.org/issue18606 http://www.python.org/dev/peps/pep-0450/ -- Steven From tjreedy at udel.edu Sat Aug 31 05:06:19 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 30 Aug 2013 23:06:19 -0400 Subject: [Python-Dev] cpython: Issue #17741: Rename IncrementalParser and its methods. In-Reply-To: References: <3cRLDG5Rqdz7LjS@mail.python.org> <20130830190744.7fb2afc1@fsol> Message-ID: On 8/30/2013 9:37 PM, Ryan Gonzalez wrote: > I still think non-blocking sounds network-related... But it isn't ;-). Gui apps routinely use event loops and/or threads or subprocesses to avoid blocking on either user input (which can come from keyboard or mouse) and maybe disk operations and calculations. For instance, if it became possible to run the test suite from Idle, it would need to be non-blocking so one could continue to edit while the tests run (15-20 min on my machine, divided by number of cores). -- Terry Jan Reedy From rdmurray at bitdance.com Sat Aug 31 07:21:35 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 31 Aug 2013 01:21:35 -0400 Subject: [Python-Dev] Completing the email6 API changes. Message-ID: <20130831052136.42D4B2507F5@webabinitio.net> If you've read my blog (eg: on planet python), you will be aware that I dedicated August to full time email package development. At the beginning of the month I worked out a design proposal for the remaining API additions to the email package, dealing with handling message bodies in a more natural way. I posted this to the email-sig, and got...well, no objections. Barry Warsaw did review it, and told me he had no issues with the overall design, but also had no time for a detailed review. Since one way to see if a design holds together is to document and code it, I decided to go ahead and do so. This resulted in a number of small tweaks, but no major changes. I have at this point completed the coding. You can view the whole patch at: http://bugs.python.org/issue18891 which also links to three layered patches that I posted as I went along, if you prefer somewhat smaller patches. I think it would be great if I could check this in for alpha2. Since it is going in as an addition to the existing provisional code, the level of review required is not as high as for non-provisional code, I think. But I would certainly appreciate review from anyone so moved, since I haven't gotten any yet. Of course, if there is serious bikeshedding about the API, I won't make alpha2, but that's fine. The longer term goal, by the way, is to move all of this out of provisional status for 3.5. This code finishes the planned API additions for the email package to bring it fully into the world of Python3 and unicode. It does not "fix" the deep internals, which could be a future development direction (but probably only after the "old" API has been retired, which will take a while). But it does make it so that you can use the email package without having to be a MIME expert. (You can't get away with *no* MIME knowledge, but you no longer have to fuss with the details of the syntax.) To give you the flavor of how the entire new provisional API plays together, here's how you can build a complete message in your application: from email.message import MIMEMessage from email.headerregistry import Address fullmsg = MIMEMessage() fullmsg['To'] = Address('Fo?? Bar', 'fbar at example.com') fullmsg['From'] = "m?? " fullmsg['Subject'] = "j'ai un probl??me de python." fullmsg.set_content("et la il est mont?? sur moi et il commence" " a m'??touffer.") htmlmsg = MIMEMessage() htmlmsg.set_content("

et la il est mont?? sur moi et il commence" " a m'??touffer.

", subtype='html') with open('python.jpg', 'rb') as python: htmlmsg.add_related(python.read(), 'image', 'jpg', cid='image1' disposition='inline') fullmsg.make_alternative() fullmsg.attach(htmlmsg) with open('police-report.txt') as report: fullmsg.add_attachment(report.read(), filename='p??lice-report.txt', params=dict(wrap='flow'), headers=( 'X-Secret-Level: top', 'X-Authorization: Monty')) Which results in: >>> for line in bytes(fullmsg).splitlines(): >>> print(line) b'To: =?utf-8?q?Fo=C3=B6?= Bar ' b'From: =?utf-8?q?m=C3=A8?= ' b"Subject: j'ai un =?utf-8?q?probl=C3=A8me?= de python." b'MIME-Version: 1.0' b'Content-Type: multipart/mixed; boundary="===============1710006838=="' b'' b'--===============1710006838==' b'Content-Type: multipart/alternative; boundary="===============1811969196=="' b'' b'--===============1811969196==' b'Content-Type: text/plain; charset="utf-8"' b'Content-Transfer-Encoding: 8bit' b'' b"et la il est mont\xc3\xa9 sur moi et il commence a m'\xc3\xa9touffer." b'' b'--===============1811969196==' b'MIME-Version: 1.0' b'Content-Type: multipart/related; boundary="===============1469657937=="' b'' b'--===============1469657937==' b'Content-Type: text/html; charset="utf-8"' b'Content-Transfer-Encoding: quoted-printable' b'' b"

et la il est mont=C3=A9 sur moi et il commence a m'=C3=A9touffer.

" b'' b'--===============1469657937==' b'MIME-Version: 1.0' b'Content-Type: image/jpg' b'Content-Transfer-Encoding: base64' b'Content-Disposition: inline' b'Content-ID: image1' b'' b'ZmFrZSBpbWFnZSBkYXRhCg==' b'' b'--===============1469657937==--' b'--===============1811969196==--' b'--===============1710006838==' b'MIME-Version: 1.0' b'X-Secret-Level: top' b'X-Authorization: Monty' b'Content-Transfer-Encoding: 7bit' b'Content-Disposition: attachment; filename*=utf-8''p%C3%B6lice-report.txt" b'Content-Type: text/plain; charset="utf-8"; wrap="flow"' b'' b'il est sorti de son vivarium.' b'' b'--===============1710006838==--' If you've used the email package enough to be annoyed by it, you may notice that there are some nice things going on there, such as using CTE 8bit for the text part by default, and quoted-printable instead of base64 for utf8 when the lines are long enough to need wrapping. (Hmm. Looking at that I see I didn't fully fix a bug I had meant to fix: some of the parts have a MIME-Version header that don't need it.) All input strings are unicode, and the library takes care of doing whatever encoding is required. When you pull data out of a parsed message, you get unicode, without having to worry about how to decode it yourself. On the parsing side, after the above message has been parsed into a message object, we can do: >>> print(fullmsg['to'], fullmsg['from']) Fo?? Bar <"fbar at example.com"> m?? >>> print(fullmsg['subject']) j'ai un probl??me de python. >>> print(fullmsg['to'].addresses[0].display_name) Fo?? Bar >>> print(fullmsg.get_body(('plain',)).get_content()) et la il est mont?? sur moi et il commence a m'??touffer. >>> for part in fullmsg.get_body().iter_parts(): ... print(part.get_content())

et la il est mont?? sur moi et il commence a m'??touffer.

b'fake image data\n' >>> for attachment in fullmsg.iter_attachments(): ... print(attachment.get_content()) ... print(attachment['Content-Type'].params()) il est sorti de son vivarium. {'charset': 'utf-8', 'wrap': 'flow'} Of course, in a real program you'd actually be checking the mime types via get_content_type() and friends before getting the content and doing anything with it. Please read the new contentmanager module docs in the patch for full details of the content management part of the above API (and the headerregistry docs if you want to review the (new in 3.3) header parsing part of the above API). Feedback welcome, here or on the issue. --David PS: python jokes courtesy of someone doing a drive-by on #python-dev the other day. From stephen at xemacs.org Sat Aug 31 11:57:56 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 31 Aug 2013 18:57:56 +0900 Subject: [Python-Dev] Completing the email6 API changes. In-Reply-To: <20130831052136.42D4B2507F5@webabinitio.net> References: <20130831052136.42D4B2507F5@webabinitio.net> Message-ID: <87bo4echcr.fsf@uwakimon.sk.tsukuba.ac.jp> R. David Murray writes: > But I would certainly appreciate review from anyone so moved, since I > haven't gotten any yet. I'll try to make time for a serious (but obviously partial) review by Monday. I don't know if this is "serious" bikeshedding, but I have a comment or two on the example: > from email.message import MIMEMessage > from email.headerregistry import Address > fullmsg = MIMEMessage() > fullmsg['To'] = Address('Fo?? Bar', 'fbar at example.com') > fullmsg['From'] = "m?? " > fullmsg['Subject'] = "j'ai un probl??me de python." This is very nice! *I* *love* it. But (sorry!) I worry that it's not obvious to "naive" users. Maybe it would be useful to have a Message() factory which has one semantic difference from MIMEMessage: it "requires" RFC 822-required headers (or optionally RFC 1036 for news). Eg: # This message will be posted and mailed # These would conform to the latest Draft Standards # and be DKIM-signed fullmsg = Message('rfc822', 'rfc1036', 'dmarc') I'm not sure how "required" would be implemented (perhaps through a .validate() method). So the signature of the API suggested above is Message(*validators, **kw). For MIMEMessage, I think I prefer the name "MIMEPart". To naive users, the idea of MIMEMessages containing MIMEMessages is a bit disconcerting, except in the case of multipart/digest, I think. > fullmsg.set_content("et la il est mont?? sur moi et il commence" > " a m'??touffer.") > htmlmsg = MIMEMessage() > htmlmsg.set_content("

et la il est mont?? sur moi et il commence" > " a m'??touffer.

", > subtype='html') I think I'd like to express the suite above as fullmsg.payload.add_alternative(...) fullmsg.payload.add_alternative(..., subtype='html') This would automatically convert the MIME type of fullmsg to 'multipart/alternative', and .payload to a list where necessary. .set_content() would be available but it's "dangerous" (it could replace an arbitrary multipart -- this would be useful operation to replace it with a textual URL or external-body part). Aside: it occurs to me that the .payload attribute (and other such attributes) could be avoided by the device of using names prefixed by ":" such as ":payload" as keys: "fullmsg[':payload']" since colon is illegal in field names (cf RFC 5322). Probably I've just been writing too much Common Lisp, though. I'm not sure whether "payload" is a better name than "content" for that attribute. Now the suite > with open('python.jpg', 'rb') as python: > htmlmsg.add_related(python.read(), 'image', 'jpg', cid='image1' > disposition='inline') > fullmsg.make_alternative() > fullmsg.attach(htmlmsg) becomes just with open('python.jpg', 'rb') as python: fullmsg.payload['text/html'].add_related(...) At this point, "fullmsg.add_related()" without the .payload attribute would be an error, unless a "insertPart=True" keyword argument were present. With "insertPart=True", a new top-level multipart/related would be interposed with the existing multipart/alternative as its first child, and the argument of add_related as the second. Maybe that's too complicated, but I suspect it's harder for people who think of MIME messages as trees, than for people who think of messages as documents and don't want to hear about mimes other than Marcel Marceau. The indexing of the .payload attribute by part type is perhaps too smart for my own good, haven't thought carefully about it. It's plausible, though, since a message with multiple parts of the same type can only have one displayed -- normally that shouldn't happen. OTOH, this wouldn't work without modification for multipart/mixed or multipart/related. Could use Yet Another Keyword Argument, maybe. (BTW, it's really annoying when the text/plain part refers to images etc that are attached only to the text/html part. AFAICT from RFC 2387 it ought to be possible to put the multipart/related part at the top so both text/html and text/plain can refer to it.) > with open('police-report.txt') as report: > fullmsg.add_attachment(report.read(), filename='p??lice-report.txt', > params=dict(wrap='flow'), headers=( > 'X-Secret-Level: top', > 'X-Authorization: Monty')) I can't find an RFC that specifies a "wrap" parameter to text/plain. Do you mean RFC 2646 'format="flowed"' here? (A "validate" method could raise a warning on unregistered parameters.) > (Hmm. Looking at that I see I didn't fully fix a bug I had meant to fix: > some of the parts have a MIME-Version header that don't need it.) Another reason why the top-level part should be treated differently in the API. Steve From steve at pearwood.info Sat Aug 31 12:37:30 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 31 Aug 2013 20:37:30 +1000 Subject: [Python-Dev] Completing the email6 API changes. In-Reply-To: <20130831052136.42D4B2507F5@webabinitio.net> References: <20130831052136.42D4B2507F5@webabinitio.net> Message-ID: <5221C76A.1080804@pearwood.info> On 31/08/13 15:21, R. David Murray wrote: > If you've read my blog (eg: on planet python), you will be aware that > I dedicated August to full time email package development. [...] The API looks really nice! Thank you for putting this together. A question comes to mind though: > All input strings are unicode, and the library takes care of doing > whatever encoding is required. When you pull data out of a parsed > message, you get unicode, without having to worry about how to decode > it yourself. How well does your library cope with emails where the encoding is declared wrongly? Or no encoding declared at all? Conveniently, your email is an example of this. Although it contains non-ASCII characters, it is declared as us-ascii: --===============1633676851== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline which may explain why Stephen Turnbull's reply contains mojibake. -- Steven From stephen at xemacs.org Sat Aug 31 14:21:18 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 31 Aug 2013 21:21:18 +0900 Subject: [Python-Dev] Completing the email6 API changes. In-Reply-To: <5221C76A.1080804@pearwood.info> References: <20130831052136.42D4B2507F5@webabinitio.net> <5221C76A.1080804@pearwood.info> Message-ID: <87a9jycapt.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > which may explain why Stephen Turnbull's reply contains mojibake. Nah. It was already there, I just copied it. Could be my MUA's fault, though; I've tweaked it for Japanese, and it doesn't handle odd combinations well. From stefan_ml at behnel.de Sat Aug 31 16:08:59 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 31 Aug 2013 16:08:59 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: Message-ID: *bump* Does this sound like a viable solution? Stefan Stefan Behnel, 25.08.2013 14:36: > Hi, > > thanks for bringing this up. It clearly shows that there is more to this > problem than I initially thought. > > Let me just add one idea that your post gave me. > > PJ Eby, 25.08.2013 06:12: >> My "Importing" package offers lazy imports by creating module objects >> in sys.modules that are a subtype of ModuleType, and use a >> __getattribute__ hook so that trying to use them fires off a reload() >> of the module. > > I wonder if this wouldn't be an approach to fix the reloading problem in > general. What if extension module loading, at least with the new scheme, > didn't return the module object itself and put it into sys.modules but > created a wrapper that redirects its __getattr__ and __setattr__ to the > actual module object? That would have a tiny performance impact on > attribute access, but I'd expect that to be negligible given that the usual > reason for the extension module to exist is that it does non-trivial stuff > in whatever its API provides. Reloading could then really create a > completely new module object and replace the reference inside of the wrapper. > > That way, code that currently uses "from extmodule import xyz" would > continue to see the original version of the module as of the time of its > import, and code that just did "import extmodule" and then used attribute > access at need would always see the current content of the module as it was > last loaded. I think that, together with keeping module global state in the > module object itself, would nicely fix both cases. > > Stefan From rdmurray at bitdance.com Sat Aug 31 16:13:49 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 31 Aug 2013 10:13:49 -0400 Subject: [Python-Dev] Completing the email6 API changes. In-Reply-To: <5221C76A.1080804@pearwood.info> References: <20130831052136.42D4B2507F5@webabinitio.net> <5221C76A.1080804@pearwood.info> Message-ID: <20130831141349.960CF2507F5@webabinitio.net> On Sat, 31 Aug 2013 20:37:30 +1000, Steven D'Aprano wrote: > On 31/08/13 15:21, R. David Murray wrote: > > If you've read my blog (eg: on planet python), you will be aware that > > I dedicated August to full time email package development. > [...] > > > The API looks really nice! Thank you for putting this together. Thanks. > A question comes to mind though: > > > All input strings are unicode, and the library takes care of doing > > whatever encoding is required. When you pull data out of a parsed > > message, you get unicode, without having to worry about how to decode > > it yourself. > > How well does your library cope with emails where the encoding is declared wrongly? Or no encoding declared at all? It copes as best it can :) The bad bytes are preserved (unless you modify a part) but are returned as the "unknown character" in a string context. You can get the original bytes out by using the bytes access interface. (There are probably some places where how to do that isn't clear in the current API, but bascially either you use BytesGenerator or you drop down to a lower level API.) An attempt is made to interpret "bad bytes" as utf-8, before giving up and replacing them with the 'unknown character' character. I'm not 100% sure that is a good idea. > Conveniently, your email is an example of this. Although it contains non-ASCII characters, it is declared as us-ascii: Oh, yeah, my MUA is a little quirky and I forgot the step that would have made that correct. Wanting to rewrite it is one of the reasons I embarked on this whole email thing a few years ago :) --David From rdmurray at bitdance.com Sat Aug 31 16:23:51 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 31 Aug 2013 10:23:51 -0400 Subject: [Python-Dev] Completing the email6 API changes. In-Reply-To: <87bo4echcr.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20130831052136.42D4B2507F5@webabinitio.net> <87bo4echcr.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20130831142352.600462507F5@webabinitio.net> On Sat, 31 Aug 2013 18:57:56 +0900, "Stephen J. Turnbull" wrote: > R. David Murray writes: > > > But I would certainly appreciate review from anyone so moved, since I > > haven't gotten any yet. > > I'll try to make time for a serious (but obviously partial) review by > Monday. Thanks. > I don't know if this is "serious" bikeshedding, but I have a comment > or two on the example: Yeah, you engaged in some serious bikeshedding there ;) I like the idea of a top level part that requires the required headers, and I agree that MIMEPart is better than MIMEMessage for that class. Full validation is something that is currently a "future objective". There's infrastructure to do it, but not all of the necessary knowledge has been coded in yet. I take your point about the relationship between related and alternative not being set in stone. I'll have to think through the consequences of that, but I think it is just a matter of removing a couple error checks and updating the documentation. I'll also have to sit and think through your other ideas (the more extensive bikeshedding :) before I can comment, and I'm heading out to take my step-daughter to her freshman year of college, so I won't be able to do thorough responses until tomorrow. --David From stephen at xemacs.org Sat Aug 31 17:18:59 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 01 Sep 2013 00:18:59 +0900 Subject: [Python-Dev] Completing the email6 API changes. In-Reply-To: <20130831142352.600462507F5@webabinitio.net> References: <20130831052136.42D4B2507F5@webabinitio.net> <87bo4echcr.fsf@uwakimon.sk.tsukuba.ac.jp> <20130831142352.600462507F5@webabinitio.net> Message-ID: <878uzhdh24.fsf@uwakimon.sk.tsukuba.ac.jp> R. David Murray writes: > Full validation is something that is currently a "future > objective". I didn't mean it to be anything else. :-) > There's infrastructure to do it, but not all of the necessary knowledge > has been coded in yet. Well, I assume you already know that there's no way that can ever happen (at least until we abandon messaging entirely): new RFCs will continue to be published. So it needs to be an extensible mechanism, a "pipeline" of checks (Barry would say a "chain of rules", I think). Enjoy your trip! From ncoghlan at gmail.com Sat Aug 31 18:19:48 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 1 Sep 2013 02:19:48 +1000 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: Message-ID: On 1 Sep 2013 00:10, "Stefan Behnel" wrote: > > *bump* > > Does this sound like a viable solution? This isn't likely to progress until we have Eric's PEP 451 to a point where it's ready for python-dev discussion and pronouncement. However, the revised loader API is being designed to allow for the loader returning arbitrary objects, so something along these lines should work. There will likely be some adjustments to the API signature to allow extension modules to optionally support reloading if they so desire. Cheers, Nick. > > Stefan > > > Stefan Behnel, 25.08.2013 14:36: > > Hi, > > > > thanks for bringing this up. It clearly shows that there is more to this > > problem than I initially thought. > > > > Let me just add one idea that your post gave me. > > > > PJ Eby, 25.08.2013 06:12: > >> My "Importing" package offers lazy imports by creating module objects > >> in sys.modules that are a subtype of ModuleType, and use a > >> __getattribute__ hook so that trying to use them fires off a reload() > >> of the module. > > > > I wonder if this wouldn't be an approach to fix the reloading problem in > > general. What if extension module loading, at least with the new scheme, > > didn't return the module object itself and put it into sys.modules but > > created a wrapper that redirects its __getattr__ and __setattr__ to the > > actual module object? That would have a tiny performance impact on > > attribute access, but I'd expect that to be negligible given that the usual > > reason for the extension module to exist is that it does non-trivial stuff > > in whatever its API provides. Reloading could then really create a > > completely new module object and replace the reference inside of the wrapper. > > > > That way, code that currently uses "from extmodule import xyz" would > > continue to see the original version of the module as of the time of its > > import, and code that just did "import extmodule" and then used attribute > > access at need would always see the current content of the module as it was > > last loaded. I think that, together with keeping module global state in the > > module object itself, would nicely fix both cases. > > > > Stefan > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sat Aug 31 18:46:41 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 31 Aug 2013 09:46:41 -0700 Subject: [Python-Dev] EINTR handling... In-Reply-To: References: <20130830190207.07b43c95@fsol> Message-ID: On Fri, Aug 30, 2013 at 10:57 AM, Guido van Rossum wrote: > I don't have a strong opinion on this either. The distinction between > send() and send_all() makes sense to me though (send_all() works hard to > get all your data out, send() only does what it can quickly). > > Personally for calls like select() I think returning early on EINTR makes > sense, it's usually part of a select loop anyway. > > The only thing I care deeply about is that you shouldn't restart anything > before letting a Python-level signal handler run. A program might well have > a Python signal handler that must run before the I/O is restarted, and the > signal handler might raise an exception (like the default SIGINT handler, > which raises KeyboardInterrupt). > I see http://bugs.python.org/issue18885 has been filed to track this discussion so we should probably move it there (I've added comments). TL;DR you can't simply retry a system call with the exact same arguments when you receive an EINTR. There are some system calls for which that will not do what the programmer intended. > > > On Fri, Aug 30, 2013 at 10:02 AM, Antoine Pitrou wrote: > >> On Fri, 30 Aug 2013 12:29:12 +0200 >> Charles-Fran?ois Natali wrote: >> > >> > Furthermore, the stdlib code base is not consistent: some code paths >> > handle EINTR, e.g. subprocess, multiprocessing, sock_sendall() does >> > but not sock_send()... >> > Just grep for EINTR and InterruptedError and you'll be amazed. >> > >> > GHC, the JVM and probably other platforms handle EINTR, maybe it's >> > time for us too? >> >> I don't have any precise opinion on this. It's true that we should have >> a systematic approach, I just don't know if all interfaces should >> handler EINTR automatically, or only the high-level ones. >> (for the sake of clarity, I'm fine with either :-)) >> >> Regards >> >> Antoine. >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/guido%40python.org >> > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Aug 31 18:49:56 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 1 Sep 2013 02:49:56 +1000 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: Oops, had a draft from a few days ago that I was interrupted before sending. Finished editing the parts I believe are still relevant. On 25 Aug 2013 21:56, "Stefan Behnel" wrote: > > Nick Coghlan, 24.08.2013 23:43: > > On 25 Aug 2013 01:44, "Stefan Behnel" wrote: > >> Nick Coghlan, 24.08.2013 16:22: > >>> The new _PyImport_CreateAndExecExtensionModule function does the heavy > >>> lifting: > >>> > >>> https://bitbucket.org/ncoghlan/cpython_sandbox/src/081f8f7e3ee27dc309463b48e6c67cf4880fca12/Python/importdl.c?at=new_extension_imports#cl-65 > >>> > >>> One key point to note is that it *doesn't* call > >>> _PyImport_FixupExtensionObject, which is the API that handles all the > >>> PEP 3121 per-module state stuff. Instead, the idea will be for modules > >>> that don't need additional C level state to just implement > >>> PyImportExec_NAME, while those that *do* need C level state implement > >>> PyImportCreate_NAME and return a custom object (which may or may not > >>> be a module subtype). > >> > >> Is it really a common case for an extension module not to need any C level > >> state at all? I mean, this might work for very simple accelerator modules > >> with only a few stand-alone functions. But anything non-trivial will > >> almost > >> certainly have some kind of global state, cache, external library, etc., > >> and that state is best stored at the C level for safety reasons. In my experience, most extension authors aren't writing high performance C accelerators, they're exposing an existing C API to Python. It's the cffi use case rather than the Cython use case. My primary experience of C extensions is with such wrapper modules, and for those, the exec portion of the new API is exactly what you want. The components of the wrapper module don't share global state, they just translate between Python and a pre-existing externally stateless C API. For that use case, a precreated module to populate with types and functions is exactly what you want to keep things simple and stateless at the C level. > > I'd prefer to encourage people to put that state on an exported *type* > > rather than directly in the module global state. So while I agree we need > > to *support* C level module globals, I'd prefer to provide a simpler > > alternative that avoids them. > > But that has an impact on the API then. Why do you want the users of an > extension module to go through a separate object (even if it's just a > singleton, for example) instead of going through functions at the module > level? We don't currently encourage or propose this design for Python > modules either. Quite the contrary, it's extremely common for Python > modules to provide most of their functionality at the function level. And > IMHO that's a good thing. Mutable module global state is always a recipe for obscure bugs, and not something I will ever let through code review without a really good rationale. Hidden process global state is never good, just sometimes a necessary evil. However, keep in mind my patch is currently just the part I can implement without PEP 451 module spec objects. Once those are available, then I can implement the initial hook that supports returning a completely custom object. > Note that even global functions usually hold state, be it in the form of > globally imported modules, global caches, constants, ... If they can be shared safely across multiple instances of the module (e.g. immutable constants), then these can be shared at the C level. Otherwise, a custom Python type will be needed to make them instance specific. > > We also need the create/exec split to properly support reloading. Reload > > *must* reinitialize the object already in sys.modules instead of inserting > > a different object or it completely misses the point of reloading modules > > over deleting and reimporting them (i.e. implicitly affecting the > > references from other modules that imported the original object). > > Interesting. I never thought of it that way. > > I'm not sure this can be done in general. What if the module has threads > running that access the global state? In that case, reinitialising the > module object itself would almost certainly lead to a crash. My current proposal on import-sig is to make the first hook "prepare_module", and pass in the existing object in the reload case. For the extension loader, this would be reflected in the signature of the C level hook as well, so the module could decide for itself if it supported reloading. > And what if you do "from extmodule import some_function" in a Python > module? Then reloading couldn't replace that reference, just as for normal > Python modules. Meaning that you'd still have to keep both modules properly > alive in order to prevent crashes due to lost global state of the imported > function. > > The difference to Python modules here is that in Python code, you'll get > some kind of exception if state is lost during a reload. In C code, you'll > most likely get a crash. Agreed. This is actually my primary motivation for trying to improve the "can this be reloaded or not?" aspects of the loader API in PEP 451. > > How would you even make sure global state is properly cleaned up? Would you > call tp_clear() on the module object before re-running the init code? Or > how else would you enable the init code to do the right thing during both > the first run (where global state is uninitialised) and subsequent runs > (where global state may hold valid state and owned Python references)? Up to the module. For Python modules, we just blindly overwrite things and let the GC sort it out. (keep in mind existing extension modules using the existing API will still never be reloaded) > > Even tp_clear() may not be enough, because it's only meant to clean up > Python references, not C-level state. Basically, for reloading to be > correct without changing the object reference, it would have to go all the > way through tp_dealloc(), catch the object at the very end, right before it > gets freed, and then re-initialise it. > > This sounds like we need some kind of indirection (as you mentioned above), > but without the API impact that a separate type implies. Simply making > modules an arbitrary extension type, as I proposed, cannot solve this. > > (Actually, my intuition tells me that if it can't really be made to work > 100% for Python modules, e.g. due to the from-import case, why bother with > it for extension types?) To fix testing the C implementation of etree using the same model we use for other extension modules (that's loading a second copy rather than reloading in place, but the problems are related). > > > >>> Such modules can still support reloading (e.g. > >>> to pick up reloaded or removed module dependencies) by providing > >>> PyImportExec_NAME as well. > >>> > >>> (in a PEP 451 world, this would likely be split up as two separate > >>> functions, one for create, one for exec) > >> > >> Can't we just always require extension modules to implement their own > >> type? > >> Sure, it's a lot of boiler plate code, but that could be handled by a > >> simple C code generator or maybe even a copy&paste example in the docs. I > >> would like to avoid making it too easy for users in the future to get > >> anything wrong with reloading or sub-interpreters. Most people won't test > >> these things for their own code and the harder it is to make them not > >> work, > >> the more likely it is that a given set of dependencies will properly work > >> in a sub-interpreter. > >> > >> If users are required to implement their own type, I think it would be > >> more > >> obvious where to put global module state, how to define functions (i.e. > >> module methods), how to handle garbage collection at the global module > >> level, etc. > > > > Take a look at the current example - everything gets stored in the module > > dict for the simple case with no C level global state. > > Well, you're storing types there. And those types are your module API. I > understand that it's just an example, but I don't think it matches a common > case. As far as I can see, the types are not even interacting with each > other, let alone doing any C-level access of each other. We should try to > focus on the normal case that needs C-level state and C-level field access > of extension types. Once that's solved, we can still think about how to > make the really simple cases simpler, if it turns out that they are not > simple enough. Our experience is very different - my perspective is that the normal case either eschews C level global state in the extension module, because it causes so many problems, or else just completely ignores subinterpreter support and proper module cleanup. > Keeping everything in the module dict is a design that (IMHO) is too error > prone. C state should be kept safely at the C level, outside of the reach > of Python code. I don't want users of my extension module to be able to > provoke a crash by saying "extmodule._xyz = None". So don't have global state in the *extension module*, then, keep it in the regular C/C++ modules. (And don't use the exec-only approach if you do have significant global state in the extension). > I didn't know about PyType_FromSpec(), BTW. It looks like a nice addition > for manually written code (although useless for Cython). This is the only way to create custom types when using the stable ABI. Can I take your observation to mean that Cython doesn't currently offer the option of limiting itself to the stable ABI? Cheers, Nick. > > Stefan > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sat Aug 31 19:09:09 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 31 Aug 2013 10:09:09 -0700 Subject: [Python-Dev] Add a new tracemalloc module to trace memory allocations In-Reply-To: References: Message-ID: First, I really like this. +1 On Wed, Aug 28, 2013 at 6:07 PM, Victor Stinner wrote: > 2013/8/29 Victor Stinner : > > My proposed implementation for Python 3.4 is different: > > > > * no enable() / disable() function: tracemalloc can only be enabled > > before startup by setting PYTHONTRACEMALLOC=1 environment variable > > > > * traces (size of the memory block, Python filename, Python line > > number) are stored directly in the memory block, not in a separated > > hash table > > > > I chose PYTHONTRACEMALLOC env var instead of enable()/disable() > > functions to be able to really trace *all* memory allocated by Python, > > especially memory allocated at startup, during Python initialization. > > I'm not sure that having to set an environment variable is the most > convinient option, especially on Windows. > > Storing traces directly into memory blocks should use less memory, but > it requires to start tracemalloc before the first memory allocation. > It is possible to add again enable() and disable() methods to > dynamically install/uninstall the hook on memory allocators. I solved > this issue in the current implementation by using a second hash table > (pointer => trace). > > We can keep the environment variable as PYTHONFAULTHANDLER which > enables faulthandler at startup. faulthandler has also a command line > option: -X faulthandler. We may add -X tracemalloc. > We should be consistent with faulthandler's options. Why do you not want to support both the env var and enable()/disable() functions? Users are likely to want snapshots captured by enable()/disable() around particular pieces of code just as much as whole program information. Think of the possibilities, you could even setup a test runner to enable/disable before and after each test, test suite or test module to gather narrow statistics as to what code actually _caused_ the allocations rather than the ultimate individual file/line doing it. Taking that further: file and line information is great, but what if you extend the concept: could you allow for C API or even Python hooks to gather additional information at the time of each allocation or free? for example: Gathering the actual C and Python stack traces for correlation to figure out what call patterns lead allocations is powerful. (Yes, this gets messy fast as hooks should not trigger calls back into themselves when they allocate or free, similar to the "fun" involved in writing coverage tools) let me know if you think i'm crazy. :) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From cf.natali at gmail.com Sat Aug 31 19:30:20 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Sat, 31 Aug 2013 19:30:20 +0200 Subject: [Python-Dev] Add a new tracemalloc module to trace memory allocations In-Reply-To: References: Message-ID: 2013/8/29 Victor Stinner : > Charles-Fran?ois Natali and Serhiy Storchaka asked me to add this > module somewhere in Python 3.4: "how about adding pyfailmalloc to the > main repo (maybe under Tools), with a script making it easy to run the > tests suite with it enabled?" There are two reasons I think it would be a great addition: - since OOM conditions are - almost - never tested, the OOM handling code is - almost - always incorrect: indeed, Victor has found and fixed several dozens crashes thanks to this module - this module is actually really simple (~150 LOC) I have two comments on the API: 1) failmalloc.enable(range: int=1000): schedule a memory allocation failure in random.randint(1, range) allocations. That's one shot, i.e. only one failure will be triggered. So if this failure occurs in a place where the code is prepared to handle MemoryError (e.g. bigmem tests), no failure will occur in the remaining test. It would be better IMO to repeat this (i.e. reset the next failure counter), to increase the coverage. 2) It's a consequence of 1): since only one malloc() failure is triggered, it doesn't really reflect how a OOM condition would appear in real life: usually, it's either because you've exhausted your address space or the machine is under memory pressure, which means that once you've hit OOM, you're likely to encounter it again on subsequent allocations, for example if your OOM handling code allocates new memory (that's why it's so complicated to properly handle OOM, and one might want to use "memory parachutes"). It might be interesting to be able to pass an absolute maximum memory usage, or an option where once you've triggered an malloc() failure, you record the current memory usage, and use it as ceiling for subsequent allocations. From ethan at stoneleaf.us Sat Aug 31 19:23:27 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 31 Aug 2013 10:23:27 -0700 Subject: [Python-Dev] error in test suite Message-ID: <5222268F.3060902@stoneleaf.us> Am I the only one experiencing this? 262 tests OK. 93 tests failed: test___all__ test_abc test_array test_ast test_asynchat test_asyncore test_bisect test_buffer test_bufio test_bytes test_codeccallbacks test_codecs test_colorsys test_compileall test_configparser test_contextlib test_crypt test_ctypes test_dbm test_dbm_dumb test_dbm_ndbm test_dictcomps test_enum test_exceptions test_faulthandler test_file test_fileinput test_frozen test_future test_future3 test_future5 test_genericpath test_getargs2 test_getpass test_hash test_hashlib test_heapq test_idle test_imaplib test_imp test_import test_index test_io test_ioctl test_ipaddress test_iterlen test_json test_keyword test_largefile test_locale test_macpath test_multiprocessing_fork test_multiprocessing_forkserver test_multiprocessing_spawn test_namespace_pkgs test_ntpath test_operator test_osx_env test_pdb test_pep352 test_posixpath test_print test_py_compile test_random test_regrtest test_robotparser test_runpy test_sched test_set test_shutil test_site test_smtpd test_sndhdr test_source_encoding test_sqlite test_stat test_strftime test_sundry test_tarfile test_textwrap test_threading test_time test_unicode test_univnewlines test_urllib test_urllib2net test_userstring test_uuid test_warnings test_wave test_webbrowser test_xml_dom_minicompat test_zipfile 24 tests skipped: test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp test_codecmaps_kr test_codecmaps_tw test_curses test_dbm_gnu test_devpoll test_gdb test_kqueue test_lzma test_msilib test_ossaudiodev test_smtpnet test_socketserver test_startfile test_timeout test_tk test_ttk_guionly test_urllibnet test_winreg test_winsound test_xmlrpc_net test_zipfile64 and the failure appears to always be: test [...] crashed -- Traceback (most recent call last): File "/home/ethan/source/python/issue18780/Lib/test/regrtest.py", line 1265, in runtest_inner huntrleaks) File "/home/ethan/source/python/issue18780/Lib/test/regrtest.py", line 1381, in dash_R indirect_test() File "/home/ethan/source/python/issue18780/Lib/test/regrtest.py", line 1261, in test_runner = lambda: support.run_unittest(tests) File "/home/ethan/source/python/issue18780/Lib/test/support/__init__.py", line 1683, in run_unittest _run_suite(suite) File "/home/ethan/source/python/issue18780/Lib/test/support/__init__.py", line 1649, in _run_suite result = runner.run(suite) File "/home/ethan/source/python/issue18780/Lib/test/support/__init__.py", line 1548, in run test(result) File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line 76, in __call__ return self.run(*args, **kwds) File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line 114, in run test(result) File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line 76, in __call__ return self.run(*args, **kwds) File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line 114, in run test(result) TypeError: 'NoneType' object is not callable From solipsis at pitrou.net Sat Aug 31 19:57:22 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 31 Aug 2013 19:57:22 +0200 Subject: [Python-Dev] error in test suite References: <5222268F.3060902@stoneleaf.us> Message-ID: <20130831195722.591558dd@fsol> On Sat, 31 Aug 2013 10:23:27 -0700 Ethan Furman wrote: > Am I the only one experiencing this? http://bugs.python.org/issue11798 perhaps? Regards Antoine. > 262 tests OK. > 93 tests failed: > test___all__ test_abc test_array test_ast test_asynchat > test_asyncore test_bisect test_buffer test_bufio test_bytes > test_codeccallbacks test_codecs test_colorsys test_compileall > test_configparser test_contextlib test_crypt test_ctypes test_dbm > test_dbm_dumb test_dbm_ndbm test_dictcomps test_enum > test_exceptions test_faulthandler test_file test_fileinput > test_frozen test_future test_future3 test_future5 test_genericpath > test_getargs2 test_getpass test_hash test_hashlib test_heapq > test_idle test_imaplib test_imp test_import test_index test_io > test_ioctl test_ipaddress test_iterlen test_json test_keyword > test_largefile test_locale test_macpath test_multiprocessing_fork > test_multiprocessing_forkserver test_multiprocessing_spawn > test_namespace_pkgs test_ntpath test_operator test_osx_env > test_pdb test_pep352 test_posixpath test_print test_py_compile > test_random test_regrtest test_robotparser test_runpy test_sched > test_set test_shutil test_site test_smtpd test_sndhdr > test_source_encoding test_sqlite test_stat test_strftime > test_sundry test_tarfile test_textwrap test_threading test_time > test_unicode test_univnewlines test_urllib test_urllib2net > test_userstring test_uuid test_warnings test_wave test_webbrowser > test_xml_dom_minicompat test_zipfile > 24 tests skipped: > test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp > test_codecmaps_kr test_codecmaps_tw test_curses test_dbm_gnu > test_devpoll test_gdb test_kqueue test_lzma test_msilib > test_ossaudiodev test_smtpnet test_socketserver test_startfile > test_timeout test_tk test_ttk_guionly test_urllibnet test_winreg > test_winsound test_xmlrpc_net test_zipfile64 > > and the failure appears to always be: > > test [...] crashed -- Traceback (most recent call last): > File "/home/ethan/source/python/issue18780/Lib/test/regrtest.py", line 1265, in runtest_inner > huntrleaks) > File "/home/ethan/source/python/issue18780/Lib/test/regrtest.py", line 1381, in dash_R > indirect_test() > File "/home/ethan/source/python/issue18780/Lib/test/regrtest.py", line 1261, in > test_runner = lambda: support.run_unittest(tests) > File "/home/ethan/source/python/issue18780/Lib/test/support/__init__.py", line 1683, in run_unittest > _run_suite(suite) > File "/home/ethan/source/python/issue18780/Lib/test/support/__init__.py", line 1649, in _run_suite > result = runner.run(suite) > File "/home/ethan/source/python/issue18780/Lib/test/support/__init__.py", line 1548, in run > test(result) > File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line 76, in __call__ > return self.run(*args, **kwds) > File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line 114, in run > test(result) > File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line 76, in __call__ > return self.run(*args, **kwds) > File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line 114, in run > test(result) > TypeError: 'NoneType' object is not callable From andrew.svetlov at gmail.com Sat Aug 31 19:58:48 2013 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Sat, 31 Aug 2013 20:58:48 +0300 Subject: [Python-Dev] error in test suite In-Reply-To: <5222268F.3060902@stoneleaf.us> References: <5222268F.3060902@stoneleaf.us> Message-ID: Sorry, this is mine. This is related to http://bugs.python.org/issue11798 Error happens when tests regrtest executed with -R option. I've temporary disabled this feature until finally fix it. On Sat, Aug 31, 2013 at 8:23 PM, Ethan Furman wrote: > Am I the only one experiencing this? > > 262 tests OK. > 93 tests failed: > test___all__ test_abc test_array test_ast test_asynchat > test_asyncore test_bisect test_buffer test_bufio test_bytes > test_codeccallbacks test_codecs test_colorsys test_compileall > test_configparser test_contextlib test_crypt test_ctypes test_dbm > test_dbm_dumb test_dbm_ndbm test_dictcomps test_enum > test_exceptions test_faulthandler test_file test_fileinput > test_frozen test_future test_future3 test_future5 test_genericpath > test_getargs2 test_getpass test_hash test_hashlib test_heapq > test_idle test_imaplib test_imp test_import test_index test_io > test_ioctl test_ipaddress test_iterlen test_json test_keyword > test_largefile test_locale test_macpath test_multiprocessing_fork > test_multiprocessing_forkserver test_multiprocessing_spawn > test_namespace_pkgs test_ntpath test_operator test_osx_env > test_pdb test_pep352 test_posixpath test_print test_py_compile > test_random test_regrtest test_robotparser test_runpy test_sched > test_set test_shutil test_site test_smtpd test_sndhdr > test_source_encoding test_sqlite test_stat test_strftime > test_sundry test_tarfile test_textwrap test_threading test_time > test_unicode test_univnewlines test_urllib test_urllib2net > test_userstring test_uuid test_warnings test_wave test_webbrowser > test_xml_dom_minicompat test_zipfile > 24 tests skipped: > test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp > test_codecmaps_kr test_codecmaps_tw test_curses test_dbm_gnu > test_devpoll test_gdb test_kqueue test_lzma test_msilib > test_ossaudiodev test_smtpnet test_socketserver test_startfile > test_timeout test_tk test_ttk_guionly test_urllibnet test_winreg > test_winsound test_xmlrpc_net test_zipfile64 > > and the failure appears to always be: > > test [...] crashed -- Traceback (most recent call last): > File "/home/ethan/source/python/issue18780/Lib/test/regrtest.py", line > 1265, in runtest_inner > huntrleaks) > File "/home/ethan/source/python/issue18780/Lib/test/regrtest.py", line > 1381, in dash_R > indirect_test() > File "/home/ethan/source/python/issue18780/Lib/test/regrtest.py", line > 1261, in > test_runner = lambda: support.run_unittest(tests) > File "/home/ethan/source/python/issue18780/Lib/test/support/__init__.py", > line 1683, in run_unittest > _run_suite(suite) > File "/home/ethan/source/python/issue18780/Lib/test/support/__init__.py", > line 1649, in _run_suite > result = runner.run(suite) > File "/home/ethan/source/python/issue18780/Lib/test/support/__init__.py", > line 1548, in run > test(result) > File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line > 76, in __call__ > return self.run(*args, **kwds) > File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line > 114, in run > test(result) > File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line > 76, in __call__ > return self.run(*args, **kwds) > File "/home/ethan/source/python/issue18780/Lib/unittest/suite.py", line > 114, in run > test(result) > TypeError: 'NoneType' object is not callable > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From stefan_ml at behnel.de Sat Aug 31 21:16:10 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 31 Aug 2013 21:16:10 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: References: <20130823111822.49cba700@pitrou.net> Message-ID: Nick Coghlan, 31.08.2013 18:49: > On 25 Aug 2013 21:56, "Stefan Behnel" wrote: >>>>> One key point to note is that it *doesn't* call >>>>> _PyImport_FixupExtensionObject, which is the API that handles all the >>>>> PEP 3121 per-module state stuff. Instead, the idea will be for modules >>>>> that don't need additional C level state to just implement >>>>> PyImportExec_NAME, while those that *do* need C level state implement >>>>> PyImportCreate_NAME and return a custom object (which may or may not >>>>> be a module subtype). >>>> >>>> Is it really a common case for an extension module not to need any C >>>> level >>>> state at all? I mean, this might work for very simple accelerator >>>> modules >>>> with only a few stand-alone functions. But anything non-trivial will >>>> almost >>>> certainly have some kind of global state, cache, external library, >>>> etc., >>>> and that state is best stored at the C level for safety reasons. > > In my experience, most extension authors aren't writing high performance C > accelerators, they're exposing an existing C API to Python. It's the cffi > use case rather than the Cython use case. Interesting. I can't really remember a case where I could afford the runtime overhead of implementing a wrapper in Python and going through something like ctypes or cffi. I mean, testing C libraries with Python tools would be one, but then, you wouldn't want to write an extension module for that and instead want to call it directly from the test code as directly as possible. I'm certainly aware that that use case exists, though, and also the case of just wanting to get things done as quickly and easily as possible. > Mutable module global state is always a recipe for obscure bugs, and not > something I will ever let through code review without a really good > rationale. Hidden process global state is never good, just sometimes a > necessary evil. I'm not necessarily talking about mutable state. Rather about things like pre-initialised data or imported functionality. For example, I often have a bound method of a compiled regex lying around somewhere in my Python modules as a utility function. And the same kind of stuff exists in C code, some may be local to a class, but other things can well be module global. And given that we are talking about module internals here I'd always keep them at the C level rather than exposing them through the module dict. The module dict involves a much higher access overhead, in addition to the reduced safety due to user accessibility. Exported C-APIs are also a use case. You'd import the C-API of another module at init time and from that point on only go through function pointers etc. Those are (sub-)interpreter specific, i.e. they are module global state that is specific to the currently loaded module instances. > However, keep in mind my patch is currently just the part I can implement > without PEP 451 module spec objects. Understood. >> Note that even global functions usually hold state, be it in the form of >> globally imported modules, global caches, constants, ... > > If they can be shared safely across multiple instances of the module (e.g. > immutable constants), then these can be shared at the C level. Otherwise, a > custom Python type will be needed to make them instance specific. I assume you meant a custom module (extension) type here. Just to be clear, the "module state at the C-level" is meant to be stored in the object struct fields of the extension type that implements the module, at least for modules that want to support reloading and sub-interpreters. Obviously, nothing should be stored in static (global) variables etc. >>> We also need the create/exec split to properly support reloading. Reload >>> *must* reinitialize the object already in sys.modules instead of >>> inserting >>> a different object or it completely misses the point of reloading >>> modules >>> over deleting and reimporting them (i.e. implicitly affecting the >>> references from other modules that imported the original object). >> >> Interesting. I never thought of it that way. >> >> I'm not sure this can be done in general. What if the module has threads >> running that access the global state? In that case, reinitialising the >> module object itself would almost certainly lead to a crash. > > My current proposal on import-sig is to make the first hook > "prepare_module", and pass in the existing object in the reload case. For > the extension loader, this would be reflected in the signature of the C > level hook as well, so the module could decide for itself if it supported > reloading. I really don't like the idea of reloading by replacing module state. It would be much simpler if the module itself would be replaced, then the original module could stay alive and could still be used by those who hold a reference to it or parts of its contents. Especially the from-import case would benefit from this. Obviously, you could still run into obscure bugs where a function you call rejects the input because it expects an older version of a type, for example. But I can't see that being worse (or even just different) from the reload-by-refilling-dict case. You seemed to be ok with my idea of making the loader return a wrapped extension module instead of the module itself. We should actually try that. > This is actually my primary motivation for trying to improve the > "can this be reloaded or not?" aspects of the loader API in PEP 451. I assume you mean that the extension module would be able to clearly signal that it can't be reloaded, right? I agree that that's helpful. If you're wrapping a C library, then the way that library is implemented might simply force you to prevent any attempts at reloading the wrapper module. But if reloading is possible at all, it would be even more helpful if we could make it really easy to properly support it. > (keep in mind existing extension modules using the existing API will still > never be reloaded) Sure, that's the cool thing. We can really design this totally from scratch without looking back. >>> Take a look at the current example - everything gets stored in the >>> module dict for the simple case with no C level global state. >> >> Well, you're storing types there. And those types are your module API. I >> understand that it's just an example, but I don't think it matches a >> common >> case. As far as I can see, the types are not even interacting with each >> other, let alone doing any C-level access of each other. We should try to >> focus on the normal case that needs C-level state and C-level field access >> of extension types. Once that's solved, we can still think about how to >> make the really simple cases simpler, if it turns out that they are not >> simple enough. > > Our experience is very different - my perspective is that the normal case > either eschews C level global state in the extension module, because it > causes so many problems, or else just completely ignores subinterpreter > support and proper module cleanup. As soon as you have more than one extension type in your module, and they interact with each other, they will almost certainly have to do type checks against each other to make sure users haven't passed them rubbish before they access any C struct fields of the object. Doing a type check means that at least one type has a pointer to the other, meaning that it holds global module state. I really think that having some kind of global module state is the exceedingly common case for an extension module. >> I didn't know about PyType_FromSpec(), BTW. It looks like a nice addition >> for manually written code (although useless for Cython). > > This is the only way to create custom types when using the stable ABI. Can > I take your observation to mean that Cython doesn't currently offer the > option of limiting itself to the stable ABI? Correct. I've taken a bird's view at it back then, and keep stumbling over "wow - I couldn't even use that?" kind of declarations in the header files. I don't think it makes sense for Cython. Existing CPython versions are easy to support because they don't change anymore, and new major releases most likely need adaptations anyway, if only to adapt to new features and performance changes. Cython actually knows quite a lot about the inner workings of CPython and its various releases. Going only through the stable ABI parts of the C-API would make the code horribly slow in comparison, so there are huge drawbacks for the benefit it might give. The Cython way of doing it is more like: you want your code to run on a new CPython version, then use a recent Cython release to compile it. It may still work with older ones, but what you actually want is the newest anyway, and you also want to compile the C code for the specific CPython version at hand to get the most out of it. It's the C code that adapts, not the runtime code (or Cython itself). We run continuous integration tests with all of CPython's development branches since 2.4, so we usually support new CPython releases long before they are out. And new releases of CPython rarely affect Cython user code. Stefan From solipsis at pitrou.net Sat Aug 31 21:27:36 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 31 Aug 2013 21:27:36 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules References: <20130823111822.49cba700@pitrou.net> Message-ID: <20130831212736.79ee96a6@fsol> On Sat, 31 Aug 2013 21:16:10 +0200 Stefan Behnel wrote: > > Our experience is very different - my perspective is that the normal case > > either eschews C level global state in the extension module, because it > > causes so many problems, or else just completely ignores subinterpreter > > support and proper module cleanup. > > As soon as you have more than one extension type in your module, and they > interact with each other, they will almost certainly have to do type checks > against each other to make sure users haven't passed them rubbish before > they access any C struct fields of the object. Doing a type check means > that at least one type has a pointer to the other, meaning that it holds > global module state. > > I really think that having some kind of global module state is the > exceedingly common case for an extension module. Since we are eating our own dogfood here (and the work which prompted this discussion was indeed about trying to make our extension modules more cleanup-friendly), it would be nice to take a look at the Modules directory and count which proportion of CPython extension modules have state. Caution: "state" is a bit vague here. Depending on which API you use, custom extension types can be a part of "module state". Regards Antoine. From stefan_ml at behnel.de Sat Aug 31 22:35:42 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 31 Aug 2013 22:35:42 +0200 Subject: [Python-Dev] Pre-PEP: Redesigning extension modules In-Reply-To: <20130831212736.79ee96a6@fsol> References: <20130823111822.49cba700@pitrou.net> <20130831212736.79ee96a6@fsol> Message-ID: Antoine Pitrou, 31.08.2013 21:27: > On Sat, 31 Aug 2013 21:16:10 +0200 > Stefan Behnel wrote: >>> Our experience is very different - my perspective is that the normal case >>> either eschews C level global state in the extension module, because it >>> causes so many problems, or else just completely ignores subinterpreter >>> support and proper module cleanup. >> >> As soon as you have more than one extension type in your module, and they >> interact with each other, they will almost certainly have to do type checks >> against each other to make sure users haven't passed them rubbish before >> they access any C struct fields of the object. Doing a type check means >> that at least one type has a pointer to the other, meaning that it holds >> global module state. >> >> I really think that having some kind of global module state is the >> exceedingly common case for an extension module. > > Since we are eating our own dogfood here (and the work which prompted > this discussion was indeed about trying to make our extension modules > more cleanup-friendly), it would be nice to take a look at the Modules > directory and count which proportion of CPython extension modules have > state. There seem to be 81 modules in there currently (grepped for PyMODINIT_FUNC). 16 of them come up when you grep for '(TypeCheck|IsInstance)', all using global extension type pointers. 32 use some kind of global "static PyObject* something;". Both add up to 41. That's half of the modules already. I'm sure there's more if you dig deeper. Some modules only define functions or only one type (e.g. md5). They would get away with no global state, I guess - if they all used heap types. > Caution: "state" is a bit vague here. Depending on which API you use, > custom extension types can be a part of "module state". Yep, as I said. Stefan