From infroma at gmail.com Fri May 2 21:34:12 2014 From: infroma at gmail.com (Roman Inflianskas) Date: Fri, 02 May 2014 23:34:12 +0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers Message-ID: <19168724.RU9dWeh3or@romas-x230-suse.lan> It's really useful that python 3 allows me to use some Unicode symbols (as specified in https://docs.python.org/3.4/reference/lexical_analysis.html#identifiers[1]), especially Greek symbols for mathematical programs. But when I write mathematical program with lots of indices I would like to use symbols from block "Superscripts and Subscripts" (as id_continue), for example: ???? I don't see any problems with allowing yet another subset of Unicode symbols. In Julia, for example, I can use them without problems. -- Regards, Roman Inflianskas -------- [1] https://docs.python.org/3.4/reference/lexical_analysis.html#identifiers -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Fri May 2 22:48:49 2014 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 2 May 2014 13:48:49 -0700 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <19168724.RU9dWeh3or@romas-x230-suse.lan> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> Message-ID: This block includes non-alphanumeric characters. You wouldn't want to allow variables named x?? (~ x+1) Some of the characters in this block are already allowed (the letters in category Lm). The characters you want are in the No (other numbers) category. Unfortunately, adding that category would be problematic as it includes characters like ? and you surely don't want a variable named x? or x?. That's x1/2 and x(1) for those without Unicode fonts. --- Bruce Learn how hackers think: http://j.mp/gruyere-security https://www.linkedin.com/in/bruceleban On Fri, May 2, 2014 at 12:34 PM, Roman Inflianskas wrote: > It's really useful that python 3 allows me to use some Unicode symbols (as specified in https://docs.python.org/3.4/reference/lexical_analysis.html#identifiers), especially Greek symbols for mathematical programs. But when I write mathematical program with lots of indices I would like to use symbols from block "Superscripts and Subscripts" (as id_continue), for example: > > > ???? > > > I don't see any problems with allowing yet another subset of Unicode symbols. In Julia, for example, I can use them without problems. > > > -- > > Regards, Roman Inflianskas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Fri May 2 23:17:33 2014 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 2 May 2014 17:17:33 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <19168724.RU9dWeh3or@romas-x230-suse.lan> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> Message-ID: On Fri, May 2, 2014 at 3:34 PM, Roman Inflianskas wrote: > I would like to use symbols from block "Superscripts and Subscripts" -1 Python uses ** operator for what is superscript in math and [] operator for what is subscript. Allowing sub/superscripts in identifiers will create confusion. (It is not uncommon to mix typeset math with python code in generated documentation.) If you have many identifiers with subscripts, I would recommend using a list or a dictionary and call them a[1], a[2], etc. instead of a1, a2. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat May 3 00:29:52 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 02 May 2014 18:29:52 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <19168724.RU9dWeh3or@romas-x230-suse.lan> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> Message-ID: On 5/2/2014 3:34 PM, Roman Inflianskas wrote: > It's really useful that python 3 allows me to use some Unicode > symbols (as specified > inhttps://docs.python.org/3.4/reference/lexical_analysis.html#identifiers), > especially Greek symbols for mathematical programs. But when I write > mathematical program with lots of indices I would like to use symbols > from block "Superscripts and Subscripts" (as id_continue), for > example: > > ???? > > I don't see any problems with allowing yet another subset of Unicode > symbols. In Julia, for example, I can use them without problems. From 2.3. Identifiers and keywords "The syntax of identifiers in Python is based on the Unicode standard annex UAX-31, with elaboration and changes as defined below; see also PEP 3131 for further details." -- Terry Jan Reedy From tjreedy at udel.edu Sat May 3 04:27:56 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 02 May 2014 22:27:56 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: References: <19168724.RU9dWeh3or@romas-x230-suse.lan> Message-ID: On 5/2/2014 6:29 PM, Terry Reedy wrote: > On 5/2/2014 3:34 PM, Roman Inflianskas wrote: >> It's really useful that python 3 allows me to use some Unicode >> symbols (as specified >> inhttps://docs.python.org/3.4/reference/lexical_analysis.html#identifiers), >> >> especially Greek symbols for mathematical programs. But when I write >> mathematical program with lots of indices I would like to use symbols >> from block "Superscripts and Subscripts" (as id_continue), for >> example: >> >> ???? >> I believe 'other numbers' are intentionally omitted. >> I don't see any problems with allowing yet another subset of Unicode >> symbols. In Julia, for example, I can use them without problems. If the rules for identifiers are expanded, any code the uses newly allowed names cannot be backported or run on previous versions. In contracted, the opposite problem occurs. I do not think they should be changed either way without a strong cause. > From 2.3. Identifiers and keywords > "The syntax of identifiers in Python is based on the Unicode standard > annex UAX-31, with elaboration and changes as defined below; see also > PEP 3131 for further details." In other words, we use the standard with a few intentional modifications. The 2.x ascii rules were the same or very similar as in other languages (such as C). The 3.x rule are similar to other languages that follow the same standard. There is a benefit to this. -- Terry Jan Reedy From steve at pearwood.info Sat May 3 06:50:23 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 3 May 2014 14:50:23 +1000 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: References: <19168724.RU9dWeh3or@romas-x230-suse.lan> Message-ID: <20140503045022.GA4273@ando> On Fri, May 02, 2014 at 10:27:56PM -0400, Terry Reedy wrote: > If the rules for identifiers are expanded, any code the uses newly > allowed names cannot be backported or run on previous versions. In > contracted, the opposite problem occurs. I do not think they should be > changed either way without a strong cause. That applies to any new feature -- code using that feature cannot be easily backported. In this case, it's actually quite simple to backport code using the new rules for identifiers: just change the identifiers. The algorithm used by the code remains that same. > > From 2.3. Identifiers and keywords > >"The syntax of identifiers in Python is based on the Unicode standard > >annex UAX-31, with elaboration and changes as defined below; see also > >PEP 3131 for further details." > > In other words, we use the standard with a few intentional > modifications. Playing Devil's Advocate, perhaps we could add a few more intentional modifications. While there are advantages to following a standard just for the sake of following a standard, once you allow any changes, you're no longer following the standard. So the argument becomes, why should we allow that change but not this change? Particularly for mathematically-focused code, I think it would be useful to be able to use identifiers like (say) ?? for variance, g? for sample skewness, or ?? for Pearson's skewness, to give a few real-world examples. Regular digits may be ambiguous: compare s?? for the sample variance with Bessel's correction, versus s12. (s twelve?) I'm going to give a tentative +1 vote to allowing superscript and subscript letters and digits in identifiers, if it can be done without excessive cost in complexity or performance. Anything else, like (say) ? (CIRCLED DIGIT FIVE), I will give a firm -1. -- Steven From greg.ewing at canterbury.ac.nz Sat May 3 08:38:21 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 03 May 2014 18:38:21 +1200 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <20140503045022.GA4273@ando> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> Message-ID: <53648EDD.8060201@canterbury.ac.nz> Steven D'Aprano wrote: > Particularly for mathematically-focused code, I think it would be useful > to be able to use identifiers like (say) ?? for variance, Having ?? be a variable name could be confusing. To a mathematician, it's not a distinct variable, it's just ? ** 2. -- Greg From rosuav at gmail.com Sat May 3 08:49:16 2014 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 3 May 2014 16:49:16 +1000 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <53648EDD.8060201@canterbury.ac.nz> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> Message-ID: On Sat, May 3, 2014 at 4:38 PM, Greg Ewing wrote: > Steven D'Aprano wrote: >> >> Particularly for mathematically-focused code, I think it would be useful >> to be able to use identifiers like (say) ?? for variance, > > > Having ?? be a variable name could be confusing. To a > mathematician, it's not a distinct variable, it's > just ? ** 2. Maybe, but subscripts can be useful. Recently we were discussing linear acceleration on python-list, and the way I learned the principle (other people learned it with different letters) was: V? = V?t + at?/2 which should translate into Python as: V? = V?*t + a*t*t/2 (Not sure if people's fonts have all those characters; that's read "V-t equals V-0 t plus a t squared over two".) Being able to use subscripts in identifiers wouldn't be *often* useful, but it would make direct translation from math to code a bit easier. ChrisA From bruce at leapyear.org Sat May 3 09:19:52 2014 From: bruce at leapyear.org (Bruce Leban) Date: Sat, 3 May 2014 00:19:52 -0700 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> Message-ID: I've actually written programs like that and honestly names like 'sigma' and 'beta' and 'v_t' worked just fine. Many of us have used (x1, y1) and (x2, y2) without confusing anyone because the digits weren't subscripted. The ability to use Unicode in identifiers I'm sure is appreciated by non-English writers but that's a decidedly different issue. This is a solution without an actual problem. --- Bruce (from my phone) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sat May 3 09:29:07 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 03 May 2014 09:29:07 +0200 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> Message-ID: Bruce Leban, 03.05.2014 09:19: > I've actually written programs like that and honestly names like 'sigma' > and 'beta' and 'v_t' worked just fine. Many of us have used (x1, y1) and > (x2, y2) without confusing anyone because the digits weren't subscripted. Plus, the numbers are much easier to read that way than in tiny subscripts. Stefan From rosuav at gmail.com Sat May 3 09:30:17 2014 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 3 May 2014 17:30:17 +1000 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> Message-ID: On Sat, May 3, 2014 at 5:19 PM, Bruce Leban wrote: > I've actually written programs like that and honestly names like 'sigma' and > 'beta' and 'v_t' worked just fine. Many of us have used (x1, y1) and (x2, > y2) without confusing anyone because the digits weren't subscripted. Yeah; like I said, it's not a big thing. I certainly wouldn't choose a language on the basis of subscript-digit-support-in-identifiers. But when I'm working with maths I'm not overly familiar with (stuff a lot more complicated than simple linear acceleration), and I'm trying to translate a not-quite-perfect set of handwritten scribbles into code, every little bit helps. That's why WYSIWYG music editing software is so much more popular with novices than GNU Lilypond is - if you're not *really* familiar with what you're working with, the difference between "dot on the page that looks like this" and "c'8." slows you down. Not insurmountable but the mind glitches across the gap. ChrisA From steve at pearwood.info Sat May 3 11:05:24 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 3 May 2014 19:05:24 +1000 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <53648EDD.8060201@canterbury.ac.nz> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> Message-ID: <20140503090523.GB4273@ando> On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote: > Steven D'Aprano wrote: > >Particularly for mathematically-focused code, I think it would be useful > >to be able to use identifiers like (say) ?? for variance, > > Having ?? be a variable name could be confusing. To a > mathematician, it's not a distinct variable, it's > just ? ** 2. Actually, not really. A better way of putting it is that the standard deviation is "just" the square root of ??. Variance comes first (it's defined from first principles), and then the standard deviation is defined by taking the square root. But really, it doesn't matter which is derived from which. To a mathematician, x? is just as much a legitimate variable as x. One can say that f is a function of x? just as well as saying that it is a function of y, where y happens to equal x?. But regardless of philisophical differences regarding the nature of what is or isn't a variable, versus something derived from a variable, it simply is useful to have a one-to-one correspondence between variables in Python code and notation used in mathematics. Is it useful enough to make up for the (minor) issues that others have already mentioned? I think so, but I will understand if others disagree. I think that the ability to distinguish between x? and x? can be important, and both x2 and x_2 are poor substitutes. (Of the two, I prefer x2.) But I'm also aware that this is very dependent on the problem domain. I wouldn't use x? and x? outside of a mathematical context. -- Steven From stephen at xemacs.org Sat May 3 14:27:31 2014 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 03 May 2014 21:27:31 +0900 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <20140503090523.GB4273@ando> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> <20140503090523.GB4273@ando> Message-ID: <87ha57kpxo.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote: > > Steven D'Aprano wrote: > > >Particularly for mathematically-focused code, I think it would be useful > > >to be able to use identifiers like (say) ?? for variance, > > > > Having ?? be a variable name could be confusing. To a > > mathematician, it's not a distinct variable, it's > > just ? ** 2. > > Actually, not really. A better way of putting it is that the standard > deviation is "just" the square root of ??. Variance comes first (it's > defined from first principles), and then the standard deviation is > defined by taking the square root. Thank you for writing that better than I could have. :-) > But really, it doesn't matter which is derived from which. To a > mathematician, x? is just as much a legitimate variable as x. One can > say that f is a function of x? just as well as saying that it is a > function of y, where y happens to equal x?. We part company here. x? (in the usage "function of x?") is not a variable, it's an expression. I don't think I've even seen the usage "f(x?) = ..." in a *definition* of "f", with the single exception of the use of "f(?,??) = ..." in defining the distribution of a random variable, and even then that's unusual (? is almost always more convenient, even for test statistics). I'd consider that the exception that proves the rule.... Especially in a case like z(x,?,??) = (x - ?)/?! To put it another way, I suspect you would get rather upset if I used both x and x? in such a context and treated them as I would x and y. Or, if in real analysis I ignored the fact that x? is necessarily non-negative. I could go on, but I think the point is clear: *linguistically* these are expressions, not variables -- they are constructed syntactically, and their semantics can be deduced from the syntax. Of course in mathematics you can treat them as variables (as statisticians do ??), but that works because in mathematics no symbols or syntax have fixed semantics, not ?, not even 0. If you can get a version of Python that has "where ..." clauses in it that can define semantics for sub- and superscript syntax past Guido, I'd be all for this. But I really don't think that's going to happen. > Is it useful enough to make up for the (minor) issues that others > have already mentioned? I think so, but I will understand if others > disagree. I think that the ability to distinguish between x? and > x? can be important, Which, I suspect, means these notations don't pass the "generalized grit on Tim's monitor" test. > and both x2 and x_2 are poor substitutes. In programming (as opposed to the chemistry of nuclear fusion), if you need to distinguish x? from x?, and x**2 and x[2] don't do the trick, I suspect your notation has real readability problems no matter how you arrange things spatially. I guess that use cases where such usage is in good taste are way too rare to justify this. From ron3200 at gmail.com Sat May 3 17:39:23 2014 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 03 May 2014 11:39:23 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <20140503090523.GB4273@ando> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> <20140503090523.GB4273@ando> Message-ID: On 05/03/2014 05:05 AM, Steven D'Aprano wrote: > On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote: >> >Steven D'Aprano wrote: >>> > >Particularly for mathematically-focused code, I think it would be useful >>> > >to be able to use identifiers like (say) ?? for variance, >> >Having ?? be a variable name could be confusing. To a >> >mathematician, it's not a distinct variable, it's >> >just ? ** 2. > Actually, not really. A better way of putting it is that the standard > deviation is "just" the square root of ??. Variance comes first (it's > defined from first principles), and then the standard deviation is > defined by taking the square root. The main problem I see is that many possible questions come to mind rather than one simple or obvious interpretation. Cheers, Ron From steve at pearwood.info Sat May 3 19:57:03 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 4 May 2014 03:57:03 +1000 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> <20140503090523.GB4273@ando> Message-ID: <20140503175702.GF4273@ando> On Sat, May 03, 2014 at 11:39:23AM -0400, Ron Adam wrote: > > > On 05/03/2014 05:05 AM, Steven D'Aprano wrote: > >On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote: > > >>>Steven D'Aprano wrote: > > >>>> >Particularly for mathematically-focused code, I think it would be > >>>useful > >>>> >to be able to use identifiers like (say) ?? for variance, > > >>>Having ?? be a variable name could be confusing. To a > >>>mathematician, it's not a distinct variable, it's > >>>just ? ** 2. > > >Actually, not really. A better way of putting it is that the standard > >deviation is "just" the square root of ??. Variance comes first (it's > >defined from first principles), and then the standard deviation is > >defined by taking the square root. > > > The main problem I see is that many possible questions come to mind rather > than one simple or obvious interpretation. If I name a variable "x2", what is the "one simple or obvious interpretation" that such an identifier presumably has? If standard, ASCII-only identifiers don't have a single interpretation, why should identifiers like ?? be held to that requirement? Like any other identifier, one needs to interpret the name in context. Identifiers can be idiomatic ("i" for a loop variable, "c" for a character), more or less descriptive ("number_of_pages", "npages"), or obfuscated ("e382702"). They can be written in English, or in some other language. They can be ordinary words, or jargon that only means something to those who understand the problem domain. None of this will be different if sub/superscript digits and letters are allowed. One of the frustrations on this list is how often people hold new proposals to higher standard than existing features. Particularly *impossible* standards. It simply isn't possible for characters like superscript-two to be given a *single* interpretation (although there is an obvious one, namely "squared") any more than it is possible for the letter "a" to be given a *single* interpretation. There are valid objections to this proposal. It may be that the effort needed to allow code points like ? in identifiers without also allowing ? or ? may be too great. Or the performance cost is too high. Or the benefit for mathematical-style code doesn't justify adding additional language complexity. Or even a purely aethetic judgement "I just don't like it". (I don't like identifiers written in cyrillic, because I can't read them, but I'm not the target audience for such identifiers and I will never need to read them. Consequently I don't object if other people use cyrillic identifiers in their personal code.) Holding this proposal up to an impossible standard which plain ASCII identifiers don't even meet is simply not cricket. Thank you all for letting me get that off my chest, and apologies to Ron for singling him out. -- Steven From rosuav at gmail.com Sat May 3 20:11:39 2014 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 4 May 2014 04:11:39 +1000 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <20140503175702.GF4273@ando> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> <20140503090523.GB4273@ando> <20140503175702.GF4273@ando> Message-ID: On Sun, May 4, 2014 at 3:57 AM, Steven D'Aprano wrote: > One of the frustrations on this list is how often people hold new > proposals to higher standard than existing features. ... > > Holding this proposal up to an impossible standard which plain ASCII > identifiers don't even meet is simply not cricket. > > Thank you all for letting me get that off my chest, and apologies to Ron > for singling him out. A fair point in this case, and yet there is such a thing as the grandfather clause. Adding something to the language has a much higher bar than merely retaining something (because *removing* something from the language has an even higher bar), so a proposal can't simply say "It's no worse than what we have already" to get acceptance. Impossible standard? A bit unfair. Higher than existing features? Quite possibly has its place. ChrisA From stephen at xemacs.org Sat May 3 20:34:32 2014 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 04 May 2014 03:34:32 +0900 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <20140503175702.GF4273@ando> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> <20140503090523.GB4273@ando> <20140503175702.GF4273@ando> Message-ID: <8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > If I name a variable "x2", what is the "one simple or obvious > interpretation" that such an identifier presumably has? If standard, > ASCII-only identifiers don't have a single interpretation, why should > identifiers like ?? be held to that requirement? Because subscripts and superscripts are syntactic constructs, and naturally decompose into two identifiers in a specific relationship (even if that relationship cannot be further specified without going deep into some domain of discourse) -- and that is is much of the motivation for wanting to use them. "x2" does not carry that load. Note that Unicode itself considers them *compatibility* characters and says: Superscripts and subscripts have been included in the Unicode Standard only to provide compatibility with existing character sets. In general, the Unicode character encoding does not attempt to describe the positioning of a character above or below the baseline in typographical layout. In other words, Unicode is reluctant to guarantee that x2, x?, and x? are actually different identifiers! It's considered bad practice to treat them as the same, but not actually forbidden. At least 2 technical reports (#20 and #25) discourage their use except in the case where they are letter-like (phonetic transcriptions use several such letters, where they have different meaning from their compatibility equivalents). The more I look into this, the more I think it is really problematic. From tjreedy at udel.edu Sat May 3 23:48:51 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 03 May 2014 17:48:51 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <20140503045022.GA4273@ando> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> Message-ID: On 5/3/2014 12:50 AM, Steven D'Aprano wrote: > On Fri, May 02, 2014 at 10:27:56PM -0400, Terry Reedy wrote: > >> If the rules for identifiers are expanded, any code the uses newly >> allowed names cannot be backported or run on previous versions. In >> contracted, the opposite problem occurs. I do not think they should be >> changed either way without a strong cause. > > That applies to any new feature -- code using that feature cannot be > easily backported. In this case, it's actually quite simple to backport > code using the new rules for identifiers: just change the identifiers. > The algorithm used by the code remains that same. It appears that I consider lexicography more 'fundamental' in some sense than you do. But lets skip over this. >>> From 2.3. Identifiers and keywords >>> "The syntax of identifiers in Python is based on the Unicode standard >>> annex UAX-31, with elaboration and changes as defined below; see also >>> PEP 3131 for further details." Without reading the annex, I cannot tell which part of the 'below' actually defines a 'change', as opposed to an 'elaboration' (explanation). I have no idea whether the unknown changes are additions, deletions, or merely selections of options. >> In other words, we use the standard with a few intentional >> modifications. > > Playing Devil's Advocate, perhaps we could add a few more intentional > modifications. Or perhaps not, depending on what the modifications actually are and what the reasons were. > While there are advantages to following a standard just for the sake of > following a standard, once you allow any changes, you're no longer > following the standard. So the argument becomes, why should we allow > that change but not this change? Nick recently argued, very similarly, that having restored string 'u' prefixes was a reason to restore dict.iterxyz methods. You agreed with me that there were good reasons why B did not follow from A. To properly compare current and proposed changes, we must know the current 'modifications and changes', their reasons and effects, and the proposed changes and their reasons (any real parallels) and likely effects. If you were to do the research, I would be willing to discuss. > Particularly for mathematically-focused code, I think it would be useful > to be able to use identifiers like (say) ?? for variance, g? for sample > skewness, or ?? for Pearson's skewness, to give a few real-world > examples. Regular digits may be ambiguous: compare s?? for the sample > variance with Bessel's correction, versus s12. (s twelve?) I agree that there are good uses for this restricted set of additions. Would you allow super/subscripts as prefixes rather than suffixes? I presume not since we already disallow initial numbers. > I'm going to give a tentative +1 vote to allowing superscript and > subscript letters and digits in identifiers, if it can be done without > excessive cost in complexity or performance. Would you consider doubling the cost of checking each character (a reasonable estimate, I think) excessive or not? > Anything else, like (say) ? (CIRCLED DIGIT FIVE), > I will give a firm -1. -- Terry Jan Reedy From tjreedy at udel.edu Sun May 4 00:06:06 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 03 May 2014 18:06:06 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> Message-ID: On 5/3/2014 5:48 PM, Terry Reedy wrote: > Would you consider doubling the cost of checking each character (a > reasonable estimate, I think) excessive or not? Thinking about it more, I think double is an over-estimate. Since I do not know how the unicode lexer works, I won't guess or worry about the cost until there it times code with and without the change. -- Terry Jan Reedy From alexander.belopolsky at gmail.com Sun May 4 00:08:51 2014 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 3 May 2014 18:08:51 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> Message-ID: On Sat, May 3, 2014 at 5:48 PM, Terry Reedy wrote: > Would you allow super/subscripts as prefixes rather than suffixes? I > presume not since we already disallow initial numbers. Python 3 does not recognize subscripts as numbers: >>> int('?') Traceback (most recent call last): File "", line 1, in ValueError: invalid literal for int() with base 10: '?' -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun May 4 04:40:44 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 4 May 2014 12:40:44 +1000 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> <20140503090523.GB4273@ando> <20140503175702.GF4273@ando> <8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20140504024043.GG4273@ando> On Sun, May 04, 2014 at 03:34:32AM +0900, Stephen J. Turnbull wrote: > Note that Unicode itself considers them *compatibility* characters and > says: > > Superscripts and subscripts have been included in the Unicode > Standard only to provide compatibility with existing character > sets. In general, the Unicode character encoding does not attempt > to describe the positioning of a character above or below the > baseline in typographical layout. > > In other words, Unicode is reluctant to guarantee that x2, x?, and x? > are actually different identifiers! [...] I don't think this is a valid interpretation of what the Unicode standard is trying to say, but the point is moot. I think you've just identified (pun intended) a major objection to the proposal, one serious enough to change my mind from limited support to opposition. Python identifiers are treated by their NFKC normalised form: All identifiers are converted into the normal form NFKC while parsing; comparison of identifiers is based on NFKC. https://docs.python.org/3/reference/lexical_analysis.html And superscripts and subscripts normalise to standard characters: py> [unicodedata.normalize('NFKC', s) for s in 'x? x? x2'.split()] ['x2', 'x2', 'x2'] So that categorically rules out allowing superscripts and subscripts as *distinct* characters in identifiers. So even if they were allowed, it would mean that x? and x? would be treated as the same identifier as x2. For my use-case, I would want x? and x? to be treated as distinct identifiers, not just as a funny way of writing x2. So from my perspective, *at best* there is now insufficient benefit to bother allowing them. It's actually stronger than that: allowing superscripts and subscripts would be an attractive nuisance for my use-case. If they were allowed, I would be tempted to write x? and x?, which could end up being a subtle source of bugs if I accidentally used them both in the same namespace, thinking that they were distinct when they actually aren't. So I am now -1 on allowing superscripts and subscripts. -- Steven From infroma at gmail.com Sun May 4 09:10:56 2014 From: infroma at gmail.com (Roman Inflianskas) Date: Sun, 04 May 2014 11:10:56 +0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <20140504024043.GG4273@ando> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp> <20140504024043.GG4273@ando> Message-ID: <2552140.U2lLahrYnB@romas-x230-suse.lan> On Sunday 04 May 2014 12:40:44 Steven D'Aprano wrote: > On Sun, May 04, 2014 at 03:34:32AM +0900, Stephen J. Turnbull wrote: > > > Note that Unicode itself considers them *compatibility* characters and > > says: > > > > Superscripts and subscripts have been included in the Unicode > > Standard only to provide compatibility with existing character > > sets. In general, the Unicode character encoding does not attempt > > to describe the positioning of a character above or below the > > baseline in typographical layout. > > > > In other words, Unicode is reluctant to guarantee that x2, x?, and x? > > are actually different identifiers! > [...] > > I don't think this is a valid interpretation of what the Unicode > standard is trying to say, but the point is moot. I think you've just > identified (pun intended) a major objection to the proposal, one serious > enough to change my mind from limited support to opposition. > > Python identifiers are treated by their NFKC normalised form: > > All identifiers are converted into the normal form NFKC while > parsing; comparison of identifiers is based on NFKC. > > https://docs.python.org/3/reference/lexical_analysis.html > > And superscripts and subscripts normalise to standard characters: > > py> [unicodedata.normalize('NFKC', s) for s in 'x? x? x2'.split()] > ['x2', 'x2', 'x2'] > > So that categorically rules out allowing superscripts and subscripts as > *distinct* characters in identifiers. So even if they were allowed, it > would mean that x? and x? would be treated as the same identifier as x2. > > For my use-case, I would want x? and x? to be treated as distinct > identifiers, not just as a funny way of writing x2. So from my > perspective, *at best* there is now insufficient benefit to bother > allowing them. > > It's actually stronger than that: allowing superscripts and subscripts > would be an attractive nuisance for my use-case. If they were allowed, I > would be tempted to write x? and x?, which could end up being a subtle > source of bugs if I accidentally used them both in the same namespace, > thinking that they were distinct when they actually aren't. So I am now > -1 on allowing superscripts and subscripts. > > > That's the strongest point against allowing superscripts and subscripts in a whole discussion, IMHO. I would want x? and x? to be treated as distinct identifiers either. I've tried this use case in Julia and it works: julia> x? = 1 1 julia> x? = 2 2 julia> x? 1 julia> x? 2 But then I've found thread in Julia's bugtracker covering unicode identifiers normalization[1]. As I understood they don't use NFKC. As a consequence symbols "?" (0x00b5) and "?" (0x03bc) are treated as different. They understood that it's weird and they need to do something about this. Some of they don't want to use NFKC because of the same reason (+ for example, "H" and "?" would became equal identifiers). Others decided to give a warning when new identifier is equal to the defined one (in the terms of NFKC normalization). Now I understood that things are more complicated that I considered them when I did a proposal. I think that there is no "good way" to add support for subscripts and superscripts. So it's better to leave the situation as is. -- Regards, Roman Inflianskas -------- [1] covering unicode identifiers normalization -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun May 4 11:51:25 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 04 May 2014 05:51:25 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <2552140.U2lLahrYnB@romas-x230-suse.lan> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp> <20140504024043.GG4273@ando> <2552140.U2lLahrYnB@romas-x230-suse.lan> Message-ID: On 5/4/2014 3:10 AM, Roman Inflianskas wrote: > Now I understood that things are more complicated that I considered them > when I did a proposal. I think that there is no "good way" to add > support for subscripts and superscripts. So it's better to leave the > situation as is. If you are the one who opened the tracker issue, please close it. And thanks for bringing the discussion here. -- Terry Jan Reedy From infroma at gmail.com Sun May 4 12:00:20 2014 From: infroma at gmail.com (Roman Inflianskas) Date: Sun, 04 May 2014 14:00:20 +0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <2552140.U2lLahrYnB@romas-x230-suse.lan> Message-ID: <2117120.VD1tLruAqg@romas-x230-suse.lan> On Sunday 04 May 2014 05:51:25 Terry Reedy wrote: > On 5/4/2014 3:10 AM, Roman Inflianskas wrote: > > > Now I understood that things are more complicated that I considered them > > when I did a proposal. I think that there is no "good way" to add > > support for subscripts and superscripts. So it's better to leave the > > situation as is. > > If you are the one who opened the tracker issue, please close it. And > thanks for bringing the discussion here. Done. Thank you for participation in this discussion. The next time I will not open bug before discussion, I promise :) -- Regards, Roman Inflianskas From ron3200 at gmail.com Sun May 4 18:17:42 2014 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 04 May 2014 12:17:42 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <20140503175702.GF4273@ando> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> <20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz> <20140503090523.GB4273@ando> <20140503175702.GF4273@ando> Message-ID: On 05/03/2014 01:57 PM, Steven D'Aprano wrote: > On Sat, May 03, 2014 at 11:39:23AM -0400, Ron Adam wrote: >> > >> > >> >On 05/03/2014 05:05 AM, Steven D'Aprano wrote: >>> > >On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote: >> > >>>>> > >>>Steven D'Aprano wrote: >> > >>>>>>> > >>>> >Particularly for mathematically-focused code, I think it would be >>>>> > >>>useful >>>>>>> > >>>> >to be able to use identifiers like (say) ?? for variance, >> > >>>>> > >>>Having ?? be a variable name could be confusing. To a >>>>> > >>>mathematician, it's not a distinct variable, it's >>>>> > >>>just ? ** 2. >> > >>> > >Actually, not really. A better way of putting it is that the standard >>> > >deviation is "just" the square root of ??. Variance comes first (it's >>> > >defined from first principles), and then the standard deviation is >>> > >defined by taking the square root. >> > >> > >> >The main problem I see is that many possible questions come to mind rather >> >than one simple or obvious interpretation. > If I name a variable "x2", what is the "one simple or obvious > interpretation" that such an identifier presumably has? If standard, > ASCII-only identifiers don't have a single interpretation, why should > identifiers like ?? be held to that requirement? Steven Turnbull pointed out some of the different interpretations I was thinking about in his reply to this message. Mainly that of it being more of a syntactic form, but as you said it also might be interpreted as an identifier spelling. > Like any other identifier, one needs to interpret the name in context. > Identifiers can be idiomatic ("i" for a loop variable, "c" for a > character), more or less descriptive ("number_of_pages", "npages"), or > obfuscated ("e382702"). They can be written in English, or in some other > language. They can be ordinary words, or jargon that only means > something to those who understand the problem domain. None of this will > be different if sub/superscript digits and letters are allowed. > > One of the frustrations on this list is how often people hold new > proposals to higher standard than existing features. Particularly > *impossible* standards. It simply isn't possible for characters like > superscript-two to be given a*single* interpretation (although there is > an obvious one, namely "squared") any more than it is possible for the > letter "a" to be given a*single* interpretation. > > There are valid objections to this proposal. It may be that the effort > needed to allow code points like ? in identifiers without also allowing > ? or ? may be too great. Or the performance cost is too high. Or the > benefit for mathematical-style code doesn't justify adding additional > language complexity. > > Or even a purely aethetic judgement "I just don't like it". (I don't > like identifiers written in cyrillic, because I can't read them, but I'm > not the target audience for such identifiers and I will never need to > read them. Consequently I don't object if other people use cyrillic > identifiers in their personal code.) > > Holding this proposal up to an impossible standard which plain ASCII > identifiers don't even meet is simply not cricket. > Thank you all for letting me get that off my chest, and apologies to Ron > for singling him out. No problem, you didn't comment on me, but expressed your own thoughts. That's fine. But thanks for clarifying the context of your message, it does help us avoid unintended misunderstandings in message based conversations like these where we don't get to hear the tone of a message. I feel the same as you describe here in many of these discussions. Enough so that I'm attempting to write a minimal language that uses some of the features I've thought about. The exercise was/is helping me understand many of the lower level language-design patterns in python and some other languages. Some of the ideas I've wanted just don't fit with pythons design, and some would work, but not without many changes to other parts. And some ideas we can't do because they directly conflict with something we already have. Sigh. The ones that most interest me are the ones that simplify or unify existing features, but those are also the one that are the hardest to do right. ;-) Cheers, Ron From ram.rachum at gmail.com Mon May 5 15:17:16 2014 From: ram.rachum at gmail.com (Ram Rachum) Date: Mon, 5 May 2014 06:17:16 -0700 (PDT) Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` Message-ID: I suggest implementing: - `itertools.permutations.__getitem__`, for getting a permutation by its index number, and possibly also slicing, and - `itertools.permutations.index` for getting the index number of a given permutation. What do you think? Thanks, Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ram.rachum at gmail.com Mon May 5 18:07:27 2014 From: ram.rachum at gmail.com (Ram Rachum) Date: Mon, 5 May 2014 09:07:27 -0700 (PDT) Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: Message-ID: <081c1efb-4f7e-4d70-9a50-957d9f347e1e@googlegroups.com> And now that I think about it, I'd also like to give it a `__len__`, and to give `itertools.product` the same treatment. What do you think? On Monday, May 5, 2014 4:17:16 PM UTC+3, Ram Rachum wrote: > > I suggest implementing: > > - `itertools.permutations.__getitem__`, for getting a permutation by its > index number, and possibly also slicing, and > - `itertools.permutations.index` for getting the index number of a given > permutation. > > What do you think? > > > Thanks, > Ram. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon May 5 19:15:38 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 6 May 2014 03:15:38 +1000 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: Message-ID: <20140505171538.GR4273@ando> On Mon, May 05, 2014 at 06:17:16AM -0700, Ram Rachum wrote: > I suggest implementing: > > - `itertools.permutations.__getitem__`, for getting a permutation by its > index number, and possibly also slicing, and > - `itertools.permutations.index` for getting the index number of a given > permutation. > > What do you think? An intriguing idea. range() objects also implement indexing, and len. But range() objects have an obvious, unambiguous order: range(2, 6) *must* give 2, 3, 4, 5, in that order, by the definition of range. Permutations aren't like that. The order of the permutations is an implementation detail, not part of the definition. If permutations provides indexing operations, then the order becomes part of the interface. I'm not sure that's such a good idea. I think, rather that adding __getitem__ to permutations, I would rather see a separate function (not iterator) which returns the nth permutation. -- Steven From steve at pearwood.info Mon May 5 19:23:14 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 6 May 2014 03:23:14 +1000 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: <081c1efb-4f7e-4d70-9a50-957d9f347e1e@googlegroups.com> References: <081c1efb-4f7e-4d70-9a50-957d9f347e1e@googlegroups.com> Message-ID: <20140505172314.GS4273@ando> On Mon, May 05, 2014 at 09:07:27AM -0700, Ram Rachum wrote: > And now that I think about it, I'd also like to give it a `__len__`, and to > give `itertools.product` the same treatment. What do you think? Consider: p = itertools.permutations('CAT') assert len(p) == 6 So far, that's obvious. But: next(p) => returns a permutation Now what will len(p) return? If it still returns 6, that will lead to bugs when people check the len, but fail to realise that some of those permutations have already been consumed. In the most extreme case, you could have: assert len(p) == 6 list(p) == [] which is terribly surprising. On the other hand, if len(p) returns the number of permutations remaining, apart from increasing the complexity of the iterator, it will also be surprising to those who expect the length to be the total number of permutations. I would rather have a separate API, perhaps something like this: p.number() # returns the total number of permutations -- Steven From songofacandy at gmail.com Tue May 6 00:22:56 2014 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 6 May 2014 07:22:56 +0900 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: <20140505171538.GR4273@ando> References: <20140505171538.GR4273@ando> Message-ID: > range() objects also implement indexing, and len. But range() objects > have an obvious, unambiguous order: range(2, 6) *must* give 2, 3, 4, 5, > in that order, by the definition of range. Permutations aren't like > that. The order of the permutations is an implementation detail, not > part of the definition. If permutations provides indexing operations, > then the order becomes part of the interface. I'm not sure that's such a > good idea. I don't think the order of permutation is implementation detail. Python implementations should follow CPython's documented order. https://docs.python.org/3.4/library/itertools.html#itertools.permutations > Permutations are emitted in lexicographic sort order. So, if the input iterable is sorted, the permutation tuples will be produced in sorted order. On Tue, May 6, 2014 at 2:15 AM, Steven D'Aprano wrote: > On Mon, May 05, 2014 at 06:17:16AM -0700, Ram Rachum wrote: >> I suggest implementing: >> >> - `itertools.permutations.__getitem__`, for getting a permutation by its >> index number, and possibly also slicing, and >> - `itertools.permutations.index` for getting the index number of a given >> permutation. >> >> What do you think? > > An intriguing idea. > > range() objects also implement indexing, and len. But range() objects > have an obvious, unambiguous order: range(2, 6) *must* give 2, 3, 4, 5, > in that order, by the definition of range. Permutations aren't like > that. The order of the permutations is an implementation detail, not > part of the definition. If permutations provides indexing operations, > then the order becomes part of the interface. I'm not sure that's such a > good idea. > > I think, rather that adding __getitem__ to permutations, I would rather > see a separate function (not iterator) which returns the nth > permutation. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- INADA Naoki From ethan at stoneleaf.us Tue May 6 01:06:39 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 05 May 2014 16:06:39 -0700 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: <20140505171538.GR4273@ando> Message-ID: <5368197F.20507@stoneleaf.us> On 05/05/2014 03:22 PM, INADA Naoki wrote: >> range() objects also implement indexing, and len. But range() objects >> have an obvious, unambiguous order: range(2, 6) *must* give 2, 3, 4, 5, >> in that order, by the definition of range. Permutations aren't like >> that. The order of the permutations is an implementation detail, not >> part of the definition. If permutations provides indexing operations, >> then the order becomes part of the interface. I'm not sure that's such a >> good idea. > > I don't think the order of permutation is implementation detail. > Python implementations should follow CPython's documented order. > > https://docs.python.org/3.4/library/itertools.html#itertools.permutations > >> Permutations are emitted in lexicographic sort order. So, if the input iterable is sorted, the permutation tuples will be produced in sorted order. What does that mean? If permutations are emitted in an order, why does the input iterable have to be ordered? What happens if it's not? --> list(''.join(p) for p in permutations('abc')) ['abc', 'acb', 'bac', 'bca', 'cab', 'cba'] --> list(''.join(p) for p in permutations('cab')) ['cab', 'cba', 'acb', 'abc', 'bca', 'bac'] Okay, read http://en.wikipedia.org/wiki/Lexicographical_order -- I think 'lexicographic' is not the best choice of word... maybe positional? -- ~Ethan~ From ethan at stoneleaf.us Tue May 6 01:57:38 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 05 May 2014 16:57:38 -0700 Subject: [Python-ideas] A library for the deprecation of a function/class In-Reply-To: References: Message-ID: <53682572.5070702@stoneleaf.us> On 04/17/2014 12:28 PM, St?phane Wirtel wrote: > > With the CPython sprint, I was thinking about a lib to mark a function/class as deprecated. [...] > The deprecated decorator should check the version of the software and the version of Python if asked with the arguments. > it will raise warnings.warn with PendingDeprecationWarning or DeprecationWarning. Can be used in the documentation, via > introspection. Seems like a useful idea. Others have also thought so and there are some code snippets at https://wiki.python.org/moin/PythonDecoratorLibrary#Generating_Deprecation_Warnings -- ~Ethan~ From steve at pearwood.info Tue May 6 04:39:02 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 6 May 2014 12:39:02 +1000 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: <20140505171538.GR4273@ando> Message-ID: <20140506023902.GV4273@ando> On Tue, May 06, 2014 at 07:22:56AM +0900, INADA Naoki wrote: > I don't think the order of permutation is implementation detail. > Python implementations should follow CPython's documented order. > > https://docs.python.org/3.4/library/itertools.html#itertools.permutations Hmmm. Well, since the order of permutations is documented, I suppose my objection is answered. In that case, it becomes a question of whether or not there is an easy way to generate the Nth permutation without having to iterate through the previous N-1 permutations. > > Permutations are emitted in lexicographic sort order. So, if the > > input iterable is sorted, the permutation tuples will be produced in > > sorted order. I think I know what the docs are trying to say, but I'm not sure if they are quite saying it correctly. If the permutations are emitted in "lexicographic sort order", that implies that they are sortable, but that's not necessarily the case: py> 4j > 2j Traceback (most recent call last): File "", line 1, in TypeError: no ordering relation is defined for complex numbers py> list(itertools.permutations([4j, 2j])) [(4j, 2j), (2j, 4j)] I think that just removing the word "sort" is sufficient: "Permutations are emitted in lexicographic order" is meaningful, and correct, even when the elements are not sortable. -- Steven From taleinat at gmail.com Tue May 6 11:35:09 2014 From: taleinat at gmail.com (Tal Einat) Date: Tue, 6 May 2014 12:35:09 +0300 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: <20140506023902.GV4273@ando> References: <20140505171538.GR4273@ando> <20140506023902.GV4273@ando> Message-ID: On Tue, May 6, 2014 at 5:39 AM, Steven D'Aprano wrote: > On Tue, May 06, 2014 at 07:22:56AM +0900, INADA Naoki wrote: > >> I don't think the order of permutation is implementation detail. >> Python implementations should follow CPython's documented order. >> >> https://docs.python.org/3.4/library/itertools.html#itertools.permutations > > Hmmm. Well, since the order of permutations is documented, I suppose my > objection is answered. In that case, it becomes a question of whether or > not there is an easy way to generate the Nth permutation without having > to iterate through the previous N-1 permutations. Yes, it is possible using factorial decomposition of N. See, for an example: http://stackoverflow.com/a/7919887/40076 - Tal Einat From p.f.moore at gmail.com Tue May 6 14:45:02 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 6 May 2014 13:45:02 +0100 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: <20140505171538.GR4273@ando> <20140506023902.GV4273@ando> Message-ID: On 6 May 2014 10:35, Tal Einat wrote: >> Hmmm. Well, since the order of permutations is documented, I suppose my >> objection is answered. In that case, it becomes a question of whether or >> not there is an easy way to generate the Nth permutation without having >> to iterate through the previous N-1 permutations. > > Yes, it is possible using factorial decomposition of N. > > See, for an example: http://stackoverflow.com/a/7919887/40076 For large N, this is much slower than itertools.permutations when you only want the first few entries. p = itertools.permutations(range(10000)) for i in range(5): print(next(p)) vs for i in range(5): print(ithperm(10000, i)) The first is substantially faster. That's not to say that ithperm isn't useful, just that its computational complexity may be surprising if it's spelled as an indexing operation. Paul From alan.cristh at gmail.com Tue May 6 15:04:44 2014 From: alan.cristh at gmail.com (Alan Cristhian Ruiz) Date: Tue, 06 May 2014 10:04:44 -0300 Subject: [Python-ideas] Plug-ins for IDLE Message-ID: <5368DDEC.40709@gmail.com> I think it would be great to have an minimun API for writing plug-ins for Python IDLE. I don't know if there are people interested in this, but I do. What do you think? Python could benefit in some way with this? From taleinat at gmail.com Tue May 6 17:40:53 2014 From: taleinat at gmail.com (Tal Einat) Date: Tue, 6 May 2014 18:40:53 +0300 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: <20140505171538.GR4273@ando> <20140506023902.GV4273@ando> Message-ID: On Tue, May 6, 2014 at 3:45 PM, Paul Moore wrote: > On 6 May 2014 10:35, Tal Einat wrote: >>> Hmmm. Well, since the order of permutations is documented, I suppose my >>> objection is answered. In that case, it becomes a question of whether or >>> not there is an easy way to generate the Nth permutation without having >>> to iterate through the previous N-1 permutations. >> >> Yes, it is possible using factorial decomposition of N. >> >> See, for an example: http://stackoverflow.com/a/7919887/40076 > > For large N, this is much slower than itertools.permutations when you > only want the first few entries. If someone just wants the first few entries, they probably aren't worried about it being super fast. And if they were, they could just iterate to get the first permutations. As for getting anything past the first few permutations (e.g. an arbitrary one), factorial decomposition would be faster by several orders of magnitude than iterating from the beginning. For relatively large permutations, iterating from the beginning could be unfeasible, while factorial decomposition would still take far less than a second. The real question IMO is if this is useful enough to bother including in the stdlib. For example, I don't think it would pass the "potential uses in the stdlib" test. Perhaps Ram (the OP) has some actual use-cases for this? - Tal From p.f.moore at gmail.com Tue May 6 17:49:36 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 6 May 2014 16:49:36 +0100 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: <20140505171538.GR4273@ando> <20140506023902.GV4273@ando> Message-ID: On 6 May 2014 16:40, Tal Einat wrote: > The real question IMO is if this is useful enough to bother including > in the stdlib. For example, I don't think it would pass the "potential > uses in the stdlib" test. Perhaps Ram (the OP) has some actual > use-cases for this? Agreed, I suspect this is more appropriate as a utility on PyPI. But I stand by my statement that wherever it's implemented, it should *not* be spelled permutations(x)[N], as having indexing with a small index being significantly slower than a few calls to next() is a nasty performance trap for the unwary (no matter how rare it will be in practice). Paul From taleinat at gmail.com Tue May 6 17:51:19 2014 From: taleinat at gmail.com (Tal Einat) Date: Tue, 6 May 2014 18:51:19 +0300 Subject: [Python-ideas] Plug-ins for IDLE In-Reply-To: <5368DDEC.40709@gmail.com> References: <5368DDEC.40709@gmail.com> Message-ID: On Tue, May 6, 2014 at 4:04 PM, Alan Cristhian Ruiz wrote: > I think it would be great to have an minimun API for writing plug-ins for > Python IDLE. > > I don't know if there are people interested in this, but I do. > > What do you think? Python could benefit in some way with this? This already exists! In IDLE they're called Extensions. See extend.txt (link below) in the IDLE source code for details on how to write them. IDLE also ships with several built-in plugins which provide some key functionality, such as auto-completion and parenthesis matching. You can check those out for examples. http://hg.python.org/cpython/file/v3.4.0/Lib/idlelib/extend.txt - Tal From ram.rachum at gmail.com Wed May 7 19:21:25 2014 From: ram.rachum at gmail.com (Ram Rachum) Date: Wed, 7 May 2014 20:21:25 +0300 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: <20140505171538.GR4273@ando> <20140506023902.GV4273@ando> Message-ID: Hi Tal, I'm using it for a project of my own (optimizing keyboard layout) but I can't make the case that it's useful for the stdlib. I'd understand if it would be omitted for not being enough of a common need. On Tue, May 6, 2014 at 6:40 PM, Tal Einat wrote: > On Tue, May 6, 2014 at 3:45 PM, Paul Moore wrote: > > On 6 May 2014 10:35, Tal Einat wrote: > >>> Hmmm. Well, since the order of permutations is documented, I suppose my > >>> objection is answered. In that case, it becomes a question of whether > or > >>> not there is an easy way to generate the Nth permutation without having > >>> to iterate through the previous N-1 permutations. > >> > >> Yes, it is possible using factorial decomposition of N. > >> > >> See, for an example: http://stackoverflow.com/a/7919887/40076 > > > > For large N, this is much slower than itertools.permutations when you > > only want the first few entries. > > If someone just wants the first few entries, they probably aren't > worried about it being super fast. And if they were, they could just > iterate to get the first permutations. > > As for getting anything past the first few permutations (e.g. an > arbitrary one), factorial decomposition would be faster by several > orders of magnitude than iterating from the beginning. For relatively > large permutations, iterating from the beginning could be unfeasible, > while factorial decomposition would still take far less than a second. > > The real question IMO is if this is useful enough to bother including > in the stdlib. For example, I don't think it would pass the "potential > uses in the stdlib" test. Perhaps Ram (the OP) has some actual > use-cases for this? > > - Tal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From taleinat at gmail.com Wed May 7 19:40:22 2014 From: taleinat at gmail.com (Tal Einat) Date: Wed, 7 May 2014 20:40:22 +0300 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: <20140505171538.GR4273@ando> <20140506023902.GV4273@ando> Message-ID: On Wed, May 7, 2014 at 8:21 PM, Ram Rachum wrote: > Hi Tal, > > I'm using it for a project of my own (optimizing keyboard layout) but I > can't make the case that it's useful for the stdlib. I'd understand if it > would be omitted for not being enough of a common need. At the least, this (a function for getting a specific permutation by lexicographical-order index) could make a nice cookbook recipe. - Tal From ram.rachum at gmail.com Wed May 7 19:43:20 2014 From: ram.rachum at gmail.com (Ram Rachum) Date: Wed, 7 May 2014 20:43:20 +0300 Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: <20140505171538.GR4273@ando> <20140506023902.GV4273@ando> Message-ID: I'm probably going to implement it in my python_toolbox package. I already implemented 30% and it's really cool. It's at the point where I doubt that I want it in the stdlib because I've gotten so much awesome functionality into it and I'd hate to (a) have 80% of it stripped and (b) have the class names changed to be non-Pythonic :) On Wed, May 7, 2014 at 8:40 PM, Tal Einat wrote: > On Wed, May 7, 2014 at 8:21 PM, Ram Rachum wrote: > > Hi Tal, > > > > I'm using it for a project of my own (optimizing keyboard layout) but I > > can't make the case that it's useful for the stdlib. I'd understand if it > > would be omitted for not being enough of a common need. > > At the least, this (a function for getting a specific permutation by > lexicographical-order index) could make a nice cookbook recipe. > > - Tal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alonisser at gmail.com Wed May 7 22:40:56 2014 From: alonisser at gmail.com (alonn) Date: Wed, 7 May 2014 23:40:56 +0300 Subject: [Python-ideas] Things I wish Pip learned from Npm Message-ID: A "Rant" I wrote about pip and npm. Maybe someone would find this interesting or even useful in thinking on Pip's future. I would like to stress that this is really not meant to hurt anyone and in particular the great members of this open source community who bear the burden of developing and maintaining Pip. https://medium.com/devops-programming/f712fa26f5bc Twitter:@alonisser LinkedIn Profile Facebook -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Wed May 7 22:48:29 2014 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 07 May 2014 22:48:29 +0200 Subject: [Python-ideas] Things I wish Pip learned from Npm In-Reply-To: References: Message-ID: <536A9C1D.6000703@egenix.com> On 07.05.2014 22:40, alonn wrote: > A "Rant" I wrote about pip and npm. Maybe someone would find this > interesting or even useful in thinking on Pip's future. > > I would like to stress that this is really not meant to hurt anyone and in > particular the great members of this open source community who bear the > burden of developing and maintaining Pip. > > https://medium.com/devops-programming/f712fa26f5bc Please note that you should probably post this to the pip mailing list and/or the distutils list. python-ideas is about ideas for Python itself and even though Python 3.4 includes bootstrap code to install pip, pip itself is not developed by the Python Core Devs. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 07 2014) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2014-04-24: Released mxODBC.Connect 2.0.5 ... http://egenix.com/go55 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From wegge at wegge.dk Wed May 7 22:50:00 2014 From: wegge at wegge.dk (Anders Wegge Keller) Date: 07 May 2014 22:50:00 +0200 Subject: [Python-ideas] Things I wish Pip learned from Npm References: Message-ID: <871tw52u13.fsf@huddi.jernurt.dk> alonn writes: > A "Rant" I wrote about pip and npm. Maybe someone would find this > interesting or even useful in thinking on Pip's future. > > I would like to stress that this is really not meant to hurt anyone and in > particular the great members of this open source community who bear the > burden of developing and maintaining Pip. > > https://medium.com/devops-programming/f712fa26f5bc One nit: $ Selenium==2.4.1 ... $ Selenium==2.35.0 Pip Just downgraded a package version without even asking for I would say that pip upgraded Selenium 31 minor releases. However, I come from a background where each of major.minor.rev can be a multi-digit number, so I might be wrong in this context. -- /Wegge Leder efter redundant peering af dk.*,linux.debian.* From ncoghlan at gmail.com Thu May 8 01:09:44 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 8 May 2014 09:09:44 +1000 Subject: [Python-ideas] Things I wish Pip learned from Npm In-Reply-To: References: Message-ID: On 8 May 2014 06:41, "alonn" wrote: > > A "Rant" I wrote about pip and npm. Maybe someone would find this interesting or even useful in thinking on Pip's future. > > I would like to stress that this is really not meant to hurt anyone and in particular the great members of this open source community who bear the burden of developing and maintaining Pip. > > https://medium.com/devops-programming/f712fa26f5bc While MAL is correct that distutils-sig is a better list, I also suggest reading packaging.python.org and the referenced PEPs (especially PEP 426), along with articles like http://lwn.net/Articles/580399/ to come up to speed with the current state of play in the Python packaging ecosystem. Cheers, Nick. > > > Twitter:@alonisser > LinkedIn Profile > Facebook > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Thu May 8 17:04:12 2014 From: flying-sheep at web.de (Philipp A.) Date: Thu, 8 May 2014 17:04:12 +0200 Subject: [Python-ideas] Things I wish Pip learned from Npm In-Reply-To: <536A9C1D.6000703@egenix.com> References: <536A9C1D.6000703@egenix.com> Message-ID: 2014-05-07 22:48 GMT+02:00 M.-A. Lemburg : Please note that you should probably post this to the pip mailing > list and/or the distutils list. > > python-ideas is about ideas for Python itself and even though Python 3.4 > includes bootstrap code to install pip, pip itself is not developed by > the Python Core Devs. > There?s one point that?s relevant and worth discussing here, I quote: *pip?s choice of defaulting to a global installation is wrong* > Yes. Python?s pip bootstrapping is there in order to guarantee that all the ? just type pip install foobar? tutorials work. Linux distributions like Arch, Debian and Ubuntu deliberatelybroke this guarantee, and for good reasons: Pip per default installs globally, which should be the system package manager?s territory. It would be best if pip would work like this if you run it without some -g, --global switch outside a venv: On Linux: ?You?re on linux, please use your package manager for global installation or use a virtual environment. Use the -g switch to force global installation? On Windows and OSX: ?Please use the -g switch for installations outside of virtual environments? ------------------------------ Is it to late to change this? I really *want* the guarantee and pip to work. But I also totally understand why all those distributions break it! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu May 8 17:12:27 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 May 2014 01:12:27 +1000 Subject: [Python-ideas] Things I wish Pip learned from Npm In-Reply-To: References: <536A9C1D.6000703@egenix.com> Message-ID: On 9 May 2014 01:05, "Philipp A." wrote: > > ________________________________ > > Is it to late to change this? I really want the guarantee and pip to work. But I also totally understand why all those distributions break it! There's already an open pip issue to change the default install location to be user installs (at least on POSIX systems), so no, that's not a novel idea, and this still isn't the right list to discuss it. (I'll also note that Fedora was able to implement PEP 453 successfully, so it is certainly possible for distros to comply with it) Cheers, Nick. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Thu May 8 17:16:08 2014 From: donald at stufft.io (Donald Stufft) Date: Thu, 8 May 2014 11:16:08 -0400 Subject: [Python-ideas] Things I wish Pip learned from Npm In-Reply-To: References: <536A9C1D.6000703@egenix.com> Message-ID: On May 8, 2014, at 11:04 AM, Philipp A. wrote: > 2014-05-07 22:48 GMT+02:00 M.-A. Lemburg : > > > > Please note that you should probably post this to the pip mailing > list and/or the distutils list. > > python-ideas is about ideas for Python itself and even though Python 3.4 > includes bootstrap code to install pip, pip itself is not developed by > the Python Core Devs. > > > There?s one point that?s relevant and worth discussing here, I quote: > > > > pip?s choice of defaulting to a global installation is wrong > > > Yes. Python?s pip bootstrapping is there in order to guarantee that all the ? just type pip install foobar? tutorials work. > > Linux distributions like Arch, Debian and Ubuntu deliberately broke this guarantee, and for good reasons: > > Pip per default installs globally, which should be the system package manager?s territory. > > This isn?t exactly accurate. Linux distributions aren?t installing pip globally using ensurepip, which is an entirely different thing than pip. We never expected Linux distros *to* use ensurepip for that purpose. If you install python-pip on any of these distros you still get a version of pip that installs things globally. Some distros (Fedora I believe) are making Python depend on pip so the outcome is exactly the same. > It would be best if pip would work like this if you run it without some -g, --global switch outside a venv: > > I don?t think pip will ever require a virtual environment by default. However there is an open ticket to make ?user installs the default when running as non root. > On Linux: ?You?re on linux, please use your package manager for global installation or use a virtual environment. Use the -g switch to force global installation? > > On Windows and OSX: ?Please use the -g switch for installations outside of virtual environments? > > Is it to late to change this? I really want the guarantee and pip to work. But I also totally understand why all those distributions break it! > > Like I said above, no distribution currently breaks pip, a few have a broken ensurepip but that is being fixed. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From barry at python.org Thu May 8 21:22:31 2014 From: barry at python.org (Barry Warsaw) Date: Thu, 8 May 2014 15:22:31 -0400 Subject: [Python-ideas] Things I wish Pip learned from Npm References: <536A9C1D.6000703@egenix.com> Message-ID: <20140508152231.3387e6d4@anarchist.wooz.org> On May 08, 2014, at 11:16 AM, Donald Stufft wrote: >Some distros (Fedora I believe) are making Python depend on pip so the >outcome is exactly the same. Debian and Ubuntu won't be providing pip automatically outside of a virtualenv, but we'll provide some hints as to how to install it using the OS package manager. I'm close to a solution for https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=732703 which will fix pyvenv and pip inside a pyvenv. It's not at all as straightforward as you might think, but I have A Plan. I still think pip-outside-venv should install to --user by default so I'm glad pip upstream is moving in that direction. There's a use case, although IMO fairly limited, for global installation via `sudo pip` so we'll support that, but you'll have to use the python*-pip distro package for that to work. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: not available URL: From bunslow at gmail.com Fri May 9 07:38:34 2014 From: bunslow at gmail.com (Bill Winslow) Date: Fri, 9 May 2014 00:38:34 -0500 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools Message-ID: Hey guys. This is simple pattern I've added to my own "myutils.py" file, which I think could see wide use if added to the standard library. Simply, it's the following function meant to be used primarily as a decorator. def export(func): global __all__ if callable(func) and hasattr(func, '__name__'): try: __all__.append(func.__name__) except NameError: __all__ = [func.__name__] return func Then, instead of having a magic __all__ declaration at the top of a module with a list of strings (that may or may not be accurate [of course stdlib modules are maintained more rigorously]), people writing libraries can instead use the following idiom: import stuff_not_meant_to_be_visible def _private_func_1(): pass def _private_func_2(): pass @export def public_func_1(): pass @export def public_func_2(): pass Of course, this doesn't actually solve any problem, because programmers using best-practice will prepend underscores to private functions and define their __all__ properly. However I still think this might be worth adding to the stdlib (presumably in functools) because 1) Readability counts (and explicit is better than implicit): it's easy to determine that, other than "well there's no underscore so this is probably a public function", that "yes, the library author meant for this function to be used". 2) Proper maintenance becomes easier. Attaching a small decorator next to each public function is easier to remember than remembering to add an arbitrary string to an arbitrary global constant. There is also the added benefit that renaming/refactoring also doesn't require modifying the magic global when you're done. 3) It helps encourage best practices, especially among either lazy programmers or those new to Python. One possible counter argument is that it's not very important/isn't a core feature for library inclusion: Well, things like an lru_cache or total_ordering aren't core features, but they are nice to have, which is why they were added; export would fall into the same category. What are everyone's thoughts? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri May 9 07:59:35 2014 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 9 May 2014 15:59:35 +1000 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: On Fri, May 9, 2014 at 3:38 PM, Bill Winslow wrote: > 2) Proper maintenance becomes easier. Attaching a small decorator next to > each public function is easier to remember than remembering to add an > arbitrary string to an arbitrary global constant. There is also the added > benefit that renaming/refactoring also doesn't require modifying the magic > global when you're done. +1 for this reason. Attaching info to code is the purpose of docstrings, and it makes very good sense to implement __all__ the same way. But your given implementation seems to have a problem: how can you import that into another module? It looks at "global __all__", which will look at the module it's implemented in. Would it work like this, perhaps? # This is starting to read a little oddly, but oh well :) from eustace_scrubb import export, government, drain # ... define your functions with @export, as above ... __all__ = export.get_all() The get_all() function would return the list, and empty it in readiness for the next module. It's non-reentrant, so you'd have to make sure you don't import any other modules in the middle of defining your own exports. Or is there something I'm not seeing about your original export() that makes it work? ChrisA From bunslow at gmail.com Fri May 9 09:12:01 2014 From: bunslow at gmail.com (Bill Winslow) Date: Fri, 9 May 2014 02:12:01 -0500 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: > But your given implementation seems to have a problem: how can > you import that into another module? And thus I am caught as not having actually tested this -- you are of course correct. Having a statement at the bottom defining "__all__" as you suggest would (partially) defeat the point, which is to make it easy for the programmer to state what should be public, that is, it shouldn't be necessary to muck with __all__. I'm trying to avoid the magicness/arbitraryness of assignments to __all__. I'll try and think up an alternative implementation that would work as advertised when imported from another module. (Note that if the code itself were copy and pasted, the function would work fine, yet importing it fails -- something I have not yet encountered in Python. This also suggests one trivial solution -- import a function that instead exec()'s the definition above.) I suspect I might have to learn something about import internals to come up with a (better-than-the-trivial) solution. (Copy and pasting code is of course unacceptable as well.) > Here's a simpler implementation: That is of course the same as mine, except less error checking and also assumes the global already exists (remember, one goal is to not have to muck with __all__ in any way, not even declaring it above the function definition). At least somebody else also had the same idea; hopefully I can come up with an importable solution... -------------- next part -------------- An HTML attachment was scrubbed... URL: From berker.peksag at gmail.com Fri May 9 09:06:47 2014 From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=) Date: Fri, 9 May 2014 10:06:47 +0300 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: On Fri, May 9, 2014 at 8:38 AM, Bill Winslow wrote: > Hey guys. > > This is simple pattern I've added to my own "myutils.py" file, which I think > could see wide use if added to the standard library. > > Simply, it's the following function meant to be used primarily as a > decorator. > > def export(func): > global __all__ > if callable(func) and hasattr(func, '__name__'): > try: > __all__.append(func.__name__) > except NameError: > __all__ = [func.__name__] > return func Here's a simpler implementation: http://hg.python.org/release/file/b270b4d5cf2c/3.4/dryparse/dryparse.py#l57 --Berker From rosuav at gmail.com Fri May 9 09:38:18 2014 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 9 May 2014 17:38:18 +1000 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: (By the way: You responded to two different posts in yours, and didn't cite either of them. Please keep the original-poster line, such as you'll see on the next non-blank line.) On Fri, May 9, 2014 at 5:12 PM, Bill Winslow wrote: >> But your given implementation seems to have a problem: how can >> you import that into another module? > > And thus I am caught as not having actually tested this -- you are of course > correct. Having a statement at the bottom defining "__all__" as you suggest > would (partially) defeat the point, which is to make it easy for the > programmer to state what should be public, that is, it shouldn't be > necessary to muck with __all__. I'm trying to avoid the > magicness/arbitraryness of assignments to __all__. I don't mind the concept of one-line directives to specify things. In the same way that you would put "import socket" at the top if you use sockets, you put "__all__ = export.get_all()" at the bottom to capture all the __all__ entries. It still deduplicates and brings the information right to where the function's defined, so there is some value in it. > I'll try and think up an alternative implementation that would work as > advertised when imported from another module. (Note that if the code itself > were copy and pasted, the function would work fine, yet importing it fails > -- something I have not yet encountered in Python. This also suggests one > trivial solution -- import a function that instead exec()'s the definition > above.) I suspect I might have to learn something about import internals to > come up with a (better-than-the-trivial) solution. (Copy and pasting code is > of course unacceptable as well.) For the exec method to work, it would have to be passed a reference to globals() for the calling module, so you can simplify it. I don't know how useful it would be, but this ought to work: # my_tools.py def make_all(globls, listname="__all__"): globls[listname] = [] def grabber(obj): globls[listname].append(obj) return obj return grabber # your module import my_tools export = make_all(globals()) def _private_func_1(): pass def _private_func_2(): pass @export def public_func_1(): pass @export def public_func_2(): pass This is still a bit magical, in that you assign to the name "export" and it actually is for setting __all__, but it's better than exec :) ChrisA From bunslow at gmail.com Fri May 9 10:03:02 2014 From: bunslow at gmail.com (Bill Winslow) Date: Fri, 9 May 2014 03:03:02 -0500 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: On Fri, May 9, 2014 at 2:38 AM, Chris Angelico wrote: > (By the way: You responded to two different posts in yours, and didn't > cite either of them. Please keep the original-poster line, such as > you'll see on the next non-blank line.) Sorry -- I think this is correct? I'm new to mailing lists :P > I don't mind the concept of one-line directives to specify things. In > the same way that you would put "import socket" at the top if you use > sockets, you put "__all__ = export.get_all()" at the bottom to capture > all the __all__ entries. It still deduplicates and brings the > information right to where the function's defined, so there is some > value in it. Fair enough, but I'd still like to avoid such statements if possible. I think we can do better. > > I'll try and think up an alternative implementation that would work as > > advertised when imported from another module. (Note that if the code > itself > > were copy and pasted, the function would work fine, yet importing it > fails > > -- something I have not yet encountered in Python. This also suggests one > > trivial solution -- import a function that instead exec()'s the > definition > > above.) I suspect I might have to learn something about import internals > to > > come up with a (better-than-the-trivial) solution. (Copy and pasting > code is > > of course unacceptable as well.) > > For the exec method to work, it would have to be passed a reference to > globals() for the calling module, so you can simplify it. I don't know > how useful it would be, but this ought to work: > > # my_tools.py > def make_all(globls, listname="__all__"): > globls[listname] = [] > def grabber(obj): > globls[listname].append(obj) > return obj > return grabber > > # your module > import my_tools > export = make_all(globals()) > > def _private_func_1(): pass > > def _private_func_2(): pass > > @export > def public_func_1(): pass > > @export > def public_func_2(): pass > > > This is still a bit magical, in that you assign to the name "export" > and it actually is for setting __all__, but it's better than exec :) That's what I had basically just got working, except with exec instead of just straight up modifying the globals... and if you're going to do that, you may as well directly assign globls['export'] = grabber :P (still better than an exec of course :D). On the other hand, while testing my version of the above (with exec), I ran into another issue (or at least I perceive it as such): the import stuff only respects __all__ *if* we are importing * from the module. If instead we do "import module as m", the *entire* namespace of the module is made available in m, even those that start with an underscore. Can someone please explain the rationale behind that? I would consider this surprising. Why should "import module" give different results than "from module import *"? If the latter can be author-limited, why not the former as well? Even a pointer to relevant documentation would be helpful. (Basically, I had this idea because I thought that __all__ was way more important than it apparently is.) -Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri May 9 10:17:16 2014 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 9 May 2014 18:17:16 +1000 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: On Fri, May 9, 2014 at 6:03 PM, Bill Winslow wrote: > > On Fri, May 9, 2014 at 2:38 AM, Chris Angelico wrote: >> >> (By the way: You responded to two different posts in yours, and didn't >> cite either of them. Please keep the original-poster line, such as >> you'll see on the next non-blank line.) > > Sorry -- I think this is correct? I'm new to mailing lists :P Yep, looks good! Thanks! You'll find this sort of thing helpful even with non-list mail, too. As soon as you get a long and complex discussion (particularly between more than two people), it's useful to bottom-post, trim quotes, maintain proper citations, etc, etc. Good habits to be in. > That's what I had basically just got working, except with exec instead of > just straight up modifying the globals... and if you're going to do that, > you may as well directly assign globls['export'] = grabber :P (still > better than an exec of course :D). You could do it that way too, yes. That would shorten the usage a little. > On the other hand, while testing my version of the above (with exec), I ran > into another issue (or at least I perceive it as such): the import stuff > only respects __all__ *if* we are importing * from the module. If instead we > do "import module as m", the *entire* namespace of the module is made > available in m, even those that start with an underscore. > > Can someone please explain the rationale behind that? I would consider this > surprising. Why should "import module" give different results than "from > module import *"? If the latter can be author-limited, why not the former as > well? Even a pointer to relevant documentation would be helpful. > > (Basically, I had this idea because I thought that __all__ was way more > important than it apparently is.) Yep. When you "import module" (optionally "as m"), you can reference whatever you want; __all__ is to help tame an "import * from". But neither of them can truly hide anything; Python doesn't work that way. You can always go digging around and finding the internals of something. The nearest you can get to truly hiding something from the namespace is to del it when you're done, which obviously means you can't reference it yourself either. (Has its uses, though; for instance, you might undefine export when you're done with it, in your above examples.) ChrisA From __peter__ at web.de Fri May 9 10:45:33 2014 From: __peter__ at web.de (Peter Otten) Date: Fri, 09 May 2014 10:45:33 +0200 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools References: Message-ID: Bill Winslow wrote: > Hey guys. > > This is simple pattern I've added to my own "myutils.py" file, which I > think could see wide use if added to the standard library. > > Simply, it's the following function meant to be used primarily as a > decorator. > > def export(func): > global __all__ > if callable(func) and hasattr(func, '__name__'): > try: > __all__.append(func.__name__) > except NameError: > __all__ = [func.__name__] > return func > > > Then, instead of having a magic __all__ declaration at the top of a module > with a list of strings (that may or may not be accurate [of course stdlib > modules are maintained more rigorously]), people writing libraries can > instead use the following idiom: > > import stuff_not_meant_to_be_visible > > def _private_func_1(): pass > > def _private_func_2(): pass > > @export > def public_func_1(): pass > > @export > def public_func_2(): pass > > > Of course, this doesn't actually solve any problem, because programmers > using best-practice will prepend underscores to private functions and > define their __all__ properly. > > However I still think this might be worth adding to the stdlib (presumably > in functools) because > > 1) Readability counts (and explicit is better than implicit): it's easy to > determine that, other than "well there's no underscore so this is probably > a public function", that "yes, the library author meant for this function > to be used". > > 2) Proper maintenance becomes easier. Attaching a small decorator next to > each public function is easier to remember than remembering to add an > arbitrary string to an arbitrary global constant. There is also the added > benefit that renaming/refactoring also doesn't require modifying the magic > global when you're done. > > 3) It helps encourage best practices, especially among either lazy > programmers or those new to Python. > > > One possible counter argument is that it's not very important/isn't a core > feature for library inclusion: > Well, things like an lru_cache or total_ordering aren't core features, but > they are nice to have, which is why they were added; export would fall > into the same category. > > What are everyone's thoughts? I rarely use star imports, and the decorator is mostly noise in my eyes, so that's a clear -1 from me. I'm mostly posting to suggest an alternative implementation for your personal use ;) def export(f): sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__) return f From rosuav at gmail.com Fri May 9 10:53:23 2014 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 9 May 2014 18:53:23 +1000 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: On Fri, May 9, 2014 at 6:45 PM, Peter Otten <__peter__ at web.de> wrote: > I'm mostly posting to suggest an alternative implementation for your > personal use ;) > > def export(f): > sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__) > return f sys._getframe, in my opinion, isn't so much code "smell" as "Mythbusters' 1987 Chevrolet"... :) ChrisA From niki.spahiev at gmail.com Fri May 9 12:12:02 2014 From: niki.spahiev at gmail.com (Niki Spahiev) Date: Fri, 09 May 2014 13:12:02 +0300 Subject: [Python-ideas] OrderedDict literals In-Reply-To: References: Message-ID: Hello, Currently expression (a=1, b=2) is a syntax error. If it's defined to mean (('a',1), ('b',2)) it can be used when making OrderedDict or anything that requires named ordered args e.g. OrderedDict((a=1, b=2)) another variant with more changes in VM is OrderedDict(**(a=1, b=2)) Niki From jsbueno at python.org.br Fri May 9 13:07:49 2014 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 9 May 2014 08:07:49 -0300 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: On 9 May 2014 05:53, Chris Angelico wrote: > On Fri, May 9, 2014 at 6:45 PM, Peter Otten <__peter__ at web.de> wrote: >> I'm mostly posting to suggest an alternative implementation for your >> personal use ;) >> >> def export(f): >> sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__) >> return f in this case it could be .... f.__globals__.setdefault ... (instead of the sys._getframe) anyway, I also dislike the idea on the basis that __all__ is not that usefull in itself, and people coming from static languages (and worse, people building "pylinters") might come to find this a "good practice" to the point of being mandatory (else, fail the linter): and voil?: a lot of noise to the language. js -><- > > sys._getframe, in my opinion, isn't so much code "smell" as > "Mythbusters' 1987 Chevrolet"... :) > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Fri May 9 13:58:51 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 May 2014 21:58:51 +1000 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: On 9 May 2014 18:53, Chris Angelico wrote: > On Fri, May 9, 2014 at 6:45 PM, Peter Otten <__peter__ at web.de> wrote: >> I'm mostly posting to suggest an alternative implementation for your >> personal use ;) >> >> def export(f): >> sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__) >> return f > > sys._getframe, in my opinion, isn't so much code "smell" as > "Mythbusters' 1987 Chevrolet"... :) As this subthread suggests, the main problem with the concept of decorator based public/private markings is "If the implementation is hard to explain, it's a bad idea." You essentially have to use some form of dynamic scoping in order to modify __all__ in the right module, and then that limits your ability to wrap the export decorator inside other helper functions. In many cases, the fact that underscore prefixed names are excluded from the implicit all is sufficient to avoid the need to worry too much about defining an explicit __all__ attribute. Problems typically only arise due to imported modules being implicitly re-exported. If folks really want to avoid defining an explicit __all__, then it isn't that hard to define a helper function that allows a module to be finished with a line like: __all__ = all_without_modules(globals()) It generally isn't worth the hassle, though, especially when star imports are strongly discouraged in the first place. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Fri May 9 19:25:03 2014 From: barry at python.org (Barry Warsaw) Date: Fri, 9 May 2014 13:25:03 -0400 Subject: [Python-ideas] Things I wish Pip learned from Npm References: <536A9C1D.6000703@egenix.com> <20140508152231.3387e6d4@anarchist.wooz.org> Message-ID: <20140509132503.0fd64423@anarchist.wooz.org> On May 08, 2014, at 03:22 PM, Barry Warsaw wrote: >Debian and Ubuntu won't be providing pip automatically outside of a >virtualenv, but we'll provide some hints as to how to install it using the >OS package manager. I should clarify this. I think python3-pip should be a Recommends (for python3 I guess) and apt-get installs Recommends by default, so to most people it'll seem like you get it automatically. But you disable this with apt-get's --no-install-recommends flag. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: not available URL: From greg.ewing at canterbury.ac.nz Sat May 10 01:15:43 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 10 May 2014 11:15:43 +1200 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: <536D619F.2090007@canterbury.ac.nz> Nick Coghlan wrote: > You essentially have to use some > form of dynamic scoping in order to modify __all__ in the right > module, and then that limits your ability to wrap the export decorator > inside other helper functions. The version that uses the function's f_globals directly doesn't have that problem. -- Greg From ncoghlan at gmail.com Sat May 10 14:02:52 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 10 May 2014 22:02:52 +1000 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: <536D619F.2090007@canterbury.ac.nz> References: <536D619F.2090007@canterbury.ac.nz> Message-ID: On 10 May 2014 09:16, "Greg Ewing" wrote: > > Nick Coghlan wrote: >> >> You essentially have to use some >> form of dynamic scoping in order to modify __all__ in the right >> module, and then that limits your ability to wrap the export decorator >> inside other helper functions. > > > The version that uses the function's f_globals directly > doesn't have that problem. It has a different problem: f_globals may come from a wrapper function applied by a decorator that lives in a different module (or replace it with a callable that has no "f_globals" at all). An export decorator like this is a neat idea that might work within the confines of a single project or organisation, but it's inherently too fragile to make it a generally available part of the standard library. Cheers, Nick. > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kn0m0n3 at gmail.com Sat May 10 16:24:57 2014 From: kn0m0n3 at gmail.com (www.leap.cc) Date: Sat, 10 May 2014 09:24:57 -0500 Subject: [Python-ideas] cell phone gnu os Message-ID: <83tjvg6338ic7t48j3gi051s.1399731897514@email.android.com> Niki Spahiev wrote: >Hello, > >Currently expression (a=1, b=2) is a syntax error. If it's defined to >mean (('a',1), ('b',2)) it can be used when making OrderedDict or >anything that requires named ordered args e.g. > >OrderedDict((a=1, b=2)) > >another variant with more changes in VM is > >OrderedDict(**(a=1, b=2)) > >Niki > >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas >Code of Conduct: http://python.org/psf/codeofconduct/ From dw+python-ideas at hmmz.org Sat May 10 20:04:36 2014 From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org) Date: Sat, 10 May 2014 18:04:36 +0000 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: <20140510180436.GA16032@k2> On Fri, May 09, 2014 at 10:45:33AM +0200, Peter Otten wrote: > I rarely use star imports, and the decorator is mostly noise in my eyes, so > that's a clear -1 from me. __all__ also prettifies e.g. pydoc output, which I'd consider a +1. I'm overall not fond of hiding simple mechanisms using unecessary magic, especially when the result doesn't generalize to all possible uses of taht mechanism (e.g. this approach doesn't work for exporting simple variables). > def export(f): > sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__) > return f How about: def export(fn): mod = sys.modules[fn.__module__] lst = vars(mod).setdefault('__all__', []) lst.append(fn) return fn From markus at unterwaditzer.net Sat May 10 21:30:55 2014 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Sat, 10 May 2014 21:30:55 +0200 Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools In-Reply-To: References: Message-ID: <20140510193055.GA956@chromebot.unti> On Fri, May 09, 2014 at 12:38:34AM -0500, Bill Winslow wrote: > Hey guys. > > This is simple pattern I've added to my own "myutils.py" file, which I > think could see wide use if added to the standard library. > > Simply, it's the following function meant to be used primarily as a > decorator. > > def export(func): > global __all__ > if callable(func) and hasattr(func, '__name__'): > try: > __all__.append(func.__name__) > except NameError: > __all__ = [func.__name__] > return func Another version which doesn't have the bug mentioned by Chris, and works without getframe: def export(f): module = __import__(f.__module__) if not hasattr(module, '__all__'): module.__all__ = [] module.__all__.append(f.__name__) return f -- Markus From ram.rachum at gmail.com Thu May 15 22:02:56 2014 From: ram.rachum at gmail.com (Ram Rachum) Date: Thu, 15 May 2014 13:02:56 -0700 (PDT) Subject: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`. Message-ID: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> I suggest exposing `itertools.count.start` and implementing `itertools.count.__eq__` based on it. This'll provide the same benefits that `range` got by exposing `range.start` and allowing `range.__eq__`. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri May 16 00:51:57 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 15 May 2014 18:51:57 -0400 Subject: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`. In-Reply-To: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> Message-ID: On 5/15/2014 4:02 PM, Ram Rachum wrote: > I suggest exposing `itertools.count.start` and implementing > `itertools.count.__eq__` based on it. This'll provide the same benefits > that `range` got by exposing `range.start` and allowing `range.__eq__`. The benefits cannot be the same because range and count are in different categories. A range object is an immutable, constant attribute, reiterable sequence object. It makes sense to expose the read-only constants and compare on the basis of them. This is as sensible as comparing other sequences. A count is an iterator. We do not try to compare iterators (except by identity). The start value is only the initial value yielded. As soon as values are pulled from the iterator, the starting value is history. The generator equivalent in the doc can be condensed a bit to how I would actually write it. def count(start=0, step=1): while True: yield start start += step For an iterator class, I would save the start parameter as self.n, .count, or .current. In other words, something equivlaent to def __init__(self, start=0, step=1): self.count = start self.step = step If you want an augmented iterator class, you should write one yourself for your specific needs. -- Terry Jan Reedy From steve at pearwood.info Fri May 16 01:35:36 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 16 May 2014 09:35:36 +1000 Subject: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`. In-Reply-To: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> Message-ID: <20140515233536.GO4273@ando> On Thu, May 15, 2014 at 01:02:56PM -0700, Ram Rachum wrote: > I suggest exposing `itertools.count.start` and implementing > `itertools.count.__eq__` based on it. This'll provide the same benefits > that `range` got by exposing `range.start` and allowing `range.__eq__`. What benefits are those? Under what circumstances have you compared two range objects or checked their start? That's a serious question -- I don't recall ever wanting to compare range objects for equality. The iterator protocol is intentionally very simple, and I think that is a good thing. Adding complexity to one specific standard iterator without a good, solid use-case does not strike me as a good idea. But even if you have a good use-case, I don't think the concept of equality for count objects is very well defined. Consider: from itertools import count a = count(1) b = count(1) _ = next(b); _ = next(b) c = count(3) a.start and b.start are the same, so one might argue that a and b should compare equal. But next(a) and next(b) are different, so one might equally argue that a and b should compare unequal. Likewise b.start and c.start are different, but next(b) and next(c) return the same value, so one might expect b and c to be both equal and unequal. I think, whichever definition of equality you pick, people will be surprised by it fifty percent of the time. -- Steven From ram.rachum at gmail.com Thu May 15 22:04:02 2014 From: ram.rachum at gmail.com (Ram Rachum) Date: Thu, 15 May 2014 13:04:02 -0700 (PDT) Subject: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`. In-Reply-To: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> Message-ID: Now that I think about it, I would ideally want `itertools.count` to be deprecated in favor of `range(float('inf'))`, but I know that would never happen. On Thursday, May 15, 2014 11:02:56 PM UTC+3, Ram Rachum wrote: > > I suggest exposing `itertools.count.start` and implementing > `itertools.count.__eq__` based on it. This'll provide the same benefits > that `range` got by exposing `range.start` and allowing `range.__eq__`. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettli at thomas-guettler.de Fri May 16 09:05:11 2014 From: guettli at thomas-guettler.de (=?ISO-8859-15?Q?Thomas_G=FCttler?=) Date: Fri, 16 May 2014 09:05:11 +0200 Subject: [Python-ideas] logging.config.defaultConfig() Message-ID: <5375B8A7.1000204@thomas-guettler.de> Using logging (as library) is easy in Python. But setting up logging is done different in nearly every environment. I would like to have a common way to **load** the configuration. The configuration itself should be done in the environment the scripts runs in. If you write a console script, and you want it to be reusable in different environments, there is no default way at the moment (or at least I don't see any) to let the environment set up the logging. I want a standard hook: The console script should be able to call into the surrounding environment. This improves reusablity. I know how to use dictConfig() or fileConfig(), but these methods need parameters. And what the parameters look like should not be defined in the reusable console script. I think the following solution is very flexible and solves most needs to set up logging, since I can implement your needs in for example your_environment_module.set_up() {{{ def defaultConfig(): ''' Load module to set_up() the logging configuration of your environment. Reads the module name from: os.environ.get('LOGGINGCONFIG', 'loggingconfig') Calls set_up() on the imported module. Would be nice to have this as logging.config.defaultConfig Related: https://docs.python.org/2/library/logging.config.html ''' module_name = os.environ.get('LOGGINGCONFIG', 'loggingconfig') module = importlib.import_module(module_name) module.set_up() }}} Do you understand what I propose? What do you think? Thomas G?ttler -- Thomas G?ttler http://thomas-guettler.de/ From solipsis at pitrou.net Fri May 16 11:27:16 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 16 May 2014 11:27:16 +0200 Subject: [Python-ideas] logging.config.defaultConfig() References: <5375B8A7.1000204@thomas-guettler.de> Message-ID: <20140516112716.6002e7e8@fsol> On Fri, 16 May 2014 09:05:11 +0200 Thomas G?ttler wrote: > > I think the following solution is very flexible and solves most needs to set up logging, > since I can implement your needs in for example your_environment_module.set_up() This looks dubious to me. There is no reason to have a shared Python logging configuration, IMO. Also, I don't understand why this is importing a module. If all your scripts are part of an application, then it's reasonable for them to share a mechanism for logging configuration. But it should be done in your application, not in Python itself. Regards Antoine. From j.wielicki at sotecware.net Fri May 16 11:32:55 2014 From: j.wielicki at sotecware.net (Jonas Wielicki) Date: Fri, 16 May 2014 11:32:55 +0200 Subject: [Python-ideas] logging.config.defaultConfig() In-Reply-To: <20140516112716.6002e7e8@fsol> References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> Message-ID: <5375DB47.1020708@sotecware.net> On 16.05.2014 11:27, Antoine Pitrou wrote: > On Fri, 16 May 2014 09:05:11 +0200 > Thomas G?ttler > wrote: >> >> I think the following solution is very flexible and solves most needs to set up logging, >> since I can implement your needs in for example your_environment_module.set_up() > > This looks dubious to me. There is no reason to have a shared Python > logging configuration, IMO. Also, I don't understand why this is > importing a module. While I agree that importing a module might not be the right way, having a standard way to configure logging via environment variables might be helpful. Configuring logging is a difficult thing if done fully, like, allowing different loglevels for different loggers. Having this implemented in the standard library might be actually useful (and it?s also done that way in other languages). regards, Jonas > > If all your scripts are part of an application, then it's reasonable > for them to share a mechanism for logging configuration. But it should > be done in your application, not in Python itself. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From solipsis at pitrou.net Fri May 16 11:54:17 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 16 May 2014 11:54:17 +0200 Subject: [Python-ideas] logging.config.defaultConfig() References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> Message-ID: <20140516115417.0d7e3a54@fsol> On Fri, 16 May 2014 11:32:55 +0200 Jonas Wielicki wrote: > On 16.05.2014 11:27, Antoine Pitrou wrote: > > On Fri, 16 May 2014 09:05:11 +0200 > > Thomas G?ttler > > wrote: > >> > >> I think the following solution is very flexible and solves most needs to set up logging, > >> since I can implement your needs in for example your_environment_module.set_up() > > > > This looks dubious to me. There is no reason to have a shared Python > > logging configuration, IMO. Also, I don't understand why this is > > importing a module. > > While I agree that importing a module might not be the right way, having > a standard way to configure logging via environment variables might be > helpful. I entirely disagree. An environment variable is a very lousy way to specify a configuration file's location; and there is no reason to have a common logging configuration for all Python applications. > Configuring logging is a difficult thing if done fully, like, allowing > different loglevels for different loggers. Having this implemented in > the standard library might be actually useful (and it?s also done that > way in other languages). What does this have to do with environment variables? logging.dictConfig() already does this. Regards Antoine. From guettli at thomas-guettler.de Fri May 16 13:08:17 2014 From: guettli at thomas-guettler.de (Thomas Guettler) Date: Fri, 16 May 2014 13:08:17 +0200 Subject: [Python-ideas] logging.config.defaultConfig() In-Reply-To: <20140516115417.0d7e3a54@fsol> References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> Message-ID: <5375F1A1.7070602@thomas-guettler.de> Am 16.05.2014 11:54, schrieb Antoine Pitrou: > On Fri, 16 May 2014 11:32:55 +0200 > Jonas Wielicki > wrote: >> On 16.05.2014 11:27, Antoine Pitrou wrote: >>> On Fri, 16 May 2014 09:05:11 +0200 >>> Thomas G?ttler >>> wrote: >>>> >>>> I think the following solution is very flexible and solves most needs to set up logging, >>>> since I can implement your needs in for example your_environment_module.set_up() >>> >>> This looks dubious to me. There is no reason to have a shared Python >>> logging configuration, IMO. Also, I don't understand why this is >>> importing a module. >> >> While I agree that importing a module might not be the right way, having >> a standard way to configure logging via environment variables might be >> helpful. > > I entirely disagree. An environment variable is a very lousy way to > specify a configuration file's location; and there is no reason to have > a common logging configuration for all Python applications. ** I don't want a common logging configuration ** I want a standard hook to find the logging configuration. And I want it to be a Python method. If you prefer a file config, create a method which loads your config file. This would make the spec "simple and stupid". The configuration should be empty by default. Only if the environment wants to have a common config, it should provide one. Thomas -- Thomas Guettler http://www.thomas-guettler.de/ From solipsis at pitrou.net Fri May 16 13:14:47 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 16 May 2014 13:14:47 +0200 Subject: [Python-ideas] logging.config.defaultConfig() References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> <5375F1A1.7070602@thomas-guettler.de> Message-ID: <20140516131447.6d4ba9ef@fsol> On Fri, 16 May 2014 13:08:17 +0200 Thomas Guettler wrote: > Am 16.05.2014 11:54, schrieb Antoine Pitrou: > > On Fri, 16 May 2014 11:32:55 +0200 > > Jonas Wielicki > > wrote: > >> On 16.05.2014 11:27, Antoine Pitrou wrote: > >>> On Fri, 16 May 2014 09:05:11 +0200 > >>> Thomas G?ttler > >>> wrote: > >>>> > >>>> I think the following solution is very flexible and solves most needs to set up logging, > >>>> since I can implement your needs in for example your_environment_module.set_up() > >>> > >>> This looks dubious to me. There is no reason to have a shared Python > >>> logging configuration, IMO. Also, I don't understand why this is > >>> importing a module. > >> > >> While I agree that importing a module might not be the right way, having > >> a standard way to configure logging via environment variables might be > >> helpful. > > > > I entirely disagree. An environment variable is a very lousy way to > > specify a configuration file's location; and there is no reason to have > > a common logging configuration for all Python applications. > > ** I don't want a common logging configuration ** > > I want a standard hook to find the logging configuration. Why would that be Python's business? If the hook is meant to be truly "standard", then it should be something like a LSB standard. End users don't really care whether some application is written in Python or another language. Why a Python-specific hook? What do users gain? > And I want it to be a Python method. Basically you are telling us what /you/ want, but not why it would be useful for the broader community. Regards Antoine. From mal at egenix.com Fri May 16 13:34:54 2014 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 16 May 2014 13:34:54 +0200 Subject: [Python-ideas] logging.config.defaultConfig() In-Reply-To: <20140516115417.0d7e3a54@fsol> References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> Message-ID: <5375F7DE.9080902@egenix.com> On 16.05.2014 11:54, Antoine Pitrou wrote: >> While I agree that importing a module might not be the right way, having >> a standard way to configure logging via environment variables might be >> helpful. > > I entirely disagree. An environment variable is a very lousy way to > specify a configuration file's location; and there is no reason to have > a common logging configuration for all Python applications. Hmm, it's a fairly standard way to define config file locations esp. on Unix platforms, so I don't follow you here. Perhaps I'm just missing some context. Such env vars are often used in application environments to override system defaults, e.g. for finding OpenSSL or ODBC config files. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 16 2014) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From accountearnstar at gmail.com Sat May 17 03:40:25 2014 From: accountearnstar at gmail.com (Chris B) Date: Sat, 17 May 2014 03:40:25 +0200 Subject: [Python-ideas] Extending the usage of the @ decoration operator. Message-ID: Right now, one can use the @ symbol only for decorations and only bofore function or class definition. ("A decorator is just a callable that takes a function as an argument and returns a replacement function."). @dec1(arg) @dec2 def func(): pass If a function has already been defined, it cannot be decorated using the decoration operator. But it can still be decorated by explicitly calling the decorator: @dec1(arg) @dec2 def func(): pass is equivalent to: def func(): pass func = dec1(arg)(dec2(func)) Now I propose that the @ symbol should also be usable as an assignment operator, in which case a succeeding function definition would not be decorated: def foo(): pass foo @ decorator def bar(): pass is equivalent to: def foo(): pass foo = decorator(func) def bar(): pass This doesn't allow us to have stacked decorators so, the use of a tuple is needed: def func(): pass func @ (dec2, dec1(arg)) is equivalent to: def func(): pass func = dec1(arg)(dec2(func)) Why not decorate more than one function at once?: func1, func2, func3 @ dec1(arg), dec2 is equivalent to: func1 = dec1(arg)(dec2(func1)) func2 = dec1(arg)(dec2(func2)) func3 = dec1(arg)(dec2(func3)) or better: _temp1 = dec1(arg)(dec2(func1)) _temp2 = dec1(arg)(dec2(func2)) _temp3 = dec1(arg)(dec2(func3)) func1, func2, func3 = _temp1, _temp2, _temp3 The @ operator would still be only used for function decoration. But it should pass any object preceding it as the (only) argument to the first callable - let's call them modifiers - in the tuple succeeding it and then pass the return value to the next modifier in the tuple. The last return value should then be assigned to the variable again. Consider the following example: from os.path import expandvars, abspath, normcase p1 = input('Insert path here: ) p2 = input('And another path here: ) # Fix the path strings p1, p2 @ expandvars, abspath, normcase Functions that take more than one argument can't be used as modifiers. But simply currying them solves the problem: from os.path import expandvars, abspath, normcase, relpath def curry(f, *args, **kwargs): def curried_f(arg1): return f(arg1, *args, **kwargs) return curried_f # Fix the path strings p1, p2 @ expandvars, abspath, normcase, curry(relpath, start) Storing the modifiers in a mutable like a list, one could do rather complex stuff: def add(a, b): return a + b def sub(...; def mult(...; def div(... # ...the obvious way. def permutations(L): for _ in range(possible_permutations) next_permutation = ... yield next_permutation L = [curry(add, 1), curry(sub, 2), curry(mult, 3), curry(div, 4)] # Prints the result for all possible combinations of the four operations +1, -2, *3, /4 # applied to 1. for permutation in permutations(L): x = 1 x @ permutation print(x) I'm not sure where to go from here. Does this idea qualify for a PEP? Is it even possible to be implemented? Has it already been discussed? What do you think about it? Please share your opinions, suggestions and improvements! -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sat May 17 04:08:31 2014 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 17 May 2014 03:08:31 +0100 Subject: [Python-ideas] Extending the usage of the @ decoration operator. In-Reply-To: References: Message-ID: <5376C49F.3040604@mrabarnett.plus.com> On 2014-05-17 02:40, Chris B wrote: > Right now, one can use the @ symbol only for decorations and only bofore > function or class definition. ("A decorator is just a callable that > takes a function as an argument and returns a replacement function."). > > @dec1(arg) > @dec2 > def func(): pass > > > If a function has already been defined, it cannot be decorated using the > decoration operator. But it can still be decorated by explicitly calling > the decorator: > > @dec1(arg) > @dec2 > def func(): pass > > is equivalent to: > > def func(): pass > func = dec1(arg)(dec2(func)) > > > Now I propose that the @ symbol should also be usable as an assignment > operator, in which case a succeeding function definition would not be > decorated: > [snip] There is a proposal to use @ as an operator for matrix multiplication: http://legacy.python.org/dev/peps/pep-0465/ From steve at pearwood.info Sat May 17 04:53:22 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 17 May 2014 12:53:22 +1000 Subject: [Python-ideas] Extending the usage of the @ decoration operator. In-Reply-To: <5376C49F.3040604@mrabarnett.plus.com> References: <5376C49F.3040604@mrabarnett.plus.com> Message-ID: <20140517025322.GQ4273@ando> On Sat, May 17, 2014 at 03:08:31AM +0100, MRAB wrote: > On 2014-05-17 02:40, Chris B wrote: > >Right now, one can use the @ symbol only for decorations and only bofore > >function or class definition. ("A decorator is just a callable that > >takes a function as an argument and returns a replacement function."). [...] > There is a proposal to use @ as an operator for matrix multiplication: > > http://legacy.python.org/dev/peps/pep-0465/ It's not just a proposal, it's accepted and implemented in Python 3.5: http://bugs.python.org/issue21176 So regardless of the merits of this proposal (if any), it isn't going to happen. -- Steven From ethan at stoneleaf.us Sat May 17 04:42:54 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 16 May 2014 19:42:54 -0700 Subject: [Python-ideas] Extending the usage of the @ decoration operator. In-Reply-To: <5376C49F.3040604@mrabarnett.plus.com> References: <5376C49F.3040604@mrabarnett.plus.com> Message-ID: <5376CCAE.2090002@stoneleaf.us> On 05/16/2014 07:08 PM, MRAB wrote: > On 2014-05-17 02:40, Chris B wrote: >> Right now, one can use the @ symbol only for decorations and only bofore >> function or class definition. ("A decorator is just a callable that >> takes a function as an argument and returns a replacement function."). >> >> @dec1(arg) >> @dec2 >> def func(): pass >> >> >> If a function has already been defined, it cannot be decorated using the >> decoration operator. But it can still be decorated by explicitly calling >> the decorator: >> >> @dec1(arg) >> @dec2 >> def func(): pass >> >> is equivalent to: >> >> def func(): pass >> func = dec1(arg)(dec2(func)) >> >> >> Now I propose that the @ symbol should also be usable as an assignment >> operator, in which case a succeeding function definition would not be >> decorated: >> > [snip] > There is a proposal to use @ as an operator for matrix multiplication: > > http://legacy.python.org/dev/peps/pep-0465/ Which has been accepted. -- ~Ethan~ From ncoghlan at gmail.com Sat May 17 10:56:07 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 17 May 2014 18:56:07 +1000 Subject: [Python-ideas] logging.config.defaultConfig() In-Reply-To: <5375F7DE.9080902@egenix.com> References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com> Message-ID: On 16 May 2014 21:34, M.-A. Lemburg wrote: > On 16.05.2014 11:54, Antoine Pitrou wrote: >>> While I agree that importing a module might not be the right way, having >>> a standard way to configure logging via environment variables might be >>> helpful. >> >> I entirely disagree. An environment variable is a very lousy way to >> specify a configuration file's location; and there is no reason to have >> a common logging configuration for all Python applications. > > Hmm, it's a fairly standard way to define config file locations esp. > on Unix platforms, so I don't follow you here. Perhaps I'm just > missing some context. > > Such env vars are often used in application environments to override > system defaults, e.g. for finding OpenSSL or ODBC config files. Python is a language runtime, not an application. Having globally configurable behaviours for a runtime is, in general, questionable, which is why we have the options to ignore the environment variables, site-packages, user site-packages and now the "isolated mode" flag that basically says "ignore *every* explicitly configurable Python setting in the environment". For 3.2+, we defined a sensible default logging configuration (warning and above written to stderr, everything else ignored), so users should be free to just use the logging module when writing libraries without worrying about whether or not it has been configured for reporting properly. That doesn't help Python 2 users, but that's the case for a lot of things. Trying to provide a way to actually *configure* logging in a general way would be fraught with backwards compatibility issues when it came to interfering with frameworks (whether for writing CLI applications, web applications, or GUI applications) that already providing their own way of handling logging configuration. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat May 17 11:30:53 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 17 May 2014 19:30:53 +1000 Subject: [Python-ideas] Extending the usage of the @ decoration operator. In-Reply-To: References: Message-ID: On 17 May 2014 11:40, Chris B wrote: > I'm not sure where to go from here. Does this idea qualify for a PEP? Is it > even possible to be implemented? Has it already been discussed? What do you > think about it? Please share your opinions, suggestions and improvements! Others have noted that this specific proposal conflicts with the already accepted matrix multiplication operator, but there are some more general questions to ask yourself when making syntax proposals: * what problem am I trying to solve? * how common is that problem in general? * what are the existing solutions to that problem? * how easy is it to make a mistake when relying on the existing solutions? * how does the readability of the new syntax compare to existing code? * how much harder will it be to learn Python after this proposal is added? For example, the original decorator syntax solved a significant readability problem: def method(a, b, c): # Where is self???? # many # lines # of # implementation method = staticmethod(method) # Oh, it's a static method vs @staticmethod def method(a, b, c): # Obviously no self needed # many # lines # of # implementation By contrast, a new way of spelling the "method = staticmethod(method)" line isn't particularly interesting - it doesn't add much expressiveness to the language, just a new way of spelling something that can already be written out explicitly. Adding a complicated way of avoiding writing multiple assignment statements or a helper function also isn't compelling: p1, p2 @ expandvars, abspath, normcase, curry(relpath, start) vs def fixpath(p): return expandvars(abspath(normcase(relpath(start, p)))) p1 = fixpath(input('Insert path here: ')) p2 = fixpath(input('And another path here: ')) Python aspires to be "executable pseudocode". While we often fall short of that mark, it does at least mean we're willing to sacrifice a little brevity for the sake of clarity. For a more recent example of a successful syntax change proposal, the numeric Python community were able to make their case for a new matrix multiplication operator because they have been trying to solve it *without* a new operator for more than a decade, but haven't been able to come up with a non-syntactic solution that they were all happy with. The PEP was accepted in short order because they were able to demonstrate two things: 1. Yes, they really needed new syntax to solve the problem properly 2. No, they weren't likely to be back in a couple of years time asking for *another* operator in 3.6 - matrix multiplication really was the only thing they had found they didn't have a good clean spelling for http://www.curiousefficiency.org/posts/2011/02/justifying-python-language-changes.html has a few more examples of past changes that were accepted, and some of the key reasons why. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guettli at thomas-guettler.de Sat May 17 14:04:12 2014 From: guettli at thomas-guettler.de (=?ISO-8859-1?Q?Thomas_G=FCttler?=) Date: Sat, 17 May 2014 14:04:12 +0200 Subject: [Python-ideas] logging.config.defaultConfig() In-Reply-To: References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com> Message-ID: <5377503C.2070000@thomas-guettler.de> Am 17.05.2014 10:56, schrieb Nick Coghlan: > On 16 May 2014 21:34, M.-A. Lemburg wrote: >> On 16.05.2014 11:54, Antoine Pitrou wrote: >>>> While I agree that importing a module might not be the right way, having >>>> a standard way to configure logging via environment variables might be >>>> helpful. >>> >>> I entirely disagree. An environment variable is a very lousy way to >>> specify a configuration file's location; and there is no reason to have >>> a common logging configuration for all Python applications. >> >> Hmm, it's a fairly standard way to define config file locations esp. >> on Unix platforms, so I don't follow you here. Perhaps I'm just >> missing some context. >> >> Such env vars are often used in application environments to override >> system defaults, e.g. for finding OpenSSL or ODBC config files. > > Python is a language runtime, not an application. Having globally > configurable behaviours for a runtime is, in general, questionable, > which is why we have the options to ignore the environment variables, > site-packages, user site-packages and now the "isolated mode" flag > that basically says "ignore *every* explicitly configurable Python > setting in the environment". Using logging as library works well in Python. But writing console scripts which use logging force the developer to solve the same problems again and again: How to set up the logging? And the developer of the console script does not know how the user of the console script wants to handle logging. That's why all Python application have a different way to set up the logging. > For 3.2+, we defined a sensible default logging configuration (warning > and above written to stderr, everything else ignored), so users should > be free to just use the logging module when writing libraries without > worrying about whether or not it has been configured for reporting > properly. That doesn't help Python 2 users, but that's the case for a > lot of things. > Trying to provide a way to actually *configure* logging in a general > way would be fraught with backwards compatibility issues when it came > to interfering with frameworks (whether for writing CLI applications, > web applications, or GUI applications) that already providing their > own way of handling logging configuration. Of course a standard way to get the logging configuration defined by the end user should be optional. I don't see any backwards compatibility issues. The author of the console script should just need one line, to get the defaults which the console script user wants. {{{ import argparse def main() logging.config.defaultConfig() argparse... }}} The end user can set up the logging in the way he wants: - Log to a file - Log to a daemon - Format the messages the way he likes it - ... Since I know that some logging environments are complicated, I think it is best to hook into a method call. There are environments where fileConfig() does not solve all needs. Please ask if, you don't understand what I want. Thomas G?ttler -- Thomas G?ttler http://thomas-guettler.de/ From solipsis at pitrou.net Sat May 17 14:08:36 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 17 May 2014 14:08:36 +0200 Subject: [Python-ideas] logging.config.defaultConfig() References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com> <5377503C.2070000@thomas-guettler.de> Message-ID: <20140517140836.2af88817@fsol> On Sat, 17 May 2014 14:04:12 +0200 Thomas G?ttler wrote: > > There are environments where fileConfig() does not solve all needs. Please explain how. From mal at egenix.com Sat May 17 14:27:32 2014 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 17 May 2014 14:27:32 +0200 Subject: [Python-ideas] logging.config.defaultConfig() In-Reply-To: References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com> Message-ID: <537755B4.9070003@egenix.com> On 17.05.2014 10:56, Nick Coghlan wrote: > On 16 May 2014 21:34, M.-A. Lemburg wrote: >> On 16.05.2014 11:54, Antoine Pitrou wrote: >>>> While I agree that importing a module might not be the right way, having >>>> a standard way to configure logging via environment variables might be >>>> helpful. >>> >>> I entirely disagree. An environment variable is a very lousy way to >>> specify a configuration file's location; and there is no reason to have >>> a common logging configuration for all Python applications. >> >> Hmm, it's a fairly standard way to define config file locations esp. >> on Unix platforms, so I don't follow you here. Perhaps I'm just >> missing some context. >> >> Such env vars are often used in application environments to override >> system defaults, e.g. for finding OpenSSL or ODBC config files. > > Python is a language runtime, not an application. Having globally > configurable behaviours for a runtime is, in general, questionable, > which is why we have the options to ignore the environment variables, > site-packages, user site-packages and now the "isolated mode" flag > that basically says "ignore *every* explicitly configurable Python > setting in the environment". Right, but those options address specific use cases (e.g. for setting up testing environments). Their existence does not imply that having config variables for all of the above is a bad thing, as you seem to imply - otherwise, we wouldn't have them in the first place ;-) Logging is just another runtime feature, just like writing PYC files or setting a search path. Now, configuring logging is too complex to do on the command line, so pointing the runtime to a logging config file instead seems like a good idea. Of course, an application could just as well do this, so the question really is whether we should have it in general or not. PS: Note that with "application environment" I'm referring to exactly that: a shell environment with environment options specifically setup for a specific application. You typically use those for application specific user accounts, not globally. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 17 2014) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From guettli at thomas-guettler.de Sat May 17 14:41:05 2014 From: guettli at thomas-guettler.de (=?UTF-8?B?VGhvbWFzIEfDvHR0bGVy?=) Date: Sat, 17 May 2014 14:41:05 +0200 Subject: [Python-ideas] logging.config.defaultConfig() In-Reply-To: <20140517140836.2af88817@fsol> References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com> <5377503C.2070000@thomas-guettler.de> <20140517140836.2af88817@fsol> Message-ID: <537758E1.7010403@thomas-guettler.de> Am 17.05.2014 14:08, schrieb Antoine Pitrou: > On Sat, 17 May 2014 14:04:12 +0200 > Thomas G?ttler > wrote: >> >> There are environments where fileConfig() does not solve all needs. > > Please explain how. If you want to get the config from a database or LDAP. Is this supported by fileConfig()? -- Thomas G?ttler http://thomas-guettler.de/ From solipsis at pitrou.net Sat May 17 14:55:52 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 17 May 2014 14:55:52 +0200 Subject: [Python-ideas] logging.config.defaultConfig() References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com> <5377503C.2070000@thomas-guettler.de> <20140517140836.2af88817@fsol> <537758E1.7010403@thomas-guettler.de> Message-ID: <20140517145552.1436833b@fsol> On Sat, 17 May 2014 14:41:05 +0200 Thomas G?ttler wrote: > Am 17.05.2014 14:08, schrieb Antoine Pitrou: > > On Sat, 17 May 2014 14:04:12 +0200 > > Thomas G?ttler > > wrote: > >> > >> There are environments where fileConfig() does not solve all needs. > > > > Please explain how. > > If you want to get the config from a database or LDAP. Is > this supported by fileConfig()? Obviously not, but you should be able to use dictConfig() for that. Mapping the database contents to the dict representation expected by dictConfig() is a domain-specific task that cannot be provided by the standard library, so it's the application's job to provide it. Regards Antoine. From ncoghlan at gmail.com Sat May 17 16:34:55 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 18 May 2014 00:34:55 +1000 Subject: [Python-ideas] logging.config.defaultConfig() In-Reply-To: <5377503C.2070000@thomas-guettler.de> References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com> <5377503C.2070000@thomas-guettler.de> Message-ID: On 17 May 2014 22:05, "Thomas G?ttler" wrote: > Using logging as library works well in Python. But writing console > scripts which use logging force the developer to solve the same > problems again and again: How to set up the logging? > > And the developer of the console script does > not know how the user of the console script wants to handle logging. > > That's why all Python application have a different way to set up the logging. But this is also why command line frameworks like Cement exist (disclosure: Cement was just the first example I found of the kind of full featured CLI framework I mean. There may be other examples). I guess my question is, if an application is to the point of worrying about configuring logging, why should we handle it in the interpreter for command line applications, when we leave it to frameworks to handle for web and GUI applications? Application configuration is a complicated problem - you have to decide what to do about global defaults, user defaults, environment variables, command line options, potentially runtime adjustable options for daemons, SIGHUP handling, etc, etc. This complexity, along with other questions like ini-format vs JSON vs YAML is a key part of *why* PEP 391 punted on the question and just defined logging.dictConfig() instead. Cheers, Nick. > > > For 3.2+, we defined a sensible default logging configuration (warning > > and above written to stderr, everything else ignored), so users should > > be free to just use the logging module when writing libraries without > > worrying about whether or not it has been configured for reporting > > properly. That doesn't help Python 2 users, but that's the case for a > > lot of things. > > > > Trying to provide a way to actually *configure* logging in a general > > way would be fraught with backwards compatibility issues when it came > > to interfering with frameworks (whether for writing CLI applications, > > web applications, or GUI applications) that already providing their > > own way of handling logging configuration. > > Of course a standard way to get the logging configuration defined by the end user > should be optional. I don't see any backwards compatibility issues. > > The author of the console script should just need one line, to > get the defaults which the console script user wants. > > {{{ > import argparse > > def main() > logging.config.defaultConfig() > argparse... > > }}} > > The end user can set up the logging in the way he wants: > > - Log to a file > - Log to a daemon > - Format the messages the way he likes it > - ... > > Since I know that some logging environments are complicated, I think > it is best to hook into a method call. There are environments > where fileConfig() does not solve all needs. > > Please ask if, you don't understand what I want. > > > Thomas G?ttler > > -- > Thomas G?ttler > http://thomas-guettler.de/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettli at thomas-guettler.de Sun May 18 09:34:52 2014 From: guettli at thomas-guettler.de (=?ISO-8859-15?Q?Thomas_G=FCttler?=) Date: Sun, 18 May 2014 09:34:52 +0200 Subject: [Python-ideas] Application start up hooks Message-ID: <5378629C.4040208@thomas-guettler.de> This is the successor to the thread "logging.config.defaultConfig()". Thank you for your replies to the first post. I thought agin about my needs: setting up logging. The more abstract description is "setting up an application". And again, that's something that should not be done by Python itself. But there could be standard hooks to build a bridge between developer and operator: - developer: develops libraries, command lines apps, web apps, ... - app user: responsible for configuring the app. Up to to now every application has its own way to set up the application, and this diversity is good. All environments are different, but this pattern is common for most applications: 1. use default config provided by the app. These defaults are from the developer of the app. 2. use default config provided by the app user. This can overwrite previous config 3. use explicit config (for example command line arguments) provided by the app user. This can overwrite previous config I looked at Cement. It provides the above steps, but it provides a lot of other things, too. This makes it a too heavy weight dependency for many small applications. I like argparse for command line tools. But it misses loading defaults before parsing the command line args. Many post of my first thread said something like "this is not the job of python, handle this in our application yourself". Now I think you are right. If there is a good reusable module for loading configs (setting up logging is one part of this) it can live outside the standard library. And if it is really good, it can get into the standard library in the future. I know that it is off topic on this list. But it might be useful for other people, too: Does anyone know a light weight module for loading configuration settings? Thomas -- Thomas G?ttler http://thomas-guettler.de/ From tjreedy at udel.edu Sun May 18 10:30:27 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 18 May 2014 04:30:27 -0400 Subject: [Python-ideas] Application start up hooks In-Reply-To: <5378629C.4040208@thomas-guettler.de> References: <5378629C.4040208@thomas-guettler.de> Message-ID: On 5/18/2014 3:34 AM, Thomas G?ttler wrote: [snip] > Does anyone know a light weight module for loading configuration settings? Questions like this are a good subject for a python-list post. -- Terry Jan Reedy From machyniak at gmail.com Mon May 19 18:16:31 2014 From: machyniak at gmail.com (Pavel Machyniak) Date: Mon, 19 May 2014 18:16:31 +0200 Subject: [Python-ideas] python configure --with-ssl Message-ID: <537A2E5F.4060004@gmail.com> Hello python developers, please add option to python configure for setting custom `openssl` installation on python build, eg `--with-ssl=path` as used commonly. Otherwise it is difficult to build python with specific `openssl` installation/compilation, see eg. http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-with-a-custom-openssl-version. Thank you. Sincerely, Pavel Machyniak machyniak at gmail.com From random832 at fastmail.us Mon May 19 22:40:31 2014 From: random832 at fastmail.us (random832 at fastmail.us) Date: Mon, 19 May 2014 16:40:31 -0400 Subject: [Python-ideas] Delivery Status Notification (Failure) In-Reply-To: References: Message-ID: <1400532031.8059.119201577.158D70D7@webmail.messagingengine.com> For some reason it tried to send my message to a @googlegroups.com address, likely because this for some reason shows up in the other user's headers instead of the correct email address. On Mon, May 19, 2014, at 14:48, Mail Delivery Subsystem wrote: > Hello random832 at fastmail.us, > > We're writing to let you know that the group you tried to contact > (python-ideas) may not exist, or you may not have permission to post > messages to the group. A few more details on why you weren't able to > post: > > * You might have spelled or formatted the group name incorrectly. > * The owner of the group may have removed this group. > * You may need to join the group before receiving permission to post. > * This group may not be open to posting. > > If you have questions related to this or any other Google Group, visit > the Help Center at http://groups.google.com/support/. > > Thanks, > > Google Groups > > > > ----- Original message ----- > > X-Received: by 10.50.32.4 with SMTP id e4mr126025igi.7.1400525310266; > Mon, 19 May 2014 11:48:30 -0700 (PDT) > Return-Path: > Received: from out3-smtp.messagingengine.com > (out3-smtp.messagingengine.com. [66.111.4.27]) > by gmr-mx.google.com with ESMTPS id > bu7si4056012pad.0.2014.05.19.11.48.29 > for > (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 > bits=128/128); > Mon, 19 May 2014 11:48:29 -0700 (PDT) > Received-SPF: pass (google.com: domain of random832 at fastmail.us > designates 66.111.4.27 as permitted sender) client-ip=66.111.4.27; > Authentication-Results: gmr-mx.google.com; > spf=pass (google.com: domain of random832 at fastmail.us designates > 66.111.4.27 as permitted sender) smtp.mail=random832 at fastmail.us; > dkim=pass header.i=@fastmail.us > Received: from compute6.internal (compute6.nyi.mail.srv.osa > [10.202.2.46]) > by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id C3DB720ABA; > Mon, 19 May 2014 14:48:28 -0400 (EDT) > Received: from web3 ([10.202.2.213]) > by compute6.internal (MEProxy); Mon, 19 May 2014 14:48:28 -0400 > DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.us; h= > message-id:from:to:mime-version:content-transfer-encoding > :content-type:subject:date:in-reply-to:references; s=mesmtp; bh= > U6FWb0C2uyTm8cceYVd+D9FI+LU=; b=A73KEJ7yw88fKnQPrc+QgfItUQV+aRd7 > w3+v0bLtjD+hul2EliX/jxu+oWr0r60DYKKYpKq6LaeqK3wZKiohTBraOw4yh+5+ > /AIlvOm4u1otMWb4TX2RLeRKjvaPkBZbr9aaRy9pJ9uYR2pugWxCCaV82nRhZoPS > 6bXzp7y7zPM= > DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= > messagingengine.com; h=message-id:from:to:mime-version > :content-transfer-encoding:content-type:subject:date:in-reply-to > :references; s=smtpout; bh=U6FWb0C2uyTm8cceYVd+D9FI+LU=; b=fXFmo > JvkBD6AI/AoBMbVe1LKbTgCoFtw2ERCmhTLa3/V7NqaltxUvvB3Rj51xJ+2RpQBR > RBIeqyDDRYLjtwpy6CxK7ZpnbSQlHImhhQUCKSVSUFI5n97n9X5MVKdfIdfXu3Pn > 1udejeeAK095H3R9knwOCvdTHuFiI63/pA2Fm8= > Received: by web3.nyi.mail.srv.osa (Postfix, from userid 99) > id 9CF84114773; Mon, 19 May 2014 14:48:28 -0400 (EDT) > Message-Id: > <1400525308.6670.119156649.17F929FF at webmail.messagingengine.com> > X-Sasl-Enc: Y0SV34WN7zg7S+fK/epnscp+bHNT34v9NGD0AvnSqJX5 1400525308 > From: random832 at fastmail.us > To: Ram Rachum , python-ideas at googlegroups.com > MIME-Version: 1.0 > Content-Transfer-Encoding: 7bit > Content-Type: text/plain > X-Mailer: MessagingEngine.com Webmail Interface - ajax-988d4021 > Subject: Re: [Python-ideas] Expose `itertools.count.start` and implement > `itertools.count.__eq__` based on it, like `range`. > Date: Mon, 19 May 2014 14:48:28 -0400 > In-Reply-To: <082cd87a-aeb5-49bf-9f79-d99a6d18e402 at googlegroups.com> > References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402 at googlegroups.com> > > On Thu, May 15, 2014, at 16:02, Ram Rachum wrote: > > I suggest exposing `itertools.count.start` and implementing > > `itertools.count.__eq__` based on it. This'll provide the same benefits > > that `range` got by exposing `range.start` and allowing `range.__eq__`. > > I think this _and_ your other request reveal a misunderstanding of what > itertools are. They're not "magic sequences", they're generators - the > fact that you can use either a sequence or a generator in a for loop may > have confused you. In other words, they're more like Python 2 > dict.iteritems than Python 2 xrange. It might be more reasonable to > propose that a new module be created for "magic sequence" objects. -- Random832 From nad at acm.org Tue May 20 00:30:41 2014 From: nad at acm.org (Ned Deily) Date: Mon, 19 May 2014 15:30:41 -0700 Subject: [Python-ideas] python configure --with-ssl References: <537A2E5F.4060004@gmail.com> Message-ID: In article <537A2E5F.4060004 at gmail.com>, Pavel Machyniak wrote: > please add option to python configure for setting custom `openssl` > installation on python build, eg `--with-ssl=path` as used commonly. > Otherwise it is difficult to build python with specific `openssl` > installation/compilation, see eg. > http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-wit > h-a-custom-openssl-version. This sort of request has come up before on the Python bug tracker. With a quick search, I didn't find an exact match for what you request, although there might be a more general issue open to allow more control over all third-party libraries. However, there is http://bugs.python.org/issue5575 which provides a patch to allow control using environment variables rather than configure options. Feel free to open a new issue or comment on this one. -- Ned Deily, nad at acm.org From darren.rmc at gmail.com Tue May 20 06:27:38 2014 From: darren.rmc at gmail.com (Darren McCleary) Date: Tue, 20 May 2014 00:27:38 -0400 Subject: [Python-ideas] Break if *condition* Message-ID: Hello, This is my first python idea suggestion. I often find myself writing code like: for i in iterable: # do something if i == some_condition : break I feel that condensing this down to one line would be a novel idea. That same 2 lines could be written as: for i in iterable: # do something break if i == some_condition It seems to me this would be logical and in the same vein of conditional variable assignments ( i.e. x = 0 if y == True else 1 ) Thoughts? Cheers, Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue May 20 07:23:47 2014 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 20 May 2014 15:23:47 +1000 Subject: [Python-ideas] Break if *condition* In-Reply-To: References: Message-ID: On Tue, May 20, 2014 at 2:27 PM, Darren McCleary wrote: > if i == some_condition : > break > > I feel that condensing this down to one line would be a novel idea. That > same 2 lines could be written as: > > for i in iterable: > # do something > break if i == some_condition if i == some_condition: break That's one line too, and works on existing Python interpreters :) ChrisA From theller at ctypes.org Tue May 20 10:51:22 2014 From: theller at ctypes.org (Thomas Heller) Date: Tue, 20 May 2014 10:51:22 +0200 Subject: [Python-ideas] pathlib suggestion Message-ID: Python 3.4's pathlib uses str(path) to get the full pathname as string. I'd like to suggest adding a property which allows to access the full pathname. IMO this should make it easier to understand the code or make if possible to search for it in sources. I'm unsure about the name this property should get; maybe .fullpath or something like that. I'm also unsure whether there should be separate properties to get the full pathname as string or bytes object. Opinions? Thomas From machyniak at gmail.com Tue May 20 11:33:58 2014 From: machyniak at gmail.com (Pavel Machyniak) Date: Tue, 20 May 2014 11:33:58 +0200 Subject: [Python-ideas] python configure --with-ssl In-Reply-To: References: <537A2E5F.4060004@gmail.com> Message-ID: <537B2186.3080907@gmail.com> On 20.5.2014 0:30, Ned Deily wrote: > In article <537A2E5F.4060004 at gmail.com>, > Pavel Machyniak > wrote: >> please add option to python configure for setting custom `openssl` >> installation on python build, eg `--with-ssl=path` as used commonly. >> Otherwise it is difficult to build python with specific `openssl` >> installation/compilation, see eg. >> http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-wit >> h-a-custom-openssl-version. > > This sort of request has come up before on the Python bug tracker. With > a quick search, I didn't find an exact match for what you request, > although there might be a more general issue open to allow more control > over all third-party libraries. However, there is > http://bugs.python.org/issue5575 which provides a patch to allow control > using environment variables rather than configure options. Feel free to > open a new issue or comment on this one. > Thanks, I am well aware of the patch but it does not work if there is default openssl installation within the system (because it only adds another path to the END of the search list), and although the patch is from the year 2009 it is not released (accepted?) yet. I will probably find some time and propose the solution/patch using configure options --with-ssl (and also --with-ssl-includes, --with-ssl-libs, --with_krb5, and maybe --with-sqlite, --with-sqlite-includes, --with-sqlite-libs as well). Pavel Machyniak From g.rodola at gmail.com Tue May 20 12:22:12 2014 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Tue, 20 May 2014 12:22:12 +0200 Subject: [Python-ideas] pathlib suggestion In-Reply-To: References: Message-ID: On Tue, May 20, 2014 at 10:51 AM, Thomas Heller wrote: > > I'm unsure about the name this property should get; maybe .fullpath > or something like that. Probably "abspath" in order to be consistent with os.path.abspath. -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From theller at ctypes.org Tue May 20 17:08:45 2014 From: theller at ctypes.org (Thomas Heller) Date: Tue, 20 May 2014 17:08:45 +0200 Subject: [Python-ideas] pathlib suggestion In-Reply-To: References: Message-ID: Am 20.05.2014 12:22, schrieb Giampaolo Rodola': > On Tue, May 20, 2014 at 10:51 AM, Thomas Heller > wrote: > > I'm unsure about the name this property should get; maybe .fullpath > or something like that. > > > Probably "abspath" in order to be consistent with os.path.abspath. Well, it would not always be an absolute pathname, so .abspath looks wrong to me. From solipsis at pitrou.net Tue May 20 17:25:18 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 20 May 2014 17:25:18 +0200 Subject: [Python-ideas] pathlib suggestion References: Message-ID: <20140520172518.4576b754@fsol> On Tue, 20 May 2014 10:51:22 +0200 Thomas Heller wrote: > Python 3.4's pathlib uses str(path) to get the full pathname > as string. > > I'd like to suggest adding a property which allows to access > the full pathname. IMO this should make it easier to understand > the code or make if possible to search for it in sources. > > I'm unsure about the name this property should get; maybe .fullpath > or something like that. I'm also unsure whether there should be > separate properties to get the full pathname as string or bytes object. .strpath perhaps? (also .bytespath if desired) It was once proposed as "filesystem path" protocol where classes purporting to represent filesystem paths could define a e.g. __strpath__ method returning the string representation of the path. I can only find the following allusions on python-ideas: https://mail.python.org/pipermail/python-ideas/2012-October/016912.html https://mail.python.org/pipermail/python-ideas/2012-October/016974.html Regards Antoine. From victor.stinner at gmail.com Tue May 20 18:57:53 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 20 May 2014 18:57:53 +0200 Subject: [Python-ideas] Make Python code read-only Message-ID: Hi, I'm trying to find the best option to make CPython faster. I would like to discuss here a first idea of making the Python code read-only to allow new optimizations. Make Python code read-only ========================== I propose to add an option to Python to make the code read-only. In this mode, module namespace, class namespace and function attributes become read-only. It is still be possible to add a "__readonly__ = False" marker to keep a module, a class and/or a function modifiable. I chose to make the code read-only by default instead of the opposite. In my test, almost all code can be made read-only without major issue, few code requires the "__readonly__ = False" marker. A module is only made read-only by importlib after the module is loaded. The module is stil modifiable when code is executed until importlib has set all its attributes (ex: __loader__). I have a proof of concept: a fork of Python 3.5 making code read-only if the PYTHONREADONLY environment variable is set to 1. Commands to try it: hg clone http://hg.python.org/sandbox/readonly cd readonly && ./configure && make PYTHONREADONLY=1 ./python -c 'import os; os.x = 1' # ValueError: read-only dictionary Status of the standard library (Lib/*.py): 139 modules are read-only, 25 are modifiable. Except of the sys module, all modules writen in C are read-only. I'm surprised that so few code rely on the ability to modify everything. Most of the code can be read-only. Optimizations possible when the code is read-only ================================================= * Inline calls to functions. * Replace calls to pure functions (without side effect) with the result. For example, len("abc") can be replaced with 3. * Constants can be replaced with their values (at least for simple types like bytes, int and str). It is for example possible to implement these optimizations by manipulating the Abstract Syntax Tree (AST) during the compilation from the source code to bytecode. See my astoptimizer project which already implements similar optimizations: https://bitbucket.org/haypo/astoptimizer More optimizations ================== My main motivation to make code read-only is to specialize a function: optimize a function for a specific environment (type of parameters, external symbols like other functions, etc). Checking the type of parameters can be fast (especially when implemented in C), but it would be expensive to check that all global variables used in the function were not modified since the function has been "specialized". For example, if os.path.isabs(path) is called: you have to check that "os.path" and "os.path.isabs" attributes were not modified and that the isabs() was not modified. If we know that globals are read-only, these checks are no more needed and so it becomes cheap to decide if the specialized function can be used or not. It becomes possible to "learn" types (trace the execution of the application, and then compile for the recorded types). Knowing the type of function parameters, result and local variables opens an interesting class of new optimizations, but I prefer to discuss this later, after discussing the idea of making the code read-only. One point remains unclear to me. There is a short time window between a module is loaded and the module is made read-only. During this window, we cannot rely on the read-only property of the code. Specialized code cannot be used safetly before the module is known to be read-only. I don't know yet how the switch from "slow" code to optimized code should be implemented. Issues with read-only code ========================== * Currently, it's not possible to allow again to modify a module, class or function to keep my implementation simple. With a registry of callbacks, it may be possible to enable again modification and call code to disable optimizations. * PyPy implements this but thanks to its JIT, it can optimize again the modified code during the execution. Writing a JIT is very complex, I'm trying to find a compromise between the fast PyPy and the slow CPython. Add a JIT to CPython is out of my scope, it requires too much modifications of the code. * With read-only code, monkey-patching cannot be used anymore. It's annoying to run tests. An obvious solution is to disable read-only mode to run tests, which can be seen as unsafe since tests are usually used to trust the code. * The sys module cannot be made read-only because modifying sys.stdout and sys.ps1 is a common use case. * The warnings module tries to add a __warningregistry__ global variable in the module where the warning was emited to not repeat warnings that should only be emited once. The problem is that the module namespace is made read-only before this variable is added. A workaround would be to maintain these dictionaries in the warnings module directly, but it becomes harder to clear the dictionary when a module is unloaded or reloaded. Another workaround is to add __warningregistry__ before making a module read-only. * Lazy initialization of module variables does not work anymore. A workaround is to use a mutable type. It can be a dict used as a namespace for module modifiable variables. * The interactive interpreter sets a "_" variable in the builtins namespace. I have no workaround for this. The "_" variable is no more created in read-only mode. Don't run the interactive interpreter in read-only mode. * It is not possible yet to make the namespace of packages read-only. For example, "import encodings.utf_8" adds the symbol "utf_8" to the encodings namespace. A workaround is to load all submodules before making the namespace read-only. This cannot be done for some large modules. For example, the encodings has a lot of submodules, only a few are needed. Read the documentation for more information: http://hg.python.org/sandbox/readonly/file/tip/READONLY.txt More optimizations ================== See my notes for all ideas to optimize CPython: http://haypo-notes.readthedocs.org/faster_cpython.html I explain there why I prefer to optimize CPython instead of working on PyPy or another Python implementation like Pyston, Numba or similar projects. Victor From dw+python-ideas at hmmz.org Tue May 20 19:22:42 2014 From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org) Date: Tue, 20 May 2014 17:22:42 +0000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: <20140520172242.GA26262@k2> On Tue, May 20, 2014 at 06:57:53PM +0200, Victor Stinner wrote: > * With read-only code, monkey-patching cannot be used anymore. It's > annoying to run tests. An obvious solution is to disable read-only > mode to run tests, which can be seen as unsafe since tests are usually > used to trust the code. At least for me, this represents a material change to the philosophy of the language. While frowned upon, monkey patching is extremely useful while debugging, and occasionally in emergencies. :) Definitely not worth it for a few extra % IMHO David From pmawhorter at gmail.com Tue May 20 19:36:51 2014 From: pmawhorter at gmail.com (Peter Mawhorter) Date: Tue, 20 May 2014 10:36:51 -0700 Subject: [Python-ideas] Make Python code read-only In-Reply-To: <20140520172242.GA26262@k2> References: <20140520172242.GA26262@k2> Message-ID: On Tue, May 20, 2014 at 10:22 AM, wrote: > On Tue, May 20, 2014 at 06:57:53PM +0200, Victor Stinner wrote: > >> * With read-only code, monkey-patching cannot be used anymore. It's >> annoying to run tests. An obvious solution is to disable read-only >> mode to run tests, which can be seen as unsafe since tests are usually >> used to trust the code. > > At least for me, this represents a material change to the philosophy of > the language. While frowned upon, monkey patching is extremely useful > while debugging, and occasionally in emergencies. :) > > Definitely not worth it for a few extra % IMHO > > David > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ I think part of the point was that this read-only mode would be entirely optional. One of the main reasons that I don't use Python for all of my projects is the speed issue, so anything that's a "free" speedup seems like a great thing. The main cost I can see here is in maintaining the readonly mode and the perhaps subtle bugs that would arise in many people's code when run in readonly mode. As an official feature, there would be a documentation and maintenance cost to the community, but I do think that there's substantial benefit, and especially as an opt-in feature, if the optimizations really speed things up, this seems quite useful. I guess the question is: How does this compare to other "drop-in" speedup solutions, like PyPy. Is it applicable to more existing code? Is it easier to apply? Does it provide a better speed increase? If there's a niche for it in one of those three areas and it's an opt-in system, I see the issue being a cost-benefit analysis of what is gained (whatever that niche is) vs. the maintenance cost in terms of bug reports etc. -Peter Mawhorter From rosuav at gmail.com Tue May 20 19:37:42 2014 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 21 May 2014 03:37:42 +1000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: On Wed, May 21, 2014 at 2:57 AM, Victor Stinner wrote: > * The sys module cannot be made read-only because modifying sys.stdout > and sys.ps1 is a common use case. I think this highlights the biggest concern with defaulting to read-only. Currently, most Python code won't reach into another module and change anything, but any program could monkey-patch any module at any time. You've noted that modifying sys's attributes is common enough to prevent its being made read-only; how do you know what else will be broken if this change goes through? For that reason, even though the read-only state would be the more common one, I would strongly recommend flagging those modules which _are_ read-only, rather than those which aren't. Then it becomes a documentable part of the module's interface: "This module will be frozen when Python is run in read-only mode". Setting that flag and then modifying your own state would be a mistake on par with using assert for crucial checks; monkey-patching someone else's read-only module makes your app incompatible with read-only mode. Any problems would come from use of *both* read-only mode *and* the __readonly__ flag, rather than unexpectedly cropping up when someone loads up a module from PyPI and it turns out to depend on mutability. Also, flagging the ones that have the changed behaviour means it's quite easy to get partial benefit from this, with no risk. In fact, you could probably turn this on for arbitrary Python programs, as long as only the standard library uses __readonly__; going the other way, having a single module that doesn't have the flag and requires mutability would prevent the whole app from being run in read-only mode. With that (rather big, and yet quite trivial) caveat, though: Looks interesting. Optimizing for the >99% of code that doesn't do weird things makes very good sense, just as long as the <1% can be catered for. ChrisA From rosuav at gmail.com Tue May 20 19:44:43 2014 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 21 May 2014 03:44:43 +1000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: <20140520172242.GA26262@k2> Message-ID: On Wed, May 21, 2014 at 3:36 AM, Peter Mawhorter wrote: > The main cost I can see here is in > maintaining the readonly mode and the perhaps subtle bugs that would > arise in many people's code when run in readonly mode. Here's a stupid-crazy idea to chew on. (Fortunately this is not python-good-ideas at python.org - I wouldn't have much to contribute there!) Make the per-module flag opt-in-only, but the overall per-application flag active by default. Then, read-only mode applies to a small number of standard library modules (plus any user modules that specifically request it), and will thus be less surprising; and a small rewording of the error message (eg "... - run Python with the -O0 parameter to disable this check") would mean the monkey-patchers could still do their stuff, at the cost of this optimization. It's less likely to be surprising, because development would be done with read-only mode active, rather than "Okay, let's try this in optimized mode now - no asserts and read-only dicts... oh dear, it's not working". Big downside: Time machine policy prevents us from going back to 2.0 and implementing it there. There's going to be an even worse boundary when people upgrade to a Python with this active by default. So it's probably better to NOT make either half active by default, but to recommend that new projects be developed with read-only mode active. ChrisA From ethan at stoneleaf.us Tue May 20 19:34:58 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 20 May 2014 10:34:58 -0700 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: <537B9242.3080601@stoneleaf.us> On 05/20/2014 09:57 AM, Victor Stinner wrote: > > I'm trying to find the best option to make CPython faster. I would > like to discuss here a first idea of making the Python code read-only > to allow new optimizations. -1 to a forced read-only by default +0 to a command-line switch to enable read-only by default -- ~Ethan~ From victor.stinner at gmail.com Tue May 20 22:32:25 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 20 May 2014 22:32:25 +0200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: 2014-05-20 19:37 GMT+02:00 Chris Angelico : >> * The sys module cannot be made read-only because modifying sys.stdout >> and sys.ps1 is a common use case. > > I think this highlights the biggest concern with defaulting to > read-only. Hum, maybe my email was unclear: the read-only mode is disabled by default. When you enable the read-only mode, all modules are read-only except the modules explicitly configured to be modifiable. I don't have a strong opinion on this choice. We may only make modules read-only when the read-only mode is activated and the module is explicitly configured to be read-only. Another option is to have a list of modules which should be made read-only, configurable by the application. > With that (rather big, and yet quite trivial) caveat, though: Looks > interesting. Optimizing for the >99% of code that doesn't do weird > things makes very good sense, just as long as the <1% can be catered > for. Yeah, the whole stdlib doesn't need to be read-only to make an application faster. Victor From ethan at stoneleaf.us Tue May 20 23:21:35 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 20 May 2014 14:21:35 -0700 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: <537BC75F.2060605@stoneleaf.us> On 05/20/2014 01:32 PM, Victor Stinner wrote: > 2014-05-20 19:37 GMT+02:00 Chris Angelico : >>> * The sys module cannot be made read-only because modifying sys.stdout >>> and sys.ps1 is a common use case. >> >> I think this highlights the biggest concern with defaulting to >> read-only. > > Hum, maybe my email was unclear: the read-only mode is disabled by default. Ah, that's good. > Another option is to have a list of modules which should be made > read-only, configurable by the application. Or a list of modules that should remain read/write. As an application dev I should know which modules I am going to be modifying after initial load, so as long as I can easily add them to a read/write list I would be happy (especially when it came time to debug something). -- ~Ethan~ From ericsnowcurrently at gmail.com Wed May 21 00:04:22 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 20 May 2014 16:04:22 -0600 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: An interesting idea. Comments below. On May 20, 2014 10:58 AM, "Victor Stinner" wrote: > Make Python code read-only > ========================== > > I propose to add an option to Python to make the code read-only. In > this mode, module namespace, class namespace and function attributes > become read-only. It is still be possible to add a "__readonly__ = > False" marker to keep a module, a class and/or a function modifiable. Make __readonly__ a data descriptor (getset in the C-API) on ModuleType, type, and FunctionType and people could toggle it as needed. The descriptor could look something like this (in pure Python): class ReadonlyDescriptor: DEFAULT = os.environ.get(b'PYTHONREADONLY', False) # i.e. ignore changes to PYTHONREADONLY def __init__(self, *, default=None): if default is None: default = cls.DEFAULT self.default = default def __get__(self, obj, cls): if obj is None: return self try: return obj.__dict__['__readonly__'] except KeyError: readonly = bool(self.default) obj.__dict__['__readonly__'] = readonly return readonly def __set__(self, obj, value): obj.__dict__['__readonly__'] = value Alternately, the object structs for the 3 types (e.g. PyModuleObject) could each grow a "readonly" field (or an extra flag option if there is an appropriate flag). The descriptor (in C) would use that instead of obj.__dict__['__readonly__']. However, I'd prefer going through __dict__. Either way, the 3 types would share a tp_setattro implementation that checked the read-only flag. That way there's no need to make sweeping changes to the 3 types, nor to the dict type. def __setattr__(self, name, value): if self.__readonly__: raise AttributeError('readonly') super().__setattr__(name, value) FWIW, the idea of a flag for read-only could be applied to objects in general, particularly in a future language addition. "__readonly__" is a good name for the flag so the precedent set by the three types in this proposal would be a good one. > > I chose to make the code read-only by default instead of the opposite. > In my test, almost all code can be made read-only without major issue, > few code requires the "__readonly__ = False" marker. Read-only by default would be backwards-incompatible, but having a commandline flag (and/or env var) to enable it would be useful. For classes a decorator could be nice, though it should wait until it was more obviously worth doing. I'm not sure it would matter for functions, though the same decorator would probably work. > > A module is only made read-only by importlib after the module is > loaded. The module is stil modifiable when code is executed until > importlib has set all its attributes (ex: __loader__). With a data descriptor and __setattr__ like I described above, there is no need to make any changes to importlib. > Optimizations possible when the code is read-only > ================================================= ... > More optimizations > ================== +1 > One point remains unclear to me. There is a short time window between > a module is loaded and the module is made read-only. During this > window, we cannot rely on the read-only property of the code. > Specialized code cannot be used safetly before the module is known to > be read-only. How big a problem would this be in practice? > Issues with read-only code > ========================== > > * Currently, it's not possible to allow again to modify a module, > class or function to keep my implementation simple. With a registry of > callbacks, it may be possible to enable again modification and call > code to disable optimizations. With the data descriptor approach toggling read-only would work. Enabling/disabling optimizations at that point would depend on how they were implemented. > * Lazy initialization of module variables does not work anymore. A > workaround is to use a mutable type. It can be a dict used as a > namespace for module modifiable variables. What do you mean by "lazy initialization of module variables"? > * It is not possible yet to make the namespace of packages read-only. > For example, "import encodings.utf_8" adds the symbol "utf_8" to the > encodings namespace. A workaround is to load all submodules before > making the namespace read-only. This cannot be done for some large > modules. For example, the encodings has a lot of submodules, only a > few are needed. If read-only is only enforced via __setattr__ then the workaround is to bind the submodule directly via pkg.__dict__. -eric From greg.ewing at canterbury.ac.nz Wed May 21 00:44:25 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 21 May 2014 10:44:25 +1200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: <537BDAC9.2010100@canterbury.ac.nz> Victor Stinner wrote: > Optimizations possible when the code is read-only > ================================================= > > * Inline calls to functions. > > * Replace calls to pure functions (without side effect) with the > result. For example, len("abc") can be replaced with 3. I'm skeptical about how much difference this would make. In most of the Python code I've seen, calls to module-level functions are relatively rare -- most calls are method calls. -- Greg From rosuav at gmail.com Wed May 21 00:46:04 2014 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 21 May 2014 08:46:04 +1000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: On Wed, May 21, 2014 at 6:32 AM, Victor Stinner wrote: > 2014-05-20 19:37 GMT+02:00 Chris Angelico : >>> * The sys module cannot be made read-only because modifying sys.stdout >>> and sys.ps1 is a common use case. >> >> I think this highlights the biggest concern with defaulting to >> read-only. > > Hum, maybe my email was unclear: the read-only mode is disabled by default. > > When you enable the read-only mode, all modules are read-only except > the modules explicitly configured to be modifiable. There are two read-only states: 1) Is this application running in read-only mode? (You give an example of setting this by an env var.) 2) Is this module read-only? (You give an example of setting this to False.) It's the second one that I'm talking about. If, once you turn on read-only mode (the first state), every module is read-only except those marked __readonly__=False, you're going to have major backward incompatibility problems. All it takes is one single module that ought to be marked __readonly__=False and isn't, and read-only mode is broken. Yes, it may be that most of the standard library can be made read-only; but I believe it would still be better to explicitly say __readonly__=True on each of those modules, than __readonly__=False on the others - because of all the *non* stdlib modules. ChrisA From victor.stinner at gmail.com Wed May 21 01:46:34 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 May 2014 01:46:34 +0200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: 2014-05-21 0:04 GMT+02:00 Eric Snow : > Make __readonly__ a data descriptor (getset in the C-API) on > ModuleType, type, and FunctionType and people could toggle it as > needed. In my PoC, I chose to modify directly the builtin type "dict". I don't think that I will keep this solution because I would prefer to not touch such critical Python type. I may use a subclass instead. I added a dict.setreadonly() method which can be used to make a dict read-only, but a read-only dict cannot be made modifiable again. I added a type.setreadonly() method which calls type.__dict__.setreadonly(). I did this to access the underlying dict, type.setreadonly() also works on builtin types like str. For example, str.__dict__ is a mappingproxy, not the real dictionary. > Alternately, the object structs for the 3 types (e.g. PyModuleObject) > could each grow a "readonly" field (or an extra flag option if there > is an appropriate flag). The descriptor (in C) would use that instead > of obj.__dict__['__readonly__']. However, I'd prefer going through > __dict__. There is already a function.__readonly__ property (I just modified its name, it was called __modifiable__ before, the opposite). It is used to make a function read-only by importlib. > Either way, the 3 types would share a tp_setattro implementation that > checked the read-only flag. That way there's no need to make sweeping > changes to the 3 types, nor to the dict type. > > def __setattr__(self, name, value): > if self.__readonly__: > raise AttributeError('readonly') > super().__setattr__(name, value) Are you sure that it's not possible to retrieve the underlying dictionary somehow? For example, functions have a func.__dict__ attribute. > Read-only by default would be backwards-incompatible, but having a > commandline flag (and/or env var) to enable it would be useful. My PoC had a PYTHONREADONLY env var to enable the read-only mode. I just added a -r command line option for the same purpose. It's disabled by default for backward compatibility. Only enable it if you want to try my optimizations :-) > For classes a decorator could be nice, though it should wait until it > was more obviously worth doing. I'm not sure it would matter for > functions, though the same decorator would probably work. I just pushed a change to make the classes read-only by default to make also nested classes read-only. I modified the builtin __build_class__ function for that. The decorator is called after the class is defined, it's too late. That's why I chose a class attribute. >> One point remains unclear to me. There is a short time window between >> a module is loaded and the module is made read-only. During this >> window, we cannot rely on the read-only property of the code. >> Specialized code cannot be used safetly before the module is known to >> be read-only. > > How big a problem would this be in practice? I have no idea right now :) >> Issues with read-only code >> ========================== >> >> * Currently, it's not possible to allow again to modify a module, >> class or function to keep my implementation simple. With a registry of >> callbacks, it may be possible to enable again modification and call >> code to disable optimizations. > > With the data descriptor approach toggling read-only would work. > Enabling/disabling optimizations at that point would depend on how > they were implemented. Hum, I should try to use your descriptor. I'm not sure that it works for modules and classes. (Functions already have a __readonly__ property.) >> * Lazy initialization of module variables does not work anymore. A >> workaround is to use a mutable type. It can be a dict used as a >> namespace for module modifiable variables. > > What do you mean by "lazy initialization of module variables"? To reduce the memory footprint, "large" precomputed tables of the base64 module are only filled at the first call of the function needing the tables. I also saw in other modules that a module is only imported the first time that is it loaded. Example: "def _lazy_import_sys(): global sys; import sys" and then "if sys is None: _lazy_import_sys(); # use sys". >> * It is not possible yet to make the namespace of packages read-only. >> For example, "import encodings.utf_8" adds the symbol "utf_8" to the >> encodings namespace. A workaround is to load all submodules before >> making the namespace read-only. This cannot be done for some large >> modules. For example, the encodings has a lot of submodules, only a >> few are needed. > > If read-only is only enforced via __setattr__ then the workaround is > to bind the submodule directly via pkg.__dict__. I don't like the idea of an "almost" read-only module object. In one of my project, I would like to emit machine code. If a module is modified whereas the machine code relies on the module read-only property, Python may crash. Victor From ncoghlan at gmail.com Wed May 21 02:00:31 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 May 2014 10:00:31 +1000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: > More optimizations > ================== > > See my notes for all ideas to optimize CPython: > > http://haypo-notes.readthedocs.org/faster_cpython.html > > I explain there why I prefer to optimize CPython instead of working on > PyPy or another Python implementation like Pyston, Numba or similar > projects. You don't explain why you don't want to go with the selective optimisation approach of Numba. That isn't its own implementation - it's a way of marking particular functions to be accelerated. Cheers, Nick. > > Victor > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed May 21 02:09:03 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 May 2014 02:09:03 +0200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: <537BDAC9.2010100@canterbury.ac.nz> References: <537BDAC9.2010100@canterbury.ac.nz> Message-ID: 2014-05-21 0:44 GMT+02:00 Greg Ewing : >> Optimizations possible when the code is read-only >> ================================================= >> >> * Inline calls to functions. >> >> * Replace calls to pure functions (without side effect) with the >> result. For example, len("abc") can be replaced with 3. > > I'm skeptical about how much difference this would make. > In most of the Python code I've seen, calls to module-level > functions are relatively rare -- most calls are method > calls. If the class is read-only and has a __slots__ class attribute, methods cannot be modified anymore. If you are able to get (compute) the type of an object, you can optimize the call to the method. Dummy example: --- chars=[] for ch in range(32, 126): chars.append(chr(ch)) print(''.join(chars)) --- Here you can guess that the type of chars in "chars.append" is list. The list.append() method is well known (and it is read-only, even if my global read-only mode is disabled, because list.append is a builtin type). You may inline the call in the body of the loop. Or you can at least move the lookup of the append method out of the loop. Victor From victor.stinner at gmail.com Wed May 21 02:16:41 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 May 2014 02:16:41 +0200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: 2014-05-21 2:00 GMT+02:00 Nick Coghlan : >> More optimizations >> ================== >> >> See my notes for all ideas to optimize CPython: >> >> http://haypo-notes.readthedocs.org/faster_cpython.html >> >> I explain there why I prefer to optimize CPython instead of working on >> PyPy or another Python implementation like Pyston, Numba or similar >> projects. > > You don't explain why you don't want to go with the selective optimisation > approach of Numba. > > That isn't its own implementation - it's a way of marking particular > functions to be accelerated. I don't want to optimize a single function, I want to optimize a whole application. If possible, I would prefer to not have to modify the application to run it faster. Numba plays very well with numbers and arrays, but I'm not sure that it is able to inline arbitrary Python function for example. Victor From steve at pearwood.info Wed May 21 03:42:04 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 21 May 2014 11:42:04 +1000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: <20140521014203.GE10355@ando> On Wed, May 21, 2014 at 03:37:42AM +1000, Chris Angelico wrote: > With that (rather big, and yet quite trivial) caveat, though: Looks > interesting. Optimizing for the >99% of code that doesn't do weird > things makes very good sense, just as long as the <1% can be catered > for. "99% of Python code doesn't do weird things..." It seems to me that this is a myth, or at least unjustifiable by the facts as we have seen it. Victor's experiment shows 25 modules from the standard library are modifiable, with 139 read-only. That's more like 15% than 1% "weird". I don't consider setting sys.ps1 and sys.stdout to be "weird", which is why Victor has to leave sys unlocked. Read-only by default would play havok with such simple idioms as global variables. (Globals may be considered harmful, but they're not considered "weird", and they're also more intuitive to many beginners than function arguments and return values. Strange but true.) As much as I wish to discourage people from using the global statement to rebind globals, I consider it completely unacceptable to have to teach beginners how to disable read-only mode before they've even mastered writing simple functions. I applaud Victor for his experiment, and would like to suggest a couple of approaches he might like to think about. I assume that read-only mode can be set on a per-module basis. * For simplicity, read-only mode is all-or-nothing on a per module basis. If the module is locked, so are the functions and classes defined by that module. If the module is not locked, neither are the functions and classes. (By locked, I mean Victor's read-only mode where globals and class attributes cannot be re-bound, etc.) * For backwards compatibility, any (eventual) production use of this would have to default to off. Perhaps in Python 4 or 5 we can consider defaulting to on. * Define an (optional) module global, say, __readonly__, which defaults to False. The module author must explicitly set it to True if they wish to lock the module in read-only mode. There's no way to enable the read-only optimizations by accident, you have to explicitly turn them on. * However there are ways to auto-detect when *not* to enable them. E.g. if a module uses the global statement in any function or method, read-only mode is disabled for that module. * Similarly, a Python switch to enable/disable read-only mode. I don't mind if the switch --enable-readonly is true by defalt, so long as individual modules default to unlocked. How about testing? It's very common, useful, and very much non-weird to reach into a module and monkey-patch it for the purposes of testing. I don't have a good solution to that, but a couple of stream of consciousness suggestions: - Would it help if there was a statement "import unlocked mymodule" that forces mymodule to remain unlocked rather than read-only? - Would it help if you could make a copy of a readonly module in an unlocked state? - Obviously the "best" (most obvious) solution would be if there was a way to unlock modules on the fly, but Victor suggests that's hard. -- Steven From rosuav at gmail.com Wed May 21 04:20:14 2014 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 21 May 2014 12:20:14 +1000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: <20140521014203.GE10355@ando> References: <20140521014203.GE10355@ando> Message-ID: On Wed, May 21, 2014 at 11:42 AM, Steven D'Aprano wrote: > On Wed, May 21, 2014 at 03:37:42AM +1000, Chris Angelico wrote: > >> With that (rather big, and yet quite trivial) caveat, though: Looks >> interesting. Optimizing for the >99% of code that doesn't do weird >> things makes very good sense, just as long as the <1% can be catered >> for. > > "99% of Python code doesn't do weird things..." > > It seems to me that this is a myth, or at least unjustifiable by the > facts as we have seen it. Victor's experiment shows 25 modules from the > standard library are modifiable, with 139 read-only. That's more like > 15% than 1% "weird". > > I don't consider setting sys.ps1 and sys.stdout to be "weird", which is > why Victor has to leave sys unlocked. Allow me to clarify. A module mutating its own globals is not at all weird; the only thing I'm calling weird is reaching into another module's globals and changing things. In a few rare cases (like sys.ps1 and sys.stdout), this is part of the documented interface of the module; but if (in a parallel universe) Python were designed such that this sort of thing is impossible, it wouldn't be illogical to have a "sys.set_ps1()" function, because the author(s) of the sys module *expect* ps1 to be changed. In contrast, the random module makes use of a bunch of stuff from math (importing them all with underscores, presumably to keep them out of "from random import *", although __all__ is also set), and it is NOT normal to reach in and change them. And before you say "Well, that has an underscore, of course you don't fiddle with it", other modules like imaplib will happily import without underscores - is it expected that you should be able to change imaplib.random to have it use a different random number generator? Or, for that matter, to replace some of its helper functions like imaplib.Int2AP? That, I think, would be considered weird. So there are 15% that change their own globals, which is fine. In this particular instance, we can't optimize for the whole of the 99%, but I maintain that the 15% is not all "weird" just because it's not optimizable. How many modules actually expect that their globals will be externally changed? ChrisA From greg at krypto.org Wed May 21 06:45:26 2014 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 20 May 2014 21:45:26 -0700 Subject: [Python-ideas] python configure --with-ssl In-Reply-To: <537B2186.3080907@gmail.com> References: <537A2E5F.4060004@gmail.com> <537B2186.3080907@gmail.com> Message-ID: On Tue, May 20, 2014 at 2:33 AM, Pavel Machyniak wrote: > On 20.5.2014 0:30, Ned Deily wrote: > > In article <537A2E5F.4060004 at gmail.com>, > > Pavel Machyniak > > wrote: > >> please add option to python configure for setting custom `openssl` > >> installation on python build, eg `--with-ssl=path` as used commonly. > >> Otherwise it is difficult to build python with specific `openssl` > >> installation/compilation, see eg. > >> > http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-wit > >> h-a-custom-openssl-version. > > > > This sort of request has come up before on the Python bug tracker. With > > a quick search, I didn't find an exact match for what you request, > > although there might be a more general issue open to allow more control > > over all third-party libraries. However, there is > > http://bugs.python.org/issue5575 which provides a patch to allow control > > using environment variables rather than configure options. Feel free to > > open a new issue or comment on this one. > > > > Thanks, > > I am well aware of the patch but it does not work if there is default > openssl installation within the system (because it only adds another > path to the END of the search list), and although the patch is from the > year 2009 it is not released (accepted?) yet. > > I will probably find some time and propose the solution/patch using > configure options --with-ssl (and also --with-ssl-includes, > --with-ssl-libs, --with_krb5, and maybe --with-sqlite, > --with-sqlite-includes, --with-sqlite-libs as well). > If you ever go so far as to include options for everything, please include the ability to point to a specific path for readline, ncurses and zlib as well. :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Wed May 21 08:26:03 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 21 May 2014 00:26:03 -0600 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: On Tue, May 20, 2014 at 10:57 AM, Victor Stinner wrote: > Issues with read-only code > ========================== Other things to consider: * reload() will no longer work (it loads into the existing module ns) * the module-replaces-self-in-sys-modules hack will be weird * class decorators that modify the class will no longer work * caching class attrs that are lazily set by instances will no longer work (similar to modules) * singletons stored on the class will break -eric From greg.ewing at canterbury.ac.nz Wed May 21 08:29:11 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 21 May 2014 18:29:11 +1200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: <20140521014203.GE10355@ando> References: <20140521014203.GE10355@ando> Message-ID: <537C47B7.2080404@canterbury.ac.nz> Steven D'Aprano wrote: > Read-only by default would play havok with such simple idioms as global > variables. I don't see why there couldn't be a way to exempt selected names in a module from read-only status. An exemption could be inferred whenever a name is referenced by a 'global' statement. There should also be a way to explicitly mark a name as exempt, to take care of sys.stdout etc., and cases where the only mutations are done from a different module, so there is no global statement. For modules implemented in Python, the explicit marker could consist of a global statement at the top level, which is currently allowed but redundant. -- Greg From theller at ctypes.org Wed May 21 08:58:48 2014 From: theller at ctypes.org (Thomas Heller) Date: Wed, 21 May 2014 08:58:48 +0200 Subject: [Python-ideas] pathlib suggestion In-Reply-To: <20140520172518.4576b754@fsol> References: <20140520172518.4576b754@fsol> Message-ID: Am 20.05.2014 17:25, schrieb Antoine Pitrou: > On Tue, 20 May 2014 10:51:22 +0200 > Thomas Heller wrote: >> Python 3.4's pathlib uses str(path) to get the full pathname >> as string. >> >> I'd like to suggest adding a property which allows to access >> the full pathname. IMO this should make it easier to understand >> the code or make if possible to search for it in sources. >> >> I'm unsure about the name this property should get; maybe .fullpath >> or something like that. I'm also unsure whether there should be >> separate properties to get the full pathname as string or bytes object. > > .strpath perhaps? > (also .bytespath if desired) The names .strpath and .bytespath look good to me. > > It was once proposed as "filesystem path" protocol where classes > purporting to represent filesystem paths could define a e.g. > __strpath__ method returning the string representation of the path. I > can only find the following allusions on python-ideas: > https://mail.python.org/pipermail/python-ideas/2012-October/016912.html > https://mail.python.org/pipermail/python-ideas/2012-October/016974.html This is not directly related of my proposal, but it may be a good idea. So __strpath__() would return the .strpath property? Thomas From machyniak at gmail.com Wed May 21 10:46:32 2014 From: machyniak at gmail.com (Pavel Machyniak) Date: Wed, 21 May 2014 10:46:32 +0200 Subject: [Python-ideas] python configure --with-ssl In-Reply-To: References: <537A2E5F.4060004@gmail.com> <537B2186.3080907@gmail.com> Message-ID: <537C67E8.4010505@gmail.com> On 21.5.2014 6:45, Gregory P. Smith wrote: > > > > On Tue, May 20, 2014 at 2:33 AM, Pavel Machyniak > wrote: > > On 20.5.2014 0:30, Ned Deily wrote: > > In article <537A2E5F.4060004 at gmail.com > >, > > Pavel Machyniak > > > wrote: > >> please add option to python configure for setting custom `openssl` > >> installation on python build, eg `--with-ssl=path` as used commonly. > >> Otherwise it is difficult to build python with specific `openssl` > >> installation/compilation, see eg. > >> > http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-wit > >> h-a-custom-openssl-version. > > > > This sort of request has come up before on the Python bug tracker. > With > > a quick search, I didn't find an exact match for what you request, > > although there might be a more general issue open to allow more > control > > over all third-party libraries. However, there is > > http://bugs.python.org/issue5575 which provides a patch to allow > control > > using environment variables rather than configure options. Feel > free to > > open a new issue or comment on this one. > > > > Thanks, > > I am well aware of the patch but it does not work if there is default > openssl installation within the system (because it only adds another > path to the END of the search list), and although the patch is from the > year 2009 it is not released (accepted?) yet. > > I will probably find some time and propose the solution/patch using > configure options --with-ssl (and also --with-ssl-includes, > --with-ssl-libs, --with_krb5, and maybe --with-sqlite, > --with-sqlite-includes, --with-sqlite-libs as well). > > > If you ever go so far as to include options for everything, please > include the ability to point to a specific path for readline, ncurses > and zlib as well. :) > Created an issue, please see and comment there: http://bugs.python.org/issue21541 From ncoghlan at gmail.com Wed May 21 11:43:05 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 May 2014 19:43:05 +1000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: <20140521014203.GE10355@ando> References: <20140521014203.GE10355@ando> Message-ID: On 21 May 2014 11:48, "Steven D'Aprano" wrote: > > On Wed, May 21, 2014 at 03:37:42AM +1000, Chris Angelico wrote: > > > With that (rather big, and yet quite trivial) caveat, though: Looks > > interesting. Optimizing for the >99% of code that doesn't do weird > > things makes very good sense, just as long as the <1% can be catered > > for. > > "99% of Python code doesn't do weird things..." > > It seems to me that this is a myth, or at least unjustifiable by the > facts as we have seen it. Victor's experiment shows 25 modules from the > standard library are modifiable, with 139 read-only. That's more like > 15% than 1% "weird". It also misses the big reason I am a Python programmer rather than a Java programmer. For me, Python is primarily an orchestration language. It is the language for the code that is telling everything else what to do. If my Python code is an overall performance bottleneck, then "Huzzah!", as it means I have finally engineered all the other structural bottlenecks out of the system. For this use case, monkey patching is not an incidental feature to be tolerated merely for backwards compatibility reasons: it is a key capability that makes Python an ideal language for me, as it takes ultimate control of what dependencies do away from the original author and places it in my hands as the system integrator. This is a dangerous power, not to be used lightly, but it also grants me the ability to work around critical bugs in dependencies at run time, rather than having to fork and patch the source the way Java developers tend to do. Victor's proposal is to make Python more complicated and a worse orchestration language, for the sake of making it a better applications programming language. In isolation, it might be possible to make that case, but in the presence of PyPy for a full dynamically optimised runtime and tools like Cython and Numba for selective optimisation within CPython, no. Regards, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Wed May 21 13:05:49 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 07:05:49 -0400 Subject: [Python-ideas] Disable all peephole optimizations Message-ID: <537C888D.7060903@nedbatchelder.com> ** The problem A long-standing problem with CPython is that the peephole optimizer cannot be completely disabled. Normally, peephole optimization is a good thing, it improves execution speed. But in some situations, like coverage testing, it's more important to be able to reason about the code's execution. I propose that we add a way to completely disable the optimizer. To demonstrate the problem, here is continue.py: a = b = c = 0 for n in range(100): if n % 2: if n % 4: a += 1 continue else: b += 1 c += 1 assert a == 50 and b == 50 and c == 50 If you execute "python3.4 -m trace -c -m continue.py", it produces this continue.cover file: 1: a = b = c = 0 101: for n in range(100): 100: if n % 2: 50: if n % 4: 50: a += 1 >>>>>> continue else: 50: b += 1 50: c += 1 1: assert a == 50 and b == 50 and c == 50 This indicates that the continue line is not executed. It's true: the byte code for that statement is not executed, because the peephole optimizer has removed the jump to the jump. But in reasoning about the code, the continue statement is clearly part of the semantics of this program. If you remove the statement, the program will run differently. If you had to explain this code to a learner, you would of course describe the continue statement as part of the execution. So the trace output does not match our (correct) understanding of the program. The reason we are running trace (or coverage.py) in the first place is to learn something about our code, but it is misleading us. The peephole optimizer is interfering with our ability to reason about the code. We need a way to disable the optimizer so that this won't happen. This type of control is well-known in C compilers, for the same reasons: when running code, optimization is good for speed; when reasoning about code, optimization gets in the way. More details are in http://bugs.python.org/issue2506, which also includes previous discussion of the idea. This has come up on Python-Dev, and Guido seemed supportive: https://mail.python.org/pipermail/python-dev/2012-December/123099.html . ** Implementation Although it may seem like a big change to be able to disable the optimizer, the heart of it is quite simple. In compile.c is the only call to PyCode_Optimize. That function takes a string of bytecode and returns another. If we skip that call, the peephole optimizer is disabled. ** User Interface Unfortunately, the -O command-line switch does not lend itself to a new value that means, "less optimization than the default." I propose a new switch -P, to control the peephole optimizer, with a value of -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment variable would also control the option. There are about a dozen places internal to CPython where optimization level is indicated with an integer, for example, in Py_CompileStringObject. Those uses also don't allow for new values indicating less optimization than the default: 0 and -1 already have meanings. Unless we want to start using -2 for less that the default. I'm not sure we need to provide for those values, or if the PYTHONPEEPHOLE environment variable provides enough control. ** Ramifications This switch makes no changes to the semantics of Python programs, although clearly, if you are tracing a program, the exact sequence of lines and bytecodes will be different (this is the whole point). In the ticket, one objection raised is that providing this option will complicate testing, and that optimization is a difficult enough thing to get right as it is. I disagree, I think providing this option will help test the optimizer, because it will give us a way to test that code runs the same with and without the optimizer. This gives us a tool to use to demonstrate that the optimizer isn't changing the behavior of programs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed May 21 13:41:45 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 May 2014 21:41:45 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537C888D.7060903@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> Message-ID: On 21 May 2014 21:06, "Ned Batchelder" wrote: > ** Implementation > > Although it may seem like a big change to be able to disable the optimizer, the heart of it is quite simple. In compile.c is the only call to PyCode_Optimize. That function takes a string of bytecode and returns another. If we skip that call, the peephole optimizer is disabled. > > ** User Interface > > Unfortunately, the -O command-line switch does not lend itself to a new value that means, "less optimization than the default." I propose a new switch -P, to control the peephole optimizer, with a value of -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment variable would also control the option. Since this is a CPython specific thing, a -X named command line option would be more appropriate. > > There are about a dozen places internal to CPython where optimization level is indicated with an integer, for example, in Py_CompileStringObject. Those uses also don't allow for new values indicating less optimization than the default: 0 and -1 already have meanings. Unless we want to start using -2 for less that the default. I'm not sure we need to provide for those values, or if the PYTHONPEEPHOLE environment variable provides enough control. I assume you want the environment variable so the setting can be inherited by subprocesses? Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed May 21 14:13:19 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 21 May 2014 22:13:19 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537C888D.7060903@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> Message-ID: <20140521121319.GG10355@ando> On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote: > ** The problem > > A long-standing problem with CPython is that the peephole optimizer > cannot be completely disabled. Normally, peephole optimization is a > good thing, it improves execution speed. But in some situations, like > coverage testing, it's more important to be able to reason about the > code's execution. I propose that we add a way to completely disable the > optimizer. I'm not sure whether this is an argument for or against your proposal, but the continue statement shown below is *not* dead code and should not be optimized out. The assert fails if you remove the continue statement. I don't have 3.4 on this machine to test with, but using 3.3, I can see no evidence that `continue` is optimized away. Later in your post, you say: > It's true: the > byte code for that statement [the continue] is not executed, because > the peephole optimizer has removed the jump to the jump. But that cannot be true, because if it were, the assertion would fail. Here's your code again: > To demonstrate the problem, here is continue.py: > > a = b = c = 0 > for n in range(100): > if n % 2: > if n % 4: > a += 1 > continue > else: > b += 1 > c += 1 > assert a == 50 and b == 50 and c == 50 If the continue were not executed, c would equal 100 and the assertion would fail. Have I misunderstood something? (By the way, as given, your indents are inconsistent: some are 4 spaces and some are 5.) -- Steven From j.wielicki at sotecware.net Wed May 21 14:21:50 2014 From: j.wielicki at sotecware.net (Jonas Wielicki) Date: Wed, 21 May 2014 14:21:50 +0200 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <20140521121319.GG10355@ando> References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> Message-ID: <537C9A5E.1060502@sotecware.net> On 21.05.2014 14:13, Steven D'Aprano wrote: > On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote: >> ** The problem >> >> A long-standing problem with CPython is that the peephole optimizer >> cannot be completely disabled. Normally, peephole optimization is a >> good thing, it improves execution speed. But in some situations, like >> coverage testing, it's more important to be able to reason about the >> code's execution. I propose that we add a way to completely disable the >> optimizer. > > I'm not sure whether this is an argument for or against your proposal, > but the continue statement shown below is *not* dead code and should not > be optimized out. The assert fails if you remove the continue statement. > > I don't have 3.4 on this machine to test with, but using 3.3, I can see > no evidence that `continue` is optimized away. The logical continue is still there -- what happens is that the optimizer rewrites the `else` jump at the preceding `if` condition, which would normally point at the `continue` statement, to the beginning of the loop, because it would be a jump (to the continue) to a jump (to the for loop header). Thus, the actual continue statement is not reached, but logically the code does the same, because the only way continue would have been reached was transformed to a continue itself. > Later in your post, you > say: > >> It's true: the >> byte code for that statement [the continue] is not executed, because >> the peephole optimizer has removed the jump to the jump. > > But that cannot be true, because if it were, the assertion would > fail. Here's your code again: > > >> To demonstrate the problem, here is continue.py: >> >> a = b = c = 0 >> for n in range(100): >> if n % 2: >> if n % 4: >> a += 1 >> continue >> else: >> b += 1 >> c += 1 >> assert a == 50 and b == 50 and c == 50 > > If the continue were not executed, c would equal 100 and the assertion > would fail. Have I misunderstood something? > > (By the way, as given, your indents are inconsistent: some are 4 spaces > and some are 5.) > > From p.f.moore at gmail.com Wed May 21 15:05:07 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 21 May 2014 14:05:07 +0100 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537C888D.7060903@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> Message-ID: On 21 May 2014 12:05, Ned Batchelder wrote: > Unfortunately, the -O command-line switch does not lend itself to a new > value that means, "less optimization than the default." I propose a new > switch -P, to control the peephole optimizer, with a value of -P0 meaning no > optimization at all. The PYTHONPEEPHOLE environment variable would also > control the option. The idea sounds reasonable (pretty specialised, but that's OK). But one pitfall is that unless you encode the PYTHONPEEPHOLE setting in the bytecode filename then people will have to remember to delete all bytecode files before using the flag, or the interpreter will pick up an optimised pyc file. Or maybe pyc/pyo files should be ignored if PYTHONPEEPHOLE is set? That's probably simpler. Paul From bcannon at gmail.com Wed May 21 15:51:48 2014 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 21 May 2014 13:51:48 +0000 Subject: [Python-ideas] Disable all peephole optimizations References: <537C888D.7060903@nedbatchelder.com> Message-ID: On Wed May 21 2014 at 9:05:48 AM, Paul Moore wrote: > On 21 May 2014 12:05, Ned Batchelder wrote: > > Unfortunately, the -O command-line switch does not lend itself to a new > > value that means, "less optimization than the default." I propose a new > > switch -P, to control the peephole optimizer, with a value of -P0 > meaning no > > optimization at all. The PYTHONPEEPHOLE environment variable would also > > control the option. > > The idea sounds reasonable (pretty specialised, but that's OK). But > one pitfall is that unless you encode the PYTHONPEEPHOLE setting in > the bytecode filename then people will have to remember to delete all > bytecode files before using the flag, or the interpreter will pick up > an optimised pyc file. Or maybe pyc/pyo files should be ignored if > PYTHONPEEPHOLE is set? That's probably simpler. > There are constant rumblings about trying to make .pyc/.pyo aware of what optimizations were applied so that this kind of thing wouldn't occur. It would require tweaking how optimizations are expressed/added so that they are more easily controlled and can somehow contribute to the labeling of what optimizations were applied. All totally doable but will require thinking about the proper API and such (reading .pyc/.pyo files would also break but that's happened before when we added file size to the header and .pyc/.pyo files are viewed as internal optimizations anyway). -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Wed May 21 16:12:58 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 10:12:58 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> Message-ID: <537CB46A.3040002@nedbatchelder.com> On 5/21/14 7:41 AM, Nick Coghlan wrote: > > > On 21 May 2014 21:06, "Ned Batchelder" > wrote: > > ** Implementation > > > > Although it may seem like a big change to be able to disable the > optimizer, the heart of it is quite simple. In compile.c is the only > call to PyCode_Optimize. That function takes a string of bytecode and > returns another. If we skip that call, the peephole optimizer is > disabled. > > > > ** User Interface > > > > Unfortunately, the -O command-line switch does not lend itself to a > new value that means, "less optimization than the default." I propose > a new switch -P, to control the peephole optimizer, with a value of > -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment > variable would also control the option. > > Since this is a CPython specific thing, a -X named command line option > would be more appropriate. > I had overlooked the introduction of -X. Yes, that seems like the right way: -Xpeephole=0 > > > > > There are about a dozen places internal to CPython where > optimization level is indicated with an integer, for example, in > Py_CompileStringObject. Those uses also don't allow for new values > indicating less optimization than the default: 0 and -1 already have > meanings. Unless we want to start using -2 for less that the > default. I'm not sure we need to provide for those values, or if the > PYTHONPEEPHOLE environment variable provides enough control. > > I assume you want the environment variable so the setting can be > inherited by subprocesses? > It allows it to be inherited by subprocesses, yes. I was hoping it would mean the setting would be available deeper in the interpreter, but now that I think about it, environment variables are interpreted at the top of the interpreter, and then the settings passed along internally. I'll do a survey to figure out where the setting has to be plumbed through the layers to get to compile.c properly. --Ned. > > Cheers, > Nick. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Wed May 21 16:13:57 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 10:13:57 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> Message-ID: <537CB4A5.1030003@nedbatchelder.com> On 5/21/14 9:05 AM, Paul Moore wrote: > On 21 May 2014 12:05, Ned Batchelder wrote: >> Unfortunately, the -O command-line switch does not lend itself to a new >> value that means, "less optimization than the default." I propose a new >> switch -P, to control the peephole optimizer, with a value of -P0 meaning no >> optimization at all. The PYTHONPEEPHOLE environment variable would also >> control the option. > The idea sounds reasonable (pretty specialised, but that's OK). But > one pitfall is that unless you encode the PYTHONPEEPHOLE setting in > the bytecode filename then people will have to remember to delete all > bytecode files before using the flag, or the interpreter will pick up > an optimised pyc file. Or maybe pyc/pyo files should be ignored if > PYTHONPEEPHOLE is set? That's probably simpler. For my use case, it would be enough to use whatever .pyc files the interpreter finds. For a testing scenario, it is fine to delete all the .pyc files, set PYTHONPEEPHOLE, and then run the test suite to be sure to avoid optimized pyc files. > > Paul From ned at nedbatchelder.com Wed May 21 16:23:41 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 10:23:41 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537C9A5E.1060502@sotecware.net> References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> Message-ID: <537CB6ED.2050906@nedbatchelder.com> On 5/21/14 8:21 AM, Jonas Wielicki wrote: > On 21.05.2014 14:13, Steven D'Aprano wrote: >> >On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote: >>> >>** The problem >>> >> >>> >>A long-standing problem with CPython is that the peephole optimizer >>> >>cannot be completely disabled. Normally, peephole optimization is a >>> >>good thing, it improves execution speed. But in some situations, like >>> >>coverage testing, it's more important to be able to reason about the >>> >>code's execution. I propose that we add a way to completely disable the >>> >>optimizer. >> > >> >I'm not sure whether this is an argument for or against your proposal, >> >but the continue statement shown below is*not* dead code and should not >> >be optimized out. The assert fails if you remove the continue statement. >> > >> >I don't have 3.4 on this machine to test with, but using 3.3, I can see >> >no evidence that `continue` is optimized away. > The logical continue is still there -- what happens is that the > optimizer rewrites the `else` jump at the preceding `if` condition, > which would normally point at the `continue` statement, to the beginning > of the loop, because it would be a jump (to the continue) to a jump (to > the for loop header). > > Thus, the actual continue statement is not reached, but logically the > code does the same, because the only way continue would have been > reached was transformed to a continue itself. > To make the details more explicit, here is the source again, and the disassembled code, with the original source interspersed: a = b = c = 0 for n in range(100): if n % 2: if n % 4: a += 1 continue else: b += 1 c += 1 assert a == 50 and b == 50 and c == 50 Disassembled (Python 3.4, but the same effect is visible in 2.7, 3.3, etc): a = b = c = 0 1 0 LOAD_CONST 0 (0) 3 DUP_TOP 4 STORE_NAME 0 (a) 7 DUP_TOP 8 STORE_NAME 1 (b) 11 STORE_NAME 2 (c) for n in range(100): 2 14 SETUP_LOOP 79 (to 96) 17 LOAD_NAME 3 (range) 20 LOAD_CONST 1 (100) 23 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 26 GET_ITER >> 27 FOR_ITER 65 (to 95) 30 STORE_NAME 4 (n) if n % 2: 3 33 LOAD_NAME 4 (n) 36 LOAD_CONST 2 (2) 39 BINARY_MODULO 40 POP_JUMP_IF_FALSE 72 if n % 4: 4 43 LOAD_NAME 4 (n) 46 LOAD_CONST 3 (4) 49 BINARY_MODULO 50 POP_JUMP_IF_FALSE 27 a += 1 5 53 LOAD_NAME 0 (a) 56 LOAD_CONST 4 (1) 59 INPLACE_ADD 60 STORE_NAME 0 (a) 63 JUMP_ABSOLUTE 27 continue 6 66 JUMP_ABSOLUTE 27 69 JUMP_FORWARD 10 (to 82) b += 1 8 >> 72 LOAD_NAME 1 (b) 75 LOAD_CONST 4 (1) 78 INPLACE_ADD 79 STORE_NAME 1 (b) c += 1 9 >> 82 LOAD_NAME 2 (c) 85 LOAD_CONST 4 (1) 88 INPLACE_ADD 89 STORE_NAME 2 (c) 92 JUMP_ABSOLUTE 27 >> 95 POP_BLOCK assert a == 50 and b == 50 and c == 50 10 >> 96 LOAD_NAME 0 (a) 99 LOAD_CONST 5 (50) 102 COMPARE_OP 2 (==) 105 POP_JUMP_IF_FALSE 132 108 LOAD_NAME 1 (b) 111 LOAD_CONST 5 (50) 114 COMPARE_OP 2 (==) 117 POP_JUMP_IF_FALSE 132 120 LOAD_NAME 2 (c) 123 LOAD_CONST 5 (50) 126 COMPARE_OP 2 (==) 129 POP_JUMP_IF_TRUE 138 >> 132 LOAD_GLOBAL 5 (AssertionError) 135 RAISE_VARARGS 1 >> 138 LOAD_CONST 6 (None) 141 RETURN_VALUE Notice that line 6 (the continue) is unreachable, because the else-jump from line 4 has been turned into a jump to bytecode offset 27 (the for loop), and the end of line 5 has also been turned into a jump to 27, rather than letting it flow to line 6. So line 6 still exists in the bytecode, but is never executed, leading tracing tools to indicate that line 6 is never executed. --Ned. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed May 21 16:11:21 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 21 May 2014 07:11:21 -0700 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: <20140521014203.GE10355@ando> Message-ID: <537CB409.5000104@stoneleaf.us> On 05/21/2014 02:43 AM, Nick Coghlan wrote: > > For this use case, monkey patching is not an incidental feature to be tolerated merely for backwards compatibility > reasons: it is a key capability that makes Python an ideal language for me, as it takes ultimate control of what > dependencies do away from the original author and places it in my hands as the system integrator. This is a dangerous > power, not to be used lightly, but it also grants me the ability to work around critical bugs in dependencies at run > time, rather than having to fork and patch the source the way Java developers tend to do. +inf -- ~Ethan~ From mulhern at gmail.com Wed May 21 16:49:36 2014 From: mulhern at gmail.com (mulhern) Date: Wed, 21 May 2014 10:49:36 -0400 Subject: [Python-ideas] Maybe/Option builtin Message-ID: I feel that a Maybe/Option type, analogous to the types found in Haskell or OCaml would actually be useful in Python. The value analogous to the None constructor should be Python's None. Obviously, it wouldn't give the type-checking benefits that it gives in statically checked languages, but every use of a Maybe object as if it were the contained object would give an error, alerting the user to the fact that None is a possibility and allowing them to address the problem sooner rather than later. I feel that it would be kind of tricky to implement it as a class. Something like: class Maybe(object): def __init__(self, value=None): self.value = value def value(self): return self.value is a start but I'm not able to see how to make if Maybe(): print("nothing") # never prints but if Maybe({}): print("yes a value") #always prints which is definitely the desired behaviour. I also think that it would be the first Python type introduced solely because of its typey properties, not because it provided any actual functionality, which might be considered unpythonic. Any comments? Thanks! - mulhern -------------- next part -------------- An HTML attachment was scrubbed... URL: From Steve.Dower at microsoft.com Wed May 21 16:56:23 2014 From: Steve.Dower at microsoft.com (Steve Dower) Date: Wed, 21 May 2014 14:56:23 +0000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: <537CB409.5000104@stoneleaf.us> References: <20140521014203.GE10355@ando> , <537CB409.5000104@stoneleaf.us> Message-ID: <1d4b9976723c4013addf2c8ec3c77f0f@BLUPR03MB389.namprd03.prod.outlook.com> Another +inf from me. Mind if I quote you on this next time I'm trying to convince C# developers to take Python seriously? :) Top-posted from my Windows Phone ________________________________ From: Ethan Furman Sent: ?5/?21/?2014 7:37 To: python-ideas at python.org Subject: Re: [Python-ideas] Make Python code read-only On 05/21/2014 02:43 AM, Nick Coghlan wrote: > > For this use case, monkey patching is not an incidental feature to be tolerated merely for backwards compatibility > reasons: it is a key capability that makes Python an ideal language for me, as it takes ultimate control of what > dependencies do away from the original author and places it in my hands as the system integrator. This is a dangerous > power, not to be used lightly, but it also grants me the ability to work around critical bugs in dependencies at run > time, rather than having to fork and patch the source the way Java developers tend to do. +inf -- ~Ethan~ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.wielicki at sotecware.net Wed May 21 17:16:17 2014 From: j.wielicki at sotecware.net (Jonas Wielicki) Date: Wed, 21 May 2014 17:16:17 +0200 Subject: [Python-ideas] Maybe/Option builtin In-Reply-To: References: Message-ID: <537CC341.7030901@sotecware.net> On 21.05.2014 16:49, mulhern wrote: > Something like: > > class Maybe(object): > > def __init__(self, value=None): > self.value = value > > def value(self): > return self.value > > is a start but I'm not able to see how to make > > if Maybe(): > print("nothing") # never prints > > but > > if Maybe({}): > print("yes a value") #always prints Implement the __bool__ method: . I don?t have any opinion on the proposal itself. regards, Jonas From ahammel87 at gmail.com Wed May 21 17:38:16 2014 From: ahammel87 at gmail.com (Alex Hammel) Date: Wed, 21 May 2014 08:38:16 -0700 Subject: [Python-ideas] Maybe/Option builtin In-Reply-To: References: Message-ID: What's a specific use case? Usually a Maybe is used to model a chain of computations which might fail. You can use exceptions for that in Python: try: foo1 = bar() foo2 = baz(foo1) foo3 = quux(foo2) except FooException: # recover On occasion I've wanted to do the opposite: call a number of functions and keep the value of the first one that doesn't throw an exception. I implemented it like this. That certainly doesn't need to be a built-in, though, and I'm not convinced it belongs in that standard library. It's a relatively rare use-case, and it's easy to roll-your-own if you need it. On Wed, May 21, 2014 at 7:49 AM, mulhern wrote: > I feel that a Maybe/Option type, analogous to the types found in Haskell > or OCaml would actually be useful in Python. The value analogous to the > None constructor should be Python's None. > > Obviously, it wouldn't give the type-checking benefits that it gives in > statically checked languages, but every use of a Maybe object as if it were > the contained object would give an error, alerting the user to the fact > that None is a possibility and allowing them to address the problem sooner > rather than later. > > I feel that it would be kind of tricky to implement it as a class. > Something like: > > class Maybe(object): > > def __init__(self, value=None): > self.value = value > > def value(self): > return self.value > > is a start but I'm not able to see how to make > > if Maybe(): > print("nothing") # never prints > > but > > if Maybe({}): > print("yes a value") #always prints > > which is definitely the desired behaviour. > > I also think that it would be the first Python type introduced solely > because of its typey properties, not because it provided any actual > functionality, which might be considered unpythonic. > > Any comments? > > Thanks! > > - mulhern > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Wed May 21 21:44:24 2014 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 21 May 2014 20:44:24 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: Message-ID: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> On May 21, 2014, at 1:21 PM, python-ideas-request at python.org wrote: > I propose that we add a way to completely disable the > optimizer. I think this opens a can of worms that is better left closed. * We will have to start running tests both with and without the switch turned on for example (because you're exposing yet another way to run Python with different code). * Over time, I expect that some of the functionality of the peepholer is going to be moved upstream into AST transformations you will have even less ability switch something on-and-off. * The code in-place has been in the code for over a decade and the tracker item has languished for years. That provides some evidence that the "need" here is very small. * I sympathize with "there is an irritating dimple in coverage.py" but that hasn't actually impaired its usability beyond creating a curiosity. Using that a reason to add a new CPython-only command-line switch seems like having the tail wag the dog. * As the other implementations of Python continue to develop, I don't think we should tie their hands with respect to code generation. * Ideally, the peepholer should be thought of as part of the code generation. As compilation improves over time, it should start to generate the same code as we're getting now. It probably isn't wise to expose the implementation detail that the constant folding and jump tweaks are done in a separate second pass. * Mostly, I don't want to open a new crack in the Python veneer where people are switching on and off two different streams of code generation (currently, there is one way to do it). I can't fully articulate my instincts here, but I think we'll regret opening this door when we didn't have to. That being said, I know how the politics of python-ideas works and I expect that my thoughts on the subject will quickly get buried by a discussion of which lettercode should be used for the command-line switch. Hopefully, some readers will focus on the question of whether it is worth it. Others might look at ways to improve the existing code (without an off-switch) so that the continue-statement jump-to-jump shows-up in your coverage tool. IMO, adding a new command-line switch is a big deal (we should do it very infrequently, limit it to things with a big payoff, and think about whether there are any downsides). Personally, I don't see any big wins here and have a sense that there are downsides that would make us regret exposing alternate code generation. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Wed May 21 22:38:01 2014 From: antony.lee at berkeley.edu (Antony Lee) Date: Wed, 21 May 2014 13:38:01 -0700 Subject: [Python-ideas] Another pathlib suggestion Message-ID: Handling of Paths with multiple extensions is currently not so easy with pathlib. Specifically, I don't think there is an easy way to go from "foo.tar.gz" to "foo.ext", because Path.with_suffix only replaces the last suffix. I would therefore like to suggest either 1/ add Path.replace_suffix, such that Path("foo.tar.gz").replace_suffix(".tar.gz", ".ext") == Path("foo.ext") (this would also provide extension-checking capabilities, raising ValueError if the first argument is not a valid suffix of the initial path); or 2/ add a second argument to Path.with_suffix, "n_to_strip" (although perhaps with a better name), defaulting to 1, such that Path("foo.tar.gz").with_suffix(".ext", 0) == Path("foo.tar.gz.ext") Path("foo.tar.gz").with_suffix(".ext", 1) == Path("foo.tar.ext") Path("foo.tar.gz").with_suffix(".ext", 2) == Path("foo.ext") # set n_to_strip to len(path.suffixes) for stripping all of them. Path("foo.tar.gz").with_suffix(".ext", 3) raises a ValueError. Best, Antony -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed May 21 23:14:41 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 21 May 2014 22:14:41 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> Message-ID: On Wed, May 21, 2014 at 8:44 PM, Raymond Hettinger wrote: > * I sympathize with "there is an irritating dimple in coverage.py" > but that hasn't actually impaired its usability beyond creating a > curiosity. Using that a reason to add a new CPython-only > command-line switch seems like having the tail wag the dog. I've certainly been frustrated by this wart in coverage.py's output -- if one uses a dev cycle where you constantly review every uncovered line to make sure that tests are doing what you want, then even a small number of spurious uncovered lines that appear and disappear based on the optimizer's whim can result in a lot of wasted time. (Not to mention the hours wasted the first time I ran into this, trying to figure out why my tests weren't working and writing new ones specifically to target the optimized-out line...) That said, I'm also sympathetic to your point. Isn't the real problem here that the peephole optimizer violates the first rule of optimization ("don't change semantics") by breaking sys.settrace? Couldn't we fix this directly? One approach might be to enhance co_lnotab (if anyone dares touch it) so that it can record that a peepholed jump instruction logically belongs to multiple *different* lines, and when we encounter such an instruction we call the trace function multiple times. Then the peephole optimizer just has to propagate line number information whenever it short-circuits a jump. Or perhaps it would be enough to add a dead-code optimization pass after the peephole optimizer, so that coverage.py can at least see that things like Ned's "continue" didn't actually generate any code. (This is suboptimal as well, since it will still cause coverage.py to produce somewhat confusing output, as if the "continue" line had a comment instead of real code -- but it'd still be better than the status quo.) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From solipsis at pitrou.net Wed May 21 23:24:14 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 21 May 2014 23:24:14 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> Message-ID: <20140521232414.2720b5ed@fsol> Hi, On Wed, 21 May 2014 20:44:24 +0100 Raymond Hettinger wrote: > > I think this opens a can of worms that is better left closed. FWIW, I agree with Raymond's arguments here. Regards Antoine. From p.f.moore at gmail.com Thu May 22 00:17:59 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 21 May 2014 23:17:59 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <20140521232414.2720b5ed@fsol> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> Message-ID: On 21 May 2014 22:24, Antoine Pitrou wrote: > On Wed, 21 May 2014 20:44:24 +0100 > Raymond Hettinger > wrote: >> >> I think this opens a can of worms that is better left closed. > > FWIW, I agree with Raymond's arguments here. I tend to agree as well. It's a pretty specialised case, and presumably tools similar to coverage for languages like C manage to deal with the issue. Like Raymond, I can't quite explain my reservations, but it feels like this proposal leans towards overspecifying implementation details, in a way that will limit future development of the optimiser. Paul From trip at flowroute.com Thu May 22 00:30:38 2014 From: trip at flowroute.com (Trip Volpe) Date: Wed, 21 May 2014 15:30:38 -0700 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> Message-ID: (First, shouldn't this be in the "disable all peephole optimizations" thread? Raymond seems to have replied to the digest..!) On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith wrote: > Isn't the real problem here that the peephole optimizer violates the > first rule of optimization ("don't change semantics") by breaking > sys.settrace? Couldn't we fix this directly? I agree with this. Adding a command line flag to tinker with code generation may well be opening a can of worms, but "the peephole optimizer shouldn't change semantics" is a more compelling argument, although fixing it from that angle is obviously more involved. One problem is that functions like settrace() expose low-level details to the higher-level semantics. It's a fair question as to whether it should be considered kosher to expose implementation details like the peephole optimizer through such interfaces. I could get behind an implementation that hides the erasure of lines that are still (semantically) being executed, without disabling the peephole optimizer. - Trip On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith wrote: > On Wed, May 21, 2014 at 8:44 PM, Raymond Hettinger > wrote: > > * I sympathize with "there is an irritating dimple in coverage.py" > > but that hasn't actually impaired its usability beyond creating a > > curiosity. Using that a reason to add a new CPython-only > > command-line switch seems like having the tail wag the dog. > > I've certainly been frustrated by this wart in coverage.py's output -- > if one uses a dev cycle where you constantly review every uncovered > line to make sure that tests are doing what you want, then even a > small number of spurious uncovered lines that appear and disappear > based on the optimizer's whim can result in a lot of wasted time. (Not > to mention the hours wasted the first time I ran into this, trying to > figure out why my tests weren't working and writing new ones > specifically to target the optimized-out line...) > > That said, I'm also sympathetic to your point. > > Isn't the real problem here that the peephole optimizer violates the > first rule of optimization ("don't change semantics") by breaking > sys.settrace? Couldn't we fix this directly? > > One approach might be to enhance co_lnotab (if anyone dares touch it) > so that it can record that a peepholed jump instruction logically > belongs to multiple *different* lines, and when we encounter such an > instruction we call the trace function multiple times. Then the > peephole optimizer just has to propagate line number information > whenever it short-circuits a jump. > > Or perhaps it would be enough to add a dead-code optimization pass > after the peephole optimizer, so that coverage.py can at least see > that things like Ned's "continue" didn't actually generate any code. > (This is suboptimal as well, since it will still cause coverage.py to > produce somewhat confusing output, as if the "continue" line had a > comment instead of real code -- but it'd still be better than the > status quo.) > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed May 21 23:37:36 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 21 May 2014 14:37:36 -0700 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <20140521232414.2720b5ed@fsol> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> Message-ID: <537D1CA0.7070106@stoneleaf.us> On 05/21/2014 02:24 PM, Antoine Pitrou wrote: > On Wed, 21 May 2014 20:44:24 +0100 Raymond Hettinger wrote: >> >> I think this opens a can of worms that is better left closed. > > FWIW, I agree with Raymond's arguments here. Wow, did a new star fire up somewhere? I also agree. :) -- ~Ethan~ From p.f.moore at gmail.com Thu May 22 00:47:24 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 21 May 2014 23:47:24 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> Message-ID: On 21 May 2014 23:30, Trip Volpe wrote: > On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith wrote: >> Isn't the real problem here that the peephole optimizer violates the >> first rule of optimization ("don't change semantics") by breaking >> sys.settrace? Couldn't we fix this directly? > > I agree with this. Adding a command line flag to tinker with code generation > may well be opening a can of worms, but "the peephole optimizer shouldn't > change semantics" is a more compelling argument, although fixing it from > that angle is obviously more involved. One problem is that functions like > settrace() expose low-level details to the higher-level semantics. It's a > fair question as to whether it should be considered kosher to expose > implementation details like the peephole optimizer through such interfaces. While I'm happy to be proved wrong with code, my instinct is that "making sys.settrace work" would likely be too complex to be practical. In any case, as you say, it exposes low-level details, and I would personally consider "glitches" like this as implementation details. To put it another way, I don't consider the exact lines traced by sys.settrace to be part of the semantics of a program, any more than I consider the output of dis.dis to be. So in my view it is acceptable for the optimiser to change the lines that get traced in the way that coverage experienced. Paul. From ned at nedbatchelder.com Thu May 22 00:51:41 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 18:51:41 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> Message-ID: <537D2DFD.6000504@nedbatchelder.com> On 5/21/14 3:44 PM, Raymond Hettinger wrote: > > On May 21, 2014, at 1:21 PM, python-ideas-request at python.org > wrote: > >> I propose that we add a way to completely disable the >> optimizer. > > I think this opens a can of worms that is better left closed. > > * We will have to start running tests both with and without the switch > turned on for example (because you're exposing yet another way to > run Python with different code). Yes, this could mean an increased testing burden. But that scale horizontally, and will not require a large amount of engineering work. Besides, what better way to test the optimizer? > > * Over time, I expect that some of the functionality of the peepholer > is going to be moved upstream into AST transformations you will > have even less ability switch something on-and-off. I'm perfectly happy to remove the word "peephole" from the feature. If we expect the set of optimizations to grow in the future, then we can expect that more cases of code analysis will be misled by optimizations. All the more reason to establish a way now that will disable all optimizations. > > * The code in-place has been in the code for over a decade and > the tracker item has languished for years. That provides some > evidence that the "need" here is very small. > > * I sympathize with "there is an irritating dimple in coverage.py" > but that hasn't actually impaired its usability beyond creating a > curiosity. Using that a reason to add a new CPython-only > command-line switch seems like having the tail wag the dog. I don't think you should dismiss real users' concerns as a curiosity. We already have -X as a way to provide implementation-specific switches, I'm not sure why the CPython-only nature of this is an issue? > > * As the other implementations of Python continue to develop, > I don't think we should tie their hands with respect to code > generation. This proposal only applies to CPython. > > * Ideally, the peepholer should be thought of as part of the code > generation. As compilation improves over time, it should start > to generate the same code as we're getting now. It probably > isn't wise to expose the implementation detail that the constant > folding and jump tweaks are done in a separate second pass. I'm happy to remove the word "peephole". I think a way to disable optimization is useful. I've heard the concern from a number of coverage.py users. If as we all think, optimizations will expand in CPython, then the number of mis-diagnosed code problems will grow. --Ned. > > * Mostly, I don't want to open a new crack in the Python veneer > where people are switching on and off two different streams of > code generation (currently, there is one way to do it). I can't > fully articulate my instincts here, but I think we'll regret opening > this door when we didn't have to. > > That being said, I know how the politics of python-ideas works > and I expect that my thoughts on the subject will quickly get > buried by a discussion of which lettercode should be used for the > command-line switch. > > Hopefully, some readers will focus on the question of whether > it is worth it. Others might look at ways to improve the existing > code (without an off-switch) so that the continue-statement > jump-to-jump shows-up in your coverage tool. > > IMO, adding a new command-line switch is a big deal (we should > do it very infrequently, limit it to things with a big payoff, and > think about whether there are any downsides). Personally, I don't > see any big wins here and have a sense that there are downsides > that would make us regret exposing alternate code generation. > > > Raymond > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu May 22 00:50:47 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 22 May 2014 10:50:47 +1200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> Message-ID: <537D2DC7.5030504@canterbury.ac.nz> Raymond Hettinger wrote: > * I sympathize with "there is an irritating dimple in coverage.py" > but that hasn't actually impaired its usability beyond creating a > curiosity. Another way to address this would be to make coverage.py smart enough to understand when a source line has been optimised away and always report it as executed. -- Greg From ned at nedbatchelder.com Thu May 22 00:54:13 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 18:54:13 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> Message-ID: <537D2E95.8060806@nedbatchelder.com> On 5/21/14 6:47 PM, Paul Moore wrote: > On 21 May 2014 23:30, Trip Volpe wrote: >> On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith wrote: >>> Isn't the real problem here that the peephole optimizer violates the >>> first rule of optimization ("don't change semantics") by breaking >>> sys.settrace? Couldn't we fix this directly? >> I agree with this. Adding a command line flag to tinker with code generation >> may well be opening a can of worms, but "the peephole optimizer shouldn't >> change semantics" is a more compelling argument, although fixing it from >> that angle is obviously more involved. One problem is that functions like >> settrace() expose low-level details to the higher-level semantics. It's a >> fair question as to whether it should be considered kosher to expose >> implementation details like the peephole optimizer through such interfaces. > While I'm happy to be proved wrong with code, my instinct is that > "making sys.settrace work" would likely be too complex to be > practical. I absolutely agree that "fixing settrace" is likely to be 100x more complex than disabling the optimizer. > > In any case, as you say, it exposes low-level details, and I would > personally consider "glitches" like this as implementation details. I also agree that the exact lines reported by settrace are to some extent an implementation detail. All I'm asking for is a way to make the implementation match the expectations. > To > put it another way, I don't consider the exact lines traced by > sys.settrace to be part of the semantics of a program, any more than I > consider the output of dis.dis to be. So in my view it is acceptable > for the optimiser to change the lines that get traced in the way that > coverage experienced. > > Paul. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ned at nedbatchelder.com Thu May 22 00:59:38 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 18:59:38 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> Message-ID: <537D2FDA.6030907@nedbatchelder.com> On 5/21/14 6:17 PM, Paul Moore wrote: > On 21 May 2014 22:24, Antoine Pitrou wrote: >> On Wed, 21 May 2014 20:44:24 +0100 >> Raymond Hettinger >> wrote: >>> I think this opens a can of worms that is better left closed. >> FWIW, I agree with Raymond's arguments here. > I tend to agree as well. It's a pretty specialised case, and > presumably tools similar to coverage for languages like C manage to > deal with the issue. Yes, C and its tools have a way to deal with this. Are you familiar with the -O0 switch? It disables optimization. BTW: As C programmers know, if you want to debug your program, you use the -O0 switch. Debugging is about reasoning about the code rather than executing it. Trying to debug optimized C code is very difficult, because nothing matches your expectations. If, as others in this thread have said, we expect the set of optimizations to grow, the need for an off switch will become greater, even to debug the code. > > Like Raymond, I can't quite explain my reservations, but it feels like > this proposal leans towards overspecifying implementation details, in > a way that will limit future development of the optimiser. If by implementation details, you mean the word "peephole", then let's remove it, and simply have a switch that disables all optimization. Rather than limiting the future of the optimizer, it will provide an escape hatch for people who would rather not have the optimizer's effects. --Ned. > Paul > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ned at nedbatchelder.com Thu May 22 01:04:58 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 19:04:58 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537D1CA0.7070106@stoneleaf.us> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> Message-ID: <537D311A.2010101@nedbatchelder.com> On 5/21/14 5:37 PM, Ethan Furman wrote: > On 05/21/2014 02:24 PM, Antoine Pitrou wrote: >> On Wed, 21 May 2014 20:44:24 +0100 Raymond Hettinger wrote: >>> >>> I think this opens a can of worms that is better left closed. >> >> FWIW, I agree with Raymond's arguments here. > > Wow, did a new star fire up somewhere? I also agree. :) I'm not sure what can of worms you are imagining. Let's look to our experience with C compilers. They have a switch to disable optimization. What trouble has that brought? When I think of problems with optimizers in C compilers, I think of incorrect or buggy optimizations. I can't think of something that has gone wrong because there was a switch to turn it off. People in this thread have contrasted this proposal with an apparent desire to expand the set of optimizations performed. It seems to me that the complexity and danger lie in expanded optimizations, not disabled ones. --Ned. > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ned at nedbatchelder.com Thu May 22 01:09:33 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 19:09:33 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537D2DC7.5030504@canterbury.ac.nz> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <537D2DC7.5030504@canterbury.ac.nz> Message-ID: <537D322D.7030909@nedbatchelder.com> On 5/21/14 6:50 PM, Greg Ewing wrote: > Raymond Hettinger wrote: >> * I sympathize with "there is an irritating dimple in coverage.py" >> but that hasn't actually impaired its usability beyond creating a >> curiosity. > > Another way to address this would be to make coverage.py > smart enough to understand when a source line has been > optimised away and always report it as executed. > Do you have any ideas about that could possibly work? Reverse-engineering optimized code is difficult if not impossible. I'm open to concrete ideas though. --Ned. From donald at stufft.io Thu May 22 01:18:31 2014 From: donald at stufft.io (Donald Stufft) Date: Wed, 21 May 2014 19:18:31 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537D2DFD.6000504@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <537D2DFD.6000504@nedbatchelder.com> Message-ID: On May 21, 2014, at 6:51 PM, Ned Batchelder wrote: >> * I sympathize with "there is an irritating dimple in coverage.py" >> but that hasn't actually impaired its usability beyond creating a >> curiosity. Using that a reason to add a new CPython-only >> command-line switch seems like having the tail wag the dog. > I don't think you should dismiss real users' concerns as a curiosity. We already have -X as a way to provide implementation-specific switches, I'm not sure why the CPython-only nature of this is an issue? I think it has impacted it?s usability. I?ve certainly burned some amount of time trying to figure out why an optimized line was showing up as uncovered. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From victor.stinner at gmail.com Thu May 22 01:22:49 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 22 May 2014 01:22:49 +0200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: <20140521014203.GE10355@ando> References: <20140521014203.GE10355@ando> Message-ID: 2014-05-21 3:42 GMT+02:00 Steven D'Aprano : > - Obviously the "best" (most obvious) solution would be if there was a > way to unlock modules on the fly, but Victor suggests that's hard. The problem is to react to such event. In a function has a specialized version for a set of read-only objects, the specialized version should not be used anymore. Ok, I create a new branch "readonly_cb" branch where it is possible to make again modules, types and functions modifiable. *But* when the readonly state is modified, a callback is called. It can be used to disable optimizations relying on it. So all issues listed in this thread are away. It's possible again to use monkey-patching, lazy initialization of module variables and class variables, etc. I hope that such callback is enough to make optimizations efficient. Victor From njs at pobox.com Thu May 22 01:09:51 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 22 May 2014 00:09:51 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537D2DC7.5030504@canterbury.ac.nz> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <537D2DC7.5030504@canterbury.ac.nz> Message-ID: On Wed, May 21, 2014 at 11:50 PM, Greg Ewing wrote: > Raymond Hettinger wrote: >> >> * I sympathize with "there is an irritating dimple in coverage.py" >> but that hasn't actually impaired its usability beyond creating a >> curiosity. > > > Another way to address this would be to make coverage.py > smart enough to understand when a source line has been > optimised away and always report it as executed. AFAICT the only ways to make coverage.py "smart enough" would be: 1) Teach coverage.py to perform a full (sound) reachability analysis on bytecode. 2) Teach coverage.py to notice when a jump instruction doesn't go where you might expect it to based on a naive reading of the source code, and then reverse-engineer from this what sequence of jump instructions that must have been merged to produce the one we observe. I guess in practice this probably would require carrying around a patched copy of the full compiler code from every Python release. The problem here is that the Python compiler is throwing away information that only it has. Asking coverage.py to reconstruct that without help from the compiler isn't reasonable IMO. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From steve at pearwood.info Thu May 22 01:50:31 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 22 May 2014 09:50:31 +1000 Subject: [Python-ideas] Maybe/Option builtin In-Reply-To: References: Message-ID: <20140521235031.GH10355@ando> On Wed, May 21, 2014 at 10:49:36AM -0400, mulhern wrote: > I feel that a Maybe/Option type, analogous to the types found in Haskell or > OCaml would actually be useful in Python. The value analogous to the None > constructor should be Python's None. > > Obviously, it wouldn't give the type-checking benefits that it gives in > statically checked languages, but every use of a Maybe object as if it were > the contained object would give an error, alerting the user to the fact > that None is a possibility and allowing them to address the problem sooner > rather than later. Since this is a Python list, you shouldn't take it for granted that we will be familiar with Haskell or Ocaml, nor expect us to go off and study those languages well enough to understand what a Maybe object is used for or how it will fit into Python's execution model. I'm no expect on either of those two languages, but it seems to me that that the only point of Maybe is to allow the static type checker to distinguish between (for example) "this function returns a string" and "this function returns either a string or None". Since Python doesn't do static, compile-time type checks, I'm having difficulty in seeing what would be the point of Maybe in Python. As I said, I'm not an expert in Haskell, but I don't think this proposal is helpful, and certainly not helpful enough to make it a built-in or standard part of the language. If I have missed some benefit of Maybe, please explain how it would apply in Python terms. > I feel that it would be kind of tricky to implement it as a class. > Something like: > > class Maybe(object): > def __init__(self, value=None): > self.value = value > def value(self): > return self.value > > is a start but I'm not able to see how to make > > if Maybe(): > print("nothing") # never prints In Python 2, define __nonzero__. In Python 3, define __bool__. https://docs.python.org/2/reference/datamodel.html#object.__nonzero__ https://docs.python.org/3/reference/datamodel.html#object.__bool__ > but > > if Maybe({}): > print("yes a value") #always prints > > which is definitely the desired behaviour. Not to me it isn't. This goes against the standard Python convention that empty containers are falsey. Since Maybe({}) wraps an empty dict, it should be considered a false value. -- Steven From haoyi.sg at gmail.com Thu May 22 01:53:01 2014 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 21 May 2014 16:53:01 -0700 Subject: [Python-ideas] Maybe/Option builtin In-Reply-To: <20140521235031.GH10355@ando> References: <20140521235031.GH10355@ando> Message-ID: I've been using [x] and [] for my Option type. It works great, even can be chained monadically using for comprehensions =) On Wed, May 21, 2014 at 4:50 PM, Steven D'Aprano wrote: > On Wed, May 21, 2014 at 10:49:36AM -0400, mulhern wrote: > > I feel that a Maybe/Option type, analogous to the types found in Haskell > or > > OCaml would actually be useful in Python. The value analogous to the None > > constructor should be Python's None. > > > > Obviously, it wouldn't give the type-checking benefits that it gives in > > statically checked languages, but every use of a Maybe object as if it > were > > the contained object would give an error, alerting the user to the fact > > that None is a possibility and allowing them to address the problem > sooner > > rather than later. > > Since this is a Python list, you shouldn't take it for granted that we > will be familiar with Haskell or Ocaml, nor expect us to go off and > study those languages well enough to understand what a Maybe object is > used for or how it will fit into Python's execution model. > > I'm no expect on either of those two languages, but it seems to me that > that the only point of Maybe is to allow the static type checker to > distinguish between (for example) "this function returns a string" and > "this function returns either a string or None". Since Python doesn't do > static, compile-time type checks, I'm having difficulty in seeing what > would be the point of Maybe in Python. > > As I said, I'm not an expert in Haskell, but I don't think this proposal > is helpful, and certainly not helpful enough to make it a built-in or > standard part of the language. If I have missed some benefit of Maybe, > please explain how it would apply in Python terms. > > > > > I feel that it would be kind of tricky to implement it as a class. > > Something like: > > > > class Maybe(object): > > def __init__(self, value=None): > > self.value = value > > def value(self): > > return self.value > > > > is a start but I'm not able to see how to make > > > > if Maybe(): > > print("nothing") # never prints > > In Python 2, define __nonzero__. In Python 3, define __bool__. > > https://docs.python.org/2/reference/datamodel.html#object.__nonzero__ > https://docs.python.org/3/reference/datamodel.html#object.__bool__ > > > > but > > > > if Maybe({}): > > print("yes a value") #always prints > > > > which is definitely the desired behaviour. > > Not to me it isn't. This goes against the standard Python convention > that empty containers are falsey. Since Maybe({}) wraps an empty dict, > it should be considered a false value. > > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu May 22 02:01:40 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 May 2014 10:01:40 +1000 Subject: [Python-ideas] Make Python code read-only In-Reply-To: <1d4b9976723c4013addf2c8ec3c77f0f@BLUPR03MB389.namprd03.prod.outlook.com> References: <20140521014203.GE10355@ando> <537CB409.5000104@stoneleaf.us> <1d4b9976723c4013addf2c8ec3c77f0f@BLUPR03MB389.namprd03.prod.outlook.com> Message-ID: On 22 May 2014 00:56, "Steve Dower" wrote: > > Another +inf from me. > > Mind if I quote you on this next time I'm trying to convince C# developers to take Python seriously? :) Sure - I expect your conversations with C# devs resemble some of mine with Java devs :) Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu May 22 02:07:30 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 May 2014 10:07:30 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537CB6ED.2050906@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> Message-ID: On 22 May 2014 00:24, "Ned Batchelder" wrote: > > On 5/21/14 8:21 AM, Jonas Wielicki wrote: >> >> On 21.05.2014 14:13, Steven D'Aprano wrote: >>> >>> > On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote: >>>> >>>> >> ** The problem >>>> >> >>>> >> A long-standing problem with CPython is that the peephole optimizer >>>> >> cannot be completely disabled. Normally, peephole optimization is a >>>> >> good thing, it improves execution speed. But in some situations, like >>>> >> coverage testing, it's more important to be able to reason about the >>>> >> code's execution. I propose that we add a way to completely disable the >>>> >> optimizer. >>> >>> > >>> > I'm not sure whether this is an argument for or against your proposal, >>> > but the continue statement shown below is *not* dead code and should not >>> > be optimized out. The assert fails if you remove the continue statement. >>> > >>> > I don't have 3.4 on this machine to test with, but using 3.3, I can see >>> > no evidence that `continue` is optimized away. >> >> The logical continue is still there -- what happens is that the >> optimizer rewrites the `else` jump at the preceding `if` condition, >> which would normally point at the `continue` statement, to the beginning >> of the loop, because it would be a jump (to the continue) to a jump (to >> the for loop header). >> >> Thus, the actual continue statement is not reached, but logically the >> code does the same, because the only way continue would have been >> reached was transformed to a continue itself. >> > To make the details more explicit, here is the source again, and the disassembled code, with the original source interspersed: > >> a = b = c = 0 >> for n in range(100): >> if n % 2: >> if n % 4: >> a += 1 >> continue >> else: >> b += 1 >> c += 1 >> assert a == 50 and b == 50 and c == 50 > > Disassembled (Python 3.4, but the same effect is visible in 2.7, 3.3, etc): > > > a = b = c = 0 > 1 0 LOAD_CONST 0 (0) > 3 DUP_TOP > 4 STORE_NAME 0 (a) > 7 DUP_TOP > 8 STORE_NAME 1 (b) > 11 STORE_NAME 2 (c) > > for n in range(100): > 2 14 SETUP_LOOP 79 (to 96) > 17 LOAD_NAME 3 (range) > 20 LOAD_CONST 1 (100) > 23 CALL_FUNCTION 1 (1 positional, 0 keyword pair) > 26 GET_ITER > >> 27 FOR_ITER 65 (to 95) > 30 STORE_NAME 4 (n) > > if n % 2: > 3 33 LOAD_NAME 4 (n) > 36 LOAD_CONST 2 (2) > 39 BINARY_MODULO > 40 POP_JUMP_IF_FALSE 72 > > if n % 4: > 4 43 LOAD_NAME 4 (n) > 46 LOAD_CONST 3 (4) > 49 BINARY_MODULO > 50 POP_JUMP_IF_FALSE 27 > > a += 1 > 5 53 LOAD_NAME 0 (a) > 56 LOAD_CONST 4 (1) > 59 INPLACE_ADD > 60 STORE_NAME 0 (a) > 63 JUMP_ABSOLUTE 27 > > continue > 6 66 JUMP_ABSOLUTE 27 > 69 JUMP_FORWARD 10 (to 82) > > b += 1 > 8 >> 72 LOAD_NAME 1 (b) > 75 LOAD_CONST 4 (1) > 78 INPLACE_ADD > 79 STORE_NAME 1 (b) > > c += 1 > 9 >> 82 LOAD_NAME 2 (c) > 85 LOAD_CONST 4 (1) > 88 INPLACE_ADD > 89 STORE_NAME 2 (c) > 92 JUMP_ABSOLUTE 27 > >> 95 POP_BLOCK > > > assert a == 50 and b == 50 and c == 50 > 10 >> 96 LOAD_NAME 0 (a) > 99 LOAD_CONST 5 (50) > 102 COMPARE_OP 2 (==) > 105 POP_JUMP_IF_FALSE 132 > 108 LOAD_NAME 1 (b) > 111 LOAD_CONST 5 (50) > 114 COMPARE_OP 2 (==) > 117 POP_JUMP_IF_FALSE 132 > 120 LOAD_NAME 2 (c) > 123 LOAD_CONST 5 (50) > 126 COMPARE_OP 2 (==) > 129 POP_JUMP_IF_TRUE 138 > >> 132 LOAD_GLOBAL 5 (AssertionError) > 135 RAISE_VARARGS 1 > >> 138 LOAD_CONST 6 (None) > 141 RETURN_VALUE > > Notice that line 6 (the continue) is unreachable, because the else-jump from line 4 has been turned into a jump to bytecode offset 27 (the for loop), and the end of line 5 has also been turned into a jump to 27, rather than letting it flow to line 6. So line 6 still exists in the bytecode, but is never executed, leading tracing tools to indicate that line 6 is never executed. So isn't this just a bug in the dead code elimination? Fixing that (so there's no bytecode behind that line and coverage tools can know it has been optimised out) sounds better than adding an obscure config option. Potentially less risky would be to provide a utility in the dis module to flag such lines after the fact. Cheers, Nick. > > --Ned. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dw+python-ideas at hmmz.org Thu May 22 02:10:39 2014 From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org) Date: Thu, 22 May 2014 00:10:39 +0000 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537D311A.2010101@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> Message-ID: <20140522001038.GB15946@k2> On Wed, May 21, 2014 at 07:04:58PM -0400, Ned Batchelder wrote: > I can't think of something that has gone wrong because there was a > switch to turn it off. > I'm not sure what can of worms you are imagining. Let's look to our > experience with C compilers. They have a switch to disable optimization. > What trouble has that brought? Are you serious? Somehow I'm reminded of the funroll-loops.info Gentoo parody site. As others mention, there is a difficult to quantify, but very real non-zero cost in introducing new major execution modes. > When I think of problems with optimizers in C compilers, I think of > incorrect or buggy optimizations. Sure, it if were still the early 90s. Most optimization bugs come from inexperienced developers relying on undefined behaviour of one form or another, and Python doesn't suffer from UB quite the way C does. > People in this thread have contrasted this proposal with an apparent desire > to expand the set of optimizations performed. It seems to me that the > complexity and danger lie in expanded optimizations, not disabled ones. Agreed, and so I'd suggest a better fix would be removing the peephole optimizer, for the little benefit that it offers, if it could be shown that it really truly does hinder peoples' comprehension of Python. It seems the proposed feature is all about avoiding saying "oh, don't worry about that for the moment" while teaching, assuming the question comes up at all. Adding another special case to disable a minor performance improvement seems pointless when the implementation is slow regardless, kind of along the same lines as adding another -O or -OO flag, and we all know how useful they ended up being. If there really was a problem here, it seems preferable to just remove the optimizer entirely and find more general ways to fix performance without creating a mess. David From jeanpierreda at gmail.com Thu May 22 02:11:19 2014 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 21 May 2014 17:11:19 -0700 Subject: [Python-ideas] Maybe/Option builtin In-Reply-To: <20140521235031.GH10355@ando> References: <20140521235031.GH10355@ando> Message-ID: On Wed, May 21, 2014 at 4:50 PM, Steven D'Aprano wrote: > On Wed, May 21, 2014 at 10:49:36AM -0400, mulhern wrote: >> I feel that a Maybe/Option type, analogous to the types found in Haskell or >> OCaml would actually be useful in Python. The value analogous to the None >> constructor should be Python's None. >> >> Obviously, it wouldn't give the type-checking benefits that it gives in >> statically checked languages, but every use of a Maybe object as if it were >> the contained object would give an error, alerting the user to the fact >> that None is a possibility and allowing them to address the problem sooner >> rather than later. > > Since this is a Python list, you shouldn't take it for granted that we > will be familiar with Haskell or Ocaml, nor expect us to go off and > study those languages well enough to understand what a Maybe object is > used for or how it will fit into Python's execution model. > > I'm no expect on either of those two languages, but it seems to me that > that the only point of Maybe is to allow the static type checker to > distinguish between (for example) "this function returns a string" and > "this function returns either a string or None". Since Python doesn't do > static, compile-time type checks, I'm having difficulty in seeing what > would be the point of Maybe in Python. Python has no way to represent "I have a value, and that value is None". e.g. what is the difference between x.get('a') and x.get('b') for x == {'a': None} ? Otherwise, agree 100% . And anyway, retrofitting this into Python can't really work. -- Devin > As I said, I'm not an expert in Haskell, but I don't think this proposal > is helpful, and certainly not helpful enough to make it a built-in or > standard part of the language. If I have missed some benefit of Maybe, > please explain how it would apply in Python terms. > > > >> I feel that it would be kind of tricky to implement it as a class. >> Something like: >> >> class Maybe(object): >> def __init__(self, value=None): >> self.value = value >> def value(self): >> return self.value >> >> is a start but I'm not able to see how to make >> >> if Maybe(): >> print("nothing") # never prints > > In Python 2, define __nonzero__. In Python 3, define __bool__. > > https://docs.python.org/2/reference/datamodel.html#object.__nonzero__ > https://docs.python.org/3/reference/datamodel.html#object.__bool__ > > >> but >> >> if Maybe({}): >> print("yes a value") #always prints >> >> which is definitely the desired behaviour. > > Not to me it isn't. This goes against the standard Python convention > that empty containers are falsey. Since Maybe({}) wraps an empty dict, > it should be considered a false value. > > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From amber.yust at gmail.com Thu May 22 02:15:15 2014 From: amber.yust at gmail.com (Amber Yust) Date: Wed, 21 May 2014 17:15:15 -0700 Subject: [Python-ideas] Maybe/Option builtin In-Reply-To: References: <20140521235031.GH10355@ando> Message-ID: If you care about none as a value, you specify a different default. NO_VALUE = object() foo = bar.get("baz", NO_VALUE) if foo is NO_VALUE: #.... On May 21, 2014 5:12 PM, "Devin Jeanpierre" wrote: > On Wed, May 21, 2014 at 4:50 PM, Steven D'Aprano > wrote: > > On Wed, May 21, 2014 at 10:49:36AM -0400, mulhern wrote: > >> I feel that a Maybe/Option type, analogous to the types found in > Haskell or > >> OCaml would actually be useful in Python. The value analogous to the > None > >> constructor should be Python's None. > >> > >> Obviously, it wouldn't give the type-checking benefits that it gives in > >> statically checked languages, but every use of a Maybe object as if it > were > >> the contained object would give an error, alerting the user to the fact > >> that None is a possibility and allowing them to address the problem > sooner > >> rather than later. > > > > Since this is a Python list, you shouldn't take it for granted that we > > will be familiar with Haskell or Ocaml, nor expect us to go off and > > study those languages well enough to understand what a Maybe object is > > used for or how it will fit into Python's execution model. > > > > I'm no expect on either of those two languages, but it seems to me that > > that the only point of Maybe is to allow the static type checker to > > distinguish between (for example) "this function returns a string" and > > "this function returns either a string or None". Since Python doesn't do > > static, compile-time type checks, I'm having difficulty in seeing what > > would be the point of Maybe in Python. > > Python has no way to represent "I have a value, and that value is None". > > e.g. what is the difference between x.get('a') and x.get('b') for x == > {'a': None} ? > > Otherwise, agree 100% . And anyway, retrofitting this into Python > can't really work. > > -- Devin > > > As I said, I'm not an expert in Haskell, but I don't think this proposal > > is helpful, and certainly not helpful enough to make it a built-in or > > standard part of the language. If I have missed some benefit of Maybe, > > please explain how it would apply in Python terms. > > > > > > > >> I feel that it would be kind of tricky to implement it as a class. > >> Something like: > >> > >> class Maybe(object): > >> def __init__(self, value=None): > >> self.value = value > >> def value(self): > >> return self.value > >> > >> is a start but I'm not able to see how to make > >> > >> if Maybe(): > >> print("nothing") # never prints > > > > In Python 2, define __nonzero__. In Python 3, define __bool__. > > > > https://docs.python.org/2/reference/datamodel.html#object.__nonzero__ > > https://docs.python.org/3/reference/datamodel.html#object.__bool__ > > > > > >> but > >> > >> if Maybe({}): > >> print("yes a value") #always prints > >> > >> which is definitely the desired behaviour. > > > > Not to me it isn't. This goes against the standard Python convention > > that empty containers are falsey. Since Maybe({}) wraps an empty dict, > > it should be considered a false value. > > > > > > > > > > -- > > Steven > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Thu May 22 02:17:18 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 21 May 2014 18:17:18 -0600 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537D2DFD.6000504@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <537D2DFD.6000504@nedbatchelder.com> Message-ID: On Wed, May 21, 2014 at 4:51 PM, Ned Batchelder wrote: > On 5/21/14 3:44 PM, Raymond Hettinger wrote: >> I think this opens a can of worms that is better left closed. >> >> * We will have to start running tests both with and without the switch >> turned on for example (because you're exposing yet another way to >> run Python with different code). > > Yes, this could mean an increased testing burden. But that scale > horizontally, and will not require a large amount of engineering work. > Besides, what better way to test the optimizer? I buy that to an extent. It would definitely be helpful when adding or changing optimizations, particularly to identify the impact of changes both in sematics and performance. However, work on optimizations isn't too common. Aside from direct work on optimizations, optimization-free testing could be useful for identifying optimizer-related bugs (which I expect are quite rare). However, that doesn't add a lot of benefit over a normal buildbot run considering that each run has few changes it is testing. Having said all that, I think it would still be worth testing with and without optimizations. Unless the optimizations are platform-specific, would we need more than one buildbot running with optimizations turned off? >> * Over time, I expect that some of the functionality of the peepholer >> is going to be moved upstream into AST transformations you will >> have even less ability switch something on-and-off. > > I'm perfectly happy to remove the word "peephole" from the feature. If we > expect the set of optimizations to grow in the future, then we can expect > that more cases of code analysis will be misled by optimizations. All the > more reason to establish a way now that will disable all optimizations. While the use-case is very specific, I think it's a valid motivator for a means of disabling all optimizations, particularly if disabling optimizations is isolated to a very focused location as you've indicated. The big question then is the impact on implementing optimizations (in general) in the future. There has been talk of AST-based optimizations. Raymond indicates that this makes it harder to conditionally optimize. So how much harder would it make this future optimization work? Is that a premature optimization? >> * I sympathize with "there is an irritating dimple in coverage.py" >> but that hasn't actually impaired its usability beyond creating a >> curiosity. Using that a reason to add a new CPython-only >> command-line switch seems like having the tail wag the dog. > > I don't think you should dismiss real users' concerns as a curiosity. We > already have -X as a way to provide implementation-specific switches, I'm > not sure why the CPython-only nature of this is an issue? If optimizations can break coverage tools when run on other Python implementations, does that make a case for a more general command-line option? Or is it just a matter of CPython's optimizations behave badly by breaking some perceived invariants that coverage tools rely on, and other implementations behave correctly? If it's the latter, then perhaps Python needs a few tests added to the test suite that verify that optimizer doesn't break the invariants. Such tests would benefit all implementations. However, even if it's the right approach, if the burden of fixing things is so much more than the burden of adding a no-optimizations option, it may make more sense to just add the option and move on. It's all about who has the time to do something about it. (And of course "Now is better than never. Although never is often better than *right* now.") Of course, if the coverage tools rely on CPython implementation details then an implementation-specific -X option makes even more sense. FWIW, regardless of the scenario a -X option makes practical sense in that it would relatively immediately relieve the (infrequent? but onerous) pain point encountered in coverage tools. However, keep in mind that such an option would not be backported and would not be released until 3.5 (in late 2015). So I suppose it would be more about relieving future pain and not helping current coverage tool users. > >> * Ideally, the peepholer should be thought of as part of the code >> generation. As compilation improves over time, it should start >> to generate the same code as we're getting now. It probably >> isn't wise to expose the implementation detail that the constant >> folding and jump tweaks are done in a separate second pass. > > I'm happy to remove the word "peephole". I think a way to disable > optimization is useful. I've heard the concern from a number of coverage.py > users. If as we all think, optimizations will expand in CPython, then the > number of mis-diagnosed code problems will grow. The comparison made elsewhere with -O0 option in other compilers is also appropriate here. -eric From steve at pearwood.info Thu May 22 02:24:34 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 22 May 2014 10:24:34 +1000 Subject: [Python-ideas] Maybe/Option builtin In-Reply-To: References: <20140521235031.GH10355@ando> Message-ID: <20140522002434.GJ10355@ando> On Wed, May 21, 2014 at 05:11:19PM -0700, Devin Jeanpierre wrote: > Python has no way to represent "I have a value, and that value is None". Sure it has. 'a' in x and x['a'] is None > e.g. what is the difference between x.get('a') and x.get('b') for x == > {'a': None} ? dict.get is explicitly designed to blur the distinction between "I have a key and here is its value" and "I may or may not have a key, it doesn't matter which, return its value or this default regardless". (The default value defaults to None.) If you care about the difference, as most people do most of the time, you should avoid the get method and call x['a'] directly. -- Steven From ned at nedbatchelder.com Thu May 22 03:29:40 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 21:29:40 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <537D2DFD.6000504@nedbatchelder.com> Message-ID: <537D5304.6030109@nedbatchelder.com> On 5/21/14 8:17 PM, Eric Snow wrote: >>> * Over time, I expect that some of the functionality of the peepholer >>> >>is going to be moved upstream into AST transformations you will >>> >>have even less ability switch something on-and-off. >> > >> >I'm perfectly happy to remove the word "peephole" from the feature. If we >> >expect the set of optimizations to grow in the future, then we can expect >> >that more cases of code analysis will be misled by optimizations. All the >> >more reason to establish a way now that will disable all optimizations. > While the use-case is very specific, I think it's a valid motivator > for a means of disabling all optimizations, particularly if disabling > optimizations is isolated to a very focused location as you've > indicated. > > The big question then is the impact on implementing optimizations (in > general) in the future. There has been talk of AST-based > optimizations. Raymond indicates that this makes it harder to > conditionally optimize. So how much harder would it make this future > optimization work? Is that a premature optimization? > I don't understand the claim that AST transformations will have less ability to switch something on-and-off. The very term "AST transformations" outlines the implementation: step 1, construct an AST; step 2, transform the AST; step 3, generate code from the AST. My proposal is that a switch would let you skip step 2. This is analogous to the current optimizer, which generates bytecode, then as a separate (and skippable!) step, performs peephole optimizations on that bytecode. --Ned. From ned at nedbatchelder.com Thu May 22 04:00:38 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 22:00:38 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> Message-ID: <537D5A46.4040200@nedbatchelder.com> On 5/21/14 8:07 PM, Nick Coghlan wrote: > > > Notice that line 6 (the continue) is unreachable, because the > else-jump from line 4 has been turned into a jump to bytecode offset > 27 (the for loop), and the end of line 5 has also been turned into a > jump to 27, rather than letting it flow to line 6. So line 6 still > exists in the bytecode, but is never executed, leading tracing tools > to indicate that line 6 is never executed. > > So isn't this just a bug in the dead code elimination? Fixing that (so > there's no bytecode behind that line and coverage tools can know it > has been optimised out) sounds better than adding an obscure config > option. > Perhaps I don't know how much dead code elimination was intended. Assuming we can get to the point that the statement has been completely removed, you'll still have the confusing state that a perfectly good statement is marked as not executable (because it has no corresponding bytecode). And getting to that point means adding more complexity to the bytecode optimizer. > > Potentially less risky would be to provide a utility in the dis module > to flag such lines after the fact. > I don't see how the dis module would know which lines these are? I'm surprised at the amount of invention and mystery code people will propose to avoid having an off-switch for the code we already have. > > Cheers, > Nick. > From ned at nedbatchelder.com Thu May 22 04:03:22 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 21 May 2014 22:03:22 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <20140522001038.GB15946@k2> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522001038.GB15946@k2> Message-ID: <537D5AEA.901@nedbatchelder.com> On 5/21/14 8:10 PM, dw+python-ideas at hmmz.org wrote: > Agreed, and so I'd suggest a better fix would be removing the peephole > optimizer, for the little benefit that it offers, if it could be shown > that it really truly does hinder peoples' comprehension of Python. > > It seems the proposed feature is all about avoiding saying "oh, don't > worry about that for the moment" while teaching, assuming the question > comes up at all. The point is not about teaching Python. It's about getting useful information from code analysis tools. When you run coverage tools or debuggers, you are hoping to learn something about your code. It is bad when those tools give you incorrect or misleading information. Being able to disable the optimizer will prevent certain kinds of incorrect information. --Ned. From ethan at stoneleaf.us Thu May 22 04:00:49 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 21 May 2014 19:00:49 -0700 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537D2DFD.6000504@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <537D2DFD.6000504@nedbatchelder.com> Message-ID: <537D5A51.3010809@stoneleaf.us> On 05/21/2014 03:51 PM, Ned Batchelder wrote: > > I'm perfectly happy to remove the word "peephole" from the feature. If we expect the set of optimizations to grow in the > future, then we can expect that more cases of code analysis will be misled by optimizations. All the more reason to > establish a way now that will disable all optimizations. I think the big part of the problem is that there are more than just peephole optimizations. For example, what about all the fast-path optimizations? Do we want to be able to turn those off? How about the heapq optimizations that Raymond put in a few months ago? As Nick suggested, I think it would be better to fix whichever part is broken and allowing dead code to stay in the bytecode. -- ~Ethan~ From jeanpierreda at gmail.com Thu May 22 08:25:14 2014 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 21 May 2014 23:25:14 -0700 Subject: [Python-ideas] Maybe/Option builtin In-Reply-To: <20140522002434.GJ10355@ando> References: <20140521235031.GH10355@ando> <20140522002434.GJ10355@ando> Message-ID: On Wed, May 21, 2014 at 5:24 PM, Steven D'Aprano wrote: > On Wed, May 21, 2014 at 05:11:19PM -0700, Devin Jeanpierre wrote: > >> Python has no way to represent "I have a value, and that value is None". > > Sure it has. 'a' in x and x['a'] is None Hm, why do you bring this up? Do you think I didn't know about this? While we're at it, you could just implement the Option type, and then the problem is also solved! I am sorry I did not communicate what I meant to say. >> e.g. what is the difference between x.get('a') and x.get('b') for x == >> {'a': None} ? > > dict.get is explicitly designed to blur the distinction between "I have > a key and here is its value" and "I may or may not have a key, it > doesn't matter which, return its value or this default regardless". (The > default value defaults to None.) If you care about the difference, as > most people do most of the time, you should avoid the get method and > call x['a'] directly. I was not trying to criticize dict.get, I was noting the same property you have agreed with: it blurs the distinction. That distinction does not have to be blurred, and for some other APIs, must not be blurred. I appreciate that dict.get is not the former case, so, for the sake of making the discussion about something important, let's choose such an API: imagine that we steal Guido's time machine to go into the past and redesign next(). With next(), it is absolutely vital that we are able to know when the iterator has stopped vs when the iterator has produced a value. API ideas: - Use next's current primary API: raise an exception if the iterator has ended. This is a very good API, except that if the caller forgets to catch the exception, this can result in generators or iterators up the call stack spontaneously and silently stopping, which can be hard to debug - and, unfortunately, is almost never actually your fault. - Use next's current alternate API: give next a default parameter which is returned when the iterator stops, and follow Amber's solution. However, you have to be careful that this sentinel value won't ever be wanted as a return value, and that you don't accidentally match things against the sentinel by mistake: i.e. it has to be defined inline with a unique object and checked by identity. - make next() return None when the generator is exhausted. How do you differentiate None as a value vs None as a no-result sentinel? ideas: - give iterators a .is_exhausted attribute which is True if the None it returned indicates end-of-iterator, False otherwise. This is acceptable, but now, if the caller forgets to check for exhaustion, they start silently giving None as values. This might also be hard to debug. - make all values be wrapped in Some(v), so that next(it) returns either None or Some(v). This can cause problems if you forget to actually unwrap the Some to get at the value. Unlike the above problems, however, such things are very easy to debug - you will get a TypeError when you add Some(3) + 5. And, similarly, if you forget to check for None, this will be an error when you treat it like a Some value (e.g. 'NoneType' object has no attribute 'unwrap'). So this option is pretty resilient against mistakes, even in a dynamically typed language. Stylistically, I feel like exceptions and option types give you a clean separation of cases, where the other two options just munge data together and let you sort it out afterwards - this feels ugly to me, and would seem to push programmers towards not caring about the value. (Probably not a good idea when the cases are important.) Similarly, it's easy to forget to catch an exception, especially since quite often people use next in a situation where they think it will never raise an exception. And in the particular case of StopIteration, unlike most other exceptions, it's frequently silently caught by various functions that might sit between you and where the exception was raised, meaning that you might not easily identify the cause of the problem -- especially in a badly tested codebase. (And what other kind of codebase would miss this error? ;). So for me personally, if Python had all these options and they were all equally easy and natural, I would probably choose option types. This is of course subjective, but I hope it sort of ties together everything and explains the utility. If you value certain things and have a certain sense of elegance, you might want to use option types instead of some of the alternatives, in some circumstances, even in a dynamically typed language. Though, I am not trying to suggest that you personally would ever want to use it, or that it's even particularly Pythonic. I personally feel like probably Some/None would be awkward/unpythonic to retrofit (although I think sum types in general belong in the language, which is why I am talking about this at all - awareness is important!). Does that help explain how Some/None can sometimes be useful or desirable (to some people) even without static typing? -- Devin From tjreedy at udel.edu Thu May 22 08:44:52 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 22 May 2014 02:44:52 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537D2FDA.6030907@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> Message-ID: On 5/21/2014 6:59 PM, Ned Batchelder wrote: > If by implementation details, you mean the word "peephole", then let's > remove it, and simply have a switch that disables all optimization. > Rather than limiting the future of the optimizer, it will provide an > escape hatch for people who would rather not have the optimizer's effects. The presumption of this idea is that there is a proper, canonical unoptimized version of 'compiled Python'. For Python there obviously is not. For CPython, there is not either. What Raymond has been saying is that the output of the CPython compiler is the output of the CPython compiler. Sys.settrace is not intended to mandate. It reports on the operations of a particular version of CPython as well as it can with the line number table it gets. The existence of the table is not mandated by the language definition, but is provided on a best effort basis. Another issue on the tracker points out that if an ast is constructed directly, and then compiled, then 'source line numbers' has no meaning. When I used coverage (last summer) with tested Idle modules, I could not get a reported 100% coverage because coverage counts the body of a final "if __name__ == '__main__':" statement. So I had to visually checked that those were the only 'uncovered' lines. I do not see doing the same for 'uncovered' continue as much different. In either case, coverage could leave such lines out of the denominator. -- Terry Jan Reedy From ncoghlan at gmail.com Thu May 22 10:25:26 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 May 2014 18:25:26 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537D5A46.4040200@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> Message-ID: On 22 May 2014 12:00, "Ned Batchelder" wrote: > > I'm surprised at the amount of invention and mystery code people will propose to avoid having an off-switch for the code we already have. It's not the off switch per se, it's the documentation and testing consequences. Better to figure out a way to let the code generator and analysis tools collaborate more effectively than to complicate the execution model further. Cheers, Nick. >> >> >> Cheers, >> Nick. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu May 22 10:43:34 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 22 May 2014 10:43:34 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> Message-ID: <20140522104334.36b2b07f@fsol> On Thu, 22 May 2014 02:44:52 -0400 Terry Reedy wrote: > > When I used coverage (last summer) with tested Idle modules, I could not > get a reported 100% coverage because coverage counts the body of a final > "if __name__ == '__main__':" statement. There are flags to modify this behaviour. Regards Antoine. From solipsis at pitrou.net Thu May 22 10:52:20 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 22 May 2014 10:52:20 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> Message-ID: <20140522105220.70b360fe@fsol> On Wed, 21 May 2014 19:04:58 -0400 Ned Batchelder wrote: > > I'm not sure what can of worms you are imagining. Let's look to our > experience with C compilers. They have a switch to disable > optimization. What trouble has that brought? When I think of problems > with optimizers in C compilers, I think of incorrect or buggy > optimizations. I can't think of something that has gone wrong because > there was a switch to turn it off. Python's usage model does not contain the notion of compiler optimizations. Hardly anybody uses the misnamed -O flags. There is a single compilation mode, which everyone is content with. It is part of the simplicity of the language (or, at least, of CPython); by adding some flags than can affect the level of "optimization" you make the model more complicated to understand for users, and to support for us. (having used coverage several times, I haven't found those missed lines really annoying, by the way; not to the point that I would have wanted a specific command-line flag to disable optimizations) The use case for disabling optimizations in C is to make programs actually debuggable. Python doesn't have that problem. Regards Antoine. From p.f.moore at gmail.com Thu May 22 11:02:06 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 22 May 2014 10:02:06 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <20140522105220.70b360fe@fsol> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> Message-ID: On 22 May 2014 09:52, Antoine Pitrou wrote: > adding some flags than can affect the level of "optimization" you make > the model more complicated to understand for users, and to support for > us. As a concrete example, note my earlier comment about pyc files. Switching off optimisation results in unoptimised bytecode being written to pyc files, which could then be read in a subsequent (supposedly) optimised run. And vice versa. This may not be a huge problem for the coverage use case, but it does add an extra level of complexity into the model of caching bytecode. Handwaving it away as "not a big deal - just delete the bytecode files before and after the coverage run" doesn't alter the fact that the bytecode caching model isn't handling the new mode properly. Paul From theller at ctypes.org Thu May 22 14:50:08 2014 From: theller at ctypes.org (Thomas Heller) Date: Thu, 22 May 2014 14:50:08 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <20140522105220.70b360fe@fsol> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> Message-ID: Am 22.05.2014 10:52, schrieb Antoine Pitrou: > The use case for disabling optimizations in C is to make programs > actually debuggable. Python doesn't have that problem. Well, setting a breakpoint to the 'continue' line in Ned's test code and running it with pdb does NOT trigger the breakpoint. So 'Python doesn't have this problem' is not really true. Thomas From ned at nedbatchelder.com Thu May 22 15:24:37 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 22 May 2014 09:24:37 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> Message-ID: <537DFA95.4040000@nedbatchelder.com> On 5/22/14 2:44 AM, Terry Reedy wrote: > On 5/21/2014 6:59 PM, Ned Batchelder wrote: > >> If by implementation details, you mean the word "peephole", then let's >> remove it, and simply have a switch that disables all optimization. >> Rather than limiting the future of the optimizer, it will provide an >> escape hatch for people who would rather not have the optimizer's >> effects. > > The presumption of this idea is that there is a proper, canonical > unoptimized version of 'compiled Python'. For Python there obviously > is not. For CPython, there is not either. What Raymond has been saying > is that the output of the CPython compiler is the output of the > CPython compiler. > I'd like to understand why we think the Python compiler is different in this regard than a C compiler. We all use C compilers that have a -O0 switch. It's there to disable optimizations so that programs can be debugged. The C compiler also has no "canonical unoptimized compiled output". But the switch is there to make it possible to debug (reason about) the compiled code. I don't care if we have a command line switch or some other mechanism to disable optimizations. I just think it's useful to be able to do it somehow. When this came up 18 months ago on Python-Dev, it was part of a thread about adding more optimizations to CPython. Guido said "+1" to the idea of being able to disable the optimizers (https://mail.python.org/pipermail/python-dev/2012-December/123099.html). Our need is not as great as C's, the unrecognizability of the compiled code is much less, but current optimizations are already interfering with the ability to debug and analyze code, and new optimizations will only broaden the possibility of interference. --Ned. From ned at nedbatchelder.com Thu May 22 15:29:46 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 22 May 2014 09:29:46 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> Message-ID: <537DFBCA.2070006@nedbatchelder.com> On 5/22/14 4:25 AM, Nick Coghlan wrote: > > > On 22 May 2014 12:00, "Ned Batchelder" > wrote: > > > > I'm surprised at the amount of invention and mystery code people > will propose to avoid having an off-switch for the code we already have. > > It's not the off switch per se, it's the documentation and testing > consequences. Better to figure out a way to let the code generator and > analysis tools collaborate more effectively than to complicate the > execution model further. > The problem with "letting them collaborate more effectively" is that we don't know how to do that. If we can come up with a way to do it, it will involve much more complex code than I am proposing. As far as documentation, we have three possibilities for optimization level now. This will add a fourth. I don't see that as a burden. On the testing front, if I were the developer of an optimizer, I would welcome a switch to disable it, as a way to test that optimizations don't change semantics. I understand that this is a different mode of execution. I guess we have different opinions about the tradeoff of risk and benefit of that new mode. > Cheers, > Nick. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu May 22 15:05:51 2014 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 22 May 2014 23:05:51 +1000 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> Message-ID: On Thu, May 22, 2014 at 10:50 PM, Thomas Heller wrote: > Am 22.05.2014 10:52, schrieb Antoine Pitrou: > >> The use case for disabling optimizations in C is to make programs >> actually debuggable. Python doesn't have that problem. > > > Well, setting a breakpoint to the 'continue' line in Ned's test code > and running it with pdb does NOT trigger the breakpoint. > So 'Python doesn't have this problem' is not really true. Correct me if I'm wrong, but as I understand it, the problem is that the peephole optimizer eliminated an entire line of code. Would it be possible to have it notice when it merges two pieces from different lines, and somehow mark that the resulting bytecode comes from both lines? That would solve the breakpoint and coverage problems simultaneously. ChrisA From skip at pobox.com Thu May 22 15:49:49 2014 From: skip at pobox.com (Skip Montanaro) Date: Thu, 22 May 2014 08:49:49 -0500 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> Message-ID: On Thu, May 22, 2014 at 8:05 AM, Chris Angelico wrote: > Correct me if I'm wrong, but as I understand it, the problem is that > the peephole optimizer eliminated an entire line of code. Would it be > possible to have it notice when it merges two pieces from different > lines, and somehow mark that the resulting bytecode comes from both > lines? That would solve the breakpoint and coverage problems > simultaneously. It seems to me that Ned has revealed a bug in the peephole optimizer. It zapped an entire source line's worth of bytecode, but failed to delete the relevant entry in the line number table of the resulting code object. If I had my druthers, that would be the change I'd prefer. That said, I think Ned's proposal is fairly simple. As for the increased testing load, I think the extra cost would be the duplication of the buildbots (or the adjustment of their setup to test with -O and -O0 flags). Is it still the case that -O effectively does nothing (maybe only eliding __debug__ checks)? Skip From ethan at stoneleaf.us Thu May 22 16:02:36 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 22 May 2014 07:02:36 -0700 Subject: [Python-ideas] Maybe/Option builtin In-Reply-To: References: <20140521235031.GH10355@ando> <20140522002434.GJ10355@ando> Message-ID: <537E037C.1020202@stoneleaf.us> On 05/21/2014 11:25 PM, Devin Jeanpierre wrote: > > Does that help explain how Some/None can sometimes be useful or > desirable (to some people) even without static typing? Nice essay, thanks! -- ~Ethan~ From p.f.moore at gmail.com Thu May 22 16:29:13 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 22 May 2014 15:29:13 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537DFA95.4040000@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> Message-ID: On 22 May 2014 14:24, Ned Batchelder wrote: > When this came up 18 months ago on Python-Dev, it was part of a thread about > adding more optimizations to CPython. Guido said "+1" to the idea of being > able to disable the optimizers > (https://mail.python.org/pipermail/python-dev/2012-December/123099.html). > Our need is not as great as C's, the unrecognizability of the compiled code > is much less, but current optimizations are already interfering with the > ability to debug and analyze code, and new optimizations will only broaden > the possibility of interference. So I'm a bit confused. This was debated on python-dev and (presumably) agreement reached, so why does it need a whole new thread here? Paul From wolfgang.maier at biologie.uni-freiburg.de Thu May 22 16:44:52 2014 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Thu, 22 May 2014 16:44:52 +0200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: <20140521014203.GE10355@ando> Message-ID: On 21.05.2014 11:43, Nick Coghlan wrote: > > It also misses the big reason I am a Python programmer rather than a > Java programmer. > > For me, Python is primarily an orchestration language. It is the > language for the code that is telling everything else what to do. If my > Python code is an overall performance bottleneck, then "Huzzah!", as it > means I have finally engineered all the other structural bottlenecks out > of the system. > > For this use case, monkey patching is not an incidental feature to be > tolerated merely for backwards compatibility reasons: it is a key > capability that makes Python an ideal language for me, as it takes > ultimate control of what dependencies do away from the original author > and places it in my hands as the system integrator. This is a dangerous > power, not to be used lightly, but it also grants me the ability to work > around critical bugs in dependencies at run time, rather than having to > fork and patch the source the way Java developers tend to do. > Very intriguing perspective! I am not experienced enough with Java to judge your comparison, but if you ever find the time to elaborate on this in a longer post or article somewhere I'd be eager to read it. Thanks, Wolfgang From ned at nedbatchelder.com Thu May 22 17:29:07 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 22 May 2014 11:29:07 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> Message-ID: <537E17C3.8000307@nedbatchelder.com> On 5/22/14 10:29 AM, Paul Moore wrote: > On 22 May 2014 14:24, Ned Batchelder wrote: >> When this came up 18 months ago on Python-Dev, it was part of a thread about >> adding more optimizations to CPython. Guido said "+1" to the idea of being >> able to disable the optimizers >> (https://mail.python.org/pipermail/python-dev/2012-December/123099.html). >> Our need is not as great as C's, the unrecognizability of the compiled code >> is much less, but current optimizations are already interfering with the >> ability to debug and analyze code, and new optimizations will only broaden >> the possibility of interference. > So I'm a bit confused. This was debated on python-dev and (presumably) > agreement reached, so why does it need a whole new thread here? > > Paul I would not say the idea was debated. You can read the (short) thread here: https://mail.python.org/pipermail/python-dev/2012-December/123022.html . Mark Shannon proposed emitting different bytecode for while loops and some other constructs. Guido said no PEP was needed. Nick Coghlan said "main challenge is to keep stepping through the code with pdb sane" (I agree with that!). I said it would be good to have a way to disable optimizations, Guido said "+1". I put this idea here because the discussion on issue2506 got involved enough that someone suggested this was the right place for it. I linked to Guido's sentiment in my initial post here, and had hoped that he would chime in. --Ned. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu May 22 17:32:10 2014 From: guido at python.org (Guido van Rossum) Date: Thu, 22 May 2014 08:32:10 -0700 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537DFBCA.2070006@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> Message-ID: FWIW, I am strictly with Ned here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Thu May 22 17:32:33 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 22 May 2014 11:32:33 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> Message-ID: <537E1891.5050808@nedbatchelder.com> On 5/22/14 9:49 AM, Skip Montanaro wrote: > On Thu, May 22, 2014 at 8:05 AM, Chris Angelico wrote: >> Correct me if I'm wrong, but as I understand it, the problem is that >> the peephole optimizer eliminated an entire line of code. Would it be >> possible to have it notice when it merges two pieces from different >> lines, and somehow mark that the resulting bytecode comes from both >> lines? That would solve the breakpoint and coverage problems >> simultaneously. > It seems to me that Ned has revealed a bug in the peephole optimizer. > It zapped an entire source line's worth of bytecode, but failed to > delete the relevant entry in the line number table of the resulting > code object. If I had my druthers, that would be the change I'd > prefer. I think it is the nature of optimization that it will destroy useful information. I don't think it will always be possible to retain enough back-mapping that the optimized code can be understood as if it had not been optimized. For example, the debug issue would still be present: if you run pdb and set a breakpoint on the "continue" line, it will never be hit. Even if the optimizer cleaned up after itself perfectly (in fact, especially so), that breakpoint will still not be hit. You simply cannot reason about optimized code without having to mentally understand the transformations that have been applied. The whole point of this proposal is to recognize that there are times (debugging, coverage measurement) when optimizations are harmful, and to avoid them. > > That said, I think Ned's proposal is fairly simple. As for the > increased testing load, I think the extra cost would be the > duplication of the buildbots (or the adjustment of their setup to test > with -O and -O0 flags). Is it still the case that -O effectively does > nothing (maybe only eliding __debug__ checks)? > > Skip > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From p.f.moore at gmail.com Thu May 22 17:39:57 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 22 May 2014 16:39:57 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E17C3.8000307@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> <537E17C3.8000307@nedbatchelder.com> Message-ID: On 22 May 2014 16:29, Ned Batchelder wrote: > I would not say the idea was debated. You can read the (short) thread here: > https://mail.python.org/pipermail/python-dev/2012-December/123022.html . > Mark Shannon proposed emitting different bytecode for while loops and some > other constructs. Guido said no PEP was needed. Nick Coghlan said "main > challenge is to keep stepping through the code with pdb sane" (I agree with > that!). I said it would be good to have a way to disable optimizations, > Guido said "+1". > > I put this idea here because the discussion on issue2506 got involved enough > that someone suggested this was the right place for it. I linked to Guido's > sentiment in my initial post here, and had hoped that he would chime in. OK, thanks for the summary. Personally, I still think the biggest issue is around pyc files. I think any proposal needs an answer to that (even if it's just that no-optimisation mode never reads or writes bytecode files). Expecting users to manually manage pyc files is a bad idea. Well, that and any implementation complexity, which I'll leave to others to consider. Paul From mal at egenix.com Thu May 22 17:40:56 2014 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 22 May 2014 17:40:56 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E1891.5050808@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> Message-ID: <537E1A88.8080501@egenix.com> On 22.05.2014 17:32, Ned Batchelder wrote: > > The whole point of this proposal is to recognize that there are times (debugging, coverage > measurement) when optimizations are harmful, and to avoid them. +1 It's regular practice in other languages to disable optimizations when debugging code. I don't see why Python should be different in this respect. Debuggers, testing, coverage and other such tools should be able to invoke a Python runtime mode that let's the compiler work strictly by the book, without applying any kind of optimization. This used to be the default in Python, but over the years, we gradually moved away from this as default, with no options to get the old non-optimizing behavior back. I think it's fine to make safe optimizations default in Python, but there's definitely a need for being able to run Python in a debugger without having it perfectly valid skip code lines (even if they are no ops). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2014) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From mal at egenix.com Thu May 22 17:41:52 2014 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 22 May 2014 17:41:52 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> <537E17C3.8000307@nedbatchelder.com> Message-ID: <537E1AC0.1000708@egenix.com> On 22.05.2014 17:39, Paul Moore wrote: > On 22 May 2014 16:29, Ned Batchelder wrote: >> I would not say the idea was debated. You can read the (short) thread here: >> https://mail.python.org/pipermail/python-dev/2012-December/123022.html . >> Mark Shannon proposed emitting different bytecode for while loops and some >> other constructs. Guido said no PEP was needed. Nick Coghlan said "main >> challenge is to keep stepping through the code with pdb sane" (I agree with >> that!). I said it would be good to have a way to disable optimizations, >> Guido said "+1". >> >> I put this idea here because the discussion on issue2506 got involved enough >> that someone suggested this was the right place for it. I linked to Guido's >> sentiment in my initial post here, and had hoped that he would chime in. > > OK, thanks for the summary. > > Personally, I still think the biggest issue is around pyc files. I > think any proposal needs an answer to that (even if it's just that > no-optimisation mode never reads or writes bytecode files). Expecting > users to manually manage pyc files is a bad idea. Well, that and any > implementation complexity, which I'll leave to others to consider. Why not simply have the new option disable writing PYC files ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2014) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From p.f.moore at gmail.com Thu May 22 17:46:13 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 22 May 2014 16:46:13 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E1AC0.1000708@egenix.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> <537E17C3.8000307@nedbatchelder.com> <537E1AC0.1000708@egenix.com> Message-ID: On 22 May 2014 16:41, M.-A. Lemburg wrote: > Why not simply have the new option disable writing PYC files ? That's what I said. But you also need to not read them as well, because otherwise you could read an optimised file if the source hasn't changed. Paul From mal at egenix.com Thu May 22 17:49:27 2014 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 22 May 2014 17:49:27 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> <537E17C3.8000307@nedbatchelder.com> <537E1AC0.1000708@egenix.com> Message-ID: <537E1C87.7020407@egenix.com> On 22.05.2014 17:46, Paul Moore wrote: > On 22 May 2014 16:41, M.-A. Lemburg wrote: >> Why not simply have the new option disable writing PYC files ? > > That's what I said. But you also need to not read them as well, > because otherwise you could read an optimised file if the source > hasn't changed. Good point :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2014) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From victor.stinner at gmail.com Thu May 22 17:55:33 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 22 May 2014 17:55:33 +0200 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537C888D.7060903@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> Message-ID: Hi, 2014-05-21 13:05 GMT+02:00 Ned Batchelder : > A long-standing problem with CPython is that the peephole optimizer cannot > be completely disabled. I had a similar concern why I worked on my astoptimizer project. I wanted to reimplement the peephole optimizer using the AST instead of the bytecode. Since the peephole optimizer is always on, I was not able to compare the bytecode generated by my AST optimizer without peepholer optimizer to the bytecode generated with the peephole optimizer. I would also be curious to see the code before the peepholer optimizer modifies it. > If you execute "python3.4 -m trace -c -m continue.py", it produces this > continue.cover file: > > 1: a = b = c = 0 > 101: for n in range(100): > 100: if n % 2: > 50: if n % 4: > 50: a += 1 >>>>>>> continue > else: > 50: b += 1 > 50: c += 1 > 1: assert a == 50 and b == 50 and c == 50 > > This indicates that the continue line is not executed. I played long hours in gdb and this is a common issue of compiler optimizations. In gdb, sometimes the program looks to go backward or reexecute the same instruction twice. I hate loosing my time with that, I prefer to recompile the whole (C) application with gcc -O0 -ggdb. > ** User Interface > > Unfortunately, the -O command-line switch does not lend itself to a new > value that means, "less optimization than the default." I propose a new > switch -P, to control the peephole optimizer, with a value of -P0 meaning no > optimization at all. The PYTHONPEEPHOLE environment variable would also > control the option. I propose "python -X nopeephole" , "python -X peephole=0" or "python -X optim=0" I don't like "python -X peephole=0" because "python -X peephole" should active the optimizer, which is already the default. For "optim" proposition, should we keep it synchronous with -O and -OO? (no -O alternative) <=> -X optim=0 (default) => -X optim=1 -O <=> -X optim=2 -OO <=> -X optim=3 I never understand -O and -OO. What is optimized exactly. To me, striping docstrings is not really an "optimization", it should be a different option. Because of this confusion, the peephole optimizer option should maybe be disconnected to -O and -OO. So take "python -X nopeephole". IMO you should not write .pyc or .pyo files if the peephole optimizer is actived. It avoids the question of "was this .pyc generated with or without peephole optimizer?". Usually, when you disable optimizations, you don't care of performances (.pyc are created to speedup Python startup time). I also suggest to add a new flag to the builtin compile() function: PyCF_NO_PEEPHOLE. > There are about a dozen places internal to CPython where optimization level > is indicated with an integer, for example, in Py_CompileStringObject. Those > uses also don't allow for new values indicating less optimization than the > default: 0 and -1 already have meanings. Unless we want to start using -2 > for less that the default. I'm not sure we need to provide for those > values, or if the PYTHONPEEPHOLE environment variable provides enough > control. Add a new flag to sys.flags: "peephole" or "peephole_optimizer" (boolean, True by default). Victor From ethan at stoneleaf.us Thu May 22 17:43:35 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 22 May 2014 08:43:35 -0700 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E1891.5050808@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> Message-ID: <537E1B27.7050201@stoneleaf.us> On 05/22/2014 08:32 AM, Ned Batchelder wrote: > On 5/22/14 9:49 AM, Skip Montanaro wrote: >> On Thu, May 22, 2014 at 8:05 AM, Chris Angelico wrote: >>> >>> Correct me if I'm wrong, but as I understand it, the problem is that >>> the peephole optimizer eliminated an entire line of code. Would it be >>> possible to have it notice when it merges two pieces from different >>> lines, and somehow mark that the resulting bytecode comes from both >>> lines? That would solve the breakpoint and coverage problems >>> simultaneously. >> >> It seems to me that Ned has revealed a bug in the peephole optimizer. >> It zapped an entire source line's worth of bytecode, but failed to >> delete the relevant entry in the line number table of the resulting >> code object. If I had my druthers, that would be the change I'd >> prefer. > > I think it is the nature of optimization that it will destroy useful information. I don't think it will always be > possible to retain enough back-mapping that the optimized code can be understood as if it had not been optimized. For > example, the debug issue would still be present: if you run pdb and set a breakpoint on the "continue" line, it will > never be hit. Even if the optimizer cleaned up after itself perfectly (in fact, especially so), that breakpoint will > still not be hit. You simply cannot reason about optimized code without having to mentally understand the > transformations that have been applied. > > The whole point of this proposal is to recognize that there are times (debugging, coverage measurement) when > optimizations are harmful, and to avoid them. Having read through the issue on the tracker, I find myself swayed towards Neds point of view. However, I do still agree with Raymond that a full-fledged command-line switch is overkill, especially since the unoptimized runs are very special-cased (meaning useful for debugging, coverage, curiosity, learning about optimizing, etc). If we had a sys flag that could be set before a module was loaded, then coverage, pdb, etc., could use that to recompile the source, not save a .pyc file, and move forward. For debugging purposes perhaps a `__no_optimize__ = True` or `from __future__ import no_optimize` would help in those cases where you're dropping into the debugger. The dead-code elimination still has a bug to be fixed, though, because if a line has been optimized away trying to set a break-point at it should fail. -- ~Ethan~ From barry at python.org Thu May 22 18:41:32 2014 From: barry at python.org (Barry Warsaw) Date: Thu, 22 May 2014 12:41:32 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> Message-ID: <20140522124132.14bffcb3@anarchist.wooz.org> On May 22, 2014, at 10:02 AM, Paul Moore wrote: >As a concrete example, note my earlier comment about pyc files. >Switching off optimisation results in unoptimised bytecode being >written to pyc files, which could then be read in a subsequent >(supposedly) optimised run. Seems to me that PEP 3147 tagging could be extended to describe various optimization levels. It might even be nice to get rid of the overloaded .pyo files. The use of .pyo for both -O and -OO optimization levels causes some issues. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: not available URL: From ericsnowcurrently at gmail.com Thu May 22 18:47:40 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 22 May 2014 10:47:40 -0600 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> <537E17C3.8000307@nedbatchelder.com> Message-ID: On May 22, 2014 9:40 AM, "Paul Moore" wrote: > Personally, I still think the biggest issue is around pyc files. I > think any proposal needs an answer to that (even if it's just that > no-optimisation mode never reads or writes bytecode files). So the flag for that would be set implicitly? That sounds reasonable (and easy). > Expecting > users to manually manage pyc files is a bad idea. Well, that and any > implementation complexity, which I'll leave to others to consider. As a fallback, Victor already pointed out that changing sys.implementation.cache_tag would be easy too. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From Steve.Dower at microsoft.com Thu May 22 17:41:31 2014 From: Steve.Dower at microsoft.com (Steve Dower) Date: Thu, 22 May 2014 15:41:31 +0000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> Message-ID: Guido van Rossum wrote: > FWIW, I am strictly with Ned here. As someone who maintains/develops a debugger for Python, I?m with Ned as well (and also Raymond, since I really don?t want to have to worry about one-more-mode that Python might be running in). Why not move the existing optimisation into -O mode and put future optimisations in there too? It may just start having enough value that people switch to using it. From stefan at bytereef.org Thu May 22 19:23:02 2014 From: stefan at bytereef.org (Stefan Krah) Date: Thu, 22 May 2014 19:23:02 +0200 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> Message-ID: <20140522172302.GA25613@sleipnir.bytereef.org> Victor Stinner wrote: > I never understand -O and -OO. What is optimized exactly. To me, > striping docstrings is not really an "optimization", it should be a > different option. Indeed, it should be -Os (optimize for space). Stefan Krah From ethan at stoneleaf.us Thu May 22 19:29:48 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 22 May 2014 10:29:48 -0700 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> Message-ID: <537E340C.9090001@stoneleaf.us> On 05/22/2014 08:41 AM, Steve Dower wrote: > Guido van Rossum wrote: >> >> FWIW, I am strictly with Ned here. > > As someone who maintains/develops a debugger for Python, I?m with > Ned as well (and also Raymond, since I really don?t want to have > to worry about one-more-mode that Python might be running in). > > Why not move the existing optimisation into -O mode and put future > optimisations in there too? It may just start having enough value > that people switch to using it. I will admit to being very surprised the day I realized that the normal run mode for python is debugging mode! For anyone who hasn't yet realized this, without -O, __debug__ is True, but with any -O __debug__ is False. Given that, it does seem kind of odd to have source altering optimizations active when __debug__ is True. Of course, we can't change that mid-3.x stream. However, we could turn off optimizations by default, and then have -O remove assertions /and/ turn on optimizations. Which would still work nicely with .pyc and .pyo files as ... wait, let me make a table: flag | optimizations | saved files --------+--------------------+-------------- none | none | none --------+--------------------+-------------- -O | asserts removed | .pyc | peephole, etc. | --------+--------------------+-------------- -OO | -O plus | | docstrings removed | .pyo That would certainly make the -O flags make more sense than they do now. It would also emphasize the fact that assert is not for user data verification. ;) -- ~Ethan~ From steve at pearwood.info Thu May 22 19:59:10 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 23 May 2014 03:59:10 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> Message-ID: <20140522175910.GM10355@ando> On Thu, May 22, 2014 at 03:41:31PM +0000, Steve Dower wrote: > Why not move the existing optimisation into -O mode and put future > optimisations in there too? It may just start having enough value that > people switch to using it. I just had the same idea, you beat me to it. There's a steady but small stream of people asking "why do we have -O, it does so little we might as well get rid of it". If I remember correctly (and apologies if I do not), Guido has even suggested getting rid of simple constant folding. So let's make -O more attractive, while simplifying the default behaviour: * By default, no optimizations operate at all. * With -O, you get assert disabling, the tricky string concatenation optimization, constant folding, and whatever else the peepholer does. * The double -OO switch should be deprecated, for eventual removal in the very distant future. (4.0? 5.0?) * Instead, a separate switch for removing docstrings can be added, to support implementations in low-memory devices or other constrained situations. This will make Python's compilation model a little more familiar to people coming from other languages. It will make -O more attractive, instead of being viewed by some as a waste of effort, and ensure that by default there are no tricks played with byte-code. A big advantage: we already have separate .pyo and .pyc files, so no risk of confusion. Downside of this suggestion: - To the extent that constant folding and other optimizations actually lead to a speed-up, turning them off by default will be a performance regression. - Experienced programmers ought to know not to rely on the string concatenation optimization, as it is non-portable and prone to surprising failures even in CPython. The optimization really only exists for naive programmers, but they are unlikely to know about, or bother using, -O to get that optimization. - Surely I don't expect PyPy to perform no optimizations at all unless the -O switch is given? I'd have to be mad to suggest that. -- Steven From njs at pobox.com Thu May 22 19:16:33 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 22 May 2014 18:16:33 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E1891.5050808@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> Message-ID: On Thu, May 22, 2014 at 4:32 PM, Ned Batchelder wrote: > On 5/22/14 9:49 AM, Skip Montanaro wrote: >> It seems to me that Ned has revealed a bug in the peephole optimizer. >> It zapped an entire source line's worth of bytecode, but failed to >> delete the relevant entry in the line number table of the resulting >> code object. If I had my druthers, that would be the change I'd >> prefer. > > I think it is the nature of optimization that it will destroy useful > information. I don't think it will always be possible to retain enough > back-mapping that the optimized code can be understood as if it had not been > optimized. For example, the debug issue would still be present: if you run > pdb and set a breakpoint on the "continue" line, it will never be hit. Even > if the optimizer cleaned up after itself perfectly (in fact, especially so), > that breakpoint will still not be hit. You simply cannot reason about > optimized code without having to mentally understand the transformations > that have been applied. In this particular case, the back-mapping problem is pretty minor. IIUC the optimization is that if we have (abusing BASIC notation) 10 GOTO 20 20 GOTO 30 30 ... then in fact the operations at lines 10 and 20 are, from the point of view of the rest of the program, indivisible -- every time you execute 10 you also execute 20, there is no way to tell from outside whether we paused in betwen executing 10 and 20, etc. Effectively we just have a single uber-instruction that does both: (10, 20) GOTO 30 30 ... So from the coverage point of view, just marking line 20 as covered every time line 10 is executed is the Right Thing To Do. From the debugging point of view, a breakpoint set at line 20 should just trip whenever line 10 is executed -- it's not like there's any way to tell whether we're "half way through" the jump sequence or not. It's a pretty solid abstraction. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From steve at pearwood.info Thu May 22 20:14:11 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 23 May 2014 04:14:11 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537E340C.9090001@stoneleaf.us> References: <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <537E340C.9090001@stoneleaf.us> Message-ID: <20140522181411.GN10355@ando> On Thu, May 22, 2014 at 10:29:48AM -0700, Ethan Furman wrote: > However, we could turn off optimizations by default, and then have -O > remove assertions /and/ turn on optimizations. > > Which would still work nicely with .pyc and .pyo files as ... wait, let me > make a table: > > flag | optimizations | saved files > --------+--------------------+-------------- > none | none | none > --------+--------------------+-------------- > -O | asserts removed | .pyc > | peephole, etc. | > --------+--------------------+-------------- > -OO | -O plus | > | docstrings removed | .pyo I think we still want to cache byte code in .pyc files by default. Technically, yes, it's an optimization, but it's not the sort of optimization that makes a difference to debugging[1]. As I understand it, generating the parse tree is *extremely* expensive. Run python -v to see just how many modules would have to be parsed and compiled every single time without the cached .pyc files. > That would certainly make the -O flags make more sense than they do now. > It would also emphasize the fact that assert is not for user data > verification. ;) :-) [1] Except perhaps under very rare and unusual circumstances, but there are already mechanisms in place to disable the generation of .pyc files. -- Steven From ericsnowcurrently at gmail.com Thu May 22 20:49:32 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 22 May 2014 12:49:32 -0600 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <20140522175910.GM10355@ando> References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> Message-ID: On Thu, May 22, 2014 at 11:59 AM, Steven D'Aprano wrote: > On Thu, May 22, 2014 at 03:41:31PM +0000, Steve Dower wrote: > >> Why not move the existing optimisation into -O mode and put future >> optimisations in there too? It may just start having enough value that >> people switch to using it. > > I just had the same idea, you beat me to it. Same here. More concretely: -O0 -- no optmizations at all -O1 -- level 1 optimizations (current peephole optmizations), asserts disabled -O2 -- level 2 optimizations (currently nothing extra) -O3 -- ... -ONs or -X nodocstrings or -X compact or --compact or --nodocstrings -- remove docstrings (for space savings) --debug or -X debug -- sets __debug__ to True (also implies -O0) Compatibility (keeping the current behavior): Default: -O + __debug__ = True (deprecate setting __debug__ to True?) -O -- same as -O -OO -- same as -Os (deprecate) Having the current optimizations correspond to -O1 makes sense in that we don't have anything more granular. However, if more optimizations were added I'd expect them to fall under a higher optimization level. Adding a new option just for docstrings/compact seems like a I waste so I like Stefan's idea of optionally appending "s" (for space) onto the -O option. As Barry noted, we would also build on PEPs 3147/3149 to add a tag for the optmization level, etc. The default mode would keep the current cache tag and -O/-OO would likewise stay the same (with the .pyo suffix). > * The double -OO switch should be deprecated, for eventual removal > in the very distant future. (4.0? 5.0?) Good idea. > * Instead, a separate switch for removing docstrings can be added, > to support implementations in low-memory devices or other > constrained situations. Also a good idea. > This will make Python's compilation model a little more familiar to > people coming from other languages. It will make -O more attractive, > instead of being viewed by some as a waste of effort, and ensure that by > default there are no tricks played with byte-code. +1 -eric From ericsnowcurrently at gmail.com Thu May 22 20:57:41 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 22 May 2014 12:57:41 -0600 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> Message-ID: On Thu, May 22, 2014 at 12:49 PM, Eric Snow wrote: > Same here. More concretely: ... Having said that, revamping those options and our current optimization mechanism is a far cry from just adding -X nopeephole as Ned has implied. While the former may make sense on its own, those broader changes may languish as nice-to-haves. It may be better to go with the latter in the short-term while the broader changes swirl in the maelstrom of discussion indefinitely. -eric From stefan_ml at behnel.de Thu May 22 21:11:46 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 22 May 2014 21:11:46 +0200 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> Message-ID: Brett Cannon, 21.05.2014 15:51: > There are constant rumblings about trying to make .pyc/.pyo aware of what > optimizations were applied so that this kind of thing wouldn't occur. It > would require tweaking how optimizations are expressed/added so that they > are more easily controlled and can somehow contribute to the labeling of > what optimizations were applied. All totally doable but will require > thinking about the proper API and such (reading .pyc/.pyo files would also > break but that's happened before when we added file size to the header and > .pyc/.pyo files are viewed as internal optimizations anyway). It might be possible to move the peephole optimiser run into the code loader, i.e. the .pyc files could be written out *before* it runs, as plain unoptimised byte code. There might be a tiny performance impact on load, but I doubt that it would be serious. Stefan From ned at nedbatchelder.com Thu May 22 22:10:11 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 22 May 2014 16:10:11 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E1C87.7020407@egenix.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> <537E17C3.8000307@nedbatchelder.com> <537E1AC0.1000708@egenix.com> <537E1C87.7020407@egenix.com> Message-ID: <537E59A3.8090100@nedbatchelder.com> On 5/22/14 11:49 AM, M.-A. Lemburg wrote: > On 22.05.2014 17:46, Paul Moore wrote: >> On 22 May 2014 16:41, M.-A. Lemburg wrote: >>> Why not simply have the new option disable writing PYC files ? >> That's what I said. But you also need to not read them as well, >> because otherwise you could read an optimised file if the source >> hasn't changed. > Good point :-) > For the use-case I am considering, it would be best to write .pyc files as usual. These are large test suites that already have detailed choreography, usually involving new working trees for each run, or explicitly deleted pyc files. Avoiding pyc's altogether will slow things down, and test suites are universally considered to take too long as it is. --Ned. From ned at nedbatchelder.com Thu May 22 22:13:48 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 22 May 2014 16:13:48 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E1B27.7050201@stoneleaf.us> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> <537E1B27.7050201@stoneleaf.us> Message-ID: <537E5A7C.3080708@nedbatchelder.com> On 5/22/14 11:43 AM, Ethan Furman wrote: > On 05/22/2014 08:32 AM, Ned Batchelder wrote: >> On 5/22/14 9:49 AM, Skip Montanaro wrote: >>> On Thu, May 22, 2014 at 8:05 AM, Chris Angelico wrote: >>>> >>>> Correct me if I'm wrong, but as I understand it, the problem is that >>>> the peephole optimizer eliminated an entire line of code. Would it be >>>> possible to have it notice when it merges two pieces from different >>>> lines, and somehow mark that the resulting bytecode comes from both >>>> lines? That would solve the breakpoint and coverage problems >>>> simultaneously. >>> >>> It seems to me that Ned has revealed a bug in the peephole optimizer. >>> It zapped an entire source line's worth of bytecode, but failed to >>> delete the relevant entry in the line number table of the resulting >>> code object. If I had my druthers, that would be the change I'd >>> prefer. >> >> I think it is the nature of optimization that it will destroy useful >> information. I don't think it will always be >> possible to retain enough back-mapping that the optimized code can be >> understood as if it had not been optimized. For >> example, the debug issue would still be present: if you run pdb and >> set a breakpoint on the "continue" line, it will >> never be hit. Even if the optimizer cleaned up after itself >> perfectly (in fact, especially so), that breakpoint will >> still not be hit. You simply cannot reason about optimized code >> without having to mentally understand the >> transformations that have been applied. >> >> The whole point of this proposal is to recognize that there are times >> (debugging, coverage measurement) when >> optimizations are harmful, and to avoid them. > > Having read through the issue on the tracker, I find myself swayed > towards Neds point of view. However, I do still agree with Raymond > that a full-fledged command-line switch is overkill, especially since > the unoptimized runs are very special-cased (meaning useful for > debugging, coverage, curiosity, learning about optimizing, etc). > I'm perfectly happy to drop the idea of the command-line switch. An environment variable would be a fine way to control this behavior. > If we had a sys flag that could be set before a module was loaded, > then coverage, pdb, etc., could use that to recompile the source, not > save a .pyc file, and move forward. For debugging purposes perhaps a > `__no_optimize__ = True` or `from __future__ import no_optimize` would > help in those cases where you're dropping into the debugger. I don't understand these ideas, but having to add an import to the top of the file seems like a non-starter to me. > > The dead-code elimination still has a bug to be fixed, though, because > if a line has been optimized away trying to set a break-point at it > should fail. If we get a way to disable optimization, we don't need to fix that bug. Everyone knows that optimized code acts oddly in debuggers. :) > > -- > ~Ethan~ From ned at nedbatchelder.com Thu May 22 22:17:18 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 22 May 2014 16:17:18 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> Message-ID: <537E5B4E.8050905@nedbatchelder.com> On 5/22/14 1:16 PM, Nathaniel Smith wrote: > On Thu, May 22, 2014 at 4:32 PM, Ned Batchelder wrote: >> On 5/22/14 9:49 AM, Skip Montanaro wrote: >>> It seems to me that Ned has revealed a bug in the peephole optimizer. >>> It zapped an entire source line's worth of bytecode, but failed to >>> delete the relevant entry in the line number table of the resulting >>> code object. If I had my druthers, that would be the change I'd >>> prefer. >> I think it is the nature of optimization that it will destroy useful >> information. I don't think it will always be possible to retain enough >> back-mapping that the optimized code can be understood as if it had not been >> optimized. For example, the debug issue would still be present: if you run >> pdb and set a breakpoint on the "continue" line, it will never be hit. Even >> if the optimizer cleaned up after itself perfectly (in fact, especially so), >> that breakpoint will still not be hit. You simply cannot reason about >> optimized code without having to mentally understand the transformations >> that have been applied. > In this particular case, the back-mapping problem is pretty minor. > IIUC the optimization is that if we have (abusing BASIC notation) > > 10 GOTO 20 > 20 GOTO 30 > 30 ... > > then in fact the operations at lines 10 and 20 are, from the point of > view of the rest of the program, indivisible -- every time you execute > 10 you also execute 20, there is no way to tell from outside whether > we paused in betwen executing 10 and 20, etc. Effectively we just have > a single uber-instruction that does both: > > (10, 20) GOTO 30 > 30 ... > > So from the coverage point of view, just marking line 20 as covered > every time line 10 is executed is the Right Thing To Do. From the > debugging point of view, a breakpoint set at line 20 should just trip > whenever line 10 is executed -- it's not like there's any way to tell > whether we're "half way through" the jump sequence or not. It's a > pretty solid abstraction. > > -n > You've used the word "just" three times, glossing over the fact that we have no facility for marking statements as an uber instruction, and you've made no proposal for how it might work. Even if we build (and test!) a way to do that, it only covers this particular kind of oddity with optimized code. --Ned. From ned at nedbatchelder.com Thu May 22 22:26:15 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 22 May 2014 16:26:15 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> Message-ID: <537E5D67.90101@nedbatchelder.com> On 5/22/14 2:57 PM, Eric Snow wrote: > On Thu, May 22, 2014 at 12:49 PM, Eric Snow wrote: >> Same here. More concretely: > ... > > Having said that, revamping those options and our current optimization > mechanism is a far cry from just adding -X nopeephole as Ned has > implied. While the former may make sense on its own, those broader > changes may languish as nice-to-haves. It may be better to go with > the latter in the short-term while the broader changes swirl in the > maelstrom of discussion indefinitely. I get distracted (by work...) for the afternoon, and things take an unexpected turn! I definitely did not mean to throw open the floodgates to reconsider the entire -O switch. I agree that the -O switch seems like too much UI for too little change in results, and I think a different set of settings and defaults makes more sense. But I do not suppose that we have much appetite to take on that large a change. For my purposes, an environment variable and no change or addition to the switches would be fine. --Ned > -eric > From tjreedy at udel.edu Fri May 23 02:45:33 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 22 May 2014 20:45:33 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <20140522104334.36b2b07f@fsol> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <20140522104334.36b2b07f@fsol> Message-ID: On 5/22/2014 4:43 AM, Antoine Pitrou wrote: > On Thu, 22 May 2014 02:44:52 -0400 > Terry Reedy wrote: >> >> When I used coverage (last summer) with tested Idle modules, I could not >> get a reported 100% coverage because coverage counts the body of a final >> "if __name__ == '__main__':" statement. > > There are flags to modify this behaviour. Not directly, but yes, indirectly via --rcfile=FILE where FILE defaults to .coveragerc and the configuration file has [report] exclude_lines = if __name__ == .__main__.: I believe Ned pointed that out to me when I reported the 'problem' to him. If 'continue' were added under 'exclude_lines', the 'can't get 100% coverage' continue issue should go away also. (Yes, I know it is not quite that simple, as there will be times when continue is skipped that should be reported. But I suspect that there will nearly always be some other line skipped and reported, so that a false 100% will be rare.) -- Terry Jan Reedy From ethan at stoneleaf.us Fri May 23 02:40:51 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 22 May 2014 17:40:51 -0700 Subject: [Python-ideas] Disabling optimizations Message-ID: <537E9913.7070501@stoneleaf.us> So, to hopefully summarize where we seem to have come to something of a consensus: - disabling optimizations can be a good thing - creating a new command-line switch is an overpowered solution - having a sys flag could work - redefining the existing -O switch could work - care must be taken to properly handle what is written to .pyc/.pyo files Personally, I could live with either a sys flag type solution or the -O solution, but I strongly favor the -O solution. Why? Partly because -O is for optimizations, so it naturally lends itself to turning them off; partly because I think the current state of the -O switches is sub-optimal (almost-pun intended ;); partly because I see assert being used incorrectly and want to encourage the use of at least -O; partly because running in __debug__ mode by default seems a bit strange; and partly because running in __debug__ mode but having optimizations turned on also seems a bit strange. I think the big question if we go this route is what gets written to pyc files, and what to pyo files? -- ~Ethan~ From ned at nedbatchelder.com Fri May 23 03:44:27 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 22 May 2014 21:44:27 -0400 Subject: [Python-ideas] Disabling optimizations In-Reply-To: <537E9913.7070501@stoneleaf.us> References: <537E9913.7070501@stoneleaf.us> Message-ID: <537EA7FB.8070806@nedbatchelder.com> On 5/22/14 8:40 PM, Ethan Furman wrote: > So, to hopefully summarize where we seem to have come to something of > a consensus: > > - disabling optimizations can be a good thing > > - creating a new command-line switch is an overpowered solution > > - having a sys flag could work > > - redefining the existing -O switch could work > > - care must be taken to properly handle what is written to .pyc/.pyo > files > > Personally, I could live with either a sys flag type solution or the > -O solution, but I strongly favor the -O solution. > > Why? > > Partly because -O is for optimizations, so it naturally lends itself > to turning them off; partly because I think the current state of the > -O switches is sub-optimal (almost-pun intended ;); partly because I > see assert being used incorrectly and want to encourage the use of at > least -O; partly because running in __debug__ mode by default seems a > bit strange; and partly because running in __debug__ mode but having > optimizations turned on also seems a bit strange. > > I think the big question if we go this route is what gets written to > pyc files, and what to pyo files? I'm of the opinion that we don't need to segregate bytecode into different files depending on the options used to create the bytecode. How often is the same program run in the same place with different options at different times? I'm happy to have optimized and non-optimized code both written to .pyc files, and if you are fiddling with the options like that, you should delete your pyc files when you change the options. If we come up with a way to have the bytecode file-segregated, I'm OK with that too. I definitely don't like the alternative that says unoptimized code isn't written to disk at all. If people want to solve the problem that way, there is already a mechanism to avoid writing bytecode, you can use it with the optimizer controls to achieve the effect you want. --Ned. From cs at zip.com.au Fri May 23 03:56:57 2014 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 23 May 2014 11:56:57 +1000 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E1B27.7050201@stoneleaf.us> References: <537E1B27.7050201@stoneleaf.us> Message-ID: <20140523015657.GA35202@cskk.homeip.net> On 22May2014 08:43, Ethan Furman wrote: >On 05/22/2014 08:32 AM, Ned Batchelder wrote: >>The whole point of this proposal is to recognize that there are times (debugging, coverage measurement) when >>optimizations are harmful, and to avoid them. > >Having read through the issue on the tracker, I find myself swayed >towards Neds point of view. I've been with Ned from the first post, but have been playing (slow) catchup on the discussion. I'd personally be fine with a -O0 command line switch in keeping with a somewhat common C-compiler convention, or with an environment variable. If all the optimizations in the compiler/interpreter are a distinct step, then having a switch that just says "skip this step, we do not want the naive code transformed at all" seems both desirable and easy. And finally, the sig quote below really did come up at random for this message. Cheers, Cameron Simpson We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. - Donald Knuth From tjreedy at udel.edu Fri May 23 04:07:58 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 22 May 2014 22:07:58 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E1A88.8080501@egenix.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> <537E1A88.8080501@egenix.com> Message-ID: On 5/22/2014 11:40 AM, M.-A. Lemburg wrote: > On 22.05.2014 17:32, Ned Batchelder wrote: >> >> The whole point of this proposal is to recognize that there are times (debugging, coverage >> measurement) when optimizations are harmful, and to avoid them. > > +1 > > It's regular practice in other languages to disable optimizations > when debugging code. I don't see why Python should be different in this > respect. > > Debuggers, testing, coverage and other such tools should be able to > invoke a Python runtime mode that let's the compiler work strictly > by the book, without applying any kind of optimization. > > This used to be the default in Python, I believe that Python has always had an 'as if' rule that allows more or less 'hidden' optimizations, as long as the net effect of a statement is as defined. 1. By the book, "a,b = b,a" means create a tuple from b,a, unpack the contents to a and b, and delete the reference to the tuple. An obvious optimization is to not create the tuple. As I remember, this was once tried out before tuple unpacking was generalized to iterable unpacking. I don't know if CPython was ever released with that optimization, or if other implementations have or do use it. By the 'as if' rule, it does not matter, even though an allocation tracer (such as the one added to 3.4?) might detect the non-allocation. 2. The manual says ''' @f1(arg) @f2 def func(): pass is equivalent to def func(): pass func = f1(arg)(f2(func)) ''' The equivalent is 'as if', in net effect, not in the detailed process. CPython actually executes (or at least did at one time) def (): pass func = f1(arg)(f2()) Ignore f1. The difference can be detected when f2 is called by examining the approriate namespace within f2. When someone filed an issue about the 'bug' of 'func' never being bound to the unwrapped function object, Guido said that he neither wanted to change the doc or the implementation. (Sorry, I cannot find the issue.) 3. "a + b" is *usually* equivalent to "a.__class__.__add__(b)" or possibly "b.__class__.__radd__(a)". However, my understanding is that if a and b are ints, a 'fast path' optimization is applied that bypasses the int.__add slot wrapper. Is so, a call tracer could notice the difference and if unaware of such optimizations, falsely report a problem. 4. Some Python implementations delay object destruction. I suspect that some (many?) do not really destroy objects (zero out the memory block). > but there's definitely a need for being able to run Python in > a debugger without having it perfectly valid skip code lines > (even if they are no ops). This is a different issue from 'disable the peephole optimizer'. -- Terry Jan Reedy From ethan at stoneleaf.us Fri May 23 04:03:53 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 22 May 2014 19:03:53 -0700 Subject: [Python-ideas] Disabling optimizations In-Reply-To: <537E9913.7070501@stoneleaf.us> References: <537E9913.7070501@stoneleaf.us> Message-ID: <537EAC89.6090802@stoneleaf.us> Oh, and just to be clear, we are only talking about optimizations that modify the byte-code, correct? -- ~Ethan~ From tjreedy at udel.edu Fri May 23 04:53:28 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 22 May 2014 22:53:28 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537DFA95.4040000@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> Message-ID: On 5/22/2014 9:24 AM, Ned Batchelder wrote: > On 5/22/14 2:44 AM, Terry Reedy wrote: >> On 5/21/2014 6:59 PM, Ned Batchelder wrote: >> >>> If by implementation details, you mean the word "peephole", then let's >>> remove it, and simply have a switch that disables all optimization. >>> Rather than limiting the future of the optimizer, it will provide an >>> escape hatch for people who would rather not have the optimizer's >>> effects. >> >> The presumption of this idea is that there is a proper, canonical >> unoptimized version of 'compiled Python'. For Python there obviously >> is not. For CPython, there is not either. What Raymond has been saying >> is that the output of the CPython compiler is the output of the >> CPython compiler. > I'd like to understand why we think the Python compiler is different in > this regard than a C compiler. Python is a different language. But let us not get sidetracked on that. > When this came up 18 months ago on Python-Dev, it was part of a thread > about adding more optimizations to CPython. Guido said "+1" to the idea > of being able to disable the optimizers > (https://mail.python.org/pipermail/python-dev/2012-December/123099.html). I read that and it is not to me exactly what his quick, top-posted '+1' really means. I claimed in response to Marc-Andre that CPython has always had an as-if rule and numerous optimizations, some of which cannot, realistically, be disabled. Nor would we really want to disable 'all optimization' (as you requested in your post). My objection to 'disable the peephole optimizer' is that it likely disables too much, and perhaps too little (as more is done with asts). Also, it seems it may add a continuing burden to a relatively small core developer team, which also has an stdlib to maintain. I think we should initially focus on the ghosting of 'continue'. While the coverage problem can be partly solved by adding 'continue' to 'exclude lines', that will not solve the problem of a debugger checkpoint not working. I think you could argue (very Pythonically ;-) that the total machine-time saving of ghosting 'continue' is not worth the extra time waste of humans. I would be happier removing that particular optimization than with adding machinery to make it optional. If, as has been proposed, some or all of the peephole (code) optimizations were moved to the ast stage, where continue jumps are still distinguished by Continue nodes, it might be easier to selectively avoid undesirable ghosting of continue statements. -- Terry Jan Reedy From ethan at stoneleaf.us Fri May 23 05:10:01 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 22 May 2014 20:10:01 -0700 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> Message-ID: <537EBC09.6000004@stoneleaf.us> On 05/22/2014 07:53 PM, Terry Reedy wrote: > On 5/22/2014 9:24 AM, Ned Batchelder wrote: >> >> When this came up 18 months ago on Python-Dev, it was part of a thread >> about adding more optimizations to CPython. Guido said "+1" to the idea >> of being able to disable the optimizers > > I read that and it is not to me exactly what his quick, top-posted '+1' really means. In the interest of not debating what Guido meant way back when, he has posted (today?) that "I am strictly with Ned here." I think we can count that as a +1 for Ned's request. -- ~Ethan~ From stefan_ml at behnel.de Fri May 23 07:02:36 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2014 07:02:36 +0200 Subject: [Python-ideas] Disabling optimizations In-Reply-To: <537EA7FB.8070806@nedbatchelder.com> References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> Message-ID: Ned Batchelder, 23.05.2014 03:44: > I'm of the opinion that we don't need to segregate bytecode into different > files depending on the options used to create the bytecode. How often is > the same program run in the same place with different options at different > times? I'm happy to have optimized and non-optimized code both written to > .pyc files, and if you are fiddling with the options like that, you should > delete your pyc files when you change the options. If we come up with a > way to have the bytecode file-segregated, I'm OK with that too. > > I definitely don't like the alternative that says unoptimized code isn't > written to disk at all. If people want to solve the problem that way, > there is already a mechanism to avoid writing bytecode, you can use it with > the optimizer controls to achieve the effect you want. As I already proposed, we could get rid of .pyo files all together and only write unoptimised .pyc files, and then apply the optimisations at load time based on the current interpreter config. I think that would give us a good tradeoff between fast (precompiled) code loading and differing requirements on byte code optimisations. Stefan From stefan_ml at behnel.de Fri May 23 07:28:32 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2014 07:28:32 +0200 Subject: [Python-ideas] Disabling optimizations In-Reply-To: References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> Message-ID: Stefan Behnel, 23.05.2014 07:02: > Ned Batchelder, 23.05.2014 03:44: >> I'm of the opinion that we don't need to segregate bytecode into different >> files depending on the options used to create the bytecode. How often is >> the same program run in the same place with different options at different >> times? I'm happy to have optimized and non-optimized code both written to >> .pyc files, and if you are fiddling with the options like that, you should >> delete your pyc files when you change the options. If we come up with a >> way to have the bytecode file-segregated, I'm OK with that too. >> >> I definitely don't like the alternative that says unoptimized code isn't >> written to disk at all. If people want to solve the problem that way, >> there is already a mechanism to avoid writing bytecode, you can use it with >> the optimizer controls to achieve the effect you want. > > As I already proposed, we could get rid of .pyo files all together and only > write unoptimised .pyc files, and then apply the optimisations at load time > based on the current interpreter config. I think that would give us a good > tradeoff between fast (precompiled) code loading and differing requirements > on byte code optimisations. Stefan Krah already proposed -Os (optimise for space) for the cases where you want to reduce the size of the byte code file, e.g. by removing doc strings. That could become the next .pyo file. Although it's unclear to me why you would do that, instead of just compressing them. Stefan From ethan at stoneleaf.us Fri May 23 07:42:07 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 22 May 2014 22:42:07 -0700 Subject: [Python-ideas] Disabling optimizations In-Reply-To: References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> Message-ID: <537EDFAF.4060802@stoneleaf.us> On 05/22/2014 10:02 PM, Stefan Behnel wrote: > Ned Batchelder, 23.05.2014 03:44: >> I'm of the opinion that we don't need to segregate bytecode into different >> files depending on the options used to create the bytecode. How often is >> the same program run in the same place with different options at different >> times? I'm happy to have optimized and non-optimized code both written to >> .pyc files, and if you are fiddling with the options like that, you should >> delete your pyc files when you change the options. If we come up with a >> way to have the bytecode file-segregated, I'm OK with that too. >> >> I definitely don't like the alternative that says unoptimized code isn't >> written to disk at all. If people want to solve the problem that way, >> there is already a mechanism to avoid writing bytecode, you can use it with >> the optimizer controls to achieve the effect you want. > > As I already proposed, we could get rid of .pyo files all together and only > write unoptimised .pyc files, and then apply the optimisations at load time > based on the current interpreter config. I think that would give us a good > tradeoff between fast (precompiled) code loading and differing requirements > on byte code optimisations. -1 The whole point of saving the compiled version to disk is to load-and-go. I have no problem with having the pyc contain the info on which optimizations it was compiled with, and if the current options are different then it gets recompiled. As Ned said, "How often is the same program run in the same place with different options at different times?" -- ~Ethan~ From p.f.moore at gmail.com Fri May 23 09:20:00 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 23 May 2014 08:20:00 +0100 Subject: [Python-ideas] Disabling optimizations In-Reply-To: <537EA7FB.8070806@nedbatchelder.com> References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> Message-ID: On 23 May 2014 02:44, Ned Batchelder wrote: > I'm happy to have optimized and non-optimized code both written to .pyc > files, and if you are fiddling with the options like that, you should delete > your pyc files when you change the options. Surely the net effect of this on your original issue would be that instead of people wondering why continue is not shown as covered, doing a lot of debugging, realising it was an eliminated line and moving on, you would have people wondering why continue is shown as not covered, doing a lot of debugging, realising they forgot to delete the pyc file, removing it, rerunning the coverage report and moving on? I doubt that diagnosing "I forgot to remove the pyc file, and it matters" would be much easier than the current situation. Both could pretty easily be documented in the coverage docs. -1 on having Python fail to distinguish pyc files that have different user-visible behaviour that we care about. Paul From mal at egenix.com Fri May 23 10:25:29 2014 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 23 May 2014 10:25:29 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> <537E1A88.8080501@egenix.com> Message-ID: <537F05F9.7070406@egenix.com> On 23.05.2014 04:07, Terry Reedy wrote: > On 5/22/2014 11:40 AM, M.-A. Lemburg wrote: >> On 22.05.2014 17:32, Ned Batchelder wrote: >>> >>> The whole point of this proposal is to recognize that there are times (debugging, coverage >>> measurement) when optimizations are harmful, and to avoid them. >> >> +1 >> >> It's regular practice in other languages to disable optimizations >> when debugging code. I don't see why Python should be different in this >> respect. >> >> Debuggers, testing, coverage and other such tools should be able to >> invoke a Python runtime mode that let's the compiler work strictly >> by the book, without applying any kind of optimization. >> >> This used to be the default in Python, > > I believe that Python has always had an 'as if' rule that allows more or less 'hidden' > optimizations, as long as the net effect of a statement is as defined. I was referring to the times before the peephole optimizer was introduced (Python 2.3 and earlier). What's important here is to look at the difference between what the compiler generates by simply following its rule book and the version of the byte code which is the result of running an optimizer on the byte code or even on the AST before running the transform to byte code. Note that I'm not talking about optimizations applied at the VM level implementations of bytecodes and I think neither was Ned. > 1. By the book, "a,b = b,a" means create a tuple from b,a, unpack the contents to a and b, and > delete the reference to the tuple. An obvious optimization is to not create the tuple. As I > remember, this was once tried out before tuple unpacking was generalized to iterable unpacking. I > don't know if CPython was ever released with that optimization, or if other implementations have or > do use it. By the 'as if' rule, it does not matter, even though an allocation tracer (such as the > one added to 3.4?) might detect the non-allocation. This is an implementation detail of the VM. The code generated by the compiler is byte code saying rotate the top two arguments on the stack (ROT_TWO). > 2. The manual says > ''' > @f1(arg) > @f2 > def func(): pass > > is equivalent to > > def func(): pass > func = f1(arg)(f2(func)) > ''' > The equivalent is 'as if', in net effect, not in the detailed process. CPython actually executes (or > at least did at one time) > > def (): pass > func = f1(arg)(f2()) > > Ignore f1. The difference can be detected when f2 is called by examining the approriate namespace > within f2. When someone filed an issue about the 'bug' of 'func' never being bound to the unwrapped > function object, Guido said that he neither wanted to change the doc or the implementation. (Sorry, > I cannot find the issue.) I'd put that under documentation bug, if at all :-) Note that the function func does get the name "func". It's just not bound to the name in the intermediate step, since the function object serves as parameter to the function f2. > 3. "a + b" is *usually* equivalent to "a.__class__.__add__(b)" or possibly > "b.__class__.__radd__(a)". However, my understanding is that if a and b are ints, a 'fast path' > optimization is applied that bypasses the int.__add slot wrapper. Is so, a call tracer could notice > the difference and if unaware of such optimizations, falsely report a problem. Again, this is an optimization in the implementation of the byte code, not one applied by the compiler. There are quite a few more such optimizations going in the VM. > 4. Some Python implementations delay object destruction. I suspect that some (many?) do not really > destroy objects (zero out the memory block). I don't see what this has to do with the compiler. Isn't that just a implementation detail of how GC works on a particular Python platform ? >> but there's definitely a need for being able to run Python in >> a debugger without having it perfectly valid skip code lines >> (even if they are no ops). > > This is a different issue from 'disable the peephole optimizer'. For me, a key argument for having a runtime mode without compiler optimizations is that the compiler gains more freedom in applying more aggressive optimizations. Tools will no longer have to adapt to whatever optimizations are added with each new Python release, since there will be a defined non-optimized runtime mode they can use as basis for their work. The net result would be faster Pythons and better working debugging tools (well, at least that's the hope ;-). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 23 2014) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From me+python at ixokai.io Fri May 23 10:02:29 2014 From: me+python at ixokai.io (Stephen Hansen) Date: Fri, 23 May 2014 01:02:29 -0700 Subject: [Python-ideas] Disabling optimizations In-Reply-To: <537EA7FB.8070806@nedbatchelder.com> References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> Message-ID: On Thu, May 22, 2014 at 6:44 PM, Ned Batchelder wrote: > I'm of the opinion that we don't need to segregate bytecode into different > files depending on the options used to create the bytecode. How often is > the same program run in the same place with different options at different > times? I'm happy to have optimized and non-optimized code both written to > .pyc files, and if you are fiddling with the options like that, you should > delete your pyc files when you change the options. If we come up with a > way to have the bytecode file-segregated, I'm OK with that too. > What madness is this? Any suggestion that "you should delete your pyc files" strikes me as remarkably wrongheaded. You shouldn't even have to think about pyc (or pyo) files -- they're a convenience, not something there is any expectation on anyone to *manage*. When I edit my .py file, I don't have to go delete the pyc; I don't need to be sure to do a 'make clean' like on some of my C projects. Python sees my source is modified, and discards the compiled bit -- expecting anything more from people using python is a serious thing. Things have gotten a bit more complex in modern Python with the __pycache__ directory, yet still there is no expectation that users *manage* these files. That's a bit shocking to me. I definitely don't like the alternative that says unoptimized code isn't > written to disk at all. If people want to solve the problem that way, > there is already a mechanism to avoid writing bytecode, you can use it with > the optimizer controls to achieve the effect you want. > I don't understand this point. It seems natural to me that if you have an option to run code with optimizations disabled, its not written to disk...: after all the entire assumption of the point is the code isn't doing everything it can to be as efficient as it can. At that point, what does speed matter? You've decided you want precise traceable semantics even when its known that certain branches aren't needed -- you want to trace the precise logic. Do you really then care about the cost it takes to compile the source to bytecode? I get that there are reasons to not want optimizations, but I don't get the desire to complicate the compilation and running step. Optimizations on/off makes some sense: in testing environments and the like. Its something else entirely to demand people manually delete files, or where the burden is upon those who run the app/test suites/etc to deal with files created as a side-effect of what they're doing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri May 23 11:11:02 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 23 May 2014 19:11:02 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <537E5D67.90101@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: On 23 May 2014 06:27, "Ned Batchelder" wrote: > > On 5/22/14 2:57 PM, Eric Snow wrote: >> >> On Thu, May 22, 2014 at 12:49 PM, Eric Snow wrote: >>> >>> Same here. More concretely: >> >> ... >> >> Having said that, revamping those options and our current optimization >> mechanism is a far cry from just adding -X nopeephole as Ned has >> implied. While the former may make sense on its own, those broader >> changes may languish as nice-to-haves. It may be better to go with >> the latter in the short-term while the broader changes swirl in the >> maelstrom of discussion indefinitely. > > > I get distracted (by work...) for the afternoon, and things take an unexpected turn! > > I definitely did not mean to throw open the floodgates to reconsider the entire -O switch. I agree that the -O switch seems like too much UI for too little change in results, and I think a different set of settings and defaults makes more sense. But I do not suppose that we have much appetite to take on that large a change. > > For my purposes, an environment variable and no change or addition to the switches would be fine. Given how far away 3.5 is, I'd actually be interested in seeing a full write-up of Eric's proposal, comparing it to the "let's just add some more technical debt to the pile" -X option based approach. I don't think *anyone* really likes the current state of the optimisation flags, so if this proposal tips us over the edge into finally fixing them properly, huzzah! Cheers, Nick. > > --Ned > >> -eric >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri May 23 11:30:26 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 23 May 2014 11:30:26 +0200 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: 2014-05-23 11:11 GMT+02:00 Nick Coghlan : > Given how far away 3.5 is, I'd actually be interested in seeing a full > write-up of Eric's proposal, comparing it to the "let's just add some more > technical debt to the pile" -X option based approach. The discussion in now splitted in 4 places: 3 threads on this mailing list, 1 issue in the bug tracker. And there are some old discussions on python-dev. It's maybe time to use the power of the PEP process to summarize this in a clear document? (Write a PEP.) Victor From solipsis at pitrou.net Fri May 23 11:53:57 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 23 May 2014 11:53:57 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> Message-ID: <20140523115357.435531a4@fsol> On Thu, 22 May 2014 22:53:28 -0400 Terry Reedy wrote: > > > I'd like to understand why we think the Python compiler is different in > > this regard than a C compiler. > > Python is a different language. But let us not get sidetracked on that. The number one difference is that people don't compile code explicitly when writing Python code (well, except packagers who call compileall(), and a few advanced uses). So "choosing compilation options" is really not part of the standard workflow for developing in Python. Regards Antoine. From solipsis at pitrou.net Fri May 23 11:57:09 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 23 May 2014 11:57:09 +0200 Subject: [Python-ideas] Disabling optimizations References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> Message-ID: <20140523115709.58b6d016@fsol> On Fri, 23 May 2014 07:28:32 +0200 Stefan Behnel wrote: > > > > As I already proposed, we could get rid of .pyo files all together and only > > write unoptimised .pyc files, and then apply the optimisations at load time > > based on the current interpreter config. I think that would give us a good > > tradeoff between fast (precompiled) code loading and differing requirements > > on byte code optimisations. > > Stefan Krah already proposed -Os (optimise for space) for the cases where > you want to reduce the size of the byte code file, e.g. by removing doc > strings. That could become the next .pyo file. Although it's unclear to me > why you would do that, instead of just compressing them. People who are really short on disk space (embedded devs?) probably do both: first strip docstrings and friends, then compress. For the same reason, optimizing in-memory would be detrimental: optimizations can usually reduce the size of pyc files. (besides, optimizing at compile-time allows us to do more costly optimizations without caring *too much* about their overhead) Regards Antoine. From njs at pobox.com Fri May 23 12:30:27 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 23 May 2014 11:30:27 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537E5B4E.8050905@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> <537E5B4E.8050905@nedbatchelder.com> Message-ID: On Thu, May 22, 2014 at 9:17 PM, Ned Batchelder wrote: > On 5/22/14 1:16 PM, Nathaniel Smith wrote: >> >> On Thu, May 22, 2014 at 4:32 PM, Ned Batchelder >> wrote: >>> >>> On 5/22/14 9:49 AM, Skip Montanaro wrote: >>>> >>>> It seems to me that Ned has revealed a bug in the peephole optimizer. >>>> It zapped an entire source line's worth of bytecode, but failed to >>>> delete the relevant entry in the line number table of the resulting >>>> code object. If I had my druthers, that would be the change I'd >>>> prefer. >>> >>> I think it is the nature of optimization that it will destroy useful >>> information. I don't think it will always be possible to retain enough >>> back-mapping that the optimized code can be understood as if it had not >>> been >>> optimized. For example, the debug issue would still be present: if you >>> run >>> pdb and set a breakpoint on the "continue" line, it will never be hit. >>> Even >>> if the optimizer cleaned up after itself perfectly (in fact, especially >>> so), >>> that breakpoint will still not be hit. You simply cannot reason about >>> optimized code without having to mentally understand the transformations >>> that have been applied. >> >> In this particular case, the back-mapping problem is pretty minor. >> IIUC the optimization is that if we have (abusing BASIC notation) >> >> 10 GOTO 20 >> 20 GOTO 30 >> 30 ... >> >> then in fact the operations at lines 10 and 20 are, from the point of >> view of the rest of the program, indivisible -- every time you execute >> 10 you also execute 20, there is no way to tell from outside whether >> we paused in betwen executing 10 and 20, etc. Effectively we just have >> a single uber-instruction that does both: >> >> (10, 20) GOTO 30 >> 30 ... >> >> So from the coverage point of view, just marking line 20 as covered >> every time line 10 is executed is the Right Thing To Do. From the >> debugging point of view, a breakpoint set at line 20 should just trip >> whenever line 10 is executed -- it's not like there's any way to tell >> whether we're "half way through" the jump sequence or not. It's a >> pretty solid abstraction. > > You've used the word "just" three times, glossing over the fact that we have > no facility for marking statements as an uber instruction, and you've made > no proposal for how it might work. What we have right now is co_lnotab. It encodes a many-to-one mapping from bytecode locations to line number: # bytecode offset -> line no lnotab = { 0: 10, 1: 10, 2: 10, 3: 11, 4: 12, ... } AFAIK, the main operations it supports are (a) given a bytecode location, return the relevant line (for backtraces etc.), (b) when executing bytecode, detect transitions from an instruction associated with one line to an instruction associated with another line (for sys.settrace, used by coverage and pdb). def backtrace_lineno(offset): return lnotab[offset] def do_trace(offset1, offset2): if lnotab[offset1] != lnotab[offset2]: call_trace_fn(lnotab[offset2]) My proposal is to make this a many-to-many mapping: lnotab = { 0: {10}, 1: {10}, 2: {10, 11}, # optimized jump 3: {12}, ... } def backtrace_lineno(offset): # if there are multiple linenos, then it's indistinguishable which one the # exception occurred on, so just pick one to display return min(lnotab[offset]) def do_trace(offset1, offset2): for lineno in sorted(lnotab[offset2].difference(lnotab[offset1])): call_trace_fn(lineno) Yes, there is some complexity in practice because currently co_lnotab is a ridiculously optimized data structure for encoding the many-to-one mapping, and so some work needs to be done to come up with a similarly optimized way of encoding a many-to-many mapping. But this is all fundamentally trivial. "Compactly encoding a dict of sets of ints" is not the sort of challenge that we should find daunting and impossible. > Even if we build (and test!) a way to do > that, it only covers this particular kind of oddity with optimized code. Well, this is the only oddity that is causing problems. And future optimizations might well be covered by my proposed mechanism. Any optimization that works by taking in a set of line-number-tagged objects (ast nodes, bytecode instructions, whatever) and spits out a set of new objects could potentially make use of this -- just set the lineno annotation on the output objects to be the union of the lineno annotations on the input objects. Will that actually be enough in practice? Who knows, we'll have to wait until we get there. Trying to handle hypothetical future optimizations now is just borrowing trouble. And even if we do add a minimal-optimization mode, that shouldn't be taken as a blank check to stop worrying about the debuggability of the default-optimization mode, so we'll still need something like this sooner or later. gdb actually works extremely well on optimized C/C++ code -- sure, sometimes it's a bit confusing and you have to recompile with -O0 to wrap your head around what's happening, but gdb keeps working regardless and I almost never bother. And this is because the C/C++ crowd has spent a lot of time on coming up with solid systems for describing really really complicated relationships between compiler output and the original source code -- much worse than the ones we have to deal with. Just throwing up our hands and giving up seems like a rather cowardly solution. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From ned at nedbatchelder.com Fri May 23 12:39:54 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 23 May 2014 06:39:54 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <20140523115357.435531a4@fsol> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> <20140523115357.435531a4@fsol> Message-ID: <537F257A.5000509@nedbatchelder.com> On 5/23/14 5:53 AM, Antoine Pitrou wrote: > On Thu, 22 May 2014 22:53:28 -0400 > Terry Reedy wrote: >>> I'd like to understand why we think the Python compiler is different in >>> this regard than a C compiler. >> Python is a different language. But let us not get sidetracked on that. > The number one difference is that people don't compile code explicitly > when writing Python code (well, except packagers who call compileall(), > and a few advanced uses). So "choosing compilation options" is really > not part of the standard workflow for developing in Python. That seems an odd distinction to make, given that we already do have ways to control how the compilation step happens, and we are having no trouble imagining other ways to control it. Whether you like those options or not, you have to admit that we do have ways to tell Python how we want compilation to happen. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From solipsis at pitrou.net Fri May 23 12:44:31 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 23 May 2014 12:44:31 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> <20140523115357.435531a4@fsol> <537F257A.5000509@nedbatchelder.com> Message-ID: <20140523124431.7e79aba3@fsol> On Fri, 23 May 2014 06:39:54 -0400 Ned Batchelder wrote: > On 5/23/14 5:53 AM, Antoine Pitrou wrote: > > On Thu, 22 May 2014 22:53:28 -0400 > > Terry Reedy wrote: > >>> I'd like to understand why we think the Python compiler is different in > >>> this regard than a C compiler. > >> Python is a different language. But let us not get sidetracked on that. > > The number one difference is that people don't compile code explicitly > > when writing Python code (well, except packagers who call compileall(), > > and a few advanced uses). So "choosing compilation options" is really > > not part of the standard workflow for developing in Python. > > That seems an odd distinction to make, given that we already do have > ways to control how the compilation step happens, and we are having no > trouble imagining other ways to control it. Whether you like those > options or not, you have to admit that we do have ways to tell Python > how we want compilation to happen. My point is that almost nobody ever cares about them. The standard model for execution Python code is "python mycode.py" or "python -m mymodule". Compilation is invisible for the average user. Regards Antoine. From ned at nedbatchelder.com Fri May 23 14:04:23 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 23 May 2014 08:04:23 -0400 Subject: [Python-ideas] Disabling optimizations In-Reply-To: References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> Message-ID: <537F3947.6000401@nedbatchelder.com> On 5/23/14 4:02 AM, Stephen Hansen wrote: > On Thu, May 22, 2014 at 6:44 PM, Ned Batchelder > wrote: > > I'm of the opinion that we don't need to segregate bytecode into > different files depending on the options used to create the > bytecode. How often is the same program run in the same place > with different options at different times? I'm happy to have > optimized and non-optimized code both written to .pyc files, and > if you are fiddling with the options like that, you should delete > your pyc files when you change the options. If we come up with a > way to have the bytecode file-segregated, I'm OK with that too. > > > What madness is this? > > Any suggestion that "you should delete your pyc files" strikes me as > remarkably wrongheaded. You shouldn't even have to think about pyc (or > pyo) files -- they're a convenience, not something there is any > expectation on anyone to *manage*. When I edit my .py file, I don't > have to go delete the pyc; I don't need to be sure to do a 'make > clean' like on some of my C projects. Python sees my source is > modified, and discards the compiled bit -- expecting anything more > from people using python is a serious thing. > > Things have gotten a bit more complex in modern Python with the > __pycache__ directory, yet still there is no expectation that users > *manage* these files. That's a bit shocking to me. > > I definitely don't like the alternative that says unoptimized code > isn't written to disk at all. If people want to solve the problem > that way, there is already a mechanism to avoid writing bytecode, > you can use it with the optimizer controls to achieve the effect > you want. > > > I don't understand this point. It seems natural to me that if you have > an option to run code with optimizations disabled, its not written to > disk...: after all the entire assumption of the point is the code > isn't doing everything it can to be as efficient as it can. At that > point, what does speed matter? You've decided you want precise > traceable semantics even when its known that certain branches aren't > needed -- you want to trace the precise logic. Do you really then care > about the cost it takes to compile the source to bytecode? > > I get that there are reasons to not want optimizations, but I don't > get the desire to complicate the compilation and running step. > Optimizations on/off makes some sense: in testing environments and the > like. Its something else entirely to demand people manually delete > files, or where the burden is upon those who run the app/test > suites/etc to deal with files created as a side-effect of what they're > doing. > I may not have been clear, sorry: I would love to find a way to make this transparent to the user, and not have to have the user delete .pyc files. I was merely trying to make my requirements precise. In my particular use case, having to delete .pyc files is not a problem. If we can engineer it so that is not necessary, all the better. The .pyc file already has a metadata that indicates the source timestamp and the version of the Python interpreter. If those numbers don't mesh well with the Python source and interpreter that finds the pyc file, then the file is discarded transparently. We could put the compilation options into the pyc file as well, and automatically discard the file if it had been made with different options than the running interpreter. --Ned. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcannon at gmail.com Fri May 23 14:27:26 2014 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 23 May 2014 12:27:26 +0000 Subject: [Python-ideas] Disabling optimizations References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> <20140523115709.58b6d016@fsol> Message-ID: On Fri May 23 2014 at 6:03:59 AM, Antoine Pitrou wrote: > On Fri, 23 May 2014 07:28:32 +0200 > Stefan Behnel wrote: > > > > > > As I already proposed, we could get rid of .pyo files all together and > only > > > write unoptimised .pyc files, and then apply the optimisations at load > time > > > based on the current interpreter config. I think that would give us a > good > > > tradeoff between fast (precompiled) code loading and differing > requirements > > > on byte code optimisations. > > > > Stefan Krah already proposed -Os (optimise for space) for the cases where > > you want to reduce the size of the byte code file, e.g. by removing doc > > strings. That could become the next .pyo file. Although it's unclear to > me > > why you would do that, instead of just compressing them. > > People who are really short on disk space (embedded devs?) probably do > both: first strip docstrings and friends, then compress. > .pyo files also use less memory once loaded as well. -OO is definitely not going away as at least an available option under some name. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Fri May 23 13:59:48 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 23 May 2014 07:59:48 -0400 Subject: [Python-ideas] Disabling optimizations In-Reply-To: <20140523115709.58b6d016@fsol> References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> <20140523115709.58b6d016@fsol> Message-ID: <537F3834.7050904@nedbatchelder.com> On 5/23/14 5:57 AM, Antoine Pitrou wrote: > On Fri, 23 May 2014 07:28:32 +0200 > Stefan Behnel wrote: >>> As I already proposed, we could get rid of .pyo files all together and only >>> write unoptimised .pyc files, and then apply the optimisations at load time >>> based on the current interpreter config. I think that would give us a good >>> tradeoff between fast (precompiled) code loading and differing requirements >>> on byte code optimisations. >> Stefan Krah already proposed -Os (optimise for space) for the cases where >> you want to reduce the size of the byte code file, e.g. by removing doc >> strings. That could become the next .pyo file. Although it's unclear to me >> why you would do that, instead of just compressing them. > People who are really short on disk space (embedded devs?) probably do > both: first strip docstrings and friends, then compress. > > For the same reason, optimizing in-memory would be detrimental: > optimizations can usually reduce the size of pyc files. > > (besides, optimizing at compile-time allows us to do more costly > optimizations without caring *too much* about their overhead) Optimizing at compile time also lets you do optimizations that are not bytecode->bytecode transformations. Most of the recent discussion about new optimizations is focused on AST manipulations. Although I started this discussion with the word "peephole", those types of optimizations also affect the source->bytecode mapping, and should be controlled by the levers we're discussing. --Ned. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Fri May 23 18:33:28 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 May 2014 02:33:28 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: On 23 May 2014 19:30, Victor Stinner wrote: > 2014-05-23 11:11 GMT+02:00 Nick Coghlan : >> Given how far away 3.5 is, I'd actually be interested in seeing a full >> write-up of Eric's proposal, comparing it to the "let's just add some more >> technical debt to the pile" -X option based approach. > > The discussion in now splitted in 4 places: 3 threads on this mailing > list, 1 issue in the bug tracker. And there are some old discussions > on python-dev. > > It's maybe time to use the power of the PEP process to summarize this > in a clear document? (Write a PEP.) Yes, I think so. One key thing this discussion made me realise is that we haven't taken a serious look at the compilation behaviour since PEP 3147 was implemented. The introduction of the cache tag and the source<->cache conversion functions provides an opportunity to actually clean up the handling of the different optimisation levels, and potentially make docstring stripping an independent setting. It may be that the end result of that process is to declare "-X nopeephole" a good enough solution and proceed with implementing that. I just think it's worth exploring what would be involved in fixing things properly before making a decision. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Fri May 23 18:49:30 2014 From: guido at python.org (Guido van Rossum) Date: Fri, 23 May 2014 09:49:30 -0700 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: I'm not happy with the direction this is taking. I would prefer an approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files. I would also like to remind people the reason why there are separate pyc and pyo files: they are separate to support precompilation of the standard library and installed 3rd party packages for different optimization levels. While it may be okay for a developer that their pyc files all get invalidated when they change the optimization level, the stdlib and site-packages may require root access to write, so if your optimization level means you have to ignore the precompiled stdlib or site packages, that would be a major drag on your startup time (and memory usage will also spike at import time, since the AST is rather large). Looking at my own (frequent) use of coverage.py, I would be totally fine if disabling peephole optimization only affected my app's code, and kept using the precompiled stdlib. (How exactly this would work is left as an exercise for the reader.) On Fri, May 23, 2014 at 9:33 AM, Nick Coghlan wrote: > On 23 May 2014 19:30, Victor Stinner wrote: > > 2014-05-23 11:11 GMT+02:00 Nick Coghlan : > >> Given how far away 3.5 is, I'd actually be interested in seeing a full > >> write-up of Eric's proposal, comparing it to the "let's just add some > more > >> technical debt to the pile" -X option based approach. > > > > The discussion in now splitted in 4 places: 3 threads on this mailing > > list, 1 issue in the bug tracker. And there are some old discussions > > on python-dev. > > > > It's maybe time to use the power of the PEP process to summarize this > > in a clear document? (Write a PEP.) > > Yes, I think so. One key thing this discussion made me realise is that > we haven't taken a serious look at the compilation behaviour since PEP > 3147 was implemented. The introduction of the cache tag and the > source<->cache conversion functions provides an opportunity to > actually clean up the handling of the different optimisation levels, > and potentially make docstring stripping an independent setting. > > It may be that the end result of that process is to declare "-X > nopeephole" a good enough solution and proceed with implementing that. > I just think it's worth exploring what would be involved in fixing > things properly before making a decision. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Fri May 23 19:08:30 2014 From: donald at stufft.io (Donald Stufft) Date: Fri, 23 May 2014 13:08:30 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: <7477C87E-FB8F-42A8-B825-9F5F04B4A884@stufft.io> On May 23, 2014, at 12:49 PM, Guido van Rossum wrote: > I'm not happy with the direction this is taking. I would prefer an approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files. I agree with this I think. > > I would also like to remind people the reason why there are separate pyc and pyo files: they are separate to support precompilation of the standard library and installed 3rd party packages for different optimization levels. Sadly enough it doesn?t go far enough since you can?t have (as far as I know) a .pyo for both -O and -OO. Perhaps the PEP isn?t the worst idea in order to make all of that work with the __pycache__ directories and the pyc tagging. > While it may be okay for a developer that their pyc files all get invalidated when they change the optimization level, the stdlib and site-packages may require root access to write, so if your optimization level means you have to ignore the precompiled stdlib or site packages, that would be a major drag on your startup time (and memory usage will also spike at import time, since the AST is rather large). > > Looking at my own (frequent) use of coverage.py, I would be totally fine if disabling peephole optimization only affected my app's code, and kept using the precompiled stdlib. (How exactly this would work is left as an exercise for the reader.) > > > On Fri, May 23, 2014 at 9:33 AM, Nick Coghlan wrote: > On 23 May 2014 19:30, Victor Stinner wrote: > > 2014-05-23 11:11 GMT+02:00 Nick Coghlan : > >> Given how far away 3.5 is, I'd actually be interested in seeing a full > >> write-up of Eric's proposal, comparing it to the "let's just add some more > >> technical debt to the pile" -X option based approach. > > > > The discussion in now splitted in 4 places: 3 threads on this mailing > > list, 1 issue in the bug tracker. And there are some old discussions > > on python-dev. > > > > It's maybe time to use the power of the PEP process to summarize this > > in a clear document? (Write a PEP.) > > Yes, I think so. One key thing this discussion made me realise is that > we haven't taken a serious look at the compilation behaviour since PEP > 3147 was implemented. The introduction of the cache tag and the > source<->cache conversion functions provides an opportunity to > actually clean up the handling of the different optimisation levels, > and potentially make docstring stripping an independent setting. > > It may be that the end result of that process is to declare "-X > nopeephole" a good enough solution and proceed with implementing that. > I just think it's worth exploring what would be involved in fixing > things properly before making a decision. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From ericsnowcurrently at gmail.com Fri May 23 19:11:59 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 23 May 2014 11:11:59 -0600 Subject: [Python-ideas] Disabling optimizations In-Reply-To: <537F3947.6000401@nedbatchelder.com> References: <537E9913.7070501@stoneleaf.us> <537EA7FB.8070806@nedbatchelder.com> <537F3947.6000401@nedbatchelder.com> Message-ID: On Fri, May 23, 2014 at 6:04 AM, Ned Batchelder wrote: > The .pyc file already has a metadata that indicates the source timestamp and > the version of the Python interpreter. If those numbers don't mesh well > with the Python source and interpreter that finds the pyc file, then the > file is discarded transparently. We could put the compilation options into > the pyc file as well, and automatically discard the file if it had been made > with different options than the running interpreter. Adjusting the cache tag (sys.implementation.cache_tag) to reflect the optimization level would be pretty straight-forward and relatively easy. I'd like that better than putting that information into the .pyc header. -eric From guido at python.org Fri May 23 19:12:37 2014 From: guido at python.org (Guido van Rossum) Date: Fri, 23 May 2014 10:12:37 -0700 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <7477C87E-FB8F-42A8-B825-9F5F04B4A884@stufft.io> References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> <7477C87E-FB8F-42A8-B825-9F5F04B4A884@stufft.io> Message-ID: On Fri, May 23, 2014 at 10:08 AM, Donald Stufft wrote: > > On May 23, 2014, at 12:49 PM, Guido van Rossum wrote: > > I'm not happy with the direction this is taking. I would prefer an > approach that *first* implements the minimal thing (an internal flag, set > by an environment variable, to disable the peephole optimizer) and *then* > perhaps revisits the greater UI for specifying optimization levels and the > consequences this has for pyc/pyo files. > > > I agree with this I think. > > > I would also like to remind people the reason why there are separate pyc > and pyo files: they are separate to support precompilation of the standard > library and installed 3rd party packages for different optimization levels. > > > Sadly enough it doesn?t go far enough since you can?t have (as far as I > know) a .pyo for both -O and -OO. Perhaps the PEP isn?t the worst idea in > order to make all of that work with the __pycache__ directories and the pyc > tagging. > Agreed (though I think that -OO is a very niche feature) and I think deciding on what to do about this (if anything) should not hold the peephole disabling feature hostage. (The latter of course has to decide what to do about pyc files, but the should be a suitable answer that doesn't require solving the general problem nor prevents the general problem being solved.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri May 23 19:17:03 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 23 May 2014 11:17:03 -0600 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum wrote: > I'm not happy with the direction this is taking. I would prefer an approach > that *first* implements the minimal thing (an internal flag, set by an > environment variable, to disable the peephole optimizer) and *then* perhaps > revisits the greater UI for specifying optimization levels and the > consequences this has for pyc/pyo files. Yeah, that's exactly what I was trying to convey in the followup to my longer message about revamping the optimization levels. > Looking at my own (frequent) use of coverage.py, I would be totally fine if > disabling peephole optimization only affected my app's code, and kept using > the precompiled stdlib. (How exactly this would work is left as an exercise > for the reader.) Would it be a problem if .pyc files weren't generated or used (a la -B or PYTHONDONTWRITEBYTECODE) when you ran coverage? -eric From ncoghlan at gmail.com Fri May 23 19:22:25 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 May 2014 03:22:25 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: On 24 May 2014 02:49, "Guido van Rossum" wrote: > > I'm not happy with the direction this is taking. I would prefer an approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files. Sure, that sounds like a reasonable approach, too. My perspective is mainly coloured by the fact that we're still in the "eh, feature freeze is still more than a year away" low urgency period for 3.5 :) Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri May 23 19:22:43 2014 From: guido at python.org (Guido van Rossum) Date: Fri, 23 May 2014 10:22:43 -0700 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: On Fri, May 23, 2014 at 10:17 AM, Eric Snow wrote: > On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum > wrote: > > Looking at my own (frequent) use of coverage.py, I would be totally fine > if > > disabling peephole optimization only affected my app's code, and kept > using > > the precompiled stdlib. (How exactly this would work is left as an > exercise > > for the reader.) > > Would it be a problem if .pyc files weren't generated or used (a la -B > or PYTHONDONTWRITEBYTECODE) when you ran coverage? > In first approximation that would probably be okay, although it would make coverage even slower. I was envisioning something where it would still use, but not write, pyc files for the stdlib or site-packages, because the code in whose coverage I am interested is puny compared to the stdlib code it imports. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri May 23 19:23:41 2014 From: guido at python.org (Guido van Rossum) Date: Fri, 23 May 2014 10:23:41 -0700 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: On Fri, May 23, 2014 at 10:22 AM, Nick Coghlan wrote: > > On 24 May 2014 02:49, "Guido van Rossum" wrote: > > > > I'm not happy with the direction this is taking. I would prefer an > approach that *first* implements the minimal thing (an internal flag, set > by an environment variable, to disable the peephole optimizer) and *then* > perhaps revisits the greater UI for specifying optimization levels and the > consequences this has for pyc/pyo files. > > Sure, that sounds like a reasonable approach, too. My perspective is > mainly coloured by the fact that we're still in the "eh, feature freeze is > still more than a year away" low urgency period for 3.5 :) > Yeah, and I'm countering that not every project needs to land a week before the feature freeze. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri May 23 19:36:32 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 May 2014 03:36:32 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: On 24 May 2014 03:24, "Guido van Rossum" wrote: > > On Fri, May 23, 2014 at 10:22 AM, Nick Coghlan wrote: >> >> >> On 24 May 2014 02:49, "Guido van Rossum" wrote: >> > >> > I'm not happy with the direction this is taking. I would prefer an approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files. >> >> Sure, that sounds like a reasonable approach, too. My perspective is mainly coloured by the fact that we're still in the "eh, feature freeze is still more than a year away" low urgency period for 3.5 :) > > Yeah, and I'm countering that not every project needs to land a week before the feature freeze. :-) But that approach makes Larry's life far more exciting! :) Happily-on-the-other-side-of-the-Pacific-from-Larry-while-saying-that'ly yours, Nick. > > -- > --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dw+python-ideas at hmmz.org Fri May 23 21:22:20 2014 From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org) Date: Fri, 23 May 2014 19:22:20 +0000 Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs Message-ID: <20140523192220.GA20596@k2> Early while working on py-lmdb I noticed that a huge proportion of runtime was being lost to PyArg_ParseTupleAndKeywords, and so I subsequently wrote a specialization for this extension module. In the current code[0], parse_args() is much faster than ParseTupleAndKeywords, responsible for a doubling of performance in several of the library's faster code paths (e.g. Cursor.put(append=True)). Ever since adding the rewrite I've wanted to go back and either remove it or at least reduce the amount of custom code, but it seems there really isn't a better approach to fast argument parsing using the bare Python C API at the moment. [0] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L833 In the append=True path, parse_args() yields a method that can complete 1.1m insertions/sec on my crappy Core 2 laptop, compared to 592k/sec using the same method rewritten with PyArg_ParseTupleAndKeywords. Looking to other 'fast' projects for precedent, and studying Cython's output in particular, it seems that Cython completely ignores the standard APIs and expends a huge amount of .text on using almost every imagineable C performance trick to speed up parsing (actually Cython's output is a sheer marvel of trickery, it's worth study). So it's clear the standard APIs are somewhat non-ideal, and those concerned with performance are taking other approaches. ParseTupleAndKeywords is competitive for positional arguments (1.2m/sec vs 1.5m/sec for "Cursor.put(k, v)"), but things go south when a kwarg dict is provided. The primary goal of parse_args() was to avoid the continous temporary allocations and hashing done by PyArg_ParseTupleAndKeywords, by way of PyDict_GetItemString(), which invokes PyString_FromString() internally, which subsequently causes alloc / strlen() and memcpy(), one for each possible kwarg, on every function call. The rewrite has been hacked over time, and honestly I'm not sure which bits are responsible for the speed improvement, and which are totally redundant. The tricks are: * Intern keyword arg strings once at startup, avoiding the temporary PyString creation and also causing their hash() to be cached across calls. This uses an incredibly ugly pair of enum/const char *[] static globals.[3] [3] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L79 * Use a per-function 'static const' array of structures to describe the expected set of arguments. Since these arrays are built at compile time, they cannot directly reference the runtime-generated interned PyStrings, thus the use of an enum. A nice side effect of the array's contents being purely small integer is that each array element is small and thus quite cache-efficient. In the current code array elements are 4 bytes each. * Avoid use of variable-length argument lists. I'm not sure if this helps at all, but certainly it simplifies the parsing code and makes the call sites much more compact. Instead of a va_arg list of destination pointers, parsed output is represented as a per-function structure[1][2] definition, whose offsets are encoded into the above argspec array, and at build time. [1] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L1265 [2] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L704 This might hurt the compiler's ability to optimize the placement of what were previouly small stack variables (e.g. I'm not sure if it prevents the compiler making more use of registers). In any case the overall result is much faster than before. And most recently, giving a further 20% boost to append=True: * Cache a dict that maps interned kwarg -> argspec array offset, allowing the per-call kwarg dict to be iterated, and causing only one hash lookup per supplied kwarg. Prior to the cache, presence of kwargs would cause one hash lookup per argspec entry (e.g. potentially 15 lookups instead of 1 or 2). It's obvious this approach isn't generally useful, and looking at the CPython source we can see the interning trick is already known, and presumably not exposed in the CPython API because the method is quite ugly. Still it seems there is room to improve the public API to include something like this interning trick, and that's what this mail is about. My initial thought is for a horribly macro-heavy API like: PyObject *my_func(PyObject *self, PyObject *args, PyObject *kwargs) { Py_ssize_t foo; const char *some_buf; PyObject *list; Py_BEGIN_ARGS PY_ARG("foo", PY_ARG_SSIZE_T, NULL, PY_ARG_REQUIRED), PY_ARG("some_buf", PY_ARG_BUFFER, NULL, PY_ARG_REQUIRED), PY_ARG("list", PY_ARG_OBJECT, &PyList_Type, NULL, 0) Py_END_ARGS if(Py_PARSE_ARGS(args, kwds, &foo, &some_buf, &list)) { return NULL; } /* do stuff */ } Where: struct py_arg_info; /* Opaque */ struct py_arg_spec { const char *name; enum { ... } type; PyTypeObject *type; int options; }; #define PY_BEGIN_ARGS \ static struct py_arg_info *_py_arg_info; \ if(! _py_arg_info) { \ static const struct py_arg_spec _py_args[] = { #define PY_END_ARGS \ }; \ _Py_InitArgInfo(&_py_arg_info, _py_args, \ sizeof _py_args / sizeof _py_args[0]); \ } #define PY_ARG(name, type, type2, opts) {name, type, type2, opts} #define Py_PARSE_ARGS(a, k, ...) \ _Py_ParseArgsFromInfo(&_py_arg_info, a, k, _VA_ARG_); Here some implementation-internal py_arg_info structure is built up on first function invocation, producing the cached mapping of argument keywords to array index, and storing a reference to the py_arg_spec array, or some version of it that has been internally transformed to a more useful format. You may notice this depends on va_arg macros, which breaks at least Visual Studio, so at the very least that part is broken. The above also doesn't deal with all the cases supported by the existing PyArg_ routines, such as setting the function name and custom error message, or unpacking tuples (is this still even supported in Python 3?) Another approach might be to use a PyArg_ParseTupleAndKeywords-alike API, so that something like this was possible: static PyObject * my_method(PyObject *self, PyObject *args, *PyObject *kwds) { Py_ssize_t foo; const char *some_buf; Py_ssize_t some_buf_size; PyObject *list; static PyArgInfo arg_info; static char *keywords[] = { "foo", "some_buf", "list", NULL }; if(! PyArg_FastParse(&arg_info, args, kwds, "ns#|O!", keywords, &foo, &some_buf, &some_buf_size, &PyList_Type, &list)) { return NULL; } /* do stuff */ } In that case that API is very familiar, and PyArg_FastParse() builds the cache on first invocation itself, but the supplied va_list is full of noise that needs to be carefully skipped somehow. The work involved in doing the skipping might introduce complexity that slows things down all over again. Any thoughts on a better API? Is there a need here? I'm obviously not the first to notice PyArg_ParseTupleAndKeywords is slow, and so I wonder how many people have sighed and brushed off the fact their module is slower than it could be. David From tjreedy at udel.edu Fri May 23 21:55:49 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 23 May 2014 15:55:49 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537F05F9.7070406@egenix.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> <537E1A88.8080501@egenix.com> <537F05F9.7070406@egenix.com> Message-ID: On 5/23/2014 4:25 AM, M.-A. Lemburg wrote: >> I believe that Python has always had an 'as if' rule that allows more or less 'hidden' >> optimizations, as long as the net effect of a statement is as defined. > > I was referring to the times before the peephole optimizer was > introduced (Python 2.3 and earlier). > > What's important here is to look at the difference between what > the compiler generates by simply following its rule book and the > version of the byte code which is the result of running an > optimizer on the byte code or even on the AST before running the > transform to byte code. I have tried to say that the 'rule book' at a particular stage is not a fixed thing. There are several tranformations from source to CPython bytecode. The order and grouping is somewhat a matter of convenience. However, leave that aside. What Ned wants and what Guido has supported is that there be an option to get bytecode that is friendly to execution analysis. They can decide what constraints that places on the end product and therefore on the multiple transformation processes. > For me, a key argument for having a runtime mode without > compiler optimizations is that the compiler gains > more freedom in applying more aggressive optimizations. > > Tools will no longer have to adapt to whatever optimizations > are added with each new Python release, since there will be > a defined non-optimized runtime mode they can use as basis for > their work. Stability is certainly a useful constraint. > The net result would be faster Pythons and better working debugging > tools (well, at least that's the hope ;-). Good point. It appears that rethinking the current -O, -OO will help. -- Terry Jan Reedy From tjreedy at udel.edu Fri May 23 22:05:11 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 23 May 2014 16:05:11 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: <537F257A.5000509@nedbatchelder.com> References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com> <20140523115357.435531a4@fsol> <537F257A.5000509@nedbatchelder.com> Message-ID: On 5/23/2014 6:39 AM, Ned Batchelder wrote: > On 5/23/14 5:53 AM, Antoine Pitrou wrote: >> On Thu, 22 May 2014 22:53:28 -0400 >> Terry Reedy wrote: >>>> I'd like to understand why we think the Python compiler is different in >>>> this regard than a C compiler. >>> Python is a different language. But let us not get sidetracked on that. >> The number one difference is that people don't compile code explicitly >> when writing Python code (well, except packagers who call compileall(), >> and a few advanced uses). So "choosing compilation options" is really >> not part of the standard workflow for developing in Python. > > That seems an odd distinction to make, given that we already do have > ways to control how the compilation step happens, There are not used much, and I doubt that anyone is joyous at the status quo. Which is why your proposal looks more inviting (to me, and I think to some others) as part of a reworking of the clumbsy status quo than as a clumbsy add-on. -- Terry Jan Reedy From dw+python-ideas at hmmz.org Fri May 23 22:07:17 2014 From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org) Date: Fri, 23 May 2014 20:07:17 +0000 Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs In-Reply-To: <20140523192220.GA20596@k2> References: <20140523192220.GA20596@k2> Message-ID: <20140523200717.GA22671@k2> On Fri, May 23, 2014 at 07:22:20PM +0000, dw+python-ideas at hmmz.org wrote: > if(! PyArg_FastParse(&arg_info, args, kwds, "ns#|O!", keywords, > &foo, &some_buf, &some_buf_size, > &PyList_Type, &list)) { > return NULL; > } Perhaps the most off-the-wall approach would be to completely preserve the existing interface, by using a dollop of assembly to fetch the return address, and use that to maintain some internal hash table. That's incredibly nasty, but as a systemic speedup it might be worth it? David From njs at pobox.com Fri May 23 22:08:28 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 23 May 2014 21:08:28 +0100 Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs In-Reply-To: <20140523192220.GA20596@k2> References: <20140523192220.GA20596@k2> Message-ID: On Fri, May 23, 2014 at 8:22 PM, wrote: > Early while working on py-lmdb I noticed that a huge proportion of > runtime was being lost to PyArg_ParseTupleAndKeywords, and so I > subsequently wrote a specialization for this extension module. > > In the current code[0], parse_args() is much faster than > ParseTupleAndKeywords, responsible for a doubling of performance in > several of the library's faster code paths (e.g. > Cursor.put(append=True)). Ever since adding the rewrite I've wanted to > go back and either remove it or at least reduce the amount of custom > code, but it seems there really isn't a better approach to fast argument > parsing using the bare Python C API at the moment. > > [0] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L833 > > In the append=True path, parse_args() yields a method that can complete > 1.1m insertions/sec on my crappy Core 2 laptop, compared to 592k/sec > using the same method rewritten with PyArg_ParseTupleAndKeywords. > > Looking to other 'fast' projects for precedent, and studying Cython's > output in particular, it seems that Cython completely ignores the > standard APIs and expends a huge amount of .text on using almost every > imagineable C performance trick to speed up parsing (actually Cython's > output is a sheer marvel of trickery, it's worth study). So it's clear > the standard APIs are somewhat non-ideal, and those concerned with > performance are taking other approaches. As another data point about PyArg_ParseTupleAndKeywords slowness, Numpy has tons of barely-maintainable hand-written argument parsing code. I haven't read the proposal below in detail, but anything that helps us clean that up is ok with me... You should check out Argument Clinic (PEP 436) if you haven't seen it. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From dw+python-ideas at hmmz.org Fri May 23 22:22:48 2014 From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org) Date: Fri, 23 May 2014 20:22:48 +0000 Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs In-Reply-To: References: <20140523192220.GA20596@k2> Message-ID: <20140523202248.GB22671@k2> On Fri, May 23, 2014 at 09:08:28PM +0100, Nathaniel Smith wrote: > You should check out Argument Clinic (PEP 436) if you haven't seen it. Thanks! I'd seen this but forgotten about it. The use of a preprocessor seems excessive, and a potential PITA when combined with other preprocessors - e.g. Qt's moc, but the language is a very cool idea. If the DSL definition was expressed as a string constant, that pointer could key an internal hash table. Still not as fast as specialized code, but perhaps an interesting middleground. David From njs at pobox.com Fri May 23 22:38:40 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 23 May 2014 21:38:40 +0100 Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs In-Reply-To: <20140523202248.GB22671@k2> References: <20140523192220.GA20596@k2> <20140523202248.GB22671@k2> Message-ID: On Fri, May 23, 2014 at 9:22 PM, wrote: > On Fri, May 23, 2014 at 09:08:28PM +0100, Nathaniel Smith wrote: > >> You should check out Argument Clinic (PEP 436) if you haven't seen it. > > Thanks! I'd seen this but forgotten about it. The use of a preprocessor > seems excessive, and a potential PITA when combined with other > preprocessors - e.g. Qt's moc, but the language is a very cool idea. Yes, but OTOH it's working and shipping code with a substantial user base (lots of the CPython implementation), so making it fast and usable in third-party libraries might still be the most efficient approach. And IIRC it's not (necessarily) a build-time thing, the usual mode is for it to update your checked-in source directly, so integration with other preprocessors might be a non-issue. A preprocessor approach might make it easier to support older Python's in the generated code, compared to a library approach. (It's easier to say "developers/the person making the source release must have Python 3 installed, but the generated code works everywhere" than to say "this library only works on Python 3.5+ because that's the first version that ships the new argument parsing API".) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From dw+python-ideas at hmmz.org Fri May 23 23:41:35 2014 From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org) Date: Fri, 23 May 2014 21:41:35 +0000 Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs In-Reply-To: <20140523200717.GA22671@k2> References: <20140523192220.GA20596@k2> <20140523200717.GA22671@k2> Message-ID: <20140523214135.GA24056@k2> On Fri, May 23, 2014 at 08:07:17PM +0000, dw+python-ideas at hmmz.org wrote: > Perhaps the most off-the-wall approach would be to completely preserve > the existing interface, by using a dollop of assembly to fetch the > return address, and use that to maintain some internal hash table. > > That's incredibly nasty, but as a systemic speedup it might be worth it? Final (obvious in hindsight) suggestion: mix 'fmt' and 'keywords' argument pointers together for use as a hash key into a table within getargs.c, no nasty asm or interface changes necessary. David From greg.ewing at canterbury.ac.nz Sat May 24 01:15:10 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 24 May 2014 11:15:10 +1200 Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30 In-Reply-To: References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com> <537E5B4E.8050905@nedbatchelder.com> Message-ID: <537FD67E.2000000@canterbury.ac.nz> Nathaniel Smith wrote: > "Compactly encoding a dict of sets of > ints" is not the sort of challenge that we should find daunting and > impossible. I'd question whether it's even worth going to heroic lengths to compress the lnotab these days, especially if it could be lazily loaded from the pyc when needed. -- Greg From ncoghlan at gmail.com Sat May 24 03:36:29 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 May 2014 11:36:29 +1000 Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs In-Reply-To: References: <20140523192220.GA20596@k2> <20140523202248.GB22671@k2> Message-ID: On 24 May 2014 06:39, "Nathaniel Smith" wrote: > > On Fri, May 23, 2014 at 9:22 PM, wrote: > > On Fri, May 23, 2014 at 09:08:28PM +0100, Nathaniel Smith wrote: > > > >> You should check out Argument Clinic (PEP 436) if you haven't seen it. > > > > Thanks! I'd seen this but forgotten about it. The use of a preprocessor > > seems excessive, and a potential PITA when combined with other > > preprocessors - e.g. Qt's moc, but the language is a very cool idea. > > Yes, but OTOH it's working and shipping code with a substantial user > base (lots of the CPython implementation), so making it fast and > usable in third-party libraries might still be the most efficient > approach. And IIRC it's not (necessarily) a build-time thing, the > usual mode is for it to update your checked-in source directly, so > integration with other preprocessors might be a non-issue. Note there are two key goals behind Argument Clinic: 1. Add introspection metadata to functions implemented in C without further reducing maintainability (adding an arg to a C function already touched 6 places, signature metadata would have been a 7th) 2. Eventually switch the generated code to something faster than PyArg_ParseTupleAndKeywords. What phase 2 actually looks like hasn't been defined yet (enabling phase 1 ended up being a big enough challenge for 3.4), but the ideas in this thread would definitely be worth exploring further in that context. As Nathaniel noted, once checked in, Argument Clinic code is just ordinary C code with some funny comments, so it introduces no additional build time dependencies. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dw+python-ideas at hmmz.org Sat May 24 03:52:11 2014 From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org) Date: Sat, 24 May 2014 01:52:11 +0000 Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs In-Reply-To: References: <20140523192220.GA20596@k2> <20140523202248.GB22671@k2> Message-ID: <20140524015211.GA28050@k2> On Sat, May 24, 2014 at 11:36:29AM +1000, Nick Coghlan wrote: > 1. Add introspection metadata to functions implemented in C without further > reducing maintainability (adding an arg to a C function already touched 6 > places, signature metadata would have been a 7th) > As Nathaniel noted, once checked in, Argument Clinic code is just ordinary C > code with some funny comments, so it introduces no additional build time > dependencies. Hadn't realized it was already in use! It isn't nearly as intrusive as I might have expected, it seems 'preprocessor' is just a scary word. :) > 2. Eventually switch the generated code to something faster than > PyArg_ParseTupleAndKeywords. > What phase 2 actually looks like hasn't been defined yet (enabling phase 1 > ended up being a big enough challenge for 3.4), but the ideas in this thread > would definitely be worth exploring further in that context. The previous mail's hint led to thinking about how to actually implement a no-API-changes internal cache for PyArg_ParseTupleAndKeywords. While not a perfect solution, that approach has the tremendous benefit of backwards compatibility with every existing extension. It seems after 20 years' evolution, getargs.c is quite resistant to change (read: it afflicts headaches and angst on the unweary), so instead I've spent my Friday evening exploring a rewrite. David From z at etiol.net Sat May 24 05:57:18 2014 From: z at etiol.net (Zero Piraeus) Date: Fri, 23 May 2014 23:57:18 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> Message-ID: <20140524035718.GA13305@piedra> : On Thu, May 22, 2014 at 12:49:32PM -0600, Eric Snow wrote: > > -O0 -- no optmizations at all > [...] > -OO -- same as -Os (deprecate) Making no optimization so easily visually confused with maximum optimization isn't terribly good UI ... -[]z. -- Zero Piraeus: inter caetera http://etiol.net/pubkey.asc From ned at nedbatchelder.com Tue May 27 02:27:20 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Mon, 26 May 2014 20:27:20 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: <5383DBE8.6020309@nedbatchelder.com> On 5/23/14 1:22 PM, Guido van Rossum wrote: > On Fri, May 23, 2014 at 10:17 AM, Eric Snow > > wrote: > > On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum > > wrote: > > Looking at my own (frequent) use of coverage.py, I would be > totally fine if > > disabling peephole optimization only affected my app's code, and > kept using > > the precompiled stdlib. (How exactly this would work is left as > an exercise > > for the reader.) > > Would it be a problem if .pyc files weren't generated or used (a la -B > or PYTHONDONTWRITEBYTECODE) when you ran coverage? > > > In first approximation that would probably be okay, although it would > make coverage even slower. I was envisioning something where it would > still use, but not write, pyc files for the stdlib or site-packages, > because the code in whose coverage I am interested is puny compared to > the stdlib code it imports. I was concerned about losing any time in test suites that are already considered too slow. But I tried to do some controlled measurements of these scenarios, and found the worst case (no .pyc available, and none written) was only 2.8% slower than full .pyc files available. When I tried to measure stdlib .pyc's available, and no .pyc's for my code, the results were actually very slightly faster than the typical case. I think this points to the difficult in controlling all the variables! In any case, it seems that the penalty for avoiding the .pyc files is not burdensome. > > -- > --Guido van Rossum (python.org/~guido ) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue May 27 04:40:37 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 27 May 2014 12:40:37 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <5383DBE8.6020309@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> <5383DBE8.6020309@nedbatchelder.com> Message-ID: On 27 May 2014 10:28, "Ned Batchelder" wrote: > > On 5/23/14 1:22 PM, Guido van Rossum wrote: >> >> On Fri, May 23, 2014 at 10:17 AM, Eric Snow wrote: >>> >>> On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum wrote: >>> >>> Would it be a problem if .pyc files weren't generated or used (a la -B >>> or PYTHONDONTWRITEBYTECODE) when you ran coverage? >> >> >> In first approximation that would probably be okay, although it would make coverage even slower. I was envisioning something where it would still use, but not write, pyc files for the stdlib or site-packages, because the code in whose coverage I am interested is puny compared to the stdlib code it imports. > > > I was concerned about losing any time in test suites that are already considered too slow. But I tried to do some controlled measurements of these scenarios, and found the worst case (no .pyc available, and none written) was only 2.8% slower than full .pyc files available. When I tried to measure stdlib .pyc's available, and no .pyc's for my code, the results were actually very slightly faster than the typical case. I think this points to the difficult in controlling all the variables! > > In any case, it seems that the penalty for avoiding the .pyc files is not burdensome. Along these lines, how about making the environment variable something like "PYTHONANALYSINGSOURCE" with the effects: - bytecode files are neither read nor written - all bytecode and AST optimisations are disabled A use case oriented flag like that lets us tweak the definition as needed in the future, unlike an option that is specific to turning off the CPython peephole optimiser (e.g. we don't have an AST optimiser yet, but turning it off would still be covered by an "analysing source" flag). Cheers, Nick. >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Tue May 27 04:45:21 2014 From: haoyi.sg at gmail.com (Haoyi Li) Date: Mon, 26 May 2014 19:45:21 -0700 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> <5383DBE8.6020309@nedbatchelder.com> Message-ID: > - bytecode files are neither read nor written Yay! That would be amazing... On Mon, May 26, 2014 at 7:40 PM, Nick Coghlan wrote: > > On 27 May 2014 10:28, "Ned Batchelder" wrote: > > > > On 5/23/14 1:22 PM, Guido van Rossum wrote: > >> > >> On Fri, May 23, 2014 at 10:17 AM, Eric Snow < > ericsnowcurrently at gmail.com> wrote: > >>> > >>> On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum > wrote: > > >>> > >>> Would it be a problem if .pyc files weren't generated or used (a la -B > >>> or PYTHONDONTWRITEBYTECODE) when you ran coverage? > >> > >> > >> In first approximation that would probably be okay, although it would > make coverage even slower. I was envisioning something where it would still > use, but not write, pyc files for the stdlib or site-packages, because the > code in whose coverage I am interested is puny compared to the stdlib code > it imports. > > > > > > I was concerned about losing any time in test suites that are already > considered too slow. But I tried to do some controlled measurements of > these scenarios, and found the worst case (no .pyc available, and none > written) was only 2.8% slower than full .pyc files available. When I tried > to measure stdlib .pyc's available, and no .pyc's for my code, the results > were actually very slightly faster than the typical case. I think this > points to the difficult in controlling all the variables! > > > > In any case, it seems that the penalty for avoiding the .pyc files is > not burdensome. > > Along these lines, how about making the environment variable something > like "PYTHONANALYSINGSOURCE" with the effects: > > - bytecode files are neither read nor written > - all bytecode and AST optimisations are disabled > > A use case oriented flag like that lets us tweak the definition as needed > in the future, unlike an option that is specific to turning off the CPython > peephole optimiser (e.g. we don't have an AST optimiser yet, but turning it > off would still be covered by an "analysing source" flag). > > Cheers, > Nick. > > >> > >> > >> -- > >> --Guido van Rossum (python.org/~guido) > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue May 27 21:08:58 2014 From: barry at python.org (Barry Warsaw) Date: Tue, 27 May 2014 15:08:58 -0400 Subject: [Python-ideas] Disable all peephole optimizations References: <537C888D.7060903@nedbatchelder.com> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> Message-ID: <20140527150858.2ba75c1c@anarchist.wooz.org> On May 23, 2014, at 09:49 AM, Guido van Rossum wrote: >I would also like to remind people the reason why there are separate pyc >and pyo files: they are separate to support precompilation of the standard >library and installed 3rd party packages for different optimization levels. In fact, Debian (and I'm sure other OSes with package managers) precompiles source files at installation time. We have a couple of bugs languishing to provide -OO optimization levels as an option when doing this precompilation. I haven't pushed this forward because I got side-tracked by the overloading of .pyo files for -O and -OO optimization levels. I agree that the flags, mechanisms, and semantics should be worked out first, but I also think that PEP 3147 tagging will provide a nice ui for the file system representation of the optimization levels. death-to-pyo-files-ly y'rs, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: not available URL: