From m.raesener at gmail.com Sun Apr 1 03:39:31 2018 From: m.raesener at gmail.com (=?UTF-8?Q?Marius_R=C3=A4sener?=) Date: Sun, 01 Apr 2018 07:39:31 +0000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <20180401014805.GA16661@ando.pearwood.info> References: <20180401014805.GA16661@ando.pearwood.info> Message-ID: Ok I see this is nothing for any 3.x release. I imagine this now either ?clean? for users with compatibility break or just leave things as they are. So, if at all, maybe something for Python 4 :) Coincidence I watched yesterday Armin Ronachers talk related to seeing compatibility as the holy cow - interesting watch... https://www.youtube.com/watch?v=xkcNoqHgNs8&feature=youtu.be&t=2890 Steven D'Aprano schrieb am So. 1. Apr. 2018 um 03:49: > On Sun, Apr 01, 2018 at 02:20:16AM +0100, Rob Cliffe via Python-ideas > wrote: > > > >New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline > > >strings only, would multiply the number of alternatives by about 5 and > > >would require another rewrite of all code (Python or not) that parses > > >Python code (such as in syntax colorizers). > > > > I think you're exaggerating the difficulty somewhat. Multiplying the > > number of alternatives by 5 is not the same thing as increasing the > > complexity of code to parse it by 5. > > Terry didn't say that it would increase the complexity of the code by a > factor of five. He said it would multiply the number of alternatives by > "about 5". There would be a significant increase in the complexity of > the code too, but I wouldn't want to guess how much. > > Starting with r and f prefixes, in both upper and lower case, we have: > > 4 single letter prefixes > (plus 2 more, u and U, that don't combine with others) > 8 double letter prefixes > > making 14 in total. Adding one more prefix, d|D, increases it to: > > 6 single letter prefixes > (plus 2 more, u and U) > 24 double letter prefixes > 48 triple letter prefixes > > making 80 prefixes in total. Terry actually underestimated the explosion > in prefixes: it is closer to six times more than five (but who is > counting? apart from me *wink*) > > [Aside: if we add a fourth, the total becomes 634 prefixes.] > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Apr 1 04:53:36 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 1 Apr 2018 18:53:36 +1000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <20180401014805.GA16661@ando.pearwood.info> Message-ID: <20180401085335.GB16661@ando.pearwood.info> On Sun, Apr 01, 2018 at 07:39:31AM +0000, Marius R?sener wrote: > Ok I see this is nothing for any 3.x release. [...] > So, if at all, maybe something for Python 4 :) No, that's the wrong conclusion to draw. There are four options: (1) Change the behaviour of triple-quoted strings, immediately as of 3.8. This is out. It will be out in 3.9, and 4.0. (2) Change the behaviour of triple-quoted strings using a warning period and a __future__ import. This would probably take a minimum of three releases, but it could start in 3.8. However, anyone arguing in favour of this would have to make a VERY good case for it. (3) Leave the behaviour of triple-quoted strings alone, but introduce new behaviour via a method, or a new prefix. Again, this could start as early as 3.8 if someone makes a strong case for it. (4) The status quo: nothing changes. Python 4 will not be special like Python 3 was. Any new features in Python 4 that break backwards compatibility will still be required to go through a transition period, involving warnings and/or __future__ imports. Python will possibly never again go through a major break like Python 2 to 3, but if it does, it may not until Python 5 or 6. So if you think that waiting a few years means we will be free to make this change, no, option (1) will still be out, even in Python 4. Personally, I find the situation with triple-quoted strings and indentation to be a regular low-level annoyance, and I'd like to see a nice solution sooner rather than later. Thank you for raising this issue again, even if nothing comes from it. -- Steve From ncoghlan at gmail.com Sun Apr 1 05:07:50 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 1 Apr 2018 19:07:50 +1000 Subject: [Python-ideas] PEP draft: Unifying function/method classes In-Reply-To: <5ABFBE2B.3000308@UGent.be> References: <5ABF9FD8.4030507@UGent.be> <9481860f6a1f4518b53bb7c19a742c34@xmail102.UGent.be> <5ABFBE2B.3000308@UGent.be> Message-ID: On 1 April 2018 at 02:58, Jeroen Demeyer wrote: > On 2018-03-31 18:09, Steven D'Aprano wrote: >> Seems to me that if you want a fast, exact (no subclasses) check, you >> should use "type(obj) is Class" rather than isinstance. If the *only* >> reason to prohibit subclassing is to make isinstance a bit faster, >> I don't think that's a good enough reason. > > I didn't really mean "isinstance" literally, I was mostly thinking of the C > API. I agree that it's not clear. > > Do you happen to know why the existing function classes in Python disallow > subclassing? I assumed that it was for exactly this reason. Disallowing subclasses is a simplifying assumption for builtin types, since it means they don't need to account for subclasses potentially failing to enforce class invariants. That said, allowing Cython/CFFI/etc to use the existing fast paths for native Python functions and/or builtin C functions is a reasonable justification for flipping that switch and doing the extra work needed to make it robust - it's just likely to involve adding a number of `*_CheckExact()` calls in various places. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eric at trueblade.com Sun Apr 1 05:11:07 2018 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 1 Apr 2018 05:11:07 -0400 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <20180401014805.GA16661@ando.pearwood.info> References: <20180401014805.GA16661@ando.pearwood.info> Message-ID: <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> On 3/31/2018 9:48 PM, Steven D'Aprano wrote: > On Sun, Apr 01, 2018 at 02:20:16AM +0100, Rob Cliffe via Python-ideas wrote: > >>> New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline >>> strings only, would multiply the number of alternatives by about 5 and >>> would require another rewrite of all code (Python or not) that parses >>> Python code (such as in syntax colorizers). >> >> I think you're exaggerating the difficulty somewhat.? Multiplying the >> number of alternatives by 5 is not the same thing as increasing the >> complexity of code to parse it by 5. > > Terry didn't say that it would increase the complexity of the code by a > factor of five. He said it would multiply the number of alternatives by > "about 5". There would be a significant increase in the complexity of > the code too, but I wouldn't want to guess how much. > > Starting with r and f prefixes, in both upper and lower case, we have: > > 4 single letter prefixes > (plus 2 more, u and U, that don't combine with others) > 8 double letter prefixes > > making 14 in total. Adding one more prefix, d|D, increases it to: > > 6 single letter prefixes > (plus 2 more, u and U) > 24 double letter prefixes > 48 triple letter prefixes > > making 80 prefixes in total. Terry actually underestimated the explosion > in prefixes: it is closer to six times more than five (but who is > counting? apart from me *wink*) > > [Aside: if we add a fourth, the total becomes 634 prefixes.] Not that it really matters, but there's some code I use whenever I feel like playing with adding string prefixes. It usually encourages me to not do that! Lib/tokenize.py:_all_string_prefixes exists just for calculating string prefixes. Since it's not what is actually used by the tokenizer, I don't claim it's perfect (but I don't know of any errors in it). According to it, and ignoring the empty string, there are currently 24 prefixes: {'B', 'BR', 'Br', 'F', 'FR', 'Fr', 'R', 'RB', 'RF', 'Rb', 'Rf', 'U', 'b', 'bR', 'br', 'f', 'fR', 'fr', 'r', 'rB', 'rF', 'rb', 'rf', 'u'} And if you add 'd', and it can't combine with 'b' or 'u', I count 90: {'rdf', 'FR', 'dRF', 'rD', 'FrD', 'DFr', 'frd', 'RDf', 'u', 'DF', 'd', 'Frd', 'frD', 'dFr', 'rDF', 'fD', 'rB', 'dFR', 'FD', 'dr', 'Fr', 'DfR', 'fdR', 'Rb', 'dfr', 'rdF', 'rf', 'Drf', 'R', 'RB', 'BR', 'FdR', 'bR', 'DFR', 'RdF', 'dF', 'F', 'fd', 'Br', 'Dfr', 'Dr', 'r', 'rfd', 'RFd', 'Fdr', 'dfR', 'rb', 'fDr', 'rFD', 'fRd', 'Rfd', 'RDF', 'rFd', 'Rdf', 'rF', 'FDr', 'drF', 'dR', 'D', 'br', 'fr', 'drf', 'DrF', 'rd', 'DRF', 'DR', 'RFD', 'Rf', 'fR', 'RfD', 'Df', 'rDf', 'U', 'f', 'df', 'DRf', 'fdr', 'B', 'FRD', 'RF', 'Fd', 'Rd', 'fRD', 'FRd', 'b', 'dRf', 'FDR', 'RD', 'fDR', 'rfD'} I guess it's debatable if you want to count prefixes that contain 'b' as string prefixes or not, but the tokenizer thinks they are. If you leave them out, you come up with the 14 and 80 that Steven mentions. I agree with Terry that adding 'F' was a mistake. But since the upper case versions of 'r', and 'b' already existed, it was included. Interestingly, in 2.7 'ur' is a valid prefix, but not in 3.6. I don't recall if that was deliberate or not. And 'ru' isn't valid in either version. Eric From rosuav at gmail.com Sun Apr 1 05:24:29 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 1 Apr 2018 19:24:29 +1000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> Message-ID: On Sun, Apr 1, 2018 at 7:11 PM, Eric V. Smith wrote: > Interestingly, in 2.7 'ur' is a valid prefix, but not in 3.6. I don't recall > if that was deliberate or not. And 'ru' isn't valid in either version. I believe it was. The 'ur' string literal in Py2 was a bizarre hybrid of raw-but-allowing-Unicode-escapes, which makes no sense in the Py3 world. $ python3 Python 3.8.0a0 (heads/literal_eval-exception:ddcb2eb331, Feb 21 2018, 04:32:23) [GCC 6.3.0 20170516] on linux Type "help", "copyright", "credits" or "license" for more information. >>> print(u"\\ \u005c \\") \ \ \ >>> print(r"\\ \u005c \\") \\ \u005c \\ $ python2 Python 2.7.13 (default, Nov 24 2017, 17:33:09) [GCC 6.3.0 20170516] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> print(u"\\ \u005c \\") \ \ \ >>> print(r"\\ \u005c \\") \\ \u005c \\ >>> print(ur"\\ \u005c \\") \\ \ \\ In Py3, a normal Unicode literal (with or without the 'u' prefix, which has no meaning) will interpret "\u005c" as a backslash. A raw literal will treat it as a backslash followed by five other characters. In Py2, the same semantics hold for normal Unicode literals, and for bytes literals, the "\u" escape code has no meaning (and is parsed as "\\u"). But in a raw Unicode literal, the backslashes are treated literally... unless they're starting a "\u" sequence, in which case they're parsed. So if you use a raw string literal in Python 3 to store a Windows path name, you're fine as long as it doesn't end with a backslash (a wart in the design that probably can't be done any other way). But in Py2, a raw Unicode literal will occasionally misbehave in the *exact* *same* *way* as a non-raw string literal will - complete with it being data-dependent. Since the entire point of the Py3 u"..." prefix is compatibility with Py2, the semantics have to be retained. There's no point supporting ur"..." in Py3 if it's not going to produce the same result as in Py2. ChrisA From ncoghlan at gmail.com Sun Apr 1 05:31:19 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 1 Apr 2018 19:31:19 +1000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> Message-ID: On 1 April 2018 at 19:24, Chris Angelico wrote: > Since the entire point of the Py3 u"..." prefix is compatibility with > Py2, the semantics have to be retained. There's no point supporting > ur"..." in Py3 if it's not going to produce the same result as in Py2. Right, "ur" strings were originally taken out in Python 3.0, and then we made the decision *not* to add them back when PEP 414 restored other uses of the "u" prefix: https://www.python.org/dev/peps/pep-0414/#exclusion-of-raw-unicode-literals Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From desmoulinmichel at gmail.com Sun Apr 1 07:41:16 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sun, 1 Apr 2018 13:41:16 +0200 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: Message-ID: <896c5942-88a6-0969-b477-3e6459fc98a1@gmail.com> A "d" prefix to do textwrap.dedent is something I wished for a long time. It's like the "f" one: we already can do it, be hell is it convenient to have a shortcut. This is especially if, like me, you take a lot of care in the error messages you give to the user. I write a LOT of them, very long, very descriptive, and I have to either import textwrap or play the concatenation game. Having a str.dedent() method would be nice, but the d prefix has the huge advantage to be able to dedent on parsing, and hence be more performant. Le 31/03/2018 ? 16:50, Marius R?sener a ?crit?: > Hey List, > > this is my very first approach to suggest a Python improvement I'd think > worth discussing. > > At some point, maybe with Dart 2.0 or a little earlier, Dart is now > supporting multiline strings with "proper" identation (tried, but I > can't find the according docs at the moment. probably due to the rather > large changes related to dart 2.0 and outdated docs.) > > What I have in mind is probably best described with an Example: > > print(""" > ? ? I am a > ? ? multiline > ? ? String. > ? ? """) > > the closing quote defines the "margin indentation" - so in this example > all lines would get reduces by their leading 4 spaces, resulting in a > "clean" and unintended string. > > anyways, if dart or not, doesn't matter - I like the Idea and I think > python3.x could benefit from it. If that's possible at all :) > > I could also imagine that this "indentation cleanup" only is applied if > the last quotes are on their own line? Might be too complicated though, > I can't estimated or understand this... > > thx for reading, > Marius > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From Richard at Damon-Family.org Sun Apr 1 08:08:41 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Sun, 1 Apr 2018 08:08:41 -0400 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> Message-ID: <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> On 4/1/18 5:11 AM, Eric V. Smith wrote: > On 3/31/2018 9:48 PM, Steven D'Aprano wrote: >> On Sun, Apr 01, 2018 at 02:20:16AM +0100, Rob Cliffe via Python-ideas >> wrote: >> >>>> New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline >>>> strings only, would multiply the number of alternatives by about 5 and >>>> would require another rewrite of all code (Python or not) that parses >>>> Python code (such as in syntax colorizers). >>> >>> I think you're exaggerating the difficulty somewhat.? Multiplying the >>> number of alternatives by 5 is not the same thing as increasing the >>> complexity of code to parse it by 5. >> >> Terry didn't say that it would increase the complexity of the code by a >> factor of five. He said it would multiply the number of alternatives by >> "about 5". There would be a significant increase in the complexity of >> the code too, but I wouldn't want to guess how much. >> >> Starting with r and f prefixes, in both upper and lower case, we have: >> >> 4 single letter prefixes >> (plus 2 more, u and U, that don't combine with others) >> 8 double letter prefixes >> >> making 14 in total. Adding one more prefix, d|D, increases it to: >> >> 6 single letter prefixes >> (plus 2 more, u and U) >> 24 double letter prefixes >> 48 triple letter prefixes >> >> making 80 prefixes in total. Terry actually underestimated the explosion >> in prefixes: it is closer to six times more than five (but who is >> counting? apart from me *wink*) >> >> [Aside: if we add a fourth, the total becomes 634 prefixes.] > > Not that it really matters, but there's some code I use whenever I > feel like playing with adding string prefixes. It usually encourages > me to not do that! > > Lib/tokenize.py:_all_string_prefixes exists just for calculating > string prefixes. Since it's not what is actually used by the > tokenizer, I don't claim it's perfect (but I don't know of any errors > in it). > > According to it, and ignoring the empty string, there are currently 24 > prefixes: > {'B', 'BR', 'Br', 'F', 'FR', 'Fr', 'R', 'RB', 'RF', 'Rb', 'Rf', 'U', > 'b', 'bR', 'br', 'f', 'fR', 'fr', 'r', 'rB', 'rF', 'rb', 'rf', 'u'} > > And if you add 'd', and it can't combine with 'b' or 'u', I count 90: > {'rdf', 'FR', 'dRF', 'rD', 'FrD', 'DFr', 'frd', 'RDf', 'u', 'DF', 'd', > 'Frd', 'frD', 'dFr', 'rDF', 'fD', 'rB', 'dFR', 'FD', 'dr', 'Fr', > 'DfR', 'fdR', 'Rb', 'dfr', 'rdF', 'rf', 'Drf', 'R', 'RB', 'BR', 'FdR', > 'bR', 'DFR', 'RdF', 'dF', 'F', 'fd', 'Br', 'Dfr', 'Dr', 'r', 'rfd', > 'RFd', 'Fdr', 'dfR', 'rb', 'fDr', 'rFD', 'fRd', 'Rfd', 'RDF', 'rFd', > 'Rdf', 'rF', 'FDr', 'drF', 'dR', 'D', 'br', 'fr', 'drf', 'DrF', 'rd', > 'DRF', 'DR', 'RFD', 'Rf', 'fR', 'RfD', 'Df', 'rDf', 'U', 'f', 'df', > 'DRf', 'fdr', 'B', 'FRD', 'RF', 'Fd', 'Rd', 'fRD', 'FRd', 'b', 'dRf', > 'FDR', 'RD', 'fDR', 'rfD'} > > I guess it's debatable if you want to count prefixes that contain 'b' > as string prefixes or not, but the tokenizer thinks they are. If you > leave them out, you come up with the 14 and 80 that Steven mentions. > > I agree with Terry that adding 'F' was a mistake. But since the upper > case versions of 'r', and 'b' already existed, it was included. > > Interestingly, in 2.7 'ur' is a valid prefix, but not in 3.6. I don't > recall if that was deliberate or not. And 'ru' isn't valid in either > version. > > Eric One comment about the 'combitorial explosion' is that it sort of assumes that each individual combination case needs to be handled with distinct code. My guess is that virtually all of the actual implementation of these prefixes can be handled by setting a flag for the presence of that prefix, and at the parsing of each character you need to just check a flag or two to figure out how to process it. You might get a bit more complication in determining if a given combination is valid, but if that gets too complicated it is likely an indication of an inconstancy in the language definition. -- Richard Damon From francismb at email.de Sun Apr 1 08:29:43 2018 From: francismb at email.de (francismb) Date: Sun, 1 Apr 2018 14:29:43 +0200 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <1627c986568.2837.db5b03704c129196a4e9415e55413ce6@gmail.com> Message-ID: <8f063ed8-670f-a440-bd9b-b92fae6b4aef@email.de> Hi, On 03/31/2018 09:47 PM, Terry Reedy wrote: > With no padding, I would not argue with someone who prefers > textwrap.dedent, but dedent cannot add the leading space. Couldn't one use use the 'indent' method on the 'textwrap' module for that purpose? Thanks, --francis From steve at pearwood.info Sun Apr 1 08:36:13 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 1 Apr 2018 22:36:13 +1000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> Message-ID: <20180401123613.GC16661@ando.pearwood.info> On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote: > One comment about the 'combitorial explosion' is that it sort of assumes > that each individual combination case needs to be handled with distinct > code. No -- as I said in an earlier post, Terry and I (and Eric) are talking about the explosion in number of prefixes, not the complexity of the code. You are right that many of the prefixes can be handled by the same code: rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them? presumably will all handled by the same "raw dedent f-string" code. But the parser still has to handle all those cases, and so does the person reading the code. And that *mental complexity* is (in my opinion) the biggest issue with adding a new d-prefix, and why I would rather make it a method. Another big advantage of a method is that we can apply it to non-literals too. The number of code paths increases too, but not anywhere as fast: # existing - regular ("cooked") triple-quoted string; - raw string; - f-string - raw f-string # proposed additions - dedent string - raw dedent string - dedent f-string - raw dedent f-string so roughly doubling the number of cases. I doubt that will double the code complexity, but it will complicate it somewhat. Apart from parsing, the actual complexity to the code will probably be similar whether it is a method or a prefix. After all, whichever we do, we still need built-in dedent code. -- Steve From francismb at email.de Sun Apr 1 08:42:33 2018 From: francismb at email.de (francismb) Date: Sun, 1 Apr 2018 14:42:33 +0200 Subject: [Python-ideas] Generalized version of contextlib.close In-Reply-To: References: Message-ID: On 03/26/2018 10:35 AM, Roberto Mart?nez wrote: > Hi, > > sometimes I need to use contextlib.close but over methods with a different > name, for example stop(), halt(), etc. For those cases I have to write my > own contextlib.close specialized version with a hard-coded method name. > > I think adding a "method" argument to contextlib.close can be very useful: > > @contextmanager > def closing(thing, method="close"): > try: > yield thing > finally: > getattr(thing, method)() > +1 From rosuav at gmail.com Sun Apr 1 08:55:34 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 1 Apr 2018 22:55:34 +1000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <20180401123613.GC16661@ando.pearwood.info> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: On Sun, Apr 1, 2018 at 10:36 PM, Steven D'Aprano wrote: > And that *mental complexity* is (in my opinion) the biggest issue with > adding a new d-prefix, and why I would rather make it a method. > > Another big advantage of a method is that we can apply it to > non-literals too. I'd like to expand on this a bit more. Current string prefix letters are: * u/b: completely change the object you're creating * f: change it from a literal to a kind of expression * r: change the interpretation of backslashes * triple quotes: change the interpretation of newlines All of these are significant *to the parser*. You absolutely cannot do any of these with methods (well, maybe you could have u/b done by having a literal for one of them, and the other is an encode or decode operation, but that's a pretty poor hack). But dedenting a string doesn't change the way the source code is interpreted. So it's free to be a method - which is far easier to add to the language. All you need is ".dedent()" to be syntactically parsed as a method call (which it already is), and every tool that processes Python code will correctly interpret this. So here's what, IMO, Marius can push for: 1) A method on Unicode strings which does the same as textwrap.dedent() 2) A peephole optimization wherein certain methods on literals get executed at compile time. The latter optimization would also apply to cases such as " spam ".strip() - as long as all it does is return another constant value, it can be done at compile time. Semantically, though, the part that matters is simply the new method. (Sadly, this can't be applied to Decimal("1.234"), as that's not a method and could be shadowed/mocked.) While I wouldn't use that method much myself, I think it's a Good Thing for features like that to be methods rather than functions stashed away in a module. (How do you know to look in "textwrap" for a line-by-line version of x.strip() ??) So I would be +1 on both the enhancements I mentioned above, and a solid -1 on this becoming a new form of literal. ChrisA From eric at trueblade.com Sun Apr 1 15:07:31 2018 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 1 Apr 2018 15:07:31 -0400 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: <8b948ea8-e71c-a4a2-5327-280af04eb5de@trueblade.com> On 4/1/2018 8:55 AM, Chris Angelico wrote: > On Sun, Apr 1, 2018 at 10:36 PM, Steven D'Aprano wrote: >> And that *mental complexity* is (in my opinion) the biggest issue with >> adding a new d-prefix, and why I would rather make it a method. >> >> Another big advantage of a method is that we can apply it to >> non-literals too. > > I'd like to expand on this a bit more. > > Current string prefix letters are: > > * u/b: completely change the object you're creating > * f: change it from a literal to a kind of expression > * r: change the interpretation of backslashes > * triple quotes: change the interpretation of newlines > > All of these are significant *to the parser*. You absolutely cannot do > any of these with methods (well, maybe you could have u/b done by > having a literal for one of them, and the other is an encode or decode > operation, but that's a pretty poor hack). The one place where a dedented string would come in handy, and where it would need to be recognized by the parser (and could not be the result of a function or method) is a docstring. Well, I guess you could have the parser "know" about certain string methods, but that seems horrible. Eric > > But dedenting a string doesn't change the way the source code is > interpreted. So it's free to be a method - which is far easier to add > to the language. All you need is ".dedent()" to be syntactically > parsed as a method call (which it already is), and every tool that > processes Python code will correctly interpret this. > > So here's what, IMO, Marius can push for: > > 1) A method on Unicode strings which does the same as textwrap.dedent() > 2) A peephole optimization wherein certain methods on literals get > executed at compile time. > > The latter optimization would also apply to cases such as " spam > ".strip() - as long as all it does is return another constant value, > it can be done at compile time. Semantically, though, the part that > matters is simply the new method. (Sadly, this can't be applied to > Decimal("1.234"), as that's not a method and could be > shadowed/mocked.) > > While I wouldn't use that method much myself, I think it's a Good > Thing for features like that to be methods rather than functions > stashed away in a module. (How do you know to look in "textwrap" for a > line-by-line version of x.strip() ??) So I would be +1 on both the > enhancements I mentioned above, and a solid -1 on this becoming a new > form of literal. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From tjreedy at udel.edu Sun Apr 1 15:10:56 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 1 Apr 2018 15:10:56 -0400 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <20180401123613.GC16661@ando.pearwood.info> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: On 4/1/2018 8:36 AM, Steven D'Aprano wrote: > On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote: > >> One comment about the 'combinatorial explosion' is that it sort of assumes >> that each individual combination case needs to be handled with distinct >> code. > No -- as I said in an earlier post, Terry and I (and Eric) are talking > about the explosion in number of prefixes, not the complexity of the > code. > > You are right that many of the prefixes can be handled by the same code: > > rfd rfD rFd rFD rdf rdF rDf rDF > Rfd RfD RFd RFD Rdf RdF RDf RDF > frd frD fRd fRD fdr fdR fDr fDR > Frd FrD FRd FRD Fdr FdR FDr FDR > drf drF dRf dRF dfr dfR dFr dFR > Drf DrF DRf DRF Dfr DfR DFr DFR > # why did we support all these combinations? who uses them? > > presumably will all handled by the same "raw dedent f-string" code. But > the parser still has to handle all those cases, and so does the person > reading the code. IDLE's colorizer does its parsing with a giant regex. The new prefix combinations would nearly double the number of alternatives in the regex. I am sure that this would mean more nodes in the compiled finite-state machine. Even though the non-re code of the colorizer would not change, I am pretty sure that this would mean that coloring takes longer. Since the colorizer is called with each keystroke*, and since other events can be handled between keystrokes#, colorizing time *could* become an issue, especially on older or slower machines than mine. Noticeable delays between keystroke and character appearance on screen are a real drag. * Type 'i', 'i' appears 'normal'; type 'n', 'in' is colored 'keyword'; type 't', 'int' is colored 'builtin'; type 'o', 'into' becomes 'normal' again. # One can edit while a program is running in a separate process and outputting to the shell window. -- Terry Jan Reedy From m.raesener at gmail.com Sun Apr 1 15:25:32 2018 From: m.raesener at gmail.com (=?UTF-8?Q?Marius_R=C3=A4sener?=) Date: Sun, 01 Apr 2018 19:25:32 +0000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: Hey again, Thx all for the active discussion. Since I?m the OP and though want to make clear that I didn?t had a `d` string literal in mind. So the Idea was to support this just as default, with any more effords to support it I don?t see a real advantage or that I?d think it is ?solved?. So I?m aware that probably there won?t be a majority to have this considering a breaking change - still I want to emphasize that I wouldn?t want yet another string literal. I think this would be really bad. Actually I?d rather like to see Python develop backwards and remove string literals and not getting even more ... so maybe just `r` and `b`? Anyways, I think I?ve made my Point clear. Terry Reedy schrieb am So. 1. Apr. 2018 um 21:12: > On 4/1/2018 8:36 AM, Steven D'Aprano wrote: > > On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote: > > > >> One comment about the 'combinatorial explosion' is that it sort of > assumes > >> that each individual combination case needs to be handled with distinct > >> code. > > > No -- as I said in an earlier post, Terry and I (and Eric) are talking > > about the explosion in number of prefixes, not the complexity of the > > code. > > > > You are right that many of the prefixes can be handled by the same code: > > > > rfd rfD rFd rFD rdf rdF rDf rDF > > Rfd RfD RFd RFD Rdf RdF RDf RDF > > frd frD fRd fRD fdr fdR fDr fDR > > Frd FrD FRd FRD Fdr FdR FDr FDR > > drf drF dRf dRF dfr dfR dFr dFR > > Drf DrF DRf DRF Dfr DfR DFr DFR > > # why did we support all these combinations? who uses them? > > > > presumably will all handled by the same "raw dedent f-string" code. But > > the parser still has to handle all those cases, and so does the person > > reading the code. > > IDLE's colorizer does its parsing with a giant regex. The new prefix > combinations would nearly double the number of alternatives in the > regex. I am sure that this would mean more nodes in the compiled > finite-state machine. Even though the non-re code of the colorizer > would not change, I am pretty sure that this would mean that coloring > takes longer. Since the colorizer is called with each keystroke*, and > since other events can be handled between keystrokes#, colorizing time > *could* become an issue, especially on older or slower machines than > mine. Noticeable delays between keystroke and character appearance on > screen are a real drag. > > * Type 'i', 'i' appears 'normal'; type 'n', 'in' is colored 'keyword'; > type 't', 'int' is colored 'builtin'; type 'o', 'into' becomes 'normal' > again. > > # One can edit while a program is running in a separate process and > outputting to the shell window. > > -- > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Sun Apr 1 15:42:18 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Sun, 1 Apr 2018 15:42:18 -0400 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <20180401123613.GC16661@ando.pearwood.info> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: <44c30ae2-e138-dd4b-3dbc-7859511b4c8f@Damon-Family.org> On 4/1/18 8:36 AM, Steven D'Aprano wrote: > On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote: > >> One comment about the 'combitorial explosion' is that it sort of assumes >> that each individual combination case needs to be handled with distinct >> code. > No -- as I said in an earlier post, Terry and I (and Eric) are talking > about the explosion in number of prefixes, not the complexity of the > code. > > You are right that many of the prefixes can be handled by the same code: > > rfd rfD rFd rFD rdf rdF rDf rDF > Rfd RfD RFd RFD Rdf RdF RDf RDF > frd frD fRd fRD fdr fdR fDr fDR > Frd FrD FRd FRD Fdr FdR FDr FDR > drf drF dRf dRF dfr dfR dFr dFR > Drf DrF DRf DRF Dfr DfR DFr DFR > # why did we support all these combinations? who uses them? > > presumably will all handled by the same "raw dedent f-string" code. But > the parser still has to handle all those cases, and so does the person > reading the code. > > And that *mental complexity* is (in my opinion) the biggest issue with > adding a new d-prefix, and why I would rather make it a method. > > Another big advantage of a method is that we can apply it to > non-literals too. > > The number of code paths increases too, but not anywhere as fast: > > # existing > - regular ("cooked") triple-quoted string; > - raw string; > - f-string > - raw f-string > > # proposed additions > - dedent string > - raw dedent string > - dedent f-string > - raw dedent f-string > > so roughly doubling the number of cases. I doubt that will double the > code complexity, but it will complicate it somewhat. > > Apart from parsing, the actual complexity to the code will probably be > similar whether it is a method or a prefix. After all, whichever we do, > we still need built-in dedent code. > > I think you miss my point that we shouldn't be parsing by each combination of prefixes (even collapsing equivalent ones), and instead by each prefix adjusting the rules for parse, which a single parsing routine uses. Mentally, you should be doing the same. I think that the grammar trying to exhaustively list the prefixes is awkward, and would be better served by a simpler production that allows for an arbitrary combination of the prefixes combined with a rule properly limiting the combinations of letters allowed, something like: one one of a give letter (case insensitive), at most one of b, u, and f, at most one of r and u (for python 3), then followed like currently with a description of what each letter does. This removes the combitorial explosion that is already starting with the addition of f. -- Richard Damon From rosuav at gmail.com Sun Apr 1 15:57:28 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 2 Apr 2018 05:57:28 +1000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: On Mon, Apr 2, 2018 at 5:25 AM, Marius R?sener wrote: > Actually I?d rather like to see Python develop backwards and remove string > literals and not getting even more ... so maybe just `r` and `b`? Yeah, that's not gonna happen :) ChrisA From brenbarn at brenbarn.net Sun Apr 1 16:31:30 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Sun, 01 Apr 2018 13:31:30 -0700 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <20180401123613.GC16661@ando.pearwood.info> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: <5AC141A2.10802@brenbarn.net> On 2018-04-01 05:36, Steven D'Aprano wrote: > On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote: > >> One comment about the 'combitorial explosion' is that it sort of assumes >> that each individual combination case needs to be handled with distinct >> code. > > No -- as I said in an earlier post, Terry and I (and Eric) are talking > about the explosion in number of prefixes, not the complexity of the > code. > > You are right that many of the prefixes can be handled by the same code: > > rfd rfD rFd rFD rdf rdF rDf rDF > Rfd RfD RFd RFD Rdf RdF RDf RDF > frd frD fRd fRD fdr fdR fDr fDR > Frd FrD FRd FRD Fdr FdR FDr FDR > drf drF dRf dRF dfr dfR dFr dFR > Drf DrF DRf DRF Dfr DfR DFr DFR > # why did we support all these combinations? who uses them? > > presumably will all handled by the same "raw dedent f-string" code. But > the parser still has to handle all those cases, and so does the person > reading the code. > > And that *mental complexity* is (in my opinion) the biggest issue with > adding a new d-prefix, and why I would rather make it a method. That doesn't seem a very reasonable argument to me. That is like saying that a person reading code has to mentally slog through the cognitive burden of understanding "all the combinations" of "a + b + c", "a + b - c", "a * b + c", "a - b * c", etc. We don't. We know what the operators mean and we build up our understanding of expressions by combining them. Similarly, these string prefixes can mostly be thought of as indepedent flags. You don't parse each combination separately; you learn what each flag means and then build up your understanding of a prefix by combining your understanding of the flags. (This is also glossing over the fact that many of the combinations you list differ only in case, which to my mind adds no extra cognitive load whatsoever.) -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From Richard at Damon-Family.org Sun Apr 1 16:57:32 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Sun, 1 Apr 2018 16:57:32 -0400 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <5AC141A2.10802@brenbarn.net> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> <5AC141A2.10802@brenbarn.net> Message-ID: <4cfc95df-c72d-5c75-39bb-0e1fe2d5c748@Damon-Family.org> On 4/1/18 4:31 PM, Brendan Barnwell wrote: > On 2018-04-01 05:36, Steven D'Aprano wrote: >> On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote: >> >>> One comment about the 'combitorial explosion' is that it sort of >>> assumes >>> that each individual combination case needs to be handled with distinct >>> code. >> >> No -- as I said in an earlier post, Terry and I (and Eric) are talking >> about the explosion in number of prefixes, not the complexity of the >> code. >> >> You are right that many of the prefixes can be handled by the same code: >> >> ???? rfd rfD rFd rFD rdf rdF rDf rDF >> ???? Rfd RfD RFd RFD Rdf RdF RDf RDF >> ???? frd frD fRd fRD fdr fdR fDr fDR >> ???? Frd FrD FRd FRD Fdr FdR FDr FDR >> ???? drf drF dRf dRF dfr dfR dFr dFR >> ???? Drf DrF DRf DRF Dfr DfR DFr DFR >> ???? # why did we support all these combinations? who uses them? >> >> presumably will all handled by the same "raw dedent f-string" code. But >> the parser still has to handle all those cases, and so does the person >> reading the code. >> >> And that *mental complexity* is (in my opinion) the biggest issue with >> adding a new d-prefix, and why I would rather make it a method. > > ????That doesn't seem a very reasonable argument to me.? That is like > saying that a person reading code has to mentally slog through the > cognitive burden of understanding "all the combinations" of "a + b + > c", "a + b - c", "a * b + c", "a - b * c", etc.? We don't.? We know > what the operators mean and we build up our understanding of > expressions by combining them.? Similarly, these string prefixes can > mostly be thought of as indepedent flags.? You don't parse each > combination separately; you learn what each flag means and then build > up your understanding of a prefix by combining your understanding of > the flags.? (This is also glossing over the fact that many of the > combinations you list differ only in case, which to my mind adds no > extra cognitive load whatsoever.) > Actually ALL the variation listed were the exact same prefix (dfr) with the 6 variations in possible order and the 8 variation of each of those in case. Which just shows why you don't want to try to exhaustive list prefixes. -- Richard Damon From g.rodola at gmail.com Sun Apr 1 16:57:56 2018 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Sun, 1 Apr 2018 22:57:56 +0200 Subject: [Python-ideas] Generalized version of contextlib.close In-Reply-To: References: Message-ID: On Mon, Mar 26, 2018 at 10:35 AM, Roberto Mart?nez < robertomartinezp at gmail.com> wrote: > Hi, > > sometimes I need to use contextlib.close but over methods with a different > name, for example stop(), halt(), etc. For those cases I have to write my > own contextlib.close specialized version with a hard-coded method name. > > I think adding a "method" argument to contextlib.close can be very useful: > > @contextmanager > def closing(thing, method="close"): > try: > yield thing > finally: > getattr(thing, method)() > > Or maybe something even more generic: > > @contextmanager > def calling(fn, *args, **kwargs): > try: > yield > finally: > fn(*args, **kwargs) > > > Best regards, > Roberto > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > +1 -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sun Apr 1 21:45:56 2018 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 2 Apr 2018 02:45:56 +0100 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: <80b20ec4-4462-400b-a36d-832b1dce23cf@mrabarnett.plus.com> On 2018-04-01 20:10, Terry Reedy wrote: > On 4/1/2018 8:36 AM, Steven D'Aprano wrote: >> On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote: >> >>> One comment about the 'combinatorial explosion' is that it sort of assumes >>> that each individual combination case needs to be handled with distinct >>> code. > >> No -- as I said in an earlier post, Terry and I (and Eric) are talking >> about the explosion in number of prefixes, not the complexity of the >> code. >> >> You are right that many of the prefixes can be handled by the same code: >> >> rfd rfD rFd rFD rdf rdF rDf rDF >> Rfd RfD RFd RFD Rdf RdF RDf RDF >> frd frD fRd fRD fdr fdR fDr fDR >> Frd FrD FRd FRD Fdr FdR FDr FDR >> drf drF dRf dRF dfr dfR dFr dFR >> Drf DrF DRf DRF Dfr DfR DFr DFR >> # why did we support all these combinations? who uses them? >> >> presumably will all handled by the same "raw dedent f-string" code. But >> the parser still has to handle all those cases, and so does the person >> reading the code. > > IDLE's colorizer does its parsing with a giant regex. The new prefix > combinations would nearly double the number of alternatives in the > regex. I am sure that this would mean more nodes in the compiled > finite-state machine. Even though the non-re code of the colorizer > would not change, I am pretty sure that this would mean that coloring > takes longer. Since the colorizer is called with each keystroke*, and > since other events can be handled between keystrokes#, colorizing time > *could* become an issue, especially on older or slower machines than > mine. Noticeable delays between keystroke and character appearance on > screen are a real drag. > In Python 3.7 that part is now: stringprefix = r"(?i:\br|u|f|fr|rf|b|br|rb)?" (which looks slightly wrong to me!). From python-ideas at mgmiller.net Mon Apr 2 01:09:28 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Sun, 1 Apr 2018 22:09:28 -0700 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <20180401123613.GC16661@ando.pearwood.info> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: On 2018-04-01 05:36, Steven D'Aprano wrote: > You are right that many of the prefixes can be handled by the same code: > > rfd rfD rFd rFD rdf rdF rDf rDF > Rfd RfD RFd RFD Rdf RdF RDf RDF > frd frD fRd fRD fdr fdR fDr fDR > Frd FrD FRd FRD Fdr FdR FDr FDR > drf drF dRf dRF dfr dfR dFr dFR > Drf DrF DRf DRF Dfr DfR DFr DFR > # why did we support all these combinations? who uses them? In almost twenty years of using Python, I've not seen capital string prefixes in real code, ever. Sounds like a great candidate for deprecation? -Mike From ncoghlan at gmail.com Mon Apr 2 03:20:57 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 2 Apr 2018 17:20:57 +1000 Subject: [Python-ideas] Generalized version of contextlib.close In-Reply-To: References: Message-ID: On 26 March 2018 at 18:35, Roberto Mart?nez wrote: > @contextmanager > def calling(fn, *args, **kwargs): > try: > yield > finally: > fn(*args, **kwargs) I'd be more amenable to a proposal along these lines (rather than adding a parameter to closing), as it more closely resembles the way that contextlib.ExitStack.callback already works: with contextlib.ExitStack() as stack: stack.callback(fn, *args, **kwds) ... In cases where you just want to call a single operation and don't need the extra complexity and overhead of the dynamic stack, then it would be nice to be able to instead write: with contextlib.callback(fn, *args, **kwds): ... Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From desmoulinmichel at gmail.com Mon Apr 2 05:53:18 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Mon, 2 Apr 2018 11:53:18 +0200 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: Le 02/04/2018 ? 07:09, Mike Miller a ?crit?: > > On 2018-04-01 05:36, Steven D'Aprano wrote: >> You are right that many of the prefixes can be handled by the same code: >> >> ???? rfd rfD rFd rFD rdf rdF rDf rDF >> ???? Rfd RfD RFd RFD Rdf RdF RDf RDF >> ???? frd frD fRd fRD fdr fdR fDr fDR >> ???? Frd FrD FRd FRD Fdr FdR FDr FDR >> ???? drf drF dRf dRF dfr dfR dFr dFR >> ???? Drf DrF DRf DRF Dfr DfR DFr DFR >> ???? # why did we support all these combinations? who uses them? > > In almost twenty years of using Python, I've not seen capital string > prefixes in real code, ever.? Sounds like a great candidate for > deprecation? +1 It's not like migrating would be hard: a replace is enough to fix the rare projects doing that. And even if they missed the warning, it's a syntax error anyway, so you will get the error as soon as you try to run the program, not at a later point at runtime. What about doing a poll, then suggests a warning on 3.8, removed in 4.0 ? From songofacandy at gmail.com Mon Apr 2 06:39:31 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 2 Apr 2018 19:39:31 +0900 Subject: [Python-ideas] PEP draft: Unifying function/method classes In-Reply-To: <5ABF9FD8.4030507@UGent.be> References: <5ABF9FD8.4030507@UGent.be> Message-ID: Thanks for writing such hard PEP. At first glance, it new type hierarchy seems OK. But I can't understand rational for new flags. And it's very difficult to estimate runtime and maintenance cost of the PEP, without draft implementation. FASTCALL is introduced in recently version, and it make implementation complicated. I'm afraid that this PEP make it worse. For making FASTCALL stable & public, I'm +1 if Victor and Serhiy agree. Regards, From gadgetsteve at live.co.uk Mon Apr 2 08:08:47 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Mon, 2 Apr 2018 12:08:47 +0000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <20180401014805.GA16661@ando.pearwood.info> References: <20180401014805.GA16661@ando.pearwood.info> Message-ID: On 01/04/2018 02:48, Steven D'Aprano wrote: > On Sun, Apr 01, 2018 at 02:20:16AM +0100, Rob Cliffe via Python-ideas wrote: > >>> New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline >>> strings only, would multiply the number of alternatives by about 5 and >>> would require another rewrite of all code (Python or not) that parses >>> Python code (such as in syntax colorizers). >> >> I think you're exaggerating the difficulty somewhat.? Multiplying the >> number of alternatives by 5 is not the same thing as increasing the >> complexity of code to parse it by 5. > > Terry didn't say that it would increase the complexity of the code by a > factor of five. He said it would multiply the number of alternatives by > "about 5". There would be a significant increase in the complexity of > the code too, but I wouldn't want to guess how much. > > Starting with r and f prefixes, in both upper and lower case, we have: > > 4 single letter prefixes > (plus 2 more, u and U, that don't combine with others) > 8 double letter prefixes > > making 14 in total. Adding one more prefix, d|D, increases it to: > > 6 single letter prefixes > (plus 2 more, u and U) > 24 double letter prefixes > 48 triple letter prefixes > > making 80 prefixes in total. Terry actually underestimated the explosion > in prefixes: it is closer to six times more than five (but who is > counting? apart from me *wink*) > > [Aside: if we add a fourth, the total becomes 634 prefixes.] > > > Can I suggest, rather than another string prefix, that will require the user to add the d flag to every string that they use, we consider a file scope dedent_multiline or auto_dedent import, possibly from __future__ or textwrap, that automatically applies the dedent function to all multiline strings in the file. This would reflect that, typically, a specific developer tends to want either all or no multi-line text strings dedented. It should have minimumal impact on the language, operate at compile time so be low overhead and avoid cluttering strings up. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com From steve at pearwood.info Mon Apr 2 09:06:24 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 2 Apr 2018 23:06:24 +1000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <20180401014805.GA16661@ando.pearwood.info> Message-ID: <20180402130624.GH16661@ando.pearwood.info> On Mon, Apr 02, 2018 at 12:08:47PM +0000, Steve Barnes wrote: > This would reflect that, typically, a specific developer tends to want > either all or no multi-line text strings dedented. I don't know how you come to that conclusion. I certainly would not want "All or Nothing" when it comes to dedenting triple-quoted strings. -- Steve From ncoghlan at gmail.com Mon Apr 2 10:46:01 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 3 Apr 2018 00:46:01 +1000 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <20180402130624.GH16661@ando.pearwood.info> References: <20180401014805.GA16661@ando.pearwood.info> <20180402130624.GH16661@ando.pearwood.info> Message-ID: On 2 April 2018 at 23:06, Steven D'Aprano wrote: > On Mon, Apr 02, 2018 at 12:08:47PM +0000, Steve Barnes wrote: > >> This would reflect that, typically, a specific developer tends to want >> either all or no multi-line text strings dedented. > > I don't know how you come to that conclusion. > > I certainly would not want "All or Nothing" when it comes to dedenting > triple-quoted strings. If we did flip the default with a "from __future__ import auto_dedent" though, there would be an opportunity to consider the available approaches for *adding* indentation after the fact, such as: indented = textwrap.indent(text, " " * 8) or: indent = " " * 8 indented = "\n".join((indent + line if line else line) for line in text.splitlines()) Adding indentation is generally easier than removing it, since you can operate on each line in isolation, rather than having to work out the common prefix. To allow exact recreation of the current indented multi-line string behaviour, we could offer an `__indent__` constant, which the compiler replaced with the leading indent of the current code block (Note: not necessarily the indent level of the current line). So where today we have: * leading indent by default * "textwrap.dedent(text)" to strip the common leading whitespace In an auto-dedent world, we'd have: * the current block indent level stripped from each line after the first in multi-line strings by default * add it back by doing "textwrap.indent(text, __indent__)" in the same code block I mostly find the current behaviour irritating, and work around it by way of module level constants, but even so, I'm still not sure it qualifies as being annoying enough to be worth the hassle of changing it. One relevant point though is that passing an already dedented string through textwrap.dedent() will be a no-op, so the compatibility cases to worry about will be those where *all* of the leading whitespace in a multiline string is significant, including the common prefix arising from the code block indentation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From J.Demeyer at UGent.be Mon Apr 2 11:46:07 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Mon, 2 Apr 2018 17:46:07 +0200 Subject: [Python-ideas] PEP draft: Unifying function/method classes In-Reply-To: <63ae15c8dec5414495e3c5b18b0b424b@xmail102.UGent.be> References: <5ABF9FD8.4030507@UGent.be> <63ae15c8dec5414495e3c5b18b0b424b@xmail102.UGent.be> Message-ID: <5AC2503F.6040705@UGent.be> On 2018-04-02 12:39, INADA Naoki wrote: > Thanks for writing such hard PEP. > > At first glance, it new type hierarchy seems OK. > But I can't understand rational for new flags. Which flags in particular do you mean? I just pushed an update explaining the rationale of METH_ARG0_FUNCTION: https://github.com/jdemeyer/PEP-functions#replacing-tp_call-meth_arg0_function > And it's very difficult to estimate runtime and maintenance cost of > the PEP, without draft implementation. Runtime cost: the goal is no slowdowns at all. Maintenance cost: IMHO, this PEP simplifies functions in CPython by removing special classes like method_descriptor so the effect should only be in the good sense. > FASTCALL is introduced in recently version, and it make implementation > complicated. > I'm afraid that this PEP make it worse. What do you mean? I am not making any changes to METH_FASTCALL. I only mention it in my PEP to document it. Jeroen. From songofacandy at gmail.com Mon Apr 2 12:22:04 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 3 Apr 2018 01:22:04 +0900 Subject: [Python-ideas] PEP draft: Unifying function/method classes In-Reply-To: <5AC2503F.6040705@UGent.be> References: <5ABF9FD8.4030507@UGent.be> <63ae15c8dec5414495e3c5b18b0b424b@xmail102.UGent.be> <5AC2503F.6040705@UGent.be> Message-ID: On Tue, Apr 3, 2018 at 12:46 AM, Jeroen Demeyer wrote: > On 2018-04-02 12:39, INADA Naoki wrote: >> >> Thanks for writing such hard PEP. >> >> At first glance, it new type hierarchy seems OK. >> But I can't understand rational for new flags. > > > Which flags in particular do you mean? I just pushed an update explaining > the rationale of METH_ARG0_FUNCTION: > > https://github.com/jdemeyer/PEP-functions#replacing-tp_call-meth_arg0_function > I meant all new flags. Please note that most PEP readers doesn't read calling implementation everyday. So it's unclear why METH_ARG0_NO_SLICE and METH_ARG0_FUNCTION should be added. Actual example for METH_USR* flags are healpful too. >> And it's very difficult to estimate runtime and maintenance cost of >> the PEP, without draft implementation. > > > Runtime cost: the goal is no slowdowns at all. > Good. > Maintenance cost: IMHO, this PEP simplifies functions in CPython by removing > special classes like method_descriptor so the effect should only be in the > good sense. > I can't imagine it until PoC implementation. >> FASTCALL is introduced in recently version, and it make implementation >> complicated. >> I'm afraid that this PEP make it worse. > > > What do you mean? I am not making any changes to METH_FASTCALL. > When METH_FASTCALL is added, many special casing are added to support it. I'm afraid adding new class means adding more special cases. From storchaka at gmail.com Mon Apr 2 12:31:52 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 2 Apr 2018 19:31:52 +0300 Subject: [Python-ideas] PEP draft: Unifying function/method classes In-Reply-To: <5ABF9FD8.4030507@UGent.be> References: <5ABF9FD8.4030507@UGent.be> Message-ID: 31.03.18 17:48, Jeroen Demeyer ????: > I have prepared a PEP draft for unifying function/method classes. You > can find it at > > https://github.com/jdemeyer/PEP-functions > > This has not officially been submitted as PEP yet, I want to hear your > comments first. I once tried to move in this direction (unifying the set of attributes for functions). But there are more than two kinds of function-like objects, so I deferred this work until I have more time. Do you have working implementation? How much code should be modified for passing all tests? Ideally only specific code in the inspect module and like should be modified. From guido at python.org Mon Apr 2 13:34:24 2018 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Apr 2018 10:34:24 -0700 Subject: [Python-ideas] PEP draft: Unifying function/method classes In-Reply-To: <5ABF9FD8.4030507@UGent.be> References: <5ABF9FD8.4030507@UGent.be> Message-ID: I want to support this work. I can't promise your PEP will be accepted, but it looks like you've done your homework, and you're getting feedback from core devs who care about this area. (One of them may end up BDFL-delegate.) It will be a long road to success, but I recommend that you start with a PR to the peps repo. Merging new versions there will be easy (the requirements for accepting edits to new PEPs in that repo are really low -- it just needs to successfully generate HTML). Once your PEP is there you should probably focus on an implementation. On Sat, Mar 31, 2018 at 7:48 AM, Jeroen Demeyer wrote: > I have prepared a PEP draft for unifying function/method classes. You can > find it at > > https://github.com/jdemeyer/PEP-functions > > This has not officially been submitted as PEP yet, I want to hear your > comments first. > > > Thanks, > Jeroen. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Mon Apr 2 14:40:22 2018 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 2 Apr 2018 19:40:22 +0100 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> Message-ID: <6948816c-5d04-8556-ad17-de1553771523@mrabarnett.plus.com> On 2018-04-02 10:53, Michel Desmoulin wrote: > > Le 02/04/2018 ? 07:09, Mike Miller a ?crit?: >> >> On 2018-04-01 05:36, Steven D'Aprano wrote: >>> You are right that many of the prefixes can be handled by the same code: >>> >>> ???? rfd rfD rFd rFD rdf rdF rDf rDF >>> ???? Rfd RfD RFd RFD Rdf RdF RDf RDF >>> ???? frd frD fRd fRD fdr fdR fDr fDR >>> ???? Frd FrD FRd FRD Fdr FdR FDr FDR >>> ???? drf drF dRf dRF dfr dfR dFr dFR >>> ???? Drf DrF DRf DRF Dfr DfR DFr DFR >>> ???? # why did we support all these combinations? who uses them? >> >> In almost twenty years of using Python, I've not seen capital string >> prefixes in real code, ever.? Sounds like a great candidate for >> deprecation? > > +1 > > > > It's not like migrating would be hard: a replace is enough to fix the > rare projects doing that. And even if they missed the warning, it's a > syntax error anyway, so you will get the error as soon as you try to run > the program, not at a later point at runtime. > > What about doing a poll, then suggests a warning on 3.8, removed in 4.0 ? > Also, Python is case-sensitive elsewhere, so why not here too? OTOH, it's not like it's causing a problem. From python-ideas at mgmiller.net Mon Apr 2 16:51:50 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 2 Apr 2018 13:51:50 -0700 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: <87in9d2xm3.fsf@vostro.rath.org> References: <20180323150058.GU16661@ando.pearwood.info> <20180324044102.GV16661@ando.pearwood.info> <20180324144432.GW16661@ando.pearwood.info> <27fccc82-8833-d1a5-a589-8d1358a3887a@btinternet.com> <5AB6A081.5010503@stoneleaf.us> <87in9d2xm3.fsf@vostro.rath.org> Message-ID: <2d7052b6-5912-c454-13f2-6595a32afa41@mgmiller.net> Yes, I first came across := when learning (Turbo) Pascal in the early 90's. However golang managed to screw it up?it only works there as a "short declaration AND assignment" operator. You can't use it twice on the same variable! Boggles the mind how experienced designers came up with that one. ;-) Maybe Algol did it that way? (before my time) I found Pascal's syntax, := for assignment, = and <>, for tests about close to perfect in ease of learning/comprehension as it gets, from someone who studied math before C anyway. -Mike On 2018-03-30 12:04, Nikolaus Rath wrote: From python-ideas at mgmiller.net Mon Apr 2 16:59:30 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 2 Apr 2018 13:59:30 -0700 Subject: [Python-ideas] Dart like multi line strings identation In-Reply-To: <6948816c-5d04-8556-ad17-de1553771523@mrabarnett.plus.com> References: <20180401014805.GA16661@ando.pearwood.info> <3c3933bd-5d5e-6b06-d820-abbc98ffabc9@trueblade.com> <011e10ab-5d27-a1b1-92e7-3c509cad90b2@Damon-Family.org> <20180401123613.GC16661@ando.pearwood.info> <6948816c-5d04-8556-ad17-de1553771523@mrabarnett.plus.com> Message-ID: <3b4177ca-4734-137f-9e47-e9b201b05d2f@mgmiller.net> On 2018-04-02 11:40, MRAB wrote: > > OTOH, it's not like it's causing a problem. Well, not a big one, but there are arguments for keeping a language as simple as possible. Also every time an idea comes up for a string prefix, the combinatorial issue comes up again. ?If we could factor out an unnecessary 2x it might help there. From guido at python.org Mon Apr 2 18:03:31 2018 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Apr 2018 15:03:31 -0700 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: <2d7052b6-5912-c454-13f2-6595a32afa41@mgmiller.net> References: <20180323150058.GU16661@ando.pearwood.info> <20180324044102.GV16661@ando.pearwood.info> <20180324144432.GW16661@ando.pearwood.info> <27fccc82-8833-d1a5-a589-8d1358a3887a@btinternet.com> <5AB6A081.5010503@stoneleaf.us> <87in9d2xm3.fsf@vostro.rath.org> <2d7052b6-5912-c454-13f2-6595a32afa41@mgmiller.net> Message-ID: IIRC Algol-68 (the lesser-known, more complicated version) used 'int x = 0;' to declare a constant and 'int x := 0;' to declare a variable. And there was a lot more to it; see https://en.wikipedia.org/wiki/ALGOL_68#mode:_Declarations. I'm guessing Go reversed this because they want '=' to be the common assignment (whereas in Algol-68 the common assignment was ':='). My current thinking about Python is that if we're doing this, '=' and ':=' will mean the same thing but inside an expression you must use ':='. Chris, Nick and I are working out some details off-list. On Mon, Apr 2, 2018 at 1:51 PM, Mike Miller wrote: > Yes, I first came across := when learning (Turbo) Pascal in the early 90's. > > However golang managed to screw it up?it only works there as a "short > declaration AND assignment" operator. You can't use it twice on the same > variable! Boggles the mind how experienced designers came up with that > one. ;-) Maybe Algol did it that way? (before my time) > > I found Pascal's syntax, := for assignment, = and <>, for tests about > close to perfect in ease of learning/comprehension as it gets, from someone > who studied math before C anyway. > > -Mike > > > > On 2018-03-30 12:04, Nikolaus Rath wrote: > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Mon Apr 2 18:23:16 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 2 Apr 2018 15:23:16 -0700 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: References: <20180323150058.GU16661@ando.pearwood.info> <20180324044102.GV16661@ando.pearwood.info> <20180324144432.GW16661@ando.pearwood.info> <27fccc82-8833-d1a5-a589-8d1358a3887a@btinternet.com> <5AB6A081.5010503@stoneleaf.us> <87in9d2xm3.fsf@vostro.rath.org> <2d7052b6-5912-c454-13f2-6595a32afa41@mgmiller.net> Message-ID: <1d2264f2-007c-0212-f5c2-a3074a56eb7b@mgmiller.net> Interesting, thanks. On 2018-04-02 15:03, Guido van Rossum wrote: From ncoghlan at gmail.com Tue Apr 3 09:50:26 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 3 Apr 2018 23:50:26 +1000 Subject: [Python-ideas] PEP draft: Unifying function/method classes In-Reply-To: <5ABF9FD8.4030507@UGent.be> References: <5ABF9FD8.4030507@UGent.be> Message-ID: On 1 April 2018 at 00:48, Jeroen Demeyer wrote: > I have prepared a PEP draft for unifying function/method classes. You can > find it at > > https://github.com/jdemeyer/PEP-functions > > This has not officially been submitted as PEP yet, I want to hear your > comments first. I've only read the description of the proposed type heirarchy so far, but I really like where you're heading with this. A couple of specific naming suggestions: * method -> bound_method "method" is an overloaded term, and this will make it clearer that these objects are specifically for bound methods. * generic_function -> defined_function "Generic function" already refers to functions decorated with functools.singledispatch (as well as the general concept of generic functions), so re-using it for a different purpose here would be confusing. I don't have a particularly great alternative name to suggest, but "defined_function" at least takes its inspiration from the "def" keyword, and the fact that these functions are significantly better defined than builtin ones (from a runtime introspection perspective). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From nick at humrich.us Wed Apr 4 12:25:49 2018 From: nick at humrich.us (Nick Humrich) Date: Wed, 04 Apr 2018 16:25:49 +0000 Subject: [Python-ideas] Pypi private repo's Message-ID: I am sure this has been discussed before, and this might not even be the best place for this discussion, but I just wanted to make sure this has been thought about. What if pypi.org supported private repos at a cost, similar to npm? This would be able to help support the cost of pypi, and hopefully make it better/more reliable, thus in turn improving the python community. If this discussion should happen somewhere else, let me know. Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Wed Apr 4 16:55:07 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 4 Apr 2018 16:55:07 -0400 Subject: [Python-ideas] Pypi private repo's In-Reply-To: References: Message-ID: <3075001d3cc57$370df900$a529eb00$@sdamon.com> I am fairly sure if you give the PyPA that suggestion, they will just deflate at the thought of the workload. Besides, we already offer private repos for free, several ways ranging from devpi to python -m SimpleHTTPServer in a specially created directory. From: Python-ideas On Behalf Of Nick Humrich Sent: Wednesday, April 4, 2018 12:26 PM To: python-ideas at python.org Subject: [Python-ideas] Pypi private repo's I am sure this has been discussed before, and this might not even be the best place for this discussion, but I just wanted to make sure this has been thought about. What if pypi.org supported private repos at a cost, similar to npm? This would be able to help support the cost of pypi, and hopefully make it better/more reliable, thus in turn improving the python community. If this discussion should happen somewhere else, let me know. Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ed.oconnor at gmail.com Thu Apr 5 12:52:17 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Thu, 5 Apr 2018 12:52:17 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin Message-ID: Dear all, In Python, I often find myself building lists where each element depends on the last. This generally means making a for-loop, create an initial list, and appending to it in the loop, or creating a generator-function. Both of these feel more verbose than necessary. I was thinking it would be nice to be able to encapsulate this common type of operation into a more compact comprehension. I propose a new "Reduce-Map" comprehension that allows us to write: signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] Instead of: def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): average = initial_value for xt in signal: average = (1-decay)*average + decay*xt yield average signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = list(exponential_moving_average(signal, decay=0.05)) I've created a complete proposal at: https://github.com/petered/peps/blob/master/pep-9999.rst , (and a pull-request ) and I'd be interested to hear what people think of this idea. Combined with the new "last" builtin discussed in the proposal, this would allow u to replace "reduce" with a more Pythonic comprehension-style syntax. - Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Thu Apr 5 13:08:25 2018 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 5 Apr 2018 18:08:25 +0100 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: On 05/04/18 17:52, Peter O'Connor wrote: > Dear all, > > In Python, I often find myself building lists where each element depends on > the last. This generally means making a for-loop, create an initial list, > and appending to it in the loop, or creating a generator-function. Both of > these feel more verbose than necessary. > > I was thinking it would be nice to be able to encapsulate this common type > of operation into a more compact comprehension. > > I propose a new "Reduce-Map" comprehension that allows us to write: > > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] > smooth_signal = [average = (1-decay)*average + decay*x for x in signal > from average=0.] Ew. This looks magic (and indeed is magic) and uses single equals inside the expression (inviting "=" vs "==" gumbies). I think you are trying to do too much in one go, and something like this is complex enough that it shouldn't be in a comprehension in the first place. > Instead of: > > def exponential_moving_average(signal: Iterable[float], decay: float, > initial_value: float=0.): > average = initial_value > for xt in signal: > average = (1-decay)*average + decay*xt > yield average > > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] > smooth_signal = list(exponential_moving_average(signal, decay=0.05)) Aside from unnecessarily being a generator, this reads better to me! -- Rhodri James *-* Kynesim Ltd From ethan at stoneleaf.us Thu Apr 5 13:40:21 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 05 Apr 2018 10:40:21 -0700 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: <5AC65F85.5010106@stoneleaf.us> On 04/05/2018 09:52 AM, Peter O'Connor wrote: > [snip html code snippets] Please don't use html markup. The code was very difficult to read. -- ~Ethan~ From clint.hepner at gmail.com Thu Apr 5 13:48:22 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Thu, 5 Apr 2018 13:48:22 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> > On 2018 Apr 5 , at 12:52 p, Peter O'Connor wrote: > > Dear all, > > In Python, I often find myself building lists where each element depends on the last. This generally means making a for-loop, create an initial list, and appending to it in the loop, or creating a generator-function. Both of these feel more verbose than necessary. > > I was thinking it would be nice to be able to encapsulate this common type of operation into a more compact comprehension. > > I propose a new "Reduce-Map" comprehension that allows us to write: > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] > smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] > Instead of: > def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): > average = initial_value > for xt in signal: > average = (1-decay)*average + decay*xt > yield average > > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] > smooth_signal = list(exponential_moving_average(signal, decay=0.05)) > I've created a complete proposal at: https://github.com/petered/peps/blob/master/pep-9999.rst , (and a pull-request) and I'd be interested to hear what people think of this idea. > > Combined with the new "last" builtin discussed in the proposal, this would allow u to replace "reduce" with a more Pythonic comprehension-style syntax. See itertools.accumulate, comparing the rough implementation in the docs to your exponential_moving_average function: signal = [math.sin(i*0.01) + random.normalvariate(0,0.1) for i in range(1000)] dev compute_avg(avg, x): return (1 - decay)*avg + decay * x smooth_signal = accumulate([initial_average] + signal, compute_avg) -- Clint From peter.ed.oconnor at gmail.com Thu Apr 5 17:26:03 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Thu, 5 Apr 2018 17:26:03 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> Message-ID: Ah, that's nice, I didn't know that itertools.accumulate now has an optional "func" parameter. Although to get the exact same behaviour (output the same length as input) you'd actually have to do: smooth_signal = itertools.islice(itertools.accumulate([initial_average] + signal, compute_avg), 1, None) And you'd also have to use iterools.chain to concatenate the initial_average to the rest if "signal" were a generator instead of a list, so the fully general version would be: smooth_signal = itertools.islice(itertools.accumulate(itertools.chain([initial_average], signal), compute_avg), 1, None) I find this a bit awkward, and maintain that it would be nice to have this as a built-in language construct to do this natively. You have to admit: smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] Is a lot cleaner and more intuitive than: dev compute_avg(avg, x): return (1 - decay)*avg + decay * x smooth_signal = itertools.islice(itertools.accumulate(itertools.chain([initial_average], signal), compute_avg), 1, None) Moreover, if added with the "last" builtin proposed in the link, it could also kill the need for reduce, as you could instead use: last_smooth_signal = last(average = (1-decay)*average + decay*x for x in signal from average=0.) On Thu, Apr 5, 2018 at 1:48 PM, Clint Hepner wrote: > > > On 2018 Apr 5 , at 12:52 p, Peter O'Connor > wrote: > > > > Dear all, > > > > In Python, I often find myself building lists where each element depends > on the last. This generally means making a for-loop, create an initial > list, and appending to it in the loop, or creating a generator-function. > Both of these feel more verbose than necessary. > > > > I was thinking it would be nice to be able to encapsulate this common > type of operation into a more compact comprehension. > > > > I propose a new "Reduce-Map" comprehension that allows us to write: > > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in > range(1000)] > > smooth_signal = [average = (1-decay)*average + decay*x for x in signal > from average=0.] > > Instead of: > > def exponential_moving_average(signal: Iterable[float], decay: float, > initial_value: float=0.): > > average = initial_value > > for xt in signal: > > average = (1-decay)*average + decay*xt > > yield average > > > > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in > range(1000)] > > smooth_signal = list(exponential_moving_average(signal, decay=0.05)) > > I've created a complete proposal at: https://github.com/petered/ > peps/blob/master/pep-9999.rst , (and a pull-request) and I'd be > interested to hear what people think of this idea. > > > > Combined with the new "last" builtin discussed in the proposal, this > would allow u to replace "reduce" with a more Pythonic comprehension-style > syntax. > > > See itertools.accumulate, comparing the rough implementation in the docs > to your exponential_moving_average function: > > signal = [math.sin(i*0.01) + random.normalvariate(0,0.1) for i in > range(1000)] > > dev compute_avg(avg, x): > return (1 - decay)*avg + decay * x > > smooth_signal = accumulate([initial_average] + signal, compute_avg) > > -- > Clint -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Apr 5 17:55:48 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 5 Apr 2018 22:55:48 +0100 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> Message-ID: On 5 April 2018 at 22:26, Peter O'Connor wrote: > I find this a bit awkward, and maintain that it would be nice to have this > as a built-in language construct to do this natively. You have to admit: > > smooth_signal = [average = (1-decay)*average + decay*x for x in signal > from average=0.] > > Is a lot cleaner and more intuitive than: > > dev compute_avg(avg, x): > return (1 - decay)*avg + decay * x > > smooth_signal = > itertools.islice(itertools.accumulate(itertools.chain([initial_average], > signal), compute_avg), 1, None) Not really, I don't... In fact, factoring out compute_avg() is the first step I'd take in converting the proposed syntax into something I'd find readable and maintainable. (It's worth remembering that when you understand the subject of the code very well, it's a lot easier to follow complex constructs, than when you're less familiar with it - and the person who's unfamiliar with it could easily be you in a few months). The string of itertools functions are *not* readable, but I'd fix that by expanding them into an explicit loop: smooth_signal = [] average = 0 for x in signal: average = compute_avg(average, x) smooth_signal.append(average) If I have that wrong, it's because I misread *both* the itertools calls *and* the proposed syntax. But I doubt anyone would claim that it's possible to misunderstand the explicit loop. > Moreover, if added with the "last" builtin proposed in the link, it could > also kill the need for reduce, as you could instead use: > > last_smooth_signal = last(average = (1-decay)*average + decay*x for x in > signal from average=0.) last_smooth_signal = 0 for x in signal: last_smooth_signal = compute_avg(last_smooth_signal, x) or functools.reduce(compute_avg, signal, 0), if you prefer reduce() - I'm not sure I do. Sorry, this example has pretty much confirmed for me that an explicit loop is *far* more readable. Paul. From peter.ed.oconnor at gmail.com Thu Apr 5 18:24:25 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Thu, 5 Apr 2018 18:24:25 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> Message-ID: Well, whether you factor out the loop-function is a separate issue. Lets say we do: smooth_signal = [average = compute_avg(average, x) for x in signal from average=0] Is just as readable and maintainable as your expanded version, but saves 4 lines of code. What's not to love? On Thu, Apr 5, 2018 at 5:55 PM, Paul Moore wrote: > On 5 April 2018 at 22:26, Peter O'Connor > wrote: > > I find this a bit awkward, and maintain that it would be nice to have > this > > as a built-in language construct to do this natively. You have to admit: > > > > smooth_signal = [average = (1-decay)*average + decay*x for x in > signal > > from average=0.] > > > > Is a lot cleaner and more intuitive than: > > > > dev compute_avg(avg, x): > > return (1 - decay)*avg + decay * x > > > > smooth_signal = > > itertools.islice(itertools.accumulate(itertools.chain([initial_average], > > signal), compute_avg), 1, None) > > Not really, I don't... In fact, factoring out compute_avg() is the > first step I'd take in converting the proposed syntax into something > I'd find readable and maintainable. (It's worth remembering that when > you understand the subject of the code very well, it's a lot easier to > follow complex constructs, than when you're less familiar with it - > and the person who's unfamiliar with it could easily be you in a few > months). > > The string of itertools functions are *not* readable, but I'd fix that > by expanding them into an explicit loop: > > smooth_signal = [] > average = 0 > for x in signal: > average = compute_avg(average, x) > smooth_signal.append(average) > > If I have that wrong, it's because I misread *both* the itertools > calls *and* the proposed syntax. But I doubt anyone would claim that > it's possible to misunderstand the explicit loop. > > > Moreover, if added with the "last" builtin proposed in the link, it could > > also kill the need for reduce, as you could instead use: > > > > last_smooth_signal = last(average = (1-decay)*average + decay*x for > x in > > signal from average=0.) > > last_smooth_signal = 0 > for x in signal: > last_smooth_signal = compute_avg(last_smooth_signal, x) > > or functools.reduce(compute_avg, signal, 0), if you prefer reduce() - > I'm not sure I do. > > Sorry, this example has pretty much confirmed for me that an explicit > loop is *far* more readable. > > Paul. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Apr 5 19:20:56 2018 From: mertz at gnosis.cx (David Mertz) Date: Thu, 05 Apr 2018 23:20:56 +0000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> Message-ID: On Thu, Apr 5, 2018, 5:32 PM Peter O'Connor wrote: > I find this a bit awkward, and maintain that it would be nice to have this > as a built-in language construct to do this natively. You have to admit: > > smooth_signal = [average = (1-decay)*average + decay*x for x in signal > from average=0.] > > Is a lot cleaner and more intuitive than: > > dev compute_avg(avg, x): > return (1 - decay)*avg + decay * x > The proposed syntax strikes me as confusing and mysterious to do something I do only occasionally. In contrast, itertools.accumulate() is straightforward and far more general. Definitely -100 on the proposal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Apr 5 20:31:41 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 05 Apr 2018 17:31:41 -0700 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> Message-ID: <5AC6BFED.40507@stoneleaf.us> On 04/05/2018 03:24 PM, Peter O'Connor wrote: > Well, whether you factor out the loop-function is a separate issue. Lets say we do: > > smooth_signal = [average = compute_avg(average, x) for x in signal from average=0] > > Is just as readable and maintainable as your expanded version, but saves 4 lines of code. What's not to love? It is not readable and it is not Python (and hopefully never will be). -- ~Ethan~ From steve at pearwood.info Thu Apr 5 20:29:19 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 6 Apr 2018 10:29:19 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> Message-ID: <20180406002918.GR16661@ando.pearwood.info> On Thu, Apr 05, 2018 at 06:24:25PM -0400, Peter O'Connor wrote: > Well, whether you factor out the loop-function is a separate issue. Lets > say we do: > > smooth_signal = [average = compute_avg(average, x) for x in signal from > average=0] > > Is just as readable and maintainable as your expanded version, but saves 4 > lines of code. What's not to love? Be careful about asking questions which you think are rhetorical but aren't. I can think of at least half a dozen objections to this: - I'd have no idea what it means without the context of reading this thread. - That you call it "MapReduce" while apparently doing something different from what other people call MapReduce: https://en.wikipedia.org/wiki/MapReduce - That it uses = as an expression, and the keyword `from` in a weird way that doesn't make sense to me. - The fact that it requires new syntax, so it isn't backwards compatible. Even if I loved it and your proposal was accepted, I couldn't use it for at least two years. If I'm writing a library that has to work with older versions of Python, probably not for a decade. - That there's no obvious search terms to google for if you come across this in code and don't know what it means ("that thing that looks like a list comprehension but has from in it"). (And yes, before you object, list comps have the same downside.) - The fact that this uses a functional idiom in the first place, which many people don't like or get. Especially when they start getting complex. If you haven't already already done so, you ought to read the numerous threads from last month on statement local name bindings: https://mail.python.org/pipermail/python-ideas/2018-March/thread.html The barrier to adding new syntax to the language is very high. I suspect that the *only* chance you have for this sort of comprehension will be if one of the name binding proposals is accepted. That will give you *half* of what you want: [(compute_avg(average, x) as average) for x in signal] [(average := compute_avg(average, x)) for x in signal] only needing a way to give it an initial value. Depending on the way comprehensions work, this might be all you need: average = 0 smooth_signal [(average := compute_avg(average, x)) for x in signal] assuming the := syntax is accepted. An alternative would be to push for a variant of functools.reduce that yields its values lazily, giving us: smooth_signal = list(lazy_reduce(compute_avg, x, 0)) -- Steve From steve at pearwood.info Thu Apr 5 20:37:32 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 6 Apr 2018 10:37:32 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <5AC6BFED.40507@stoneleaf.us> References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> <5AC6BFED.40507@stoneleaf.us> Message-ID: <20180406003732.GS16661@ando.pearwood.info> On Thu, Apr 05, 2018 at 05:31:41PM -0700, Ethan Furman wrote: > On 04/05/2018 03:24 PM, Peter O'Connor wrote: > > >Well, whether you factor out the loop-function is a separate issue. Lets > >say we do: > > > > smooth_signal = [average = compute_avg(average, x) for x in signal > > from average=0] > > > >Is just as readable and maintainable as your expanded version, but saves 4 > >lines of code. What's not to love? > > It is not readable and it is not Python (and hopefully never will be). Be fair. Strip out the last "from average = 0" and we have little that isn't either in Python or is currently being proposed elsewhere. Change the syntax for assignment within the comprehension to one of the preferred syntax variants from last month's "Statement local name bindings" thread, and we have something that is strongly being considered: [(average := compute_avg(average, x)) for x in signal] [(compute_avg(average, x) as average) for x in signal] All we need now is a way to feed in the initial value for average. And that could be as trival as assigning a local name for it: average = 0 before running the comprehension. -- Steve From steve at pearwood.info Thu Apr 5 20:38:15 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 6 Apr 2018 10:38:15 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180406002918.GR16661@ando.pearwood.info> References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> <20180406002918.GR16661@ando.pearwood.info> Message-ID: <20180406003815.GT16661@ando.pearwood.info> On Fri, Apr 06, 2018 at 10:29:19AM +1000, Steven D'Aprano wrote: > - That you call it "MapReduce" while apparently doing something > different from what other people call MapReduce: Actually, no you don't -- you call it "Reduce-Map". Sorry, my mistake. -- Steve From rosuav at gmail.com Thu Apr 5 21:02:30 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 6 Apr 2018 11:02:30 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180406003732.GS16661@ando.pearwood.info> References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> <5AC6BFED.40507@stoneleaf.us> <20180406003732.GS16661@ando.pearwood.info> Message-ID: On Fri, Apr 6, 2018 at 10:37 AM, Steven D'Aprano wrote: > On Thu, Apr 05, 2018 at 05:31:41PM -0700, Ethan Furman wrote: >> On 04/05/2018 03:24 PM, Peter O'Connor wrote: >> >> >Well, whether you factor out the loop-function is a separate issue. Lets >> >say we do: >> > >> > smooth_signal = [average = compute_avg(average, x) for x in signal >> > from average=0] >> > >> >Is just as readable and maintainable as your expanded version, but saves 4 >> >lines of code. What's not to love? >> >> It is not readable and it is not Python (and hopefully never will be). > > Be fair. Strip out the last "from average = 0" and we have little that > isn't either in Python or is currently being proposed elsewhere. Change > the syntax for assignment within the comprehension to one of the > preferred syntax variants from last month's "Statement local name > bindings" thread, and we have something that is strongly being > considered: > > [(average := compute_avg(average, x)) for x in signal] > > [(compute_avg(average, x) as average) for x in signal] > > All we need now is a way to feed in the initial value for average. And > that could be as trival as assigning a local name for it: > > average = 0 > > before running the comprehension. That would only work if the comprehension is executed in the same context as the surrounding code, instead of (as currently) being in a nested function. Otherwise, there'd need to be an initializer inside the comprehension - but that can be done (although it won't be particularly beautiful). ChrisA From ethan at stoneleaf.us Thu Apr 5 21:06:40 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 05 Apr 2018 18:06:40 -0700 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180406003732.GS16661@ando.pearwood.info> References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> <5AC6BFED.40507@stoneleaf.us> <20180406003732.GS16661@ando.pearwood.info> Message-ID: <5AC6C820.2090608@stoneleaf.us> On 04/05/2018 05:37 PM, Steven D'Aprano wrote: > On Thu, Apr 05, 2018 at 05:31:41PM -0700, Ethan Furman wrote: >> [snip unkind words] > > Be fair. Strip out the last "from average = 0" and we have little that > isn't either in Python or is currently being proposed elsewhere. Ugh. Thanks for reminding me, Steven. Peter, my apologies. It's been a frustrating day for me and I shouldn't have taken it out on you. -- ~Ethan~ From steve at pearwood.info Thu Apr 5 21:18:54 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 6 Apr 2018 11:18:54 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: <20180406011854.GU16661@ando.pearwood.info> On Thu, Apr 05, 2018 at 12:52:17PM -0400, Peter O'Connor wrote: > I propose a new "Reduce-Map" comprehension that allows us to write: > > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] > smooth_signal = [average = (1-decay)*average + decay*x for x in signal > from average=0.] I've already commented on this proposed syntax. A few further comments below. > Instead of: > > def exponential_moving_average(signal: Iterable[float], decay: float, > initial_value: float=0.): > average = initial_value > for xt in signal: > average = (1-decay)*average + decay*xt > yield average What I like about this is that it is testable in isolation and re- usable. It can be documented, the implementation changed if needed without having to touch all the callers of that function, and the name is descriptive. (I don't understand why so many people have such an aversion to writing functions and seek to eliminate them from their code.) Here's another solution which I like, one based on what we used to call coroutines until that term was taken for async functions. So keeping in mind that this version of "coroutine" has nothing to do with async: import functools def coroutine(func): """Decorator to prime coroutines when they are initialised.""" @functools.wraps(func) def started(*args, **kwargs): cr = func(*args,**kwargs) cr.send(None) return cr return started @coroutine def exponential_moving_average(decay=0.5): """Exponentially weighted moving average (EWMA). Coroutine returning a moving average with exponentially decreasing weights. By default the decay factor is one half, which is equivalent to averaging each value (after the first) with the previous moving average: >>> aver = exponential_moving_average() >>> [aver.send(x) for x in [5, 1, 2, 4.5]] [5, 3.0, 2.5, 3.5] """ average = (yield None) x = (yield average) while True: average = decay*x + (1-decay)*average x = (yield average) I wish this sort of coroutine were better known and loved. You can run more than one of them at once, you can feed values into them lazily, they can be paused and put aside to come back to them later, and if you want to use them eagerly, you can just drop them into a list comprehension. -- Steve From steve at pearwood.info Thu Apr 5 21:58:55 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 6 Apr 2018 11:58:55 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <4472AF8A-A43B-4D0D-BBC3-7F1B33A9F96A@gmail.com> <5AC6BFED.40507@stoneleaf.us> <20180406003732.GS16661@ando.pearwood.info> Message-ID: <20180406015855.GV16661@ando.pearwood.info> On Fri, Apr 06, 2018 at 11:02:30AM +1000, Chris Angelico wrote: > On Fri, Apr 6, 2018 at 10:37 AM, Steven D'Aprano wrote: [...] > > All we need now is a way to feed in the initial value for average. And > > that could be as trival as assigning a local name for it: > > > > average = 0 > > > > before running the comprehension. > > That would only work if the comprehension is executed in the same > context as the surrounding code, instead of (as currently) being in a > nested function. Otherwise, there'd need to be an initializer inside > the comprehension - but that can be done (although it won't be > particularly beautiful). Not necessarily: we could keep the rule that comprehensions are executed in their own scope. We just add the rule that if a name is used as a sublocal name binding, then (and only then) it is initialised from the surrounding scopes. If there is no such surrounding name, then the sublocal remains uninitialised and trying to evaluate it will give UnboundLocalError. That's similar to how Lua works with locals/globals, and yes, I'm aware of the irony that I'm proposing this. I don't like the way it works in Lua where it applies *everywhere*, but I think it is justifiable and useful if applied specifically to comprehensions. A contrived example: suppose we want the running sum of a list, written as a list comprehension. This runs, but doesn't do what we want: [((x as spam) + spam) for x in [1, 2, 3]] => returns [2, 4, 6] This version fails as we try to evaluate spam before it is defined: [(spam + (x as spam)) for x in [1, 2, 3]] But if spam was copied from the surrounding scope, this would work: spam = 0 [(spam + (x as spam)) for x in [1, 2, 3]] => returns [1, 3, 5] and of course this would allow Peter's reduce/map without the ugly and ackward "from spam=0" initialiser syntax. (Sorry Peter.) If you don't like that implicit copying, let's make it explicit: spam = 0 [(spam + (x as nonlocal spam)) for x in [1, 2, 3]] (Should we allow global spam as well? Works for me.) Or if you prefer the Pascal-style assignment syntax that Guido favours: [(spam + (nonlocal spam := x)) for x in [1, 2, 3]] -- Steve From storchaka at gmail.com Fri Apr 6 02:19:10 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 6 Apr 2018 09:19:10 +0300 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: 05.04.18 19:52, Peter O'Connor ????: > I propose a new "Reduce-Map" comprehension that allows us to write: > > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1)for iin range(1000)] > smooth_signal = [average = (1-decay)*average + decay*xfor xin signalfrom average=0.] Using currently supported syntax: smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]] From ncoghlan at gmail.com Fri Apr 6 08:34:34 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 6 Apr 2018 22:34:34 +1000 Subject: [Python-ideas] [Distutils] Pypi private repo's In-Reply-To: <2A893F9F-C4CE-4470-BC1D-F556A7DA5253@me.com> References: <3075001d3cc57$370df900$a529eb00$@sdamon.com> <2A893F9F-C4CE-4470-BC1D-F556A7DA5253@me.com> Message-ID: On 5 April 2018 at 07:58, Jannis Gebauer wrote: > What if there was some kind of ?blessed? entity that runs these services and puts the majority of the revenue into a fund that funds development on PyPi (maybe trough the PSF)? Having a wholly owned for-profit subsidiary that provides commercial services as a revenue raising mechanism is certainly one way to approach something like this without alienating sponsors or tax authorities (although it may still alienate the vendors of now competing services). It would require a big time commitment on the PSF side to get everything set up though, as well as interest from key folks in joining what would essentially be a single-language-focused start up in an already crowded cross-language developer tools marketplace. When the PSF as a whole is still operating with only a handful of full or part time employees, it's far from clear that setting something like that up would be the most effective possible use of their time and energy. At a more basic level, that kind of arrangement technically doesn't require anyone's blessing, it could be as straightforward as downstream tooling vendors signing up as PSF sponsors and saying "please allocate our sponsorship contribution to the Packaging WG's budget so that PyPI keeps operating well and the PyPA tooling keeps improving, increasing the level of demand for our commercial Python repository management services". Historically that wouldn't have helped much, since the PSF itself has struggled with effective project management (for a variety of reasons), but one of the things I think the success of the MOSS grant has shown is the significant strides that the PSF has made in budget management in recent years, such that if funding is made available, it can and will be spent effectively. Cheers, Nick. P.S. PyPA contributors are also free agents in their own right, so folks offering Python-centric developer workflow management tools or features may decide that it's worth their while to invest more directly in smoothing out some of the rough edges that currently still exist. It's a mercenary way of looking at things, but in many cases, it is *absolutely* possible to pay for the time and attention of existing contributors, and if you can persuade them that your proposals are reasonable, they'll often have an easier time than most convincing other community contributors that it's a good way to go :) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From peter.ed.oconnor at gmail.com Fri Apr 6 10:47:10 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Fri, 6 Apr 2018 10:47:10 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: Hi all, thank you for the feedback. I laughed, I cried, and I learned. I looked over all your suggestions and recreated them here: https://github.com/petered/peters_example_code/blob/master/peters_example_code/ways_to_skin_a_cat.py I still favour my (y = f(y, x) for x in xs from y=initializer) syntax for a few reasons: 1) By adding an "initialized generator" as a special language construct, we could add a "last" builtin (similar to "next") so that "last(initialized_generator)" returns the initializer if the initialized_generator yields no values (and thus replaces reduce). 2) Declaring the initial value as part of the generator lets us pass around the generator around so it can be run in other scopes without it keeping alive the scope it's defined in, and bringing up awkward questions like "What if the initializer variable in the scope that created the generator changes after the generator is defined but before it is used?" 3) The idea that an assignment operation "a = f()" returns a value (a) is already consistent with the "chained assignment" syntax of "b=a=f()" (which can be thought of as "b=(a=f())"). I don't know why we feel the need for new constructs like "(a:=f())" or "(f() as a)" when we could just think of assignments as returning values (unless that breaks something that I'm not aware of) However, it looks like I'd be fighting a raging current if I were to try and push this proposal. It's also encouraging that most of the work would be done anyway if ("Statement Local Name Bindings") thread passes. So some more humble proposals would be: 1) An initializer to itertools.accumulate functools.reduce already has an initializer, I can't see any controversy to adding an initializer to itertools.accumulate 2) Assignment returns a value (basically what's already in the "Statement local name bindings" discussion) `a=f()` returns a value of a This would allow updating variables in a generator (I don't see the need for ":=" or "f() as a") but that's another discussion Is there any interest (or disagreement) to these more humble proposals? - Peter On Fri, Apr 6, 2018 at 2:19 AM, Serhiy Storchaka wrote: > 05.04.18 19:52, Peter O'Connor ????: > >> I propose a new "Reduce-Map" comprehension that allows us to write: >> >> signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1)for iin >> range(1000)] >> smooth_signal = [average = (1-decay)*average + decay*xfor xin signalfrom >> average=0.] >> > > Using currently supported syntax: > > smooth_signal = [average for average in [0] for x in signal > for average in [(1-decay)*average + decay*x]] > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericfahlgren at gmail.com Fri Apr 6 10:53:53 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Fri, 6 Apr 2018 07:53:53 -0700 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor wrote: > 3) The idea that an assignment operation "a = f()" returns a value (a) is > already consistent with the "chained assignment" syntax of "b=a=f()" (which > can be thought of as "b=(a=f())"). I don't know why we feel the need for > new constructs like "(a:=f())" or "(f() as a)" when we could just think of > assignments as returning values (unless that breaks something that I'm not > aware of) > ?Consider >>> if x = 1: >>> print("What did I just do?")? -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ed.oconnor at gmail.com Fri Apr 6 10:58:10 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Fri, 6 Apr 2018 10:58:10 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: Ah, ok, I suppose that could easily lead to typo-bugs. Ok, then I agree that "a:=f()" returning a is better On Fri, Apr 6, 2018 at 10:53 AM, Eric Fahlgren wrote: > On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor > wrote: > >> 3) The idea that an assignment operation "a = f()" returns a value (a) is >> already consistent with the "chained assignment" syntax of "b=a=f()" (which >> can be thought of as "b=(a=f())"). I don't know why we feel the need for >> new constructs like "(a:=f())" or "(f() as a)" when we could just think of >> assignments as returning values (unless that breaks something that I'm not >> aware of) >> > > ?Consider > > >>> if x = 1: > >>> print("What did I just do?")? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Apr 6 11:06:45 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 6 Apr 2018 08:06:45 -0700 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor wrote: > Hi all, thank you for the feedback. I laughed, I cried, and I learned. > You'll be a language designer yet. :-) > However, it looks like I'd be fighting a raging current if I were to try > and push this proposal. It's also encouraging that most of the work would > be done anyway if ("Statement Local Name Bindings") thread passes. So some > more humble proposals would be: > > 1) An initializer to itertools.accumulate > functools.reduce already has an initializer, I can't see any controversy > to adding an initializer to itertools.accumulate > See if that's accepted in the bug tracker. > 2) Assignment returns a value (basically what's already in the "Statement > local name bindings" discussion) > `a=f()` returns a value of a > This would allow updating variables in a generator (I don't see the need > for ":=" or "f() as a") but that's another discussion > Please join the PEP 572 discussion. The strongest contender currently is `a := f()` and for good reasons. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ctaank at gmail.com Fri Apr 6 11:27:45 2018 From: ctaank at gmail.com (Cammil Taank) Date: Fri, 06 Apr 2018 15:27:45 +0000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: I'm not sure if my suggestion for 572 has been considered: ``name! expression`` I'm curious what the pros and cons of this form would be (?). My arguments for were in a previous message but there do not seem to be any responses to it. Cammil On Fri, 6 Apr 2018, 16:14 Guido van Rossum, wrote: > On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor > wrote: > >> Hi all, thank you for the feedback. I laughed, I cried, and I learned. >> > > You'll be a language designer yet. :-) > > >> However, it looks like I'd be fighting a raging current if I were to try >> and push this proposal. It's also encouraging that most of the work would >> be done anyway if ("Statement Local Name Bindings") thread passes. So some >> more humble proposals would be: >> >> 1) An initializer to itertools.accumulate >> functools.reduce already has an initializer, I can't see any controversy >> to adding an initializer to itertools.accumulate >> > > See if that's accepted in the bug tracker. > > >> 2) Assignment returns a value (basically what's already in the "Statement >> local name bindings" discussion) >> `a=f()` returns a value of a >> This would allow updating variables in a generator (I don't see the need >> for ":=" or "f() as a") but that's another discussion >> > > Please join the PEP 572 discussion. The strongest contender currently is > `a := f()` and for good reasons. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ed.oconnor at gmail.com Fri Apr 6 12:52:27 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Fri, 6 Apr 2018 12:52:27 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: Seems to me it's much more obvious that "name:=expression" is assigning expression to name than "name!expression". The ! is also confusing because "!=" means "not equals", so the "!" symbol is already sort of associated with "not" On Fri, Apr 6, 2018 at 11:27 AM, Cammil Taank wrote: > I'm not sure if my suggestion for 572 has been considered: > > ``name! expression`` > > I'm curious what the pros and cons of this form would be (?). > > My arguments for were in a previous message but there do not seem to be > any responses to it. > > Cammil > > On Fri, 6 Apr 2018, 16:14 Guido van Rossum, wrote: > >> On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor < >> peter.ed.oconnor at gmail.com> wrote: >> >>> Hi all, thank you for the feedback. I laughed, I cried, and I learned. >>> >> >> You'll be a language designer yet. :-) >> >> >>> However, it looks like I'd be fighting a raging current if I were to >>> try and push this proposal. It's also encouraging that most of the work >>> would be done anyway if ("Statement Local Name Bindings") thread passes. >>> So some more humble proposals would be: >>> >>> 1) An initializer to itertools.accumulate >>> functools.reduce already has an initializer, I can't see any controversy >>> to adding an initializer to itertools.accumulate >>> >> >> See if that's accepted in the bug tracker. >> >> >>> 2) Assignment returns a value (basically what's already in the "Statement >>> local name bindings" discussion) >>> `a=f()` returns a value of a >>> This would allow updating variables in a generator (I don't see the need >>> for ":=" or "f() as a") but that's another discussion >>> >> >> Please join the PEP 572 discussion. The strongest contender currently is >> `a := f()` and for good reasons. >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Apr 6 19:49:12 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 7 Apr 2018 09:49:12 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: <20180406234912.GX16661@ando.pearwood.info> On Fri, Apr 06, 2018 at 03:27:45PM +0000, Cammil Taank wrote: > I'm not sure if my suggestion for 572 has been considered: > > ``name! expression`` > > I'm curious what the pros and cons of this form would be (?). I can't see any pros for it. In what way is ! associated with assignment or binding? It might as well be a arbitrary symbol. (Yes, I know that ultimately *everything* is an arbitrary symbol, but some of them have very strong associations built on years or decades or centuries of usage.) As Peter says, ! is associated with negation, as in !=, and to those of us with a maths background, name! simply *screams* "FACTORIAL" at the top of its voice. > My arguments for were in a previous message but there do not seem to be any > responses to it. Care to repeat those arguments? -- Steve From steve at pearwood.info Fri Apr 6 19:50:49 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 7 Apr 2018 09:50:49 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: <20180406235048.GY16661@ando.pearwood.info> On Fri, Apr 06, 2018 at 08:06:45AM -0700, Guido van Rossum wrote: > Please join the PEP 572 discussion. The strongest contender currently is `a > := f()` and for good reasons. Where has that discussion moved to? The threads on python-ideas seem to have gone quiet, and the last I heard you said that you, Chris and Nick were discussing some issues privately. -- Steve From rosuav at gmail.com Fri Apr 6 19:54:17 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 7 Apr 2018 09:54:17 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180406235048.GY16661@ando.pearwood.info> References: <20180406235048.GY16661@ando.pearwood.info> Message-ID: On Sat, Apr 7, 2018 at 9:50 AM, Steven D'Aprano wrote: > On Fri, Apr 06, 2018 at 08:06:45AM -0700, Guido van Rossum wrote: > >> Please join the PEP 572 discussion. The strongest contender currently is `a >> := f()` and for good reasons. > > Where has that discussion moved to? The threads on python-ideas seem to > have gone quiet, and the last I heard you said that you, Chris and Nick > were discussing some issues privately. > I'm still working on getting some code done, and I'm stuck due to a lack of time on my part. It'll likely move forward this weekend, and if I can do what I'm trying to do, I'll have a largely rewritten PEP to discuss. (Never call ANYTHING "trivial" or "simple" unless you already know the solution to it. Turns out that there are even more subtleties to "make it behave like assignment" than I had thought.) ChrisA From raymond.hettinger at gmail.com Fri Apr 6 21:02:05 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 6 Apr 2018 18:02:05 -0700 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] Message-ID: > On Friday, April 6, 2018 at 8:14:30 AM UTC-7, Guido van Rossum wrote: > On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor wrote: >> So some more humble proposals would be: >> >> 1) An initializer to itertools.accumulate >> functools.reduce already has an initializer, I can't see any controversy to adding an initializer to itertools.accumulate > > See if that's accepted in the bug tracker. It did come-up once but was closed for a number reasons including lack of use cases. However, Peter's signal processing example does sound interesting, so we could re-open the discussion. For those who want to think through the pluses and minuses, I've put together a Q&A as food for thought (see below). Everybody's design instincts are different -- I'm curious what you all think think about the proposal. Raymond --------------------------------------------- Q. Can it be done? A. Yes, it wouldn't be hard. _sentinel = object() def accumulate(iterable, func=operator.add, start=_sentinel): it = iter(iterable) if start is _sentinel: try: total = next(it) except StopIteration: return else: total = start yield total for element in it: total = func(total, element) yield total Q. Do other languages do it? A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. * http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html * https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html * http://microapl.com/apl/apl_concepts_chapter5.html \+ 1 2 3 4 5 1 3 6 10 15 * https://reference.wolfram.com/language/ref/Accumulate.html * https://www.haskell.org/hoogle/?hoogle=mapAccumL Q. How much work for a person to do it currently? A. Almost zero effort to write a simple helper function: myaccum = lambda it, func, start: accumulate(chain([start], it), func) Q. How common is the need? A. Rare. Q. Which would be better, a simple for-loop or a customized itertool? A. The itertool is shorter but more opaque (especially with respect to the argument order for the function call): result = [start] for x in iterable: y = func(result[-1], x) result.append(y) versus: result = list(accumulate(iterable, func, start=start)) Q. How readable is the proposed code? A. Look at the following code and ask yourself what it does: accumulate(range(4, 6), operator.mul, start=6) Now test your understanding: How many values are emitted? What is the first value emitted? Are the two sixes related? What is this code trying to accomplish? Q. Are there potential surprises or oddities? A. Is it readily apparent which of assertions will succeed? a1 = sum(range(10)) a2 = sum(range(10), 0) assert a1 == a2 a3 = functools.reduce(operator.add, range(10)) a4 = functools.reduce(operator.add, range(10), 0) assert a3 == a4 a4 = list(accumulate(range(10), operator.add)) a5 = list(accumulate(range(10), operator.add, start=0)) assert a5 == a6 Q. What did the Python 3.0 Whatsnew document have to say about reduce()? A. "Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable." Q. What would this look like in real code? A. We have almost no real-world examples, but here is one from a StackExchange post: def wsieve(): # wheel-sieve, by Will Ness. ideone.com/mqO25A->0hIE89 wh11 = [ 2,4,2,4,6,2,6,4,2,4,6,6, 2,6,4,2,6,4,6,8,4,2,4,2, 4,8,6,4,6,2,4,6,2,6,6,4, 2,4,6,2,6,4,2,4,2,10,2,10] cs = accumulate(cycle(wh11), start=11) yield( next( cs)) # cf. ideone.com/WFv4f ps = wsieve() # codereview.stackexchange.com/q/92365/9064 p = next(ps) # 11 psq = p*p # 121 D = dict( zip( accumulate(wh11, start=0), count(0))) # start from sieve = {} for c in cs: if c in sieve: wheel = sieve.pop(c) for m in wheel: if not m in sieve: break sieve[m] = wheel # sieve[143] = wheel at 187 elif c < psq: yield c else: # (c==psq) # map (p*) (roll wh from p) = roll (wh*p) from (p*p) x = [p*d for d in wh11] i = D[ (p-11) % 210] wheel = accumulate(cycle(x[i:] + x[:i]), start=psq) p = next(ps) ; psq = p*p next(wheel) ; m = next(wheel) sieve[m] = wheel From tim.peters at gmail.com Sat Apr 7 00:06:30 2018 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 6 Apr 2018 23:06:30 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: Message-ID: [Raymond Hettinger ] > ... > Q. How readable is the proposed code? > A. Look at the following code and ask yourself what it does: > > accumulate(range(4, 6), operator.mul, start=6) > > Now test your understanding: > > How many values are emitted? 3 > What is the first value emitted? 6 > Are the two sixes related? No. > What is this code trying to accomplish? It's quite obviously trying to bias the reader against the proposal by presenting a senseless example ;-) Assuming there's any real reason to write that code at all, a better question is whether it's more comprehensible than accumulate(itertools.chain([6], range(4, 6)), operator.mul) > ... > Q. What would this look like in real code? > A. We have almost no real-world examples, but here is one from a StackExchange post: > > def wsieve(): # wheel-sieve, by Will Ness. ideone.com/mqO25A->0hIE89 > ... By sheer coincidence, I happened to write another yesterday. This is from a program looking for the smallest integers that yield new records for Collatz sequence lengths. The details don't matter, except that - like Will Ness's wheel sieve code - it needs to generate an unbounded increasing sequence of integers with a periodic, but complicated, sequence of deltas, starting at a more-or-less arbitrary point. def buildtab(SHIFT, LIMIT): ... # Candidates are of the form i*LIMIT + j, for i >= 1 and j in # goodix. However, a new record can't be set for a number of # the form 3k+2: that's two steps after 2k+1, so the smaller # 2k+1 has a glide 2 longer. We want to arrange to never try # numbers of the form 3k+2 to begin with. base = 0 ix2 = [] for i in range(3): base += LIMIT for ix in goodix: num = base + ix if num % 3 != 2: ix2.append(num) ix2.append(ix2[0] + 3 * LIMIT) assert len(ix2) == 2 * len(goodix) + 1 del goodix deltas = tuple(ix2[i] - ix2[i-1] for i in range(1, len(ix2))) return tuple(result), ix2[0], deltas A note on "complicated": the tuple of deltas here can contain millions of integers, and that's the smallest length at which it becomes periodic. Later: def coll(SHIFT=24): ... from itertools import accumulate, chain, cycle ... LIMIT = 1 << SHIFT ... abc, first, deltas = buildtab(SHIFT, LIMIT) ... for num in accumulate(chain([first], cycle(deltas))): assert num % 3 != 2 As in Will's code, it would be more readable as: for num in accumulate(cycle(deltas), start=first): That says what it does pretty clearly, whereas deducing the behavior from "OK, it's chaining together a singleton list and a cycle, because ...?" is a bit of a head scratcher at first. That said, if the need came up often, as you noted it's dead easy to write a helper function to encapsulate the "head scratcher" part, and with no significant loss of efficiency. So I'd be -0 overall, _except_ that "chain together a singleton list and a cycle" is so obscure on the face of it than I'm not sure most programmers who wanted the functionality of `start=` would ever think of it. I'm not sure that I would have, except that I studied Ness's wheel sieve code a long time ago and the idea stuck. So that makes me +0.4. From raymond.hettinger at gmail.com Sat Apr 7 03:44:37 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sat, 7 Apr 2018 00:44:37 -0700 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: Message-ID: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> > On Apr 6, 2018, at 9:06 PM, Tim Peters wrote: > >> >> What is this code trying to accomplish? > > It's quite obviously trying to bias the reader against the proposal by > presenting a senseless example ;-) FWIW, the example was not from me. It was provided by the OP on the tracker. I changed the start point from 10 to a 6 so it at least made some sense as the continuation of a factorial sequence: 6 24 120 > By sheer coincidence, I happened to write another yesterday. This is > from a program looking for the smallest integers that yield new > records for Collatz sequence lengths. Nice. That brings the number of real-world examples up to a total of three (collatz, wheel sieve, and signal processing). Prior to today, that total was only one (which was found after much digging). > Later: > > def coll(SHIFT=24): > ... > from itertools import accumulate, chain, cycle > ... > LIMIT = 1 << SHIFT > ... > abc, first, deltas = buildtab(SHIFT, LIMIT) > ... > for num in accumulate(chain([first], cycle(deltas))): > assert num % 3 != 2 > > As in Will's code, it would be more readable as: > > for num in accumulate(cycle(deltas), start=first): That does read better. I am curious how you would have written it as a plain for-loop before accumulate() was added (part of the argument against reduce() was that a plain for-loop would be clearer 99% of the time). > That said, if the need came up often, as you noted it's dead easy to > write a helper function to encapsulate the "head scratcher" part, and > with no significant loss of efficiency. > > So I'd be -0 overall, _except_ that "chain together a singleton list > and a cycle" is so obscure on the face of it than I'm not sure most > programmers who wanted the functionality of `start=` would ever think > of it. I'm not sure that I would have, except that I studied Ness's > wheel sieve code a long time ago and the idea stuck. So that makes me > +0.4. Agreed that the "chain([x], it)" step is obscure. That's a bit of a bummer -- one of the goals for the itertools module was to be a generic toolkit for chopping-up, modifying, and splicing iterator streams (sort of a CRISPR for iterators). The docs probably need another recipe to show this pattern: def prepend(value, iterator): "prepend(1, [2, 3, 4]) -> 1 2 3 4" return chain([value], iterator) Thanks for taking a look at the proposal. I was -0 when it came up once before. Once I saw a use case pop-up on this list, I thought it might be worth discussing again. Raymond From p.f.moore at gmail.com Sat Apr 7 04:48:20 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 7 Apr 2018 09:48:20 +0100 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: On 7 April 2018 at 08:44, Raymond Hettinger wrote: > Agreed that the "chain([x], it)" step is obscure. That's a bit of a bummer -- one of the goals for the itertools module was to be a generic toolkit for chopping-up, modifying, and splicing iterator streams (sort of a CRISPR for iterators). The docs probably need another recipe to show this pattern: > > def prepend(value, iterator): > "prepend(1, [2, 3, 4]) -> 1 2 3 4" > return chain([value], iterator) > > Thanks for taking a look at the proposal. I was -0 when it came up once before. Once I saw a use case pop-up on this list, I thought it might be worth discussing again. I don't have much to add here - I typically agree that an explicit loop is simpler, but my code tends not to be the sort that does this type of operation, so my experience is either where it's not appropriate, or where I'm unfamiliar with the algorithms, so terseness is more of a problem to me than it would be to a domain expert. Having said that, I find that the arguments that it's easy to add and it broadens the applicability of the function to be significant. Certainly, writing a helper is simple, but as Tim pointed out, the trick to writing that helper is obscure. Also, in the light of the itertools design goal to be a toolkit for iterators, I often find that the tools are just slightly *too* low level for my use case - they are designed to be combined, certainly, but in practice I find that building my own loop is often quicker than working out how to combine them. (I don't have concrete examples, unfortunately - this feeling comes from working back from the question of why I don't use itertools more than I do). So I tend to favour such slight extensions to the use cases of itertools functions. A recipe would help, but I don't know how much use the recipes see in practice. I see a lot of questions where "there's a recipe for that" is the answer - indicating that people don't always spot the recipes. So I guess I'm +0 on the change - but as a matter of principle, rather than from a pressing need for it, so don't take that vote too seriously. Paul From ctaank at gmail.com Sat Apr 7 04:54:33 2018 From: ctaank at gmail.com (Cammil Taank) Date: Sat, 7 Apr 2018 09:54:33 +0100 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180406234912.GX16661@ando.pearwood.info> References: <20180406234912.GX16661@ando.pearwood.info> Message-ID: > Care to repeat those arguments? Indeed. *Minimal use of characters* The primary benefit for me would be the minimal use of characters, which within list comprehensions I think is not an insignificant benefit: stuff = [[(f(x) as y), x/y] for x in range(5)] # seems quite syntactically busy stuff = [[y := f(x), x/y] for x in range(5)] # better stuff = [[y! f(x), x/y] for x in range(5)] # two fewer characters (if you include the space after the identifier) *Thoughts on odd usage of "!"* In the English language, `!` signifies an exclamation, and I am imagining a similar usage to that of introducing something by its name in an energetic way. For example a boxer walking in to the ring: "Muhammed_Ali! ", "x! get_x()" I get that `!` is associated with "not", and factorial, but I couldn't think of another character already used that would work in this usage. I also think `name! expression` would be hard to interpret as a comparison or factorial. I suppose the trade off here is efficiency vs. idiosyncrasy. I very much appreciate this is all very tentative, but I wanted to explain why this syntax does not sit terribly with me. Cammil On 7 April 2018 at 00:49, Steven D'Aprano wrote: > On Fri, Apr 06, 2018 at 03:27:45PM +0000, Cammil Taank wrote: > > I'm not sure if my suggestion for 572 has been considered: > > > > ``name! expression`` > > > > I'm curious what the pros and cons of this form would be (?). > > I can't see any pros for it. In what way is ! associated with assignment > or binding? It might as well be a arbitrary symbol. > > (Yes, I know that ultimately *everything* is an arbitrary symbol, but > some of them have very strong associations built on years or decades or > centuries of usage.) > > As Peter says, ! is associated with negation, as in !=, and to those of > us with a maths background, name! simply *screams* "FACTORIAL" at the > top of its voice. > > > > My arguments for were in a previous message but there do not seem to be > any > > responses to it. > > Care to repeat those arguments? > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaoxiansamma at gmail.com Sat Apr 7 13:26:20 2018 From: yaoxiansamma at gmail.com (thautwarm) Date: Sun, 08 Apr 2018 01:26:20 +0800 Subject: [Python-ideas] Is there any idea about dictionary destructing? Message-ID: We know that Python support the destructing of iterable objects. m_iter = (_ for _ in range(10)) a, *b, c = m_iter That's pretty cool! It's really convenient when there're many corner cases to handle with iterable collections. However destructing in Python could be more convenient if we support dictionary destructing. In my opinion, dictionary destructing is not difficult to implement and makes the syntax more expressive. A typical example is data access on nested data structures(just like JSON),?destructing?a dictionary makes the logic quite clear: data = { "direct": "some data", "nested": { "lst_data": [1, 2, 3], "int_data": 1 } } { "direct": direct, "nested": { "lst_data": [a, b, c], } } = data Dictionary destructing might not be very well-known but it really helps. The operations on nested key-value collections are very frequent, and the codes for business logic are not readable enough until now. Moreover Python is now popular in data processing which must be enhanced by the entire support of data destructing. Here are some implementations of other languages: Elixir, which is also a popular dynamic language nowadays. iex> %{} = %{:a => 1, 2 => :b} %{2 => :b, :a => 1} iex> %{:a => a} = %{:a => 1, 2 => :b} %{2 => :b, :a => 1} iex> a 1 iex> %{:c => c} = %{:a => 1, 2 => :b} ** (MatchError) no match of right hand side value: %{2 => :b, :a => 1} And in F#, there is something similar to dictionary destructing(actually, this destructs `struct` instead) type MyRecord = { Name: string; ID: int } let IsMatchByName record1 (name: string) = match record1 with | { MyRecord.Name = nameFound; MyRecord.ID = _; } when nameFound = name -> true | _ -> false let recordX = { Name = "Parker"; ID = 10 } let isMatched1 = IsMatchByName recordX "Parker" let isMatched2 = IsMatchByName recordX "Hartono" All of them partially destructs(or matches) a dictionary. thautwarm -------------- next part -------------- An HTML attachment was scrubbed... URL: From nikolasrvanderhoof at gmail.com Sat Apr 7 16:16:46 2018 From: nikolasrvanderhoof at gmail.com (Nikolas Vanderhoof) Date: Sat, 7 Apr 2018 16:16:46 -0400 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: Message-ID: This would be a very handy feature, but Coconut (which is just python with some extra functional-style features) also has support for this kind of pattern-matching: http://coconut-lang.org ?Since Coconut will compile to Python (2 or 3) you can just write in Coconut and use the resulting code in your Python. Using your first example in coconut would be nearly identical, except I believe the entire dictionary must be specified (I am not sure about this). data = { 'direct': 'some data', 'nested': { 'lst_data': [1, 2, 3], 'int_data': 1 } } { 'direct': direct, 'nested': { 'lst_data': [a, b, c], 'int_data': _ } } = data print(direct) print(a) print(b) print(c) And this should print: On Sat, Apr 7, 2018 at 1:26 PM, thautwarm wrote: > We know that Python support the destructing of iterable objects. > > m_iter = (_ for _ in range(10)) > a, *b, c = m_iter > > That's pretty cool! It's really convenient when there're many corner cases > to handle with iterable collections. > However destructing in Python could be more convenient if we support > dictionary destructing. > > In my opinion, dictionary destructing is not difficult to implement and > makes the syntax more expressive. A typical example is data access on > nested data structures(just like JSON), destructing a dictionary makes > the logic quite clear: > > data = { > "direct": "some data", > "nested": { > "lst_data": [1, 2, 3], > "int_data": 1 > } > } > { > "direct": direct, > "nested": { > "lst_data": [a, b, c], > } > } = data > > > Dictionary destructing might not be very well-known but it really helps. > The operations on nested key-value collections are very frequent, and the > codes for business logic are not readable enough until now. Moreover Python > is now popular in data processing which must be enhanced by the entire > support of data destructing. > > Here are some implementations of other languages: > Elixir, which is also a popular dynamic language nowadays. > > iex> %{} = %{:a => 1, 2 => :b} > %{2 => :b, :a => 1} > iex> %{:a => a} = %{:a => 1, 2 => :b} > %{2 => :b, :a => 1} > iex> a > 1 > iex> %{:c => c} = %{:a => 1, 2 => :b} > ** (MatchError) no match of right hand side value: %{2 => :b, :a => 1} > > And in F#, there is something similar to dictionary destructing(actually, > this destructs `struct` instead) > type MyRecord = { Name: string; ID: int } let IsMatchByName record1 > (name: string) = match record1 with | { MyRecord.Name = nameFound; > MyRecord.ID = _; } when nameFound = name -> true | _ -> false let recordX > = { Name = "Parker"; ID = 10 } let isMatched1 = IsMatchByName recordX > "Parker" let isMatched2 = IsMatchByName recordX "Hartono" > > All of them partially destructs(or matches) a dictionary. > > thautwarm > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: coconut.png Type: image/png Size: 7879 bytes Desc: not available URL: From nikolasrvanderhoof at gmail.com Sat Apr 7 16:19:00 2018 From: nikolasrvanderhoof at gmail.com (Nikolas Vanderhoof) Date: Sat, 7 Apr 2018 16:19:00 -0400 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: Message-ID: Although that particular example once compiled to python will generate many many lines of code: ? On Sat, Apr 7, 2018 at 4:17 PM, Nikolas Vanderhoof < nikolasrvanderhoof at gmail.com> wrote: > And this should print: > > 'some data' > 1 > 2 > 3 > > On Sat, Apr 7, 2018 at 4:16 PM, Nikolas Vanderhoof < > nikolasrvanderhoof at gmail.com> wrote: > >> This would be a very handy feature, but Coconut (which is just python >> with some extra functional-style features) also has support for this kind >> of pattern-matching: >> http://coconut-lang.org >> >> >> ?Since Coconut will compile to Python (2 or 3) you can just write in >> Coconut and use the resulting code in your Python. >> >> Using your first example in coconut would be nearly identical, except I >> believe the entire dictionary must be specified (I am not sure about this). >> >> data = { >> 'direct': 'some data', >> 'nested': { >> 'lst_data': [1, 2, 3], >> 'int_data': 1 >> } >> } >> >> { >> 'direct': direct, >> 'nested': { >> 'lst_data': [a, b, c], >> 'int_data': _ >> } >> } = data >> >> print(direct) >> print(a) >> print(b) >> print(c) >> >> >> And this should print: >> >> On Sat, Apr 7, 2018 at 1:26 PM, thautwarm wrote: >> >>> We know that Python support the destructing of iterable objects. >>> >>> m_iter = (_ for _ in range(10)) >>> a, *b, c = m_iter >>> >>> That's pretty cool! It's really convenient when there're many corner >>> cases to handle with iterable collections. >>> However destructing in Python could be more convenient if we support >>> dictionary destructing. >>> >>> In my opinion, dictionary destructing is not difficult to implement and >>> makes the syntax more expressive. A typical example is data access on >>> nested data structures(just like JSON), destructing a dictionary makes >>> the logic quite clear: >>> >>> data = { >>> "direct": "some data", >>> "nested": { >>> "lst_data": [1, 2, 3], >>> "int_data": 1 >>> } >>> } >>> { >>> "direct": direct, >>> "nested": { >>> "lst_data": [a, b, c], >>> } >>> } = data >>> >>> >>> Dictionary destructing might not be very well-known but it really helps. >>> The operations on nested key-value collections are very frequent, and the >>> codes for business logic are not readable enough until now. Moreover Python >>> is now popular in data processing which must be enhanced by the entire >>> support of data destructing. >>> >>> Here are some implementations of other languages: >>> Elixir, which is also a popular dynamic language nowadays. >>> >>> iex> %{} = %{:a => 1, 2 => :b} >>> %{2 => :b, :a => 1} >>> iex> %{:a => a} = %{:a => 1, 2 => :b} >>> %{2 => :b, :a => 1} >>> iex> a >>> 1 >>> iex> %{:c => c} = %{:a => 1, 2 => :b} >>> ** (MatchError) no match of right hand side value: %{2 => :b, :a => 1} >>> >>> And in F#, there is something similar to dictionary >>> destructing(actually, this destructs `struct` instead) >>> type MyRecord = { Name: string; ID: int } let IsMatchByName record1 >>> (name: string) = match record1 with | { MyRecord.Name = nameFound; >>> MyRecord.ID = _; } when nameFound = name -> true | _ -> false let >>> recordX = { Name = "Parker"; ID = 10 } let isMatched1 = IsMatchByName >>> recordX "Parker" let isMatched2 = IsMatchByName recordX "Hartono" >>> >>> All of them partially destructs(or matches) a dictionary. >>> >>> thautwarm >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: coconut.png Type: image/png Size: 7879 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: coconut_example.png Type: image/png Size: 17698 bytes Desc: not available URL: From nikolasrvanderhoof at gmail.com Sat Apr 7 16:17:12 2018 From: nikolasrvanderhoof at gmail.com (Nikolas Vanderhoof) Date: Sat, 7 Apr 2018 16:17:12 -0400 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: Message-ID: And this should print: 'some data' 1 2 3 On Sat, Apr 7, 2018 at 4:16 PM, Nikolas Vanderhoof < nikolasrvanderhoof at gmail.com> wrote: > This would be a very handy feature, but Coconut (which is just python with > some extra functional-style features) also has support for this kind of > pattern-matching: > http://coconut-lang.org > > > ?Since Coconut will compile to Python (2 or 3) you can just write in > Coconut and use the resulting code in your Python. > > Using your first example in coconut would be nearly identical, except I > believe the entire dictionary must be specified (I am not sure about this). > > data = { > 'direct': 'some data', > 'nested': { > 'lst_data': [1, 2, 3], > 'int_data': 1 > } > } > > { > 'direct': direct, > 'nested': { > 'lst_data': [a, b, c], > 'int_data': _ > } > } = data > > print(direct) > print(a) > print(b) > print(c) > > > And this should print: > > On Sat, Apr 7, 2018 at 1:26 PM, thautwarm wrote: > >> We know that Python support the destructing of iterable objects. >> >> m_iter = (_ for _ in range(10)) >> a, *b, c = m_iter >> >> That's pretty cool! It's really convenient when there're many corner >> cases to handle with iterable collections. >> However destructing in Python could be more convenient if we support >> dictionary destructing. >> >> In my opinion, dictionary destructing is not difficult to implement and >> makes the syntax more expressive. A typical example is data access on >> nested data structures(just like JSON), destructing a dictionary makes >> the logic quite clear: >> >> data = { >> "direct": "some data", >> "nested": { >> "lst_data": [1, 2, 3], >> "int_data": 1 >> } >> } >> { >> "direct": direct, >> "nested": { >> "lst_data": [a, b, c], >> } >> } = data >> >> >> Dictionary destructing might not be very well-known but it really helps. >> The operations on nested key-value collections are very frequent, and the >> codes for business logic are not readable enough until now. Moreover Python >> is now popular in data processing which must be enhanced by the entire >> support of data destructing. >> >> Here are some implementations of other languages: >> Elixir, which is also a popular dynamic language nowadays. >> >> iex> %{} = %{:a => 1, 2 => :b} >> %{2 => :b, :a => 1} >> iex> %{:a => a} = %{:a => 1, 2 => :b} >> %{2 => :b, :a => 1} >> iex> a >> 1 >> iex> %{:c => c} = %{:a => 1, 2 => :b} >> ** (MatchError) no match of right hand side value: %{2 => :b, :a => 1} >> >> And in F#, there is something similar to dictionary destructing(actually, >> this destructs `struct` instead) >> type MyRecord = { Name: string; ID: int } let IsMatchByName record1 >> (name: string) = match record1 with | { MyRecord.Name = nameFound; >> MyRecord.ID = _; } when nameFound = name -> true | _ -> false let >> recordX = { Name = "Parker"; ID = 10 } let isMatched1 = IsMatchByName >> recordX "Parker" let isMatched2 = IsMatchByName recordX "Hartono" >> >> All of them partially destructs(or matches) a dictionary. >> >> thautwarm >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: coconut.png Type: image/png Size: 7879 bytes Desc: not available URL: From eric at trueblade.com Sat Apr 7 17:39:09 2018 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 7 Apr 2018 17:39:09 -0400 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: Message-ID: <53a34231-2da4-8d2d-ed5a-3999c8f5781e@trueblade.com> There was a long thread last year on a subject, titled "Dictionary destructing and unpacking.": https://mail.python.org/pipermail/python-ideas/2017-June/045963.html You might want to read through it and see what ideas and problems were raised then. In that discussion, there's also a link to an older pattern matching thread: https://mail.python.org/pipermail/python-ideas/2015-April/032907.html Eric On 4/7/2018 1:26 PM, thautwarm wrote: > We know that Python support the destructing of iterable objects. > > m_iter= (_for _in range(10)) > a,*b, c= m_iter > > That's pretty cool! It's really convenient when there're many corner > cases to handle with iterable collections. > However destructing in Python could be more convenient if we support > dictionary destructing. > > In my opinion, dictionary destructing is not difficult to implement and > makes the syntax more expressive. A typical example is data access on > nested data structures(just like JSON), destructing?a dictionary makes > the logic quite clear: > > data= { > "direct": "some data", > "nested": { > "lst_data": [1,2,3], > "int_data": 1 > } > } > { > "direct": direct, > "nested": { > "lst_data": [a, b, c], > } > }= data > > > Dictionary destructing might not be very well-known but it really helps. > The operations on nested key-value collections are very frequent, and > the codes for business logic are not readable enough until now. Moreover > Python is now popular in data processing which must be enhanced by the > entire support of data destructing. > > Here are some implementations of other languages: > Elixir, which is also a popular dynamic language nowadays. > > |iex> %{} = %{:a => 1, 2 => :b} %{2 => :b, :a => 1} iex> %{:a => a} = > %{:a => 1, 2 => :b} %{2 => :b, :a => 1} iex> a 1 iex> %{:c => c} = %{:a > => 1, 2 => :b} ** (MatchError) no match of right hand side value: %{2 => > :b, :a => 1}| > > And in F#, there is something similar to dictionary > destructing(actually, this destructs `struct` instead) > type MyRecord = { Name: string; ID: int } letIsMatchByName record1 > (name: string) = matchrecord1 with| { MyRecord.Name = nameFound; > MyRecord.ID = _; } whennameFound = name -> true| _ -> falseletrecordX = > { Name = "Parker"; ID = 10} letisMatched1 = IsMatchByName recordX > "Parker"letisMatched2 = IsMatchByName recordX "Hartono" > > All of them partially destructs(or matches) a dictionary. > > thautwarm > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From tim.peters at gmail.com Sat Apr 7 18:09:36 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 7 Apr 2018 17:09:36 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: ... [Tim] >> Later: >> >> def coll(SHIFT=24): >> ... >> from itertools import accumulate, chain, cycle >> ... >> LIMIT = 1 << SHIFT >> ... >> abc, first, deltas = buildtab(SHIFT, LIMIT) >> ... >> for num in accumulate(chain([first], cycle(deltas))): >> assert num % 3 != 2 >> >> As in Will's code, it would be more readable as: >> >> for num in accumulate(cycle(deltas), start=first): [Raymond] > That does read better. I am curious how you would have > written it as a plain for-loop before accumulate() was added The original loop was quite different, a nested loop pair reflecting directly that candidates are of the form i * LIMIT + j for i >= 1 and j in goodix: for base in itertools.count(LIMIT, LIMIT): for ix in goodix: num = base + ix if num % 3 == 2: continue It was later I noticed that, across every 3 full iterations of the outer loop, exactly one third of the "num % 3 == 2" tests were true. It took some thought & a bit of proof to show that all and only the num % 3 != 2 candidates could be generated directly by the shorter code. BTW, count(LIMIT, LIMIT) is a bit of a head-scratcher itself ;-) Without `accumulate()`, I suppose I would have done this instead: num = first for delta in chain([0], cycle(deltas)): num += delta That's worse to my eyes! The `chain()` trick is still needed, but in this case to inject a 0 delta at the start so that `num` remains `first` across the first iteration. I should note that this is "a search loop" that rarely finds what it's looking for. There are several places in the body that give up on the current `num` and want to move on to the next candidate. So it's of great pragmatic value that it be written in a way such that a plain `continue` in the body does the right thing. For that reason, I would _not_ have written it as, e.g., num = first for delta in cycle(deltas): # masses of tests that may want to give up # early, excruciatingly nested so that "give up" # falls through to the end ... num += delta > (part of the argument against reduce() was that a plain > for-loop would be clearer 99% of the time). Except that isn't true: 99% of `reduce()` instances were replaced by `sum()` when the latter was introduced :-) "Sum reduction" and "running-sum accumulation" are primitives in many peoples' brains. In generalizing those to other dyadic operations, it's the abstraction itself that's responsible for the loss of clarity - now you're building a higher-order functional that's not a primitive in anyone's brain except for Ken Iverson and Kirby Urner ;-) The rest of us are better off seeing the moving pieces in a loop body. But that's simply not so for addition, which is why introducing `sum()` was a great idea. BTW, note that `sum()` also supports an optional `start=` argument. I expect (but don't know) that `accumulate()` is overwhelmingly used to do running sums (the only use I've had for it), so it's a bit odd on that count that it doesn't. > ... > Agreed that the "chain([x], it)" step is obscure. That's a bit of a bummer -- > one of the goals for the itertools module was to be a generic toolkit for > chopping-up, modifying, and splicing iterator streams (sort of a CRISPR > for iterators). I'd say it was overwhelmingly successful at that goal. The rub here appears to be that `x` on its own is not a stream - it has to be wrapped inside an iterable first to play nice with stream-based tools. In a stream-based language (like Haskell), there's usually a "cons" operation built in to prepend a scalar to a stream (like `x : it` in Haskell is pretty much the same as `chain([x], it)`). > The docs probably need another recipe to show this pattern: > > def prepend(value, iterator): > "prepend(1, [2, 3, 4]) -> 1 2 3 4" > return chain([value], iterator) +1. Whether `accumulate()` should grow a `start=` argument still seems a distinct (albeit related) issue to me, though. From ncoghlan at gmail.com Sat Apr 7 22:02:28 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Apr 2018 12:02:28 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180406235048.GY16661@ando.pearwood.info> References: <20180406235048.GY16661@ando.pearwood.info> Message-ID: On 7 April 2018 at 09:50, Steven D'Aprano wrote: > On Fri, Apr 06, 2018 at 08:06:45AM -0700, Guido van Rossum wrote: > >> Please join the PEP 572 discussion. The strongest contender currently is `a >> := f()` and for good reasons. > > Where has that discussion moved to? The threads on python-ideas seem to > have gone quiet, and the last I heard you said that you, Chris and Nick > were discussing some issues privately. Yeah, there were some intersecting questions between "What's technically feasible in CPython?" and "What stands even a remote chance of being accepted as a language change?" that Guido wanted to feed into the next iteration on the PEP, but were getting lost in the "Class scopes do what now?" subthreads on here. The next PEP update will have a lot more details on the related rationale, but the gist of what's going to change at the semantic level is: * the notion of hidden sublocal scopes is going away, so folks will need to use "del" or nested scopes to avoid unwanted name bindings at class and module scope (similar to iteration variables in for loops), but the proposed feature should be much easier to explain conceptually * comprehensions and generator expressions will switch to eagerly capturing referenced names from the scope where they're defined in order to eliminate most of their current class body scoping quirks (this does introduce some new name resolution quirks related to comprehensions-inside-regular-loops, but they'll at least be consistent across different scope types) * as a result of the name capturing change, the evaluation of the outermost expression in comprehensions and generator expressions can be moved inside the nested scope (so any name bindings there won't leak either) (At a syntactic level, the proposed spelling will also be switching to "name := expr") There will still be plenty of open design questions to discuss from that point, but it's a big enough shift from the previous draft that it makes sense to wait until Chris has a sufficiently complete reference implementation for the revised semantics to be confident that we can make things work the way the revised PEP proposes. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Apr 7 22:31:38 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Apr 2018 12:31:38 +1000 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: On 8 April 2018 at 08:09, Tim Peters wrote: [Raymond wrote]: >> The docs probably need another recipe to show this pattern: >> >> def prepend(value, iterator): >> "prepend(1, [2, 3, 4]) -> 1 2 3 4" >> return chain([value], iterator) > > +1. Whether `accumulate()` should grow a `start=` argument still > seems a distinct (albeit related) issue to me, though. I didn't have a strong opinion either way until Tim mentioned sum() and then I went and checked the docs for both that and for accumulate. First sentence of the sum() docs: Sums *start* and the items of an *iterable* from left to right and returns the total. First sentence of the accumulate docs: Make an iterator that returns accumulated sums, ... So I now think that having "start" as a parameter to one but not the other, counts as a genuine API discrepancy. Providing start to accumulate would then mean the same thing as providing it to sum(): it would change the basis point for the first addition operation, but it wouldn't change the *number* of cumulative sums produced. By contrast, using the prepend() approach with accumulate() not only changes the starting value, it also changes the number of cumulative sums produced. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tim.peters at gmail.com Sat Apr 7 23:17:07 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 7 Apr 2018 22:17:07 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: [Nick Coghlan ] > I didn't have a strong opinion either way until Tim mentioned sum() > and then I went and checked the docs for both that and for accumulate. > > First sentence of the sum() docs: > > Sums *start* and the items of an *iterable* from left to right and > returns the total. > > First sentence of the accumulate docs: > > Make an iterator that returns accumulated sums, ... > > So I now think that having "start" as a parameter to one but not the > other, counts as a genuine API discrepancy. Genuine but minor ;-) > Providing start to accumulate would then mean the same thing as > providing it to sum(): it would change the basis point for the first > addition operation, but it wouldn't change the *number* of cumulative > sums produced. That makes no sense to me. `sum()` with a `start` argument always returns a single result, even if the iterable is empty. >>> sum([], 42) 42 As the example shows, it's possible that `sum()` does no additions whatsoever. It would be exceedingly bizarre if the same stuff passed to `accumulate()` returned an empty iterator instead: >>> list(accumulate([], start=42)) [] It should return [42]. It seems obvious to me that a sane implementation would maintain the invariant: sum(xs, s) == list(accumulate(xs, start=s))[-1] and there's nothing inherently special about `xs` being empty. It seems also obviously desirable that accumulate(xs, start=s) generate the same results as accumulate(chain([s], xs)) That's obviously desirable because it's _so_ obvious that Raymond implicitly assumed that's how it would work in his first message ;-) Or think of it this way: if you're adding N numbers, there are N-1 additions, and N partial sums. Whether it's `sum(xs)` or `accumulate(xs)`, if len(xs)==K then specifying `start` too changes the number of addends from K to K+1. > By contrast, using the prepend() approach with accumulate() not only > changes the starting value, it also changes the number of cumulative > sums produced. As it should :-) Note that that in the "real life" example code I gave, it was essential that `accumulate()` with `start` yield the starting value first. There were three uses in Will Ness's wheel sieve code, two of which wanted the starting value on its own, and the last of which didn't. In that last case, it was just a matter of doing next(wheel) on its own to discard the (in that specific case) unwanted starting value. If you have to paste the starting value _in_ instead (when it is wanted), then we're reintroducing a need for the "chain a singleton list with the iterator" hack introducing `start=` is trying to eliminate. From ncoghlan at gmail.com Sun Apr 8 00:14:52 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Apr 2018 14:14:52 +1000 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: On 8 April 2018 at 13:17, Tim Peters wrote: > [Nick Coghlan ] >> So I now think that having "start" as a parameter to one but not the >> other, counts as a genuine API discrepancy. > > Genuine but minor ;-) Agreed :) >> Providing start to accumulate would then mean the same thing as >> providing it to sum(): it would change the basis point for the first >> addition operation, but it wouldn't change the *number* of cumulative >> sums produced. > > That makes no sense to me. `sum()` with a `start` argument always > returns a single result, even if the iterable is empty. > >>>> sum([], 42) > 42 Right, but if itertools.accumulate() had the semantics of starting with a sum() over an empty iterable, then it would always start with an initial zero. It doesn't - it starts with "0+first_item", so the length of the output iterator matches the number of items in the input iterable: >>> list(accumulate([])) [] >>> list(accumulate([1, 2, 3, 4])) [1, 3, 6, 10] That matches the output you'd get from a naive O(n^2) implementation of cumulative sums: data = list(iterable) for stop in range(1, len(iterable)): yield sum(data[:stop]) So if the new parameter were to be called start, then I'd expect the semantics to be equivalent to: data = list(iterable) for stop in range(1, len(iterable)): yield sum(data[:stop], start=start) rather than the version Raymond posted at the top of the thread (where setting start explicitly also implicitly increases the number of items produced). That concern mostly goes away if the new parameter is deliberately called something *other than* "start" (e.g. "prepend=value", or "first=value"), but it could also be addressed by offering a dedicated "yield_start" toggle, such that the revised semantics were: def accumulate(iterable, func=operator.add, start=0, yield_start=False): it = iter(iterable) total = start if yield_start: yield total for element in it: total = func(total, element) yield total That approach would have the advantage of making the default value of "start" much easier to document (since it would just be zero, the same as it is for sum()), and only the length of the input iterable and "yield_start" would affect how many partial sums were produced. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Sun Apr 8 00:31:09 2018 From: guido at python.org (Guido van Rossum) Date: Sat, 7 Apr 2018 21:31:09 -0700 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: Given that two respected members of the community so strongly disagree whether accumulate([], start=0) should behave like accumulate([]) or like accumulate([0]), maybe in the end it's better not to add a start argument. (The disagreement suggests that we can't trust users' intuition here.) On Sat, Apr 7, 2018 at 9:14 PM, Nick Coghlan wrote: > On 8 April 2018 at 13:17, Tim Peters wrote: > > [Nick Coghlan ] > >> So I now think that having "start" as a parameter to one but not the > >> other, counts as a genuine API discrepancy. > > > > Genuine but minor ;-) > > Agreed :) > > >> Providing start to accumulate would then mean the same thing as > >> providing it to sum(): it would change the basis point for the first > >> addition operation, but it wouldn't change the *number* of cumulative > >> sums produced. > > > > That makes no sense to me. `sum()` with a `start` argument always > > returns a single result, even if the iterable is empty. > > > >>>> sum([], 42) > > 42 > > Right, but if itertools.accumulate() had the semantics of starting > with a sum() over an empty iterable, then it would always start with > an initial zero. > > It doesn't - it starts with "0+first_item", so the length of the > output iterator matches the number of items in the input iterable: > > >>> list(accumulate([])) > [] > >>> list(accumulate([1, 2, 3, 4])) > [1, 3, 6, 10] > > That matches the output you'd get from a naive O(n^2) implementation > of cumulative sums: > > data = list(iterable) > for stop in range(1, len(iterable)): > yield sum(data[:stop]) > > So if the new parameter were to be called start, then I'd expect the > semantics to be equivalent to: > > data = list(iterable) > for stop in range(1, len(iterable)): > yield sum(data[:stop], start=start) > > rather than the version Raymond posted at the top of the thread (where > setting start explicitly also implicitly increases the number of items > produced). > > That concern mostly goes away if the new parameter is deliberately > called something *other than* "start" (e.g. "prepend=value", or > "first=value"), but it could also be addressed by offering a dedicated > "yield_start" toggle, such that the revised semantics were: > > def accumulate(iterable, func=operator.add, start=0, > yield_start=False): > it = iter(iterable) > total = start > if yield_start: > yield total > for element in it: > total = func(total, element) > yield total > > That approach would have the advantage of making the default value of > "start" much easier to document (since it would just be zero, the same > as it is for sum()), and only the length of the input iterable and > "yield_start" would affect how many partial sums were produced. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Apr 8 01:00:22 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Apr 2018 00:00:22 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: Nick, sorry, but your arguments still make little sense to me. I think you're pushing an analogy between `sum()` details and `accumulate()` waaaaay too far, changing a simple idea into a needlessly complicated one. `accumulate()` can do anything at all it wants to do with a `start` argument (if it grows one), and a "default" of start=0 makes no sense: unlike `sum()`, `accumulate()` is not specifically for use with numeric values and may reject non-numeric types [from the `sum()` docs] `accumulate()` accepts any two-argument function. >>> itertools.accumulate([1, 2, 3], lambda x, y: str(x) + str(y)) >>> list(_) [1, '12', '123'] Arguing that it "has to do" something exactly the way `sum()` happens to be implemented just doesn't follow - not even if they happen to give the same name to an optional argument. If the function were named `accumulate_sum()`, and restricted to numeric types, maybe - but it's not. [Nick Coghlan ] > ... > That concern mostly goes away if the new parameter is deliberately > called something *other than* "start" (e.g. "prepend=value", or > "first=value"), but it could also be addressed by offering a dedicated > "yield_start" toggle, such that the revised semantics were: > > def accumulate(iterable, func=operator.add, start=0, yield_start=False): > it = iter(iterable) > total = start > if yield_start: > yield total > for element in it: > total = func(total, element) > yield total > > That approach would have the advantage of making the default value of > "start" much easier to document (since it would just be zero, the same > as it is for sum()), and only the length of the input iterable and > "yield_start" would affect how many partial sums were produced. As above, start=0 is senseless for `accumulate` (despite that it makes sense for `sum`). Raymond gave the obvious implementation in his original message. If you reworked your implementation to accommodate that NO sensible default for `start` exists except for the one Raymond used (a unique object private to the implementation, so he knows for sure whether or not `start` was passed), you'd end up with his implementation ;-) `yield_start` looks like a nuisance in any case. As already explained, most uses want the `start` value if it's given, and in cases where it isn't it's trivial to discard by doing `next()` once on the result. Of course it could be added - but why bother? From ncoghlan at gmail.com Sun Apr 8 01:19:50 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Apr 2018 15:19:50 +1000 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: On 8 April 2018 at 14:31, Guido van Rossum wrote: > Given that two respected members of the community so strongly disagree > whether accumulate([], start=0) should behave like accumulate([]) or like > accumulate([0]), maybe in the end it's better not to add a start argument. > (The disagreement suggests that we can't trust users' intuition here.) The potential ambiguity I see is created mainly by calling the proposed parameter "start", while having it do more than just adjust the individual partial sum calculations by adding an extra partial result to the output series. If it's called something else (e.g. "first_result"), then the potential "sum(iterable, start=start)" misinterpretation goes away, and it can have Tim's desired effect of defining the first output value (effectively prepending it to the input iterable, without the boilerplate and overhead of actually doing so). A name like "first_result" would also make it clearer to readers that passing that parameter has an impact on the length of the output series (since you're injecting an extra result), and also that the production of the first result skips calling func completely (as can be seen in Tim's str coercion example). So where I'd be -1 on: >>> list(accumulate(1, 2, 3)) [1, 3, 6] >>> list(accumulate(1, 2, 3, start=0)) [0, 1, 3, 6] >>> list(accumulate(1, 2, 3, start=1)) [1, 2, 4, 7] I'd be +1 on: >>> list(accumulate(1, 2, 3)) [1, 3, 6] >>> list(accumulate(1, 2, 3, first_result=0)) [0, 1, 3, 6] >>> list(accumulate(1, 2, 3, first_result=1)) [1, 2, 4, 7] Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Apr 8 01:26:01 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Apr 2018 15:26:01 +1000 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: On 8 April 2018 at 15:00, Tim Peters wrote: > `accumulate()` accepts any two-argument function. > >>>> itertools.accumulate([1, 2, 3], lambda x, y: str(x) + str(y)) > >>>> list(_) > [1, '12', '123'] > > Arguing that it "has to do" something exactly the way `sum()` happens > to be implemented just doesn't follow - not even if they happen to > give the same name to an optional argument. If the function were > named `accumulate_sum()`, and restricted to numeric types, maybe - but > it's not. When first added in 3.2, it did have that restriction, and the default behaviour without a function argument still parallels repeated use of the sum() builtin. Extending the parallel to *also* include a parameter called "start" would create the expectation for me that that parameter would adjust the partial result calculations, without adding an extra result. Other parameter names (like the "first_result" I suggested in my reply to Guido) wouldn't have the same implications for me, so this objection is specific to the use of "start" as the parameter name, not to the entire idea. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tim.peters at gmail.com Sun Apr 8 01:26:07 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Apr 2018 00:26:07 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: Top-posting just to say I agree with Nick's bottom line (changing the name to `first_result=`). I remain just +0.5, although that is up a notch from yesterday's +0.4 ;-) --- nothing new below --- On Sun, Apr 8, 2018 at 12:19 AM, Nick Coghlan wrote: > On 8 April 2018 at 14:31, Guido van Rossum wrote: >> Given that two respected members of the community so strongly disagree >> whether accumulate([], start=0) should behave like accumulate([]) or like >> accumulate([0]), maybe in the end it's better not to add a start argument. >> (The disagreement suggests that we can't trust users' intuition here.) > > The potential ambiguity I see is created mainly by calling the > proposed parameter "start", while having it do more than just adjust > the individual partial sum calculations by adding an extra partial > result to the output series. > > If it's called something else (e.g. "first_result"), then the > potential "sum(iterable, start=start)" misinterpretation goes away, > and it can have Tim's desired effect of defining the first output > value (effectively prepending it to the input iterable, without the > boilerplate and overhead of actually doing so). > > A name like "first_result" would also make it clearer to readers that > passing that parameter has an impact on the length of the output > series (since you're injecting an extra result), and also that the > production of the first result skips calling func completely (as can > be seen in Tim's str coercion example). > > So where I'd be -1 on: > > >>> list(accumulate(1, 2, 3)) > [1, 3, 6] > >>> list(accumulate(1, 2, 3, start=0)) > [0, 1, 3, 6] > >>> list(accumulate(1, 2, 3, start=1)) > [1, 2, 4, 7] > > I'd be +1 on: > > >>> list(accumulate(1, 2, 3)) > [1, 3, 6] > >>> list(accumulate(1, 2, 3, first_result=0)) > [0, 1, 3, 6] > >>> list(accumulate(1, 2, 3, first_result=1)) > [1, 2, 4, 7] > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Apr 8 01:39:04 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Apr 2018 15:39:04 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: On 6 April 2018 at 02:52, Peter O'Connor wrote: > Combined with the new "last" builtin discussed in the proposal, this would > allow u to replace "reduce" with a more Pythonic comprehension-style syntax. I think this idea was overshadowed by the larger syntactic proposal in the rest of your email (I know I missed it initially and only noticed it in the thread subject line later). With the increased emphasis on iterators and generators in Python 3.x, the lack of a simple expression level equivalent to "for item in iterable: pass" is occasionally irritating, especially when demonstrating behaviour at the interactive prompt. Being able to reliably exhaust an iterator with "last(iterable)" or "itertools.last(iterable)" would be a nice reduction function to offer, in addition to our existing complement of builtin reducers like sum(), any() and all(). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tim.peters at gmail.com Sun Apr 8 01:40:10 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Apr 2018 00:40:10 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: Just a nit here: [Tim] >> ... >> Arguing that it "has to do" something exactly the way `sum()` happens >> to be implemented just doesn't follow - not even if they happen to >> give the same name to an optional argument. If the function were >> named `accumulate_sum()`, and restricted to numeric types, maybe - but >> it's not. [Nick] > When first added in 3.2, it did have that restriction, and the default > behaviour without a function argument still parallels repeated use of > the sum() builtin. They're not quite the same today. For example, if you have an object `x` of a custom numeric class, sum([x]) invokes x.__radd__ once (it really does do `x.__radd__(0)`), but list(itertools.accumulate([x])) just returns [x] without invoking x.__add__ or x.__radd__ at all. > Extending the parallel to *also* include a parameter called "start" > would create the expectation for me that that parameter would adjust > the partial result calculations, without adding an extra result. > > Other parameter names (like the "first_result" I suggested in my reply > to Guido) wouldn't have the same implications for me, so this > objection is specific to the use of "start" as the parameter name, not > to the entire idea. Yup! That's fine by me too - and preferable even ;-) From tim.peters at gmail.com Sun Apr 8 03:02:58 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Apr 2018 02:02:58 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: Message-ID: FYI: [Raymond] > ... > Q. Do other languages do it? > A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. > > ... > * https://www.haskell.org/hoogle/?hoogle=mapAccumL Haskell has millions of functions ;-) `mapAccumL` is a God-awful mixture of Python's map(), reduce(), and accumulate() :-( The examples here should convince you it's nearly incomprehensible: http://zvon.org/other/haskell/Outputlist/mapAccumL_f.html A more-than-less direct equivalent to Python's `accumulate` is Haskell's `scanl1`: http://zvon.org/comp/r/ref-Haskell.html#Functions~Prelude.scanl1 That doesn't allow specifying an initial value. But, Haskell being Haskell, the closely related `scanl` function requires specifying an initial value, which is also the first element of the list it returns: http://zvon.org/comp/r/ref-Haskell.html#Functions~Prelude.scanl Of the two, `scanl` is more basic - indeed, the Standard Prelude defines scanl1 by peeling off the first element of the list and passing it as the initial value for scanl to use scanl :: (a -> b -> a) -> a -> [b] -> [a] scanl f q xs = q : (case xs of [] -> [] x:xs -> scanl f (f q x) xs) scanl1 :: (a -> a -> a) -> [a] -> [a] scanl1 f (x:xs) = scanl f x xs scanl1 _ [] = [] There are also analogous `scanr` and `scanr1` functions for doing "right-to-left" accumulation. So, bottom line: as usual, when looking to Haskell for prior art,, "all of the above - for a start" applies ;-) From ncoghlan at gmail.com Sun Apr 8 07:25:33 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Apr 2018 21:25:33 +1000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: References: Message-ID: On 23 March 2018 at 20:01, Chris Angelico wrote: > Apologies for letting this languish; life has an annoying habit of > getting in the way now and then. > > Feedback from the previous rounds has been incorporated. From here, > the most important concern and question is: Is there any other syntax > or related proposal that ought to be mentioned here? If this proposal > is rejected, it should be rejected with a full set of alternatives. I was writing a new stdlib test case today, and thinking about how I might structure it differently in a PEP 572 world, and realised that a situation the next version of the PEP should discuss is this one: # Dict display data = { key_a: 1, key_b: 2, key_c: 3, } # Set display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # List display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # Dict display data = { key_a: local_a := 1, key_b: local_b := 2, key_c: local_c := 3, } # Dict display with local key name bindings data = { local_a := key_a: 1, local_b := key_b: 2, local_c := key_c: 3, } I don't think this is bad (although the interaction with dicts is a bit odd), and I don't think it counts as a rationale either, but I do think the fact that it becomes possible should be noted as an outcome arising from the "No sublocal scoping" semantics. Cheers, Nick. P.S. The specific test case is one where I want to test the three different ways of spelling "the current directory" in some sys.path manipulation code (the empty string, os.curdir, and os.getcwd()), and it occurred to me that a version of PEP 572 that omits the sublocal scoping concept will allow inline naming of parts of data structures as you define them. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Sun Apr 8 11:01:12 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 9 Apr 2018 01:01:12 +1000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: References: Message-ID: <20180408150112.GH16661@ando.pearwood.info> On Sun, Apr 08, 2018 at 09:25:33PM +1000, Nick Coghlan wrote: > I was writing a new stdlib test case today, and thinking about how I > might structure it differently in a PEP 572 world, and realised that a > situation the next version of the PEP should discuss is this one: > > # Dict display > data = { > key_a: 1, > key_b: 2, > key_c: 3, > } > > # Set display with local name bindings > data = { > local_a := 1, > local_b := 2, > local_c := 3, > } I don't understand the point of these examples. Sure, I guess they would be legal, but unless you're actually going to use the name bindings, what's the point in defining them? data = { 1, (spam := complex_expression), spam+1, spam*2, } which I think is cleaner than the existing alternative of defining spam outside of the set. And for dicts: d = { 'key': 'value', (spam := calculated_key): (eggs := calculated_value), spam.lower(): eggs.upper(), } > I don't think this is bad (although the interaction with dicts is a > bit odd), and I don't think it counts as a rationale either, but I do > think the fact that it becomes possible should be noted as an outcome > arising from the "No sublocal scoping" semantics. If we really wanted to keep the sublocal scoping, we could make list/set/dict displays their own scope too. Personally, that's the only argument for sublocal scoping that I like yet: what happens inside a display should remain inside the display, and not leak out into the function. So that has taken me from -1 on sublocal scoping to -0.5 if it applies to displays. -- Steve From guido at python.org Sun Apr 8 11:35:17 2018 From: guido at python.org (Guido van Rossum) Date: Sun, 8 Apr 2018 08:35:17 -0700 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: Well if you can get Raymond to agree on that too I suppose you can go ahead. Personally I'm -0 but I don't really write this kind of algorithmic code enough to know what's useful. I do think that the new parameter name is ugly. But maybe that's the point. On Sat, Apr 7, 2018 at 10:26 PM, Tim Peters wrote: > Top-posting just to say I agree with Nick's bottom line (changing the > name to `first_result=`). I remain just +0.5, although that is up a > notch from yesterday's +0.4 ;-) > > --- nothing new below --- > > On Sun, Apr 8, 2018 at 12:19 AM, Nick Coghlan wrote: > > On 8 April 2018 at 14:31, Guido van Rossum wrote: > >> Given that two respected members of the community so strongly disagree > >> whether accumulate([], start=0) should behave like accumulate([]) or > like > >> accumulate([0]), maybe in the end it's better not to add a start > argument. > >> (The disagreement suggests that we can't trust users' intuition here.) > > > > The potential ambiguity I see is created mainly by calling the > > proposed parameter "start", while having it do more than just adjust > > the individual partial sum calculations by adding an extra partial > > result to the output series. > > > > If it's called something else (e.g. "first_result"), then the > > potential "sum(iterable, start=start)" misinterpretation goes away, > > and it can have Tim's desired effect of defining the first output > > value (effectively prepending it to the input iterable, without the > > boilerplate and overhead of actually doing so). > > > > A name like "first_result" would also make it clearer to readers that > > passing that parameter has an impact on the length of the output > > series (since you're injecting an extra result), and also that the > > production of the first result skips calling func completely (as can > > be seen in Tim's str coercion example). > > > > So where I'd be -1 on: > > > > >>> list(accumulate(1, 2, 3)) > > [1, 3, 6] > > >>> list(accumulate(1, 2, 3, start=0)) > > [0, 1, 3, 6] > > >>> list(accumulate(1, 2, 3, start=1)) > > [1, 2, 4, 7] > > > > I'd be +1 on: > > > > >>> list(accumulate(1, 2, 3)) > > [1, 3, 6] > > >>> list(accumulate(1, 2, 3, first_result=0)) > > [0, 1, 3, 6] > > >>> list(accumulate(1, 2, 3, first_result=1)) > > [1, 2, 4, 7] > > > > Cheers, > > Nick. > > > > -- > > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Apr 8 12:28:30 2018 From: guido at python.org (Guido van Rossum) Date: Sun, 8 Apr 2018 09:28:30 -0700 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: <20180408150112.GH16661@ando.pearwood.info> References: <20180408150112.GH16661@ando.pearwood.info> Message-ID: On Sun, Apr 8, 2018 at 8:01 AM, Steven D'Aprano wrote: > If we really wanted to keep the sublocal scoping, we could make > list/set/dict displays their own scope too. > > Personally, that's the only argument for sublocal scoping that I like > yet: what happens inside a display should remain inside the display, and > not leak out into the function. > That sounds like a reasonable proposal that we could at least consider. But I think it will not fly. Presumably it doesn't apply to tuple displays, because of reasonable examples like ((a := f(), a+1), a+2), and because it would create an ugly discontinuity between (a := f()) and (a := f(),). But then switching between [a := f(), a] and (a := f(), a) would create a discontinuity. For comprehensions and generator expressions there is no such discontinuity in the new proposal, since these *already* introduce their own scope. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Apr 8 12:38:28 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Apr 2018 11:38:28 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: Message-ID: Another bit of prior art: the Python itertoolz package also supplies `accumulate()`, with an optional `initial` argument. I stumbled into that when reading a Stackoverflow "how can I do Haskell's scanl in Python?" question. https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.accumulate On Fri, Apr 6, 2018 at 8:02 PM, Raymond Hettinger wrote: >> On Friday, April 6, 2018 at 8:14:30 AM UTC-7, Guido van Rossum wrote: >> On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor wrote: >>> So some more humble proposals would be: >>> >>> 1) An initializer to itertools.accumulate >>> functools.reduce already has an initializer, I can't see any controversy to adding an initializer to itertools.accumulate >> >> See if that's accepted in the bug tracker. > > It did come-up once but was closed for a number reasons including lack of use cases. However, Peter's signal processing example does sound interesting, so we could re-open the discussion. > > For those who want to think through the pluses and minuses, I've put together a Q&A as food for thought (see below). Everybody's design instincts are different -- I'm curious what you all think think about the proposal. > > > Raymond > > --------------------------------------------- > > Q. Can it be done? > A. Yes, it wouldn't be hard. > > _sentinel = object() > > def accumulate(iterable, func=operator.add, start=_sentinel): > it = iter(iterable) > if start is _sentinel: > try: > total = next(it) > except StopIteration: > return > else: > total = start > yield total > for element in it: > total = func(total, element) > yield total > > Q. Do other languages do it? > A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. > > * http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html > * https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html > * http://microapl.com/apl/apl_concepts_chapter5.html > \+ 1 2 3 4 5 > 1 3 6 10 15 > * https://reference.wolfram.com/language/ref/Accumulate.html > * https://www.haskell.org/hoogle/?hoogle=mapAccumL > > > Q. How much work for a person to do it currently? > A. Almost zero effort to write a simple helper function: > > myaccum = lambda it, func, start: accumulate(chain([start], it), func) > > > Q. How common is the need? > A. Rare. > > > Q. Which would be better, a simple for-loop or a customized itertool? > A. The itertool is shorter but more opaque (especially with respect > to the argument order for the function call): > > result = [start] > for x in iterable: > y = func(result[-1], x) > result.append(y) > > versus: > > result = list(accumulate(iterable, func, start=start)) > > > Q. How readable is the proposed code? > A. Look at the following code and ask yourself what it does: > > accumulate(range(4, 6), operator.mul, start=6) > > Now test your understanding: > > How many values are emitted? > What is the first value emitted? > Are the two sixes related? > What is this code trying to accomplish? > > > Q. Are there potential surprises or oddities? > A. Is it readily apparent which of assertions will succeed? > > a1 = sum(range(10)) > a2 = sum(range(10), 0) > assert a1 == a2 > > a3 = functools.reduce(operator.add, range(10)) > a4 = functools.reduce(operator.add, range(10), 0) > assert a3 == a4 > > a4 = list(accumulate(range(10), operator.add)) > a5 = list(accumulate(range(10), operator.add, start=0)) > assert a5 == a6 > > > Q. What did the Python 3.0 Whatsnew document have to say about reduce()? > A. "Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable." > > > Q. What would this look like in real code? > A. We have almost no real-world examples, but here is one from a StackExchange post: > > def wsieve(): # wheel-sieve, by Will Ness. ideone.com/mqO25A->0hIE89 > wh11 = [ 2,4,2,4,6,2,6,4,2,4,6,6, 2,6,4,2,6,4,6,8,4,2,4,2, > 4,8,6,4,6,2,4,6,2,6,6,4, 2,4,6,2,6,4,2,4,2,10,2,10] > cs = accumulate(cycle(wh11), start=11) > yield( next( cs)) # cf. ideone.com/WFv4f > ps = wsieve() # codereview.stackexchange.com/q/92365/9064 > p = next(ps) # 11 > psq = p*p # 121 > D = dict( zip( accumulate(wh11, start=0), count(0))) # start from > sieve = {} > for c in cs: > if c in sieve: > wheel = sieve.pop(c) > for m in wheel: > if not m in sieve: > break > sieve[m] = wheel # sieve[143] = wheel at 187 > elif c < psq: > yield c > else: # (c==psq) > # map (p*) (roll wh from p) = roll (wh*p) from (p*p) > x = [p*d for d in wh11] > i = D[ (p-11) % 210] > wheel = accumulate(cycle(x[i:] + x[:i]), start=psq) > p = next(ps) ; psq = p*p > next(wheel) ; m = next(wheel) > sieve[m] = wheel > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From klahnakoski at mozilla.com Sun Apr 8 13:41:44 2018 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Sun, 8 Apr 2018 13:41:44 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180406011854.GU16661@ando.pearwood.info> References: <20180406011854.GU16661@ando.pearwood.info> Message-ID: <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> On 2018-04-05 21:18, Steven D'Aprano wrote: > (I don't understand why so many people have such an aversion to writing > functions and seek to eliminate them from their code.) > I think I am one of those people that have an aversion to writing functions! I hope you do not mind that I attempt to explain my aversion here. I want to clarify my thoughts on this, and maybe others will find something useful in this explanation, maybe someone has wise words for me. I think this is relevant to python-ideas because someone with this aversion will make different language suggestions than those that don't. Here is why I have an aversion to writing functions: Every unread function represents multiple unknowns in the code. Every function adds to code complexity by mapping an inaccurate name to specific functionality.? When I read code, this is what I see: >??? x = you_will_never_guess_how_corner_cases_are_handled(a, b, c) >??? y = you_dont_know_I_throw_a_BaseException_when_I_do_not_like_your_arguments(j, k, l) Not everyone sees code this way: I see people read method calls, make a number of wild assumptions about how those methods work, AND THEY ARE CORRECT!? How do they do it!?? It is as if there are some unspoken convention about how code should work that's opaque to me. For example before I read the docs on itertools.accumulate(list_of_length_N, func), here are the unknowns I see: * Does it return N, or N-1 values? * How are initial conditions handled? * Must `func` perform the initialization by accepting just one parameter, and accumulate with more-than-one parameter? * If `func` is a binary function, and `accumulate` returns N values, what's the Nth value? * if `func` is a non-cummutative binary function, what order are the arguments passed?? * Maybe accumulate expects func(*args)? * Is there a window size? Is it equal to the number of arguments of `func`? These are not all answered by reading the docs, they are answered by reading the code. The code tells me the first value is a special case; the first parameter of `func` is the accumulated `total`; `func` is applied in order; and an iterator is returned.? Despite all my questions, notice I missed asking what `accumulate` returns? It is the unknown unknowns that get me most. So, `itertools.accumulate` is a kinda-inaccurate name given to a specific functionality: Not a problem on its own, and even delightfully useful if I need it often.? What if I am in a domain where I see `accumulate` only a few times a year? Or how about a program that uses `accumulate` in only one place? For me, I must (re)read the `accumulate` source (or run the caller through the debugger) before I know what the code is doing. In these cases I advocate for in-lining the function code to remove these unknowns. Instead of an inaccurate name, there is explicit code. If we are lucky, that explicit code follows idioms that make the increased verbosity easier to read. Consider Serhiy Storchaka's elegant solution, which I reformatted for readability > smooth_signal = [ >???? average > ? ? for average in [0] > ? ? for x in signal >???? for average in [(1-decay)*average + decay*x] > ] We see the initial conditions, we see the primary function, we see how the accumulation happens, we see the number of returned values, and we see it's a list. It is a compact, easy read, from top to bottom. Yes, we must know `for x in [y]` is an idiom for assignment, but we can reuse that knowledge in all our other list comprehensions.? So, in the specific case of this Reduce-Map thread, I would advocate using the list comprehension.? In general, all functions introduce non-trivial code debt: This debt is worth it if the function is used enough; but, in single-use or rare-use cases, functions can obfuscate. Thank you for your time. From kirillbalunov at gmail.com Sun Apr 8 15:03:41 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sun, 8 Apr 2018 22:03:41 +0300 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: 2018-04-08 8:19 GMT+03:00 Nick Coghlan : > A name like "first_result" would also make it clearer to readers that > passing that parameter has an impact on the length of the output > series (since you're injecting an extra result), and also that the > production of the first result skips calling func completely (as can > be seen in Tim's str coercion example). > > So where I'd be -1 on: > > >>> list(accumulate(1, 2, 3)) > [1, 3, 6] > >>> list(accumulate(1, 2, 3, start=0)) > [0, 1, 3, 6] > >>> list(accumulate(1, 2, 3, start=1)) > [1, 2, 4, 7] > > I'd be +1 on: > > >>> list(accumulate(1, 2, 3)) > [1, 3, 6] > >>> list(accumulate(1, 2, 3, first_result=0)) > [0, 1, 3, 6] > >>> list(accumulate(1, 2, 3, first_result=1)) > [1, 2, 4, 7] > > It is a fair point! But the usual way to understand how to use an additional argument, is to try it or to look examples in the documentation. Concerning the topic of relationship between `sum` and `accumulate` I have another point of view. If `start` means something to the `sum`, there are no grounds for believing that it should mean the same for `accumulate`, these functions are not really comparable and fundamentally different. The closest `sum`s friend is `functools.reduce` which uses `initial` instead of `start`. ( the documentation uses `initializer` and the docstring uses `initial`, as for me I prefer the `initial`) and so there is already a discrepancy. Having said this I think that it is no matter how it will be named `start` or `initial` or `first_result` or `first`, which is more suitable. I would prefer `initial` to be the same as in `itertoolz` package. Regarding the second point, should it yield one more element if provided - I think everyone here agrees that yes. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Apr 8 15:22:21 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Apr 2018 14:22:21 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: [Guido] > Well if you can get Raymond to agree on that too I suppose you can go ahead. > Personally I'm -0 but I don't really write this kind of algorithmic code > enough to know what's useful. Actually, you do - but you don't _think_ of problems in these terms. Neither do I. For those who do: consider any program that has state and responds to inputs. When you get a new input, the new state is a function of the existing state and the input. That's `accumulate`! It generates a sequence of new states as the sequence of incrementally updated states derived from a sequence of inputs according to the function passed to accumulate. In that view of the world, specifying a starting value is merely specifying the program's initial state - and from that view of the world, not allowing to specify a starting value is bizarre. While Will Ness's wheel sieve program, and my Collatz glide-record-finder, don't _derive_ from that view of the world, it turns out that specifying (and returning) an initial value is also useful for them, and despite that they're just doing integer running sums. A simple example from the broader view of the world is generating all the prefixes of a list: >>> from itertools import * >>> list(accumulate(chain([[]], [8, 4, "k"]), lambda x, y: x + [y])) [[], [8], [8, 4], [8, 4, 'k']] That's obviously easier to follow if written, e.g., list(accumulate([8, 4, "k"], lambda x, y: x + [y], first_result=[])) > I do think that the new parameter name ["first_result"] is ugly. But maybe that's > the point. I noted later that the `accumulate` in the Python itertoolz package names its optional argument "initial". That's not ugly, and works just as well for me. From adelfino at gmail.com Sun Apr 8 17:18:15 2018 From: adelfino at gmail.com (=?UTF-8?Q?Andr=C3=A9s_Delfino?=) Date: Sun, 8 Apr 2018 18:18:15 -0300 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts Message-ID: Hi! I thought that maybe dict could accept several mappings as positional arguments, like this: class Dict4(dict): > def __init__(self, *args, **kwargs): > if len(args) > 1: > if not all([isinstance(arg, dict) for arg in args]): > raise TypeError('Dict4 expected instances of dict since > multiple positional arguments were passed') > > temp = args[0].copy() > > for arg in args[1:]: > temp.update(arg) > > super().__init__(temp, **kwargs) > else: > super().__init__(*args, **kwargs) > AFAIK, this wouldn't create compatibility problems, since you can't pass two positional arguments now anyways. It would be useful to solve the "sum/union dicts" discussion, for example: requests.get(url, params=dict(params, {'foo': bar}) Whar are your thoughts? -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Sun Apr 8 17:35:47 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 8 Apr 2018 14:35:47 -0700 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> > On Apr 8, 2018, at 12:22 PM, Tim Peters wrote: > > [Guido] >> Well if you can get Raymond to agree on that too I suppose you can go ahead. >> Personally I'm -0 but I don't really write this kind of algorithmic code >> enough to know what's useful. > > Actually, you do - but you don't _think_ of problems in these terms. > Neither do I. For those who do: consider any program that has state > and responds to inputs. When you get a new input, the new state is a > function of the existing state and the input. The Bayesian world view isn't much different except they would prefer "prior" instead of "initial" or "start" ;-) my_changing_beliefs = accumulate(stream_of_new_evidence, bayes_rule, prior=what_i_used_to_think) Though the two analogies are cute, I'm not sure they tell us much. In running programs or bayesian analysis, we care more about the result rather than the accumulation of intermediate results. My own experience with actually using accumulations in algorithmic code falls neatly into two groups. Many years ago, I used APL extensively in accounting work and my recollection is that a part of the convenience of "\+" was that the sequence length didn't change (so that the various data arrays could interoperate with one another). My other common case for accumulate() is building cumulative probability distributions from probability mass functions (see the code for random.choice() for example, or typical code for a K-S test). For neither of those use case categories did I ever want an initial value and it would have been distracting to even had the option. For example, when doing a discounted cash flow analysis, I was taught to model the various flows as a single sequence of up and down arrows rather than thinking of the initial balance as a distinct concept? Because of this background, I was surprised to have the question ever come up at all (other than the symmetry argument that sum() has "start" so accumulate() must as well). When writing itertools.accumulate(), I started by looking to see what other languages had done. Since accumulate() is primarily a numerical tool, I expected that the experience of numeric-centric languages would have something to teach us. My reasoning was that if the need hadn't arisen for APL, R, Numpy, Matlab?, or Mathematica, perhaps it really was just noise. My views may be dated though. Looking at the wheel sieve and collatz glide record finder, I see something new, a desire to work with lazy, potentially infinite accumulations (something that iterators do well but almost never arises in the world of fixed-length sequences or cumulative probability distributions). So I had been warming up to the idea, but got concerned that Nick could have had such a profoundly different idea about what the code should do. That cooled my interest a bit, especially when thinking about two key questions, "Will it create more problems than it solves?" and "Will anyone actually use it?". Raymond ? http://www.chegg.com/homework-help/questions-and-answers/solve-present-worth-cash-flow-shown-using-three-interest-factors-10-interest-compounded-an-q878034 ? https://www.mathworks.com/help/matlab/ref/accumarray.html From steve.dower at python.org Sun Apr 8 18:28:55 2018 From: steve.dower at python.org (Steve Dower) Date: Sun, 8 Apr 2018 17:28:55 -0500 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: References: Message-ID: # Dict display data = { key_a: local_a := 1, key_b: local_b := 2, key_c: local_c := 3, } Isn?t this a set display with local assignments and type annotations? :o) (I?m -1 on all of these ideas, btw. None help readability for me, and I read much more code than I write.) Top-posted from my Windows phone From: Nick Coghlan Sent: Sunday, April 8, 2018 6:27 To: Chris Angelico Cc: python-ideas Subject: Re: [Python-ideas] PEP 572: Statement-Local Name Bindings,take three! On 23 March 2018 at 20:01, Chris Angelico wrote: > Apologies for letting this languish; life has an annoying habit of > getting in the way now and then. > > Feedback from the previous rounds has been incorporated. From here, > the most important concern and question is: Is there any other syntax > or related proposal that ought to be mentioned here? If this proposal > is rejected, it should be rejected with a full set of alternatives. I was writing a new stdlib test case today, and thinking about how I might structure it differently in a PEP 572 world, and realised that a situation the next version of the PEP should discuss is this one: # Dict display data = { key_a: 1, key_b: 2, key_c: 3, } # Set display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # List display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # Dict display data = { key_a: local_a := 1, key_b: local_b := 2, key_c: local_c := 3, } # Dict display with local key name bindings data = { local_a := key_a: 1, local_b := key_b: 2, local_c := key_c: 3, } I don't think this is bad (although the interaction with dicts is a bit odd), and I don't think it counts as a rationale either, but I do think the fact that it becomes possible should be noted as an outcome arising from the "No sublocal scoping" semantics. Cheers, Nick. P.S. The specific test case is one where I want to test the three different ways of spelling "the current directory" in some sys.path manipulation code (the empty string, os.curdir, and os.getcwd()), and it occurred to me that a version of PEP 572 that omits the sublocal scoping concept will allow inline naming of parts of data structures as you define them. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Apr 8 18:58:56 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 09 Apr 2018 10:58:56 +1200 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> Message-ID: <5ACA9EB0.5000801@canterbury.ac.nz> Kyle Lahnakoski wrote: > Consider Serhiy Storchaka's elegant solution, which I reformatted for > readability > >>smooth_signal = [ >> average >> for average in [0] >> for x in signal >> for average in [(1-decay)*average + decay*x] >>] "Elegant" isn't the word I would use, more like "clever". Rather too clever, IMO -- it took me some head scratching to figure out how it does what it does. And it would have taken even more head scratching, except there's a clue as to *what* it's supposed to be doing: the fact that it's assigned to something called "smooth_signal" -- one of those "inaccurate names" that you disparage so much. :-) -- Greg From tim.peters at gmail.com Sun Apr 8 21:43:38 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Apr 2018 20:43:38 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: [Raymond] > The Bayesian world view isn't much different except they would > prefer "prior" instead of "initial" or "start" ;-) > > my_changing_beliefs = accumulate(stream_of_new_evidence, bayes_rule, prior=what_i_used_to_think) > > Though the two analogies are cute, I'm not sure they tell us much. In ... They're not intended to. They're just intended to shake people loose from picturing nothing subtler than adding a list of 3 integers ;-) > My own experience with actually using accumulations in algorithmic > code falls neatly into two groups. Many years ago, I used APL > extensively in accounting work and my recollection is that a part > of the convenience of "\+" was that the sequence length didn't change > (so that the various data arrays could interoperate with one another). Sure. > My other common case for accumulate() is building cumulative > probability distributions from probability mass functions (see the > code for random.choice() for example, or typical code for a K-S test). So, a question: why wasn't itertools.accumulate() written to accept iterables of _only_ numeric types? Akin to `sum()`. I gather from one of Nick's messages that it was so restricted in 3.2. Then why was it generalized to allow any 2-argument function? Given that it was, `sum()` is no longer particularly relevant: the closest thing by far is now `functools.reduce()`, which does support an optional `initial` argument. Which it really should, because it's impossible for the implementation to guess a suitable starting value for an arbitrary user-supplied dyadic function. My example using accumulate() to generate list prefixes got snipped, but same thing there: it's impossible for that snippet to work unless an empty list is supplied as the starting value. And it's impossible for the accumulate() implementation to guess that. In short, for _general_ use `accumulate()` needs `initial` for exactly the same reasons `reduce()` needed it. BTW, the type signatures on the scanl (requires an initial value) and scanl1 (does not support an initial value) implementations I pasted from Haskell's Standard Prelude give a deeper reason: without an initial value, a list of values of type A can only produce another list of values of type A via scanl1. The dyadic function passed must map As to As. But with an initial value supplied of type B, scanl can transform a list of values of type A to a list of values of type B. While that may not have been obvious in the list prefix example I gave, that was at work: a list of As was transformed into a list _of_ lists of As. That's impossible for scanl1 to do, but easy for scanl. Or, in short, someone coming from a typed functional language background sees all sorts of things that rarely (if ever) come up in number-crunching languages. Their sensibilities should count too - although not much ;-) They should get _some_ extra consideration in this context, though, because `itertools` is one of the first things they dig into when they give Python a try. > For neither of those use case categories did I ever want an initial value As above, in all your related experiences "0" was a suitable base value, so you had no reason to care. > and it would have been distracting to even had the option. Distracting for how long? One second or two? ;-) > ... > Because of this background, I was surprised to have the question ever > come up at all (other than the symmetry argument that sum() has "start" > so accumulate() must as well). As above, the real parallel here is to reduce(). `sum()` became an historical red herring when `accumulate()` was generalized. With a different background, you may just as well have been surprised if the question _hadn't_ come up. For example, this is a standard example in the Haskell world for how to define an infinite Fibonacci sequence with the initial two values f0 and f1: fibs = f0 : scanl (+) f1 fibs The part from `scanl` onward would be spelled in Python as accumulate(fibs, initial=f1) although it requires some trickery to get the recursive reference to work (details on request, but I'm sure you already know how to do that). > When writing itertools.accumulate(), I started by looking to see what > other languages had done. Since accumulate() is primarily a > numerical tool, I expected that the experience of numeric-centric > languages would have something to teach us. My reasoning was > that if the need hadn't arisen for APL, R, Numpy, Matlab?, or Mathematica, > perhaps it really was just noise. In the itertools context, I also would have looked hard at Haskell experience. BTW, whoever wrote the current `accumulate()` docs also found a use for `initial`, but hacked around it: """ First-order recurrence relations can be modeled by supplying the initial value in the iterable and using only the accumulated total in func argument: """ followed by: >>> logistic_map = lambda x, _: r * x * (1 - x) >>> r = 3.8 >>> x0 = 0.4 >>> inputs = repeat(x0, 36) # only the initial value is used >>> [format(x, '.2f') for x in accumulate(inputs, logistic_map)] That would be better on several counts to my eyes as: inputs = repeat(None, 35) # no values actually used ... for x in accumulate(inputs, logistic_map, initial=x0) In particular, filling inputs with `None` would lead to an exception if the programmer screwed up and the input values actually _were_ being used. I expect we'll both overlook that writing a generator using the obvious loop would be a lot clearer regardless ;-) > My views may be dated though. Looking at the wheel sieve and collatz glide > record finder, I see something new, a desire to work with lazy, potentially > infinite accumulations (something that iterators do well but almost never > arises in the world of fixed-length sequences or cumulative probability distributions). Amazingly enough, those are both just doing integer running sums - all the stuff above about catering to general types doesn't apply in these cases. `initial` isn't needed for _correctness_ in these cases, it would just add convenience and some clarity. > So I had been warming up to the idea, but got concerned that Nick > could have had such a profoundly different idea about what the code > should do. He appeared to withdraw his objections after agreement _not_ to name the argument "start" was reached. Something about `sum()` also having an optional argument named "start" caused confusion. > That cooled my interest a bit, especially when thinking about two > key questions, "Will it create more problems than it solves?" Like? It's an optional argument. One brief example in the docs is all it takes to understand what it does. >>> list(accumulate([1, 2, 3])) [11, 13, 16] >>> list(accumulate([1, 2, 3], initial=10)) [10, 11, 13, 16] > and "Will anyone actually use it?". As above, the docs could change to use it. And I bet the test suite too. How much more could you want from a feature?! ;-) From tim.peters at gmail.com Sun Apr 8 21:48:36 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Apr 2018 20:48:36 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: > >>> list(accumulate([1, 2, 3])) > [11, 13, 16] Wow! I would have sworn that said [1, 3, 6] when I sent it. Damn Gmail ;-) > >>> list(accumulate([1, 2, 3], initial=10)) > [10, 11, 13, 16] From greg.ewing at canterbury.ac.nz Sun Apr 8 19:49:57 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 09 Apr 2018 11:49:57 +1200 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: <5ACAAAA5.20004@canterbury.ac.nz> Raymond Hettinger wrote: > For neither of those use case categories did I ever want an initial value and > it would have been distracting to even had the option. For example, when > doing a discounted cash flow analysis, I was taught to model the various > flows as a single sequence of up and down arrows rather than thinking of the > initial balance as a distinct concept? There's always an initial value, even if it's implicit. The way accumulate() works can be thought of in two ways: (1) The initial value is implicitly the identity of whatever operation the function performs. (2) The first item in the list is the initial value, and the rest are items to be accumulated. Both of these are somewhat dodgy, IMO. The first one works only if the assumed identity is what you actually want, *and* there is always at least one item to accumulate. If those conditions don't hold, you need to insert the initial value as the first item. But this is almost certainly going to require extra code. The initial value and the items are conceptually different things, and are unlikely to start out in the same list together. What's more, the first thing the implementation of accumulate() does is extract the first item and treat it differently from the rest. So your code goes out of its way to insert the initial value, and then accumulate() goes out of its way to pull it out again. Something smells wrong about that. As an example, suppose you have a list [1, 2, 3] and you want to construct [], [1], [1, 2], [1, 2, 3]. To do that with accumulate() you need to write something like: accumulate([[], 1, 2, 3], lambda x, y: x + [y]) The fact that the first element of the list doesn't even have the same *type* as the rest should be a strong hint that forcing them to occupy the same list is an unnatural thing to do. -- Greg > > Because of this background, I was surprised to have the question ever come up > at all (other than the symmetry argument that sum() has "start" so > accumulate() must as well). > > When writing itertools.accumulate(), I started by looking to see what other > languages had done. Since accumulate() is primarily a numerical tool, I > expected that the experience of numeric-centric languages would have > something to teach us. My reasoning was that if the need hadn't arisen for > APL, R, Numpy, Matlab?, or Mathematica, perhaps it really was just noise. > > My views may be dated though. Looking at the wheel sieve and collatz glide > record finder, I see something new, a desire to work with lazy, potentially > infinite accumulations (something that iterators do well but almost never > arises in the world of fixed-length sequences or cumulative probability > distributions). > > So I had been warming up to the idea, but got concerned that Nick could have > had such a profoundly different idea about what the code should do. That > cooled my interest a bit, especially when thinking about two key questions, > "Will it create more problems than it solves?" and "Will anyone actually use > it?". > > > > Raymond > > > > > > > > ? > http://www.chegg.com/homework-help/questions-and-answers/solve-present-worth-cash-flow-shown-using-three-interest-factors-10-interest-compounded-an-q878034 > > > ? https://www.mathworks.com/help/matlab/ref/accumarray.html > _______________________________________________ Python-ideas mailing list > Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From raymond.hettinger at gmail.com Mon Apr 9 00:38:00 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 8 Apr 2018 21:38:00 -0700 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: > On Apr 8, 2018, at 6:43 PM, Tim Peters wrote: > >> My other common case for accumulate() is building cumulative >> probability distributions from probability mass functions (see the >> code for random.choice() for example, or typical code for a K-S test). > > So, a question: why wasn't itertools.accumulate() written to accept > iterables of _only_ numeric types? Akin to `sum()`. I gather from > one of Nick's messages that it was so restricted in 3.2. Then why was > it generalized to allow any 2-argument function? Prior to 3.2, accumulate() was in the recipes section as pure Python code. It had no particular restriction to numeric types. I received a number of requests for accumulate() to be promoted to a real itertool (fast, tested, documented C code with a stable API). I agreed and accumulate() was added to itertools in 3.2. It worked with anything supporting __add__, including str, bytes, lists, and tuples. More specifically, accumulate_next() called PyNumber_Add() without any particular type restriction. Subsequently, I got requests to generalize accumulate() to support any arity-2 function (with operator.mul offered as the motivating example). Given that there were user requests and there were ample precedents in other languages, I acquiesced despite having some reservations (if used with a lambda, the function call overhead might make accumulate() slower than a plain Python for-loop without the function call). So, that generalized API extension went into 3.3 and has remained unchanged ever since. Afterwards, I was greeted with the sound of crickets. Either it was nearly perfect or no one cared or both ;-) It remains one of the least used itertools. > Given that it was, `sum()` is no longer particularly relevant: the > closest thing by far is now `functools.reduce()`, which does support > an optional `initial` argument. Which it really should, because it's > impossible for the implementation to guess a suitable starting value > for an arbitrary user-supplied dyadic function. > > My example using accumulate() to generate list prefixes got snipped, > but same thing there: it's impossible for that snippet to work unless > an empty list is supplied as the starting value. And it's impossible > for the accumulate() implementation to guess that. Honestly, I couldn't immediately tell what this code was doing: list(accumulate([8, 4, "k"], lambda x, y: x + [y], first_result=[])) This may be a case where a person would be better-off without accumulate() at all. > In short, for _general_ use `accumulate()` needs `initial` for exactly > the same reasons `reduce()` needed it. The reduce() function had been much derided, so I've had it mentally filed in the anti-pattern category. But yes, there may be wisdom there. > BTW, the type signatures on the scanl (requires an initial value) and > scanl1 (does not support an initial value) implementations I pasted > from Haskell's Standard Prelude give a deeper reason: without an > initial value, a list of values of type A can only produce another > list of values of type A via scanl1. The dyadic function passed must > map As to As. But with an initial value supplied of type B, scanl can > transform a list of values of type A to a list of values of type B. > While that may not have been obvious in the list prefix example I > gave, that was at work: a list of As was transformed into a list _of_ > lists of As. That's impossible for scanl1 to do, but easy for scanl. Thanks for pointing that out. I hadn't considered that someone might want to transform one type into another using accumulate(). That is pretty far from my mental model of what accumulate() was intended for. Also, I'm still not sure whether we would want code like that buried in an accumulate() call rather than as a regular for-loop where I can see the logic and trace through it with pdb. As for scanl, I'm not sure what this code means without seeing some python equivalent. scanl :: (a -> b -> a) -> a -> [b] -> [a] scanl f q xs = q : (case xs of [] -> [] x:xs -> scanl f (f q x) xs) scanl1 :: (a -> a -> a) -> [a] -> [a] scanl1 f (x:xs) = scanl f x xs scanl1 _ [] = [] > Or, in short, someone coming from a typed functional language > background sees all sorts of things that rarely (if ever) come up in > number-crunching languages. Their sensibilities should count too - > although not much ;-) They should get _some_ extra consideration in > this context, though, because `itertools` is one of the first things > they dig into when they give Python a try. I concur. >> and it would have been distracting to even had the option. > > Distracting for how long? One second or two? ;-) Possibly forever. In my experience, if a person initially frames a problem wrong (or perhaps in a hard to solve way), it can take them a long time to recover. For example with discounted cash flows, people who think of the initial value as being special or distinct from the other cash flows will have a hard time adapting to problem variants like annuity due, balloon payments, internal rate of return, coupon stripping, valuing a transaction that takes place in the future, etc. I don't want to overstate the case, but I do think a function signature that offers a "first_value" option is an invitation to treat the first value as being distinct from the rest of the data stream. > With a different background, you may just as well have been surprised > if the question _hadn't_ come up. For example, this is a standard > example in the Haskell world for how to define an infinite Fibonacci > sequence with the initial two values f0 and f1: > > fibs = f0 : scanl (+) f1 fibs > > The part from `scanl` onward would be spelled in Python as > > accumulate(fibs, initial=f1) > > although it requires some trickery to get the recursive reference to > work (details on request, but I'm sure you already know how to do > that). Do we want the tool to encourage such trickery? Don't get me wrong, I think it is cool that you could write such code, but could and should aren't always the same. >> When writing itertools.accumulate(), I started by looking to see what >> other languages had done. Since accumulate() is primarily a >> numerical tool, I expected that the experience of numeric-centric >> languages would have something to teach us. My reasoning was >> that if the need hadn't arisen for APL, R, Numpy, Matlab?, or Mathematica, >> perhaps it really was just noise. > > In the itertools context, I also would have looked hard at Haskell experience. Haskell probably is a good source of inspiration, but I don't know the language and find its docs to be inscrutable. So, I have to just trust when you say something like, """ Haskell has millions of functions ;-) `mapAccumL` is a God-awful mixture of Python's map(), reduce(), and accumulate() :-( The examples here should convince you it's nearly incomprehensible: """ In fact, yes, you've convinced me that there is an intelligibility issue ;-) > That would be better on several counts to my eyes as: > > inputs = repeat(None, 35) # no values actually used > ... for x in accumulate(inputs, logistic_map, initial=x0) > > In particular, filling inputs with `None` would lead to an exception > if the programmer screwed up and the input values actually _were_ > being used. I expect we'll both overlook that writing a generator > using the obvious loop would be a lot clearer regardless ;-) The winks may reading your posts fun, but I really can't tell whether position is, "yes, let's do this because someone can do wild things with it", or "no, let's don't this because people would commit atrocities with it". >> and "Will anyone actually use it?". > > As above, the docs could change to use it. And I bet the test suite > too. How much more could you want from a feature?! ;-) I'm concerned that the total number of actual users will be exactly two (you and the writer of the wheel-sieve) and that you each would have used it exactly once in your life. That's a pretty small user base for a standard library feature ;-) Tim, if you could muster an honest to goodness, real +1, that would be good enough for me. Otherwise, I'm back to -0 and prefer not to see Pythonistas writing the Haskell magics described in this thread. If this does go forward, I would greatly prefer "start" rather than "first_value" or "initial". The conversation has been enjoyable (perhaps because the stakes are so low) and educational (I learn something new every time you post). I'll leave this will a few random thoughts on itertools that don't seem to fit anywhere else. 1) When itertools was created, they were one of easiest ways to get C-like performance without writing C. However, when PyPy matured we got other ways to do it. And in the world of PyPy, the plain python for-loops outperform their iterator chain equivalents, so we lost one motivate to use itertools. 2) While I personally like function chains operating on iterators, my consulting and teaching experience has convinced me that very few people think that way. Accordingly, I almost never use compress, filterfalse, takewhile, dropwhile, etc. As people started adopting PEP 279 generator expressions, interest in itertool style thinking seems to have waned. Putting these two together has left me with a preference for itertools to only cover the simplest and most common cases, leaving the rest to be expressed as plain, everyday pure python. (The combinatoric itertools are an exception because they are more algorithmically interesting). Raymond From j.van.dorp at deonet.nl Mon Apr 9 02:58:47 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Mon, 9 Apr 2018 08:58:47 +0200 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <5ACA9EB0.5000801@canterbury.ac.nz> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <5ACA9EB0.5000801@canterbury.ac.nz> Message-ID: > With the increased emphasis on iterators and generators in Python 3.x, > the lack of a simple expression level equivalent to "for item in > iterable: pass" is occasionally irritating, especially when > demonstrating behaviour at the interactive prompt. I've sometimes thought that exhaust(iterator) or iterator.exhaust() would be a good thing to have - I've often wrote code doing basically "call this function for every element in this container, and idc about return values", but find myself using a list comprehension instead of generator. I guess it's such an edge case that exhaust(iterator) as builtin would be overkill (but perhaps itertools could have it ?), and most people don't pass around iterators, so (f(x) for x in y).exhaust() might not look natural to most people. It could return the value for the last() semantics, but I think exhaustion would often be more important than the last value. 2018-04-09 0:58 GMT+02:00 Greg Ewing : > Kyle Lahnakoski wrote: > >> Consider Serhiy Storchaka's elegant solution, which I reformatted for >> readability >> >>> smooth_signal = [ >>> average >>> for average in [0] >>> for x in signal >>> for average in [(1-decay)*average + decay*x] >>> ] > > > "Elegant" isn't the word I would use, more like "clever". > Rather too clever, IMO -- it took me some head scratching > to figure out how it does what it does. > > And it would have taken even more head scratching, except > there's a clue as to *what* it's supposed to be doing: > the fact that it's assigned to something called > "smooth_signal" -- one of those "inaccurate names" that > you disparage so much. :-) > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From greg.ewing at canterbury.ac.nz Mon Apr 9 05:15:27 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 09 Apr 2018 21:15:27 +1200 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: <5ACB2F2F.1000704@canterbury.ac.nz> Raymond Hettinger wrote: > I don't want to overstate the case, but I do think a function signature that > offers a "first_value" option is an invitation to treat the first value as > being distinct from the rest of the data stream. I conjecture that the initial value is *always* special, and the only cases where it seems not to be are where you're relying on some implicit initial value such as zero. -- Greg From ncoghlan at gmail.com Mon Apr 9 06:55:08 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 9 Apr 2018 20:55:08 +1000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: <20180408150112.GH16661@ando.pearwood.info> References: <20180408150112.GH16661@ando.pearwood.info> Message-ID: On 9 April 2018 at 01:01, Steven D'Aprano wrote: > On Sun, Apr 08, 2018 at 09:25:33PM +1000, Nick Coghlan wrote: > >> I was writing a new stdlib test case today, and thinking about how I >> might structure it differently in a PEP 572 world, and realised that a >> situation the next version of the PEP should discuss is this one: >> >> # Dict display >> data = { >> key_a: 1, >> key_b: 2, >> key_c: 3, >> } >> >> # Set display with local name bindings >> data = { >> local_a := 1, >> local_b := 2, >> local_c := 3, >> } > > I don't understand the point of these examples. Sure, I guess they would > be legal, but unless you're actually going to use the name bindings, > what's the point in defining them? That *would* be the point. In the case where it occurred to me, the actual code I'd written looked like this: curdir_import = "" curdir_relative = os.curdir curdir_absolute = os.getcwd() all_spellings = [curdir_import, curdir_relative, curdir_absolute] (Since I was testing the pydoc CLI's sys.path manipulation, and wanted to cover all the cases). >> I don't think this is bad (although the interaction with dicts is a >> bit odd), and I don't think it counts as a rationale either, but I do >> think the fact that it becomes possible should be noted as an outcome >> arising from the "No sublocal scoping" semantics. > > If we really wanted to keep the sublocal scoping, we could make > list/set/dict displays their own scope too. > > Personally, that's the only argument for sublocal scoping that I like > yet: what happens inside a display should remain inside the display, and > not leak out into the function. > > So that has taken me from -1 on sublocal scoping to -0.5 if it applies > to displays. Inflicting the challenges that comprehensions have at class scope on all container displays wouldn't strike me as a desirable outcome (plus there's also the problem that full nested scopes are relatively expensive at runtime). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rhodri at kynesim.co.uk Mon Apr 9 06:52:45 2018 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 9 Apr 2018 11:52:45 +0100 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406234912.GX16661@ando.pearwood.info> Message-ID: On 07/04/18 09:54, Cammil Taank wrote: >> Care to repeat those arguments? > > Indeed. > > *Minimal use of characters* Terseness is not necessarily a virtue. While it's good not to be needlessly verbose, Python is not Perl and we are not trying to do everything on one line. Overly terse code is much less readable, as all the obfustication competitions demonstrate. I'm afraid I count this one *against* your proposal. > *Thoughts on odd usage of "!"* > > In the English language, `!` signifies an exclamation, and I am > imagining a similar usage to that of introducing something by its name > in an energetic way. For example a boxer walking in to the ring: > > "Muhammed_Ali! ", "x! get_x()" I'm afraid that's a very personal interpretation. In particular, '!' normally ends a sentence very firmly, so expecting the expression to carry on is a little counter-intuitive. For me, my expectations of '!' run roughly as: * factorial (from my maths degree) * array dereference (because I am old: a!2 was the equivalent of a[2] in BCPL) * an exclamation, much overused in writing * the author was bitten by Yahoo! at an early age. -- Rhodri James *-* Kynesim Ltd From rhodri at kynesim.co.uk Mon Apr 9 07:01:16 2018 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 9 Apr 2018 12:01:16 +0100 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406234912.GX16661@ando.pearwood.info> Message-ID: <15da2687-e8d7-b476-b925-8d686da98dac@kynesim.co.uk> On 09/04/18 11:52, Rhodri James wrote: > On 07/04/18 09:54, Cammil Taank wrote: >>> Care to repeat those arguments? >> >> Indeed. >> >> *Minimal use of characters* > > Terseness is not necessarily a virtue.? While it's good not to be > needlessly verbose, Python is not Perl and we are not trying to do > everything on one line.? Overly terse code is much less readable, as all > the obfustication competitions demonstrate.? I'm afraid I count this one > *against* your proposal. > >> *Thoughts on odd usage of "!"* >> >> In the English language, `!` signifies an exclamation, and I am >> imagining a similar usage to that of introducing something by its name >> in an energetic way. For example a boxer walking in to the ring: >> >> "Muhammed_Ali! ", "x! get_x()" > > I'm afraid that's a very personal interpretation.? In particular, '!' > normally ends a sentence very firmly, so expecting the expression to > carry on is a little counter-intuitive.? For me, my expectations of '!' > run roughly as: > > ? * factorial (from my maths degree) > ? * array dereference (because I am old: a!2 was the equivalent of a[2] > in BCPL) > ? * an exclamation, much overused in writing > ? * the author was bitten by Yahoo! at an early age. Also logical negation in C-like languages, of course. Sorry, I'm a bit sleep-deprived this morning. -- Rhodri James *-* Kynesim Ltd From ncoghlan at gmail.com Mon Apr 9 07:14:59 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 9 Apr 2018 21:14:59 +1000 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: On 9 April 2018 at 14:38, Raymond Hettinger wrote: >> On Apr 8, 2018, at 6:43 PM, Tim Peters wrote: >> In short, for _general_ use `accumulate()` needs `initial` for exactly >> the same reasons `reduce()` needed it. > > The reduce() function had been much derided, so I've had it mentally filed in the anti-pattern category. But yes, there may be wisdom there. Weirdly (or perhaps not so weirdly, given my tendency to model computational concepts procedurally), I find the operation of reduce() easier to understand when it's framed as "last(accumulate(iterable, binop, initial=value)))". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From dmoisset at machinalis.com Mon Apr 9 07:23:30 2018 From: dmoisset at machinalis.com (Daniel Moisset) Date: Mon, 9 Apr 2018 12:23:30 +0100 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: In which way would this be different to {**mapping1, **mapping2, **mapping3} ? On 8 April 2018 at 22:18, Andr?s Delfino wrote: > Hi! > > I thought that maybe dict could accept several mappings as positional > arguments, like this: > > class Dict4(dict): >> def __init__(self, *args, **kwargs): >> if len(args) > 1: >> if not all([isinstance(arg, dict) for arg in args]): >> raise TypeError('Dict4 expected instances of dict since >> multiple positional arguments were passed') >> >> temp = args[0].copy() >> >> for arg in args[1:]: >> temp.update(arg) >> >> super().__init__(temp, **kwargs) >> else: >> super().__init__(*args, **kwargs) >> > > AFAIK, this wouldn't create compatibility problems, since you can't pass > two positional arguments now anyways. > > It would be useful to solve the "sum/union dicts" discussion, for example: > requests.get(url, params=dict(params, {'foo': bar}) > > Whar are your thoughts? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Daniel F. Moisset - UK Country Manager - Machinalis Limited www.machinalis.co.uk Skype: @dmoisset T: + 44 7398 827139 1 Fore St, London, EC2Y 9DT Machinalis Limited is a company registered in England and Wales. Registered number: 10574987. -------------- next part -------------- An HTML attachment was scrubbed... URL: From adelfino at gmail.com Mon Apr 9 07:42:20 2018 From: adelfino at gmail.com (=?UTF-8?Q?Andr=C3=A9s_Delfino?=) Date: Mon, 9 Apr 2018 08:42:20 -0300 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: Sorry, I didn't know that kwargs unpacking in dictionaries displays don't raise a TypeError exception. On Mon, Apr 9, 2018 at 8:23 AM, Daniel Moisset wrote: > In which way would this be different to {**mapping1, **mapping2, > **mapping3} ? > > On 8 April 2018 at 22:18, Andr?s Delfino wrote: > >> Hi! >> >> I thought that maybe dict could accept several mappings as positional >> arguments, like this: >> >> class Dict4(dict): >>> def __init__(self, *args, **kwargs): >>> if len(args) > 1: >>> if not all([isinstance(arg, dict) for arg in args]): >>> raise TypeError('Dict4 expected instances of dict since >>> multiple positional arguments were passed') >>> >>> temp = args[0].copy() >>> >>> for arg in args[1:]: >>> temp.update(arg) >>> >>> super().__init__(temp, **kwargs) >>> else: >>> super().__init__(*args, **kwargs) >>> >> >> AFAIK, this wouldn't create compatibility problems, since you can't pass >> two positional arguments now anyways. >> >> It would be useful to solve the "sum/union dicts" discussion, for >> example: requests.get(url, params=dict(params, {'foo': bar}) >> >> Whar are your thoughts? >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > Daniel F. Moisset - UK Country Manager - Machinalis Limited > www.machinalis.co.uk > Skype: @dmoisset T: + 44 7398 827139 > > 1 Fore St, London, EC2Y 9DT > > > Machinalis Limited is a company registered in England and Wales. > Registered number: 10574987. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaoxiansamma at gmail.com Mon Apr 9 07:54:34 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Mon, 9 Apr 2018 19:54:34 +0800 Subject: [Python-ideas] Fwd: Is there any idea about dictionary destructing? In-Reply-To: References: <53a34231-2da4-8d2d-ed5a-3999c8f5781e@trueblade.com> Message-ID: I'm sorry that I didn't send a copy of the discussions here. ---------- Forwarded message ---------- From: Thautwarm Zhao Date: 2018-04-09 1:24 GMT+08:00 Subject: Re: [Python-ideas] Is there any idea about dictionary destructing? To: "Eric V. Smith" Thank you, Eric. Your links really help me and I've investigated it carefully. After reading them, I found the discussion almost focused on a non-nested data structure. The flatten key-value pairs might be easily replaced by something like x, y, z = [some_dict[k] for k in ('a', 'b', 'c')] I couldn't agree more but, when it comes to nested, some_dict = { 'a': { 'b': { 'c': V1}, 'e': V2 }, 'f': V3 } I agree that there could be other ways as intuitive as the dict destructing to get `V1, V2, V3` instead, however dict destructing refers to the consistency of Python language behaviours. When I'm writing these codes: [a, *b] = [1, 2, 3] The LHS is actually equals to RHS, and if we implement a way to apply this on dictionary {'a': a, 'b': b, '@': c, **other} = {'a': 1, 'b': 2, '@': 3, '*': 4} It also presents that LHS equals to RHS. Dict destructing/constructing is totally compatible to Python unpack/pack, just as what iterable destructing/constructing does. It's neat when we talk about Python's data structures we can talk about the consistency, readability and the expression of intuition. In the real world, the following one could really help when it comes to the field of data storage. some_dict = {'a': [1, 2, 3, {"d": 4, "f": 5}]} {'a': [b, *c, {"d": e, **_}]} = some_dict The LHS doesn't only show the structure of some variable intuitively(this makes review easier, too), but also supplies a way to access data in fewer codes. In the previous talk people have shown multiple usages of dict destructing in real world: - Django Rest Framework validate - Load config files and use them, specifically Yaml/JSON data access. In fact, any other operation on dictionary than simply getting a value from a key might needs dict destructing, just when the task is complicated enough. I do think the usages are general enough now to make us allow similar syntax to do above tasks. P.S: Some other advices in the previous talk like the following: 'a' as x, 'b' as y, 'c' as z = some_dict 'a': x, 'b': y, 'c': z = some_dict mode, height, width = **prefs Either of them conflicts against the current syntax, or does mismatch the consistency of Python language(LHS != RHS). thautwarm 2018-04-08 5:39 GMT+08:00 Eric V. Smith : > There was a long thread last year on a subject, titled "Dictionary > destructing and unpacking.": > https://mail.python.org/pipermail/python-ideas/2017-June/045963.html > > You might want to read through it and see what ideas and problems were > raised then. > > In that discussion, there's also a link to an older pattern matching > thread: > https://mail.python.org/pipermail/python-ideas/2015-April/032907.html > > Eric > > On 4/7/2018 1:26 PM, thautwarm wrote: > >> We know that Python support the destructing of iterable objects. >> >> m_iter= (_for _in range(10)) >> a,*b, c= m_iter >> >> That's pretty cool! It's really convenient when there're many corner >> cases to handle with iterable collections. >> However destructing in Python could be more convenient if we support >> dictionary destructing. >> >> In my opinion, dictionary destructing is not difficult to implement and >> makes the syntax more expressive. A typical example is data access on >> nested data structures(just like JSON), destructing a dictionary makes the >> logic quite clear: >> >> data= { >> "direct": "some data", >> "nested": { >> "lst_data": [1,2,3], >> "int_data": 1 >> } >> } >> { >> "direct": direct, >> "nested": { >> "lst_data": [a, b, c], >> } >> }= data >> >> >> Dictionary destructing might not be very well-known but it really helps. >> The operations on nested key-value collections are very frequent, and the >> codes for business logic are not readable enough until now. Moreover Python >> is now popular in data processing which must be enhanced by the entire >> support of data destructing. >> >> Here are some implementations of other languages: >> Elixir, which is also a popular dynamic language nowadays. >> >> |iex> %{} = %{:a => 1, 2 => :b} %{2 => :b, :a => 1} iex> %{:a => a} = >> %{:a => 1, 2 => :b} %{2 => :b, :a => 1} iex> a 1 iex> %{:c => c} = %{:a => >> 1, 2 => :b} ** (MatchError) no match of right hand side value: %{2 => :b, >> :a => 1}| >> >> And in F#, there is something similar to dictionary destructing(actually, >> this destructs `struct` instead) >> type MyRecord = { Name: string; ID: int } letIsMatchByName record1 (name: >> string) = matchrecord1 with| { MyRecord.Name = nameFound; MyRecord.ID = _; >> } whennameFound = name -> true| _ -> falseletrecordX = { Name = "Parker"; >> ID = 10} letisMatched1 = IsMatchByName recordX "Parker"letisMatched2 = >> IsMatchByName recordX "Hartono" >> >> All of them partially destructs(or matches) a dictionary. >> >> thautwarm >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Mon Apr 9 08:17:47 2018 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 9 Apr 2018 09:17:47 -0300 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: Message-ID: I have an idea for an inovative, unanbiguous, straightforward and backwards compatible syntax for that, that evena llows one to pass metadata along the operation so that the results can be tweaked acording to each case's needs. What about: new_data = dict_feed({ "direct": "some data", "nested": { "lst_data": [1, 2, 3], "int_data": 1 } }, data ) we could even call this approach a name such as "function call". In other words, why to bloat the language with hard to learn, error prone, grit-looking syntax, when a simple plain function call is perfectly good, all you need to do over your suggestion is to type the function name and a pair of parentheses? On 7 April 2018 at 14:26, thautwarm wrote: > We know that Python support the destructing of iterable objects. > > m_iter = (_ for _ in range(10)) > a, *b, c = m_iter > > That's pretty cool! It's really convenient when there're many corner cases > to handle with iterable collections. > However destructing in Python could be more convenient if we support > dictionary destructing. > > In my opinion, dictionary destructing is not difficult to implement and > makes the syntax more expressive. A typical example is data access on nested > data structures(just like JSON), destructing a dictionary makes the logic > quite clear: > > data = { > "direct": "some data", > "nested": { > "lst_data": [1, 2, 3], > "int_data": 1 > } > } > { > "direct": direct, > "nested": { > "lst_data": [a, b, c], > } > } = data > > > Dictionary destructing might not be very well-known but it really helps. The > operations on nested key-value collections are very frequent, and the codes > for business logic are not readable enough until now. Moreover Python is now > popular in data processing which must be enhanced by the entire support of > data destructing. > > Here are some implementations of other languages: > Elixir, which is also a popular dynamic language nowadays. > > iex> %{} = %{:a => 1, 2 => :b} > %{2 => :b, :a => 1} > iex> %{:a => a} = %{:a => 1, 2 => :b} > %{2 => :b, :a => 1} > iex> a > 1 > iex> %{:c => c} = %{:a => 1, 2 => :b} > ** (MatchError) no match of right hand side value: %{2 => :b, :a => 1} > > And in F#, there is something similar to dictionary destructing(actually, > this destructs `struct` instead) > type MyRecord = { Name: string; ID: int } let IsMatchByName record1 (name: > string) = match record1 with | { MyRecord.Name = nameFound; MyRecord.ID = _; > } when nameFound = name -> true | _ -> false let recordX = { Name = > "Parker"; ID = 10 } let isMatched1 = IsMatchByName recordX "Parker" let > isMatched2 = IsMatchByName recordX "Hartono" > > All of them partially destructs(or matches) a dictionary. > > thautwarm > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From dmoisset at machinalis.com Mon Apr 9 10:09:21 2018 From: dmoisset at machinalis.com (Daniel Moisset) Date: Mon, 9 Apr 2018 15:09:21 +0100 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: No worries, already implemented features happens so often in this list that there's a story about Guido going back in a time machine to implement them ;-) Just wanted to check that I had understood what you suggested correctly On 9 April 2018 at 12:42, Andr?s Delfino wrote: > Sorry, I didn't know that kwargs unpacking in dictionaries displays don't > raise a TypeError exception. > > On Mon, Apr 9, 2018 at 8:23 AM, Daniel Moisset > wrote: > >> In which way would this be different to {**mapping1, **mapping2, >> **mapping3} ? >> >> On 8 April 2018 at 22:18, Andr?s Delfino wrote: >> >>> Hi! >>> >>> I thought that maybe dict could accept several mappings as positional >>> arguments, like this: >>> >>> class Dict4(dict): >>>> def __init__(self, *args, **kwargs): >>>> if len(args) > 1: >>>> if not all([isinstance(arg, dict) for arg in args]): >>>> raise TypeError('Dict4 expected instances of dict since >>>> multiple positional arguments were passed') >>>> >>>> temp = args[0].copy() >>>> >>>> for arg in args[1:]: >>>> temp.update(arg) >>>> >>>> super().__init__(temp, **kwargs) >>>> else: >>>> super().__init__(*args, **kwargs) >>>> >>> >>> AFAIK, this wouldn't create compatibility problems, since you can't pass >>> two positional arguments now anyways. >>> >>> It would be useful to solve the "sum/union dicts" discussion, for >>> example: requests.get(url, params=dict(params, {'foo': bar}) >>> >>> Whar are your thoughts? >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >> >> >> -- >> Daniel F. Moisset - UK Country Manager - Machinalis Limited >> www.machinalis.co.uk >> Skype: @dmoisset T: + 44 7398 827139 >> >> 1 Fore St, London, EC2Y 9DT >> >> >> Machinalis Limited is a company registered in England and Wales. >> Registered number: 10574987. >> > > -- Daniel F. Moisset - UK Country Manager - Machinalis Limited www.machinalis.co.uk Skype: @dmoisset T: + 44 7398 827139 1 Fore St, London, EC2Y 9DT Machinalis Limited is a company registered in England and Wales. Registered number: 10574987. -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ed.oconnor at gmail.com Mon Apr 9 13:44:17 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Mon, 9 Apr 2018 13:44:17 -0400 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: It seems clear that the name "accumulate" has been kind of antiquated since the "func" argument was added and "sum" became just a default. And people seem to disagree about whether the result should have a length N or length N+1 (where N is the number of elements in the input iterable). The behaviour where the first element of the return is the same as the first element of the input can be weird and confusing. E.g. compare: >> list(itertools.accumulate([2, 3, 4], lambda accum, val: accum-val)) [2, -1, -5] >> list(itertools.accumulate([2, 3, 4], lambda accum, val: val-accum)) [2, 1, 3] One might expect that since the second function returned the negative of the first function, and both are linear, that the results of the second would be the negative of the first, but that is not the case. Maybe we can instead let "accumulate" fall into deprecation, and instead add a new more general itertools "reducemap" method: def reducemap(iterable: Iterable[Any], func: Callable[(Any, Any), Any], initial: Any, include_initial_in_return=False): -> Generator[Any] Benefits: - The name is more descriptive of the operation (a reduce operation where we keep values at each step, like a map) - The existence of include_initial_in_return=False makes it somewhat clear that the initial value will by default NOT be provided in the returning generator - The mandatory initial argument forces you to think about initial conditions. Disadvantages: - The most common use case (summation, product), has a "natural" first element (0, and 1, respectively) when you'd now be required to write out. (but we could just leave accumulate for sum). I still prefer a built-in language comprehension syntax for this like: (y := f(y, x) for x in x_vals from y=0), but for a huge discussion on that see the other thread. ------- More Examples (using "accumulate" as the name for now) ------- # Kalman filters def kalman_filter_update(state, measurement): ... return state online_trajectory_estimate = accumulate(measurement_generator, func= kalman_filter_update, initial = initial_state) --- # Bayesian stats def update_model(prior, evidence): ... return posterior model_history = accumulate(evidence_generator, func=update_model, initial = prior_distribution) --- # Recurrent Neural networks: def recurrent_network_layer_step(last_hidden, current_input): new_hidden = .... return new_hidden hidden_state_generator = accumulate(input_sequence, func= recurrent_network_layer_step, initial = initial_hidden_state) On Mon, Apr 9, 2018 at 7:14 AM, Nick Coghlan wrote: > On 9 April 2018 at 14:38, Raymond Hettinger > wrote: > >> On Apr 8, 2018, at 6:43 PM, Tim Peters wrote: > >> In short, for _general_ use `accumulate()` needs `initial` for exactly > >> the same reasons `reduce()` needed it. > > > > The reduce() function had been much derided, so I've had it mentally > filed in the anti-pattern category. But yes, there may be wisdom there. > > Weirdly (or perhaps not so weirdly, given my tendency to model > computational concepts procedurally), I find the operation of reduce() > easier to understand when it's framed as "last(accumulate(iterable, > binop, initial=value)))". > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ed.oconnor at gmail.com Mon Apr 9 14:09:00 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Mon, 9 Apr 2018 14:09:00 -0400 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: Also Tim Peter's one-line example of: print(list(itertools.accumulate([1, 2, 3], lambda x, y: str(x) + str(y)))) I think makes it clear that itertools.accumulate is not the right vehicle for this change - we should make a new itertools function with a required "initial" argument. On Mon, Apr 9, 2018 at 1:44 PM, Peter O'Connor wrote: > It seems clear that the name "accumulate" has been kind of antiquated > since the "func" argument was added and "sum" became just a default. > > And people seem to disagree about whether the result should have a length > N or length N+1 (where N is the number of elements in the input iterable). > > The behaviour where the first element of the return is the same as the > first element of the input can be weird and confusing. E.g. compare: > > >> list(itertools.accumulate([2, 3, 4], lambda accum, val: accum-val)) > [2, -1, -5] > >> list(itertools.accumulate([2, 3, 4], lambda accum, val: val-accum)) > [2, 1, 3] > > One might expect that since the second function returned the negative of > the first function, and both are linear, that the results of the second > would be the negative of the first, but that is not the case. > > Maybe we can instead let "accumulate" fall into deprecation, and instead > add a new more general itertools "reducemap" method: > > def reducemap(iterable: Iterable[Any], func: Callable[(Any, Any), Any], > initial: Any, include_initial_in_return=False): -> Generator[Any] > > Benefits: > - The name is more descriptive of the operation (a reduce operation where > we keep values at each step, like a map) > - The existence of include_initial_in_return=False makes it somewhat > clear that the initial value will by default NOT be provided in the > returning generator > - The mandatory initial argument forces you to think about initial > conditions. > > Disadvantages: > - The most common use case (summation, product), has a "natural" first > element (0, and 1, respectively) when you'd now be required to write out. > (but we could just leave accumulate for sum). > > I still prefer a built-in language comprehension syntax for this like: (y > := f(y, x) for x in x_vals from y=0), but for a huge discussion on that see > the other thread. > > ------- More Examples (using "accumulate" as the name for now) ------- > > # Kalman filters > def kalman_filter_update(state, measurement): > ... > return state > > online_trajectory_estimate = accumulate(measurement_generator, func= > kalman_filter_update, initial = initial_state) > > --- > > # Bayesian stats > def update_model(prior, evidence): > ... > return posterior > > model_history = accumulate(evidence_generator, func=update_model, > initial = prior_distribution) > > --- > > # Recurrent Neural networks: > def recurrent_network_layer_step(last_hidden, current_input): > new_hidden = .... > return new_hidden > > hidden_state_generator = accumulate(input_sequence, func= > recurrent_network_layer_step, initial = initial_hidden_state) > > > > > On Mon, Apr 9, 2018 at 7:14 AM, Nick Coghlan wrote: > >> On 9 April 2018 at 14:38, Raymond Hettinger >> wrote: >> >> On Apr 8, 2018, at 6:43 PM, Tim Peters wrote: >> >> In short, for _general_ use `accumulate()` needs `initial` for exactly >> >> the same reasons `reduce()` needed it. >> > >> > The reduce() function had been much derided, so I've had it mentally >> filed in the anti-pattern category. But yes, there may be wisdom there. >> >> Weirdly (or perhaps not so weirdly, given my tendency to model >> computational concepts procedurally), I find the operation of reduce() >> easier to understand when it's framed as "last(accumulate(iterable, >> binop, initial=value)))". >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Apr 9 16:33:22 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 9 Apr 2018 15:33:22 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: [Tim] >> Then why was [accumulate] generalized to allow any 2-argument function? [Raymond] > Prior to 3.2, accumulate() was in the recipes section as pure Python > code. It had no particular restriction to numeric types. > > I received a number of requests for accumulate() to be promoted > to a real itertool (fast, tested, documented C code with a stable API). > I agreed and accumulate() was added to itertools in 3.2. It worked > with anything supporting __add__, including str, bytes, lists, and > tuples. So that's the restriction Nick had in mind: a duck-typing kind, in that it would blow up on types that don't participate in PyNumber_Add(): > More specifically, accumulate_next() called PyNumber_Add() without < any particular type restriction. > > Subsequently, I got requests to generalize accumulate() to support any > arity-2 function (with operator.mul offered as the motivating example). Sucker ;-) > Given that there were user requests and there were ample precedents > in other languages, I acquiesced despite having some reservations (if used > with a lambda, the function call overhead might make accumulate() slower > than a plain Python for-loop without the function call). So, that generalized > API extension went into 3.3 and has remained unchanged ever since. > > Afterwards, I was greeted with the sound of crickets. Either it was nearly > perfect or no one cared or both ;-) Or nobody cared _enough_ to endure a 100-message thread arguing about an objectively minor change ;-) If you can borrow Guido's time machine, I'd suggest going back to the original implementation, except name it `cumsum()` instead, and leave `accumulate()` to the 3rd-party itertools packages (at least one of which (itertoolz) has supported an optional "initial" argument all along). > It remains one of the least used itertools. I don't see how that can be known. There are at least tens of thousands of Python programmers nobody on this list has ever heard about - or from - writing code that's invisible to search engines. I _believe_ it - I just don't see how it can be known. > ... > Honestly, I couldn't immediately tell what this code was doing: > > list(accumulate([8, 4, "k"], lambda x, y: x + [y], first_result=[])) Of course you couldn't: you think of accumulate() as being about running sums, and _maybe_ some oddball out there using it for running products. But that's a statement about your background, seeing code you've never seen before, not about the function. Nobody knows immediately, at first sight, what list(accumulate([8, 4, 6], lambda x, y: x + y, first_result=0)) does either. It's learned. If your background were in, e.g., Haskell instead, then in the latter case you'd picture a list [a, b, c, ...] and figure it out from thinking about what the prefixes of 0 + a + b + c + .... compute. In exactly the same way, in the former case you'd think about what the prefixes of [] + [a] + [b] + [c] + ... compute. They're equally obvious _after_ undertaking that easy exercise, but clear as mud before doing so. > This may be a case where a person would be better-off without accumulate() at all. De gustibus non est disputandum. >> In short, for _general_ use `accumulate()` needs `initial` for exactly >> the same reasons `reduce()` needed it. > The reduce() function had been much derided, so I've had it mentally filed > in the anti-pattern category. But yes, there may be wisdom there. The current accumulate() isn't just akin to reduce(), it _is_ reduce(), except a drunken reduce() so nauseated it vomits its internal state out after each new element it eats ;-) >> BTW, the type signatures on the scanl (requires an initial value) and >> scanl1 (does not support an initial value) implementations I pasted >> from Haskell's Standard Prelude give a deeper reason: without an >> initial value, a list of values of type A can only produce another >> list of values of type A via scanl1. The dyadic function passed must >> map As to As. But with an initial value supplied of type B, scanl can >> transform a list of values of type A to a list of values of type B. >> While that may not have been obvious in the list prefix example I >> gave, that was at work: a list of As was transformed into a list _of_ >> lists of As. That's impossible for scanl1 to do, but easy for scanl. > Thanks for pointing that out. I hadn't considered that someone might > want to transform one type into another using accumulate(). That is > pretty far from my mental model of what accumulate() was intended for. It's nevertheless what the current function supports - nothing being suggested changes that one whit. It's "worse" in Python because while only `scanl` in Haskell can "change types", the current `scanl1`-like Python `accumulate` can change types too. Perhaps the easiest way to see that is by noting that map(f, xs) is generally equivalent to accumulate(xs, lambda x, y: f(y)) right now. That is, just ignore the "accumulator" argument, and you're left with a completely arbitrary transformation of the elements. If you like, you can, e.g., use accumulate() right now to generate a a sequence of socket connections from a sequence of deques. That can't be done by Haskell's scanl1 because that language's strict static type system can't express the notion of that "well, given a list-of-A, the accumulation function f has arguments of types A and A on the first call, but types type-of-f(A) and A on subsequent calls. By supplying an initial value of type B, `scanl`'s accumulation function returns type B and always has arguments of type B and A. The type system has no problem with that. > Also, I'm still not sure whether we would want code like that buried in an > accumulate() call rather than as a regular for-loop where I can see the logic > and trace through it with pdb. De gustibus non est disputandum again ;-) These kinds of arguments belong in a style guide, wiki, tutorial, or nagging Stackoverflow answer. They really - to me - have nothing to do with the issue at hand. Again, nothing above depends in any way on whether or not accumulate grows an optional argument. All the things you see as "bad" are already not just possible, but easy. > As for scanl, I'm not sure what this code means without seeing some python equivalent. > > scanl :: (a -> b -> a) -> a -> [b] -> [a] > scanl f q xs = q : (case xs of > [] -> [] > x:xs -> scanl f (f q x) xs) > > > scanl1 :: (a -> a -> a) -> [a] -> [a] > scanl1 f (x:xs) = scanl f x xs > scanl1 _ [] = [] My apologies for that - I assumed you had a reading knowledge of Haskell, and were just unaware of these specific functions. The details don't really matter. You can just take my word for that `scanl1` is a type-safe-restricted form of Python's current `accumulate`, and that the more basic `scanl` is like what `accumulate` would be if an initial value were a _required_ argument. BTW, Haskell has no native notion of optional (or default, or variable, or keyword) arguments. All functions are "curried": they have exactly one argument. So, e.g., while there's generally no harm in viewing the Haskell f x y as being like the Python f(x, y) it's _really_ more like functools.partial(f, x)(y) In any case, "all functions have exactly one argument" is why, in Haskell, even the tiniest variation in functionality is usually implemented by creating a function with a new name for each variation, rather than pile on ever-growing sequences of required flag arguments. I'll also note that `scanl` and `scanl1` are defined in the Standard Prelude, which is akin to being a Python builtin: they're part of the core language, always available. As such, every experienced Haskell programmer is aware of them. >>> and it would have been distracting to even had the option. >> Distracting for how long? One second or two? ;-) > Possibly forever. In my experience, if a person initially frames a problem > wrong (or perhaps in a hard to solve way), it can take them a long time to > recover. For example with discounted cash flows, people who think of the > initial value as being special or distinct from the other cash flows will have > a hard time adapting to problem variants like annuity due, balloon > payments, internal rate of return, coupon stripping, valuing a transaction > that takes place in the future, etc. > > I don't want to overstate the case, but I do think a function signature that > offers a "first_value" option is an invitation to treat the first value as being > distinct from the rest of the data stream. For a _general_ accumulate, which we already have, _sometimes_ the first value really is distinct. Greg Ewing recently made that point eloquently, so I won't elaborate. But again, one example in the docs is all it takes: >>> list(accumulate([1, 2, 3])) [1, 3, 6] >>> list(accumulate([1, 2, 3], initial=10)) [10, 11, 13, 16] 99% of programmers will see that, think "huh! why would I want initial=?" and never give it another thought. 0.001% of the remaining 1% will ask on Stackoverflow, and get an answer showing "advanced" uses for accumulate. The remaining 0.999% of the remaining 1% will eventually find one of those answers. >> With a different background, you may just as well have been surprised >> if the question _hadn't_ come up. For example, this is a standard >> example in the Haskell world for how to define an infinite Fibonacci >> sequence with the initial two values f0 and f1: >> >> fibs = f0 : scanl (+) f1 fibs >> >> The part from `scanl` onward would be spelled in Python as >> >> accumulate(fibs, initial=f1) >> >> although it requires some trickery to get the recursive reference to >> work ... > Do we want the tool to encourage such trickery? > > Don't get me wrong, I think it is cool that you could write such code, but could > and should aren't always the same. Sorry, the original point got lost in the weeds there: the point isn't that such code is desirable, it's just that Haskell programmers _are_ aware of scanl. and that one of its very-well-known toy uses exploits that it specifies an initial value. So _if_ you had that background too, it would be natural as death to wonder "huh - so how come Python's accumulate _doesn't_ have such an argument?". > ... > Haskell probably is a good source of inspiration, but I don't know the language > and find its docs to be inscrutable. That's what happens when a worldwide network of obsessed PhDs struggles for decades to push the state of the art ;-) >> ... >> That would be better on several counts to my eyes as: >> >> inputs = repeat(None, 35) # no values actually used >> ... for x in accumulate(inputs, logistic_map, initial=x0) >> >> In particular, filling inputs with `None` would lead to an exception >> if the programmer screwed up and the input values actually _were_ >> being used. I expect we'll both overlook that writing a generator >> using the obvious loop would be a lot clearer regardless ;-) > The winks may reading your posts fun, but I really can't tell whether > position is, "yes, let's do this because someone can do wild things > with it", or "no, let's don't this because people would commit atrocities > with it". I mean both, of course ;-) The world is absurd, and often all I can do is chuckle and accept that all options suck in their own endearing ways. Did you write those docs? If so, that's one of life's absurdities: half of the accumulate() docs are devoted to giving a weird example of an application that ignores the input sequence values entirely, taking from the input sequence _only_ its total length. I enjoyed the example, but I can't believe you'd _recommend_ it as good practice. It's just "a trick" you _can_ do, like (as above) you _can_ ignore the accumulator argument instead to make accumulate() work like map() instead. >>> and "Will anyone actually use it?". >> As above, the docs could change to use it. And I bet the test suite >> too. How much more could you want from a feature?! ;-) > I'm concerned that the total number of actual users will be exactly > two (you and the writer of the wheel-sieve) and that you each would > have used it exactly once in your life. That's a pretty small user base > for a standard library feature ;-) You can't know that, though. It's not my only use, but it's the only use I wrote about because I had just done it the day before this thread started. I also have a prime-generating function inspired by Will Ness's. BTW, that's a beautiful thing in real life, but not primarily because of the wheel gimmicks: you don't have to specify an upper limit in advance (or ever), but it nevertheless consumes memory proportional to the number of primes less than the _square root_ of the largest prime you've generated so far. That's easy if you know the upper limit in advance, but not if you don't. Will's solution to _that_ part is clever, elegant, and eminently practical. In any case, Will is "a functional language" person, and didn't even know itertools existed. Janne Karila pointed it out to him, and Will was like a kid in a candy store. I expect (but don't know) _he_ wondered "huh - how come accumulate() doesn't have an initial value like scanl has?", but rather than gripe about it on a Python list created the "chain a singleton list" workaround instead. Regardless, I wouldn't be surprised a bit if Janne Karila also had similar code - or any number of other Python programmers we don't know about writing code we'll never see. > Tim, if you could muster an honest to goodness, real +1, that would be good > enough for me. On purely _technical_ grounds, given that accumulate() is already thoroughly general, I think adding an optional start value is not just a +1, but a no-brainer. I've still seen no technical argument against it, and several sound technical arguments in its favor have been made by several people. OTOH ... meh. There appears to be scant demand, and code churn also carries costs. I've already wormed around it, so it doesn't scratch any practical itch I still have. It's your job to worry about future generations ;-) > Otherwise, I'm back to -0 and prefer not to see Pythonistas writing the Haskell > magics described in this thread. Whether or not the argument is added is simply irrelevant to what's possible to do with accumulate() already - it just affects how transparently a special initial value can be supplied when one is desired. In all of the (few!) "real life" current uses you've seen, it would obviously make already-sane code simpler & clearer. Against that, worrying about masses of Pythonistas who _could_ make absurd (in Python) code a bit _less_ absurd just doesn't engage me. > If this does go forward, I would greatly prefer "start" rather than "first_value" or "initial". Bike-shedding the name holds little interest for me. Against "start", Nick vehemently objects to that name. I _expected_ that you would too, because you've generally seen _no_ value to specifying an initial value for sums, and "start" is the name `sum()` gives to its optional starting value. In favor of "initial", the current accumulate() _is_ a generalization not of sum() but of reduce(), and: """ Help on built-in function reduce in module _functools: reduce(...) reduce(function, sequence[, initial]) -> value """ That is, the reduce docstring has used "initial" forever. And there's also that the itertoolz accumulate() already has the feature in question, and named its argument "initial". > The conversation has been enjoyable (perhaps because the stakes > are so low) and educational (I learn something new every time you post). Yup, I too prefer fighting to the death over things that don't matter ;-) > I'll leave this will a few random thoughts on itertools that don't seem to fit > anywhere else. > > 1) When itertools was created, they were one of easiest ways to get C-like > performance without writing C. However, when PyPy matured we got other > ways to do it. And in the world of PyPy, the plain python for-loops outperform > their iterator chain equivalents, so we lost one motivate to use itertools. I should try PyPy again. I got out of the habit years ago, when I kept running out of RAM. I have a lot more RAM now ;-) > 2) While I personally like function chains operating on iterators, my consulting > and teaching experience has convinced me that very few people think that way. Matches my less extensive experience. > Accordingly, I almost never use compress, filterfalse, takewhile, dropwhile, etc. While I've never used any of those, except once each when they were new. I'm aware of them, but they just never seem to fit the problem at hand. So, just to keep things interesting, let's add an optional `start=` argument to _all_ itertools functions ;-) > As people started adopting PEP 279 generator expressions, interest > in itertool style thinking seems to have waned. > > Putting these two together has left me with a preference for itertools to only cover > the simplest and most common cases, leaving the rest to be expressed as plain, > everyday pure python. (The combinatoric itertools are an exception because > they are more algorithmically interesting). Besides all the combinatoric ones, these are the ones I'd sorely miss: count cycle repeat accumulate chain[.from_iterable] islice tee I'm surprised I left groupby() off that list! I always liked that one "in theory", but in practice - for whatever reasons - whenever I use it I usually end up rewriting the code, in such a way that groupby() no longer makes sense to use. Curious! From mistersheik at gmail.com Mon Apr 9 16:43:15 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Mon, 9 Apr 2018 13:43:15 -0700 (PDT) Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: Message-ID: <24d1b022-0b3b-46fe-87c2-7e606319b1c7@googlegroups.com> On Friday, April 6, 2018 at 9:03:05 PM UTC-4, Raymond Hettinger wrote: > > > On Friday, April 6, 2018 at 8:14:30 AM UTC-7, Guido van Rossum wrote: > > On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor > wrote: > >> So some more humble proposals would be: > >> > >> 1) An initializer to itertools.accumulate > >> functools.reduce already has an initializer, I can't see any > controversy to adding an initializer to itertools.accumulate > > > > See if that's accepted in the bug tracker. > > It did come-up once but was closed for a number reasons including lack of > use cases. However, Peter's signal processing example does sound > interesting, so we could re-open the discussion. > > For those who want to think through the pluses and minuses, I've put > together a Q&A as food for thought (see below). Everybody's design > instincts are different -- I'm curious what you all think think about the > proposal. > > > Raymond > > --------------------------------------------- > > Q. Can it be done? > A. Yes, it wouldn't be hard. > > _sentinel = object() > > def accumulate(iterable, func=operator.add, start=_sentinel): > it = iter(iterable) > if start is _sentinel: > try: > total = next(it) > except StopIteration: > return > else: > total = start > yield total > for element in it: > total = func(total, element) > yield total > > Q. Do other languages do it? > A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. > Isn't numpy a yes? https://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html They definitely support it for add and multiply. It's defined, but doesn't seem to work on custum ufuncs (the result of frompyfunc). > > * > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html > * https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html > * http://microapl.com/apl/apl_concepts_chapter5.html > \+ 1 2 3 4 5 > 1 3 6 10 15 > * https://reference.wolfram.com/language/ref/Accumulate.html > * https://www.haskell.org/hoogle/?hoogle=mapAccumL > > > Q. How much work for a person to do it currently? > A. Almost zero effort to write a simple helper function: > > myaccum = lambda it, func, start: accumulate(chain([start], it), func) > > > Q. How common is the need? > A. Rare. > > > Q. Which would be better, a simple for-loop or a customized itertool? > A. The itertool is shorter but more opaque (especially with respect > to the argument order for the function call): > > result = [start] > for x in iterable: > y = func(result[-1], x) > result.append(y) > > versus: > > result = list(accumulate(iterable, func, start=start)) > > > Q. How readable is the proposed code? > A. Look at the following code and ask yourself what it does: > > accumulate(range(4, 6), operator.mul, start=6) > > Now test your understanding: > > How many values are emitted? > What is the first value emitted? > Are the two sixes related? > What is this code trying to accomplish? > > > Q. Are there potential surprises or oddities? > A. Is it readily apparent which of assertions will succeed? > > a1 = sum(range(10)) > a2 = sum(range(10), 0) > assert a1 == a2 > > a3 = functools.reduce(operator.add, range(10)) > a4 = functools.reduce(operator.add, range(10), 0) > assert a3 == a4 > > a4 = list(accumulate(range(10), operator.add)) > a5 = list(accumulate(range(10), operator.add, start=0)) > assert a5 == a6 > > > Q. What did the Python 3.0 Whatsnew document have to say about reduce()? > A. "Removed reduce(). Use functools.reduce() if you really need it; > however, 99 percent of the time an explicit for loop is more readable." > > > Q. What would this look like in real code? > A. We have almost no real-world examples, but here is one from a > StackExchange post: > > def wsieve(): # wheel-sieve, by Will Ness. > ideone.com/mqO25A->0hIE89 > wh11 = [ 2,4,2,4,6,2,6,4,2,4,6,6, 2,6,4,2,6,4,6,8,4,2,4,2, > 4,8,6,4,6,2,4,6,2,6,6,4, 2,4,6,2,6,4,2,4,2,10,2,10] > cs = accumulate(cycle(wh11), start=11) > yield( next( cs)) # cf. ideone.com/WFv4f > ps = wsieve() # > codereview.stackexchange.com/q/92365/9064 > p = next(ps) # 11 > psq = p*p # 121 > D = dict( zip( accumulate(wh11, start=0), count(0))) # start > from > sieve = {} > for c in cs: > if c in sieve: > wheel = sieve.pop(c) > for m in wheel: > if not m in sieve: > break > sieve[m] = wheel # sieve[143] = wheel at 187 > elif c < psq: > yield c > else: # (c==psq) > # map (p*) (roll wh from p) = roll (wh*p) from (p*p) > x = [p*d for d in wh11] > i = D[ (p-11) % 210] > wheel = accumulate(cycle(x[i:] + x[:i]), start=psq) > p = next(ps) ; psq = p*p > next(wheel) ; m = next(wheel) > sieve[m] = wheel > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ed.oconnor at gmail.com Mon Apr 9 18:54:59 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Mon, 9 Apr 2018 18:54:59 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> Message-ID: Kyle, you sounded so reasonable when you were trashing itertools.accumulate (which I now agree is horrible). But then you go and support Serhiy's madness: "smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]]" which I agree is clever, but reads more like a riddle than readable code. Anyway, I continue to stand by: (y:= f(y, x) for x in iter_x from y=initial_y) And, if that's not offensive enough, to its extension: (z, y := f(z, x) -> y for x in iter_x from z=initial_z) Which carries state "z" forward but only yields "y" at each iteration. (see proposal: https://github.com/petered/peps/blob/master/pep-9999.rst) Why am I so obsessed? Because it will allow you to conveniently replace classes with more clean, concise, functional code. People who thought they never needed such a construct may suddenly start finding it indispensable once they get used to it. How many times have you written something of the form?: class StatefulThing(object): def __init__(self, initial_state, param_1, param_2): self._param_1= param_1 self._param_2 = param_2 self._state = initial_state def update_and_get_output(self, new_observation): # (or just __call__) self._state = do_some_state_update(self._state, new_observation, self._param_1) output = transform_state_to_output(self._state, self._param_2) return output processor = StatefulThing(initial_state = initial_state, param_1 = 1, param_2 = 4) processed_things = [processor.update_and_get_output(x) for x in x_gen] I've done this many times. Video encoding, robot controllers, neural networks, any iterative machine learning algorithm, and probably lots of things I don't know about - they all tend to have this general form. And how many times have I had issues like "Oh no now I want to change param_1 on the fly instead of just setting it on initialization, I guess I have to refactor all usages of this class to pass param_1 into update_and_get_output instead of __init__". What if instead I could just write: def update_and_get_output(last_state, new_observation, param_1, param_2) new_state = do_some_state_update(last_state, new_observation, _param_1) output = transform_state_to_output(last_state, _param_2) return new_state, output processed_things = [state, output:= update_and_get_output(state, x, param_1=1, param_2=4) -> output for x in observations from state=initial_state] Now we have: - No mutable objects (which cuts of a whole slew of potential bugs and anti-patterns familiar to people who do OOP.) - Fewer lines of code - Looser assumptions on usage and less refactoring. (if I want to now pass in param_1 at each iteration instead of just initialization, I need to make no changes to update_and_get_output). - No need for state getters/setters, since state is is passed around explicitly. I realize that calling for changes to syntax is a lot to ask - but I still believe that the main objections to this syntax would also have been raised as objections to the now-ubiquitous list-comprehensions - they seem hostile and alien-looking at first, but very lovable once you get used to them. On Sun, Apr 8, 2018 at 1:41 PM, Kyle Lahnakoski wrote: > > > On 2018-04-05 21:18, Steven D'Aprano wrote: > > (I don't understand why so many people have such an aversion to writing > > functions and seek to eliminate them from their code.) > > > > I think I am one of those people that have an aversion to writing > functions! > > I hope you do not mind that I attempt to explain my aversion here. I > want to clarify my thoughts on this, and maybe others will find > something useful in this explanation, maybe someone has wise words for > me. I think this is relevant to python-ideas because someone with this > aversion will make different language suggestions than those that don't. > > Here is why I have an aversion to writing functions: Every unread > function represents multiple unknowns in the code. Every function adds > to code complexity by mapping an inaccurate name to specific > functionality. > > When I read code, this is what I see: > > > x = you_will_never_guess_how_corner_cases_are_handled(a, b, c) > > y = > you_dont_know_I_throw_a_BaseException_when_I_do_not_like_your_arguments(j, > k, l) > > Not everyone sees code this way: I see people read method calls, make a > number of wild assumptions about how those methods work, AND THEY ARE > CORRECT! How do they do it!? It is as if there are some unspoken > convention about how code should work that's opaque to me. > > For example before I read the docs on > itertools.accumulate(list_of_length_N, func), here are the unknowns I see: > > * Does it return N, or N-1 values? > * How are initial conditions handled? > * Must `func` perform the initialization by accepting just one > parameter, and accumulate with more-than-one parameter? > * If `func` is a binary function, and `accumulate` returns N values, > what's the Nth value? > * if `func` is a non-cummutative binary function, what order are the > arguments passed? > * Maybe accumulate expects func(*args)? > * Is there a window size? Is it equal to the number of arguments of `func`? > > These are not all answered by reading the docs, they are answered by > reading the code. The code tells me the first value is a special case; > the first parameter of `func` is the accumulated `total`; `func` is > applied in order; and an iterator is returned. Despite all my > questions, notice I missed asking what `accumulate` returns? It is the > unknown unknowns that get me most. > > So, `itertools.accumulate` is a kinda-inaccurate name given to a > specific functionality: Not a problem on its own, and even delightfully > useful if I need it often. > > What if I am in a domain where I see `accumulate` only a few times a > year? Or how about a program that uses `accumulate` in only one place? > For me, I must (re)read the `accumulate` source (or run the caller > through the debugger) before I know what the code is doing. In these > cases I advocate for in-lining the function code to remove these > unknowns. Instead of an inaccurate name, there is explicit code. If we > are lucky, that explicit code follows idioms that make the increased > verbosity easier to read. > > Consider Serhiy Storchaka's elegant solution, which I reformatted for > readability > > > smooth_signal = [ > > average > > for average in [0] > > for x in signal > > for average in [(1-decay)*average + decay*x] > > ] > > We see the initial conditions, we see the primary function, we see how > the accumulation happens, we see the number of returned values, and we > see it's a list. It is a compact, easy read, from top to bottom. Yes, we > must know `for x in [y]` is an idiom for assignment, but we can reuse > that knowledge in all our other list comprehensions. So, in the > specific case of this Reduce-Map thread, I would advocate using the list > comprehension. > > In general, all functions introduce non-trivial code debt: This debt is > worth it if the function is used enough; but, in single-use or rare-use > cases, functions can obfuscate. > > > > Thank you for your time. > > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Apr 9 20:15:35 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 9 Apr 2018 19:15:35 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: Message-ID: Woo hoo! Another coincidence. I just happened to be playing with this problem today: You have a large list - xs - of N numbers. It's necessary to compute slice sums sum(xs[i:j]) for a great many slices, 0 <= i <= j <= N. For concreteness, say xs is a time series representing a toll booth receipt's by hour across years. "Management" may ask for all sorts of sums - by 24-hour period, by week, by month, by year, by season, ... A little thought showed that sum(xs[i:j]) = sum(xs[:j]) - sum(xs[:i]), so if we precomputed just the prefix sums, the sum across an arbitrary slice could be computed thereafter in constant time. Hard to beat that ;-) But computing the prefix sums is exactly what accumulate() does! With one twist: while we have N numbers, there are N+1 slice indices. So accumulate(xs) doesn't quite work. It needs to also have a 0 inserted as the first prefix sum (the empty prefix sum(xs[:0]). Which is exactly what a this_is_the_initial_value=0 argument would do for us. As is, using the chain trick: class SliceSummer: def __init__(self, xs): from itertools import accumulate, chain self.N = N = len(xs) if not N: raise ValueError("need a non-empty sequence") self.prefixsum = list(accumulate(chain([0], xs))) assert len(self.prefixsum) == N+1 def slicesum(self, i, j): N = self.N if not 0 <= i <= j <= N: raise ValueError(f"need 0 <= {i} <= {j} <= {N}") return self.prefixsum[j] - self.prefixsum[i] def test(N): from random import randrange xs = [randrange(-10, 11) for _ in range(N)] ntried = 0 ss = SliceSummer(xs) NP1 = N + 1 for i in range(NP1): for j in range(i, NP1): ntried += 1 assert ss.slicesum(i, j) == sum(xs[i:j]) assert ntried == N * NP1 // 2 + NP1, ntried From greg.ewing at canterbury.ac.nz Mon Apr 9 20:32:21 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 Apr 2018 12:32:21 +1200 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <83B52AC3-E640-484C-A892-A9724FAEBA26@gmail.com> Message-ID: <5ACC0615.7090002@canterbury.ac.nz> Peter O'Connor wrote: > The behaviour where the first element of the return is the same as the > first element of the input can be weird and confusing. E.g. compare: > > >> list(itertools.accumulate([2, 3, 4], lambda accum, val: accum-val)) > [2, -1, -5] > >> list(itertools.accumulate([2, 3, 4], lambda accum, val: val-accum)) > [2, 1, 3] This is another symptom of the fact that the first item in the list is taken to be the initial value. There's no way to interpret these results in terms of an assumed initial value, because neither of those functions has a left identity. -- Greg From brett at python.org Mon Apr 9 21:10:42 2018 From: brett at python.org (Brett Cannon) Date: Tue, 10 Apr 2018 01:10:42 +0000 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: Message-ID: On Mon, 9 Apr 2018 at 05:18 Joao S. O. Bueno wrote: > I have an idea for an inovative, unanbiguous, straightforward and > backwards compatible syntax for that, > that evena llows one to pass metadata along the operation so that the > results can be tweaked acording > to each case's needs. > > What about: > > new_data = dict_feed({ > "direct": "some data", > "nested": { > "lst_data": [1, 2, 3], > "int_data": 1 > } > }, > data > ) > > we could even call this approach a name such as "function call". > The harsh sarcasm is not really called for. -Brett > > > In other words, why to bloat the language with hard to learn, error prone, > grit-looking syntax, when a simple plain function call is perfectly > good, all you need to do over > your suggestion is to type the function name and a pair of parentheses? > > > > > On 7 April 2018 at 14:26, thautwarm wrote: > > We know that Python support the destructing of iterable objects. > > > > m_iter = (_ for _ in range(10)) > > a, *b, c = m_iter > > > > That's pretty cool! It's really convenient when there're many corner > cases > > to handle with iterable collections. > > However destructing in Python could be more convenient if we support > > dictionary destructing. > > > > In my opinion, dictionary destructing is not difficult to implement and > > makes the syntax more expressive. A typical example is data access on > nested > > data structures(just like JSON), destructing a dictionary makes the logic > > quite clear: > > > > data = { > > "direct": "some data", > > "nested": { > > "lst_data": [1, 2, 3], > > "int_data": 1 > > } > > } > > { > > "direct": direct, > > "nested": { > > "lst_data": [a, b, c], > > } > > } = data > > > > > > Dictionary destructing might not be very well-known but it really helps. > The > > operations on nested key-value collections are very frequent, and the > codes > > for business logic are not readable enough until now. Moreover Python is > now > > popular in data processing which must be enhanced by the entire support > of > > data destructing. > > > > Here are some implementations of other languages: > > Elixir, which is also a popular dynamic language nowadays. > > > > iex> %{} = %{:a => 1, 2 => :b} > > %{2 => :b, :a => 1} > > iex> %{:a => a} = %{:a => 1, 2 => :b} > > %{2 => :b, :a => 1} > > iex> a > > 1 > > iex> %{:c => c} = %{:a => 1, 2 => :b} > > ** (MatchError) no match of right hand side value: %{2 => :b, :a => 1} > > > > And in F#, there is something similar to dictionary destructing(actually, > > this destructs `struct` instead) > > type MyRecord = { Name: string; ID: int } let IsMatchByName record1 > (name: > > string) = match record1 with | { MyRecord.Name = nameFound; MyRecord.ID > = _; > > } when nameFound = name -> true | _ -> false let recordX = { Name = > > "Parker"; ID = 10 } let isMatched1 = IsMatchByName recordX "Parker" let > > isMatched2 = IsMatchByName recordX "Hartono" > > > > All of them partially destructs(or matches) a dictionary. > > > > thautwarm > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Apr 9 21:12:41 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 Apr 2018 13:12:41 +1200 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: Message-ID: <5ACC0F89.9080507@canterbury.ac.nz> Tim Peters wrote: > while we have N numbers, there are N+1 slice indices. So > accumulate(xs) doesn't quite work. It needs to also have a 0 inserted > as the first prefix sum (the empty prefix sum(xs[:0]). > > Which is exactly what a this_is_the_initial_value=0 argument would do > for us. In this case, yes. But that still doesn't mean it makes sense to require the initial value to be passed *in* as part of the input sequence. Maybe the best idea is for the initial value to be a separate argument, but be returned as the first item in the list. I can think of another example where this would make sense. Suppose you have an initial bank balance and a list of transactions, and you want to produce a statement with a list of running balances. The initial balance and the list of transactions are coming from different places, so the most natural way to call it would be result = accumulate(transactions, initial = initial_balance) If the initial value is returned as item 0, then the result has the following properties: result[0] is the balance brought forward result[-1] is the current balance and this remains true in the corner case where there are no transactions. -- Greg From jsbueno at python.org.br Mon Apr 9 21:23:04 2018 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 9 Apr 2018 22:23:04 -0300 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: Message-ID: On 9 April 2018 at 22:10, Brett Cannon wrote: > > > On Mon, 9 Apr 2018 at 05:18 Joao S. O. Bueno wrote: >> >> we could even call this approach a name such as "function call". > > > The harsh sarcasm is not really called for. Indeed - on rereading, I have to agree on that. I do apologize for the sarcasm. - really, I not only stand corrected: I recognize i was incorrect to start with. But my argument that this feature is needless language bloat stands. On the othe hand, as for getting variable names out of _shallow_ mappings, I've built that feature in a package I authored, using a context manager to abuse the import mechanism - In [96]: from extradict import MapGetter In [97]: data = {"A": None, "B": 10} In [98]: with MapGetter(data): ...: from data import A, B ...: In [99]: A, B Out[99]: (None, 10) That is on Pypi and can be used by anyone right now. From mertz at gnosis.cx Mon Apr 9 21:46:11 2018 From: mertz at gnosis.cx (David Mertz) Date: Tue, 10 Apr 2018 01:46:11 +0000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> Message-ID: I continue to find all this weird new syntax to create absurdly long one-liners confusing and mysterious. Python is not Perl for a reason. On Mon, Apr 9, 2018, 5:55 PM Peter O'Connor wrote: > Kyle, you sounded so reasonable when you were trashing > itertools.accumulate (which I now agree is horrible). But then you go and > support Serhiy's madness: "smooth_signal = [average for average in [0] for > x in signal for average in [(1-decay)*average + decay*x]]" which I agree > is clever, but reads more like a riddle than readable code. > > Anyway, I continue to stand by: > > (y:= f(y, x) for x in iter_x from y=initial_y) > > And, if that's not offensive enough, to its extension: > > (z, y := f(z, x) -> y for x in iter_x from z=initial_z) > > Which carries state "z" forward but only yields "y" at each iteration. > (see proposal: https://github.com/petered/peps/blob/master/pep-9999.rst) > > Why am I so obsessed? Because it will allow you to conveniently replace > classes with more clean, concise, functional code. People who thought they > never needed such a construct may suddenly start finding it indispensable > once they get used to it. > > How many times have you written something of the form?: > > class StatefulThing(object): > > def __init__(self, initial_state, param_1, param_2): > self._param_1= param_1 > self._param_2 = param_2 > self._state = initial_state > > def update_and_get_output(self, new_observation): # (or just > __call__) > self._state = do_some_state_update(self._state, > new_observation, self._param_1) > output = transform_state_to_output(self._state, self._param_2) > return output > > processor = StatefulThing(initial_state = initial_state, param_1 = 1, > param_2 = 4) > processed_things = [processor.update_and_get_output(x) for x in x_gen] > > I've done this many times. Video encoding, robot controllers, neural > networks, any iterative machine learning algorithm, and probably lots of > things I don't know about - they all tend to have this general form. > > And how many times have I had issues like "Oh no now I want to change > param_1 on the fly instead of just setting it on initialization, I guess I > have to refactor all usages of this class to pass param_1 into > update_and_get_output instead of __init__". > > What if instead I could just write: > > def update_and_get_output(last_state, new_observation, param_1, > param_2) > new_state = do_some_state_update(last_state, new_observation, > _param_1) > output = transform_state_to_output(last_state, _param_2) > return new_state, output > > processed_things = [state, output:= update_and_get_output(state, x, > param_1=1, param_2=4) -> output for x in observations from > state=initial_state] > > Now we have: > - No mutable objects (which cuts of a whole slew of potential bugs and > anti-patterns familiar to people who do OOP.) > - Fewer lines of code > - Looser assumptions on usage and less refactoring. (if I want to now pass > in param_1 at each iteration instead of just initialization, I need to make > no changes to update_and_get_output). > - No need for state getters/setters, since state is is passed around > explicitly. > > I realize that calling for changes to syntax is a lot to ask - but I still > believe that the main objections to this syntax would also have been raised > as objections to the now-ubiquitous list-comprehensions - they seem hostile > and alien-looking at first, but very lovable once you get used to them. > > > > > On Sun, Apr 8, 2018 at 1:41 PM, Kyle Lahnakoski > wrote: > >> >> >> On 2018-04-05 21:18, Steven D'Aprano wrote: >> > (I don't understand why so many people have such an aversion to writing >> > functions and seek to eliminate them from their code.) >> > >> >> I think I am one of those people that have an aversion to writing >> functions! >> >> I hope you do not mind that I attempt to explain my aversion here. I >> want to clarify my thoughts on this, and maybe others will find >> something useful in this explanation, maybe someone has wise words for >> me. I think this is relevant to python-ideas because someone with this >> aversion will make different language suggestions than those that don't. >> >> Here is why I have an aversion to writing functions: Every unread >> function represents multiple unknowns in the code. Every function adds >> to code complexity by mapping an inaccurate name to specific >> functionality. >> >> When I read code, this is what I see: >> >> > x = you_will_never_guess_how_corner_cases_are_handled(a, b, c) >> > y = >> you_dont_know_I_throw_a_BaseException_when_I_do_not_like_your_arguments(j, >> k, l) >> >> Not everyone sees code this way: I see people read method calls, make a >> number of wild assumptions about how those methods work, AND THEY ARE >> CORRECT! How do they do it!? It is as if there are some unspoken >> convention about how code should work that's opaque to me. >> >> For example before I read the docs on >> itertools.accumulate(list_of_length_N, func), here are the unknowns I see: >> >> * Does it return N, or N-1 values? >> * How are initial conditions handled? >> * Must `func` perform the initialization by accepting just one >> parameter, and accumulate with more-than-one parameter? >> * If `func` is a binary function, and `accumulate` returns N values, >> what's the Nth value? >> * if `func` is a non-cummutative binary function, what order are the >> arguments passed? >> * Maybe accumulate expects func(*args)? >> * Is there a window size? Is it equal to the number of arguments of >> `func`? >> >> These are not all answered by reading the docs, they are answered by >> reading the code. The code tells me the first value is a special case; >> the first parameter of `func` is the accumulated `total`; `func` is >> applied in order; and an iterator is returned. Despite all my >> questions, notice I missed asking what `accumulate` returns? It is the >> unknown unknowns that get me most. >> >> So, `itertools.accumulate` is a kinda-inaccurate name given to a >> specific functionality: Not a problem on its own, and even delightfully >> useful if I need it often. >> >> What if I am in a domain where I see `accumulate` only a few times a >> year? Or how about a program that uses `accumulate` in only one place? >> For me, I must (re)read the `accumulate` source (or run the caller >> through the debugger) before I know what the code is doing. In these >> cases I advocate for in-lining the function code to remove these >> unknowns. Instead of an inaccurate name, there is explicit code. If we >> are lucky, that explicit code follows idioms that make the increased >> verbosity easier to read. >> >> Consider Serhiy Storchaka's elegant solution, which I reformatted for >> readability >> >> > smooth_signal = [ >> > average >> > for average in [0] >> > for x in signal >> > for average in [(1-decay)*average + decay*x] >> > ] >> >> We see the initial conditions, we see the primary function, we see how >> the accumulation happens, we see the number of returned values, and we >> see it's a list. It is a compact, easy read, from top to bottom. Yes, we >> must know `for x in [y]` is an idiom for assignment, but we can reuse >> that knowledge in all our other list comprehensions. So, in the >> specific case of this Reduce-Map thread, I would advocate using the list >> comprehension. >> >> In general, all functions introduce non-trivial code debt: This debt is >> worth it if the function is used enough; but, in single-use or rare-use >> cases, functions can obfuscate. >> >> >> >> Thank you for your time. >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Apr 9 22:30:42 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 9 Apr 2018 21:30:42 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: <5ACC0F89.9080507@canterbury.ac.nz> References: <5ACC0F89.9080507@canterbury.ac.nz> Message-ID: [Tim] >> while we have N numbers, there are N+1 slice indices. So >> accumulate(xs) doesn't quite work. It needs to also have a 0 inserted >> as the first prefix sum (the empty prefix sum(xs[:0]). >> >> Which is exactly what a this_is_the_initial_value=0 argument would do >> for us. [Greg Ewing ] > In this case, yes. But that still doesn't mean it makes > sense to require the initial value to be passed *in* as > part of the input sequence. > > Maybe the best idea is for the initial value to be a > separate argument, but be returned as the first item in > the list. I'm not sure you've read all the messages in this thread, but that's exactly what's being proposed. That. e.g., a new optional argument: accumulate(xs, func, initial=S) act like the current accumulate(chain([S], xs), func) Note that in neither case is the original `xs` modified in any way, and in both cases the first value generated is S. Note too that the proposal is exactly the way Haskell's `scanl` works (although `scanl` always requires specifying an initial value - while the related `scanl1` doesn't allow specifying one). And that's all been so since the thread's first message, in which Raymond gave a proposed implementation: _sentinel = object() def accumulate(iterable, func=operator.add, start=_sentinel): it = iter(iterable) if start is _sentinel: try: total = next(it) except StopIteration: return else: total = start yield total for element in it: total = func(total, element) yield total > I can think of another example where this would make > sense. Suppose you have an initial bank balance and a > list of transactions, and you want to produce a statement > with a list of running balances. > > The initial balance and the list of transactions are > coming from different places, so the most natural way > to call it would be > > result = accumulate(transactions, initial = initial_balance) > > If the initial value is returned as item 0, then the > result has the following properties: > > result[0] is the balance brought forward > result[-1] is the current balance > > and this remains true in the corner case where there are > no transactions. Indeed, something quite similar often applies when parallelizing search loops of the form: for candidate in accumulate(chain([starting_value], cycle(deltas))): For a sequence that eventually becomes periodic in the sequence of deltas it cycles through, multiple processes can run independent searches starting at carefully chosen different starting values "far" apart. In effect, they're each a "balance brought forward" pretending that previous chunks have already been done. Funny: it's been weeks now since I wrote an accumulate() that _didn't_ want to specify a starting value - LOL ;-) From peter.ed.oconnor at gmail.com Mon Apr 9 23:55:55 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Mon, 9 Apr 2018 23:55:55 -0400 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <5ACC0F89.9080507@canterbury.ac.nz> Message-ID: Ok, so it seems everyone's happy with adding an initial_value argument. Now, I claim that while it should be an option, the initial value should NOT be returned by default. (i.e. the returned generator should by default yield N elements, not N+1). Example: suppose we're doing the toll booth thing, and we want to yield a cumulative sum of tolls so far. Suppose someone already made a reasonable-looking generator yielding the cumulative sum of tolls for today: def iter_cumsum_tolls_from_day(day, toll_amount_so_far): return accumulate(get_tolls_from_day(day, initial=toll_amount_so_far)) And now we want to make a way to get all tolls from the month. One might reasonably expect this to work: def iter_cumsum_tolls_from_month(month, toll_amount_so_far): for day in month: for cumsum_tolls in iter_cumsum_tolls_from_day(day, toll_amount_so_far = toll_amount_so_far): yield cumsum_tolls toll_amount_so_far = cumsum_tolls But this would actually DUPLICATE the last toll of every day - it appears both as the last element of the day's generator and as the first element of the next day's generator. This is why I think that there should be an additional " include_initial_in_return=False" argument. I do agree that it should be an option to include the initial value (your "find tolls over time-span" example shows why), but that if you want that you should have to show that you thought about that by specifying "include_initial_in_return=True" On Mon, Apr 9, 2018 at 10:30 PM, Tim Peters wrote: > [Tim] > >> while we have N numbers, there are N+1 slice indices. So > >> accumulate(xs) doesn't quite work. It needs to also have a 0 inserted > >> as the first prefix sum (the empty prefix sum(xs[:0]). > >> > >> Which is exactly what a this_is_the_initial_value=0 argument would do > >> for us. > > [Greg Ewing ] > > In this case, yes. But that still doesn't mean it makes > > sense to require the initial value to be passed *in* as > > part of the input sequence. > > > > Maybe the best idea is for the initial value to be a > > separate argument, but be returned as the first item in > > the list. > > I'm not sure you've read all the messages in this thread, but that's > exactly what's being proposed. That. e.g., a new optional argument: > > accumulate(xs, func, initial=S) > > act like the current > > accumulate(chain([S], xs), func) > > Note that in neither case is the original `xs` modified in any way, > and in both cases the first value generated is S. > > Note too that the proposal is exactly the way Haskell's `scanl` works > (although `scanl` always requires specifying an initial value - while > the related `scanl1` doesn't allow specifying one). > > And that's all been so since the thread's first message, in which > Raymond gave a proposed implementation: > > _sentinel = object() > > def accumulate(iterable, func=operator.add, start=_sentinel): > it = iter(iterable) > if start is _sentinel: > try: > total = next(it) > except StopIteration: > return > else: > total = start > yield total > for element in it: > total = func(total, element) > yield total > > > I can think of another example where this would make > > sense. Suppose you have an initial bank balance and a > > list of transactions, and you want to produce a statement > > with a list of running balances. > > > > The initial balance and the list of transactions are > > coming from different places, so the most natural way > > to call it would be > > > > result = accumulate(transactions, initial = initial_balance) > > > > If the initial value is returned as item 0, then the > > result has the following properties: > > > > result[0] is the balance brought forward > > result[-1] is the current balance > > > > and this remains true in the corner case where there are > > no transactions. > > Indeed, something quite similar often applies when parallelizing > search loops of the form: > > for candidate in accumulate(chain([starting_value], cycle(deltas))): > > For a sequence that eventually becomes periodic in the sequence of > deltas it cycles through, multiple processes can run independent > searches starting at carefully chosen different starting values "far" > apart. In effect, they're each a "balance brought forward" pretending > that previous chunks have already been done. > > Funny: it's been weeks now since I wrote an accumulate() that > _didn't_ want to specify a starting value - LOL ;-) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ed.oconnor at gmail.com Tue Apr 10 00:11:52 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Tue, 10 Apr 2018 00:11:52 -0400 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <5ACC0F89.9080507@canterbury.ac.nz> Message-ID: * correction to brackets from first example: def iter_cumsum_tolls_from_day(day, toll_amount_so_far): return accumulate(get_tolls_from_day(day), initial=toll_amount_so_far) On Mon, Apr 9, 2018 at 11:55 PM, Peter O'Connor wrote: > Ok, so it seems everyone's happy with adding an initial_value argument. > > Now, I claim that while it should be an option, the initial value should > NOT be returned by default. (i.e. the returned generator should by default > yield N elements, not N+1). > > Example: suppose we're doing the toll booth thing, and we want to yield a > cumulative sum of tolls so far. Suppose someone already made a > reasonable-looking generator yielding the cumulative sum of tolls for today: > > def iter_cumsum_tolls_from_day(day, toll_amount_so_far): > return accumulate(get_tolls_from_day(day, initial=toll_amount_so_far)) > > And now we want to make a way to get all tolls from the month. One might > reasonably expect this to work: > > def iter_cumsum_tolls_from_month(month, toll_amount_so_far): > for day in month: > for cumsum_tolls in iter_cumsum_tolls_from_day(day, > toll_amount_so_far = toll_amount_so_far): > yield cumsum_tolls > toll_amount_so_far = cumsum_tolls > > But this would actually DUPLICATE the last toll of every day - it appears > both as the last element of the day's generator and as the first element of > the next day's generator. > > This is why I think that there should be an additional " > include_initial_in_return=False" argument. I do agree that it should be > an option to include the initial value (your "find tolls over time-span" > example shows why), but that if you want that you should have to show that > you thought about that by specifying "include_initial_in_return=True" > > > > > > On Mon, Apr 9, 2018 at 10:30 PM, Tim Peters wrote: > >> [Tim] >> >> while we have N numbers, there are N+1 slice indices. So >> >> accumulate(xs) doesn't quite work. It needs to also have a 0 inserted >> >> as the first prefix sum (the empty prefix sum(xs[:0]). >> >> >> >> Which is exactly what a this_is_the_initial_value=0 argument would do >> >> for us. >> >> [Greg Ewing ] >> > In this case, yes. But that still doesn't mean it makes >> > sense to require the initial value to be passed *in* as >> > part of the input sequence. >> > >> > Maybe the best idea is for the initial value to be a >> > separate argument, but be returned as the first item in >> > the list. >> >> I'm not sure you've read all the messages in this thread, but that's >> exactly what's being proposed. That. e.g., a new optional argument: >> >> accumulate(xs, func, initial=S) >> >> act like the current >> >> accumulate(chain([S], xs), func) >> >> Note that in neither case is the original `xs` modified in any way, >> and in both cases the first value generated is S. >> >> Note too that the proposal is exactly the way Haskell's `scanl` works >> (although `scanl` always requires specifying an initial value - while >> the related `scanl1` doesn't allow specifying one). >> >> And that's all been so since the thread's first message, in which >> Raymond gave a proposed implementation: >> >> _sentinel = object() >> >> def accumulate(iterable, func=operator.add, start=_sentinel): >> it = iter(iterable) >> if start is _sentinel: >> try: >> total = next(it) >> except StopIteration: >> return >> else: >> total = start >> yield total >> for element in it: >> total = func(total, element) >> yield total >> >> > I can think of another example where this would make >> > sense. Suppose you have an initial bank balance and a >> > list of transactions, and you want to produce a statement >> > with a list of running balances. >> > >> > The initial balance and the list of transactions are >> > coming from different places, so the most natural way >> > to call it would be >> > >> > result = accumulate(transactions, initial = initial_balance) >> > >> > If the initial value is returned as item 0, then the >> > result has the following properties: >> > >> > result[0] is the balance brought forward >> > result[-1] is the current balance >> > >> > and this remains true in the corner case where there are >> > no transactions. >> >> Indeed, something quite similar often applies when parallelizing >> search loops of the form: >> >> for candidate in accumulate(chain([starting_value], cycle(deltas))): >> >> For a sequence that eventually becomes periodic in the sequence of >> deltas it cycles through, multiple processes can run independent >> searches starting at carefully chosen different starting values "far" >> apart. In effect, they're each a "balance brought forward" pretending >> that previous chunks have already been done. >> >> Funny: it's been weeks now since I wrote an accumulate() that >> _didn't_ want to specify a starting value - LOL ;-) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Apr 10 01:32:26 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 10 Apr 2018 00:32:26 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <5ACC0F89.9080507@canterbury.ac.nz> Message-ID: [Peter O'Connor ] > Ok, so it seems everyone's happy with adding an initial_value argument. Heh - that's not clear to me ;-) > Now, I claim that while it should be an option, the initial value should NOT > be returned by default. (i.e. the returned generator should by default > yield N elements, not N+1). -1 on that. - It goes against prior art. Haskell's scanl does return the initial value, and nobody on the planet has devoted more quality thought to how streams "should work" than those folks. The Python itertoolz package's accumulate already supports an optional `initial=` argument, and also returns it when specified. It requires truly compelling arguments to go against prior art. - It's "obvious" that the first value should be returned if specified. The best evidence: in this thread's first message, it was "so obvious" to Raymond that the implementation he suggested did exactly that. I doubt it even occurred to him to question whether it should. It didn't occur to me either, but my mind is arguably "polluted" by significant prior exposure to functional languages. - In all but one "real life" example so far (including the slice-summer class I stumbled into today), the code _wanted_ the initial value to be returned. The sole exception was one of the three instances in Will Ness's wheel sieve code, where he discarded the unwanted (in that specific case) initial value via a plain next(wheel) Which is telling: it's easy to discard a value you don't want, but to inject a value you _do_ want but don't get requires something like reintroducing the chain([value_i_want], the_iterable_that_didn't_give_the_value_i_want) trick the new optional argument is trying to get _away_ from. Talk about ironic ;-) I would like to see a simple thing added to itertools to make dropping unwanted values easier, though: """ drop(iterable, n=None) Return an iterator whose next() method returns all but the first `n` values from the iterable. If specified, `n` must be an integer >= 0. By default (`n`=None), the iterator is run to exhaustion. """ Then, e.g., - drop(it, 0) would effectively be a synonym for iter(it). - drop((it, 1) would skip over the first value from the iterable. - drop(it) would give "the one obvious way" to consume an iterator completely (for some reason that's a semi-FAQ,, and is usually answered by suggesting the excruciatingly obscure trick of feeding the iterable to a 0-size collections.deque constructor).. Of course Haskell has had `drop` all along, although not the "run to exhaustion" part. > Example: suppose we're doing the toll booth thing, and we want to yield a > cumulative sum of tolls so far. Suppose someone already made a > reasonable-looking generator yielding the cumulative sum of tolls for today: > > def iter_cumsum_tolls_from_day(day, toll_amount_so_far): > return accumulate(get_tolls_from_day(day, initial=toll_amount_so_far)) > > And now we want to make a way to get all tolls from the month. One might > reasonably expect this to work: > > def iter_cumsum_tolls_from_month(month, toll_amount_so_far): > for day in month: > for cumsum_tolls in iter_cumsum_tolls_from_day(day, > toll_amount_so_far = toll_amount_so_far): > yield cumsum_tolls > toll_amount_so_far = cumsum_tolls > > But this would actually DUPLICATE the last toll of every day - it appears > both as the last element of the day's generator and as the first element of > the next day's generator. I didn't really follow the details there, but the suggestion would be the same regardless: drop the duplicates you don't want. Note that making up an example in your head isn't nearly as persuasive as "real life" code. Code can be _contrived_ to "prove" anything. > This is why I think that there should be an additional > "include_initial_in_return=False" argument. I do agree that it should be an > option to include the initial value (your "find tolls over time-span" > example shows why), but that if you want that you should have to show that > you thought about that by specifying "include_initial_in_return=True" It's generally poor human design to have a second optional argument modify the behavior of yet another optional argument. If the presence of the latter can have two distinct modes of operation, then people _will_ forget which one the default mode is, making code harder to write and harder to read. Since "return the value" is supported by all known prior art, and by the bulk of "real life" Python codes known so far, "return the value" should be the default. But far better to make it the only mode rather than _just_ the default mode. Then there's nothing to be forgotten :-) From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Apr 10 02:35:19 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Tue, 10 Apr 2018 15:35:19 +0900 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <5ACA9EB0.5000801@canterbury.ac.nz> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <5ACA9EB0.5000801@canterbury.ac.nz> Message-ID: <23244.23335.302165.346462@turnbull.sk.tsukuba.ac.jp> Greg Ewing writes: > Kyle Lahnakoski wrote: > > > Consider Serhiy Storchaka's elegant solution, which I reformatted for > > readability > > > >>smooth_signal = [ > >> average > >> for average in [0] > >> for x in signal > >> for average in [(1-decay)*average + decay*x] > >>] > > "Elegant" isn't the word I would use, more like "clever". Rather > too clever, IMO -- it took me some head scratching to figure out > how it does what it does. After reading the thread where it was first mentioned (on what, I now forget; I guess it was a PEP 572 precursor discussion?), I cannot unsee the "variable for variable in singleton" initialization idiom. YMMV, of course. That's just my experience. > And it would have taken even more head scratching, except there's a > clue as to *what* it's supposed to be doing: the fact that it's > assigned to something called "smooth_signal" Of course that hint was welcome, and hand to scalp motion was initiated. But then I "got it" and scratched my dog's head instead of my own. :-) Could we find a better syntax to express this? Probably, but none of the ones I've seen so far (including PEP 572) grab me and make my heart throb. Is this TOOWTDI? Not yet, and maybe never. But for now it works. Steve From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Apr 10 02:35:52 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Tue, 10 Apr 2018 15:35:52 +0900 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> Message-ID: <23244.23368.479697.724223@turnbull.sk.tsukuba.ac.jp> Tim Peters writes: > "Sum reduction" and "running-sum accumulation" are primitives in > many peoples' brains. I wonder what Kahneman would say about that. He goes to some length to explain that people are quite good (as human abilities go) at perceiving averages over sets but terrible at summing the same. Maybe they substitute the abstraction of summation for the ability to perform the operation? Steve From yaoxiansamma at gmail.com Tue Apr 10 03:29:08 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Tue, 10 Apr 2018 15:29:08 +0800 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: Message-ID: Your library seems difficult to extract values from nested dictionary, and when the key is not an identifier it's also embarrassed. For sure we can have a library using graphql syntax to extract data from the dict of any schema, but that's not my point. I'm focused on the consistency of the language itself. {key: value_pattern, **_} = {key: value, **_} The reason why it's important is that, when destructing/constructing for built-in data structures are not supported completely, people might ask why "[a, *b] = c" is ok but "{"a": a, **b} = c" not. If only multiple assignment is supported, why "(a, (b, c)) = d" could be ok? It's exactly destructing! And then if people think destructing of data structures is ok, they might be curious about what's the boundary of this feature. If you just tell them, "oh, it's just special, it just works for iterables, even you can partially destruct the dict in argument passing, you cannot use it in a simple statement!" >> def f(a, b, **c): print (a, b, c) >> f(a=1, **{'b': 2}) 1 2 {} >> {'a': a, 'b': b, **c} = {'a': 1, **{'b': 2}} SyntaxError: can't assign to literal Above example could be confusing in some degree, I think. If we don't have so many convenient helpers for function call, in terms of consistency, it might be even better... 2018-04-10 9:23 GMT+08:00 Joao S. O. Bueno : > On 9 April 2018 at 22:10, Brett Cannon wrote: > > > > > > On Mon, 9 Apr 2018 at 05:18 Joao S. O. Bueno > wrote: > >> > > >> we could even call this approach a name such as "function call". > > > > > > The harsh sarcasm is not really called for. > > Indeed - on rereading, I have to agree on that. > > I do apologize for the sarcasm. - really, I not only stand corrected: > I recognize i was incorrect to start with. > > But my argument that this feature is needless language bloat stands. > > On the othe hand, as for getting variable names out of _shallow_ mappings, > I've built that feature in a package I authored, using a context manager > to abuse the import mechanism - > > In [96]: from extradict import MapGetter > > In [97]: data = {"A": None, "B": 10} > > In [98]: with MapGetter(data): > ...: from data import A, B > ...: > > In [99]: A, B > Out[99]: (None, 10) > > > That is on Pypi and can be used by anyone right now. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Tue Apr 10 04:18:01 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Tue, 10 Apr 2018 10:18:01 +0200 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> Message-ID: Le 10/04/2018 ? 00:54, Peter O'Connor a ?crit?: > Kyle, you sounded so reasonable when you were trashing > itertools.accumulate (which I now agree is horrible).? But then you go > and support Serhiy's madness:??"smooth_signal = [average?for average in > [0]?for x in signal?for average in [(1-decay)*average + decay*x]]" which > I agree is clever, but reads more like a riddle than readable code.?? > > Anyway, I continue to stand by: > > ? ? (y:= f(y, x) for x in iter_x from y=initial_y) > > And, if that's not offensive enough, to its extension: > > ? ? (z, y := f(z, x) -> y for x in iter_x from z=initial_z) > > Which carries state "z" forward but only yields "y" at each iteration.? > (see proposal:?https://github.com/petered/peps/blob/master/pep-9999.rst > ) > > Why am I so obsessed?? Because it will allow you to conveniently replace > classes with more clean, concise, functional code.? People who thought > they never needed such a construct may suddenly start finding it > indispensable once they get used to it.?? > > How many times have you written something of the form?: > > ? ? class StatefulThing(object): > ? ?? > ? ? ? ? def __init__(self, initial_state, param_1, param_2): > ? ? ? ? ? ? self._param_1= param_1? > ? ? ? ? ? ? self._param_2 = param_2? > ? ? ? ? ? ? self._state = initial_state > ? ?? > ? ? ? ? def update_and_get_output(self, new_observation):? # (or just > __call__) > ? ? ? ? ? ? self._state = do_some_state_update(self._state, > new_observation, self._param_1)? > ? ? ? ? ? ? output = transform_state_to_output(self._state, self._param_2) > ? ? ? ? ? ? return output > ? ?? > ? ? processor = StatefulThing(initial_state = initial_state, param_1 = > 1, param_2 = 4) > ? ? processed_things = [processor.update_and_get_output(x) for x in x_gen] > ? ?? > I've done this many times.? Video encoding, robot controllers, neural > networks, any iterative machine learning algorithm, and probably lots of > things I don't know about - they all tend to have this general form.?? > Personally I never have to do that very often. But let's say for the sake of the argument there is a class of problem a part of the Python community often solves with this pattern. After all, Python is a versatile language with a very large and diverse user base. First, why a class would be a bad thing ? It's clear, easy to understand, debug and extend. Besides, do_some_state_update and transform_state_to_output may very well be methods. Second, if you really don't want a class, use a coroutine, that's exactly what they are for: def stateful_thing(state, param_1, param_2, output=None): while True: new_observation = yield output state = do_some_state_update(state, new_observation, param_1) output = transform_state_to_output(state, param_2) processor = stateful_thing(1, 1, 4) next(processor) processed_things = [processor.send(x) for x in x_gen] If you have that much of a complex workflow, you really should not make that a one-liner. And before trying to ask for a new syntax in the language, try to solve the problem with the existing tools. I know, I get the frustration. I've been trying to get slicing on generators and inline try/except on this mailing list for years and I've been said no again and again. It's hard. But it's also why Python stayed sane for decades. From steve at pearwood.info Tue Apr 10 05:21:35 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 10 Apr 2018 19:21:35 +1000 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: Message-ID: <20180410092135.GL16661@ando.pearwood.info> On Tue, Apr 10, 2018 at 03:29:08PM +0800, Thautwarm Zhao wrote: > I'm focused on the consistency of the language itself. Consistency is good, but it is not the only factor to consider. We must guard against *foolish* consistency: adding features just for the sake of matching some other, often barely related, feature. Each feature must justify itself, and consistency with something else is merely one possible attempt at justification. > {key: value_pattern, **_} = {key: value, **_} If I saw that, I would have no idea what it could even possibly do. Let's pick the simplest concrete example I can think of: {'A': 1, **{}} = {'A': 0, **{}} I cannot interpret what that should do. Is it some sort of pattern-matching? An update? What is the result? It is obviously some sort of binding operation, an assignment, but an assignment to what? Sequence binding and unpacking was obvious the first time I saw it. I had no problem guessing what: a, b, c = 1, 2, 3 meant, and once I had seen that, it wasn't hard to guess what a, b, c = *sequence meant. From there it is easy to predict extended unpacking. But I can't say the same for this. I can almost see the point of: a, b, c, = **{'a': 1, 'b': 2, 'c': 3} but I'm having trouble thinking of a situation where I would actually use it. But your syntax above just confuses me. > The reason why it's important is that, when destructing/constructing for > built-in data structures are not supported completely, > people might ask why "[a, *b] = c" is ok but "{"a": a, **b} = c" not. People can ask all sorts of questions. I've seen people ask why Python doesn't support line numbers and GOTO. We're allowed to answer "Because it is a bad idea", or even "Because we don't think it is good enough to justify the cost". > If only multiple assignment is supported, why "(a, (b, c)) = d" could be > ok? It's exactly destructing! That syntax is supported. I don't understand your point here. > >> {'a': a, 'b': b, **c} = {'a': 1, **{'b': 2}} > SyntaxError: can't assign to literal > > Above example could be confusing in some degree, I think. I have no idea what you expect it to do. Even something simpler: {'a': a} = {'a': 2} leaves me in the dark. -- Steve From j.van.dorp at deonet.nl Tue Apr 10 05:52:40 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Tue, 10 Apr 2018 11:52:40 +0200 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: <20180410092135.GL16661@ando.pearwood.info> References: <20180410092135.GL16661@ando.pearwood.info> Message-ID: I must say I can't really see the point either. If you say like: > {'a': a, 'b': b, **c} = {'a': 1, **{'b': 2}} Do you basically mean: c = {'a': 1, **{'b': 2}} a = c.pop("a") b = c.pop("b") # ? That's the only thing I could think of. I think most of these problems could be solved with pop and the occasional list comprehension like this: a, b, c = [{'a':1,'b':2,'c':3}.pop(key) for key in ('a', 'b', 'c')] or for your example: c = {'a': 1, **{'b': 2}} # I suppose this one would generally # be dynamic, but I need a name here. a, b = [c.pop(key) for key in ('a', 'b')] would extract all the keys you need, and has the advantage that you don't need hardcoded dict structure if you expand it to nested dicts. It's even less writing, and just as extensible to nested dicts. And if you dont actually want to destruct (tuples and lists aren't destroyed either), just use __getitem__ access instead of pop. 2018-04-10 11:21 GMT+02:00 Steven D'Aprano : > On Tue, Apr 10, 2018 at 03:29:08PM +0800, Thautwarm Zhao wrote: > >> I'm focused on the consistency of the language itself. > > Consistency is good, but it is not the only factor to consider. We must > guard against *foolish* consistency: adding features just for the sake > of matching some other, often barely related, feature. Each feature must > justify itself, and consistency with something else is merely one > possible attempt at justification. > > >> {key: value_pattern, **_} = {key: value, **_} > > If I saw that, I would have no idea what it could even possibly do. > Let's pick the simplest concrete example I can think of: > > {'A': 1, **{}} = {'A': 0, **{}} > > I cannot interpret what that should do. Is it some sort of > pattern-matching? An update? What is the result? It is obviously some > sort of binding operation, an assignment, but an assignment to what? > > Sequence binding and unpacking was obvious the first time I saw it. I > had no problem guessing what: > > a, b, c = 1, 2, 3 > > meant, and once I had seen that, it wasn't hard to guess what > > a, b, c = *sequence > > meant. From there it is easy to predict extended unpacking. But I can't > say the same for this. > > I can almost see the point of: > > a, b, c, = **{'a': 1, 'b': 2, 'c': 3} > > but I'm having trouble thinking of a situation where I would actually > use it. But your syntax above just confuses me. > > >> The reason why it's important is that, when destructing/constructing for >> built-in data structures are not supported completely, >> people might ask why "[a, *b] = c" is ok but "{"a": a, **b} = c" not. > > People can ask all sorts of questions. I've seen people ask why Python > doesn't support line numbers and GOTO. We're allowed to answer "Because > it is a bad idea", or even "Because we don't think it is good enough to > justify the cost". > > >> If only multiple assignment is supported, why "(a, (b, c)) = d" could be >> ok? It's exactly destructing! > > That syntax is supported. I don't understand your point here. > > >> >> {'a': a, 'b': b, **c} = {'a': 1, **{'b': 2}} >> SyntaxError: can't assign to literal >> >> Above example could be confusing in some degree, I think. > > I have no idea what you expect it to do. Even something simpler: > > {'a': a} = {'a': 2} > > leaves me in the dark. > > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From guido at python.org Tue Apr 10 11:05:18 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Apr 2018 08:05:18 -0700 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: <23244.23368.479697.724223@turnbull.sk.tsukuba.ac.jp> References: <7F528F59-40D3-4036-A0FF-461C80AC1805@gmail.com> <23244.23368.479697.724223@turnbull.sk.tsukuba.ac.jp> Message-ID: On Mon, Apr 9, 2018 at 11:35 PM, Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > Tim Peters writes: > > > "Sum reduction" and "running-sum accumulation" are primitives in > > many peoples' brains. > > I wonder what Kahneman would say about that. He goes to some length > to explain that people are quite good (as human abilities go) at > perceiving averages over sets but terrible at summing the same. Maybe > they substitute the abstraction of summation for the ability to > perform the operation? > [OT] How is that human ability tested? I am a visual learner and I would propose that if you have a set of numbers, you can graph it in different ways to make it easier to perceive one or the other (or maybe both): - to emphasize the average, draw a line graph -- in my mind I draw a line through the average (getting the trend for free) - to emphasize the sum, draw a histogram -- in my mind I add up the sizes of the bars -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Apr 10 11:20:11 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Apr 2018 08:20:11 -0700 Subject: [Python-ideas] Is there any idea about dictionary destructing? In-Reply-To: References: <20180410092135.GL16661@ando.pearwood.info> Message-ID: Here's one argument why sequence unpacking is more important than dict unpacking. Without sequence unpacking, you have a long sequence, to get at a specific item you'd need to use indexing, where you often end up having to remember the indices for each type of information. Say you have points of the form (x, y, z, t), to get at the t coordinate you'd have to write p[3]. With sequence unpacking you can write x, y, z, t = p and then you can use the individual variables in the subsequent code. However if your point had the form {'x': x, 'y': y, 'z': z, 't': t}, you could just write p['t']. This is much more mnemonic than p[3]. All the rest follows -- after a while extended forms of iterable unpacking start making sense. But for dicts the use case is just much less common. (If you're doing a lot of JSON you might have a different view on that. You should probably use some kind of schema-guided parser though.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue Apr 10 11:49:36 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 10 Apr 2018 18:49:36 +0300 Subject: [Python-ideas] Add more information in the header of pyc files Message-ID: The format of the header of pyc files was stable for long time and changed only few times. First time it was changed in 3.3: added the size of the corresponding source mod 2**32. [1] Second time it was changed in 3.7: added the 32-bit flags field and support of hash-based pyc files (PEP 552). [2] [3] I think that it is worth to make more changed. 1. More stable file signature. Currently the magic number is changed in every feature release. Only the third and the forth bytes are stable (b'\r\n'), the first bytes are changed non-predicable. The 'py' launcher and third-party software like the 'file' command should support the list of magic numbers for all existing Python releases, and they can't detect pyc file for future versions. There is also a chance the pyc file signature will match the signature of other file type by accident. It would be better if the first 4 bytes of pyc files be same for all Python versions (or at least for all Python versions with the same major number). 2. Include the Python version. Currently the 'py' launcher needs to support the table that maps magic numbers to Python version. It can recognize only Python versions released before building the launcher. If the two major numbers of Python version be included in the version, it would not need such table. 3. The number of compatible subversion. Currently the interpreter supports only a single magic number. If the updated version of the compiler produces more optimal or more correct but compatible bytecode (like ), there is no way to say that the new bytecode is preferable, but the old bytecode can be used too. Changing the magic number causes invalidating all pyc files compiled by the old compiler (see [4] for the example of problems caused by this). The header could contain two magic numbers: the major magic number should be bumped for incompatible changes, the minor magic number should be reset to 0 when the major magic number is bumped, and should be bumped when the compiler become producing different but compatible bytecode. If the import system reads the pyc file with the minor magic number equal or greater than current, it just uses the pyc file. If it reads the pyc file with the minor magic number lesser than current, it can regenerate the pyc file if it is writeable. And the compileall module should regenerate all pyc files with minor magic numbers lesser than current. [1] https://bugs.python.org/issue13645 [2] https://bugs.python.org/issue31650 [3] http://www.python.org/dev/peps/pep-0552/ [4] https://bugs.python.org/issue27286 From solipsis at pitrou.net Tue Apr 10 11:58:38 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 10 Apr 2018 17:58:38 +0200 Subject: [Python-ideas] Add more information in the header of pyc files References: Message-ID: <20180410175838.52a693f6@fsol> On Tue, 10 Apr 2018 18:49:36 +0300 Serhiy Storchaka wrote: > > 1. More stable file signature. Currently the magic number is changed in > every feature release. Only the third and the forth bytes are stable > (b'\r\n'), the first bytes are changed non-predicable. The 'py' launcher > and third-party software like the 'file' command should support the list > of magic numbers for all existing Python releases, and they can't detect > pyc file for future versions. There is also a chance the pyc file > signature will match the signature of other file type by accident. It > would be better if the first 4 bytes of pyc files be same for all Python > versions (or at least for all Python versions with the same major number). +1. > 2. Include the Python version. Currently the 'py' launcher needs to > support the table that maps magic numbers to Python version. It can > recognize only Python versions released before building the launcher. If > the two major numbers of Python version be included in the version, it > would not need such table. +1. > 3. The number of compatible subversion. Currently the interpreter > supports only a single magic number. If the updated version of the > compiler produces more optimal or more correct but compatible bytecode > (like ), there is no way to say that the new bytecode is preferable, but > the old bytecode can be used too. Changing the magic number causes > invalidating all pyc files compiled by the old compiler (see [4] for the > example of problems caused by this). The header could contain two magic > numbers: the major magic number should be bumped for incompatible > changes, the minor magic number should be reset to 0 when the major > magic number is bumped, and should be bumped when the compiler become > producing different but compatible bytecode. -1. This is a risky move (and costly, in maintenance terms). It's easy to overlook subtle differencies that may translate into incompatibilities in some production uses. The rule ? one Python feature release == one bytecode version ? is easy to remember and understand, and is generally very well accepted. Regards Antoine. From storchaka at gmail.com Tue Apr 10 12:14:58 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 10 Apr 2018 19:14:58 +0300 Subject: [Python-ideas] Move optional data out of pyc files Message-ID: Currently pyc files contain data that is useful mostly for developing and is not needed in most normal cases in stable program. There is even an option that allows to exclude a part of this information from pyc files. It is expected that this saves memory, startup time, and disk space (or the time of loading from network). I propose to move this data from pyc files into separate file or files. pyc files should contain only external references to external files. If the corresponding external file is absent or specific option suppresses them, references are replaced with None or NULL at import time, otherwise they are loaded from external files. 1. Docstrings. They are needed mainly for developing. 2. Line numbers (lnotab). They are helpful for formatting tracebacks, for tracing, and debugging with the debugger. Sources are helpful in such cases too. If the program doesn't contain errors ;-) and is sipped without sources, they could be removed. 3. Annotations. They are used mainly by third party tools that statically analyze sources. They are rarely used at runtime. Docstrings will be read from the corresponding docstring file unless -OO is supplied. This will allow also to localize docstrings. Depending on locale or other settings different docstring file can be used. For suppressing line numbers and annotations new options can be added. From peter.ed.oconnor at gmail.com Tue Apr 10 12:18:27 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Tue, 10 Apr 2018 12:18:27 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> Message-ID: > > First, why a class would be a bad thing ? It's clear, easy to > understand, debug and extend. - Lots of reduntand-looking "frameworky" lines of code: "self._param_1 = param_1" - Potential for opaque state changes: Caller doesn't know if "y=my_object.do_something(x)" has any side-effect, whereas with ("y, new_state=do_something(state, x)" / "y=do_something(state, x)") it's clear that there (is / is not). - Makes more assumptions on usage (should I add "param_1" as an arg to "StatefulThing.__init__" or to "StatefulThing.update_and_get_output" > And before trying to ask for a new syntax in the language, try to solve > the problem with the existing tools. Oh I have, and of course there are ways but I find them all clunkier than needed. I added your coroutine to the freak show: https://github.com/petered/peters_example_code/blob/master/peters_example_code/ways_to_skin_a_cat.py#L106 > processor = stateful_thing(1, 1, 4) > next(processor) > processed_things = [processor.send(x) for x in x_gen] I *almost* like the coroutine thing but find it unusable because the peculiarity of having to initialize the generator when you use it (you do it with next(processor)) is pretty much guaranteed to lead to errors when people forget to do it. Earlier in the thread Steven D'Aprano showed how a @coroutine decorator can get around this: https://github.com/petered/peters_example_code/blob/master/peters_example_code/ways_to_skin_a_cat.py#L63 - Still, the whole coroutine thing still feels a bit magical, hacky and "clever". Also the use of generator.send will probably confuse around 90% of programmers. If you have that much of a complex workflow, you really should not make > that a one-liner. It's not a complex workflow, it's a moving average. It just seems complex because we don't have a nice, compact way to describe it. I've been trying to get slicing on generators and inline try/except on > this mailing list for years and I've been said no again and again. It's > hard. But it's also why Python stayed sane for decades. Hey I'll support your campaign if you support mine. On Tue, Apr 10, 2018 at 4:18 AM, Michel Desmoulin wrote: > > > Le 10/04/2018 ? 00:54, Peter O'Connor a ?crit : > > Kyle, you sounded so reasonable when you were trashing > > itertools.accumulate (which I now agree is horrible). But then you go > > and support Serhiy's madness: "smooth_signal = [average for average in > > [0] for x in signal for average in [(1-decay)*average + decay*x]]" which > > I agree is clever, but reads more like a riddle than readable code. > > > > Anyway, I continue to stand by: > > > > (y:= f(y, x) for x in iter_x from y=initial_y) > > > > And, if that's not offensive enough, to its extension: > > > > (z, y := f(z, x) -> y for x in iter_x from z=initial_z) > > > > Which carries state "z" forward but only yields "y" at each iteration. > > (see proposal: https://github.com/petered/peps/blob/master/pep-9999.rst > > ) > > > > Why am I so obsessed? Because it will allow you to conveniently replace > > classes with more clean, concise, functional code. People who thought > > they never needed such a construct may suddenly start finding it > > indispensable once they get used to it. > > > > How many times have you written something of the form?: > > > > class StatefulThing(object): > > > > def __init__(self, initial_state, param_1, param_2): > > self._param_1= param_1 > > self._param_2 = param_2 > > self._state = initial_state > > > > def update_and_get_output(self, new_observation): # (or just > > __call__) > > self._state = do_some_state_update(self._state, > > new_observation, self._param_1) > > output = transform_state_to_output(self._state, > self._param_2) > > return output > > > > processor = StatefulThing(initial_state = initial_state, param_1 = > > 1, param_2 = 4) > > processed_things = [processor.update_and_get_output(x) for x in > x_gen] > > > > I've done this many times. Video encoding, robot controllers, neural > > networks, any iterative machine learning algorithm, and probably lots of > > things I don't know about - they all tend to have this general form. > > > > Personally I never have to do that very often. But let's say for the > sake of the argument there is a class of problem a part of the Python > community often solves with this pattern. After all, Python is a > versatile language with a very large and diverse user base. > > First, why a class would be a bad thing ? It's clear, easy to > understand, debug and extend. Besides, do_some_state_update and > transform_state_to_output may very well be methods. > > Second, if you really don't want a class, use a coroutine, that's > exactly what they are for: > > def stateful_thing(state, param_1, param_2, output=None): > while True: > new_observation = yield output > state = do_some_state_update(state, new_observation, param_1) > output = transform_state_to_output(state, param_2) > > processor = stateful_thing(1, 1, 4) > next(processor) > processed_things = [processor.send(x) for x in x_gen] > > If you have that much of a complex workflow, you really should not make > that a one-liner. > > And before trying to ask for a new syntax in the language, try to solve > the problem with the existing tools. > > I know, I get the frustration. > > I've been trying to get slicing on generators and inline try/except on > this mailing list for years and I've been said no again and again. It's > hard. But it's also why Python stayed sane for decades. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Apr 10 12:24:27 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 10 Apr 2018 18:24:27 +0200 Subject: [Python-ideas] Move optional data out of pyc files References: Message-ID: <20180410182427.03ad0043@fsol> On Tue, 10 Apr 2018 19:14:58 +0300 Serhiy Storchaka wrote: > Currently pyc files contain data that is useful mostly for developing > and is not needed in most normal cases in stable program. There is even > an option that allows to exclude a part of this information from pyc > files. It is expected that this saves memory, startup time, and disk > space (or the time of loading from network). I propose to move this data > from pyc files into separate file or files. pyc files should contain > only external references to external files. If the corresponding > external file is absent or specific option suppresses them, references > are replaced with None or NULL at import time, otherwise they are loaded > from external files. > > 1. Docstrings. They are needed mainly for developing. Indeed, it may be nice to find a solution to ship them separately. > 2. Line numbers (lnotab). They are helpful for formatting tracebacks, > for tracing, and debugging with the debugger. Sources are helpful in > such cases too. If the program doesn't contain errors ;-) and is sipped > without sources, they could be removed. What is the weight of lnotab arrays? While docstrings can be large, I'm somehow skeptical that removing lnotab arrays would bring a significant improvement. It would be nice to have more data about this. > 3. Annotations. They are used mainly by third party tools that > statically analyze sources. They are rarely used at runtime. Even less used than docstrings probably. Regards Antoine. From storchaka at gmail.com Tue Apr 10 12:29:18 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 10 Apr 2018 19:29:18 +0300 Subject: [Python-ideas] Add more information in the header of pyc files In-Reply-To: <20180410175838.52a693f6@fsol> References: <20180410175838.52a693f6@fsol> Message-ID: 10.04.18 18:58, Antoine Pitrou ????: > On Tue, 10 Apr 2018 18:49:36 +0300 > Serhiy Storchaka > wrote: >> 3. The number of compatible subversion. Currently the interpreter >> supports only a single magic number. If the updated version of the >> compiler produces more optimal or more correct but compatible bytecode >> (like ), there is no way to say that the new bytecode is preferable, but >> the old bytecode can be used too. Changing the magic number causes >> invalidating all pyc files compiled by the old compiler (see [4] for the >> example of problems caused by this). The header could contain two magic >> numbers: the major magic number should be bumped for incompatible >> changes, the minor magic number should be reset to 0 when the major >> magic number is bumped, and should be bumped when the compiler become >> producing different but compatible bytecode. > > -1. This is a risky move (and costly, in maintenance terms). It's easy > to overlook subtle differencies that may translate into > incompatibilities in some production uses. The rule ? one Python > feature release == one bytecode version ? is easy to remember and > understand, and is generally very well accepted. A bugfix release can fix bugs in bytecode generation. See for example issue27286. [1] The part of issue33041 backported to 3.7 and 3.6 is an other example. [2] There were other examples of compatible changing the bytecode. Without bumping the magic number these fixes can just not have any effect if existing pyc files were generated by older compilers. But bumping the magic number in a bugfix release can lead to rebuilding every pyc file (even unaffected by the fix) in distributives. [1] https://bugs.python.org/issue27286 [2] https://bugs.python.org/issue33041 From tim.peters at gmail.com Tue Apr 10 12:58:48 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 10 Apr 2018 11:58:48 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: Message-ID: [Tim] > Woo hoo! Another coincidence. I just happened to be playing with > this problem today: > > You have a large list - xs - of N numbers. It's necessary to compute slice sums > > sum(xs[i:j]) > > for a great many slices, 0 <= i <= j <= N. Which brought to mind a different problem: we have a list of numbers, `xs`. For each index position `i`, we want to know the largest sum among all segments ending at xs[i], and the number of elements in a maximal-sum slice ending at xs[i]. `accumulate()` is a natural way to approach this, for someone with a functional language background. You'll have to trust me on that ;-) But there are some twists: - The identity element for max() is minus infinity, which accumulate() can';t know. - We want to generate a sequence of 2-tuples, despite that xs is a sequence of numbers. - In this case, we do _not_ want to see the special initial value. For example, given the input xs [-10, 3, -1, 7, -9, -7, -9, 7, 4] we want to generate (-10, 1), (3, 1), (2, 2), (9, 3), (0, 4), (-7, 1), (-9, 1), (7, 1), (11, 2) Note: the largest sum across all non-empty slices is then max(that_result)[0]. The code below could be easily changed to keep track off that incrementally too, but this is already so different from "plain old running sum" that I don't want more novelty than necessary to make the points (a special initial value is needed, and it's not necessarily insane to want to produce results of a different type than the inputs). The key part is the state updating function: def update(state, x): prevmax, count = state newsum = prevmax + x if newsum > x: return newsum, count + 1 else: return x, 1 That's all there is to it! Then, e.g., >>> from itertools import accumulate, chain >>> import math >>> xs = [-10, 3, -1, 7, -9, -7, -9, 7, 4] >>> initial = (-math.inf, 1) >>> result = accumulate(chain([initial], xs), update) >>> next(result) # discard unwanted value (-inf, 1) >>> list(result) [(-10, 1), (3, 1), (2, 2), (9, 3), (0, 4), (-7, 1), (-9, 1), (7, 1), (11, 2)] From solipsis at pitrou.net Tue Apr 10 12:54:56 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 10 Apr 2018 18:54:56 +0200 Subject: [Python-ideas] Add more information in the header of pyc files References: <20180410175838.52a693f6@fsol> Message-ID: <20180410185456.1ced81cc@fsol> On Tue, 10 Apr 2018 19:29:18 +0300 Serhiy Storchaka wrote: > > A bugfix release can fix bugs in bytecode generation. See for example > issue27286. [1] The part of issue33041 backported to 3.7 and 3.6 is an > other example. [2] There were other examples of compatible changing the > bytecode. Without bumping the magic number these fixes can just not have > any effect if existing pyc files were generated by older compilers. But > bumping the magic number in a bugfix release can lead to rebuilding > every pyc file (even unaffected by the fix) in distributives. Sure, but I don't think rebuilding every pyc file is a significant problem. It's certainly less error-prone than cherry-picking which files need rebuilding. Regards Antoine. From steve at pearwood.info Tue Apr 10 13:32:55 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Apr 2018 03:32:55 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> Message-ID: <20180410173255.GM16661@ando.pearwood.info> On Tue, Apr 10, 2018 at 12:18:27PM -0400, Peter O'Connor wrote: [...] > I added your coroutine to the freak show: Peter, I realise that you're a fan of functional programming idioms, and I'm very sympathetic to that. I'm a fan of judicious use of FP too, and while I'm not keen on your specific syntax, I am interested in the general concept and would like it to have the best possible case made for it. But even I find your use of dysphemisms like "freak show" for non-FP solutions quite off-putting. (I think this is the second time you've used the term.) Python is not a functional programming language like Haskell, it is a multi-paradigm language with strong support for OO and procedural idioms. Notwithstanding the problems with OO idioms that you describe, many Python programmers find OO "better", simpler to understand, learn and maintain than FP. Or at least more familiar. The rejection or approval of features into Python is not a popularity contest, ultimately it only requires one person (Guido) to either reject or approve a new feature. But popular opinion is not irrelevant either: like all benevolent dictators, Guido has a good sense of what's popular, and takes it into account in his considerations. If you put people off-side, you hurt your chances of having this feature approved. [...] > I *almost* like the coroutine thing but find it unusable because the > peculiarity of having to initialize the generator when you use it (you do > it with next(processor)) is pretty much guaranteed to lead to errors when > people forget to do it. Earlier in the thread Steven D'Aprano showed how a > @coroutine decorator can get around this: I agree that the (old-style, pre-async) coroutine idiom is little known, in part because of the awkwardness needed to make it work. Nevertheless, I think your argument about it leading to errors is overstated: if you forget to initialize the coroutine, you get a clear and obvious failure: py> def co(): ... x = (yield 1) ... py> a = co() py> a.send(99) Traceback (most recent call last): File "", line 1, in TypeError: can't send non-None value to a just-started generator > - Still, the whole coroutine thing still feels a bit magical, hacky and > "clever". Also the use of generator.send will probably confuse around 90% > of programmers. In my experience, heavy use of FP idioms will probably confuse about the same percentage. Including me: I like FP in moderation, I wouldn't want to use a strict 100% functional language, and if someone even says the word "Monad" I break out in hives. > If you have that much of a complex workflow, you really should not make > > that a one-liner. > > It's not a complex workflow, it's a moving average. It just seems complex > because we don't have a nice, compact way to describe it. Indeed. But it seems to me that itertools.accumulate() with a initial value probably will solve that issue. Besides... moving averages aren't that common that they *necessarily* need syntactic support. Wrapping the complexity in a function, then calling the function, may be an acceptible solution instead of putting the complexity directly into the language itself. The Conservation Of Complexity Principle suggests that complexity cannot be created or destroyed, only moved around. If we reduce the complexity of the Python code needed to write a moving average, we invariably increase the complexity of the language, the interpreter, and the amount of syntax people need to learn in order to be productive with Python. -- Steve From rosuav at gmail.com Tue Apr 10 13:38:08 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 03:38:08 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: On Wed, Apr 11, 2018 at 2:14 AM, Serhiy Storchaka wrote: > Currently pyc files contain data that is useful mostly for developing and is > not needed in most normal cases in stable program. There is even an option > that allows to exclude a part of this information from pyc files. It is > expected that this saves memory, startup time, and disk space (or the time > of loading from network). I propose to move this data from pyc files into > separate file or files. pyc files should contain only external references to > external files. If the corresponding external file is absent or specific > option suppresses them, references are replaced with None or NULL at import > time, otherwise they are loaded from external files. > > 1. Docstrings. They are needed mainly for developing. > > 2. Line numbers (lnotab). They are helpful for formatting tracebacks, for > tracing, and debugging with the debugger. Sources are helpful in such cases > too. If the program doesn't contain errors ;-) and is sipped without > sources, they could be removed. > > 3. Annotations. They are used mainly by third party tools that statically > analyze sources. They are rarely used at runtime. > > Docstrings will be read from the corresponding docstring file unless -OO is > supplied. This will allow also to localize docstrings. Depending on locale > or other settings different docstring file can be used. > > For suppressing line numbers and annotations new options can be added. A deployed Python distribution generally has .pyc files for all of the standard library. I don't think people want to lose the ability to call help(), and unless I'm misunderstanding, that requires docstrings. So this will mean twice as many files and twice as many file-open calls to import from the standard library. What will be the impact on startup time? ChrisA From zachary.ware+pydev at gmail.com Tue Apr 10 13:54:00 2018 From: zachary.ware+pydev at gmail.com (Zachary Ware) Date: Tue, 10 Apr 2018 12:54:00 -0500 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: On Tue, Apr 10, 2018 at 12:38 PM, Chris Angelico wrote: > A deployed Python distribution generally has .pyc files for all of the > standard library. I don't think people want to lose the ability to > call help(), and unless I'm misunderstanding, that requires > docstrings. So this will mean twice as many files and twice as many > file-open calls to import from the standard library. What will be the > impact on startup time? What about instead of separate files turning the single file into a pseudo-zip file containing all of the proposed files, and provide a simple tool for removing whatever parts you don't want? -- Zach From ethan at stoneleaf.us Tue Apr 10 14:13:01 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 10 Apr 2018 11:13:01 -0700 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: <5ACCFEAD.4070700@stoneleaf.us> On 04/10/2018 10:54 AM, Zachary Ware wrote: > On Tue, Apr 10, 2018 at 12:38 PM, Chris Angelico wrote: >> A deployed Python distribution generally has .pyc files for all of the >> standard library. I don't think people want to lose the ability to >> call help(), and unless I'm misunderstanding, that requires >> docstrings. So this will mean twice as many files and twice as many >> file-open calls to import from the standard library. What will be the >> impact on startup time? > > What about instead of separate files turning the single file into a > pseudo-zip file containing all of the proposed files, and provide a > simple tool for removing whatever parts you don't want? -O and -OO already do some trimming; perhaps going that route instead of having multiple files would be better. -- ~Ethan~ From stephanh42 at gmail.com Tue Apr 10 14:11:53 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Tue, 10 Apr 2018 20:11:53 +0200 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: There are libraries out there like this: https://docopt.readthedocs.io/en/0.2.0/ which use docstrings for runtime info. Today we already have -OO which allows us to create docstring-less bytecode files in case we have, after careful consideration, established that it is safe to do so. I think the current way (-OO) to avoid docstring loading is the correct one. It pushes the responsibility on whoever did the packaging to decide if -OO is appropriate. The ability to remove the docstrings after bytecode generation would be kinda nice (similar to Unix "strip" command) but given how fast bytecode compilation is, frankly I don't think it is very important. Stephan 2018-04-10 19:54 GMT+02:00 Zachary Ware : > On Tue, Apr 10, 2018 at 12:38 PM, Chris Angelico wrote: > > A deployed Python distribution generally has .pyc files for all of the > > standard library. I don't think people want to lose the ability to > > call help(), and unless I'm misunderstanding, that requires > > docstrings. So this will mean twice as many files and twice as many > > file-open calls to import from the standard library. What will be the > > impact on startup time? > > What about instead of separate files turning the single file into a > pseudo-zip file containing all of the proposed files, and provide a > simple tool for removing whatever parts you don't want? > > -- > Zach > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmoisset at machinalis.com Tue Apr 10 14:17:25 2018 From: dmoisset at machinalis.com (Daniel Moisset) Date: Tue, 10 Apr 2018 19:17:25 +0100 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: I'm not sure I understand the benefit of this, perhaps you can clarify. What I see is two scenarios Scenario A) External files are present In this case, the data is loaded from the pyc and then from external file, so there are no savings in memory, startup time, disk space, or network load time, it's just the same disk information and runtime structure with a different layout Scenario B) External files are not present In this case, you get runtime improvements exactly identical to not having the data in the pyc which is roughly what you get with -OO. The only new capability I see this adds is the localization benefit, is that what this proposal is about? On 10 April 2018 at 17:14, Serhiy Storchaka wrote: > Currently pyc files contain data that is useful mostly for developing and > is not needed in most normal cases in stable program. There is even an > option that allows to exclude a part of this information from pyc files. It > is expected that this saves memory, startup time, and disk space (or the > time of loading from network). I propose to move this data from pyc files > into separate file or files. pyc files should contain only external > references to external files. If the corresponding external file is absent > or specific option suppresses them, references are replaced with None or > NULL at import time, otherwise they are loaded from external files. > > 1. Docstrings. They are needed mainly for developing. > > 2. Line numbers (lnotab). They are helpful for formatting tracebacks, for > tracing, and debugging with the debugger. Sources are helpful in such cases > too. If the program doesn't contain errors ;-) and is sipped without > sources, they could be removed. > > 3. Annotations. They are used mainly by third party tools that statically > analyze sources. They are rarely used at runtime. > > Docstrings will be read from the corresponding docstring file unless -OO > is supplied. This will allow also to localize docstrings. Depending on > locale or other settings different docstring file can be used. > > For suppressing line numbers and annotations new options can be added. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Daniel F. Moisset - UK Country Manager - Machinalis Limited www.machinalis.co.uk Skype: @dmoisset T: + 44 7398 827139 1 Fore St, London, EC2Y 9DT Machinalis Limited is a company registered in England and Wales. Registered number: 10574987. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Apr 10 14:25:27 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 10 Apr 2018 20:25:27 +0200 Subject: [Python-ideas] Move optional data out of pyc files References: <5ACCFEAD.4070700@stoneleaf.us> Message-ID: <20180410202527.435ae0f5@fsol> On Tue, 10 Apr 2018 11:13:01 -0700 Ethan Furman wrote: > On 04/10/2018 10:54 AM, Zachary Ware wrote: > > On Tue, Apr 10, 2018 at 12:38 PM, Chris Angelico wrote: > >> A deployed Python distribution generally has .pyc files for all of the > >> standard library. I don't think people want to lose the ability to > >> call help(), and unless I'm misunderstanding, that requires > >> docstrings. So this will mean twice as many files and twice as many > >> file-open calls to import from the standard library. What will be the > >> impact on startup time? > > > > What about instead of separate files turning the single file into a > > pseudo-zip file containing all of the proposed files, and provide a > > simple tool for removing whatever parts you don't want? > > -O and -OO already do some trimming; perhaps going that route instead of having multiple files would be better. "python -O" and "python -OO" *do* generate different pyc files. If you want to trim docstrings with those options, you need to regenerate pyc files for all your dependencies (including third-party libraries and standard library modules). Serhiy's proposal allows "-O" and "-OO" to work without needing a custom bytecode generation step. Regard Antoine. From peter.ed.oconnor at gmail.com Tue Apr 10 14:25:33 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Tue, 10 Apr 2018 14:25:33 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180410173255.GM16661@ando.pearwood.info> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> Message-ID: > > But even I find your use of dysphemisms like "freak show" for non-FP > solutions quite off-putting. Ah, I'm sorry, "freak show" was not mean to be disparaging to the authors or even the code itself, but to describe the variety of strange solutions (my own included) to this simple problem. Indeed. But it seems to me that itertools.accumulate() with a initial value > probably will solve that issue. Kyle Lahnakoski made a pretty good case for not using itertools.accumulate() earlier in this thread, and Tim Peters made the point that it's non-initialized behaviour can be extremely unintuitive (try "print(list(itertools.accumulate([1, 2, 3], lambda x, y: str(x) + str(y))))" ). These convinced me that that itertools.accumulate should be avoided altogether. Alternatively, if anyone has a proposed syntax that does the same thing as Serhiy Storchaka's: smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]] But in a way that more intuitively expresses the intent of the code, it would be great to have more options on the market. On Tue, Apr 10, 2018 at 1:32 PM, Steven D'Aprano wrote: > On Tue, Apr 10, 2018 at 12:18:27PM -0400, Peter O'Connor wrote: > > [...] > > I added your coroutine to the freak show: > > Peter, I realise that you're a fan of functional programming idioms, and > I'm very sympathetic to that. I'm a fan of judicious use of FP too, and > while I'm not keen on your specific syntax, I am interested in the > general concept and would like it to have the best possible case made > for it. > > But even I find your use of dysphemisms like "freak show" for non-FP > solutions quite off-putting. (I think this is the second time you've > used the term.) > > Python is not a functional programming language like Haskell, it is a > multi-paradigm language with strong support for OO and procedural > idioms. Notwithstanding the problems with OO idioms that you describe, > many Python programmers find OO "better", simpler to understand, learn > and maintain than FP. Or at least more familiar. > > The rejection or approval of features into Python is not a popularity > contest, ultimately it only requires one person (Guido) to either reject > or approve a new feature. But popular opinion is not irrelevant either: > like all benevolent dictators, Guido has a good sense of what's popular, > and takes it into account in his considerations. If you put people > off-side, you hurt your chances of having this feature approved. > > > [...] > > I *almost* like the coroutine thing but find it unusable because the > > peculiarity of having to initialize the generator when you use it (you do > > it with next(processor)) is pretty much guaranteed to lead to errors when > > people forget to do it. Earlier in the thread Steven D'Aprano showed > how a > > @coroutine decorator can get around this: > > I agree that the (old-style, pre-async) coroutine idiom is little known, > in part because of the awkwardness needed to make it work. Nevertheless, > I think your argument about it leading to errors is overstated: if you > forget to initialize the coroutine, you get a clear and obvious failure: > > py> def co(): > ... x = (yield 1) > ... > py> a = co() > py> a.send(99) > Traceback (most recent call last): > File "", line 1, in > TypeError: can't send non-None value to a just-started generator > > > > > - Still, the whole coroutine thing still feels a bit magical, hacky and > > "clever". Also the use of generator.send will probably confuse around > 90% > > of programmers. > > In my experience, heavy use of FP idioms will probably confuse about the > same percentage. Including me: I like FP in moderation, I wouldn't want > to use a strict 100% functional language, and if someone even says the > word "Monad" I break out in hives. > > > > > If you have that much of a complex workflow, you really should not make > > > that a one-liner. > > > > It's not a complex workflow, it's a moving average. It just seems > complex > > because we don't have a nice, compact way to describe it. > > Indeed. But it seems to me that itertools.accumulate() with a initial > value probably will solve that issue. > > Besides... moving averages aren't that common that they *necessarily* > need syntactic support. Wrapping the complexity in a function, then > calling the function, may be an acceptible solution instead of putting > the complexity directly into the language itself. > > The Conservation Of Complexity Principle suggests that complexity cannot > be created or destroyed, only moved around. If we reduce the complexity > of the Python code needed to write a moving average, we invariably > increase the complexity of the language, the interpreter, and the amount > of syntax people need to learn in order to be productive with Python. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Tue Apr 10 13:40:26 2018 From: rhodri at kynesim.co.uk (Rhodri James) Date: Tue, 10 Apr 2018 18:40:26 +0100 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180410173255.GM16661@ando.pearwood.info> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> Message-ID: <27be0146-51e3-1ad6-4069-f1afc8f99fd5@kynesim.co.uk> On 10/04/18 18:32, Steven D'Aprano wrote: > On Tue, Apr 10, 2018 at 12:18:27PM -0400, Peter O'Connor wrote: > > [...] >> I added your coroutine to the freak show: > Peter, I realise that you're a fan of functional programming idioms, and > I'm very sympathetic to that. I'm a fan of judicious use of FP too, and > while I'm not keen on your specific syntax, I am interested in the > general concept and would like it to have the best possible case made > for it. > > But even I find your use of dysphemisms like "freak show" for non-FP > solutions quite off-putting. (I think this is the second time you've > used the term.) Thank you for saying that, Steven. I must admit I was beginning to find the implicit insults rather grating. -- Rhodri James *-* Kynesim Ltd From p.f.moore at gmail.com Tue Apr 10 15:12:14 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 10 Apr 2018 20:12:14 +0100 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> Message-ID: On 10 April 2018 at 19:25, Peter O'Connor wrote: > Kyle Lahnakoski made a pretty good case for not using itertools.accumulate() earlier in this thread I wouldn't call it a "pretty good case". He argued that writing *functions* was a bad thing, because the name of a function didn't provide all the details of what was going on in the same way that explicitly writing the code inline would do. That seems to me to be a somewhat bizarre argument - after all, encapsulation and abstraction are pretty fundamental to programming. I'm not even sure he had any specific comments about accumulate other than his general point that as a named function it's somehow worse than writing out the explicit loop. > But in a way that more intuitively expresses the intent of the code, it > would be great to have more options on the market. It's worth adding a reminder here that "having more options on the market" is pretty directly in contradiction to the Zen of Python - "There should be one-- and preferably only one --obvious way to do it". Paul From solipsis at pitrou.net Tue Apr 10 15:29:28 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 10 Apr 2018 21:29:28 +0200 Subject: [Python-ideas] Move optional data out of pyc files References: Message-ID: <20180410212928.663cf57a@fsol> On Tue, 10 Apr 2018 19:14:58 +0300 Serhiy Storchaka wrote: > Currently pyc files contain data that is useful mostly for developing > and is not needed in most normal cases in stable program. There is even > an option that allows to exclude a part of this information from pyc > files. It is expected that this saves memory, startup time, and disk > space (or the time of loading from network). I propose to move this data > from pyc files into separate file or files. pyc files should contain > only external references to external files. If the corresponding > external file is absent or specific option suppresses them, references > are replaced with None or NULL at import time, otherwise they are loaded > from external files. > > 1. Docstrings. They are needed mainly for developing. > > 2. Line numbers (lnotab). They are helpful for formatting tracebacks, > for tracing, and debugging with the debugger. Sources are helpful in > such cases too. If the program doesn't contain errors ;-) and is sipped > without sources, they could be removed. > > 3. Annotations. They are used mainly by third party tools that > statically analyze sources. They are rarely used at runtime. > > Docstrings will be read from the corresponding docstring file unless -OO > is supplied. This will allow also to localize docstrings. Depending on > locale or other settings different docstring file can be used. An alternate proposal would be to have separate sections in a single marshal file. The main section (containing the loadable module) would have references to the other sections. This way it's easy for the loader to say "all references to the docstring section and/or to the annotation section are replaced with None", depending on how Python is started. It would also be possible to do it on disk with a strip-like utility. I'm not volunteering to do all this, so just my 2 cents ;-) Regards Antoine. From eric at trueblade.com Tue Apr 10 15:50:33 2018 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 10 Apr 2018 15:50:33 -0400 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180410182427.03ad0043@fsol> References: <20180410182427.03ad0043@fsol> Message-ID: <3D3ABFD0-5785-4F99-92A3-D526DCE5BD88@trueblade.com> >> 3. Annotations. They are used mainly by third party tools that >> statically analyze sources. They are rarely used at runtime. > > Even less used than docstrings probably. typing.NamedTuple and dataclasses use annotations at runtime. Eric From steve at pearwood.info Tue Apr 10 20:03:35 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Apr 2018 10:03:35 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: <20180411000335.GN16661@ando.pearwood.info> On Wed, Apr 11, 2018 at 03:38:08AM +1000, Chris Angelico wrote: > On Wed, Apr 11, 2018 at 2:14 AM, Serhiy Storchaka wrote: > > Currently pyc files contain data that is useful mostly for developing and is > > not needed in most normal cases in stable program. There is even an option > > that allows to exclude a part of this information from pyc files. It is > > expected that this saves memory, startup time, and disk space (or the time > > of loading from network). I propose to move this data from pyc files into > > separate file or files. pyc files should contain only external references to > > external files. If the corresponding external file is absent or specific > > option suppresses them, references are replaced with None or NULL at import > > time, otherwise they are loaded from external files. > > > > 1. Docstrings. They are needed mainly for developing. > > > > 2. Line numbers (lnotab). They are helpful for formatting tracebacks, for > > tracing, and debugging with the debugger. Sources are helpful in such cases > > too. If the program doesn't contain errors ;-) and is sipped without > > sources, they could be removed. > > > > 3. Annotations. They are used mainly by third party tools that statically > > analyze sources. They are rarely used at runtime. > > > > Docstrings will be read from the corresponding docstring file unless -OO is > > supplied. This will allow also to localize docstrings. Depending on locale > > or other settings different docstring file can be used. > > > > For suppressing line numbers and annotations new options can be added. > > A deployed Python distribution generally has .pyc files for all of the > standard library. I don't think people want to lose the ability to > call help(), and unless I'm misunderstanding, that requires > docstrings. So this will mean twice as many files and twice as many > file-open calls to import from the standard library. What will be the > impact on startup time? I shouldn't think that the number of files on disk is very important, now that they're hidden away in the __pycache__ directory where they can be ignored by humans. Even venerable old FAT32 has a limit of 65,534 files in a single folder, and 268,435,437 on the entire volume. So unless the std lib expands to 16000+ modules, the number of files in the __pycache__ directory ought to be well below that limit. I think even MicroPython ought to be okay with that. (But it would be nice to find out for sure: does it support file systems with *really* tiny limits?) The entire __pycache__ directory is supposed to be a black box except under unusual circumstances, so it doesn't matter (at least not to me) if we have: __pycache__/spam.cpython-38.pyc alone or: __pycache__/spam.cpython-38.pyc __pycache__/spam.cpython-38-doc.pyc __pycache__/spam.cpython-38-lno.pyc __pycache__/spam.cpython-38-ann.pyc (say). And if the external references are loaded lazily, on need, rather than eagerly, this could save startup time, which I think is the intention. The doc strings would be still available, just not loaded until the first time you try to use them. However, Python supports byte-code only distribution, using .pyc files external to the __pycache__. In that case, it would be annoying and inconvenient to distribute four top-level files, so I think that the use of external references has to be optional, and there has to be a way to either compile to a single .pyc file containing all four parts, or an external tool that can take the existing four files and merge them. -- Steve From rosuav at gmail.com Tue Apr 10 20:08:58 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 10:08:58 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180411000335.GN16661@ando.pearwood.info> References: <20180411000335.GN16661@ando.pearwood.info> Message-ID: On Wed, Apr 11, 2018 at 10:03 AM, Steven D'Aprano wrote: > On Wed, Apr 11, 2018 at 03:38:08AM +1000, Chris Angelico wrote: >> A deployed Python distribution generally has .pyc files for all of the >> standard library. I don't think people want to lose the ability to >> call help(), and unless I'm misunderstanding, that requires >> docstrings. So this will mean twice as many files and twice as many >> file-open calls to import from the standard library. What will be the >> impact on startup time? > > I shouldn't think that the number of files on disk is very important, > now that they're hidden away in the __pycache__ directory where they can > be ignored by humans. Even venerable old FAT32 has a limit of 65,534 > files in a single folder, and 268,435,437 on the entire volume. So > unless the std lib expands to 16000+ modules, the number of files in the > __pycache__ directory ought to be well below that limit. > > I think even MicroPython ought to be okay with that. (But it would be > nice to find out for sure: does it support file systems with *really* > tiny limits?) File system limits aren't usually an issue; as you say, even FAT32 can store a metric ton of files in a single directory. I'm more interested in how long it takes to open a file, and whether doubling that time will have a measurable impact on Python startup time. Part of that cost can be reduced by using openat(), on platforms that support it, but even with a directory handle, there's still a definite non-zero cost to opening and reading an additional file. ChrisA From greg at krypto.org Tue Apr 10 20:19:40 2018 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 11 Apr 2018 00:19:40 +0000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <3D3ABFD0-5785-4F99-92A3-D526DCE5BD88@trueblade.com> References: <20180410182427.03ad0043@fsol> <3D3ABFD0-5785-4F99-92A3-D526DCE5BD88@trueblade.com> Message-ID: On Tue, Apr 10, 2018 at 12:51 PM Eric V. Smith wrote: > > >> 3. Annotations. They are used mainly by third party tools that > >> statically analyze sources. They are rarely used at runtime. > > > > Even less used than docstrings probably. > > typing.NamedTuple and dataclasses use annotations at runtime. > > Eric > Yep. Everything accessible in any way at runtime is used by something at runtime. It's a public API, we can't just get rid of it. Several libraries rely on docstrings being available (additional case in point beyond the already linked to cli tool: ply ) Most of the world never appears to use -O and -OO. If they do, they don't use these libraries or jump through special hoops to prevent pyo compliation of any sources that need them. (unlikely) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericfahlgren at gmail.com Tue Apr 10 20:43:42 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Tue, 10 Apr 2018 17:43:42 -0700 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180411000335.GN16661@ando.pearwood.info> References: <20180411000335.GN16661@ando.pearwood.info> Message-ID: On Tue, Apr 10, 2018 at 5:03 PM, Steven D'Aprano wrote: > > __pycache__/spam.cpython-38.pyc > __pycache__/spam.cpython-38-doc.pyc > __pycache__/spam.cpython-38-lno.pyc > __pycache__/spam.cpython-38-ann.pyc > ?Our product uses the doc strings for auto-generated help, so we need to keep those. We also allow users to write plugins and scripts, so getting valid feedback in tracebacks is essential for our support people, so we'll keep the lno files, too. Annotations can probably go. Looking at one of our little pyc files, I see: -rwx------+ 1 efahlgren admins 9252 Apr 10 17:25 ./lm/lib/config.pyc*? Since disk blocks are typically 4096 bytes, that's really a 12k file. Let's say it's 8k of byte code, 1k of doc, a bit of lno. So the proposed layout would give: config.pyc -> 8k config-doc.pyc -> 4k config-lno.pyc -> 4k So now I've increased disk usage by 25% (yeah yeah, I know, I picked that small file on purpose to illustrate the point, but it's not unusual). These files are often opened over a network, at least for user plugins. This can take a really, really long time on some of our poorly connected machines, like 1-2 seconds per file (no kidding, it's horrible). Now instead of opening just one file in 1-2 seconds, we have increased the time by 300%, just to do the stat+open, probably another stat to make sure there's no "ann" file laying about. Ouch. -1 from me. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Tue Apr 10 21:19:21 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 10 Apr 2018 18:19:21 -0700 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: On 2018-04-09 04:23, Daniel Moisset wrote: > In which way would this be different to {**mapping1, **mapping2, **mapping3} ? That's possible now, but believe the form mentioned previously would be more readable: dict(d1, d2, d3) -Mike From steve at pearwood.info Tue Apr 10 23:02:05 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Apr 2018 13:02:05 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: <20180411000335.GN16661@ando.pearwood.info> Message-ID: <20180411030205.GO16661@ando.pearwood.info> On Wed, Apr 11, 2018 at 10:08:58AM +1000, Chris Angelico wrote: > File system limits aren't usually an issue; as you say, even FAT32 can > store a metric ton of files in a single directory. I'm more interested > in how long it takes to open a file, and whether doubling that time > will have a measurable impact on Python startup time. Part of that > cost can be reduced by using openat(), on platforms that support it, > but even with a directory handle, there's still a definite non-zero > cost to opening and reading an additional file. Yes, it will double the number of files. Actually quadruple it, if the annotations and line numbers are in separate files too. But if most of those extra files never need to be opened, then there's no cost to them. And whatever extra cost there is, is amortized over the lifetime of the interpreter. The expectation here is that this could lead to reducing startup time, since the files which are read are smaller and less data needs to be read and traverse the network up front, but can be defered until they're actually needed. Serhiy is experienced enough that I think we should assume he's not going to push this optimization into production unless it actually does reduce startup time. He has proven himself enough that we should assume competence rather than incompetence :-) Here is the proposal as I understand it: - by default, change .pyc files to store annotations, docstrings and line numbers as references to external files which will be lazily loaded on-need; - single-file .pyc files must still be supported, but this won't be the default and could rely on an external "merge" tool; - objects that rely on docstrings or annotations, such as dataclass, may experience a (hopefully very small) increase of import time, since they may not be able to defer loading the extra files; - but in general, most modules should (we expect) see an decrease in the load time; - which will (we hope) reduce startup time; - libraries which make eager use of docstrings and annotations might even ship with the single-file .pyc instead (the library installer can look after that aspect), and so avoid any extra cost. Naturally pushing this into production will require benchmarks that prove this actually does improve startup time. I believe that Serhiy's reason for asking is to determine whether it is worth his while to experiment on this. There's no point in implementing these changes and benchmarking them, if there's no chance of it being accepted. So on the assumptions that: - benchmarking does demonstrate a non-trivial speedup of interpreter startup; - single-file .pyc files are still supported, for the use of byte-code only libraries; - and modules which are particularly badly impacted by this change are able to opt-out and use a single .pyc file; I see no reason not to support this idea if Serhiy (or someone else) is willing to put in the work. -- Steve From python-ideas at mgmiller.net Tue Apr 10 23:15:10 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 10 Apr 2018 20:15:10 -0700 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: References: <20180323150058.GU16661@ando.pearwood.info> <20180324044102.GV16661@ando.pearwood.info> <20180324144432.GW16661@ando.pearwood.info> <27fccc82-8833-d1a5-a589-8d1358a3887a@btinternet.com> <5AB6A081.5010503@stoneleaf.us> <87in9d2xm3.fsf@vostro.rath.org> <2d7052b6-5912-c454-13f2-6595a32afa41@mgmiller.net> Message-ID: <0029eccf-38f8-9360-31da-d23aaea4344c@mgmiller.net> If anyone is interested I came across this same subject on a blog post and discussion on HN today: - https://www.hillelwayne.com/post/equals-as-assignment/ - https://news.ycombinator.com/item?id=16803874 On 2018-04-02 15:03, Guido van Rossum wrote: > IIRC Algol-68 (the lesser-known, more complicated version) used 'int x = 0;' to > declare a constant and 'int x := 0;' to declare a variable. And there was a lot > more to it; see https://en.wikipedia.org/wiki/ALGOL_68#mode:_Declarations. I'm > guessing Go reversed this because they want '=' to be the common assignment > (whereas in Algol-68 the common assignment was ':='). From tim.peters at gmail.com Tue Apr 10 23:40:48 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 10 Apr 2018 22:40:48 -0500 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <5ACA9EB0.5000801@canterbury.ac.nz> Message-ID: [Jacco van Dorp ] > I've sometimes thought that exhaust(iterator) or iterator.exhaust() would be > a good thing to have - I've often wrote code doing basically "call this function > for every element in this container, and idc about return values", but find > myself using a list comprehension instead of generator. I guess it's such an > edge case that exhaust(iterator) as builtin would be overkill (but perhaps > itertools could have it ?), and most people don't pass around iterators, so > (f(x) for x in y).exhaust() might not look natural to most people. "The standard" clever way to do this is to create a 0-sized deque: >>> from collections import deque >>> deque((i for i in range(1000)), 0) deque([], maxlen=0) The deque constructor consumes the entire iterable "at C speed", but throws all the results away because the deque's maximum size is too small to hold any of them ;-) > It could return the value for the last() semantics, but I think exhaustion > would often be more important than the last value. For last(), >>> deque((i for i in range(1000)), 1)[0] 999 In that case the deque only has enough room to remember one element, and so remembers the last one it sees. Of course this generalizes to larger values too: >>> for x in deque((i for i in range(1000)), 5): ... print(x) 995 996 997 998 999 I think I'd like to see itertools add a `drop(iterable, n=None)` function. If `n` is not given, it would consume the entire iterable. Else for an integer n >= 0, it would return an iterator that skips over the first `n` values of the input iterable. `drop n xs` has been in Haskell forever, and is also in the Python itertoolz package: http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.drop I'm not happy about switching the argument order from those, but would really like to omit `n` as a a way to spell "pretend n is infinity", so there would be no more need for the "empty deque" trick. From steve at pearwood.info Tue Apr 10 23:41:15 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Apr 2018 13:41:15 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> Message-ID: <20180411034115.GP16661@ando.pearwood.info> On Tue, Apr 10, 2018 at 08:12:14PM +0100, Paul Moore wrote: > On 10 April 2018 at 19:25, Peter O'Connor wrote: > > Kyle Lahnakoski made a pretty good case for not using itertools.accumulate() earlier in this thread > > I wouldn't call it a "pretty good case". He argued that writing > *functions* was a bad thing, because the name of a function didn't > provide all the details of what was going on in the same way that > explicitly writing the code inline would do. That seems to me to be a > somewhat bizarre argument - after all, encapsulation and abstraction > are pretty fundamental to programming. I'm not even sure he had any > specific comments about accumulate other than his general point that > as a named function it's somehow worse than writing out the explicit > loop. I agree with Paul here -- I think that Kyle's argument is idiosyncratic. It isn't going to stop me from writing functions :-) > > But in a way that more intuitively expresses the intent of the code, it > > would be great to have more options on the market. > > It's worth adding a reminder here that "having more options on the > market" is pretty directly in contradiction to the Zen of Python - > "There should be one-- and preferably only one --obvious way to do > it". I'm afraid I'm going to (mildly) object here. At least you didn't misquote the Zen as "Only One Way To Do It" :-) The Zen here is not a prohibition against there being multiple ways to do something -- how could it, given that Python is a general purpose programming language there is always going to be multiple ways to write any piece of code? Rather, it exhorts us to make sure that there are one or more ways to "do it", at least one of which is obvious. And since "it" is open to interpretation, we can legitimately wonder whether (for example): - for loops - list comprehensions - list(generator expression) etc are three different ways to do "it", or three different "it"s. If we wish to dispute the old slander that Python has Only One Way to do anything, then we can emphasise the similarities and declare them three ways; if we want to defend the Zen, we can emphasise the differences and declare them to be three different "it"s. So I think Peter is on reasonable ground to suggest this, if he can make a good enough case for it. Personally, I still think the best approach here is a combination of itertools.accumulate, and the proposed name-binding as an expression feature: total = 0 running_totals = [(total := total + x) for x in values] # alternative syntax running_totals = [(total + x as total) for x in values] If you don't like the dependency on an external variable (or if that turns out not to be practical) then we could have: running_totals = [(total := total + x) for total in [0] for x in values] -- Steve From rosuav at gmail.com Tue Apr 10 23:57:18 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 13:57:18 +1000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: <0029eccf-38f8-9360-31da-d23aaea4344c@mgmiller.net> References: <20180323150058.GU16661@ando.pearwood.info> <20180324044102.GV16661@ando.pearwood.info> <20180324144432.GW16661@ando.pearwood.info> <27fccc82-8833-d1a5-a589-8d1358a3887a@btinternet.com> <5AB6A081.5010503@stoneleaf.us> <87in9d2xm3.fsf@vostro.rath.org> <2d7052b6-5912-c454-13f2-6595a32afa41@mgmiller.net> <0029eccf-38f8-9360-31da-d23aaea4344c@mgmiller.net> Message-ID: On Wed, Apr 11, 2018 at 1:15 PM, Mike Miller wrote: > If anyone is interested I came across this same subject on a blog post and > discussion on HN today: > > - https://www.hillelwayne.com/post/equals-as-assignment/ > - https://news.ycombinator.com/item?id=16803874 Those people who say "x = x + 1" makes no sense... do they also get confused by the fact that you can multiply a string by a number? Programming is not algebra. The ONLY reason that "x = x + 1" can fail to make sense is if you start by assuming that there is no such thing as time. That's the case in algebra, but it simply isn't true in software. Functional programming languages are closer to algebra than imperative languages are, but that doesn't mean they _are_ algebraic, and they go to great lengths to lie about how you can have side-effect-free side effects and such. Fortunately, Python is not bound by such silly rules, and can do things because they are useful for real-world work. Thus the question of ":=" vs "=" vs "==" vs "===" comes down to what is actually worth doing, not what would look tidiest to someone who is trying to represent a mathematician's blackboard in ASCII. ChrisA From brenbarn at brenbarn.net Tue Apr 10 23:58:18 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Tue, 10 Apr 2018 20:58:18 -0700 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> Message-ID: <5ACD87DA.3020300@brenbarn.net> On 2018-04-08 10:41, Kyle Lahnakoski wrote: > For example before I read the docs on > itertools.accumulate(list_of_length_N, func), here are the unknowns I see: It sounds like you're saying you don't like using functions because you have to read documentation. That may be so, but I don't have much sympathy for that position. One of the most useful features of functions is that they exist as defined chunks of code that can be explicitly documented. Snippets of inline code are harder to document and harder to "address" in the sense of identifying precisely which chunk of code is being documented. If the documentation for accumulate doesn't give the information that people using it need to know, that's a documentation bug for sure, but it doesn't mean we should stop using functions. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From rosuav at gmail.com Wed Apr 11 00:16:08 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 14:16:08 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180411034115.GP16661@ando.pearwood.info> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> <20180411034115.GP16661@ando.pearwood.info> Message-ID: On Wed, Apr 11, 2018 at 1:41 PM, Steven D'Aprano wrote: > Personally, I still think the best approach here is a combination of > itertools.accumulate, and the proposed name-binding as an expression > feature: > > total = 0 > running_totals = [(total := total + x) for x in values] > # alternative syntax > running_totals = [(total + x as total) for x in values] > > If you don't like the dependency on an external variable (or if that > turns out not to be practical) then we could have: > > running_totals = [(total := total + x) for total in [0] for x in values] That last one works, but it's not exactly pretty. Using an additional 'for' loop to initialize variables feels like a gross hack. Unfortunately, the first one is equivalent to this (in a PEP 572 world): total = 0 def (): result = [] for x in values: result.push(total := total + x) return result running_totals = () Problem: it's still happening in a function, which means this bombs with UnboundLocalError. Solution 1: Use the extra loop to initialize 'total' inside the comprehension. Ugly. Solution 2: Completely redefine comprehensions to use subscopes instead of a nested function. I used to think this was a good thing, but after the discussions here, I've found that this creates as many problems as it solves. Solution 3: Have some way for a comprehension to request that a name be imported from the surrounding context. Effectively this: total = 0 def (total=total): result = [] for x in values: result.push(total := total + x) return result running_totals = () This is how, in a PEP 572 world, the oddities of class scope are resolved. (I'll be posting a new PEP as soon as I fix up three failing CPython tests.) It does have its own problems, though. How do you know which names to import like that? What if 'total' wasn't assigned to right there, but instead was being lifted from a scope further out? Solution 4: Have *all* local variables in a comprehension get initialized to None. def (): result = [] total = x = None for x in values: result.push(total := (total or 0) + x) return result running_totals = () running_totals = [(total := (total or 0) + x) for total in [0] for x in values] That'd add to the run-time cost of every list comp, but probably not measurably. (Did you know, for instance, that "except Exception as e:" will set e to None before unbinding it?) It's still not exactly pretty, though, and having to explain why you have "or 0" in a purely arithmetic operation may not quite work. Solution 5: Allow an explicit initializer syntax. Could work, but you'd have to come up with one that people are happy with. None is truly ideal IMO. ChrisA From rosuav at gmail.com Wed Apr 11 00:21:17 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 14:21:17 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180411030205.GO16661@ando.pearwood.info> References: <20180411000335.GN16661@ando.pearwood.info> <20180411030205.GO16661@ando.pearwood.info> Message-ID: On Wed, Apr 11, 2018 at 1:02 PM, Steven D'Aprano wrote: > On Wed, Apr 11, 2018 at 10:08:58AM +1000, Chris Angelico wrote: > >> File system limits aren't usually an issue; as you say, even FAT32 can >> store a metric ton of files in a single directory. I'm more interested >> in how long it takes to open a file, and whether doubling that time >> will have a measurable impact on Python startup time. Part of that >> cost can be reduced by using openat(), on platforms that support it, >> but even with a directory handle, there's still a definite non-zero >> cost to opening and reading an additional file. > > Yes, it will double the number of files. Actually quadruple it, if the > annotations and line numbers are in separate files too. But if most of > those extra files never need to be opened, then there's no cost to them. > And whatever extra cost there is, is amortized over the lifetime of the > interpreter. Yes, if they are actually not needed. My question was about whether that is truly valid. Consider a very common use-case: an OS-provided Python interpreter whose files are all owned by 'root'. Those will be distributed with .pyc files for performance, but you don't want to deprive the users of help() and anything else that needs docstrings etc. So... are the docstrings lazily loaded or eagerly loaded? If eagerly, you've doubled the number of file-open calls to initialize the interpreter. (Or quadrupled, if you need annotations and line numbers and they're all separate.) If lazily, things are a lot more complicated than the original description suggested, and there'd need to be some semantic changes here. > Serhiy is experienced enough that I think we should assume he's not > going to push this optimization into production unless it actually does > reduce startup time. He has proven himself enough that we should assume > competence rather than incompetence :-) Oh, I'm definitely assuming that he knows what he's doing :-) Doesn't mean I can't ask the question though. ChrisA From rosuav at gmail.com Wed Apr 11 00:22:08 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 14:22:08 +1000 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: On Wed, Apr 11, 2018 at 11:19 AM, Mike Miller wrote: > > On 2018-04-09 04:23, Daniel Moisset wrote: >> >> In which way would this be different to {**mapping1, **mapping2, >> **mapping3} ? > > > That's possible now, but believe the form mentioned previously would be more > readable: > > dict(d1, d2, d3) > That's more readable than {**d1, **d2, **d3} ? Doesn't look materially different to me. ChrisA From steve at pearwood.info Wed Apr 11 00:44:41 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Apr 2018 14:44:41 +1000 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: <20180411044441.GQ16661@ando.pearwood.info> On Wed, Apr 11, 2018 at 02:22:08PM +1000, Chris Angelico wrote: > > dict(d1, d2, d3) > > That's more readable than {**d1, **d2, **d3} ? Doesn't look materially > different to me. It does to me. On the one hand, we have a function call (okay, technically a type...) "dict()" that can be googled on, with three arguments; on the other hand, we have syntax that looks like a set {...} and contains the obscure ** prefix operator which is hard to google for. -- Steve From rosuav at gmail.com Wed Apr 11 01:22:28 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 15:22:28 +1000 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: <20180411044441.GQ16661@ando.pearwood.info> References: <20180411044441.GQ16661@ando.pearwood.info> Message-ID: On Wed, Apr 11, 2018 at 2:44 PM, Steven D'Aprano wrote: > On Wed, Apr 11, 2018 at 02:22:08PM +1000, Chris Angelico wrote: > >> > dict(d1, d2, d3) >> >> That's more readable than {**d1, **d2, **d3} ? Doesn't look materially >> different to me. > > It does to me. > > On the one hand, we have a function call (okay, technically a type...) > "dict()" that can be googled on, with three arguments; on the other > hand, we have syntax that looks like a set {...} and contains the > obscure ** prefix operator which is hard to google for. True, you can google 'dict'. But the double-star operator is exactly the same as is used in kwargs, and actually, I *can* search for it. https://www.google.com.au/search?q=python+** Lots of results for kwargs, which is a good start. (DuckDuckGo is less useful here, though it too is capable of searching for "**". It just gives more results about exponentiation than about packing/unpacking.) The googleability argument may have been a killer a few years ago, but search engines get smarter every day [1], and it's most definitely possible to search for operators. Or at least some of them; Google and DDG don't give me anything useful for "python @". ChrisA [1] and a search engine can help you find SmarterEveryDay, not that he talks about Python From rosuav at gmail.com Wed Apr 11 01:32:04 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 15:32:04 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) Message-ID: Wholesale changes since the previous version. Statement-local name bindings have been dropped (I'm still keeping the idea in the back of my head; this PEP wasn't the first time I'd raised the concept), and we're now focusing primarily on assignment expressions, but also with consequent changes to comprehensions. Sorry for the lengthy delays; getting a reference implementation going took me longer than I expected or intended. ChrisA PEP: 572 Title: Assignment Expressions Author: Chris Angelico Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Feb-2018 Python-Version: 3.8 Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018, 04-Apr-2018 Abstract ======== This is a proposal for creating a way to assign to names within an expression. Additionally, the precise scope of comprehensions is adjusted, to maintain consistency and follow expectations. Rationale ========= Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts. Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations. Syntax and semantics ==================== In any context where arbitrary Python expressions can be used, a **named expression** can appear. This can be parenthesized for clarity, and is of the form ``(target := expr)`` where ``expr`` is any valid Python expression, and ``target`` is any valid assignment target. The value of such a named expression is the same as the incorporated expression, with the additional side-effect that the target is assigned that value. # Similar to the boolean 'or' but checking for None specifically x = "default" if (eggs := spam().ham) is None else eggs # Even complex expressions can be built up piece by piece y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) Differences from regular assignment statements ---------------------------------------------- An assignment statement can assign to multiple targets:: x = y = z = 0 To do the same with assignment expressions, they must be parenthesized:: assert 0 == (x := (y := (z := 0))) Augmented assignment is not supported in expression form:: >>> x +:= 1 File "", line 1 x +:= 1 ^ SyntaxError: invalid syntax Otherwise, the semantics of assignment are unchanged by this proposal. Alterations to comprehensions ----------------------------- The current behaviour of list/set/dict comprehensions and generator expressions has some edge cases that would behave strangely if an assignment expression were to be used. Therefore the proposed semantics are changed, removing the current edge cases, and instead altering their behaviour *only* in a class scope. As of Python 3.7, the outermost iterable of any comprehension is evaluated in the surrounding context, and then passed as an argument to the implicit function that evaluates the comprehension. Under this proposal, the entire body of the comprehension is evaluated in its implicit function. Names not assigned to within the comprehension are located in the surrounding scopes, as with normal lookups. As one special case, a comprehension at class scope will **eagerly bind** any name which is already defined in the class scope. A list comprehension can be unrolled into an equivalent function. With Python 3.7 semantics:: numbers = [x + y for x in range(3) for y in range(4)] # Is approximately equivalent to def (iterator): result = [] for x in iterator: for y in range(4): result.append(x + y) return result numbers = (iter(range(3))) Under the new semantics, this would instead be equivalent to:: def (): result = [] for x in range(3): for y in range(4): result.append(x + y) return result numbers = () When a class scope is involved, a naive transformation into a function would prevent name lookups (as the function would behave like a method). class X: names = ["Fred", "Barney", "Joe"] prefix = "> " prefixed_names = [prefix + name for name in names] With Python 3.7 semantics, this will evaluate the outermost iterable at class scope, which will succeed; but it will evaluate everything else in a function:: class X: names = ["Fred", "Barney", "Joe"] prefix = "> " def (iterator): result = [] for name in iterator: result.append(prefix + name) return result prefixed_names = (iter(names)) The name ``prefix`` is thus searched for at global scope, ignoring the class name. Under the proposed semantics, this name will be eagerly bound, being approximately equivalent to:: class X: names = ["Fred", "Barney", "Joe"] prefix = "> " def (prefix=prefix): result = [] for name in names: result.append(prefix + name) return result prefixed_names = () With list comprehensions, this is unlikely to cause any confusion. With generator expressions, this has the potential to affect behaviour, as the eager binding means that the name could be rebound between the creation of the genexp and the first call to ``next()``. It is, however, more closely aligned to normal expectations. The effect is ONLY seen with names that are looked up from class scope; global names (eg ``range()``) will still be late-bound as usual. One consequence of this change is that certain bugs in genexps will not be detected until the first call to ``next()``, where today they would be caught upon creation of the generator. TODO: Discuss the merits and costs of amelioration proposals. Recommended use-cases ===================== Simplifying list comprehensions ------------------------------- These list comprehensions are all approximately equivalent:: # Calling the function twice stuff = [[f(x), x/f(x)] for x in range(5)] # External helper function def pair(x, value): return [value, x/value] stuff = [pair(x, f(x)) for x in range(5)] # Inline helper function stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)] # Extra 'for' loop - potentially could be optimized internally stuff = [[y, x/y] for x in range(5) for y in [f(x)]] # Iterating over a genexp stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))] # Expanding the comprehension into a loop stuff = [] for x in range(5): y = f(x) stuff.append([y, x/y]) # Wrapping the loop in a generator function def g(): for x in range(5): y = f(x) yield [y, x/y] stuff = list(g()) # Using a mutable cache object (various forms possible) c = {} stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)] # Using a temporary name stuff = [[y := f(x), x/y] for x in range(5)] If calling ``f(x)`` is expensive or has side effects, the clean operation of the list comprehension gets muddled. Using a short-duration name binding retains the simplicity; while the extra ``for`` loop does achieve this, it does so at the cost of dividing the expression visually, putting the named part at the end of the comprehension instead of the beginning. Capturing condition values -------------------------- Assignment expressions can be used to good effect in the header of an ``if`` or ``while`` statement:: # Current Python, not caring about function return value while input("> ") != "quit": print("You entered a command.") # Current Python, capturing return value - four-line loop header while True: command = input("> "); if command == "quit": break print("You entered:", command) # Proposed alternative to the above while (command := input("> ")) != "quit": print("You entered:", command) # Capturing regular expression match objects # See, for instance, Lib/pydoc.py, which uses a multiline spelling # of this effect if match := re.search(pat, text): print("Found:", match.group(0)) # Reading socket data until an empty string is returned while data := sock.read(): print("Received data:", data) Particularly with the ``while`` loop, this can remove the need to have an infinite loop, an assignment, and a condition. It also creates a smooth parallel between a loop which simply uses a function call as its condition, and one which uses that as its condition but also uses the actual value. Rejected alternative proposals ============================== Proposals broadly similar to this one have come up frequently on python-ideas. Below are a number of alternative syntaxes, some of them specific to comprehensions, which have been rejected in favour of the one given above. Alternative spellings --------------------- Broadly the same semantics as the current proposal, but spelled differently. 1. ``EXPR as NAME``, with or without parentheses:: stuff = [[f(x) as y, x/y] for x in range(5)] Omitting the parentheses in this form of the proposal introduces many syntactic ambiguities. Requiring them in all contexts leaves open the option to make them optional in specific situations where the syntax is unambiguous (cf generator expressions as sole parameters in function calls), but there is no plausible way to make them optional everywhere. With the parentheses, this becomes a viable option, with its own tradeoffs in syntactic ambiguity. Since ``EXPR as NAME`` already has meaning in ``except`` and ``with`` statements (with different semantics), this would create unnecessary confusion or require special-casing. 2. Adorning statement-local names with a leading dot:: stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as" stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":=" This has the advantage that leaked usage can be readily detected, removing some forms of syntactic ambiguity. However, this would be the only place in Python where a variable's scope is encoded into its name, making refactoring harder. This syntax is quite viable, and could be promoted to become the current recommendation if its advantages are found to outweigh its cost. 3. Adding a ``where:`` to any statement to create local name bindings:: value = x**2 + 2*x where: x = spam(1, 4, 7, q) Execution order is inverted (the indented body is performed first, followed by the "header"). This requires a new keyword, unless an existing keyword is repurposed (most likely ``with:``). See PEP 3150 for prior discussion on this subject (with the proposed keyword being ``given:``). Special-casing conditional statements ------------------------------------- One of the most popular use-cases is ``if`` and ``while`` statements. Instead of a more general solution, this proposal enhances the syntax of these two statements to add a means of capturing the compared value:: if re.search(pat, text) as match: print("Found:", match.group(0)) This works beautifully if and ONLY if the desired condition is based on the truthiness of the captured value. It is thus effective for specific use-cases (regex matches, socket reads that return `''` when done), and completely useless in more complicated cases (eg where the condition is ``f(x) < 0`` and you want to capture the value of ``f(x)``). It also has no benefit to list comprehensions. Advantages: No syntactic ambiguities. Disadvantages: Answers only a fraction of possible use-cases, even in ``if``/``while`` statements. Special-casing comprehensions ----------------------------- Another common use-case is comprehensions (list/set/dict, and genexps). As above, proposals have been made for comprehension-specific solutions. 1. ``where``, ``let``, or ``given``:: stuff = [(y, x/y) where y = f(x) for x in range(5)] stuff = [(y, x/y) let y = f(x) for x in range(5)] stuff = [(y, x/y) given y = f(x) for x in range(5)] This brings the subexpression to a location in between the 'for' loop and the expression. It introduces an additional language keyword, which creates conflicts. Of the three, ``where`` reads the most cleanly, but also has the greatest potential for conflict (eg SQLAlchemy and numpy have ``where`` methods, as does ``tkinter.dnd.Icon`` in the standard library). 2. ``with NAME = EXPR``:: stuff = [(y, x/y) with y = f(x) for x in range(5)] As above, but reusing the `with` keyword. Doesn't read too badly, and needs no additional language keyword. Is restricted to comprehensions, though, and cannot as easily be transformed into "longhand" for-loop syntax. Has the C problem that an equals sign in an expression can now create a name binding, rather than performing a comparison. Would raise the question of why "with NAME = EXPR:" cannot be used as a statement on its own. 3. ``with EXPR as NAME``:: stuff = [(y, x/y) with f(x) as y for x in range(5)] As per option 2, but using ``as`` rather than an equals sign. Aligns syntactically with other uses of ``as`` for name binding, but a simple transformation to for-loop longhand would create drastically different semantics; the meaning of ``with`` inside a comprehension would be completely different from the meaning as a stand-alone statement, while retaining identical syntax. Regardless of the spelling chosen, this introduces a stark difference between comprehensions and the equivalent unrolled long-hand form of the loop. It is no longer possible to unwrap the loop into statement form without reworking any name bindings. The only keyword that can be repurposed to this task is ``with``, thus giving it sneakily different semantics in a comprehension than in a statement; alternatively, a new keyword is needed, with all the costs therein. Migration path ============== The semantic changes to list/set/dict comprehensions, and more so to generator expressions, may potentially require migration of code. In many cases, the changes simply make legal what used to raise an exception, but there are some edge cases that were previously legal and are not, and a few corner cases with altered semantics. Yield inside comprehensions --------------------------- As of Python 3.7, the outermost iterable in a comprehension is permitted to contain a 'yield' expression. If this is required, the iterable (or at least the yield) must be explicitly elevated from the comprehension:: # Python 3.7 def g(): return [x for x in [(yield 1)]] # With PEP 572 def g(): sent_item = (yield 1) return [x for x in [sent_item]] This more clearly shows that it is g(), not the comprehension, which is able to yield values (and is thus a generator function). The entire comprehension is consistently in a single scope. Name lookups in class scope --------------------------- A comprehension inside a class previously was able to 'see' class members ONLY from the outermost iterable. Other name lookups would ignore the class and potentially locate a name at an outer scope:: pattern = "<%d>" class X: pattern = "[%d]" numbers = [pattern % n for n in range(5)] In Python 3.7, ``X.numbers`` would show angle brackets; with PEP 572, it would show square brackets. Maintaining the current behaviour here is best done by using distinct names for the different forms of ``pattern``, as would be the case with functions. Generator expression bugs can be caught later --------------------------------------------- Certain types of bugs in genexps were previously caught more quickly. Some are now detected only at first iteration:: gen = (x for x in rage(10)) # NameError gen = (x for x in 10) # TypeError (not iterable) gen = (x for x in range(1/0)) # Exception raised during evaluation This brings such generator expressions in line with a simple translation to function form:: def (): for x in rage(10): yield x gen = () # No exception yet tng = next(gen) # NameError To detect these errors more quickly, ... TODO. Open questions ============== Can the outermost iterable still be evaluated early? ---------------------------------------------------- As of Python 3.7, the outermost iterable in a genexp is evaluated early, and the result passed to the implicit function as an argument. With PEP 572, this would no longer be the case. Can we still, somehow, evaluate it before moving on? One possible implementation would be:: gen = (x for x in rage(10)) # translates to def (): iterable = iter(rage(10)) yield None for x in iterable: yield x gen = () next(gen) This would pump the iterable up to just before the loop starts, evaluating exactly as much as is evaluated outside the generator function in Py3.7. This would result in it being possible to call ``gen.send()`` immediately, unlike with most generators, and may incur unnecessary overhead in the common case where the iterable is pumped immediately (perhaps as part of a larger expression). Frequently Raised Objections ============================ Why not just turn existing assignment into an expression? --------------------------------------------------------- C and its derivatives define the ``=`` operator as an expression, rather than a statement as is Python's way. This allows assignments in more contexts, including contexts where comparisons are more common. The syntactic similarity between ``if (x == y)`` and ``if (x = y)`` belies their drastically different semantics. Thus this proposal uses ``:=`` to clarify the distinction. This could be used to create ugly code! --------------------------------------- So can anything else. This is a tool, and it is up to the programmer to use it where it makes sense, and not use it where superior constructs can be used. With assignment expressions, why bother with assignment statements? ------------------------------------------------------------------- The two forms have different flexibilities. The ``:=`` operator can be used inside a larger expression; the ``=`` operator can be chained more conveniently, and closely parallels the inline operations ``+=`` and friends. The assignment statement is a clear declaration of intent: this value is to be assigned to this target, and that's it. Acknowledgements ================ The author wishes to thank Guido van Rossum and Nick Coghlan for their considerable contributions to this proposal, and to members of the core-mentorship mailing list for assistance with implementation. References ========== .. [1] Proof of concept / reference implementation (https://github.com/Rosuav/cpython/tree/assignment-expressions) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From ethan at stoneleaf.us Wed Apr 11 01:54:47 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 10 Apr 2018 22:54:47 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: <5ACDA327.7050605@stoneleaf.us> On 04/10/2018 10:32 PM, Chris Angelico wrote: > Migration path > ============== > > The semantic changes to list/set/dict comprehensions, and more so to generator > expressions, may potentially require migration of code. In many cases, the > changes simply make legal what used to raise an exception, but there are some > edge cases that were previously legal and are not, and a few corner cases with > altered semantics. s/previously legal and are not/previously legal and now are not/ -- ~Ethan~ From steve at pearwood.info Wed Apr 11 02:06:42 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Apr 2018 16:06:42 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: <20180411000335.GN16661@ando.pearwood.info> <20180411030205.GO16661@ando.pearwood.info> Message-ID: <20180411060642.GR16661@ando.pearwood.info> On Wed, Apr 11, 2018 at 02:21:17PM +1000, Chris Angelico wrote: [...] > > Yes, it will double the number of files. Actually quadruple it, if the > > annotations and line numbers are in separate files too. But if most of > > those extra files never need to be opened, then there's no cost to them. > > And whatever extra cost there is, is amortized over the lifetime of the > > interpreter. > > Yes, if they are actually not needed. My question was about whether > that is truly valid. We're never really going to know the affect on performance without implementing and benchmarking the code. It might turn out that, to our surprise, three quarters of the std lib relies on loading docstrings during startup. But I doubt it. > Consider a very common use-case: an OS-provided > Python interpreter whose files are all owned by 'root'. Those will be > distributed with .pyc files for performance, but you don't want to > deprive the users of help() and anything else that needs docstrings > etc. So... are the docstrings lazily loaded or eagerly loaded? What relevance is that they're owned by root? > If eagerly, you've doubled the number of file-open calls to initialize > the interpreter. I do not understand why you think this is even an option. Has Serhiy said something that I missed that makes this seem to be on the table? That's not a rhetorical question -- I may have missed something. But I'm sure he understands that doubling or quadrupling the number of file operations during startup is not an optimization. > (Or quadrupled, if you need annotations and line > numbers and they're all separate.) If lazily, things are a lot more > complicated than the original description suggested, and there'd need > to be some semantic changes here. What semantic change do you expect? There's an implementation change, of course, but that's Serhiy's problem to deal with and I'm sure that he has considered that. There should be no semantic change. When you access obj.__doc__, then and only then are the compiled docstrings for that module read from the disk. I don't know the current implementation of .pyc files, but I like Antoine's suggestion of laying it out in four separate areas (plus header), each one marshalled: code docstrings annotations line numbers Aside from code, which is mandatory, the three other sections could be None to represent "not available", as is the case when you pass -00 to the interpreter, or they could be some other sentinel that means "load lazily from the appropriate file", or they could be the marshalled data directly in place to support byte-code only libraries. As for the in-memory data structures of objects themselves, I imagine something like the __doc__ and __annotation__ slots pointing to a table of strings, which is not initialised until you attempt to read from the table. Or something -- don't pay too much attention to my wild guesses. The bottom line is, is there some reason *aside from performance* to avoid this? Because if the performance is worse, I'm sure Serhiy will be the first to dump this idea. -- Steve From yaoxiansamma at gmail.com Wed Apr 11 02:20:59 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Wed, 11 Apr 2018 14:20:59 +0800 Subject: [Python-ideas] Python-ideas Digest, Vol 137, Issue 40 In-Reply-To: References: Message-ID: I think Guido has given a direct answer why dict unpacking is not supported in syntax level, I can take it and I think it's better to implement a function for dict unpacking in standard library, just like from dict_unpack import dict_unpack, pattern as pat some_dict = {'a': {'b': {'c': 1}, 'd':2}, 'e': 3} extracted = dict_unpack(some_dict, schema = {'a': {'b': {'c': pat('V1')}, 'd': pat('V2')}, 'e': pat('V3')}) # extract to a flatten dictionary v1, v2, v3 = (extracted[k] for k in ('V1', 'V2', 'V3')) assert (v1, v2, v3) == (1, 2, 3) As for Steve's confusion, > > {key: value_pattern, **_} = {key: value, **_} > If I saw that, I would have no idea what it could even possibly do. > Let's pick the simplest concrete example I can think of: > > {'A': 1, **{}} = {'A': 0, **{}} > > I cannot interpret what that should do. Is it some sort of > pattern-matching? An update? What is the result? It is obviously some > sort of binding operation, an assignment, but an assignment to what? {'A': 1, **{}} = {'A': 0, **{}} should be just wrong because for any k-v pair at LHS, the key should be a expression and the value is for unpacking. {'A': [*a, b]} = {'A': [1, 2, 3]} is welcome, but {'A': 1} = {'A': '1'} is also something like pattern matching which is out of our topic. Anyway, this feature will not come true, let's forget it... I think Jacco is totally correct in following words. > I think most of these problems could be solved with pop and the > occasional list comprehension like this: > > a, b, c = [{'a':1,'b':2,'c':3}.pop(key) for key in ('a', 'b', 'c')] > > or for your example: > > c = {'a': 1, **{'b': 2}} # I suppose this one would generally > # be dynamic, but I need a name here. > a, b = [c.pop(key) for key in ('a', 'b')] > > would extract all the keys you need, and has the advantage that > you don't need hardcoded dict structure if you expand it to nested > dicts. It's even less writing, and just as extensible to nested dicts. > And if you dont actually want to destruct (tuples and lists aren't > destroyed either), just use __getitem__ access instead of pop. But pop cannot work for a nested case. Feel free to end this topic. thautwarm 2018-04-10 23:20 GMT+08:00 : > Send Python-ideas mailing list submissions to > python-ideas at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/python-ideas > or, via email, send a message with subject or body 'help' to > python-ideas-request at python.org > > You can reach the person managing the list at > python-ideas-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-ideas digest..." > > Today's Topics: > > 1. Re: Is there any idea about dictionary destructing? > (Steven D'Aprano) > 2. Re: Is there any idea about dictionary destructing? > (Jacco van Dorp) > 3. Re: Start argument for itertools.accumulate() [Was: Proposal: > A Reduce-Map Comprehension and a "last" builtin] (Guido van Rossum) > 4. Re: Is there any idea about dictionary destructing? > (Guido van Rossum) > > > ---------- ????? ---------- > From: "Steven D'Aprano" > To: python-ideas at python.org > Cc: > Bcc: > Date: Tue, 10 Apr 2018 19:21:35 +1000 > Subject: Re: [Python-ideas] Is there any idea about dictionary destructing? > On Tue, Apr 10, 2018 at 03:29:08PM +0800, Thautwarm Zhao wrote: > > > I'm focused on the consistency of the language itself. > > Consistency is good, but it is not the only factor to consider. We must > guard against *foolish* consistency: adding features just for the sake > of matching some other, often barely related, feature. Each feature must > justify itself, and consistency with something else is merely one > possible attempt at justification. > > > > {key: value_pattern, **_} = {key: value, **_} > > If I saw that, I would have no idea what it could even possibly do. > Let's pick the simplest concrete example I can think of: > > {'A': 1, **{}} = {'A': 0, **{}} > > I cannot interpret what that should do. Is it some sort of > pattern-matching? An update? What is the result? It is obviously some > sort of binding operation, an assignment, but an assignment to what? > > Sequence binding and unpacking was obvious the first time I saw it. I > had no problem guessing what: > > a, b, c = 1, 2, 3 > > meant, and once I had seen that, it wasn't hard to guess what > > a, b, c = *sequence > > meant. From there it is easy to predict extended unpacking. But I can't > say the same for this. > > I can almost see the point of: > > a, b, c, = **{'a': 1, 'b': 2, 'c': 3} > > but I'm having trouble thinking of a situation where I would actually > use it. But your syntax above just confuses me. > > > > The reason why it's important is that, when destructing/constructing for > > built-in data structures are not supported completely, > > people might ask why "[a, *b] = c" is ok but "{"a": a, **b} = c" not. > > People can ask all sorts of questions. I've seen people ask why Python > doesn't support line numbers and GOTO. We're allowed to answer "Because > it is a bad idea", or even "Because we don't think it is good enough to > justify the cost". > > > > If only multiple assignment is supported, why "(a, (b, c)) = d" could be > > ok? It's exactly destructing! > > That syntax is supported. I don't understand your point here. > > > > >> {'a': a, 'b': b, **c} = {'a': 1, **{'b': 2}} > > SyntaxError: can't assign to literal > > > > Above example could be confusing in some degree, I think. > > I have no idea what you expect it to do. Even something simpler: > > {'a': a} = {'a': 2} > > leaves me in the dark. > > > > > -- > Steve > > > > ---------- ????? ---------- > From: Jacco van Dorp > To: python-ideas at python.org > Cc: > Bcc: > Date: Tue, 10 Apr 2018 11:52:40 +0200 > Subject: Re: [Python-ideas] Is there any idea about dictionary destructing? > I must say I can't really see the point either. If you say like: > > > {'a': a, 'b': b, **c} = {'a': 1, **{'b': 2}} > > Do you basically mean: > > c = {'a': 1, **{'b': 2}} > a = c.pop("a") > b = c.pop("b") # ? > > That's the only thing I could think of. > > I think most of these problems could be solved with pop and the > occasional list comprehension like this: > > a, b, c = [{'a':1,'b':2,'c':3}.pop(key) for key in ('a', 'b', 'c')] > > or for your example: > > c = {'a': 1, **{'b': 2}} # I suppose this one would generally > # be dynamic, but I need a name here. > a, b = [c.pop(key) for key in ('a', 'b')] > > would extract all the keys you need, and has the advantage that > you don't need hardcoded dict structure if you expand it to nested > dicts. It's even less writing, and just as extensible to nested dicts. > And if you dont actually want to destruct (tuples and lists aren't > destroyed either), just use __getitem__ access instead of pop. > > 2018-04-10 11:21 GMT+02:00 Steven D'Aprano : > > On Tue, Apr 10, 2018 at 03:29:08PM +0800, Thautwarm Zhao wrote: > > > >> I'm focused on the consistency of the language itself. > > > > Consistency is good, but it is not the only factor to consider. We must > > guard against *foolish* consistency: adding features just for the sake > > of matching some other, often barely related, feature. Each feature must > > justify itself, and consistency with something else is merely one > > possible attempt at justification. > > > > > >> {key: value_pattern, **_} = {key: value, **_} > > > > If I saw that, I would have no idea what it could even possibly do. > > Let's pick the simplest concrete example I can think of: > > > > {'A': 1, **{}} = {'A': 0, **{}} > > > > I cannot interpret what that should do. Is it some sort of > > pattern-matching? An update? What is the result? It is obviously some > > sort of binding operation, an assignment, but an assignment to what? > > > > Sequence binding and unpacking was obvious the first time I saw it. I > > had no problem guessing what: > > > > a, b, c = 1, 2, 3 > > > > meant, and once I had seen that, it wasn't hard to guess what > > > > a, b, c = *sequence > > > > meant. From there it is easy to predict extended unpacking. But I can't > > say the same for this. > > > > I can almost see the point of: > > > > a, b, c, = **{'a': 1, 'b': 2, 'c': 3} > > > > but I'm having trouble thinking of a situation where I would actually > > use it. But your syntax above just confuses me. > > > > > >> The reason why it's important is that, when destructing/constructing for > >> built-in data structures are not supported completely, > >> people might ask why "[a, *b] = c" is ok but "{"a": a, **b} = c" not. > > > > People can ask all sorts of questions. I've seen people ask why Python > > doesn't support line numbers and GOTO. We're allowed to answer "Because > > it is a bad idea", or even "Because we don't think it is good enough to > > justify the cost". > > > > > >> If only multiple assignment is supported, why "(a, (b, c)) = d" could be > >> ok? It's exactly destructing! > > > > That syntax is supported. I don't understand your point here. > > > > > >> >> {'a': a, 'b': b, **c} = {'a': 1, **{'b': 2}} > >> SyntaxError: can't assign to literal > >> > >> Above example could be confusing in some degree, I think. > > > > I have no idea what you expect it to do. Even something simpler: > > > > {'a': a} = {'a': 2} > > > > leaves me in the dark. > > > > > > > > > > -- > > Steve > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > ---------- ????? ---------- > From: Guido van Rossum > To: "Stephen J. Turnbull" > Cc: Tim Peters , Python-Ideas < > python-ideas at python.org> > Bcc: > Date: Tue, 10 Apr 2018 08:05:18 -0700 > Subject: Re: [Python-ideas] Start argument for itertools.accumulate() > [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] > On Mon, Apr 9, 2018 at 11:35 PM, Stephen J. Turnbull < > turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > >> Tim Peters writes: >> >> > "Sum reduction" and "running-sum accumulation" are primitives in >> > many peoples' brains. >> >> I wonder what Kahneman would say about that. He goes to some length >> to explain that people are quite good (as human abilities go) at >> perceiving averages over sets but terrible at summing the same. Maybe >> they substitute the abstraction of summation for the ability to >> perform the operation? >> > > [OT] How is that human ability tested? I am a visual learner and I would > propose that if you have a set of numbers, you can graph it in different > ways to make it easier to perceive one or the other (or maybe both): > > - to emphasize the average, draw a line graph -- in my mind I draw a line > through the average (getting the trend for free) > - to emphasize the sum, draw a histogram -- in my mind I add up the sizes > of the bars > > -- > --Guido van Rossum (python.org/~guido) > > > ---------- ????? ---------- > From: Guido van Rossum > To: Python-Ideas > Cc: > Bcc: > Date: Tue, 10 Apr 2018 08:20:11 -0700 > Subject: Re: [Python-ideas] Is there any idea about dictionary destructing? > Here's one argument why sequence unpacking is more important than dict > unpacking. > > Without sequence unpacking, you have a long sequence, to get at a specific > item you'd need to use indexing, where you often end up having to remember > the indices for each type of information. Say you have points of the form > (x, y, z, t), to get at the t coordinate you'd have to write p[3]. With > sequence unpacking you can write > > x, y, z, t = p > > and then you can use the individual variables in the subsequent code. > > However if your point had the form {'x': x, 'y': y, 'z': z, 't': t}, you > could just write p['t']. This is much more mnemonic than p[3]. > > All the rest follows -- after a while extended forms of iterable unpacking > start making sense. But for dicts the use case is just much less common. > > (If you're doing a lot of JSON you might have a different view on that. > You should probably use some kind of schema-guided parser though.) > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Apr 11 02:21:46 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 16:21:46 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACDA327.7050605@stoneleaf.us> References: <5ACDA327.7050605@stoneleaf.us> Message-ID: On Wed, Apr 11, 2018 at 3:54 PM, Ethan Furman wrote: > On 04/10/2018 10:32 PM, Chris Angelico wrote: > >> Migration path >> ============== >> >> The semantic changes to list/set/dict comprehensions, and more so to >> generator >> expressions, may potentially require migration of code. In many cases, the >> changes simply make legal what used to raise an exception, but there are >> some >> edge cases that were previously legal and are not, and a few corner cases >> with >> altered semantics. > > > s/previously legal and are not/previously legal and now are not/ > Trivial change, easy fix. Thanks. ChrisA From gadgetsteve at live.co.uk Tue Apr 10 14:52:36 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Tue, 10 Apr 2018 18:52:36 +0000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: On 10/04/2018 18:54, Zachary Ware wrote: > On Tue, Apr 10, 2018 at 12:38 PM, Chris Angelico wrote: >> A deployed Python distribution generally has .pyc files for all of the >> standard library. I don't think people want to lose the ability to >> call help(), and unless I'm misunderstanding, that requires >> docstrings. So this will mean twice as many files and twice as many >> file-open calls to import from the standard library. What will be the >> impact on startup time? > > What about instead of separate files turning the single file into a > pseudo-zip file containing all of the proposed files, and provide a > simple tool for removing whatever parts you don't want? > Personally I quite like the idea of having the doc strings, and possibly other optional components, in a zipped section after a marker for the end of the operational code. Possibly the loader could stop reading at that point, (reducing load time and memory impact), and only load and unzip on demand. Zipping the doc strings should have a significant reduction in file sizes but it is worth remembering a couple of things: - Python is already one of the most compact languages for what it can do - I have had experts demanding to know where the rest of the program is hidden and how it is being downloaded when they noticed the size of the installed code verses the functionality provided. - File size <> disk space consumed - on most file systems each file typically occupies 1 + (file_size // allocation_size) clusters of the drive and with increasing disk sizes generally the allocation_size is increasing both of my NTFS drives currently have 4096 byte allocation sizes but I am offered up to 2 MB allocation sizes - splitting a .pyc 10,052 byte .pyc file, (picking a random example from my drive) into a 5,052 and 5,000 byte files will change the disk space occupied from 3*4,096 to 4*4,096 plus the extra directory entry. - Where absolute file size is critical you, (such as embedded systems), can always use the -O & -OO flags. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com From python-ideas at mgmiller.net Wed Apr 11 03:44:52 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 11 Apr 2018 00:44:52 -0700 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: <20180411044441.GQ16661@ando.pearwood.info> Message-ID: Ok, we can haggle the finer details and I admit once you learn the syntax it isn't substantially harder. Simply, I've found the dict() a bit easier to mentally parse at a glance. Also, to add I've always expected multiple args to work with it, and am always surprised when it doesn't. Would never have thought of this unpacking syntax if I didn't know that's the way its done now, but often have to think about it for a second or two. On 2018-04-10 22:22, Chris Angelico wrote: > On Wed, Apr 11, 2018 at 2:44 PM, Steven D'Aprano wrote: >> On Wed, Apr 11, 2018 at 02:22:08PM +1000, Chris Angelico wrote: From p.f.moore at gmail.com Wed Apr 11 03:55:28 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 08:55:28 +0100 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180411034115.GP16661@ando.pearwood.info> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> <20180411034115.GP16661@ando.pearwood.info> Message-ID: On 11 April 2018 at 04:41, Steven D'Aprano wrote: >> > But in a way that more intuitively expresses the intent of the code, it >> > would be great to have more options on the market. >> >> It's worth adding a reminder here that "having more options on the >> market" is pretty directly in contradiction to the Zen of Python - >> "There should be one-- and preferably only one --obvious way to do >> it". > > I'm afraid I'm going to (mildly) object here. At least you didn't > misquote the Zen as "Only One Way To Do It" :-) > > The Zen here is not a prohibition against there being multiple ways to > do something -- how could it, given that Python is a general purpose > programming language there is always going to be multiple ways to write > any piece of code? Rather, it exhorts us to make sure that there are one > or more ways to "do it", at least one of which is obvious. I apologise if I came across as implying that I thought the Zen said that having multiple ways was prohibited. I don't (and certainly the Zen doesn't mean that). Rather, I was saying that using "it gives us an additional way to do something" is a bad argument in favour of a proposal for Python. At a minimum, the proposal needs to argue why the new feature is "more obvious" than the existing ways (bonus points if the proposer is Dutch - see the following Zen item ;-)), or why it offers a capability that isn't possible with the existing language. And I'm not even saying that the OP hasn't attempted to make such arguments (even if I disagree with them). All I was pointing out was that the comment "it would be great to have more options on the market" implies a misunderstanding of the design goals of Python (hence my "reminder" of the principle I think is relevant here). Sorry again if that's not what it sounded like. Paul From encukou at gmail.com Wed Apr 11 04:26:15 2018 From: encukou at gmail.com (Petr Viktorin) Date: Wed, 11 Apr 2018 10:26:15 +0200 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: <20180411000335.GN16661@ando.pearwood.info> <20180411030205.GO16661@ando.pearwood.info> Message-ID: On 04/11/18 06:21, Chris Angelico wrote: > On Wed, Apr 11, 2018 at 1:02 PM, Steven D'Aprano wrote: >> On Wed, Apr 11, 2018 at 10:08:58AM +1000, Chris Angelico wrote: >> >>> File system limits aren't usually an issue; as you say, even FAT32 can >>> store a metric ton of files in a single directory. I'm more interested >>> in how long it takes to open a file, and whether doubling that time >>> will have a measurable impact on Python startup time. Part of that >>> cost can be reduced by using openat(), on platforms that support it, >>> but even with a directory handle, there's still a definite non-zero >>> cost to opening and reading an additional file. >> >> Yes, it will double the number of files. Actually quadruple it, if the >> annotations and line numbers are in separate files too. But if most of >> those extra files never need to be opened, then there's no cost to them. >> And whatever extra cost there is, is amortized over the lifetime of the >> interpreter. > > Yes, if they are actually not needed. My question was about whether > that is truly valid. Consider a very common use-case: an OS-provided > Python interpreter whose files are all owned by 'root'. Those will be > distributed with .pyc files for performance, but you don't want to > deprive the users of help() and anything else that needs docstrings > etc. Currently in Fedora, we ship *both* optimized and non-optimized pycs to make sure both -O and non--O will work nicely without root privilieges. So splitting the docstrings into a separate file would be, for us, a benefit in terms of file size. > So... are the docstrings lazily loaded or eagerly loaded? If > eagerly, you've doubled the number of file-open calls to initialize > the interpreter. (Or quadrupled, if you need annotations and line > numbers and they're all separate.) If lazily, things are a lot more > complicated than the original description suggested, and there'd need > to be some semantic changes here. > >> Serhiy is experienced enough that I think we should assume he's not >> going to push this optimization into production unless it actually does >> reduce startup time. He has proven himself enough that we should assume >> competence rather than incompetence :-) > > Oh, I'm definitely assuming that he knows what he's doing :-) Doesn't > mean I can't ask the question though. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From encukou at gmail.com Wed Apr 11 04:28:52 2018 From: encukou at gmail.com (Petr Viktorin) Date: Wed, 11 Apr 2018 10:28:52 +0200 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180411060642.GR16661@ando.pearwood.info> References: <20180411000335.GN16661@ando.pearwood.info> <20180411030205.GO16661@ando.pearwood.info> <20180411060642.GR16661@ando.pearwood.info> Message-ID: <37fa8c80-5c38-ab88-8de5-2f45ef24d54e@gmail.com> On 04/11/18 08:06, Steven D'Aprano wrote: > On Wed, Apr 11, 2018 at 02:21:17PM +1000, Chris Angelico wrote: > > [...] >>> Yes, it will double the number of files. Actually quadruple it, if the >>> annotations and line numbers are in separate files too. But if most of >>> those extra files never need to be opened, then there's no cost to them. >>> And whatever extra cost there is, is amortized over the lifetime of the >>> interpreter. >> >> Yes, if they are actually not needed. My question was about whether >> that is truly valid. > > We're never really going to know the affect on performance without > implementing and benchmarking the code. It might turn out that, to our > surprise, three quarters of the std lib relies on loading docstrings > during startup. But I doubt it. > > >> Consider a very common use-case: an OS-provided >> Python interpreter whose files are all owned by 'root'. Those will be >> distributed with .pyc files for performance, but you don't want to >> deprive the users of help() and anything else that needs docstrings >> etc. So... are the docstrings lazily loaded or eagerly loaded? > > What relevance is that they're owned by root? > > >> If eagerly, you've doubled the number of file-open calls to initialize >> the interpreter. > > I do not understand why you think this is even an option. Has Serhiy > said something that I missed that makes this seem to be on the table? > That's not a rhetorical question -- I may have missed something. But I'm > sure he understands that doubling or quadrupling the number of file > operations during startup is not an optimization. > > >> (Or quadrupled, if you need annotations and line >> numbers and they're all separate.) If lazily, things are a lot more >> complicated than the original description suggested, and there'd need >> to be some semantic changes here. > > What semantic change do you expect? > > There's an implementation change, of course, but that's Serhiy's problem > to deal with and I'm sure that he has considered that. There should be > no semantic change. When you access obj.__doc__, then and only then are > the compiled docstrings for that module read from the disk. > > I don't know the current implementation of .pyc files, but I like > Antoine's suggestion of laying it out in four separate areas (plus > header), each one marshalled: > > code > docstrings > annotations > line numbers > > Aside from code, which is mandatory, the three other sections could be > None to represent "not available", as is the case when you pass -00 to > the interpreter, or they could be some other sentinel that means "load > lazily from the appropriate file", or they could be the marshalled data > directly in place to support byte-code only libraries. > > As for the in-memory data structures of objects themselves, I imagine > something like the __doc__ and __annotation__ slots pointing to a table > of strings, which is not initialised until you attempt to read from the > table. Or something -- don't pay too much attention to my wild guesses. A __doc__ sentinel could even say something like "bytes 350--420 in the original .py file, as UTF-8". > > The bottom line is, is there some reason *aside from performance* to > avoid this? Because if the performance is worse, I'm sure Serhiy will be > the first to dump this idea. > > From p.f.moore at gmail.com Wed Apr 11 04:55:13 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 09:55:13 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On 11 April 2018 at 06:32, Chris Angelico wrote: > When a class scope is involved, a naive transformation into a function would > prevent name lookups (as the function would behave like a method). > > class X: > names = ["Fred", "Barney", "Joe"] > prefix = "> " > prefixed_names = [prefix + name for name in names] > > With Python 3.7 semantics, this will evaluate the outermost iterable at class > scope, which will succeed; but it will evaluate everything else in a function:: > > class X: > names = ["Fred", "Barney", "Joe"] > prefix = "> " > def (iterator): > result = [] > for name in iterator: > result.append(prefix + name) > return result > prefixed_names = (iter(names)) > > The name ``prefix`` is thus searched for at global scope, ignoring the class > name. Under the proposed semantics, this name will be eagerly bound, being > approximately equivalent to:: > > class X: > names = ["Fred", "Barney", "Joe"] > prefix = "> " > def (prefix=prefix): > result = [] > for name in names: > result.append(prefix + name) > return result > prefixed_names = () Surely "names" would also be eagerly bound, for use in the "for" loop? [...] > > This could be used to create ugly code! > --------------------------------------- > > So can anything else. This is a tool, and it is up to the programmer to use it > where it makes sense, and not use it where superior constructs can be used. Related objection - when used to name subexpressions in a comprehension (one of the original motivating use cases for this proposal), this introduces an asymmetry which actually makes the comprehension harder to read. As a result, it's quite possible that people won't want to use assignment expressions in this case, and the use case of precalculating expensive but multiply used results in comprehensions will remain unanswered. I think the response here is basically the same as the above - if you don't like them, don't use them. But I do think the additional nuance of "we might not have solved the original motivating use case" is worth a specific response. Overall, I like this much better than the previous proposal. I'm now +1 on the semantic changes to comprehensions, and barely +0 on the assignment expression itself (I still don't think assignment expressions are worth it, and I worry about the confusion they may cause for beginners in particular). Paul From rosuav at gmail.com Wed Apr 11 05:30:47 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 19:30:47 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Wed, Apr 11, 2018 at 6:55 PM, Paul Moore wrote: > On 11 April 2018 at 06:32, Chris Angelico wrote: >> The name ``prefix`` is thus searched for at global scope, ignoring the class >> name. Under the proposed semantics, this name will be eagerly bound, being >> approximately equivalent to:: >> >> class X: >> names = ["Fred", "Barney", "Joe"] >> prefix = "> " >> def (prefix=prefix): >> result = [] >> for name in names: >> result.append(prefix + name) >> return result >> prefixed_names = () > > Surely "names" would also be eagerly bound, for use in the "for" loop? Yep, exactly. Have corrected the example, thanks. >> This could be used to create ugly code! >> --------------------------------------- >> >> So can anything else. This is a tool, and it is up to the programmer to use it >> where it makes sense, and not use it where superior constructs can be used. > > Related objection - when used to name subexpressions in a > comprehension (one of the original motivating use cases for this > proposal), this introduces an asymmetry which actually makes the > comprehension harder to read. As a result, it's quite possible that > people won't want to use assignment expressions in this case, and the > use case of precalculating expensive but multiply used results in > comprehensions will remain unanswered. > > I think the response here is basically the same as the above - if you > don't like them, don't use them. But I do think the additional nuance > of "we might not have solved the original motivating use case" is > worth a specific response. The PEP has kinda pivoted a bit since its inception, so I'm honestly not sure what "original motivating use case" matters. :D I'm just lumping all the use-cases together at the same priority now. > Overall, I like this much better than the previous proposal. I'm now > +1 on the semantic changes to comprehensions, and barely +0 on the > assignment expression itself (I still don't think assignment > expressions are worth it, and I worry about the confusion they may > cause for beginners in particular). Now that they have the same semantics as any other form of assignment, they're a bit less useful in some cases, a bit more useful in others, and a lot easier to explain. The most confusing part, honestly, is "why do we have two ways to do assignment", which is why that is specifically answered in the PEP. ChrisA From p.f.moore at gmail.com Wed Apr 11 05:46:02 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 10:46:02 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On 11 April 2018 at 10:30, Chris Angelico wrote: > The PEP has kinda pivoted a bit since its inception, so I'm honestly > not sure what "original motivating use case" matters. :D I'm just > lumping all the use-cases together at the same priority now. Fair point, and reading this PEP in isolation the comprehension use case really isn't that prominent. So yes, I guess you're right. Paul From clint.hepner at gmail.com Wed Apr 11 08:23:57 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Wed, 11 Apr 2018 08:23:57 -0400 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> > On 2018 Apr 11 , at 1:32 a, Chris Angelico wrote: > > Wholesale changes since the previous version. Statement-local name > bindings have been dropped (I'm still keeping the idea in the back of > my head; this PEP wasn't the first time I'd raised the concept), and > we're now focusing primarily on assignment expressions, but also with > consequent changes to comprehensions. Overall, I'm slightly negative on this. I think named expressions will be a good thing to have, but not in this form. I'll say up front that, being fully aware of the issues surrounding the introduction of a new keyword, something like a let expression in Haskell would be more readable than embedded assignments in most cases. In the end, I suspect my `let` proposal is a nonstarter and just useful to list with the rest of the rejected alternatives, but I wanted. > > Abstract > ======== > [...] > > > Rationale > ========= > [...] > > Syntax and semantics > ==================== > > In any context where arbitrary Python expressions can be used, a **named > expression** can appear. This can be parenthesized for clarity, and is of > the form ``(target := expr)`` where ``expr`` is any valid Python expression, > and ``target`` is any valid assignment target. > > The value of such a named expression is the same as the incorporated > expression, with the additional side-effect that the target is assigned > that value. > > # Similar to the boolean 'or' but checking for None specifically > x = "default" if (eggs := spam().ham) is None else eggs > > # Even complex expressions can be built up piece by piece > y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) I find the assignments make it difficult to pick out what the final expression looks like. The first isn't too bad, but it took me a moment to figure out what y was. Quick: is it * (a, b, c) * (a, (b, c)) * ((a, b), c) * something else First I though it was (a, b, c), then I thought it was actually ((a, b), c), before carefully counting the parentheses showed that I was right the first time. These would be clearer if you could remove the assignment from the expression itself. Assuming "let" were available as a keyword, x = (let eggs = spam().ham in "default" if eggs is None else eggs) y = (let eggs = spam(), cheese = eggs.method() in (eggs, cheese, cheese[eggs])) Allowing for differences in how best to format such an expression, the final expression is clearly separate from its component assignment. (More on this in the Alternative Spellings section below.) > > Differences from regular assignment statements > ---------------------------------------------- > > An assignment statement can assign to multiple targets:: > > x = y = z = 0 > > To do the same with assignment expressions, they must be parenthesized:: > > assert 0 == (x := (y := (z := 0))) There's no rationale given for why this must be parenthesized. If := were right-associative, assert 0 == (x := y := z := 0) would work fine. (With high enough precedence, the remaining parentheses could be dropped, but one would probably keep them for clarity.) I think you need to spell out its associativity and precedence in more detail, and explain why the rationale for the choice made. > > Augmented assignment is not supported in expression form:: > >>>> x +:= 1 > File "", line 1 > x +:= 1 > ^ > SyntaxError: invalid syntax There's no reason give for why this is invalid. I assume it's a combination of 1) Having both += and +:=/:+= would be redundant and 2) not wanting to add 11+ new operators to the language. > > Otherwise, the semantics of assignment are unchanged by this proposal. > [List comprehensions deleted] > > > Recommended use-cases > ===================== > > Simplifying list comprehensions > ------------------------------- > > These list comprehensions are all approximately equivalent:: [existing alternatives redacted] > # Using a temporary name > stuff = [[y := f(x), x/y] for x in range(5)] Again, this would be clearer if the assignment were separated from the expression where it would be used. stuff = [let y = f(x) in [y, x/y] for x in range(5)] > > Capturing condition values > -------------------------- > > Assignment expressions can be used to good effect in the header of > an ``if`` or ``while`` statement:: > > # Current Python, not caring about function return value > while input("> ") != "quit": > print("You entered a command.") > > # Current Python, capturing return value - four-line loop header > while True: > command = input("> "); > if command == "quit": > break > print("You entered:", command) > > # Proposed alternative to the above > while (command := input("> ")) != "quit": > print("You entered:", command) > > # Capturing regular expression match objects > # See, for instance, Lib/pydoc.py, which uses a multiline spelling > # of this effect > if match := re.search(pat, text): > print("Found:", match.group(0)) > > # Reading socket data until an empty string is returned > while data := sock.read(): > print("Received data:", data) > > Particularly with the ``while`` loop, this can remove the need to have an > infinite loop, an assignment, and a condition. It also creates a smooth > parallel between a loop which simply uses a function call as its condition, > and one which uses that as its condition but also uses the actual value. These are the most compelling examples so far, doing the most to push me towards a +1. In particular, my `let` expression is too verbose here: while let data = sock.read() in data: print("Received data:", data) I have an idea in the back of my head about `NAME := FOO` being syntactic sugar for `let NAME = FOO in FOO`, but it's not well thought out. > > > Rejected alternative proposals > ============================== > > Proposals broadly similar to this one have come up frequently on python-ideas. > Below are a number of alternative syntaxes, some of them specific to > comprehensions, which have been rejected in favour of the one given above. > > > Alternative spellings > --------------------- > > Broadly the same semantics as the current proposal, but spelled differently. > > 1. ``EXPR as NAME``, with or without parentheses:: > > stuff = [[f(x) as y, x/y] for x in range(5)] > > Omitting the parentheses in this form of the proposal introduces many > syntactic ambiguities. Requiring them in all contexts leaves open the > option to make them optional in specific situations where the syntax is > unambiguous (cf generator expressions as sole parameters in function > calls), but there is no plausible way to make them optional everywhere. > > With the parentheses, this becomes a viable option, with its own tradeoffs > in syntactic ambiguity. Since ``EXPR as NAME`` already has meaning in > ``except`` and ``with`` statements (with different semantics), this would > create unnecessary confusion or require special-casing. > > 2. Adorning statement-local names with a leading dot:: > > stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as" > stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":=" > > This has the advantage that leaked usage can be readily detected, removing > some forms of syntactic ambiguity. However, this would be the only place > in Python where a variable's scope is encoded into its name, making > refactoring harder. This syntax is quite viable, and could be promoted to > become the current recommendation if its advantages are found to outweigh > its cost. > > 3. Adding a ``where:`` to any statement to create local name bindings:: > > value = x**2 + 2*x where: > x = spam(1, 4, 7, q) > > Execution order is inverted (the indented body is performed first, followed > by the "header"). This requires a new keyword, unless an existing keyword > is repurposed (most likely ``with:``). See PEP 3150 for prior discussion > on this subject (with the proposed keyword being ``given:``). 4. Adding a ``let`` expression to create local bindings value = let x = spam(1, 4, 7, q) in x**2 + 2*x 5. Adding a ``where`` expression to create local bindings: value = x**2 + 2*x where x = spam(1, 4, 7, q) Both have the extra-keyword problem. Multiple bindings are little harder to add than they would be with the ``where:`` modifier, although a few extra parentheses and judicious line breaks make it not so bad to allow a comma-separated list, as shown in my first example at the top of this reply. > > > Special-casing conditional statements > ------------------------------------- > > One of the most popular use-cases is ``if`` and ``while`` statements. Instead > of a more general solution, this proposal enhances the syntax of these two > statements to add a means of capturing the compared value:: > > if re.search(pat, text) as match: > print("Found:", match.group(0)) > > This works beautifully if and ONLY if the desired condition is based on the > truthiness of the captured value. It is thus effective for specific > use-cases (regex matches, socket reads that return `''` when done), and > completely useless in more complicated cases (eg where the condition is > ``f(x) < 0`` and you want to capture the value of ``f(x)``). It also has > no benefit to list comprehensions. > > Advantages: No syntactic ambiguities. Disadvantages: Answers only a fraction > of possible use-cases, even in ``if``/``while`` statements. > > > Special-casing comprehensions > ----------------------------- > > Another common use-case is comprehensions (list/set/dict, and genexps). As > above, proposals have been made for comprehension-specific solutions. > > 1. ``where``, ``let``, or ``given``:: > > stuff = [(y, x/y) where y = f(x) for x in range(5)] > stuff = [(y, x/y) let y = f(x) for x in range(5)] > stuff = [(y, x/y) given y = f(x) for x in range(5)] > > This brings the subexpression to a location in between the 'for' loop and > the expression. It introduces an additional language keyword, which creates > conflicts. Of the three, ``where`` reads the most cleanly, but also has the > greatest potential for conflict (eg SQLAlchemy and numpy have ``where`` > methods, as does ``tkinter.dnd.Icon`` in the standard library). > > 2. ``with NAME = EXPR``:: > > stuff = [(y, x/y) with y = f(x) for x in range(5)] > > As above, but reusing the `with` keyword. Doesn't read too badly, and needs > no additional language keyword. Is restricted to comprehensions, though, > and cannot as easily be transformed into "longhand" for-loop syntax. Has > the C problem that an equals sign in an expression can now create a name > binding, rather than performing a comparison. Would raise the question of > why "with NAME = EXPR:" cannot be used as a statement on its own. > > 3. ``with EXPR as NAME``:: > > stuff = [(y, x/y) with f(x) as y for x in range(5)] > > As per option 2, but using ``as`` rather than an equals sign. Aligns > syntactically with other uses of ``as`` for name binding, but a simple > transformation to for-loop longhand would create drastically different > semantics; the meaning of ``with`` inside a comprehension would be > completely different from the meaning as a stand-alone statement, while > retaining identical syntax. > > Regardless of the spelling chosen, this introduces a stark difference between > comprehensions and the equivalent unrolled long-hand form of the loop. It is > no longer possible to unwrap the loop into statement form without reworking > any name bindings. The only keyword that can be repurposed to this task is > ``with``, thus giving it sneakily different semantics in a comprehension than > in a statement; alternatively, a new keyword is needed, with all the costs > therein. 4. `` let NAME = EXPR1 in EXPR2``:: stuff = [let y = f(x) in (y, x/y) for x in range(5)] I don't have anything new to say about this. It has the same keyword objections as similar proposals, and I think I've addressed the use case elsewhere. > > Frequently Raised Objections > ============================ > > Why not just turn existing assignment into an expression? > --------------------------------------------------------- > > C and its derivatives define the ``=`` operator as an expression, rather than > a statement as is Python's way. This allows assignments in more contexts, > including contexts where comparisons are more common. The syntactic similarity > between ``if (x == y)`` and ``if (x = y)`` belies their drastically different > semantics. Thus this proposal uses ``:=`` to clarify the distinction. > > > With assignment expressions, why bother with assignment statements? > ------------------------------------------------------------------- > > The two forms have different flexibilities. The ``:=`` operator can be used > inside a larger expression; the ``=`` operator can be chained more > conveniently, and closely parallels the inline operations ``+=`` and friends. > The assignment statement is a clear declaration of intent: this value is to > be assigned to this target, and that's it. I don't find this convincing. I don't really see chained assignments often enough to worry about how they are written, plus note my earlier question about the precedence and associativity of :=. The fact is, `x := 5` as an expression statement appears equivalent to the assignment statement `x = 5`, so I suspect people will start using it as such no matter how strongly you suggest they shouldn't. -- Clint From kirillbalunov at gmail.com Wed Apr 11 09:03:47 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Wed, 11 Apr 2018 16:03:47 +0300 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: Great work Chris! Thank you! I do not know whether this is good or bad, but this PEP considers so many different topics, although closely interrelated with each other. 2018-04-11 8:32 GMT+03:00 Chris Angelico : > > Alterations to comprehensions > ----------------------------- > > The current behaviour of list/set/dict comprehensions and generator > expressions has some edge cases that would behave strangely if an > assignment > expression were to be used. Therefore the proposed semantics are changed, > removing the current edge cases, and instead altering their behaviour > *only* > in a class scope. > > As of Python 3.7, the outermost iterable of any comprehension is evaluated > in the surrounding context, and then passed as an argument to the implicit > function that evaluates the comprehension. > > Under this proposal, the entire body of the comprehension is evaluated in > its implicit function. Names not assigned to within the comprehension are > located in the surrounding scopes, as with normal lookups. As one special > case, a comprehension at class scope will **eagerly bind** any name which > is already defined in the class scope. > > I think this change is important one no matter what will be the future of the current PEP. And since it breaks backward compatibility it deserves a separate PEP. > Open questions > ============== > > Can the outermost iterable still be evaluated early? > ---------------------------------------------------- > > As of Python 3.7, the outermost iterable in a genexp is evaluated early, > and > the result passed to the implicit function as an argument. With PEP 572, > this > would no longer be the case. Can we still, somehow, evaluate it before > moving > on? One possible implementation would be:: > > gen = (x for x in rage(10)) > # translates to > def (): > iterable = iter(rage(10)) > yield None > for x in iterable: > yield x > gen = () > next(gen) > > This would pump the iterable up to just before the loop starts, evaluating > exactly as much as is evaluated outside the generator function in Py3.7. > This would result in it being possible to call ``gen.send()`` immediately, > unlike with most generators, and may incur unnecessary overhead in the > common case where the iterable is pumped immediately (perhaps as part of a > larger expression). > > Previously, there was an alternative _operator form_ `->` proposed by Steven D'Aprano. This option is no longer considered? I see several advantages with this variant: 1. It does not use `:` symbol which is very visually overloaded in Python. 2. It is clearly distinguishable from the usual assignment statement and it's `+=` friends There are others but they are minor. > Frequently Raised Objections > ============================ > > Why not just turn existing assignment into an expression? > --------------------------------------------------------- > > C and its derivatives define the ``=`` operator as an expression, rather > than > a statement as is Python's way. This allows assignments in more contexts, > including contexts where comparisons are more common. The syntactic > similarity > between ``if (x == y)`` and ``if (x = y)`` belies their drastically > different > semantics. Thus this proposal uses ``:=`` to clarify the distinction. > > > This could be used to create ugly code! > --------------------------------------- > > So can anything else. This is a tool, and it is up to the programmer to > use it > where it makes sense, and not use it where superior constructs can be used. > > But the ugly code matters, especially when it comes to Python. For me, the ideal option would be the combination of two rejected parts: Special-casing conditional statements > ------------------------------------- > > One of the most popular use-cases is ``if`` and ``while`` statements. > Instead > of a more general solution, this proposal enhances the syntax of these two > statements to add a means of capturing the compared value:: > > if re.search(pat, text) as match: > print("Found:", match.group(0)) > > This works beautifully if and ONLY if the desired condition is based on the > truthiness of the captured value. It is thus effective for specific > use-cases (regex matches, socket reads that return `''` when done), and > completely useless in more complicated cases (eg where the condition is > ``f(x) < 0`` and you want to capture the value of ``f(x)``). It also has > no benefit to list comprehensions. > > Advantages: No syntactic ambiguities. Disadvantages: Answers only a > fraction > of possible use-cases, even in ``if``/``while`` statements. > (+ in `while`) combined with this part: 3. ``with EXPR as NAME``:: > > stuff = [(y, x/y) with f(x) as y for x in range(5)] > > As per option 2, but using ``as`` rather than an equals sign. Aligns > syntactically with other uses of ``as`` for name binding, but a simple > transformation to for-loop longhand would create drastically different > semantics; the meaning of ``with`` inside a comprehension would be > completely different from the meaning as a stand-alone statement, while > retaining identical syntax. > I see no benefit to have the assignment expression in other places. And all your provided examples use `while` or `if` or some form of comprehension. I also see no problem with `if (re.search(pat, text) as match) is not None:..`. What is the point of overloading language with expression that will be used only in `while` and `if` and will be rejected by style checkers in other places? With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Apr 11 09:17:08 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 14:17:08 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: On 11 April 2018 at 13:23, Clint Hepner wrote: >> # Even complex expressions can be built up piece by piece >> y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) > I find the assignments make it difficult to pick out what the final expression looks like. > The first isn't too bad, but it took me a moment to figure out what y was. Quick: is it > > * (a, b, c) > * (a, (b, c)) > * ((a, b), c) > * something else > > First I though it was (a, b, c), then I thought it was actually ((a, b), c), before > carefully counting the parentheses showed that I was right the first time. This is a reasonable concern, IMO. But it comes solidly under the frequently raised objection "This could be used to create ugly code!". Writing it as y = ( (eggs := spam()), (cheese := eggs.method()), cheese[eggs] ) makes it obvious what the structure is. Paul From rosuav at gmail.com Wed Apr 11 09:25:24 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 23:25:24 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: On Wed, Apr 11, 2018 at 10:23 PM, Clint Hepner wrote: >> On 2018 Apr 11 , at 1:32 a, Chris Angelico wrote: >> # Similar to the boolean 'or' but checking for None specifically >> x = "default" if (eggs := spam().ham) is None else eggs > >> >> # Even complex expressions can be built up piece by piece >> y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) > > These would be clearer if you could remove the assignment from the expression itself. > Assuming "let" were available as a keyword, > > x = (let eggs = spam().ham > in > "default" if eggs is None else eggs) > y = (let eggs = spam(), > cheese = eggs.method() > in > (eggs, cheese, cheese[eggs])) > > Allowing for differences in how best to format such an expression, the final > expression is clearly separate from its component assignment. (More on this > in the Alternative Spellings section below.) I have no idea what the "in" keyword is doing here, but somehow it isn't being used for the meaning it currently has in Python. Does your alternative require not one but *two* new keywords? >> Differences from regular assignment statements >> ---------------------------------------------- >> >> An assignment statement can assign to multiple targets:: >> >> x = y = z = 0 >> >> To do the same with assignment expressions, they must be parenthesized:: >> >> assert 0 == (x := (y := (z := 0))) > > There's no rationale given for why this must be parenthesized. > If := were right-associative, > > assert 0 == (x := y := z := 0) > > would work fine. (With high enough precedence, the remaining parentheses > could be dropped, but one would probably keep them for clarity.) > I think you need to spell out its associativity and precedence in more detail, > and explain why the rationale for the choice made. It's partly because of other confusing possibilities, such as its use inside, or capturing, a lambda function. I'm okay with certain forms requiring parens. >> Augmented assignment is not supported in expression form:: >> >>>>> x +:= 1 >> File "", line 1 >> x +:= 1 >> ^ >> SyntaxError: invalid syntax > > There's no reason give for why this is invalid. I assume it's a combination > of 1) Having both += and +:=/:+= would be redundant and 2) not wanting > to add 11+ new operators to the language. And 3) there's no point. Can you give an example of where you would want an expression form of augmented assignment? > 4. Adding a ``let`` expression to create local bindings > > value = let x = spam(1, 4, 7, q) in x**2 + 2*x > > 5. Adding a ``where`` expression to create local bindings: > > value = x**2 + 2*x where x = spam(1, 4, 7, q) > > Both have the extra-keyword problem. Multiple bindings are little harder > to add than they would be with the ``where:`` modifier, although > a few extra parentheses and judicious line breaks make it not so bad to > allow a comma-separated list, as shown in my first example at the top of > this reply. Both also have the problem of "exactly how local ARE these bindings?", and the 'let' example either requires two new keywords, or requires repurposing 'in' to mean something completely different from its usual 'item in collection' boolean check. The 'where' example is broadly similar to rejected alternative 3, except that you're removing the colon and the suite, which means you can't create more than one variable without figuring some way to parenthesize. Let's suppose this were defined as: EXPR where NAME = EXPR as a five-component sequence. If you were to write this twice EXPR where NAME = EXPR where OTHERNAME = EXPR then it could just as logically be defined as "EXPR where NAME = (EXPR where OTHERNAME = EXPR)" as the other way. And even if it were to work as "(EXPR where NAME = EXPR) where OTHERNAME = EXPR", that still has the highly confusing semantics of being evaluated right-to-left. (Before you ask: no, you can't define it as "EXPR where NAME = EXPR , NAME = EXPR", because that would require looking a long way forward.) >> Special-casing conditional statements >> ------------------------------------- >> > 4. `` let NAME = EXPR1 in EXPR2``:: > > stuff = [let y = f(x) in (y, x/y) for x in range(5)] > > I don't have anything new to say about this. It has the same keyword > objections as similar proposals, and I think I've addressed the use case > elsewhere. This section is specifically about proposals that ONLY solve this problem within list comprehensions. I don't think there's any point mentioning your proposal there, as the "let NAME = EXPR in EXPR" notation has nothing particularly to do with comprehensions. >> With assignment expressions, why bother with assignment statements? >> ------------------------------------------------------------------- >> >> The two forms have different flexibilities. The ``:=`` operator can be used >> inside a larger expression; the ``=`` operator can be chained more >> conveniently, and closely parallels the inline operations ``+=`` and friends. >> The assignment statement is a clear declaration of intent: this value is to >> be assigned to this target, and that's it. > > I don't find this convincing. I don't really see chained assignments often enough > to worry about how they are written, plus note my earlier question about the > precedence and associativity of :=. If you don't use them, why would you care either way? :) > The fact is, `x := 5` as an expression statement appears equivalent to the > assignment statement `x = 5`, so I suspect people will start using it as such > no matter how strongly you suggest they shouldn't. Probably. But when they run into problems, the solution will be "use an assignment statement, don't abuse the assignment expression". If you want to, you can write this: np = __import__("numpy") But it's much better to use the import statement, and people will rightly ask why you're using that expression form. Your "let... in" syntax is kinda interesting, but has a number of problems. Is that the exact syntax used in Haskell, and if so, does Haskell use 'in' to mean anything else in other contexts? ChrisA From p.f.moore at gmail.com Wed Apr 11 09:37:56 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 14:37:56 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: On 11 April 2018 at 14:25, Chris Angelico wrote: > On Wed, Apr 11, 2018 at 10:23 PM, Clint Hepner wrote: >>> Differences from regular assignment statements >>> ---------------------------------------------- >>> >>> An assignment statement can assign to multiple targets:: >>> >>> x = y = z = 0 >>> >>> To do the same with assignment expressions, they must be parenthesized:: >>> >>> assert 0 == (x := (y := (z := 0))) >> >> There's no rationale given for why this must be parenthesized. >> If := were right-associative, >> >> assert 0 == (x := y := z := 0) >> >> would work fine. (With high enough precedence, the remaining parentheses >> could be dropped, but one would probably keep them for clarity.) >> I think you need to spell out its associativity and precedence in more detail, >> and explain why the rationale for the choice made. > > It's partly because of other confusing possibilities, such as its use > inside, or capturing, a lambda function. I'm okay with certain forms > requiring parens. The only possible reading of x := y := z := 0 is as x := (y := (z := 0)) because an assignment expression isn't allowed on the LHS of :=. So requiring parentheses is unnecessary. In the case of an assignment statement, "assignment to multiple targets" is a special case, because assignment is a statement not an expression. But with assignment *expressions*, a := b := 0 is simply assigning the result of the expression b := 0 (which is 0) to a. No need for a special case - so enforced parentheses would *be* the special case. And you can't really argue that they are needed "for clarity" at the same time as having your comments about how "being able to write ugly code" isn't a valid objection :-) Paul From erik.m.bray at gmail.com Wed Apr 11 09:39:28 2018 From: erik.m.bray at gmail.com (Erik Bray) Date: Wed, 11 Apr 2018 15:39:28 +0200 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <3D3ABFD0-5785-4F99-92A3-D526DCE5BD88@trueblade.com> References: <20180410182427.03ad0043@fsol> <3D3ABFD0-5785-4F99-92A3-D526DCE5BD88@trueblade.com> Message-ID: On Tue, Apr 10, 2018 at 9:50 PM, Eric V. Smith wrote: > >>> 3. Annotations. They are used mainly by third party tools that >>> statically analyze sources. They are rarely used at runtime. >> >> Even less used than docstrings probably. > > typing.NamedTuple and dataclasses use annotations at runtime. Astropy uses annotations at runtime for optional unit checking on arguments that take dimensionful quantities: http://docs.astropy.org/en/stable/api/astropy.units.quantity_input.html#astropy.units.quantity_input From rosuav at gmail.com Wed Apr 11 09:50:44 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 23:50:44 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Wed, Apr 11, 2018 at 11:03 PM, Kirill Balunov wrote: > Great work Chris! Thank you! > > I do not know whether this is good or bad, but this PEP considers so many > different topics, although closely interrelated with each other. > > 2018-04-11 8:32 GMT+03:00 Chris Angelico : >> >> >> Alterations to comprehensions >> ----------------------------- >> >> The current behaviour of list/set/dict comprehensions and generator >> expressions has some edge cases that would behave strangely if an >> assignment >> expression were to be used. Therefore the proposed semantics are changed, >> removing the current edge cases, and instead altering their behaviour >> *only* >> in a class scope. >> >> As of Python 3.7, the outermost iterable of any comprehension is evaluated >> in the surrounding context, and then passed as an argument to the implicit >> function that evaluates the comprehension. >> >> Under this proposal, the entire body of the comprehension is evaluated in >> its implicit function. Names not assigned to within the comprehension are >> located in the surrounding scopes, as with normal lookups. As one special >> case, a comprehension at class scope will **eagerly bind** any name which >> is already defined in the class scope. >> > > I think this change is important one no matter what will be the future of > the current PEP. And since it breaks backward compatibility it deserves a > separate PEP. Well, it was Guido himself who started the sub-thread about classes and comprehensions :) To be honest, the changes to comprehensions are mostly going to be under-the-hood tweaks. The only way you'll ever actually witness the changes are if you: 1) Use assignment expressions inside comprehensions (ie using both halves of this PEP); or 2) Put comprehensions at class scope (not inside methods, but actually at class scope), referring to other names from class scope, in places other than in the outermost iterable 3) Use 'yield' expressions in the outermost iterable of a list comprehension inside a generator function 4) Create a generator expression that refers to an external name, then change what that name is bound to before pumping the generator; depending on the one open question, this may occur ONLY if this external name is located at class scope. 5) Use generator expressions without iterating over them, in situations where iterating might fail (again, depends on the one open question). Aside from the first possibility, these are extremely narrow edge and corner cases, and the new behaviour is generally the more intuitive anyway. Class scope stops being so incredibly magical that it's completely ignored, and now becomes mildly magical such that name lookups are resolved eagerly instead of lazily; and the outermost iterable stops being magical in that it defies the weirdness of class scope and the precise definitions of generator functions. Special cases are being removed, not added. >> Open questions >> ============== >> >> Can the outermost iterable still be evaluated early? >> ---------------------------------------------------- >> > > Previously, there was an alternative _operator form_ `->` proposed by > Steven D'Aprano. This option is no longer considered? I see several > advantages with this variant: > 1. It does not use `:` symbol which is very visually overloaded in Python. > 2. It is clearly distinguishable from the usual assignment statement and > it's `+=` friends > There are others but they are minor. I'm not sure why you posted this in response to the open question, but whatever. The arrow operator is already a token in Python (due to its use in 'def' statements) and should not conflict with anything; however, apart from "it looks different", it doesn't have much to speak for it. The arrow faces the other way in languages like Haskell, but we can't use "<-" in Python due to conflicts with "<" and "-" as independent operators. >> This could be used to create ugly code! >> --------------------------------------- >> >> So can anything else. This is a tool, and it is up to the programmer to >> use it >> where it makes sense, and not use it where superior constructs can be >> used. >> > > But the ugly code matters, especially when it comes to Python. For me, the > ideal option would be the combination of two rejected parts: > > (+ in `while`) combined with this part: > > >> 3. ``with EXPR as NAME``:: >> >> stuff = [(y, x/y) with f(x) as y for x in range(5)] >> >> As per option 2, but using ``as`` rather than an equals sign. Aligns >> syntactically with other uses of ``as`` for name binding, but a simple >> transformation to for-loop longhand would create drastically different >> semantics; the meaning of ``with`` inside a comprehension would be >> completely different from the meaning as a stand-alone statement, while >> retaining identical syntax. > > > I see no benefit to have the assignment expression in other places. And all > your provided examples use `while` or `if` or some form of comprehension. I > also see no problem with `if (re.search(pat, text) as match) is not > None:..`. What is the point of overloading language with expression that > will be used only in `while` and `if` and will be rejected by style checkers > in other places? Can you give an example of how your syntax is superior to the more general option of simply allowing "as" bindings in any location? ChrisA From rosuav at gmail.com Wed Apr 11 09:54:40 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 11 Apr 2018 23:54:40 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: On Wed, Apr 11, 2018 at 11:37 PM, Paul Moore wrote: > On 11 April 2018 at 14:25, Chris Angelico wrote: >> On Wed, Apr 11, 2018 at 10:23 PM, Clint Hepner wrote: >>>> Differences from regular assignment statements >>>> ---------------------------------------------- >>>> >>>> An assignment statement can assign to multiple targets:: >>>> >>>> x = y = z = 0 >>>> >>>> To do the same with assignment expressions, they must be parenthesized:: >>>> >>>> assert 0 == (x := (y := (z := 0))) >>> >>> There's no rationale given for why this must be parenthesized. >>> If := were right-associative, >>> >>> assert 0 == (x := y := z := 0) >>> >>> would work fine. (With high enough precedence, the remaining parentheses >>> could be dropped, but one would probably keep them for clarity.) >>> I think you need to spell out its associativity and precedence in more detail, >>> and explain why the rationale for the choice made. >> >> It's partly because of other confusing possibilities, such as its use >> inside, or capturing, a lambda function. I'm okay with certain forms >> requiring parens. > > The only possible reading of > > x := y := z := 0 > > is as > > x := (y := (z := 0)) > > because an assignment expression isn't allowed on the LHS of :=. So > requiring parentheses is unnecessary. In the case of an assignment > statement, "assignment to multiple targets" is a special case, because > assignment is a statement not an expression. But with assignment > *expressions*, a := b := 0 is simply assigning the result of the > expression b := 0 (which is 0) to a. No need for a special case - so > enforced parentheses would *be* the special case. Sure, if you're just assigning zero to everything. But you could do that with a statement. What about this: q = { lambda: x := lambda y: z := a := 0, } Yes, it's an extreme example, but look at all those colons and tell me if you can figure out what each one is doing. ChrisA From rosuav at gmail.com Wed Apr 11 10:09:38 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 00:09:38 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180411060642.GR16661@ando.pearwood.info> References: <20180411000335.GN16661@ando.pearwood.info> <20180411030205.GO16661@ando.pearwood.info> <20180411060642.GR16661@ando.pearwood.info> Message-ID: On Wed, Apr 11, 2018 at 4:06 PM, Steven D'Aprano wrote: > On Wed, Apr 11, 2018 at 02:21:17PM +1000, Chris Angelico wrote: > > [...] >> > Yes, it will double the number of files. Actually quadruple it, if the >> > annotations and line numbers are in separate files too. But if most of >> > those extra files never need to be opened, then there's no cost to them. >> > And whatever extra cost there is, is amortized over the lifetime of the >> > interpreter. >> >> Yes, if they are actually not needed. My question was about whether >> that is truly valid. > > We're never really going to know the affect on performance without > implementing and benchmarking the code. It might turn out that, to our > surprise, three quarters of the std lib relies on loading docstrings > during startup. But I doubt it. > > >> Consider a very common use-case: an OS-provided >> Python interpreter whose files are all owned by 'root'. Those will be >> distributed with .pyc files for performance, but you don't want to >> deprive the users of help() and anything else that needs docstrings >> etc. So... are the docstrings lazily loaded or eagerly loaded? > > What relevance is that they're owned by root? You have to predict in advance what you'll want to have in your pyc files. Can't create them on the fly. >> If eagerly, you've doubled the number of file-open calls to initialize >> the interpreter. > > I do not understand why you think this is even an option. Has Serhiy > said something that I missed that makes this seem to be on the table? > That's not a rhetorical question -- I may have missed something. But I'm > sure he understands that doubling or quadrupling the number of file > operations during startup is not an optimization. > > >> (Or quadrupled, if you need annotations and line >> numbers and they're all separate.) If lazily, things are a lot more >> complicated than the original description suggested, and there'd need >> to be some semantic changes here. > > What semantic change do you expect? > > There's an implementation change, of course, but that's Serhiy's problem > to deal with and I'm sure that he has considered that. There should be > no semantic change. When you access obj.__doc__, then and only then are > the compiled docstrings for that module read from the disk. In other words, attempting to access obj.__doc__ can actually go and open a file. Does it need to check if the file exists as part of the import, or does it go back to sys.path? If the former, you're right back with the eager loading problem of needing to do 2-4 times as many stat calls; if the latter, it's semantically different in that a change to sys.path can influence something that normally is preloaded. > As for the in-memory data structures of objects themselves, I imagine > something like the __doc__ and __annotation__ slots pointing to a table > of strings, which is not initialised until you attempt to read from the > table. Or something -- don't pay too much attention to my wild guesses. > > The bottom line is, is there some reason *aside from performance* to > avoid this? Because if the performance is worse, I'm sure Serhiy will be > the first to dump this idea. Obviously it could be turned into just a performance question, but in that case everything has to be preloaded, and I doubt there's going to be any advantage. To be absolutely certain of retaining the existing semantics, there'd need to be some sort of anchoring to ensure that *this* .pyc file goes with *that* .pyc_docstrings file. Looking them up anew will mean that there's every possibility that you get the wrong file back. As a simple example, upgrading your Python installation while you have a Python script running can give you this effect already. Just import a few modules, then change everything on disk. If you now import a module that was already imported, you get it from cache (and the unmodified version); import something that wasn't imported already, and it goes to the disk. At the granularity of modules, this is seldom a problem (I can imagine some package modules getting confused by this, but otherwise not usually), but if docstrings are looked up separately - and especially if lnotab is too - you could happily import and use something (say, in a web server), then run updates, and then an exception requires you to look up a line number. Oops, a few lines got inserted into that file, and now all the line numbers are straight-up wrong. That's a definite behavioural change. Maybe it's one that's considered acceptable, but it definitely is a change. And if mutations to sys.path can do this, it's definitely a semantic change in Python. ChrisA From p.f.moore at gmail.com Wed Apr 11 10:11:02 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 15:11:02 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: On 11 April 2018 at 14:54, Chris Angelico wrote: > Sure, if you're just assigning zero to everything. But you could do > that with a statement. What about this: > > q = { > lambda: x := lambda y: z := a := 0, > } > > Yes, it's an extreme example, but look at all those colons and tell me > if you can figure out what each one is doing. lambda: x := (lambda y: (z := (a := 0))) As I say, it's the only *possible* parsing. It's ugly, and it absolutely should be parenthesised, but there's no need to make the parentheses mandatory. (And actually, it didn't take me long to add those parentheses, it's not *hard* to parse correctly - for a human). Paul From ncoghlan at gmail.com Wed Apr 11 10:23:48 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 12 Apr 2018 00:23:48 +1000 Subject: [Python-ideas] Add more information in the header of pyc files In-Reply-To: <20180410185456.1ced81cc@fsol> References: <20180410175838.52a693f6@fsol> <20180410185456.1ced81cc@fsol> Message-ID: On 11 April 2018 at 02:54, Antoine Pitrou wrote: > On Tue, 10 Apr 2018 19:29:18 +0300 > Serhiy Storchaka > wrote: >> >> A bugfix release can fix bugs in bytecode generation. See for example >> issue27286. [1] The part of issue33041 backported to 3.7 and 3.6 is an >> other example. [2] There were other examples of compatible changing the >> bytecode. Without bumping the magic number these fixes can just not have >> any effect if existing pyc files were generated by older compilers. But >> bumping the magic number in a bugfix release can lead to rebuilding >> every pyc file (even unaffected by the fix) in distributives. > > Sure, but I don't think rebuilding every pyc file is a significant > problem. It's certainly less error-prone than cherry-picking which > files need rebuilding. And we need to handle the old bytecode format in the eval loop anyway, or else we'd be breaking compatibility with bytecode-only files, as well as introducing a significant performance regression for non-writable bytecode caches (if we were to ignore them). It's a subtle enough problem that I think the `compileall --force` option is a safer way of handling it, even if it regenerates some pyc files that could have been kept. For the "stable file signature" aspect, does that need to be specifically the first *four* bytes? One of the benefits of PEP 552 leaving those four bytes alone is that it meant that a lot of magic number checking code didn't need to change. If the stable marker could be placed later (e.g. after the PEP 552 header), then we'd similarly have the benefit that code checking the PEP 552 headers wouldn't need to change, at the expense of folks having to read 20 bytes to see the new signature byte (which shouldn't be a problem, given that file defaults to reading up to 1 MiB from files it is trying to identify). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Wed Apr 11 10:28:23 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 00:28:23 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: On Thu, Apr 12, 2018 at 12:11 AM, Paul Moore wrote: > On 11 April 2018 at 14:54, Chris Angelico wrote: >> Sure, if you're just assigning zero to everything. But you could do >> that with a statement. What about this: >> >> q = { >> lambda: x := lambda y: z := a := 0, >> } >> >> Yes, it's an extreme example, but look at all those colons and tell me >> if you can figure out what each one is doing. > > lambda: x := (lambda y: (z := (a := 0))) > > As I say, it's the only *possible* parsing. It's ugly, and it > absolutely should be parenthesised, but there's no need to make the > parentheses mandatory. (And actually, it didn't take me long to add > those parentheses, it's not *hard* to parse correctly - for a human). Did you pick up on the fact that this was actually in a set? With very small changes, such as misspelling "lambda" at the beginning, this actually becomes a dict display. How much of the expression do you need to see before you can be 100% sure of the parsing? Could you do this if fed tokens one at a time, with permission to look no more than one token ahead? ChrisA From peter.ed.oconnor at gmail.com Wed Apr 11 10:37:09 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Wed, 11 Apr 2018 10:37:09 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> <20180411034115.GP16661@ando.pearwood.info> Message-ID: > > It's worth adding a reminder here that "having more options on the > market" is pretty directly in contradiction to the Zen of Python - > "There should be one-- and preferably only one --obvious way to do > it". I've got to start minding my words more. By "options on the market" I more meant it in a "candidates for the job" sense. As in in the end we'd select just one, which would in retrospect or if Dutch would seem like the obvious choice. Not that "everyone who uses Python should have more ways to do this". My reason for starting this is that there isn't "one obvious way" to do this type of operation now (as the diversity of the exponential-moving-average "zoo" attests) ------ Let's look at a task where there is "one obvious way" Suppose someone asks: "How can I build a list of squares of the first 100 odd numbers [1, 9, 25, 49, ....] in Python?" The answer is now obvious - few people would do this: list_of_odd_squares = [] for i in range(100): list_of_odd_squares.append((i*2+1)**2) or this: def iter_odd_squares(n)): for i in range(n): yield (i*2+1)**2 list_of_odd_squares = list(iter_odd_squares(100)) Because it's just more clean, compact, readable and "obvious" to do: list_of_even_squares = [(i*2+1)**2 for i in range(100)] Maybe I'm being presumptuous, but I think most Python users would agree. ------- Now lets switch our task computing the exponential moving average of a list. This is a stand-in for a HUGE range of tasks that involve carrying some state-variable forward while producing values. Some would do this: smooth_signal = [] average = 0 for x in signal: average = (1-decay)*average + decay*x smooth_signal.append(average) Some would do this: def moving_average(signal, decay, initial=0): average = initial for x in signal: average = (1-decay)*average + decay*x yield average smooth_signal = list(moving_average(signal, decay=decay)) Lovers of one-liners like Serhiy would do this: smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]] Some would scoff at the cryptic one-liner and do this: def update_moving_average(avg, x, decay): return (1-decay)*avg + decay*x smooth_signal = list(itertools.accumulate(itertools.chain([0], signal), func=functools.partial(update_moving_average, decay=decay))) And others would scoff at that and make make a class, or use coroutines. ------ There've been many suggestions in this thread (all documented here: https://github.com/petered/peters_example_code/blob/master/peters_example_code/ways_to_skin_a_cat.py) and that's good, but it seems clear that people do not agree on an "obvious" way to do things. I claim that if smooth_signal = [average := (1-decay)*average + decay*x for x in signal from average=0.] Were allowed, it would become the "obvious" way. Chris Angelico's suggestions are close to this and have the benefit of requiring no new syntax in a PEP 572 world : smooth_signal = [(average := (1-decay)*average + decay*x) for average in [0] for x in signal] or smooth_signal = [(average := (1-decay)*(average or 0) + decay*x) for x in signal] or average = 0 smooth_signal = [(average := (1-decay)*average + decay*x) for x in signal] But they all have oddities that detract from their "obviousness" and the oddities stem from there not being a built-in way to initialize. In the first, there is the odd "for average in [0]" initializer.. The second relies on a hidden "average = None" which is not obvious at all, and the third has the problem that the initial value is bound to the defining scope instead of belonging to the generator. All seem to have oddly redundant brackets whose purpose is not obvious, but maybe there's a good reason for that. If people are happy with these solutions and still see no need for the initialization syntax, we can stop this, but as I see it there is a "hole" in the language that needs to be filled. On Wed, Apr 11, 2018 at 3:55 AM, Paul Moore wrote: > On 11 April 2018 at 04:41, Steven D'Aprano wrote: > >> > But in a way that more intuitively expresses the intent of the code, > it > >> > would be great to have more options on the market. > >> > >> It's worth adding a reminder here that "having more options on the > >> market" is pretty directly in contradiction to the Zen of Python - > >> "There should be one-- and preferably only one --obvious way to do > >> it". > > > > I'm afraid I'm going to (mildly) object here. At least you didn't > > misquote the Zen as "Only One Way To Do It" :-) > > > > The Zen here is not a prohibition against there being multiple ways to > > do something -- how could it, given that Python is a general purpose > > programming language there is always going to be multiple ways to write > > any piece of code? Rather, it exhorts us to make sure that there are one > > or more ways to "do it", at least one of which is obvious. > > I apologise if I came across as implying that I thought the Zen said > that having multiple ways was prohibited. I don't (and certainly the > Zen doesn't mean that). Rather, I was saying that using "it gives us > an additional way to do something" is a bad argument in favour of a > proposal for Python. At a minimum, the proposal needs to argue why the > new feature is "more obvious" than the existing ways (bonus points if > the proposer is Dutch - see the following Zen item ;-)), or why it > offers a capability that isn't possible with the existing language. > And I'm not even saying that the OP hasn't attempted to make such > arguments (even if I disagree with them). All I was pointing out was > that the comment "it would be great to have more options on the > market" implies a misunderstanding of the design goals of Python > (hence my "reminder" of the principle I think is relevant here). > > Sorry again if that's not what it sounded like. > Paul > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Apr 11 10:41:04 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 15:41:04 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: On 11 April 2018 at 15:28, Chris Angelico wrote: > On Thu, Apr 12, 2018 at 12:11 AM, Paul Moore wrote: >> On 11 April 2018 at 14:54, Chris Angelico wrote: >>> Sure, if you're just assigning zero to everything. But you could do >>> that with a statement. What about this: >>> >>> q = { >>> lambda: x := lambda y: z := a := 0, >>> } >>> >>> Yes, it's an extreme example, but look at all those colons and tell me >>> if you can figure out what each one is doing. >> >> lambda: x := (lambda y: (z := (a := 0))) >> >> As I say, it's the only *possible* parsing. It's ugly, and it >> absolutely should be parenthesised, but there's no need to make the >> parentheses mandatory. (And actually, it didn't take me long to add >> those parentheses, it's not *hard* to parse correctly - for a human). > > Did you pick up on the fact that this was actually in a set? With very > small changes, such as misspelling "lambda" at the beginning, this > actually becomes a dict display. How much of the expression do you > need to see before you can be 100% sure of the parsing? Could you do > this if fed tokens one at a time, with permission to look no more than > one token ahead? Yes. It's not relevant to the parsing. It is relevant to the possibility of errors, as you point out. But once again, it's not the role of the PEP to prevent people writing bad code. Anyway, this is mostly nitpicking. I'm not trying to argue that this is good code, or robust code, or even code that I'd permit within a mile of one of my programs. All I'm trying to say is that *if* you want to state that the parentheses are mandatory in chained assignment expressions, then I think you need to justify it (and my suspicion is that you don't have a good justification other than "it prevents bad code" - which is already covered by the part of the PEP that points out that it's not the job of this PEP to prevent people writing bad code ;-)). Personally, I dislike the tendency with recent syntax proposals to mandate parentheses "to remove ambiguity". In my experience, all that happens is that I end up never knowing whether parentheses are required or not - and as a result end up with too many parentheses, making my code look ugly and encouraging cargo cult style "better add parens just in case" behaviour (as opposed to the reasonable rule "add parens for readability"). Paul From ncoghlan at gmail.com Wed Apr 11 10:42:24 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 12 Apr 2018 00:42:24 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: On 11 April 2018 at 02:14, Serhiy Storchaka wrote: > Currently pyc files contain data that is useful mostly for developing and is > not needed in most normal cases in stable program. There is even an option > that allows to exclude a part of this information from pyc files. It is > expected that this saves memory, startup time, and disk space (or the time > of loading from network). I propose to move this data from pyc files into > separate file or files. pyc files should contain only external references to > external files. If the corresponding external file is absent or specific > option suppresses them, references are replaced with None or NULL at import > time, otherwise they are loaded from external files. > > 1. Docstrings. They are needed mainly for developing. > > 2. Line numbers (lnotab). They are helpful for formatting tracebacks, for > tracing, and debugging with the debugger. Sources are helpful in such cases > too. If the program doesn't contain errors ;-) and is sipped without > sources, they could be removed. > > 3. Annotations. They are used mainly by third party tools that statically > analyze sources. They are rarely used at runtime. While I don't think the default inline pyc format should change, in my ideal world I'd like to see the optimized format change to a side-loading model where these things are still emitted, but they're placed in a separate metadata file that isn't loaded by default. The metadata file would then be lazily loaded at runtime, such that `-O` gave you the memory benefits of `-OO`, but docstrings/annotations/source line references/etc could still be loaded on demand if something actually needed them. This approach would also mitigate the valid points Chris Angelico raises around hot reloading support - we could just declare that it requires even more care than usual to use hot reloading in combination with `-O`. Bonus points if the sideloaded metadata file could be designed in such a way that an extension module compiler like Cython or an alternate pyc compiler frontend like Hylang could use it to provide relevant references back to the original source code (JavaScript's source maps may provide inspiration on that front). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From clint.hepner at gmail.com Wed Apr 11 10:46:58 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Wed, 11 Apr 2018 10:46:58 -0400 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: <93B1C6E1-BD86-4628-8438-A553EBA9B0CD@gmail.com> > On 2018 Apr 11 , at 9:25 a, Chris Angelico wrote: > > On Wed, Apr 11, 2018 at 10:23 PM, Clint Hepner wrote: >>> On 2018 Apr 11 , at 1:32 a, Chris Angelico wrote: >>> # Similar to the boolean 'or' but checking for None specifically >>> x = "default" if (eggs := spam().ham) is None else eggs >> >>> >>> # Even complex expressions can be built up piece by piece >>> y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) >> >> These would be clearer if you could remove the assignment from the expression itself. >> Assuming "let" were available as a keyword, >> >> x = (let eggs = spam().ham >> in >> "default" if eggs is None else eggs) >> y = (let eggs = spam(), >> cheese = eggs.method() >> in >> (eggs, cheese, cheese[eggs])) >> >> Allowing for differences in how best to format such an expression, the final >> expression is clearly separate from its component assignment. (More on this >> in the Alternative Spellings section below.) > > I have no idea what the "in" keyword is doing here, but somehow it > isn't being used for the meaning it currently has in Python. Does your > alternative require not one but *two* new keywords? Just one; I don't see using ``in`` here to be any more or a problem than was reusing ``if``, ``then``, and ``else`` for conditional expressions. Its purpose is to separate the bindings from the expression they are used in; eggs and cheese are only valid in the expression that follows "in". Syntactically, this is pretty much identical to Haskell's "let" expression, but the idea of an expression with local bindings is much older. (https://en.wikipedia.org/wiki/Let_expression) > >>> Augmented assignment is not supported in expression form:: >>> >>>>>> x +:= 1 >>> File "", line 1 >>> x +:= 1 >>> ^ >>> SyntaxError: invalid syntax >> >> There's no reason give for why this is invalid. I assume it's a combination >> of 1) Having both += and +:=/:+= would be redundant and 2) not wanting >> to add 11+ new operators to the language. > > And 3) there's no point. Can you give an example of where you would > want an expression form of augmented assignment? I wouldn't want one :). I'm just suggesting that the PEP include something to the effect of "We're not adding augmented assignment expressions because...". >> 4. Adding a ``let`` expression to create local bindings >> >> value = let x = spam(1, 4, 7, q) in x**2 + 2*x >> >> 5. Adding a ``where`` expression to create local bindings: >> >> value = x**2 + 2*x where x = spam(1, 4, 7, q) >> >> Both have the extra-keyword problem. Multiple bindings are little harder >> to add than they would be with the ``where:`` modifier, although >> a few extra parentheses and judicious line breaks make it not so bad to >> allow a comma-separated list, as shown in my first example at the top of >> this reply. > > Both also have the problem of "exactly how local ARE these bindings?", > and the 'let' example either requires two new keywords, or requires > repurposing 'in' to mean something completely different from its usual > 'item in collection' boolean check. ``in`` already has two different uses: as a Boolean operator (two, actually, with ``not in``) and as part of the various ``for`` constructs. IMO, I don't see adding this third meaning to be a problem. With ``let``, the scope extends as far right of ``in` as possible: let NAME = EXPR in let OTHERNAME = EXPR in EXPR is equivalent to let NAME = EXPR in (let OTHERNAME = EXPR in EXPR) > The 'where' example is broadly > similar to rejected alternative 3, except that you're removing the > colon and the suite, which means you can't create more than one > variable without figuring some way to parenthesize. I agree that `where` *should* be rejected; I don't really like them in Haskell, either. I only listed ``let`` here because I assume it *will* be rejected due to its requiring a new keyword, no matter how much I think adding a new keyword is warranted. >>> Special-casing conditional statements >>> ------------------------------------- >>> >> 4. `` let NAME = EXPR1 in EXPR2``:: >> >> stuff = [let y = f(x) in (y, x/y) for x in range(5)] >> >> I don't have anything new to say about this. It has the same keyword >> objections as similar proposals, and I think I've addressed the use case >> elsewhere. > > This section is specifically about proposals that ONLY solve this > problem within list comprehensions. I don't think there's any point > mentioning your proposal there, as the "let NAME = EXPR in EXPR" > notation has nothing particularly to do with comprehensions. Fair enough, although that suggests a proper let expression precludes the need for any special handling. > >>> With assignment expressions, why bother with assignment statements? >>> ------------------------------------------------------------------- >>> >>> The two forms have different flexibilities. The ``:=`` operator can be used >>> inside a larger expression; the ``=`` operator can be chained more >>> conveniently, and closely parallels the inline operations ``+=`` and friends. >>> The assignment statement is a clear declaration of intent: this value is to >>> be assigned to this target, and that's it. >> >> I don't find this convincing. I don't really see chained assignments often enough >> to worry about how they are written, plus note my earlier question about the >> precedence and associativity of :=. > > If you don't use them, why would you care either way? :) I just mean this seems like a weak argument if you are trying to convince someone to use assignment statements. "Assuming I never use chained assignments, why should I use ``=`` instead of ``:=`?" > >> The fact is, `x := 5` as an expression statement appears equivalent to the >> assignment statement `x = 5`, so I suspect people will start using it as such >> no matter how strongly you suggest they shouldn't. > > Probably. But when they run into problems, the solution will be "use > an assignment statement, don't abuse the assignment expression". If > you want to, you can write this: > > np = __import__("numpy") > > But it's much better to use the import statement, and people will > rightly ask why you're using that expression form. All the dunder methods exist to support a higher-level syntax, and are not really intended to be used directly, so I think a stronger argument than "Don't use this, even though you could" would be preferable. > > Your "let... in" syntax is kinda interesting, but has a number of > problems. Is that the exact syntax used in Haskell, and if so, does > Haskell use 'in' to mean anything else in other contexts? I don't believe ``in`` is used elsewhere in Haskell, although Python already has at least two distinct uses as noted earlier. In Haskell, the ``let`` expression (like much of Haskell's syntax) is syntactic sugar for an application of lambda abstraction. The general form let n1 = e1 n2 = e2 in e3 is syntactic sugar for let n1 = e1 in let n2 = e2 in e3 where multiple bindings are expanded to a series of nested expressions. The single expression let n1 = e1 in e2 itself is transformed into (\n1 -> e2) e1 (or translated to Python, (lambda n1: e2)(e1)). -- Clint From rosuav at gmail.com Wed Apr 11 10:48:19 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 00:48:19 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> <20180411034115.GP16661@ando.pearwood.info> Message-ID: On Thu, Apr 12, 2018 at 12:37 AM, Peter O'Connor wrote: > Let's look at a task where there is "one obvious way" > > Suppose someone asks: "How can I build a list of squares of the first 100 > odd numbers [1, 9, 25, 49, ....] in Python?" The answer is now obvious - > few people would do this: > > list_of_odd_squares = [] > for i in range(100): > list_of_odd_squares.append((i*2+1)**2) > > or this: > > def iter_odd_squares(n)): > for i in range(n): > yield (i*2+1)**2 > > list_of_odd_squares = list(iter_odd_squares(100)) > > Because it's just more clean, compact, readable and "obvious" to do: > > list_of_even_squares = [(i*2+1)**2 for i in range(100)] > > Maybe I'm being presumptuous, but I think most Python users would agree. > Or: squares = [i**2 for i in range(1, 200, 2)] So maybe even the obvious examples aren't quite as obvious as you might think. ChrisA From p.f.moore at gmail.com Wed Apr 11 10:50:59 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 15:50:59 +0100 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> <20180411034115.GP16661@ando.pearwood.info> Message-ID: On 11 April 2018 at 15:37, Peter O'Connor wrote: > If people are happy with these solutions and still see no need for the > initialization syntax, we can stop this, but as I see it there is a "hole" > in the language that needs to be filled. Personally, I'm happy with those solutions and see no need for the initialisation syntax. In particular, I'm happiest with the named moving_average() function, which may reflect to some extent my lack of familiarity with the subject area. I don't *care* how it's implemented internally - an explicit loop is fine with me, but if a domain expert wants to be clever and use something more complex, I don't need to know. An often missed disadvantage of one-liners is that they get put inline, meaning that people looking for a higher level overview of what the code does get confronted with all the gory details. Paul From kirillbalunov at gmail.com Wed Apr 11 11:01:54 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Wed, 11 Apr 2018 18:01:54 +0300 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: 2018-04-11 16:50 GMT+03:00 Chris Angelico : > > Can you give an example of how your syntax is superior to the more > general option of simply allowing "as" bindings in any location? > > This is not my syntax :) And not even my idea. I just do not understand, and even a little skeptical about allowing "as" bindings in **any location** with global scoping. All the examples in this thread and the previous ones, as well as almost all PEP's examples show how this feature will be useful in `if`, `while` statements and comprehension/generator expressions. And it excellently solves this problem. This feature increases the capabilities of these statements and also positively affects the readability of the code and it seems to me that everyone understands what this means in this context without ambiguity in their meaning in `while` or `with` statements. The remaining examples (general ones) are far-fetched, and I do not have much desire to discuss them :) These include: lambda: x := lambda y: z := a := 0 y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) and others of these kind... Thus, I do not understand why to solve such a general and complex problem, when this syntax is convenient only in specific cases. In addition, previously the concept of a Statement-Local Name Bindings was discussed, which I basically like (and it fits the above idea). In this version, it was abandoned completely, but it is unclear for what reasons. p.s.: Maybe someone has use-cases outside `if`, `while` and comprehensions, but so far no one has demonstrated them. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Apr 11 11:10:25 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 01:10:25 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <93B1C6E1-BD86-4628-8438-A553EBA9B0CD@gmail.com> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <93B1C6E1-BD86-4628-8438-A553EBA9B0CD@gmail.com> Message-ID: On Thu, Apr 12, 2018 at 12:46 AM, Clint Hepner wrote: > >> On 2018 Apr 11 , at 9:25 a, Chris Angelico wrote: >> I have no idea what the "in" keyword is doing here, but somehow it >> isn't being used for the meaning it currently has in Python. Does your >> alternative require not one but *two* new keywords? > > Just one; I don't see using ``in`` here to be any more or a problem than was > reusing ``if``, ``then``, and ``else`` for conditional expressions. > ``in`` already has two different uses: as a Boolean operator (two, actually, with ``not in``) > and as part of the various ``for`` constructs. IMO, I don't see adding this third meaning > to be a problem. > > With ``let``, the scope extends as far right of ``in` as possible: > > let NAME = EXPR in let OTHERNAME = EXPR in EXPR > > is equivalent to > > let NAME = EXPR in (let OTHERNAME = EXPR in EXPR) A 'for' loop has the following structure: for targets in expr: To the left of the 'in', you have a list of assignment targets. You can't assign to anything with an 'in' in it: >>> for (x in y) in [1]: pass ... File "", line 1 SyntaxError: can't assign to comparison (Removing the parentheses makes this "for x in (y in [1])", which is perfectly legal, but has no bearing on this discussion.) In contrast, the way you're using it here, it's simply between two arbitrary expressions. There's nothing to stop you from using the 'in' operator on both sides of it: let x = (a in b) in (b in c) This would make the precedence tables extremely complicated, or else have some messy magic to make this work. >>>> Augmented assignment is not supported in expression form:: >>>> >>>>>>> x +:= 1 >>>> File "", line 1 >>>> x +:= 1 >>>> ^ >>>> SyntaxError: invalid syntax >>> >>> There's no reason give for why this is invalid. I assume it's a combination >>> of 1) Having both += and +:=/:+= would be redundant and 2) not wanting >>> to add 11+ new operators to the language. >> >> And 3) there's no point. Can you give an example of where you would >> want an expression form of augmented assignment? > > I wouldn't want one :). I'm just suggesting that the PEP include something to the > effect of "We're not adding augmented assignment expressions because...". Does the document really need to say that it isn't needed? >> The 'where' example is broadly >> similar to rejected alternative 3, except that you're removing the >> colon and the suite, which means you can't create more than one >> variable without figuring some way to parenthesize. > > I agree that `where` *should* be rejected; I don't really like them in Haskell, either. > I only listed ``let`` here because I assume it *will* be rejected due to its requiring > a new keyword, no matter how much I think adding a new keyword is warranted. In round 3 of this PEP, I was focusing on listing all plausible variants. I'm now focusing more on an actually-viable proposal, so myriad alternatives aren't as important any more. >>>> With assignment expressions, why bother with assignment statements? >>>> ------------------------------------------------------------------- >>>> >>>> The two forms have different flexibilities. The ``:=`` operator can be used >>>> inside a larger expression; the ``=`` operator can be chained more >>>> conveniently, and closely parallels the inline operations ``+=`` and friends. >>>> The assignment statement is a clear declaration of intent: this value is to >>>> be assigned to this target, and that's it. >>> >>> I don't find this convincing. I don't really see chained assignments often enough >>> to worry about how they are written, plus note my earlier question about the >>> precedence and associativity of :=. >> >> If you don't use them, why would you care either way? :) > > I just mean this seems like a weak argument if you are trying to convince > someone to use assignment statements. "Assuming I never use chained assignments, > why should I use ``=`` instead of ``:=`?" Fair enough. The most important part is the declaration of intent. By using an assignment *statement*, you're clearly showing that this was definitely intentional. >> Your "let... in" syntax is kinda interesting, but has a number of >> problems. Is that the exact syntax used in Haskell, and if so, does >> Haskell use 'in' to mean anything else in other contexts? > > I don't believe ``in`` is used elsewhere in Haskell, although Python > already has at least two distinct uses as noted earlier. > > In Haskell, the ``let`` expression (like much of Haskell's syntax) is > syntactic sugar for an application of lambda abstraction. The general form > > let n1 = e1 > n2 = e2 > in e3 > > is syntactic sugar for > > let n1 = e1 > in let n2 = e2 > in e3 > > where multiple bindings are expanded to a series of nested expressions. The single > expression > > let n1 = e1 in e2 > > itself is transformed into > > (\n1 -> e2) e1 > > (or translated to Python, (lambda n1: e2)(e1)). Makes sense. And if someone actually wants expression-local name bindings, this is the one obvious way to do it (modulo weirdness around class scope). This would not solve the if/while situation, and it wouldn't solve several of the other problems, but it does have the advantage of logically being expression-local. ChrisA From ncoghlan at gmail.com Wed Apr 11 11:22:12 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 12 Apr 2018 01:22:12 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On 11 April 2018 at 15:32, Chris Angelico wrote: > Wholesale changes since the previous version. Statement-local name > bindings have been dropped (I'm still keeping the idea in the back of > my head; this PEP wasn't the first time I'd raised the concept), and > we're now focusing primarily on assignment expressions, but also with > consequent changes to comprehensions. Thanks for putting this revised version together! You've already incorporated my feedback on semantics, so my comments below are mostly about the framing of the proposal in the context of the PEP itself. > Syntax and semantics > ==================== > > In any context where arbitrary Python expressions can be used, a **named > expression** can appear. This can be parenthesized for clarity, and is of > the form ``(target := expr)`` where ``expr`` is any valid Python expression, > and ``target`` is any valid assignment target. > > The value of such a named expression is the same as the incorporated > expression, with the additional side-effect that the target is assigned > that value. > > # Similar to the boolean 'or' but checking for None specifically > x = "default" if (eggs := spam().ham) is None else eggs > > # Even complex expressions can be built up piece by piece > y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) > Leading with these kinds of examples really doesn't help to sell the proposal, since they're hard to read, and don't offer much, if any, benefit over the status quo where assignments (and hence the order of operations) need to be spelled out as separate lines. Instead, I'd suggestion going with the kinds of examples that folks tend to bring up when requesting this capability: # Handle a matched regex if (match := pattern.search(data)) is not None: ... # A more explicit alternative to the 2-arg form of iter() invocation while (value := read_next_item()) is not None: ... # Share a subexpression between a comprehension filter clause and its output filtered_data = [y for x in data if (y := f(x)) is not None] All three of those examples share the common characteristic that there's no ambiguity about the order of operations, and the latter two aren't amenable to simply being split out into separate assignment statements due to the fact they're part of a loop. A good proposal should have readers nodding to themselves and thinking "I could see myself using that construct, and being happy about doing so", rather than going "Eugh, my eyes, what did I just read?" :) > The name ``prefix`` is thus searched for at global scope, ignoring the class > name. Under the proposed semantics, this name will be eagerly bound, being > approximately equivalent to:: > > class X: > names = ["Fred", "Barney", "Joe"] > prefix = "> " > def (prefix=prefix): > result = [] > for name in names: > result.append(prefix + name) > return result > prefixed_names = () "names" would also be eagerly bound here. > Recommended use-cases > ===================== > > Simplifying list comprehensions > ------------------------------- > > These list comprehensions are all approximately equivalent:: > > # Calling the function twice > stuff = [[f(x), x/f(x)] for x in range(5)] > > # External helper function > def pair(x, value): return [value, x/value] > stuff = [pair(x, f(x)) for x in range(5)] > > # Inline helper function > stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)] > > # Extra 'for' loop - potentially could be optimized internally > stuff = [[y, x/y] for x in range(5) for y in [f(x)]] > > # Iterating over a genexp > stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))] > > # Expanding the comprehension into a loop > stuff = [] > for x in range(5): > y = f(x) > stuff.append([y, x/y]) > > # Wrapping the loop in a generator function > def g(): > for x in range(5): > y = f(x) > yield [y, x/y] > stuff = list(g()) > > # Using a mutable cache object (various forms possible) > c = {} > stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)] > > # Using a temporary name > stuff = [[y := f(x), x/y] for x in range(5)] The example using the PEP syntax could be listed first in its own section, and then the others given as "These are the less obvious alternatives that this new capability aims to displace". Similar to my suggestion above, you may also want to consider making this example a filtered comprehension in order to show the proposal in its best light: results = [(x, y, x/y) for x in input_data if (y := f(x) )] > Capturing condition values > -------------------------- > > Assignment expressions can be used to good effect in the header of > an ``if`` or ``while`` statement:: Similar to the comprehension section, I think this part could benefit from switching the order of presentation. > Frequently Raised Objections > ============================ There needs to be a subsection here regarding the need to call `del` at class and module scope, just as there is for loop iteration variables at those scopes. > This could be used to create ugly code! > --------------------------------------- > > So can anything else. This is a tool, and it is up to the programmer to use it > where it makes sense, and not use it where superior constructs can be used. This argument will be strengthened by making the examples used in the PEP itself more attractive, as well as proposing suitable additions to PEP 8, such as: 1. If either assignment statements or assignment expressions can be used, prefer statements 2. If using assignment expressions would lead to ambiguity about execution order, restructure to use statements instead Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Wed Apr 11 11:34:37 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 16:34:37 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On 11 April 2018 at 16:22, Nick Coghlan wrote: > Similar to my suggestion above, you may also want to consider making > this example a filtered comprehension in order to show the proposal in > its best light: > > results = [(x, y, x/y) for x in input_data if (y := f(x) )] Agreed, this is a *much* better motivating example. >> This could be used to create ugly code! >> --------------------------------------- >> >> So can anything else. This is a tool, and it is up to the programmer to use it >> where it makes sense, and not use it where superior constructs can be used. > > This argument will be strengthened by making the examples used in the > PEP itself more attractive, as well as proposing suitable additions to > PEP 8, such as: > > 1. If either assignment statements or assignment expressions can be > used, prefer statements > 2. If using assignment expressions would lead to ambiguity about > execution order, restructure to use statements instead +1 on explicitly suggesting additions to PEP 8. Bonus points for PEP 8 additions that can be automatically checked by linters/style checkers (For example "avoid chained assignment expressions"). Paul From python at mrabarnett.plus.com Wed Apr 11 11:47:14 2018 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 11 Apr 2018 16:47:14 +0100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three! In-Reply-To: <0029eccf-38f8-9360-31da-d23aaea4344c@mgmiller.net> References: <20180323150058.GU16661@ando.pearwood.info> <20180324044102.GV16661@ando.pearwood.info> <20180324144432.GW16661@ando.pearwood.info> <27fccc82-8833-d1a5-a589-8d1358a3887a@btinternet.com> <5AB6A081.5010503@stoneleaf.us> <87in9d2xm3.fsf@vostro.rath.org> <2d7052b6-5912-c454-13f2-6595a32afa41@mgmiller.net> <0029eccf-38f8-9360-31da-d23aaea4344c@mgmiller.net> Message-ID: <1fe89773-31b7-7b00-960a-99bf8c1dce6f@mrabarnett.plus.com> On 2018-04-11 04:15, Mike Miller wrote: > If anyone is interested I came across this same subject on a blog post and > discussion on HN today: > > - https://www.hillelwayne.com/post/equals-as-assignment/ It says "BCPL also introduced braces as a means of defining blocks.". That bit is wrong, unless "braces" is being used as a generic term. BCPL used $( and $). From tjreedy at udel.edu Wed Apr 11 12:24:10 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 11 Apr 2018 12:24:10 -0400 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: <20180411000335.GN16661@ando.pearwood.info> <20180411030205.GO16661@ando.pearwood.info> Message-ID: On 4/11/2018 4:26 AM, Petr Viktorin wrote: > Currently in Fedora, we ship *both* optimized and non-optimized pycs to > make sure both -O and non--O will work nicely without root privilieges. > So splitting the docstrings into a separate file would be, for us, a > benefit in terms of file size. Currently, the Windows installer has an option to pre-compile stdlib modules. (At least it does if one does an all-users installation.) If one selects this, it creates normal, -O, and -OO versions of each. Since, like most people, I never run with -O or -OO, replacing this redundancy with 1 segmented file or 2 non-redundant files might be a win for most people. -- Terry Jan Reedy From brenbarn at brenbarn.net Wed Apr 11 13:49:54 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Wed, 11 Apr 2018 10:49:54 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: <5ACE4AC2.4030503@brenbarn.net> On 2018-04-11 05:23, Clint Hepner wrote: > I find the assignments make it difficult to pick out what the final expression looks like. I strongly agree with this, and for me I think this is enough to push me to -1 on the whole proposal. For me the classic example case is still the quadratic formula type of thing: x1, x2 = (-b + sqrt(b**2 - 4*a*c))/2, (-b - sqrt(b**2 - 4*a*c))/2 It just doesn't seem worth it to me to create an expression-level assignment unless it can make things like this not just less verbose but at the same time more readable. I don't consider this more readable: x1, x2 = (-b + sqrt(D := b**2 - 4*a*c)))/2, (-b - sqrt(D))/2 . . . because having to put the assignment inline creates a visual asymmetry, when for me the entire goal of an expression-level statement is to make the symmetry between such things MORE obvious. I want to be able to write: x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2 ... . . . where "..." stands for "the part of the expression where I define the variables I'm re-using in multiple places in the expression". The new proposal does at least have the advantage that it would help with things like this: while x := some_function_call(): # do stuff So maybe I'm -0.5 rather than -1. But it's not just that this proposal "could be used to create ugly code". It's that using it for expression-internal assignments WILL create ugly code, and there's no way to avoid it. I just don't see how this proposal provides any way to make things like the quadratic formula example above MORE readable. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From ethan at stoneleaf.us Wed Apr 11 14:38:23 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 11 Apr 2018 11:38:23 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: <5ACE561F.4000707@stoneleaf.us> On 04/10/2018 10:32 PM, Chris Angelico wrote: > Title: Assignment Expressions Thank you, Chris, for doing all this! --- Personally, I'm likely to only use this feature in `if` and `while` statements; if the syntax was easier to read inside longer expressions then I might use this elsewhere -- but as has been noted by others, the on-the-spot assignment creates asymmetries that further clutter the overall expression. As Paul noted, I don't think parenthesis should be mandatory if the parser itself does not require them. For myself, I prefer the EXPR as NAME variant for two reasons: - puts the calculation first, which is what we are used to seeing in if/while statements; and - matches already existing expression-level assignments (context managers, try/except blocks) +0.5 from me. -- ~Ethan~ From kirillbalunov at gmail.com Wed Apr 11 15:24:24 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Wed, 11 Apr 2018 22:24:24 +0300 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: 2018-04-11 18:01 GMT+03:00 Kirill Balunov : > > > 2018-04-11 16:50 GMT+03:00 Chris Angelico : > >> >> Can you give an example of how your syntax is superior to the more >> general option of simply allowing "as" bindings in any location? >> >> > This is not my syntax :) And not even my idea. I just do not understand, > and even a little skeptical about allowing "as" bindings in **any > location** with global scoping. All the examples in this thread and the > previous ones, as well as almost all PEP's examples show how this feature > will be useful in `if`, `while` statements and comprehension/generator > expressions. And it excellently solves this problem. This feature > increases the capabilities of these statements and also positively affects > the readability of the code and it seems to me that everyone understands > what this means in this context without ambiguity in their meaning in > `while` or `with` statements. > > The remaining examples (general ones) are far-fetched, and I do not have > much desire to discuss them :) These include: > > lambda: x := lambda y: z := a := 0 > y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) > and others of these kind... > > Thus, I do not understand why to solve such a general and complex problem, > when this syntax is convenient only in specific cases. In addition, > previously the concept of a Statement-Local Name Bindings was discussed, which > I basically like (and it fits the above idea). In this version, it was > abandoned completely, but it is unclear for what reasons. > > p.s.: Maybe someone has use-cases outside `if`, `while` and > comprehensions, but so far no one has demonstrated them. > > I find that I wrote very vague, so I'll try in response to my answer to add some specifics. In general, I find this idea missed in the language and thank you for trying to fix this! In my opinion it has only a meaning in certain constructions such as `while`, `if`, `elif` and maybe comprehensions\generators. As a general form "anywhere" it can be _useful_, but makes the code unreadable and difficult to perceive while giving not so much benefit. What I find nice to have: Extend while statement syntax: while (input("> ") as command) != "quit": print("You entered:", command) Extend if statement syntax: if re.search(pat, text) as match: print("Found:", match.group(0)) if (re.search(pat, text) as match) is not None: print("Found:", match.group(0)) also `elif` clauses should be extended to support. Extend comprehensions syntax: # Since comprehensions have an if clause [y for x in data if (f(x) as y) is not None] # Also this form without `if` clause [(y, x/y) with f(x) as y for x in range(5)] Extend ternary expression syntax: data = y/x if (f(x) as y) > 0 else 0 I think that is all. And it seems to me that it covers 99% of all the use-cases of this feature. In my own world I would like them to make a local _statement_ binding (but this is certainly a very controversial point). I even like that this syntax matches the `with` an `except` statements syntax, although it has a different semantic. But I do not think that anyone will have problems with perception of this. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Wed Apr 11 14:05:56 2018 From: mertz at gnosis.cx (David Mertz) Date: Wed, 11 Apr 2018 18:05:56 +0000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACE4AC2.4030503@brenbarn.net> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> Message-ID: How about this, Brendan? _, x1, x2 = (D := b**2 - 4*a*c), (-b + sqrt(D))/2, (-b - sqrt(D))/2 I'm not sure I love this, but I don't hate it. On Wed, Apr 11, 2018, 12:50 PM Brendan Barnwell wrote: > On 2018-04-11 05:23, Clint Hepner wrote: > > I find the assignments make it difficult to pick out what the final > expression looks like. > > I strongly agree with this, and for me I think this is enough to > push > me to -1 on the whole proposal. For me the classic example case is > still the quadratic formula type of thing: > > x1, x2 = (-b + sqrt(b**2 - 4*a*c))/2, (-b - sqrt(b**2 - 4*a*c))/2 > > It just doesn't seem worth it to me to create an expression-level > assignment unless it can make things like this not just less verbose but > at the same time more readable. I don't consider this more readable: > > x1, x2 = (-b + sqrt(D := b**2 - 4*a*c)))/2, (-b - sqrt(D))/2 > > . . . because having to put the assignment inline creates a visual > asymmetry, when for me the entire goal of an expression-level statement > is to make the symmetry between such things MORE obvious. I want to be > able to write: > > x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2 ... > > . . . where "..." stands for "the part of the expression where I define > the variables I'm re-using in multiple places in the expression". > > The new proposal does at least have the advantage that it would > help > with things like this: > > while x := some_function_call(): > # do stuff > > So maybe I'm -0.5 rather than -1. But it's not just that this > proposal > "could be used to create ugly code". It's that using it for > expression-internal assignments WILL create ugly code, and there's no > way to avoid it. I just don't see how this proposal provides any way to > make things like the quadratic formula example above MORE readable. > > -- > Brendan Barnwell > "Do not follow where the path may lead. Go, instead, where there is no > path, and leave a trail." > --author unknown > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Apr 11 17:03:30 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 11 Apr 2018 22:03:30 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> Message-ID: On 11 April 2018 at 19:05, David Mertz wrote: > How about this, Brendan? > > _, x1, x2 = (D := b**2 - 4*a*c), (-b + sqrt(D))/2, (-b - sqrt(D))/2 > > I'm not sure I love this, but I don't hate it. Seriously, how is this in any way better than D = b**2 - 4*a*c x1, x2 = (-b + sqrt(D))/2, (-b - sqrt(D))/2 ? There are good use cases for this feature, but this simply isn't one. Paul From brenbarn at brenbarn.net Wed Apr 11 17:09:02 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Wed, 11 Apr 2018 14:09:02 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> Message-ID: <5ACE796E.7050401@brenbarn.net> On 2018-04-11 11:05, David Mertz wrote: > How about this, Brendan? > > _, x1, x2 = (D := b**2 - 4*a*c), (-b + sqrt(D))/2, (-b - sqrt(D))/2 > > I'm not sure I love this, but I don't hate it. That's clever, but why bother? I can already do this with existing Python: D = b**2 - 4*a*c x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2 If the new feature encourages people to do something like your example (or my earlier examples with the D definition inline in the expression for x1), then I'd consider that another mark against it. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From rosuav at gmail.com Wed Apr 11 17:28:28 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 07:28:28 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan wrote: >> # Similar to the boolean 'or' but checking for None specifically >> x = "default" if (eggs := spam().ham) is None else eggs >> >> # Even complex expressions can be built up piece by piece >> y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) >> > > Leading with these kinds of examples really doesn't help to sell the > proposal, since they're hard to read, and don't offer much, if any, > benefit over the status quo where assignments (and hence the order of > operations) need to be spelled out as separate lines. > > Instead, I'd suggestion going with the kinds of examples that folks > tend to bring up when requesting this capability: Cool, thanks. I've snagged these (and your other examples) and basically tossed them into the PEP unchanged. >> The name ``prefix`` is thus searched for at global scope, ignoring the class >> name. Under the proposed semantics, this name will be eagerly bound, being >> approximately equivalent to:: >> >> class X: >> names = ["Fred", "Barney", "Joe"] >> prefix = "> " >> def (prefix=prefix): >> result = [] >> for name in names: >> result.append(prefix + name) >> return result >> prefixed_names = () > > "names" would also be eagerly bound here. Yep, that was a clerical error on my part, now corrected. >> Frequently Raised Objections >> ============================ > > There needs to be a subsection here regarding the need to call `del` > at class and module scope, just as there is for loop iteration > variables at those scopes. Hmm, I'm not sure I follow. Are you saying that this is an objection to assignment expressions, or an objection to them not being statement-local? If the latter, it's really more about "rejected alternative proposals". >> This could be used to create ugly code! >> --------------------------------------- >> >> So can anything else. This is a tool, and it is up to the programmer to use it >> where it makes sense, and not use it where superior constructs can be used. > > This argument will be strengthened by making the examples used in the > PEP itself more attractive, as well as proposing suitable additions to > PEP 8, such as: > > 1. If either assignment statements or assignment expressions can be > used, prefer statements > 2. If using assignment expressions would lead to ambiguity about > execution order, restructure to use statements instead Fair enough. Also adding that chained assignment expressions should generally be avoided. Thanks for the recommendations! ChrisA From waksman at gmail.com Wed Apr 11 17:34:13 2018 From: waksman at gmail.com (George Leslie-Waksman) Date: Wed, 11 Apr 2018 21:34:13 +0000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACE4AC2.4030503@brenbarn.net> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> Message-ID: I really like this proposal in the context of `while` loops but I'm lukewarm in other contexts. I specifically like what this would do for repeated calls. Being able to replace all of these md5 = hashlib.md5() with open(filename, 'rb') as file_reader: for chunk in iter(lambda: file_reader.read(1024), b''): md5.update(chunk) md5 = hashlib.md5() with open(filename, 'rb') as file_reader: while True: chunk = file_reader.read(1024) if not chunk: break md5.update(chunk) md5 = hashlib.md5() with open(filename, 'rb') as file_reader: chunk = file_reader.read(1024) while chunk: md5.update(chunk) chunk = file_reader.read(1024) with md5 = hashlib.md5() with open(filename, 'rb') as file_reader: while chunk := file_reader.read(1024): md5.update(chunk) seems really nice. I'm not sure the other complexity is justified by this nicety and I'm really wary of anything that makes comprehensions more complicated; I already see enough comprehension abuse to the point of illegibility. --George On Wed, Apr 11, 2018 at 10:51 AM Brendan Barnwell wrote: > On 2018-04-11 05:23, Clint Hepner wrote: > > I find the assignments make it difficult to pick out what the final > expression looks like. > > I strongly agree with this, and for me I think this is enough to > push > me to -1 on the whole proposal. For me the classic example case is > still the quadratic formula type of thing: > > x1, x2 = (-b + sqrt(b**2 - 4*a*c))/2, (-b - sqrt(b**2 - 4*a*c))/2 > > It just doesn't seem worth it to me to create an expression-level > assignment unless it can make things like this not just less verbose but > at the same time more readable. I don't consider this more readable: > > x1, x2 = (-b + sqrt(D := b**2 - 4*a*c)))/2, (-b - sqrt(D))/2 > > . . . because having to put the assignment inline creates a visual > asymmetry, when for me the entire goal of an expression-level statement > is to make the symmetry between such things MORE obvious. I want to be > able to write: > > x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2 ... > > . . . where "..." stands for "the part of the expression where I define > the variables I'm re-using in multiple places in the expression". > > The new proposal does at least have the advantage that it would > help > with things like this: > > while x := some_function_call(): > # do stuff > > So maybe I'm -0.5 rather than -1. But it's not just that this > proposal > "could be used to create ugly code". It's that using it for > expression-internal assignments WILL create ugly code, and there's no > way to avoid it. I just don't see how this proposal provides any way to > make things like the quadratic formula example above MORE readable. > > -- > Brendan Barnwell > "Do not follow where the path may lead. Go, instead, where there is no > path, and leave a trail." > --author unknown > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Apr 11 17:44:15 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 07:44:15 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACE4AC2.4030503@brenbarn.net> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> Message-ID: On Thu, Apr 12, 2018 at 3:49 AM, Brendan Barnwell wrote: > On 2018-04-11 05:23, Clint Hepner wrote: >> >> I find the assignments make it difficult to pick out what the final >> expression looks like. > > > I strongly agree with this, and for me I think this is enough to > push me to -1 on the whole proposal. For me the classic example case is > still the quadratic formula type of thing: > > x1, x2 = (-b + sqrt(b**2 - 4*a*c))/2, (-b - sqrt(b**2 - 4*a*c))/2 > > It just doesn't seem worth it to me to create an expression-level > assignment unless it can make things like this not just less verbose but at > the same time more readable. I don't consider this more readable: > > x1, x2 = (-b + sqrt(D := b**2 - 4*a*c)))/2, (-b - sqrt(D))/2 > > . . . because having to put the assignment inline creates a visual > asymmetry, when for me the entire goal of an expression-level statement is > to make the symmetry between such things MORE obvious. I want to be able to > write: > > x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2 ... > > . . . where "..." stands for "the part of the expression where I define the > variables I'm re-using in multiple places in the expression". What if you want to use it THREE times? roots = [((-b + sqrt(D))/2/a, (-b - sqrt(D))/2/a) for a,b,c in triangles if (D := b**2 - 4*a*c) >= 0] Now it's matching again, without any language changes. (I've reinstated the omitted division by 'a', in case anyone's confused by the translation. It has no bearing on the PEP discussion.) Same if you're using an if statement. > The new proposal does at least have the advantage that it would help > with things like this: > > while x := some_function_call(): > # do stuff > > So maybe I'm -0.5 rather than -1. But it's not just that this > proposal "could be used to create ugly code". It's that using it for > expression-internal assignments WILL create ugly code, and there's no way to > avoid it. I just don't see how this proposal provides any way to make > things like the quadratic formula example above MORE readable. I don't think it's as terrible as you're saying. You've picked a specific example that is ugly; okay. This new syntax is not meant to *replace* normal assignment, but to complement it. There are times when it's much better to use the existing syntax. ChrisA From rosuav at gmail.com Wed Apr 11 17:55:21 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 07:55:21 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACE561F.4000707@stoneleaf.us> References: <5ACE561F.4000707@stoneleaf.us> Message-ID: On Thu, Apr 12, 2018 at 4:38 AM, Ethan Furman wrote: > On 04/10/2018 10:32 PM, Chris Angelico wrote: > >> Title: Assignment Expressions > > > Thank you, Chris, for doing all this! > > --- > > Personally, I'm likely to only use this feature in `if` and `while` > statements; if the syntax was easier to read inside longer expressions then > I might use this elsewhere -- but as has been noted by others, the > on-the-spot assignment creates asymmetries that further clutter the overall > expression. > > As Paul noted, I don't think parenthesis should be mandatory if the parser > itself does not require them. > > For myself, I prefer the EXPR as NAME variant for two reasons: > > - puts the calculation first, which is what we are used to seeing in > if/while statements; and > - matches already existing expression-level assignments (context managers, > try/except blocks) > > +0.5 from me. Context managers and except blocks don't do the same thing though, so it's a false parallel. The 'as' keyword occurs in Python's grammar thus: 1) "with EXPR as target:" captures the return value from __enter__ 2) "except EXPR as NAME:" captures the exception value, not the type(s) 3) "import NAME as NAME" loads a module object given by a token (name) and captures that 4) "from NAME import NAME as NAME" captures an attribute of a module Three of them use a name, one allows arbitrary assignment targets. (You can "with spam as ham[0]:" but you can't "except Exception as ham[0]:".) Two of them have no expressions at all, so assignment expressions wouldn't logically be usable. The other two are *dangerous* false parallels, because "with foo as bar:" is semantically different from "with (foo as bar):"; it was so awkward to try to explain that away that I actually forbade any use of "(expr as name)" in the header of a with/except block. The parallel is not nearly as useful as it appears to be on first blush. The parallel with assignment statements is far closer; in fact, many situations will behave identically whether you use the colon or not. > - puts the calculation first, which is what we are used to seeing in > if/while statements; and Not sure what you mean here; if and while statements don't have anything BUT the calculation. A 'for' loop puts the targets before the evaluated expression. > - matches already existing expression-level assignments (context managers, > try/except blocks) They're not expression-level assignments though, or else I'm misunderstanding something here? ChrisA From neatnate at gmail.com Wed Apr 11 18:01:39 2018 From: neatnate at gmail.com (Nathan Schneider) Date: Wed, 11 Apr 2018 18:01:39 -0400 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACE4AC2.4030503@brenbarn.net> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> Message-ID: On Wed, Apr 11, 2018 at 1:49 PM, Brendan Barnwell wrote: > On 2018-04-11 05:23, Clint Hepner wrote: > >> I find the assignments make it difficult to pick out what the final >> expression looks like. >> > > I strongly agree with this, and for me I think this is enough to > push me to -1 on the whole proposal. For me the classic example case is > still the quadratic formula type of thing: > > x1, x2 = (-b + sqrt(b**2 - 4*a*c))/2, (-b - sqrt(b**2 - 4*a*c))/2 > > It just doesn't seem worth it to me to create an expression-level > assignment unless it can make things like this not just less verbose but at > the same time more readable. I don't consider this more readable: > > x1, x2 = (-b + sqrt(D := b**2 - 4*a*c)))/2, (-b - sqrt(D))/2 > > I'd probably write this as: x1, x2 = [(-b + s*sqrt(b**2 - 4*a*c))/(2*a) for s in (1,-1)] Agreed that the PEP doesn't really help for this use case, but I don't think it has to. The main use cases in the PEP seem compelling enough to me. Nathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Apr 11 18:43:05 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 08:43:05 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Thu, Apr 12, 2018 at 5:24 AM, Kirill Balunov wrote: > I find that I wrote very vague, so I'll try in response to my answer to add > some specifics. In general, I find this idea missed in the language and > thank you for trying to fix this! In my opinion it has only a meaning in > certain constructions such as `while`, `if`, `elif` and maybe > comprehensions\generators. As a general form "anywhere" it can be _useful_, > but makes the code unreadable and difficult to perceive while giving not so > much benefit. What I find nice to have: > > Extend while statement syntax: > > while (input("> ") as command) != "quit": > print("You entered:", command) What you're writing there is not truly an extension of the while statement, but a special feature of an expression *within* the while header. Syntactically, this "expr as name" notation must be able to be combined with other operators (as in what you've written here), so it isn't like the way the 'with' or 'import' statement specifically makes a feature available as its own syntax. > Extend ternary expression syntax: > > data = y/x if (f(x) as y) > 0 else 0 Again, this is doing further operations after the capturing, so it's not like you can incorporate it in the syntax. You can't write: expr2 if expr1 as NAME else expr3 and explain its semantics that way, because then you can't put the "> 0" part in anywhere. So if, syntactically, this is a modification to expressions in general, why restrict them to certain contexts? Why can't I lift the condition out of the 'while' and give it a name? while (input("> ") as command) != "quit": # becomes # cond = (input("> ") as command) != "quit" print(cond) while cond: But I can't if this is magic in the 'while' statement. What do you gain by forbidding it? > I think that is all. And it seems to me that it covers 99% of all the > use-cases of this feature. In my own world I would like them to make a local > _statement_ binding (but this is certainly a very controversial point). I > even like that this syntax matches the `with` an `except` statements syntax, > although it has a different semantic. But I do not think that anyone will > have problems with perception of this. Statement-local names, controversial? You don't say! I actually think the parallel with 'with' and 'except' works *against* that version of the proposal, precisely because of the different semantics (as you mention). The difference between: except Exception as e: except (Exception as e): is significant and fairly easy to spot; as soon as you try to use 'e', you'll figure out that it's the Exception class, not the instance that got thrown. But in a 'with' statement? with open(fn) as f: with (open(fn) as f): These will do the same thing, because Python's file objects return self from __enter__. So do a lot of context managers. You can go a VERY long way down this rabbit-hole, doing things like: with (open(infile) as read and open(outfile, "w") as write): write.write(read.read()) In CPython, you likely won't notice anything wrong here. And hey, it's backslash-free multi-line context management! In fact, you might even be able to use this in *Jython* without noticing a problem. Until you have two output files, and then stuff breaks badly. Thus I sought to outright forbid 'as' in the expressions used in a 'with' or 'except' statement. The problem doesn't exist with ':=', because it's clear that the different semantics go with different syntax: with open(fn) as f: with f := open(fn): And since there's no reason to restrict it, it's open to all contexts where an expression s needed. ChrisA From ethan at stoneleaf.us Wed Apr 11 19:03:27 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 11 Apr 2018 16:03:27 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <5ACE561F.4000707@stoneleaf.us> Message-ID: <5ACE943F.9070409@stoneleaf.us> On 04/11/2018 02:55 PM, Chris Angelico wrote: > On Thu, Apr 12, 2018 at 4:38 AM, Ethan Furman wrote: > Context managers and except blocks don't do the same thing though, so > it's a false parallel. They assign things to names using `as`. Close enough. ;) > The other two are > *dangerous* false parallels, because "with foo as bar:" is > semantically different from "with (foo as bar):"; If `with` is the first token on the line, then the context manager syntax should be enforced -- you get one or the other, not both. >> - puts the calculation first, which is what we are used to seeing in >> if/while statements; and > > Not sure what you mean here; if and while statements don't have > anything BUT the calculation. Which is why we're using to seeing it first. ;) > A 'for' loop puts the targets before the evaluated expression. Hmm, good point -- but it still uses a key word, `in`, to delineate between the two. >> - matches already existing expression-level assignments (context managers, >> try/except blocks) > > They're not expression-level assignments though, or else I'm > misunderstanding something here? No, just me being sloppy with vocabulary. -- ~Ethan~ From rosuav at gmail.com Wed Apr 11 19:46:31 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 09:46:31 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACE943F.9070409@stoneleaf.us> References: <5ACE561F.4000707@stoneleaf.us> <5ACE943F.9070409@stoneleaf.us> Message-ID: On Thu, Apr 12, 2018 at 9:03 AM, Ethan Furman wrote: > On 04/11/2018 02:55 PM, Chris Angelico wrote: >> >> On Thu, Apr 12, 2018 at 4:38 AM, Ethan Furman wrote: > >> Context managers and except blocks don't do the same thing though, so >> it's a false parallel. > > They assign things to names using `as`. Close enough. ;) Okay, but because they don't assign the thing just before the word 'as', it's a dangerous parallel. Worse, a 'with' statement OFTEN assigns the thing just before the word 'as', making the parenthesized form frequently, but not always, the same as the unparenthesized. >> The other two are >> *dangerous* false parallels, because "with foo as bar:" is >> semantically different from "with (foo as bar):"; > > If `with` is the first token on the line, then the context manager syntax > should be enforced -- you get one or the other, not both. > That's what I had in the previous version: that 'as' bindings are straight-up not permitted in 'with' and 'except' headers. I wasn't able to implement that in the grammar, so I can't demo it for you, but the fact that this was a special case was *itself* a turn-off to some. Are special cases special enough to break the rules? Do we want something that is a subtle bug magnet? >>> - puts the calculation first, which is what we are used to seeing in >>> if/while statements; and >> >> >> Not sure what you mean here; if and while statements don't have >> anything BUT the calculation. > > Which is why we're using to seeing it first. ;) Heh. You could just as easily say you're used to seeing it next to the colon :) >> A 'for' loop puts the targets before the evaluated expression. > > Hmm, good point -- but it still uses a key word, `in`, to delineate between > the two. It does, yes. Fortunately we can't have a competing proposal for name bindings to be spelled "NAME in EXPR", because that's already a thing :D >>> - matches already existing expression-level assignments (context >>> managers, >>> try/except blocks) >> >> >> They're not expression-level assignments though, or else I'm >> misunderstanding something here? > > > No, just me being sloppy with vocabulary. > Gotcha, no problem. For myself, I've been back and forth a bit about whether "as" or ":=" is the better option. Both of them have problems. Both of them create edge cases that could cause problems. Since the problems caused by ":=" are well known from other languages (and are less serious than they would be if "=" were the operator), I'm pushing that form. However, the 'as' syntax is a close contender (unlike most of the other contenders), so if someone comes up with a strong argument in its favour, I could switch. ChrisA From ethan at stoneleaf.us Wed Apr 11 20:44:44 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 11 Apr 2018 17:44:44 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <5ACE561F.4000707@stoneleaf.us> <5ACE943F.9070409@stoneleaf.us> Message-ID: <5ACEABFC.1040004@stoneleaf.us> On 04/11/2018 04:46 PM, Chris Angelico wrote: > For myself, I've been back and forth a bit about whether "as" or ":=" > is the better option. Both of them have problems. Both of them create > edge cases that could cause problems. Since the problems caused by > ":=" are well known from other languages (and are less serious than > they would be if "=" were the operator), I'm pushing that form. > However, the 'as' syntax is a close contender (unlike most of the > other contenders), so if someone comes up with a strong argument in > its favour, I could switch. While I strongly prefer "as", if it can't be made to work in the grammar then that option is pretty much dead, isn't it? In which case, I'll take ":=". -- ~Ethan~ From rosuav at gmail.com Wed Apr 11 20:49:36 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 10:49:36 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACEABFC.1040004@stoneleaf.us> References: <5ACE561F.4000707@stoneleaf.us> <5ACE943F.9070409@stoneleaf.us> <5ACEABFC.1040004@stoneleaf.us> Message-ID: On Thu, Apr 12, 2018 at 10:44 AM, Ethan Furman wrote: > On 04/11/2018 04:46 PM, Chris Angelico wrote: > >> For myself, I've been back and forth a bit about whether "as" or ":=" >> is the better option. Both of them have problems. Both of them create >> edge cases that could cause problems. Since the problems caused by >> ":=" are well known from other languages (and are less serious than >> they would be if "=" were the operator), I'm pushing that form. >> However, the 'as' syntax is a close contender (unlike most of the >> other contenders), so if someone comes up with a strong argument in >> its favour, I could switch. > > > While I strongly prefer "as", if it can't be made to work in the grammar > then that option is pretty much dead, isn't it? In which case, I'll take > ":=". > It can; and in fact, I have a branch where I had exactly that (with the SLNB functionality as well): https://github.com/Rosuav/cpython/tree/statement-local-variables But it creates enough edge cases that I was swayed by the pro-:= lobby. ChrisA From steve at pearwood.info Wed Apr 11 21:59:26 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 12 Apr 2018 11:59:26 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: <20180411000335.GN16661@ando.pearwood.info> <20180411030205.GO16661@ando.pearwood.info> <20180411060642.GR16661@ando.pearwood.info> Message-ID: <20180412015926.GT16661@ando.pearwood.info> On Thu, Apr 12, 2018 at 12:09:38AM +1000, Chris Angelico wrote: [...] > >> Consider a very common use-case: an OS-provided > >> Python interpreter whose files are all owned by 'root'. Those will be > >> distributed with .pyc files for performance, but you don't want to > >> deprive the users of help() and anything else that needs docstrings > >> etc. So... are the docstrings lazily loaded or eagerly loaded? > > > > What relevance is that they're owned by root? > > You have to predict in advance what you'll want to have in your pyc > files. Can't create them on the fly. How is that different from the situation right now? > > What semantic change do you expect? > > > > There's an implementation change, of course, but that's Serhiy's problem > > to deal with and I'm sure that he has considered that. There should be > > no semantic change. When you access obj.__doc__, then and only then are > > the compiled docstrings for that module read from the disk. > > In other words, attempting to access obj.__doc__ can actually go and > open a file. Does it need to check if the file exists as part of the > import, or does it go back to sys.path? That's implementation, so I don't know, but I imagine that the module object will have a link pointing directly to the expected file on disk. No need to search the path, you just go directly to the expected file. Apart from handling the case when it doesn't exist, in which case the docstring or annotations get set to None, it should be relatively straight-forward. That link could be an explicit pathname: /path/to/__pycache__/foo.cpython-33-doc.pyc or it could be implicitly built when required from the "master" .pyc file's path, since the differences are likely to be deterministic. > If the former, you're right > back with the eager loading problem of needing to do 2-4 times as many > stat calls; Except that's not eager loading. When you open the file on demand, it might never be opened at all. If it is opened, it is likely to be a long time after interpreter startup. > > As for the in-memory data structures of objects themselves, I imagine > > something like the __doc__ and __annotation__ slots pointing to a table > > of strings, which is not initialised until you attempt to read from the > > table. Or something -- don't pay too much attention to my wild guesses. > > > > The bottom line is, is there some reason *aside from performance* to > > avoid this? Because if the performance is worse, I'm sure Serhiy will be > > the first to dump this idea. > > Obviously it could be turned into just a performance question, but in > that case everything has to be preloaded You don't need to preload things to get a performance benefit. Preloading things that you don't need immediately and may never need at all, like docstrings, annotations and line numbers, is inefficient. I fear that you have completely failed to understand the (potential) performance benefit here. The point, or at least *a* point, of the exercise is to speed up interpreter startup by deferring some of the work until it is needed. When you defer work, the pluses are that it reduces startup time, and sometimes you can avoid doing it at all; the minus is that if you do end up needing to do it, you have to do a little bit extra. So let's look at a few common scenarios: 1. You run a script. Let's say that the script ends up loading, directly or indirectly, 200 modules, none of which need docstrings or annotations during runtime, and the script runs to completion without needing to display a traceback. You save loading 200 sets of docstrings, annotations and line numbers ("metadata" for brevity) so overall the interpreter starts up quicker and the script runs faster. 2. You run the same script, but this time it raises an exception and displays a traceback. So now you have to load, let's say, 20 sets of line numbers, which is a bit slower, but that doesn't happen until the exception is raised and the traceback printed, which is already a slow and exceptional case so who cares if it takes an extra few milliseconds? It is still an overall win because of the 180 sets of metadata you didn't need to load. 3. You have a long-running server application which runs for days or weeks between restarts. Let's say it loads 1000 modules, so you get significant savings during start up (let's say, hypothetically shaving off 2 seconds from a 30 second start up time), but over the course of the week it ends up eventually loading all 1000 sets of metadata. Since that is deferred until needed, it doesn't happen all at once, but spread out a little bit at a time. Overall, you end up doing four times as many file system operations, but since they're amortized over the entire week, not startup, it is still a win. (And remember that this extra cost only applies the first time a module's metadata is needed. It isn't a cost you keep paying over and over again.) We're (hopefully!) not going to care too much if the first few times the server needs to log a traceback, it hits the file system a few extra times. Logging tracebacks are already expensive, but they're also exceptional and so making them a bit more expensive is nevertheless likely to be an overall win if it makes startup faster. The cost/benefit accounting here is: we care far more about saving 2 seconds out of the 30 second startup (6% saving) than we care about spending an extra 8 seconds spread over a week (0.001% cost). 4. You're running the interactive interpreter. You probably aren't even going to notice the fact that it starts up a millisecond faster, or even 10 ms, but on the other hand you aren't going to notice either if the first time you call help(obj) it makes an extra four file system accesses and takes an extra few milliseconds. Likewise for tracebacks, you're not going to notice or care if it takes 350ms instead of 300ms to print a traceback. (Or however long it actually takes -- my care factor is too low to even try to measure it.) These are, in my opinion, typical scenarios. If you're in an atypical scenario, say all your modules are loaded over a network running over a piece of string stuck between two tin cans *wink*, then you probably will feel a lot more pain, but honestly that's not our problem. We're not obliged to optimize Python for running on broken networks. And besides, since we have to support byte-code only modules, and we want them to be a single .pyc file not four, people with atypical scenarios or make different cost/benefit tradeoffs can always opt-in to the single .pyc mode. [...] > As a simple example, upgrading your Python installation while you have > a Python script running can give you this effect already. Right -- so we're not adding any failure modes that don't already exist. It is *already* a bad idea to upgrade your Python installation, or even modify modules, while Python is running, since the source code may get out of sync with the cached line numbers and the tracebacks will become inaccurate. This is especially a problem when running in the interactive interpreter while editing the file you are running. > if docstrings are looked up > separately - and especially if lnotab is too - you could happily > import and use something (say, in a web server), then run updates, and > then an exception requires you to look up a line number. Oops, a few > lines got inserted into that file, and now all the line numbers are > straight-up wrong. That's a definite behavioural change. Indeed, but that's no different from what happens now when the same line number might point to a different line of source code. > Maybe it's > one that's considered acceptable, but it definitely is a change. I don't think it is a change, and I think it is acceptable. I think the solution is, don't upgrade your modules while you're still running them! -- Steve From rosuav at gmail.com Thu Apr 12 00:44:40 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 14:44:40 +1000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180412015926.GT16661@ando.pearwood.info> References: <20180411000335.GN16661@ando.pearwood.info> <20180411030205.GO16661@ando.pearwood.info> <20180411060642.GR16661@ando.pearwood.info> <20180412015926.GT16661@ando.pearwood.info> Message-ID: On Thu, Apr 12, 2018 at 11:59 AM, Steven D'Aprano wrote: > On Thu, Apr 12, 2018 at 12:09:38AM +1000, Chris Angelico wrote: > > [...] >> >> Consider a very common use-case: an OS-provided >> >> Python interpreter whose files are all owned by 'root'. Those will be >> >> distributed with .pyc files for performance, but you don't want to >> >> deprive the users of help() and anything else that needs docstrings >> >> etc. So... are the docstrings lazily loaded or eagerly loaded? >> > >> > What relevance is that they're owned by root? >> >> You have to predict in advance what you'll want to have in your pyc >> files. Can't create them on the fly. > > How is that different from the situation right now? If the files aren't owned by root (more specifically, if they're owned by you, and you can write to the pycache directory), you can do everything at runtime. Otherwise, you have to do everything at installation time. >> > What semantic change do you expect? >> > >> > There's an implementation change, of course, but that's Serhiy's problem >> > to deal with and I'm sure that he has considered that. There should be >> > no semantic change. When you access obj.__doc__, then and only then are >> > the compiled docstrings for that module read from the disk. >> >> In other words, attempting to access obj.__doc__ can actually go and >> open a file. Does it need to check if the file exists as part of the >> import, or does it go back to sys.path? > > That's implementation, so I don't know, but I imagine that the module > object will have a link pointing directly to the expected file on disk. > No need to search the path, you just go directly to the expected file. > Apart from handling the case when it doesn't exist, in which case the > docstring or annotations get set to None, it should be relatively > straight-forward. > > That link could be an explicit pathname: > > /path/to/__pycache__/foo.cpython-33-doc.pyc > > or it could be implicitly built when required from the "master" .pyc > file's path, since the differences are likely to be deterministic. Referencing a path name requires that each directory in it be opened. Checking to see if the file exists requires, at absolute best, one more stat call, and that's assuming you have an open handle to the directory. >> If the former, you're right >> back with the eager loading problem of needing to do 2-4 times as many >> stat calls; > > Except that's not eager loading. When you open the file on demand, it > might never be opened at all. If it is opened, it is likely to be a long > time after interpreter startup. I have no idea what you mean here. Eager loading != opening the file on demand. Eager statting != opening on demand. If you're not going to hold open handles to heaps of directories, you have to reference everything by path name. >> > As for the in-memory data structures of objects themselves, I imagine >> > something like the __doc__ and __annotation__ slots pointing to a table >> > of strings, which is not initialised until you attempt to read from the >> > table. Or something -- don't pay too much attention to my wild guesses. >> > >> > The bottom line is, is there some reason *aside from performance* to >> > avoid this? Because if the performance is worse, I'm sure Serhiy will be >> > the first to dump this idea. >> >> Obviously it could be turned into just a performance question, but in >> that case everything has to be preloaded > > You don't need to preload things to get a performance benefit. > Preloading things that you don't need immediately and may never need at > all, like docstrings, annotations and line numbers, is inefficient. Right, and if you DON'T preload everything, you have a potential semantic difference. Which is exactly what you were asking me, and I was answering. > So let's look at a few common scenarios: > > > 1. You run a script. Let's say that the script ends up loading, directly > or indirectly, 200 modules, none of which need docstrings or annotations > during runtime, and the script runs to completion without needing to > display a traceback. You save loading 200 sets of docstrings, > annotations and line numbers ("metadata" for brevity) so overall the > interpreter starts up quicker and the script runs faster. > > > 2. You run the same script, but this time it raises an exception and > displays a traceback. So now you have to load, let's say, 20 sets of > line numbers, which is a bit slower, but that doesn't happen until the > exception is raised and the traceback printed, which is already a slow > and exceptional case so who cares if it takes an extra few milliseconds? > It is still an overall win because of the 180 sets of metadata you > didn't need to load. Does this loading happen when the exception is constructed or when it's printed? How much can you do with an exception without triggering the loading of metadata? Is it now possible for the mere formatting of a traceback to fail because of disk/network errors? > These are, in my opinion, typical scenarios. If you're in an atypical > scenario, say all your modules are loaded over a network running over a > piece of string stuck between two tin cans *wink*, then you probably > will feel a lot more pain, but honestly that's not our problem. We're > not obliged to optimize Python for running on broken networks. People DO run Python over networks, though, and people DO upgrade their Python installations. >> As a simple example, upgrading your Python installation while you have >> a Python script running can give you this effect already. > > Right -- so we're not adding any failure modes that don't already exist. > > It is *already* a bad idea to upgrade your Python installation, or even > modify modules, while Python is running, since the source code may get > out of sync with the cached line numbers and the tracebacks will become > inaccurate. This is especially a problem when running in the interactive > interpreter while editing the file you are running. Do you terminate every single Python process on your system before you upgrade Python? Let's say you're running a server on Red Hat Enterprise Linux or Debian Stable, and you go to apply all the latest security updates. Is that best done by shutting down every single application, THEN applying all updates, and only when that's all done, starting everything up? Or do you update everything on the disk, then pick one process at a time and signal it to restart? I don't know for sure about RHEL, but I do know that Debian's package management system involves a lot of Python. So it'd be a bit tricky to build your updater such that no Python is running during updates - you'd have to deploy a brand-new Python tree somewhere to use for installation, or something. And if you have any tiny little wrapper scripts written in Python, they could easily still be running across an update, even if the rest of the app is written in C. So, no. You should NOT have to take a blanket rule of "don't update while it's running". Instead, what you have is: "Binaries can safely be unlinked, and Python modules only get loaded when you import them". >> if docstrings are looked up >> separately - and especially if lnotab is too - you could happily >> import and use something (say, in a web server), then run updates, and >> then an exception requires you to look up a line number. Oops, a few >> lines got inserted into that file, and now all the line numbers are >> straight-up wrong. That's a definite behavioural change. > > Indeed, but that's no different from what happens now when the same line > number might point to a different line of source code. Yes, this is true; but at least the mapping from byte code to line number is trustworthy. Worst case, you look at the traceback, and then interpret it based on an older copy of the .py file. If lnotab is loaded lazily, you don't even have that. Something's going to have to try to figure out what the mapping is. >> Maybe it's >> one that's considered acceptable, but it definitely is a change. > > I don't think it is a change, and I think it is acceptable. I think the > solution is, don't upgrade your modules while you're still running them! If you need a solution to it, then it IS a change. Doesn't mean it can't be done, but it definitely is a change. (Look at the PEP 572 changes to list comprehensions at class scope. Nobody's denying that the semantics are changing; but normal usage won't ever witness the changes.) I don't think this is purely a performance question. ChrisA From gey3933 at gmail.com Thu Apr 12 02:25:11 2018 From: gey3933 at gmail.com (delta114514) Date: Wed, 11 Apr 2018 23:25:11 -0700 (PDT) Subject: [Python-ideas] flatten multi-nested list/set/tuple and certain types. Message-ID: I thought that itertools.chain.from_iterable isn't useful. cuz this only allow "single-nested iterable -- this will raise error when arg has non-nested element--" like below:: >>> from itertools import chain >>> chain.from_iterable([1]) >>> list(_) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable and this can't unpack over double nest. >>> chain.from_iterable([[[1, 2]], [[4, 5]]]) >>> list(_) [[1, 2], [4, 5]] So, I wanted to make "True chain.from_iterable". and this is it. def flatten(iterables, unpack=(list, tuple, set), peep=(list, tuple, set)): for element in iterables: try: if isinstance(element, unpack): if isinstance(element, peep): yield from flatten(element, unpack=unpack, peep=peep) else: yield from flatten(element, unpack=(), peep=()) elif isinstance(element, peep): yield type(element)(flatten(element, unpack=unpack, peep=peep)) else: raise TypeError except TypeError: yield element Reason why I didin't use type() is wanted to unpack type-object like "range" and I wanted to unpack user-defined-class/func. this is why I didin't use collections.Iterable to check instance. I know this will be destructed by itertools.count :( Please give me advice. And I wanna know why function like this is not in standard library. thx. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gey3933 at gmail.com Thu Apr 12 02:26:49 2018 From: gey3933 at gmail.com (delta114514) Date: Wed, 11 Apr 2018 23:26:49 -0700 (PDT) Subject: [Python-ideas] Proposal: flatten multi-nested list/tuple/set and other certain class/type. Message-ID: <17c711a2-62fc-4ad9-8640-fde64b1e6aa1@googlegroups.com> I thought that itertools.chain.from_iterable isn't useful. cuz this only allow "single-nested iterable -- this will raise error when arg has non-nested element--" like below:: >>> from itertools import chain >>> chain.from_iterable([1]) >>> list(_) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable and this can't unpack over double nest. >>> chain.from_iterable([[[1, 2]], [[4, 5]]]) >>> list(_) [[1, 2], [4, 5]] So, I wanted to make "True chain.from_iterable". and this is it. def flatten(iterables, unpack=(list, tuple, set), peep=(list, tuple, set)): for element in iterables: try: if isinstance(element, unpack): if isinstance(element, peep): yield from flatten(element, unpack=unpack, peep=peep) else: yield from flatten(element, unpack=(), peep=()) elif isinstance(element, peep): yield type(element)(flatten(element, unpack=unpack, peep=peep)) else: raise TypeError except TypeError: yield element Reason why I didin't use type() is wanted to unpack type-object like "range" and I wanted to unpack user-defined-class/func. this is why I didin't use collections.Iterable to check instance. I know this will be destructed by itertools.count :( Please give me advice. And I wanna know why function like this is not in standard library. thx. -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Apr 12 02:39:11 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Thu, 12 Apr 2018 15:39:11 +0900 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: <23246.65295.891029.399833@turnbull.sk.tsukuba.ac.jp> Nick Coghlan writes: > > # Similar to the boolean 'or' but checking for None specifically > > x = "default" if (eggs := spam().ham) is None else eggs > > > > # Even complex expressions can be built up piece by piece > > y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs]) My immediate take was "this syntax is too ugly to live", but so are Gila monsters, and I don't understand the virtues that lead Nick and Guido to take this thread seriously. So I will just leave that statement here. (no vote yet) More constructively, I found it amusing that the results were stuffed into generic one-character variables, while the temporaries got actual words, presumably standing in for mnemonic identifiers. Besides moving the examples, that should be fixed if these examples are to be used at all. I'm also with Paul (IIRC) who suggested formatting the second example on multiple lines. I suggest s/x/filling/ (sandwich) and s/y/omelet/. Steve From desmoulinmichel at gmail.com Thu Apr 12 02:45:18 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Thu, 12 Apr 2018 08:45:18 +0200 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> Message-ID: <67d083aa-0549-cb39-71bb-87247cfb5d0b@gmail.com> Le 11/04/2018 ? 23:34, George Leslie-Waksman a ?crit?: > I really like this proposal in the context of `while` loops but I'm > lukewarm in other contexts. > > I specifically like what this would do for repeated calls. > > ... > > md5 = hashlib.md5() > with open(filename, 'rb') as file_reader: > ? ? while chunk := file_reader.read(1024): > ? ? ? ? md5.update(chunk) > > seems really nice. I'm not sure the other complexity is justified by > this nicety and I'm really wary of anything that makes comprehensions > more complicated; I already see enough comprehension abuse to the point > of illegibility. > > --George > > I like the new syntax, but you can already do what you want with iter(): md5 = hashlib.md5() with open('/etc/fstab', 'rb') as file_reader: for chunk in iter(lambda: file_reader.read(1024), b''): md5.update(chunk) Anyway, both use case fall short IRL, because you would wrap read in huge try/except to deal with the mess that is letting a user access the filesystem. From rosuav at gmail.com Thu Apr 12 02:49:42 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 16:49:42 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <67d083aa-0549-cb39-71bb-87247cfb5d0b@gmail.com> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> <67d083aa-0549-cb39-71bb-87247cfb5d0b@gmail.com> Message-ID: On Thu, Apr 12, 2018 at 4:45 PM, Michel Desmoulin wrote: > > > Le 11/04/2018 ? 23:34, George Leslie-Waksman a ?crit : >> I really like this proposal in the context of `while` loops but I'm >> lukewarm in other contexts. >> >> I specifically like what this would do for repeated calls. >> >> ... >> >> md5 = hashlib.md5() >> with open(filename, 'rb') as file_reader: >> while chunk := file_reader.read(1024): >> md5.update(chunk) >> >> seems really nice. I'm not sure the other complexity is justified by >> this nicety and I'm really wary of anything that makes comprehensions >> more complicated; I already see enough comprehension abuse to the point >> of illegibility. >> >> --George >> >> > > I like the new syntax, but you can already do what you want with iter(): > > > md5 = hashlib.md5() > with open('/etc/fstab', 'rb') as file_reader: > for chunk in iter(lambda: file_reader.read(1024), b''): > md5.update(chunk) > > Anyway, both use case fall short IRL, because you would wrap read in > huge try/except to deal with the mess that is letting a user access the > filesystem. That works ONLY if you're trying to check for a sentinel condition via equality. So that'll work for the file read situation, but it won't work if you're watching for any negative number (from APIs that use negative values to signal failure), nor something where you want the condition to be "is not None", etc, etc, etc. Also, it doesn't read nearly as well. ChrisA From evpok.padding at gmail.com Thu Apr 12 03:06:40 2018 From: evpok.padding at gmail.com (Evpok Padding) Date: Thu, 12 Apr 2018 09:06:40 +0200 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACE796E.7050401@brenbarn.net> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> <5ACE796E.7050401@brenbarn.net> Message-ID: On 11 April 2018 at 23:09, Brendan Barnwell wrote: > On 2018-04-11 11:05, David Mertz wrote: > >> How about this, Brendan? >> >> _, x1, x2 = (D := b**2 - 4*a*c), (-b + sqrt(D))/2, (-b - sqrt(D))/2 >> >> I'm not sure I love this, but I don't hate it. >> > > That's clever, but why bother? I can already do this with > existing Python: Well, you were the one who suggested using an assignment expression for that case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirillbalunov at gmail.com Thu Apr 12 05:21:40 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Thu, 12 Apr 2018 12:21:40 +0300 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: 2018-04-12 1:43 GMT+03:00 Chris Angelico : > On Thu, Apr 12, 2018 at 5:24 AM, Kirill Balunov > wrote: > > I find that I wrote very vague, so I'll try in response to my answer to > add > > some specifics. In general, I find this idea missed in the language and > > thank you for trying to fix this! In my opinion it has only a meaning in > > certain constructions such as `while`, `if`, `elif` and maybe > > comprehensions\generators. As a general form "anywhere" it can be > _useful_, > > but makes the code unreadable and difficult to perceive while giving not > so > > much benefit. What I find nice to have: > > > > Extend while statement syntax: > > > > while (input("> ") as command) != "quit": > > print("You entered:", command) > > What you're writing there is not truly an extension of the while > statement, but a special feature of an expression *within* the while > header. All right! You caught me :) For lack of a thoughtful alternative version in my head, let it be an expression but such that it can only be used (evaluated) in boolean context: `while` and `if`s statements (let's skip comprehensions and generators for some time, except their `if` clause). I agree that it contradicts with the way that in 'with' or 'import' statements it is a part of their own syntax. But nonetheless they have the same syntax but, strictly speaking, different semantics! And this is normal, because they are different statements. > Syntactically, this "expr as name" notation must be able to be > combined with other operators (as in what you've written here), so it > isn't like the way the 'with' or 'import' statement specifically makes > a feature available as its own syntax. > Ability to combine `expr as name` with other operators - is actually a matter of operator precedence. In my mind it should have the highest precedence: while input("> ") as command != "quit": is equivalent to while (input("> ") as command) != "quit": Suppose that some function can return `empty tuple` and `None` (which are both False in boolean context), for this case you can write: while func(something) as value is not None: which is equivalent to while (func(something) as value) is not None: and can be also written as: while (func(something) as value) or (value is not None): In the last snippet parenthesis were used only for readability. Another example: while f1(x) + f2(x) as value: will be parsed as: while f1(x) + (f2(x) as value): > > Extend ternary expression syntax: > > > > data = y/x if (f(x) as y) > 0 else 0 > > Again, this is doing further operations after the capturing, so it's > not like you can incorporate it in the syntax. You can't write: > > expr2 if expr1 as NAME else expr3 > and explain its semantics that way, because then you can't put the "> > 0" part in anywhere. > Do not quite understand why I can not? :) (the parenthesis was used only for readability) > So if, syntactically, this is a modification to expressions in > general, why restrict them to certain contexts? Why can't I lift the > condition out of the 'while' and give it a name? > > while (input("> ") as command) != "quit": > # becomes # > cond = (input("> ") as command) != "quit" > print(cond) > while cond: > > But I can't if this is magic in the 'while' statement. What do you > gain by forbidding it? > I gain readability! I don't see any reason to use it in other contexts... Because it makes the code unreadable and difficult to perceive while giving not so much benefit. I may be wrong, but so far I have not seen a single example that at least slightly changed my mind. Concerning your example, I did not understand it..`cond` is evaluated only once...why you need a while loop in this case? > > I think that is all. And it seems to me that it covers 99% of all the > > use-cases of this feature. In my own world I would like them to make a > local > > _statement_ binding (but this is certainly a very controversial point). I > > even like that this syntax matches the `with` an `except` statements > syntax, > > although it has a different semantic. But I do not think that anyone will > > have problems with perception of this. > > Statement-local names, controversial? You don't say! > > I actually think the parallel with 'with' and 'except' works *against* > that version of the proposal, precisely because of the different > semantics (as you mention). The difference between: > > except Exception as e: > except (Exception as e): > > It **will not be allowed** in `except` since there is no boolean context... There is no parallels, only the same syntax, it seems to me that no one will have problems understanding what that means in different contexts. In addition, at the moment all are coping with the differences between `import`, `with` and `except`. > is significant and fairly easy to spot; as soon as you try to use 'e', > you'll figure out that it's the Exception class, not the instance that > got thrown. But in a 'with' statement? > > with open(fn) as f: > with (open(fn) as f): > > These will do the same thing, because Python's file objects return > self from __enter__. So do a lot of context managers. You can go a > VERY long way down this rabbit-hole, doing things like: > > with (open(infile) as read and > open(outfile, "w") as write): > write.write(read.read()) > > In CPython, you likely won't notice anything wrong here. And hey, it's > backslash-free multi-line context management! In fact, you might even > be able to use this in *Jython* without noticing a problem. Until you > have two output files, and then stuff breaks badly. Thus I sought to > outright forbid 'as' in the expressions used in a 'with' or 'except' > statement. The same goes for `with` statements, I see no reason to use both `:=` and `expr as name` in with statements. How this feature can be used here, especially in the context you mentioned above? > The problem doesn't exist with ':=', because it's clear > that the different semantics go with different syntax: > > with open(fn) as f: > with f := open(fn): > > And since there's no reason to restrict it, it's open to all contexts > where an expression s needed. > As for me this example just shows that `:=` should not be used in `with` statements at all. I ask you to understand me correctly, but what surprised me most, was that different versions of PEP (3 and 4) speaks about absolutely different things. I think it would be great to have two competing proposals: 1. This PEP in the form that is now (with general assignment expression) 2. Another PEP which discusses only the changes in if and while statements. I understand that it is much easier to advise than to do! I also understand how much time it takes for all this. I myself still can not find the time (English is given to me with great difficulty :)) to write about PEP about partial assignment statement ... But still, as I see these two PEPs will allow to simultaneously look at the two approaches, and also allow to separately work through the problems arising in each of them. I do not want to repeat it again, but I have not yet seen any reasonable example where this current `:=` feature can be useful, except `while` and `if`. So I do not understand why everything needs to be complicated and not concentrate on what can actually be useful. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.van.dorp at deonet.nl Thu Apr 12 05:48:28 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Thu, 12 Apr 2018 11:48:28 +0200 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: Wouldn't these local name bindings make the current "as" clause of "with f(x) as y" completely obsolete ? It's probably good to know my background, and my background is that I know completely nothing of the implementation, im only junior software engineer, and python was my first programming experience, and still the one I have by far the most experience with. To me, the entire proposal sounds mostly like an expansion of the as syntax as we know it from "with". There will be no difference between: with open(filename) as f: // code and with f := open(filename): // code or at least as far as I can see. (that is, if := will be allowed in the with statement, and it sounds like it will ?) from this (outsiders?) point of view, it'd make a lot more sense to keep using "as" as the local binding operator - that's exactly what it already seems to do, even if it looks different under the hood. This would keep it to just one way to do stuff, and that happens to be the way everyone's used to. of course, should still extend. So right now. with open(os.path.join(path, filename) as full_file_path) as f: // Do stuff with both full_file_path and f wont work, but it'd still be awesome/useful if it did. (overall +1 for the idea, for what it's worth) Jacco From storchaka at gmail.com Thu Apr 12 05:54:35 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 12 Apr 2018 12:54:35 +0300 Subject: [Python-ideas] Default values in multi-target assignment Message-ID: Yet one crazy idea. What if allow default values for targets in multi-target assignment? >>> (a, b=0) = (1, 2) >>> a, b (1, 2) >>> (a, b=0) = (1,) >>> a, b (1, 0) >>> (a, b=0) = () Traceback (most recent call last): File "", line 1, in ValueError: not enough values to unpack (expected at least 1, got 0) >>> (a, b=0) = (1, 2, 3) Traceback (most recent call last): File "", line 1, in ValueError: too many values to unpack (expected at most 2) Currently you need either explicitly check the length of the right-hand part (if it is a sequence and not an arbitrary iterator), if len(c) == 1: a, = c b = 0 elif len(c) == 2: a, b = c else: raise TypeError or use an intermediate function: def f(a, b=0): return a, b a, b = f(*c) The latter can be written as an ugly one-liner: a, b = (lambda a, b=0: (a, b))(*c) From storchaka at gmail.com Thu Apr 12 06:14:23 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 12 Apr 2018 13:14:23 +0300 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: 10.04.18 20:38, Chris Angelico ????: > On Wed, Apr 11, 2018 at 2:14 AM, Serhiy Storchaka wrote: > A deployed Python distribution generally has .pyc files for all of the > standard library. I don't think people want to lose the ability to > call help(), and unless I'm misunderstanding, that requires > docstrings. So this will mean twice as many files and twice as many > file-open calls to import from the standard library. What will be the > impact on startup time? Yes, this will mean more syscalls when import with docstrings. But the startup time doesn't matter for interactive shell in which you call help(). It was expected that programs which need to gain the benefit from separating optional components will run without loading them (like with option -OO). The overhead can be reduced by packing multiple files in a single archive. Finally, loading docstrings and other optional components can be made lazy. This was not in my original idea, and this will significantly complicate the implementation, but in principle it is possible. This will require larger changes in the marshal format and bytecode. This can open a door for further enhancements: loading the code and building classes and other complex data (especially heavy namedtuples, enums and dataclasses) on demand. Often you need to use just a single attribute or function from a large module. But this is different change, out of scope of this topic. From storchaka at gmail.com Thu Apr 12 06:21:32 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 12 Apr 2018 13:21:32 +0300 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180410182427.03ad0043@fsol> References: <20180410182427.03ad0043@fsol> Message-ID: 10.04.18 19:24, Antoine Pitrou ????: >> 2. Line numbers (lnotab). They are helpful for formatting tracebacks, >> for tracing, and debugging with the debugger. Sources are helpful in >> such cases too. If the program doesn't contain errors ;-) and is sipped >> without sources, they could be removed. > > What is the weight of lnotab arrays? While docstrings can be large, > I'm somehow skeptical that removing lnotab arrays would bring a > significant improvement. It would be nice to have more data about this. Maybe it is low. I just mentioned three kinds of data in pyc files that can be optional. If move out docstrings and annotations, why not move lnotabs? It would be easy if we already implement the infrastructure for others two. >> 3. Annotations. They are used mainly by third party tools that >> statically analyze sources. They are rarely used at runtime. > > Even less used than docstrings probably. And since there is a way of providing annotations in human-readable format separately from source codes, it looks naturally to provide a way for compiling them into separate files. From songofacandy at gmail.com Thu Apr 12 06:48:07 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 12 Apr 2018 19:48:07 +0900 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: > Finally, loading docstrings and other optional components can be made lazy. > This was not in my original idea, and this will significantly complicate the > implementation, but in principle it is possible. This will require larger > changes in the marshal format and bytecode. I'm +1 on this idea. * New pyc format has code section (same to current) and text section. text section stores UTF-8 strings and not loaded at import time. * Function annotation (only when PEP 563 is used) and docstring are stored as integer, point to offset in the text section. * When type.__doc__, PyFunction.__doc__, PyFunction.__annotation__ are integer, text is loaded from the text section lazily. PEP 563 will reduce some startup time, but __annotation__ is still dict. Memory overhead is negligible. In [1]: def foo(a: int, b: int) -> int: ...: return a + b ...: ...: In [2]: import sys In [3]: sys.getsizeof(foo) Out[3]: 136 In [4]: sys.getsizeof(foo.__annotations__) Out[4]: 240 When PEP 563 is used, there are no side effect while building the annotation. So the annotation can be serialized in text, like {"a":"int","b":"int","return":"int"}. This change will require new pyc format, and descriptor for PyFunction.__doc__, PyFunction.__annotation__ and type.__doc__. Regards, -- INADA Naoki From p.f.moore at gmail.com Thu Apr 12 07:09:22 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 12 Apr 2018 12:09:22 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On 11 April 2018 at 22:28, Chris Angelico wrote: > On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan wrote: >> This argument will be strengthened by making the examples used in the >> PEP itself more attractive, as well as proposing suitable additions to >> PEP 8, such as: >> >> 1. If either assignment statements or assignment expressions can be >> used, prefer statements >> 2. If using assignment expressions would lead to ambiguity about >> execution order, restructure to use statements instead > > Fair enough. Also adding that chained assignment expressions should > generally be avoided. Another one I think should be included (I'm a bit sad that it's not so obvious that no-one would ever even think of it, but the current discussion pretty much killed that hope for me). * Assignment expressions should never be used standalone - assignment statements should *always* be used in that case. That's also one that I'd like to see implemented as a warning in the common linters and style checkers. I'm still not convinced that this whole proposal is a good thing (the PEP 8 suggestions feel like fighting a rearguard action against something that's inevitable but ill-advised), but if it does get accepted it's in a lot better place now than it was when the discussions started - so thanks for all the work you've done on incorporating feedback. Paul From kirillbalunov at gmail.com Thu Apr 12 07:28:21 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Thu, 12 Apr 2018 14:28:21 +0300 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: 2018-04-12 12:48 GMT+03:00 Jacco van Dorp : > Wouldn't these local name bindings make the current "as" clause of > "with f(x) as y" completely obsolete ? > > It's probably good to know my background, and my background is that I > know completely nothing of the implementation, im only junior software > engineer, and python was my first programming experience, and still > the one I have by far the most experience with. > > To me, the entire proposal sounds mostly like an expansion of the as > syntax as we know it from "with". There will be no difference between: > > with open(filename) as f: > // code > > and > > with f := open(filename): > // code > > or at least as far as I can see. (that is, if := will be allowed in > the with statement, and it sounds like it will ?) > > Thank you Jacob! I do not know if I understood correctly how you understand what is happening here. But you are just demonstrating my fears about this proposal... with f := open(filename): This will be only valid if the returned object of (f := open(filename)) defines __enter__ and __exit__ methods ( Nevertheless, in this situation it is so). But in other cases it will raise an error. Generally `with name := expr` is not equivalent to `with expr as name:`. In another places, for example, `except` clause `f := something` is valid only if the returned type is an object, which inherit from BaseException. So in this two situations, in my opinion, it will not be used too much. Yours example under current proposal should look like `with open( full_file_path := os.path.join(path, filename) ) as f:`. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.van.dorp at deonet.nl Thu Apr 12 08:22:28 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Thu, 12 Apr 2018 14:22:28 +0200 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: 2018-04-12 13:28 GMT+02:00 Kirill Balunov : > > > 2018-04-12 12:48 GMT+03:00 Jacco van Dorp : >> >> Wouldn't these local name bindings make the current "as" clause of >> "with f(x) as y" completely obsolete ? >> >> It's probably good to know my background, and my background is that I >> know completely nothing of the implementation, im only junior software >> engineer, and python was my first programming experience, and still >> the one I have by far the most experience with. >> >> To me, the entire proposal sounds mostly like an expansion of the as >> syntax as we know it from "with". There will be no difference between: >> >> with open(filename) as f: >> // code >> >> and >> >> with f := open(filename): >> // code >> >> or at least as far as I can see. (that is, if := will be allowed in >> the with statement, and it sounds like it will ?) >> > > Thank you Jacob! I do not know if I understood correctly how you understand > what is happening here. But you are just demonstrating my fears about this > proposal... > > with f := open(filename): > > This will be only valid if the returned object of (f := open(filename)) > defines __enter__ and __exit__ methods ( Nevertheless, in this situation it > is so). But in other cases it will raise an error. Generally `with name := > expr` is not equivalent to `with expr as name:`. In another places, for > example, `except` clause `f := something` is valid only if the returned type > is an object, which inherit from BaseException. So in this two situations, > in my opinion, it will not be used too much. > > Yours example under current proposal should look like `with open( > full_file_path := os.path.join(path, filename) ) as f:`. > > With kind regards, > -gdg No, it will not raise an error to replace all these "as" usages with name binding, if we choose operator priority right. Even if we consider "with (y := f(x))", the f(x) the with gets is the same as the one bound to the name - that's the point of the binding expression. The only difference seems to be whether the name binding is done before or after the __enter__ method - depending on operator priority (see below). I've looked through PEP 343, contextlib docs ( https://docs.python.org/3/library/contextlib.html ), and I couldn't find a single case where "with (y := f(x))" would be invalid. The only difference I can think of is objects where __enter__ doesn't return self. Inexperienced programmers might forget it, so we're giving them the "as y" part working instead of binding y to None. There might be libraries out there that return a non-self value from __enter__, which would alter behaviour. I honestly can't imagine why you might do that tho. And this could be entirely solved by giving "with" a higher priority than "as". Then if "as" was used instead of ":=", you could just drop "as" as part of the with statement, and it'd work the exact same way everyone's used to. And it's basically the same with "except x as y"; If except gets a higher priority than as, it'd do the same thing(i.e., the exception object gets bound to y, not the tuple of types). At least, that's what it'd look like from outside. If people'd complain "but the local binding does different behind except", you explain the same story as when they would ask about the difference between + and * priority in basic math - and they'd be free to use parenthesis as well if they really want to for some reason i'm unable to comprehend. except (errors := (TypeError, ValueError)) as e: # Or of course: except ( (TypeError, ValueError) as errors ) as e: logger.info(f"Error {e} is included in types: {errors}") Where it all comes together is that if as is chosen instead of :=, it might just be far easier to comprehend for people how it works. "same as in with" might be wrong technically, but it's simple and correct conceptually. Note: I'm talking about operator priority here as if with and except return anything - which I know they don't. That probably complicates it a lot more than it sounds like what I've tried to explain my thoughts here, but I hope I made sense. (it's obvious to me, but well, im dutch..(jk)) Jacco From clint.hepner at gmail.com Thu Apr 12 08:28:11 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Thu, 12 Apr 2018 08:28:11 -0400 Subject: [Python-ideas] Default values in multi-target assignment In-Reply-To: References: Message-ID: <876D3CA4-375C-43F4-99D9-3105B70B4ACC@gmail.com> > On 2018 Apr 12 , at 5:54 a, Serhiy Storchaka wrote: > > Yet one crazy idea. What if allow default values for targets in multi-target assignment? > > >>> (a, b=0) = (1, 2) > >>> a, b > (1, 2) > >>> (a, b=0) = (1,) > >>> a, b > (1, 0) > >>> (a, b=0) = () > Traceback (most recent call last): > File "", line 1, in > ValueError: not enough values to unpack (expected at least 1, got 0) > >>> (a, b=0) = (1, 2, 3) > Traceback (most recent call last): > File "", line 1, in > ValueError: too many values to unpack (expected at most 2) > > Currently you need either explicitly check the length of the right-hand part (if it is a sequence and not an arbitrary iterator), > > if len(c) == 1: > a, = c > b = 0 > elif len(c) == 2: > a, b = c > else: > raise TypeError > > or use an intermediate function: > > def f(a, b=0): > return a, b > a, b = f(*c) > > The latter can be written as an ugly one-liner: > > a, b = (lambda a, b=0: (a, b))(*c) I think the best comparison would be (a, b=0, *_) = t vs a, b, *_ = (*t, 0) (a, b, *_) = (*t, 0) In both, I've added *_ to capture any trailing elements. It's more necessary with the current syntax, since adding defaults will make the tuple too long when they aren't needed. Given that, I'm +1 on the proposal, since 1. Defaults are more closely associated with their intended name 2. A new tuple doesn't need to be constructed on the RH The one minor downside, IMO, is that if you choose to omit the *_ guard, you *must* use parentheses on the LHS, as a, b = 0 = t # parsed as a, b = (0 = t) raises a SyntaxError on the assignment to 0. -- Clint From ncoghlan at gmail.com Thu Apr 12 09:02:01 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 12 Apr 2018 23:02:01 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On 12 April 2018 at 22:22, Jacco van Dorp wrote: > I've looked through PEP 343, contextlib docs ( > https://docs.python.org/3/library/contextlib.html ), and I couldn't > find a single case where "with (y := f(x))" would be invalid. Consider this custom context manager: @contextmanager def simple_cm(): yield 42 Given that example, the following code: with cm := simple_cm() as value: print(cm.func.__name__, value) would print "'simple_cm 42", since the assignment expression would reference the context manager itself, while the with statement binds the yielded value. Another relevant example would be `contextlib.closing`: that returns the passed in argument from __enter__, *not* self. And that's why earlier versions of PEP 572 (which used the "EXPR as NAME" spelling) just flat out prohibited top level name binding expressions in with statements: "with (expr as name):" and "with expr as name:" were far too different semantically for the only syntactic difference to be a surrounding set of parentheses. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Thu Apr 12 09:08:06 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 23:08:06 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Thu, Apr 12, 2018 at 9:09 PM, Paul Moore wrote: > On 11 April 2018 at 22:28, Chris Angelico wrote: >> On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan wrote: >>> This argument will be strengthened by making the examples used in the >>> PEP itself more attractive, as well as proposing suitable additions to >>> PEP 8, such as: >>> >>> 1. If either assignment statements or assignment expressions can be >>> used, prefer statements >>> 2. If using assignment expressions would lead to ambiguity about >>> execution order, restructure to use statements instead >> >> Fair enough. Also adding that chained assignment expressions should >> generally be avoided. > > Another one I think should be included (I'm a bit sad that it's not so > obvious that no-one would ever even think of it, but the current > discussion pretty much killed that hope for me). > > * Assignment expressions should never be used standalone - assignment > statements should *always* be used in that case. That's covered by the first point. If it's a standalone statement, then the statement form could be used, ergo you should prefer the statement form. ChrisA From rosuav at gmail.com Thu Apr 12 09:14:34 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 23:14:34 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Thu, Apr 12, 2018 at 7:21 PM, Kirill Balunov wrote: > > I gain readability! I don't see any reason to use it in other contexts... > Because it makes the code unreadable and difficult to perceive while giving > not so much benefit. I may be wrong, but so far I have not seen a single > example that at least slightly changed my mind. This is, in effect, your entire argument for permitting assignments only in certain contexts. "I can't think of any useful reason for doing this, so we shouldn't do it". But that means making the language grammar more complicated (both in the technical sense of the parser's definitions, and in the colloquial sense of how you'd explain Python to a new programmer), because there are these magic constructs that can be used anywhere in an expression, but ONLY if that expression is inside an if or while statement. You lose the ability to refactor your code simply to satisfy an arbitrary restriction to appease someone's feeling of "it can't be useful anywhere else". There are basically two clean ways to do this: 1) Create actual syntax as part of the while statement, in the same way that the 'with EXPR as NAME:' statement does. This means you cannot put any additional operators after the 'as NAME' part. It's as much a part of the statement's syntax as the word 'in' is in a for loop. 2) Make this a feature of expressions in general. Then they can be used anywhere that an expression can be. I've gone for option 2. If you want to push for option 1, go ahead, but it's a nerfed solution just because you personally cannot think of any good use for this. ChrisA From ncoghlan at gmail.com Thu Apr 12 09:19:59 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 12 Apr 2018 23:19:59 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On 12 April 2018 at 07:28, Chris Angelico wrote: > On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan wrote: >>> Frequently Raised Objections >>> ============================ >> >> There needs to be a subsection here regarding the need to call `del` >> at class and module scope, just as there is for loop iteration >> variables at those scopes. > > Hmm, I'm not sure I follow. Are you saying that this is an objection > to assignment expressions, or an objection to them not being > statement-local? If the latter, it's really more about "rejected > alternative proposals". It's both - accidentally polluting class and module namespaces is an argument against expression level assignments in general, and sublocal namespaces aimed to eliminate that downside. Since feedback on the earlier versions of the PEP has moved sublocal namespaces into the "rejected due to excessive conceptual complexity" box, that means accidental namespace pollution comes back as a downside that the PEP should mention. I don't think it needs to say much, just point out that they share the downside of regular for loops: if you use one at class or module scope, and don't want to export the name, you need to delete it explicitly. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From j.van.dorp at deonet.nl Thu Apr 12 09:31:59 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Thu, 12 Apr 2018 15:31:59 +0200 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: 2018-04-12 15:02 GMT+02:00 Nick Coghlan : > On 12 April 2018 at 22:22, Jacco van Dorp wrote: >> I've looked through PEP 343, contextlib docs ( >> https://docs.python.org/3/library/contextlib.html ), and I couldn't >> find a single case where "with (y := f(x))" would be invalid. > > Consider this custom context manager: > > @contextmanager > def simple_cm(): > yield 42 > > Given that example, the following code: > > with cm := simple_cm() as value: > print(cm.func.__name__, value) > > would print "'simple_cm 42", since the assignment expression would > reference the context manager itself, while the with statement binds > the yielded value. > > Another relevant example would be `contextlib.closing`: that returns > the passed in argument from __enter__, *not* self. > > And that's why earlier versions of PEP 572 (which used the "EXPR as > NAME" spelling) just flat out prohibited top level name binding > expressions in with statements: "with (expr as name):" and "with expr > as name:" were far too different semantically for the only syntactic > difference to be a surrounding set of parentheses. > > Cheers, > Nick. Makes sense. However, couldn't you prevent that by giving with priority over the binding ? As in "(with simple_cm) as value", where we consider the "as" as binding operator instead of part of the with statement ? Sure, you could commit suicide by parenthesis, but by default it'd do exactly what the "with simple_cm as value" currently does. This does require use of as instead of :=, though. (which was the point I was trying to make, apologies for the confusion) From rosuav at gmail.com Thu Apr 12 09:41:49 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 23:41:49 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Thu, Apr 12, 2018 at 11:31 PM, Jacco van Dorp wrote: > 2018-04-12 15:02 GMT+02:00 Nick Coghlan : >> On 12 April 2018 at 22:22, Jacco van Dorp wrote: >>> I've looked through PEP 343, contextlib docs ( >>> https://docs.python.org/3/library/contextlib.html ), and I couldn't >>> find a single case where "with (y := f(x))" would be invalid. >> >> Consider this custom context manager: >> >> @contextmanager >> def simple_cm(): >> yield 42 >> >> Given that example, the following code: >> >> with cm := simple_cm() as value: >> print(cm.func.__name__, value) >> >> would print "'simple_cm 42", since the assignment expression would >> reference the context manager itself, while the with statement binds >> the yielded value. >> >> Another relevant example would be `contextlib.closing`: that returns >> the passed in argument from __enter__, *not* self. >> >> And that's why earlier versions of PEP 572 (which used the "EXPR as >> NAME" spelling) just flat out prohibited top level name binding >> expressions in with statements: "with (expr as name):" and "with expr >> as name:" were far too different semantically for the only syntactic >> difference to be a surrounding set of parentheses. >> >> Cheers, >> Nick. > > Makes sense. However, couldn't you prevent that by giving with > priority over the binding ? As in "(with simple_cm) as value", where > we consider the "as" as binding operator instead of part of the with > statement ? Sure, you could commit suicide by parenthesis, but by > default it'd do exactly what the "with simple_cm as value" currently > does. This does require use of as instead of :=, though. (which was > the point I was trying to make, apologies for the confusion) If you want this to be a generic name-binding operation, then no; most objects cannot be used as context managers. You'll get an exception if you try to use "with 1 as x:", for instance. As Nick mentioned, there are context managers that return something other than 'self', and for those, "with expr as name:" has an important meaning that cannot easily be captured with an assignment operator. ChrisA From adelfino at gmail.com Thu Apr 12 09:46:20 2018 From: adelfino at gmail.com (=?UTF-8?Q?Andr=C3=A9s_Delfino?=) Date: Thu, 12 Apr 2018 10:46:20 -0300 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: <20180411044441.GQ16661@ando.pearwood.info> Message-ID: Extending the original idea, IMHO it would make sense for the dict constructor to create a new dictionary not only from several mappings, but mixing mappings and iterables too. Consider this example: x = [(1, 'one')] y = {2: 'two'} Now: {**dict(x), **y} Proposed: dict(x, y) I think this extension makes the call ostensibly easier to read and grep. I believe we are safe regarding compatibility issues, right? What do you guys think? On Wed, Apr 11, 2018 at 4:44 AM, Mike Miller wrote: > Ok, we can haggle the finer details and I admit once you learn the syntax > it isn't substantially harder. Simply, I've found the dict() a bit easier > to mentally parse at a glance. Also, to add I've always expected multiple > args to work with it, and am always surprised when it doesn't. > > Would never have thought of this unpacking syntax if I didn't know that's > the way its done now, but often have to think about it for a second or two. > > > On 2018-04-10 22:22, Chris Angelico wrote: > >> On Wed, Apr 11, 2018 at 2:44 PM, Steven D'Aprano >> wrote: >> >>> On Wed, Apr 11, 2018 at 02:22:08PM +1000, Chris Angelico wrote: >>> >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Apr 12 09:39:23 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 12 Apr 2018 23:39:23 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Thu, Apr 12, 2018 at 11:19 PM, Nick Coghlan wrote: > On 12 April 2018 at 07:28, Chris Angelico wrote: >> On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan wrote: >>>> Frequently Raised Objections >>>> ============================ >>> >>> There needs to be a subsection here regarding the need to call `del` >>> at class and module scope, just as there is for loop iteration >>> variables at those scopes. >> >> Hmm, I'm not sure I follow. Are you saying that this is an objection >> to assignment expressions, or an objection to them not being >> statement-local? If the latter, it's really more about "rejected >> alternative proposals". > > It's both - accidentally polluting class and module namespaces is an > argument against expression level assignments in general, and sublocal > namespaces aimed to eliminate that downside. > > Since feedback on the earlier versions of the PEP has moved sublocal > namespaces into the "rejected due to excessive conceptual complexity" > box, that means accidental namespace pollution comes back as a > downside that the PEP should mention. > > I don't think it needs to say much, just point out that they share the > downside of regular for loops: if you use one at class or module > scope, and don't want to export the name, you need to delete it > explicitly. > Ah, makes sense. Thanks. Have added that to the latest version. ChrisA From storchaka at gmail.com Thu Apr 12 09:57:01 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 12 Apr 2018 16:57:01 +0300 Subject: [Python-ideas] Add the method decorator Message-ID: There is a difference between functions implemented in Python and C. Functions implemented in Python are descriptors. They can be used for defining methods in Python classes. Functions implemented in C are not descriptors. When set a class attribute to a functions implemented in C, it will not become a bound method. from _noddy import noddy_name class Noddy: name = noddy_name noddy = Noddy() If noddy_name is a Python function, noddy.name() will call noddy_name(noddy), but if it is a C function, noddy.name() will call noddy_name(). The same is true for classes and custom callables. If a function is a descriptor, it can be converted into non-descriptor function by wrapping it with the staticmethod decorator. I suggest to add the method decorator, which converts an rbitrary callable into a descriptor. class Noddy: name = method(noddy_name) This will help to implement only performance critical method of a class in C. Currently you need to implement a base class in C, and inherit Python class from C class. But this doesn't work when the class should be inherited from other C class, or when an existing class should be patched like in total_ordering. This will help also to use custom callables as methods. From dmoisset at machinalis.com Thu Apr 12 10:16:31 2018 From: dmoisset at machinalis.com (Daniel Moisset) Date: Thu, 12 Apr 2018 15:16:31 +0100 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: One implementation difficulty specifically related to annotations, is that they are quite hard to find/extract from the code objects. Both docstrings and lnotab are within specific fields of the code object for their function/class/module; annotations are spread as individual constants (assuming PEP 563), which are loaded in bytecode through separate LOAD_CONST statements before creating the function object, and that can happen in the middle of bytecode for the higher level object (the module or class containing a function definition). So the change for achieving that will be more significant than just "add a couple of descriptors to function objects and change the module marshalling code". Probably making annotations fit a single structure that can live in co_consts could make this change easier, and also make startup of annotated modules faster (because you just load a single constant instead of one per argument), this might be a valuable change by itself. On 12 April 2018 at 11:48, INADA Naoki wrote: > > Finally, loading docstrings and other optional components can be made > lazy. > > This was not in my original idea, and this will significantly complicate > the > > implementation, but in principle it is possible. This will require larger > > changes in the marshal format and bytecode. > > I'm +1 on this idea. > > * New pyc format has code section (same to current) and text section. > text section stores UTF-8 strings and not loaded at import time. > * Function annotation (only when PEP 563 is used) and docstring are > stored as integer, point to offset in the text section. > * When type.__doc__, PyFunction.__doc__, PyFunction.__annotation__ are > integer, text is loaded from the text section lazily. > > PEP 563 will reduce some startup time, but __annotation__ is still > dict. Memory overhead is negligible. > > In [1]: def foo(a: int, b: int) -> int: > ...: return a + b > ...: > ...: > > In [2]: import sys > In [3]: sys.getsizeof(foo) > Out[3]: 136 > > In [4]: sys.getsizeof(foo.__annotations__) > Out[4]: 240 > > When PEP 563 is used, there are no side effect while building the > annotation. > So the annotation can be serialized in text, like > {"a":"int","b":"int","return":"int"}. > > This change will require new pyc format, and descriptor for > PyFunction.__doc__, PyFunction.__annotation__ > and type.__doc__. > > Regards, > > -- > INADA Naoki > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Daniel F. Moisset - UK Country Manager - Machinalis Limited www.machinalis.co.uk Skype: @dmoisset T: + 44 7398 827139 1 Fore St, London, EC2Y 9DT Machinalis Limited is a company registered in England and Wales. Registered number: 10574987. -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Thu Apr 12 10:32:37 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 12 Apr 2018 16:32:37 +0200 Subject: [Python-ideas] Add the method decorator In-Reply-To: <46c85cad04c143f0b485b82944333428@xmail101.UGent.be> References: <46c85cad04c143f0b485b82944333428@xmail101.UGent.be> Message-ID: <5ACF6E05.30507@UGent.be> On 2018-04-12 15:57, Serhiy Storchaka wrote: > There is a difference between functions implemented in Python and C. > Functions implemented in Python are descriptors. They can be used for > defining methods in Python classes. Functions implemented in C are not > descriptors. When set a class attribute to a functions implemented in C, > it will not become a bound method. As it happens, the recently-created PEP 575 allows creating functions in C which behave more like Python functions. It doesn't fix your use case with existing functions implemented in C or arbitrary callables but PEP 575 would fix your use case for user-implemented C functions. From e+python-ideas at kellett.im Thu Apr 12 10:34:00 2018 From: e+python-ideas at kellett.im (Ed Kellett) Date: Thu, 12 Apr 2018 15:34:00 +0100 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: <20180411044441.GQ16661@ando.pearwood.info> Message-ID: <828fc8fa-f88d-4acf-73c5-5931fbeeaa96@kellett.im> On 2018-04-12 14:46, Andr?s Delfino wrote: > Extending the original idea, IMHO it would make sense for the dict > constructor to create a new dictionary not only from several mappings, but > mixing mappings and iterables too. > > Consider this example: > > x = [(1, 'one')] > y = {2: 'two'} > > Now: {**dict(x), **y} > Proposed: dict(x, y) > > I think this extension makes the call ostensibly easier to read and grep. It allows for creating a flattened dict from an iterable of dicts, too, which I've occasionally wanted: >>> configs = {'a': 'yes'}, {'b': 'no'}, {'c': 3} >>> dict(*configs) {'a': 'yes', 'b': 'no', 'c': 3} versus: >>> dict(chain.from_iterable(c.items() for c in configs)) {'a': 'yes', 'b': 'no', 'c': 3} Ed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From e+python-ideas at kellett.im Thu Apr 12 10:30:12 2018 From: e+python-ideas at kellett.im (Ed Kellett) Date: Thu, 12 Apr 2018 15:30:12 +0100 Subject: [Python-ideas] Add the method decorator In-Reply-To: References: Message-ID: <9126b502-7765-b6da-69d0-88e0e164a8e2@kellett.im> On 2018-04-12 14:57, Serhiy Storchaka wrote: > If noddy_name is a Python function, noddy.name() will call > noddy_name(noddy), but if it is a C function, noddy.name() will call > noddy_name(). > > The same is true for classes and custom callables. FWIW, you could (almost) do this in py2: >>> class Str(str): pass ... >>> Str.print = types.MethodType(print, None, Str) >>> Str("hello").print() hello > If a function is a descriptor, it can be converted into non-descriptor > function by wrapping it with the staticmethod decorator. I suggest to > add the method decorator, which converts an rbitrary callable into a > descriptor. > > ??? class Noddy: > ??????? name = method(noddy_name) > > This will help to implement only performance critical method of a class > in C. Does the method decorator need to be written in C for the performance benefit? If you can stand __get__ being Python, it's pretty easy to write and doesn't need to change the language. This does remind me of my favourite silly functional Python trick: as long as foo is implemented in Python, foo.__get__(x)(y,z) is equivalent to foo(x,y,z), which is useful if you find Python's standard partial application syntax too ugly. > Currently you need to implement a base class in C, and inherit > Python class from C class. But this doesn't work when the class should > be inherited from other C class, or when an existing class should be > patched like in total_ordering. > > This will help also to use custom callables as methods. I wonder if it wouldn't make more sense to make the behaviour consistent between Python and C functions... that someclass.blah = a_python_function already does a frequently-wrong thing suggests (to me) that maybe the proper solution would be to bring back unbound method objects. Ed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From guido at python.org Thu Apr 12 11:08:09 2018 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Apr 2018 08:08:09 -0700 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: <828fc8fa-f88d-4acf-73c5-5931fbeeaa96@kellett.im> References: <20180411044441.GQ16661@ando.pearwood.info> <828fc8fa-f88d-4acf-73c5-5931fbeeaa96@kellett.im> Message-ID: On Thu, Apr 12, 2018 at 7:34 AM, Ed Kellett wrote: > On 2018-04-12 14:46, Andr?s Delfino wrote: > > Extending the original idea, IMHO it would make sense for the dict > > constructor to create a new dictionary not only from several mappings, > but > > mixing mappings and iterables too. > > > > Consider this example: > > > > x = [(1, 'one')] > > y = {2: 'two'} > > > > Now: {**dict(x), **y} > > Proposed: dict(x, y) > > > > I think this extension makes the call ostensibly easier to read and grep. > > It allows for creating a flattened dict from an iterable of dicts, too, > which I've occasionally wanted: > > >>> configs = {'a': 'yes'}, {'b': 'no'}, {'c': 3} > >>> dict(*configs) > {'a': 'yes', 'b': 'no', 'c': 3} > > versus: > > >>> dict(chain.from_iterable(c.items() for c in configs)) > {'a': 'yes', 'b': 'no', 'c': 3} Yes, this all sounds totally reasonable. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Apr 12 11:12:32 2018 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Apr 2018 08:12:32 -0700 Subject: [Python-ideas] Default values in multi-target assignment In-Reply-To: References: Message-ID: I hear where you're coming from but I really don't think we should do this. If you don't have the right expectation already it's hard to guess what it means. I would much rather spend effort on a proper matching statement. On Thu, Apr 12, 2018 at 2:54 AM, Serhiy Storchaka wrote: > Yet one crazy idea. What if allow default values for targets in > multi-target assignment? > > >>> (a, b=0) = (1, 2) > >>> a, b > (1, 2) > >>> (a, b=0) = (1,) > >>> a, b > (1, 0) > >>> (a, b=0) = () > Traceback (most recent call last): > File "", line 1, in > ValueError: not enough values to unpack (expected at least 1, got 0) > >>> (a, b=0) = (1, 2, 3) > Traceback (most recent call last): > File "", line 1, in > ValueError: too many values to unpack (expected at most 2) > > Currently you need either explicitly check the length of the right-hand > part (if it is a sequence and not an arbitrary iterator), > > if len(c) == 1: > a, = c > b = 0 > elif len(c) == 2: > a, b = c > else: > raise TypeError > > or use an intermediate function: > > def f(a, b=0): > return a, b > a, b = f(*c) > > The latter can be written as an ugly one-liner: > > a, b = (lambda a, b=0: (a, b))(*c) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Apr 12 11:15:58 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 12 Apr 2018 18:15:58 +0300 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: <828fc8fa-f88d-4acf-73c5-5931fbeeaa96@kellett.im> References: <20180411044441.GQ16661@ando.pearwood.info> <828fc8fa-f88d-4acf-73c5-5931fbeeaa96@kellett.im> Message-ID: 12.04.18 17:34, Ed Kellett ????: > It allows for creating a flattened dict from an iterable of dicts, too, > which I've occasionally wanted: > >>>> configs = {'a': 'yes'}, {'b': 'no'}, {'c': 3} >>>> dict(*configs) > {'a': 'yes', 'b': 'no', 'c': 3} > > versus: > >>>> dict(chain.from_iterable(c.items() for c in configs)) > {'a': 'yes', 'b': 'no', 'c': 3} Or {**x for x in configs}. From storchaka at gmail.com Thu Apr 12 11:45:51 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 12 Apr 2018 18:45:51 +0300 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: 09.04.18 00:18, Andr?s Delfino ????: > I thought that maybe dict could accept several mappings as positional > arguments, like this: > > class Dict4(dict): > ??? def __init__(self, *args, **kwargs): > ??????? if len(args) > 1: > ??????????? if not all([isinstance(arg, dict) for arg in args]): > ??????????????? raise TypeError('Dict4 expected instances of dict > since multiple positional arguments were passed') > > ??????????? temp = args[0].copy() > > ??????????? for arg in args[1:]: > ??????????????? temp.update(arg) > > ??????????? super().__init__(temp, **kwargs) > ??????? else: > ??????????? super().__init__(*args, **kwargs) > > > AFAIK, this wouldn't create compatibility problems, since you can't pass > two positional arguments now anyways. > > It would be useful to solve the "sum/union dicts" discussion, for > example: requests.get(url, params=dict(params, {'foo': bar}) > > Whar are your thoughts? It is easy to make the dict constructor merging several positional arguments. But this is not a tiny harmless change, it will start a cascade of other changes. After changing the dict constructor, we will need to update the dict.update() method too. Constructors and update() methods of dict subclasses (OrderedDict, defaultdict, Counter, and more specialized classes) should be updated too. UserDict, WeakKeyDictionary, WeakValueDictionary are next. After that we will have a pressure of updating constructors and update() methods of abstract classes Mapping and MutableMapping. This change will break a lot of third-party code that implement concrete implementations of these classes, because adding support of new arguments in the method of abstract class breaks an interface. We will be able to pass this path (we have already passed it), but we must realize how long it is. From storchaka at gmail.com Thu Apr 12 12:00:45 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 12 Apr 2018 19:00:45 +0300 Subject: [Python-ideas] Default values in multi-target assignment In-Reply-To: References: Message-ID: 12.04.18 18:12, Guido van Rossum ????: > I hear where you're coming from but I really don't think we should do > this. If you don't have the right expectation already it's hard to guess > what it means. I would much rather spend effort on a proper matching > statement. There are few applications of this syntax, and it is not totally new, it is originated from the syntax of function parameters. But I agree with you. I prefer to keep the syntax of Python simpler. From yaoxiansamma at gmail.com Thu Apr 12 13:35:56 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Fri, 13 Apr 2018 01:35:56 +0800 Subject: [Python-ideas] Python-ideas Digest, Vol 137, Issue 67 In-Reply-To: References: Message-ID: > Makes sense. However, couldn't you prevent that by giving with > priority over the binding ? As in "(with simple_cm) as value", where > we consider the "as" as binding operator instead of part of the with > statement ? Sure, you could commit suicide by parenthesis, but by > default it'd do exactly what the "with simple_cm as value" currently > does. This does require use of as instead of :=, though. (which was > the point I was trying to make, apologies for the confusion) Does "(with simple_cm) as value" means "with (simple_cm as value)"? If so, it's impossible to let the priority of "with ... as ..." over `as` binding. This is the grammar of current syntax related to with statement: with_stmt: 'with' with_item (',' with_item)* ':' suite with_item: test ['as' expr] If `as` binding could be used in a general expression, just as `test` is the top of expression, an expression using `as` binding must be in the structure `test`. In other words, if you write with expr as name: # do stuff Without doubt it's equivalent to `with (expr as name)`. Or you want to completely change the grammar design of CPython :) thautwarm 2018-04-12 21:41 GMT+08:00 : > Send Python-ideas mailing list submissions to > python-ideas at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/python-ideas > or, via email, send a message with subject or body 'help' to > python-ideas-request at python.org > > You can reach the person managing the list at > python-ideas-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-ideas digest..." > > Today's Topics: > > 1. Re: PEP 572: Assignment Expressions (post #4) (Chris Angelico) > 2. Re: PEP 572: Assignment Expressions (post #4) (Chris Angelico) > 3. Re: PEP 572: Assignment Expressions (post #4) (Nick Coghlan) > 4. Re: PEP 572: Assignment Expressions (post #4) (Jacco van Dorp) > 5. Re: PEP 572: Assignment Expressions (post #4) (Chris Angelico) > > > ---------- ????? ---------- > From: Chris Angelico > To: python-ideas > Cc: > Bcc: > Date: Thu, 12 Apr 2018 23:08:06 +1000 > Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) > On Thu, Apr 12, 2018 at 9:09 PM, Paul Moore wrote: > > On 11 April 2018 at 22:28, Chris Angelico wrote: > >> On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan > wrote: > >>> This argument will be strengthened by making the examples used in the > >>> PEP itself more attractive, as well as proposing suitable additions to > >>> PEP 8, such as: > >>> > >>> 1. If either assignment statements or assignment expressions can be > >>> used, prefer statements > >>> 2. If using assignment expressions would lead to ambiguity about > >>> execution order, restructure to use statements instead > >> > >> Fair enough. Also adding that chained assignment expressions should > >> generally be avoided. > > > > Another one I think should be included (I'm a bit sad that it's not so > > obvious that no-one would ever even think of it, but the current > > discussion pretty much killed that hope for me). > > > > * Assignment expressions should never be used standalone - assignment > > statements should *always* be used in that case. > > That's covered by the first point. If it's a standalone statement, > then the statement form could be used, ergo you should prefer the > statement form. > > ChrisA > > > > ---------- ????? ---------- > From: Chris Angelico > To: python-ideas > Cc: > Bcc: > Date: Thu, 12 Apr 2018 23:14:34 +1000 > Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) > On Thu, Apr 12, 2018 at 7:21 PM, Kirill Balunov > wrote: > > > > I gain readability! I don't see any reason to use it in other contexts... > > Because it makes the code unreadable and difficult to perceive while > giving > > not so much benefit. I may be wrong, but so far I have not seen a single > > example that at least slightly changed my mind. > > This is, in effect, your entire argument for permitting assignments > only in certain contexts. "I can't think of any useful reason for > doing this, so we shouldn't do it". But that means making the language > grammar more complicated (both in the technical sense of the parser's > definitions, and in the colloquial sense of how you'd explain Python > to a new programmer), because there are these magic constructs that > can be used anywhere in an expression, but ONLY if that expression is > inside an if or while statement. You lose the ability to refactor your > code simply to satisfy an arbitrary restriction to appease someone's > feeling of "it can't be useful anywhere else". > > There are basically two clean ways to do this: > > 1) Create actual syntax as part of the while statement, in the same > way that the 'with EXPR as NAME:' statement does. This means you > cannot put any additional operators after the 'as NAME' part. It's as > much a part of the statement's syntax as the word 'in' is in a for > loop. > > 2) Make this a feature of expressions in general. Then they can be > used anywhere that an expression can be. > > I've gone for option 2. If you want to push for option 1, go ahead, > but it's a nerfed solution just because you personally cannot think of > any good use for this. > > ChrisA > > > > ---------- ????? ---------- > From: Nick Coghlan > To: Chris Angelico > Cc: python-ideas > Bcc: > Date: Thu, 12 Apr 2018 23:19:59 +1000 > Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) > On 12 April 2018 at 07:28, Chris Angelico wrote: > > On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan > wrote: > >>> Frequently Raised Objections > >>> ============================ > >> > >> There needs to be a subsection here regarding the need to call `del` > >> at class and module scope, just as there is for loop iteration > >> variables at those scopes. > > > > Hmm, I'm not sure I follow. Are you saying that this is an objection > > to assignment expressions, or an objection to them not being > > statement-local? If the latter, it's really more about "rejected > > alternative proposals". > > It's both - accidentally polluting class and module namespaces is an > argument against expression level assignments in general, and sublocal > namespaces aimed to eliminate that downside. > > Since feedback on the earlier versions of the PEP has moved sublocal > namespaces into the "rejected due to excessive conceptual complexity" > box, that means accidental namespace pollution comes back as a > downside that the PEP should mention. > > I don't think it needs to say much, just point out that they share the > downside of regular for loops: if you use one at class or module > scope, and don't want to export the name, you need to delete it > explicitly. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > > > ---------- ????? ---------- > From: Jacco van Dorp > To: python-ideas > Cc: > Bcc: > Date: Thu, 12 Apr 2018 15:31:59 +0200 > Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) > 2018-04-12 15:02 GMT+02:00 Nick Coghlan : > > On 12 April 2018 at 22:22, Jacco van Dorp wrote: > >> I've looked through PEP 343, contextlib docs ( > >> https://docs.python.org/3/library/contextlib.html ), and I couldn't > >> find a single case where "with (y := f(x))" would be invalid. > > > > Consider this custom context manager: > > > > @contextmanager > > def simple_cm(): > > yield 42 > > > > Given that example, the following code: > > > > with cm := simple_cm() as value: > > print(cm.func.__name__, value) > > > > would print "'simple_cm 42", since the assignment expression would > > reference the context manager itself, while the with statement binds > > the yielded value. > > > > Another relevant example would be `contextlib.closing`: that returns > > the passed in argument from __enter__, *not* self. > > > > And that's why earlier versions of PEP 572 (which used the "EXPR as > > NAME" spelling) just flat out prohibited top level name binding > > expressions in with statements: "with (expr as name):" and "with expr > > as name:" were far too different semantically for the only syntactic > > difference to be a surrounding set of parentheses. > > > > Cheers, > > Nick. > > Makes sense. However, couldn't you prevent that by giving with > priority over the binding ? As in "(with simple_cm) as value", where > we consider the "as" as binding operator instead of part of the with > statement ? Sure, you could commit suicide by parenthesis, but by > default it'd do exactly what the "with simple_cm as value" currently > does. This does require use of as instead of :=, though. (which was > the point I was trying to make, apologies for the confusion) > > > > ---------- ????? ---------- > From: Chris Angelico > To: python-ideas > Cc: > Bcc: > Date: Thu, 12 Apr 2018 23:41:49 +1000 > Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) > On Thu, Apr 12, 2018 at 11:31 PM, Jacco van Dorp > wrote: > > 2018-04-12 15:02 GMT+02:00 Nick Coghlan : > >> On 12 April 2018 at 22:22, Jacco van Dorp wrote: > >>> I've looked through PEP 343, contextlib docs ( > >>> https://docs.python.org/3/library/contextlib.html ), and I couldn't > >>> find a single case where "with (y := f(x))" would be invalid. > >> > >> Consider this custom context manager: > >> > >> @contextmanager > >> def simple_cm(): > >> yield 42 > >> > >> Given that example, the following code: > >> > >> with cm := simple_cm() as value: > >> print(cm.func.__name__, value) > >> > >> would print "'simple_cm 42", since the assignment expression would > >> reference the context manager itself, while the with statement binds > >> the yielded value. > >> > >> Another relevant example would be `contextlib.closing`: that returns > >> the passed in argument from __enter__, *not* self. > >> > >> And that's why earlier versions of PEP 572 (which used the "EXPR as > >> NAME" spelling) just flat out prohibited top level name binding > >> expressions in with statements: "with (expr as name):" and "with expr > >> as name:" were far too different semantically for the only syntactic > >> difference to be a surrounding set of parentheses. > >> > >> Cheers, > >> Nick. > > > > Makes sense. However, couldn't you prevent that by giving with > > priority over the binding ? As in "(with simple_cm) as value", where > > we consider the "as" as binding operator instead of part of the with > > statement ? Sure, you could commit suicide by parenthesis, but by > > default it'd do exactly what the "with simple_cm as value" currently > > does. This does require use of as instead of :=, though. (which was > > the point I was trying to make, apologies for the confusion) > > If you want this to be a generic name-binding operation, then no; most > objects cannot be used as context managers. You'll get an exception if > you try to use "with 1 as x:", for instance. > > As Nick mentioned, there are context managers that return something > other than 'self', and for those, "with expr as name:" has an > important meaning that cannot easily be captured with an assignment > operator. > > ChrisA > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaoxiansamma at gmail.com Thu Apr 12 14:01:38 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Fri, 13 Apr 2018 02:01:38 +0800 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) Message-ID: > > Makes sense. However, couldn't you prevent that by giving with > priority over the binding ? As in "(with simple_cm) as value", where > > we consider the "as" as binding operator instead of part of the with > > statement ? Sure, you could commit suicide by parenthesis, but by > > default it'd do exactly what the "with simple_cm as value" currently > > does. This does require use of as instead of :=, though. (which was > > the point I was trying to make, apologies for the confusion) > > Does "(with simple_cm) as value" means "with (simple_cm as value)"? > If so, it's impossible to let the priority of "with ... as ..." over `as` binding. > > This is the grammar of current syntax related to with statement: > > with_stmt: 'with' with_item (',' with_item)* ':' suite > with_item: test ['as' expr] > > If `as` binding could be used in a general expression, just as > `test` is the top of expression, an expression using `as` binding must be in the structure > `test`. > In other words, if you write > > with expr as name: > # do stuff > > Without doubt it's equivalent to `with (expr as name)`. > > Or you want to completely change the grammar design of CPython :) > > thautwarm Additionally, here is an evidence. I've just had a look at Chris Angelico's implementation about expression assignment, it's cool, however the problem is still raised. https://github.com/Rosuav/cpython/blob/0f237048b7665720b5165a40de0ed601c1e82c39/Grammar/Grammar `as` binding is added at line 111, obviously you cannot separate it from the `test` structure(because `test` is the top expr). testlist_comp: (test|star_expr) ( comp_for | 'as' NAME | (',' (test|star_expr))* [','] ) It seems that if we're to support expression assignment, `as` binding should be declined. To be honest I feel upset because I think `expr as name` is really cool and pythonic. thautwarm 2018-04-13 1:35 GMT+08:00 Thautwarm Zhao : > > Makes sense. However, couldn't you prevent that by giving with > > priority over the binding ? As in "(with simple_cm) as value", where > > we consider the "as" as binding operator instead of part of the with > > statement ? Sure, you could commit suicide by parenthesis, but by > > default it'd do exactly what the "with simple_cm as value" currently > > does. This does require use of as instead of :=, though. (which was > > the point I was trying to make, apologies for the confusion) > > Does "(with simple_cm) as value" means "with (simple_cm as value)"? > If so, it's impossible to let the priority of "with ... as ..." over `as` > binding. > > This is the grammar of current syntax related to with statement: > > with_stmt: 'with' with_item (',' with_item)* ':' suite > with_item: test ['as' expr] > > If `as` binding could be used in a general expression, just as > `test` is the top of expression, an expression using `as` binding must be > in the structure `test`. > In other words, if you write > > with expr as name: > # do stuff > > Without doubt it's equivalent to `with (expr as name)`. > > Or you want to completely change the grammar design of CPython :) > > thautwarm > > > > 2018-04-12 21:41 GMT+08:00 : > >> Send Python-ideas mailing list submissions to >> python-ideas at python.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://mail.python.org/mailman/listinfo/python-ideas >> or, via email, send a message with subject or body 'help' to >> python-ideas-request at python.org >> >> You can reach the person managing the list at >> python-ideas-owner at python.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Python-ideas digest..." >> >> Today's Topics: >> >> 1. Re: PEP 572: Assignment Expressions (post #4) (Chris Angelico) >> 2. Re: PEP 572: Assignment Expressions (post #4) (Chris Angelico) >> 3. Re: PEP 572: Assignment Expressions (post #4) (Nick Coghlan) >> 4. Re: PEP 572: Assignment Expressions (post #4) (Jacco van Dorp) >> 5. Re: PEP 572: Assignment Expressions (post #4) (Chris Angelico) >> >> >> ---------- ????? ---------- >> From: Chris Angelico >> To: python-ideas >> Cc: >> Bcc: >> Date: Thu, 12 Apr 2018 23:08:06 +1000 >> Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) >> On Thu, Apr 12, 2018 at 9:09 PM, Paul Moore wrote: >> > On 11 April 2018 at 22:28, Chris Angelico wrote: >> >> On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan >> wrote: >> >>> This argument will be strengthened by making the examples used in the >> >>> PEP itself more attractive, as well as proposing suitable additions to >> >>> PEP 8, such as: >> >>> >> >>> 1. If either assignment statements or assignment expressions can be >> >>> used, prefer statements >> >>> 2. If using assignment expressions would lead to ambiguity about >> >>> execution order, restructure to use statements instead >> >> >> >> Fair enough. Also adding that chained assignment expressions should >> >> generally be avoided. >> > >> > Another one I think should be included (I'm a bit sad that it's not so >> > obvious that no-one would ever even think of it, but the current >> > discussion pretty much killed that hope for me). >> > >> > * Assignment expressions should never be used standalone - assignment >> > statements should *always* be used in that case. >> >> That's covered by the first point. If it's a standalone statement, >> then the statement form could be used, ergo you should prefer the >> statement form. >> >> ChrisA >> >> >> >> ---------- ????? ---------- >> From: Chris Angelico >> To: python-ideas >> Cc: >> Bcc: >> Date: Thu, 12 Apr 2018 23:14:34 +1000 >> Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) >> On Thu, Apr 12, 2018 at 7:21 PM, Kirill Balunov >> wrote: >> > >> > I gain readability! I don't see any reason to use it in other >> contexts... >> > Because it makes the code unreadable and difficult to perceive while >> giving >> > not so much benefit. I may be wrong, but so far I have not seen a >> single >> > example that at least slightly changed my mind. >> >> This is, in effect, your entire argument for permitting assignments >> only in certain contexts. "I can't think of any useful reason for >> doing this, so we shouldn't do it". But that means making the language >> grammar more complicated (both in the technical sense of the parser's >> definitions, and in the colloquial sense of how you'd explain Python >> to a new programmer), because there are these magic constructs that >> can be used anywhere in an expression, but ONLY if that expression is >> inside an if or while statement. You lose the ability to refactor your >> code simply to satisfy an arbitrary restriction to appease someone's >> feeling of "it can't be useful anywhere else". >> >> There are basically two clean ways to do this: >> >> 1) Create actual syntax as part of the while statement, in the same >> way that the 'with EXPR as NAME:' statement does. This means you >> cannot put any additional operators after the 'as NAME' part. It's as >> much a part of the statement's syntax as the word 'in' is in a for >> loop. >> >> 2) Make this a feature of expressions in general. Then they can be >> used anywhere that an expression can be. >> >> I've gone for option 2. If you want to push for option 1, go ahead, >> but it's a nerfed solution just because you personally cannot think of >> any good use for this. >> >> ChrisA >> >> >> >> ---------- ????? ---------- >> From: Nick Coghlan >> To: Chris Angelico >> Cc: python-ideas >> Bcc: >> Date: Thu, 12 Apr 2018 23:19:59 +1000 >> Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) >> On 12 April 2018 at 07:28, Chris Angelico wrote: >> > On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan >> wrote: >> >>> Frequently Raised Objections >> >>> ============================ >> >> >> >> There needs to be a subsection here regarding the need to call `del` >> >> at class and module scope, just as there is for loop iteration >> >> variables at those scopes. >> > >> > Hmm, I'm not sure I follow. Are you saying that this is an objection >> > to assignment expressions, or an objection to them not being >> > statement-local? If the latter, it's really more about "rejected >> > alternative proposals". >> >> It's both - accidentally polluting class and module namespaces is an >> argument against expression level assignments in general, and sublocal >> namespaces aimed to eliminate that downside. >> >> Since feedback on the earlier versions of the PEP has moved sublocal >> namespaces into the "rejected due to excessive conceptual complexity" >> box, that means accidental namespace pollution comes back as a >> downside that the PEP should mention. >> >> I don't think it needs to say much, just point out that they share the >> downside of regular for loops: if you use one at class or module >> scope, and don't want to export the name, you need to delete it >> explicitly. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> >> >> >> ---------- ????? ---------- >> From: Jacco van Dorp >> To: python-ideas >> Cc: >> Bcc: >> Date: Thu, 12 Apr 2018 15:31:59 +0200 >> Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) >> 2018-04-12 15:02 GMT+02:00 Nick Coghlan : >> > On 12 April 2018 at 22:22, Jacco van Dorp wrote: >> >> I've looked through PEP 343, contextlib docs ( >> >> https://docs.python.org/3/library/contextlib.html ), and I couldn't >> >> find a single case where "with (y := f(x))" would be invalid. >> > >> > Consider this custom context manager: >> > >> > @contextmanager >> > def simple_cm(): >> > yield 42 >> > >> > Given that example, the following code: >> > >> > with cm := simple_cm() as value: >> > print(cm.func.__name__, value) >> > >> > would print "'simple_cm 42", since the assignment expression would >> > reference the context manager itself, while the with statement binds >> > the yielded value. >> > >> > Another relevant example would be `contextlib.closing`: that returns >> > the passed in argument from __enter__, *not* self. >> > >> > And that's why earlier versions of PEP 572 (which used the "EXPR as >> > NAME" spelling) just flat out prohibited top level name binding >> > expressions in with statements: "with (expr as name):" and "with expr >> > as name:" were far too different semantically for the only syntactic >> > difference to be a surrounding set of parentheses. >> > >> > Cheers, >> > Nick. >> >> Makes sense. However, couldn't you prevent that by giving with >> priority over the binding ? As in "(with simple_cm) as value", where >> we consider the "as" as binding operator instead of part of the with >> statement ? Sure, you could commit suicide by parenthesis, but by >> default it'd do exactly what the "with simple_cm as value" currently >> does. This does require use of as instead of :=, though. (which was >> the point I was trying to make, apologies for the confusion) >> >> >> >> ---------- ????? ---------- >> From: Chris Angelico >> To: python-ideas >> Cc: >> Bcc: >> Date: Thu, 12 Apr 2018 23:41:49 +1000 >> Subject: Re: [Python-ideas] PEP 572: Assignment Expressions (post #4) >> On Thu, Apr 12, 2018 at 11:31 PM, Jacco van Dorp >> wrote: >> > 2018-04-12 15:02 GMT+02:00 Nick Coghlan : >> >> On 12 April 2018 at 22:22, Jacco van Dorp >> wrote: >> >>> I've looked through PEP 343, contextlib docs ( >> >>> https://docs.python.org/3/library/contextlib.html ), and I couldn't >> >>> find a single case where "with (y := f(x))" would be invalid. >> >> >> >> Consider this custom context manager: >> >> >> >> @contextmanager >> >> def simple_cm(): >> >> yield 42 >> >> >> >> Given that example, the following code: >> >> >> >> with cm := simple_cm() as value: >> >> print(cm.func.__name__, value) >> >> >> >> would print "'simple_cm 42", since the assignment expression would >> >> reference the context manager itself, while the with statement binds >> >> the yielded value. >> >> >> >> Another relevant example would be `contextlib.closing`: that returns >> >> the passed in argument from __enter__, *not* self. >> >> >> >> And that's why earlier versions of PEP 572 (which used the "EXPR as >> >> NAME" spelling) just flat out prohibited top level name binding >> >> expressions in with statements: "with (expr as name):" and "with expr >> >> as name:" were far too different semantically for the only syntactic >> >> difference to be a surrounding set of parentheses. >> >> >> >> Cheers, >> >> Nick. >> > >> > Makes sense. However, couldn't you prevent that by giving with >> > priority over the binding ? As in "(with simple_cm) as value", where >> > we consider the "as" as binding operator instead of part of the with >> > statement ? Sure, you could commit suicide by parenthesis, but by >> > default it'd do exactly what the "with simple_cm as value" currently >> > does. This does require use of as instead of :=, though. (which was >> > the point I was trying to make, apologies for the confusion) >> >> If you want this to be a generic name-binding operation, then no; most >> objects cannot be used as context managers. You'll get an exception if >> you try to use "with 1 as x:", for instance. >> >> As Nick mentioned, there are context managers that return something >> other than 'self', and for those, "with expr as name:" has an >> important meaning that cannot easily be captured with an assignment >> operator. >> >> ChrisA >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Apr 12 14:14:24 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 04:14:24 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Fri, Apr 13, 2018 at 4:01 AM, Thautwarm Zhao wrote: >> > Makes sense. However, couldn't you prevent that by giving with >> priority over the binding ? As in "(with simple_cm) as value", where >> > we consider the "as" as binding operator instead of part of the with >> > statement ? Sure, you could commit suicide by parenthesis, but by >> > default it'd do exactly what the "with simple_cm as value" currently >> > does. This does require use of as instead of :=, though. (which was >> > the point I was trying to make, apologies for the confusion) >> >> Does "(with simple_cm) as value" means "with (simple_cm as value)"? >> If so, it's impossible to let the priority of "with ... as ..." over `as` >> binding. >> >> This is the grammar of current syntax related to with statement: >> >> with_stmt: 'with' with_item (',' with_item)* ':' suite >> with_item: test ['as' expr] >> >> If `as` binding could be used in a general expression, just as >> `test` is the top of expression, an expression using `as` binding must be >> in the structure > `test`. >> In other words, if you write >> >> with expr as name: >> # do stuff >> >> Without doubt it's equivalent to `with (expr as name)`. >> >> Or you want to completely change the grammar design of CPython :) >> >> thautwarm > > Additionally, here is an evidence. > > I've just had a look at Chris Angelico's implementation about expression > assignment, it's cool, however the problem is still raised. > > https://github.com/Rosuav/cpython/blob/0f237048b7665720b5165a40de0ed601c1e82c39/Grammar/Grammar > > `as` binding is added at line 111, obviously you cannot separate it from the > `test` structure(because `test` is the top expr). > > testlist_comp: (test|star_expr) ( comp_for | 'as' NAME | (',' > (test|star_expr))* [','] ) > > It seems that if we're to support expression assignment, `as` binding should > be declined. > To be honest I feel upset because I think `expr as name` is really cool and > pythonic. > You're looking at a very early commit there. I suggest looking at the most recent commits on one of two branches: https://github.com/Rosuav/cpython/blob/statement-local-variables/Grammar/Grammar https://github.com/Rosuav/cpython/blob/assignment-expressions/Grammar/Grammar Those are the two most recent states in my progress towards (a) statement-local name bindings with "EXPR as NAME", and (b) assignment expressions with "target := value". ChrisA From dmoisset at machinalis.com Thu Apr 12 14:32:05 2018 From: dmoisset at machinalis.com (Daniel Moisset) Date: Thu, 12 Apr 2018 19:32:05 +0100 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: I've been playing a bit with this trying to collect some data and measure how useful this would be. You can take a look at the script I'm using at: https://github.com/dmoisset/pycstats What I'm measuring is: 1. Number of objects in the pyc, and how many of those are: * docstrings (I'm using a heuristic here which I'm not 100% sure it is correct) * lnotabs * Duplicate objects; these have not been discussed in this thread before but are another source of optimization I noticed while writing this. Essentially I'm refering to immutable constants that are instanced more than once and could be shared. You can also measure the effect of this optimization across modules and within a single module[1] 2. Bytes used in memory by the categories above (sum of sys.getsizeof() for each category). I'm not measuring anything related to annotations because, as I mentioned before, they are generated piecemeal by executable bytecode so they are hard to separate Running this on my python 3.6 pyc cache I get: $ find /usr/lib/python3.6 -name '*.pyc' |xargs python3.6 pycstats.py 8645 docstrings, 1705441B 19060 lineno tables, 941702B 59382/202898 duplicate objects for 3101287/18582807 memory size So this means around ~10% of the memory used after loading is used for docstrings, ~5% for lnotabs, and ~15% for objects that could be shared. The sharing assumes we can share betwwen modules, but even doing it within modules, you can get to ~7%. In short, this could mean a 25%-35% reduction in memory use for code objects if the stdlib is a good benchmark. Best, D. [1] Regarding duplicates, I've found some unexpected things within loaded code objects, for example instances of the small integer "1" with different id() than the singleton that cpython normally uses for "1", although most duplicates are some small strings, tuples with argument names, or . Something that could be interesting to write is a "pyc optimizer" that removes duplicates, this should be a gain at a minimal preprocessing cost. On 12 April 2018 at 15:16, Daniel Moisset wrote: > One implementation difficulty specifically related to annotations, is that > they are quite hard to find/extract from the code objects. Both docstrings > and lnotab are within specific fields of the code object for their > function/class/module; annotations are spread as individual constants > (assuming PEP 563), which are loaded in bytecode through separate > LOAD_CONST statements before creating the function object, and that can > happen in the middle of bytecode for the higher level object (the module or > class containing a function definition). So the change for achieving that > will be more significant than just "add a couple of descriptors to function > objects and change the module marshalling code". > > Probably making annotations fit a single structure that can live in > co_consts could make this change easier, and also make startup of annotated > modules faster (because you just load a single constant instead of one per > argument), this might be a valuable change by itself. > > > > On 12 April 2018 at 11:48, INADA Naoki wrote: > >> > Finally, loading docstrings and other optional components can be made >> lazy. >> > This was not in my original idea, and this will significantly >> complicate the >> > implementation, but in principle it is possible. This will require >> larger >> > changes in the marshal format and bytecode. >> >> I'm +1 on this idea. >> >> * New pyc format has code section (same to current) and text section. >> text section stores UTF-8 strings and not loaded at import time. >> * Function annotation (only when PEP 563 is used) and docstring are >> stored as integer, point to offset in the text section. >> * When type.__doc__, PyFunction.__doc__, PyFunction.__annotation__ are >> integer, text is loaded from the text section lazily. >> >> PEP 563 will reduce some startup time, but __annotation__ is still >> dict. Memory overhead is negligible. >> >> In [1]: def foo(a: int, b: int) -> int: >> ...: return a + b >> ...: >> ...: >> >> In [2]: import sys >> In [3]: sys.getsizeof(foo) >> Out[3]: 136 >> >> In [4]: sys.getsizeof(foo.__annotations__) >> Out[4]: 240 >> >> When PEP 563 is used, there are no side effect while building the >> annotation. >> So the annotation can be serialized in text, like >> {"a":"int","b":"int","return":"int"}. >> >> This change will require new pyc format, and descriptor for >> PyFunction.__doc__, PyFunction.__annotation__ >> and type.__doc__. >> >> Regards, >> >> -- >> INADA Naoki >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > Daniel F. Moisset - UK Country Manager - Machinalis Limited > www.machinalis.co.uk > Skype: @dmoisset T: + 44 7398 827139 > > 1 Fore St, London, EC2Y 9DT > > Machinalis Limited is a company registered in England and Wales. > Registered number: 10574987. > -- Daniel F. Moisset - UK Country Manager - Machinalis Limited www.machinalis.co.uk Skype: @dmoisset T: + 44 7398 827139 1 Fore St, London, EC2Y 9DT Machinalis Limited is a company registered in England and Wales. Registered number: 10574987. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Thu Apr 12 15:15:56 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Thu, 12 Apr 2018 12:15:56 -0700 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: <20180411044441.GQ16661@ando.pearwood.info> Message-ID: <838a4818-a530-f112-9505-b5f2f09d9a91@mgmiller.net> While we're on the subject, I've tried to add dicts a few times over the years to get a new one but it doesn't work: d3 = d1 + d2 # TypeError Thinking a bit, set union is probably a better analogue, but it doesn't work either: d3 = d1 | d2 # TypeError Where the last value of any duplicate keys should win. -Mike On 2018-04-12 06:46, Andr?s Delfino wrote: > Extending the original idea, IMHO it would make sense for the dict constructor > to create a new dictionary not only from several mappings, but mixing mappings > and iterables too. > > Consider this example: > > x = [(1, 'one')] > y = {2: 'two'} > > Now: {**dict(x), **y} > Proposed: dict(x, y) From peter.ed.oconnor at gmail.com Thu Apr 12 15:37:27 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Thu, 12 Apr 2018 15:37:27 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> <20180411034115.GP16661@ando.pearwood.info> Message-ID: On Wed, Apr 11, 2018 at 10:50 AM, Paul Moore wrote: > In particular, I'm happiest with the named moving_average() function, > which may reflect to some extent my lack of familiarity with the > subject area. I don't *care* how it's implemented internally - an > explicit loop is fine with me, but if a domain expert wants to be > clever and use something more complex, I don't need to know. An often > missed disadvantage of one-liners is that they get put inline, meaning > that people looking for a higher level overview of what the code does > get confronted with all the gory details. I'm all in favour of hiding things away into functions - I just think those functions should be as basic as possible, without implicit assumptions about how they will be used. Let me give an example: ---- Lets look at your preferred method (A): def moving_average(signal_iterable, decay, initial=0): last_average = initial for x in signal_iterable: last_average = (1-decay)*last_average + decay*x yield last_average moving_average_gen = moving_average(signal, decay=decay, initial=initial) And compare it with (B), which would require the proposed syntax: def moving_average_step(last_average, x, decay): return (1-decay)*last_average + decay*x moving_average_gen = (average:= moving_average_step(average, x, decay=decay) for x in signal from x=initial) ----- Now, suppose we want to change things so that the "decay" changes with every step. The moving_average function (A) now has to be changed, because what we once thought would be a fixed parameter is now a variable that changes between calls. Our options are: - Make "decay" another iterable (in which case other functions calling "moving_average" need to be changed). - Leave an option for "decay" to be a float which gets transformed to an iterable with "decay_iter = (decay for _ in itertools.count(0)) if isinstance(decay, (int, float)) else decay". (awkward because 95% of usages don't need this. If you do this for more parameters you suddenly have this weird implementation with iterators everywhere even though in most cases they're not needed). - Factor out the "pure" "moving_average_step" from "moving_average", and create a new "moving_average_with_dynamic_decay" wrapper (but now we have to maintain two wrappers - with the duplicated arguments - which starts to require a lot of maintenance when you're passing down several parameters (or you can use the dreaded **kwargs). With approach (B) on the other hand, "moving_average_step" and all the functions calling it, can stay the same: we just change the way we call it in this instance to: moving_average_gen = (average:= moving_average_step(average, x, decay=decay) for x, decay in zip(signal, decay_schedule) from x=initial) ---- Now lets imagine this were a more complex function with 10 parameters. I see these kind of examples a lot in machine-learning and robotics programs, where you'll have parameters like "learning rate", "regularization", "minibatch_size", "maximum_speed", "height_of_camera" which might initially be considered initialization parameters, but then later it turns out they need to be changed dynamically. This is why I think the "(y:=f(y, x) for x in xs from y=initial)" syntax can lead to cleaner, more maintainable code. On Wed, Apr 11, 2018 at 10:50 AM, Paul Moore wrote: > On 11 April 2018 at 15:37, Peter O'Connor > wrote: > > > If people are happy with these solutions and still see no need for the > > initialization syntax, we can stop this, but as I see it there is a > "hole" > > in the language that needs to be filled. > > Personally, I'm happy with those solutions and see no need for the > initialisation syntax. > > In particular, I'm happiest with the named moving_average() function, > which may reflect to some extent my lack of familiarity with the > subject area. I don't *care* how it's implemented internally - an > explicit loop is fine with me, but if a domain expert wants to be > clever and use something more complex, I don't need to know. An often > missed disadvantage of one-liners is that they get put inline, meaning > that people looking for a higher level overview of what the code does > get confronted with all the gory details. > > Paul > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adelfino at gmail.com Thu Apr 12 15:32:41 2018 From: adelfino at gmail.com (=?UTF-8?Q?Andr=C3=A9s_Delfino?=) Date: Thu, 12 Apr 2018 16:32:41 -0300 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: <838a4818-a530-f112-9505-b5f2f09d9a91@mgmiller.net> References: <20180411044441.GQ16661@ando.pearwood.info> <838a4818-a530-f112-9505-b5f2f09d9a91@mgmiller.net> Message-ID: There's a long thread about the subject: https://mail.python.org/pipermail/python-ideas/2015-February/031748.html I suggest to avoid the matter altogether :) On Thu, Apr 12, 2018 at 4:15 PM, Mike Miller wrote: > While we're on the subject, I've tried to add dicts a few times over the > years to get a new one but it doesn't work: > > d3 = d1 + d2 # TypeError > > Thinking a bit, set union is probably a better analogue, but it doesn't > work either: > > d3 = d1 | d2 # TypeError > > Where the last value of any duplicate keys should win. > > -Mike > > > > On 2018-04-12 06:46, Andr?s Delfino wrote: > >> Extending the original idea, IMHO it would make sense for the dict >> constructor to create a new dictionary not only from several mappings, but >> mixing mappings and iterables too. >> >> Consider this example: >> >> x = [(1, 'one')] >> y = {2: 'two'} >> >> Now: {**dict(x), **y} >> Proposed: dict(x, y) >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adelfino at gmail.com Thu Apr 12 15:42:26 2018 From: adelfino at gmail.com (=?UTF-8?Q?Andr=C3=A9s_Delfino?=) Date: Thu, 12 Apr 2018 16:42:26 -0300 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: I think the update method can (and personally, should) stay unchanged: spam.update(dict(x, y)) seems succinct and elegant enough, with the proposed constructor syntax. Sorry my ignorance, do (Mutable)Mapping ABC say anything about constructors? On Thu, Apr 12, 2018 at 12:45 PM, Serhiy Storchaka wrote: > 09.04.18 00:18, Andr?s Delfino ????: > >> I thought that maybe dict could accept several mappings as positional >> arguments, like this: >> >> class Dict4(dict): >> def __init__(self, *args, **kwargs): >> if len(args) > 1: >> if not all([isinstance(arg, dict) for arg in args]): >> raise TypeError('Dict4 expected instances of dict >> since multiple positional arguments were passed') >> >> temp = args[0].copy() >> >> for arg in args[1:]: >> temp.update(arg) >> >> super().__init__(temp, **kwargs) >> else: >> super().__init__(*args, **kwargs) >> >> >> AFAIK, this wouldn't create compatibility problems, since you can't pass >> two positional arguments now anyways. >> >> It would be useful to solve the "sum/union dicts" discussion, for >> example: requests.get(url, params=dict(params, {'foo': bar}) >> >> Whar are your thoughts? >> > > It is easy to make the dict constructor merging several positional > arguments. But this is not a tiny harmless change, it will start a cascade > of other changes. > > After changing the dict constructor, we will need to update the > dict.update() method too. Constructors and update() methods of dict > subclasses (OrderedDict, defaultdict, Counter, and more specialized > classes) should be updated too. UserDict, WeakKeyDictionary, > WeakValueDictionary are next. After that we will have a pressure of > updating constructors and update() methods of abstract classes Mapping and > MutableMapping. This change will break a lot of third-party code that > implement concrete implementations of these classes, because adding support > of new arguments in the method of abstract class breaks an interface. > > We will be able to pass this path (we have already passed it), but we must > realize how long it is. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ed.oconnor at gmail.com Thu Apr 12 15:43:54 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Thu, 12 Apr 2018 15:43:54 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> <20180411034115.GP16661@ando.pearwood.info> Message-ID: * correction to example: moving_average_gen = (average:= moving_average_step(average, x, decay=decay) for x in signal from average=initial) On Thu, Apr 12, 2018 at 3:37 PM, Peter O'Connor wrote: > On Wed, Apr 11, 2018 at 10:50 AM, Paul Moore wrote: > >> In particular, I'm happiest with the named moving_average() function, >> which may reflect to some extent my lack of familiarity with the >> subject area. I don't *care* how it's implemented internally - an >> explicit loop is fine with me, but if a domain expert wants to be >> clever and use something more complex, I don't need to know. An often >> missed disadvantage of one-liners is that they get put inline, meaning >> that people looking for a higher level overview of what the code does >> get confronted with all the gory details. > > > I'm all in favour of hiding things away into functions - I just think > those functions should be as basic as possible, without implicit > assumptions about how they will be used. Let me give an example: > > ---- > > Lets look at your preferred method (A): > > def moving_average(signal_iterable, decay, initial=0): > last_average = initial > for x in signal_iterable: > last_average = (1-decay)*last_average + decay*x > yield last_average > > moving_average_gen = moving_average(signal, decay=decay, > initial=initial) > > And compare it with (B), which would require the proposed syntax: > > def moving_average_step(last_average, x, decay): > return (1-decay)*last_average + decay*x > > moving_average_gen = (average:= moving_average_step(average, x, > decay=decay) for x in signal from x=initial) > > ----- > > Now, suppose we want to change things so that the "decay" changes with > every step. > > The moving_average function (A) now has to be changed, because what we > once thought would be a fixed parameter is now a variable that changes > between calls. Our options are: > - Make "decay" another iterable (in which case other functions calling > "moving_average" need to be changed). > - Leave an option for "decay" to be a float which gets transformed to an > iterable with "decay_iter = (decay for _ in itertools.count(0)) if > isinstance(decay, (int, float)) else decay". (awkward because 95% of > usages don't need this. If you do this for more parameters you suddenly > have this weird implementation with iterators everywhere even though in > most cases they're not needed). > - Factor out the "pure" "moving_average_step" from "moving_average", and > create a new "moving_average_with_dynamic_decay" wrapper (but now we have > to maintain two wrappers - with the duplicated arguments - which starts to > require a lot of maintenance when you're passing down several parameters > (or you can use the dreaded **kwargs). > > With approach (B) on the other hand, "moving_average_step" and all the > functions calling it, can stay the same: we just change the way we call it > in this instance to: > > moving_average_gen = (average:= moving_average_step(average, x, > decay=decay) for x, decay in zip(signal, decay_schedule) from x=initial) > > ---- > > Now lets imagine this were a more complex function with 10 parameters. I > see these kind of examples a lot in machine-learning and robotics programs, > where you'll have parameters like "learning rate", "regularization", > "minibatch_size", "maximum_speed", "height_of_camera" which might initially > be considered initialization parameters, but then later it turns out they > need to be changed dynamically. > > This is why I think the "(y:=f(y, x) for x in xs from y=initial)" syntax > can lead to cleaner, more maintainable code. > > > > On Wed, Apr 11, 2018 at 10:50 AM, Paul Moore wrote: > >> On 11 April 2018 at 15:37, Peter O'Connor >> wrote: >> >> > If people are happy with these solutions and still see no need for the >> > initialization syntax, we can stop this, but as I see it there is a >> "hole" >> > in the language that needs to be filled. >> >> Personally, I'm happy with those solutions and see no need for the >> initialisation syntax. >> >> In particular, I'm happiest with the named moving_average() function, >> which may reflect to some extent my lack of familiarity with the >> subject area. I don't *care* how it's implemented internally - an >> explicit loop is fine with me, but if a domain expert wants to be >> clever and use something more complex, I don't need to know. An often >> missed disadvantage of one-liners is that they get put inline, meaning >> that people looking for a higher level overview of what the code does >> get confronted with all the gory details. >> >> Paul >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Thu Apr 12 15:46:34 2018 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 12 Apr 2018 21:46:34 +0200 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: References: Message-ID: <15602430-b133-239a-1aa4-b3bc9973f44a@egenix.com> I think moving data out of pyc files is going in a wrong direction: more stat calls means slower import and slower startup time. Trying to make pycs smaller also isn't really worth it (they compress quite well). Saving memory could be done by disabling reading objects lazily from the file - without removing anything from the pyc file. Whether the few 100kB RAM this saves is worth the effort depends on the application space. This leaves the proposal to restructure pyc files into a sectioned file and possibly indexed file to make access to (lazily) loaded parts faster. More structure would add ways to more easily update the content going forward (similar to how PE executable files are structured) and allow us to get rid of extra pyc file variants (e.g. for special optimized versions). So that's an interesting approach :-) BTW: In all this, please remember that quite a few applications do use doc strings as part of the code, not only for documentation. Most prominent are probably parsers which keep the parsing definitions in doc strings. On 12.04.2018 20:32, Daniel Moisset wrote: > I've been playing a bit with this trying to collect some data and > measure how useful this would be. You can take a look at the script I'm > using at:?https://github.com/dmoisset/pycstats? > > What I'm measuring is: > 1. Number of objects in the pyc, and how many of those are: > ? ?* docstrings (I'm using a heuristic here which I'm not 100% sure it > is correct) > ? ?* lnotabs > ? ?* Duplicate objects; these have not been discussed in this thread > before but are another source of optimization I noticed while writing > this. Essentially I'm refering to immutable constants that are instanced > more than once and could be shared. You can also measure the effect of > this optimization across modules and within a single module[1] > 2. Bytes used in memory by the categories above (sum of sys.getsizeof() > for each category). > > I'm not measuring anything related to annotations because, as I > mentioned before, they are generated piecemeal by executable bytecode so > they are hard to separate > > Running this on my python 3.6 pyc cache I get: > > $ find /usr/lib/python3.6 -name '*.pyc' |xargs python3.6 pycstats.py? > 8645 docstrings, 1705441B > 19060 lineno tables, 941702B > 59382/202898 duplicate objects for 3101287/18582807 memory size > > So this means around ~10% of the memory used after loading is used for > docstrings, ~5% for lnotabs, and ~15% for objects that could be shared. > The sharing assumes we can share betwwen modules, but even doing it > within modules, you can get to ~7%.? > > In short, this could mean a 25%-35% reduction in memory use for code > objects if the stdlib is a good benchmark. > > Best, > D. > > [1] Regarding duplicates, I've found some unexpected things within > loaded code objects, for example instances of the small integer "1" with > different id() than the singleton that cpython normally uses for "1", > although most duplicates are some small strings, tuples with argument > names, or . Something that could be interesting to write is a "pyc > optimizer" that removes duplicates, this should be a gain at a minimal > preprocessing cost. > > > On 12 April 2018 at 15:16, Daniel Moisset > wrote: > > One implementation difficulty specifically related to annotations, > is that they are quite hard to find/extract from the code objects. > Both docstrings and lnotab are within specific fields of the code > object for their function/class/module; annotations are spread as > individual constants (assuming PEP 563), which are loaded in > bytecode through separate LOAD_CONST statements before creating the > function object, and that can happen in the middle of bytecode for > the higher level object (the module or class containing a function > definition). So the change for achieving that will be more > significant than just "add a couple of descriptors to function > objects and change the module marshalling code". > > Probably making annotations fit a single structure that can live in > co_consts could make this change easier, and also make startup of > annotated modules faster (because you just load a single constant > instead of one per argument), this might be a valuable change by itself. > > > > On 12 April 2018 at 11:48, INADA Naoki > wrote: > > > Finally, loading docstrings and other optional components can be made lazy. > > This was not in my original idea, and this will significantly complicate the > > implementation, but in principle it is possible. This will require larger > > changes in the marshal format and bytecode. > > I'm +1 on this idea. > > * New pyc format has code section (same to current) and text > section. > text section stores UTF-8 strings and not loaded at import time. > * Function annotation (only when PEP 563 is used) and docstring are > stored as integer, point to offset in the text section. > * When type.__doc__, PyFunction.__doc__, > PyFunction.__annotation__ are > integer, text is loaded from the text section lazily. > > PEP 563 will reduce some startup time, but __annotation__ is still > dict.? Memory overhead is negligible. > > In [1]: def foo(a: int, b: int) -> int: > ? ?...:? ? ?return a + b > ? ?...: > ? ?...: > > In [2]: import sys > In [3]: sys.getsizeof(foo) > Out[3]: 136 > > In [4]: sys.getsizeof(foo.__annotations__) > Out[4]: 240 > > When PEP 563 is used, there are no side effect while building > the annotation. > So the annotation can be serialized in text, like > {"a":"int","b":"int","return":"int"}. > > This change will require new pyc format, and descriptor for > PyFunction.__doc__, PyFunction.__annotation__ > and type.__doc__. > > Regards, > > -- > INADA Naoki? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > -- > Daniel F. Moisset -?UK Country Manager - Machinalis Limited > www.machinalis.co.uk > Skype: @dmoisset T:?+ 44 7398 827139 > > 1 Fore St, London, EC2Y 9DT > > Machinalis Limited is a company registered in England and Wales. > Registered number: 10574987. > > > > > -- > Daniel F. Moisset -?UK Country Manager - Machinalis Limited > www.machinalis.co.uk > Skype: @dmoisset T:?+ 44 7398 827139 > > 1 Fore St, London, EC2Y 9DT > > Machinalis Limited is a company registered in England and Wales. > Registered number: 10574987. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 12 2018) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From rosuav at gmail.com Thu Apr 12 16:26:27 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 06:26:27 +1000 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: <20180411034115.GP16661@ando.pearwood.info> References: <20180406011854.GU16661@ando.pearwood.info> <3657bd13-cf42-509e-b4ef-2737326b13f3@mozilla.com> <20180410173255.GM16661@ando.pearwood.info> <20180411034115.GP16661@ando.pearwood.info> Message-ID: On Wed, Apr 11, 2018 at 1:41 PM, Steven D'Aprano wrote: > Personally, I still think the best approach here is a combination of > itertools.accumulate, and the proposed name-binding as an expression > feature: > > total = 0 > running_totals = [(total := total + x) for x in values] > # alternative syntax > running_totals = [(total + x as total) for x in values] > > If you don't like the dependency on an external variable (or if that > turns out not to be practical) then we could have: > > running_totals = [(total := total + x) for total in [0] for x in values] > Linking this to the PEP 572 thread, this is an open question now: https://www.python.org/dev/peps/pep-0572/#importing-names-into-comprehensions Anyone who's interested in (or intrigued by) this potential syntax is very much welcome to hop over to the PEP 572 threads and join in. ChrisA From mertz at gnosis.cx Thu Apr 12 18:50:06 2018 From: mertz at gnosis.cx (David Mertz) Date: Thu, 12 Apr 2018 22:50:06 +0000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5ACE796E.7050401@brenbarn.net> References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> <5ACE4AC2.4030503@brenbarn.net> <5ACE796E.7050401@brenbarn.net> Message-ID: Fair enough. I wouldn't actually do what I suggested either. But then, I also wouldn't ever write: x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2 [with/where/whatever D=...] If your goal is simply to have symmetry in the plus-or-minus clauses, I was simply pointing out you can have that with the ':=' syntax. Inasmuch as I might like assignment expressions, it would only be in while or if statements, personally. On Wed, Apr 11, 2018, 5:09 PM Brendan Barnwell wrote: > On 2018-04-11 11:05, David Mertz wrote: > > How about this, Brendan? > > > > _, x1, x2 = (D := b**2 - 4*a*c), (-b + sqrt(D))/2, (-b - sqrt(D))/2 > > > > I'm not sure I love this, but I don't hate it. > > That's clever, but why bother? I can already do this with > existing Python: > > D = b**2 - 4*a*c > x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2 > > If the new feature encourages people to do something like your > example > (or my earlier examples with the D definition inline in the expression > for x1), then I'd consider that another mark against it. > > -- > Brendan Barnwell > "Do not follow where the path may lead. Go, instead, where there is no > path, and leave a trail." > --author unknown > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Thu Apr 12 20:38:23 2018 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Fri, 13 Apr 2018 00:38:23 +0000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <15602430-b133-239a-1aa4-b3bc9973f44a@egenix.com> References: <15602430-b133-239a-1aa4-b3bc9973f44a@egenix.com> Message-ID: On Fri, 13 Apr 2018 at 03:47, M.-A. Lemburg wrote: > I think moving data out of pyc files is going in a wrong direction: > more stat calls means slower import and slower startup time. > > Trying to make pycs smaller also isn't really worth it (they > compress quite well). > > Saving memory could be done by disabling reading objects lazily > from the file - without removing anything from the pyc file. > Whether the few 100kB RAM this saves is worth the effort depends > on the application space. > > This leaves the proposal to restructure pyc files into a sectioned > file and possibly indexed file to make access to (lazily) loaded > parts faster. +1. With this in place -O and -OO cmdline options would become even less useful (which is good). -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Apr 12 21:03:15 2018 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Apr 2018 18:03:15 -0700 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: It's a slippery slope indeed. While having to change update() alone wouldn't worry me, the subclass constructors do seem like they are going to want changing too, and that's indeed a bit much. So let's back off a bit. Not every three lines of code need a built-in shorthand. On Thu, Apr 12, 2018 at 8:45 AM, Serhiy Storchaka wrote: > 09.04.18 00:18, Andr?s Delfino ????: > >> I thought that maybe dict could accept several mappings as positional >> arguments, like this: >> >> class Dict4(dict): >> def __init__(self, *args, **kwargs): >> if len(args) > 1: >> if not all([isinstance(arg, dict) for arg in args]): >> raise TypeError('Dict4 expected instances of dict >> since multiple positional arguments were passed') >> >> temp = args[0].copy() >> >> for arg in args[1:]: >> temp.update(arg) >> >> super().__init__(temp, **kwargs) >> else: >> super().__init__(*args, **kwargs) >> >> >> AFAIK, this wouldn't create compatibility problems, since you can't pass >> two positional arguments now anyways. >> >> It would be useful to solve the "sum/union dicts" discussion, for >> example: requests.get(url, params=dict(params, {'foo': bar}) >> >> Whar are your thoughts? >> > > It is easy to make the dict constructor merging several positional > arguments. But this is not a tiny harmless change, it will start a cascade > of other changes. > > After changing the dict constructor, we will need to update the > dict.update() method too. Constructors and update() methods of dict > subclasses (OrderedDict, defaultdict, Counter, and more specialized > classes) should be updated too. UserDict, WeakKeyDictionary, > WeakValueDictionary are next. After that we will have a pressure of > updating constructors and update() methods of abstract classes Mapping and > MutableMapping. This change will break a lot of third-party code that > implement concrete implementations of these classes, because adding support > of new arguments in the method of abstract class breaks an interface. > > We will be able to pass this path (we have already passed it), but we must > realize how long it is. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaoxiansamma at gmail.com Thu Apr 12 23:00:47 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Fri, 13 Apr 2018 11:00:47 +0800 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) Message-ID: > You're looking at a very early commit there. I suggest looking at the > most recent commits on one of two branches: https://github.com/Rosuav/cpython/blob/statement-local- variables/Grammar/Grammar https://github.com/Rosuav/cpython/blob/assignment- expressions/Grammar/Grammar > Those are the two most recent states in my progress towards (a) > statement-local name bindings with "EXPR as NAME", and (b) assignment > expressions with "target := value". Okay, and the syntax "target := value" seems to be much easier to handle with. I do support this feature, but now I'm worried about the meaning of ':=', it seems to be a lazy assignment in some degree. I'm not sure using ':=' is proper here. > Inasmuch as I might like assignment expressions, it would only be in while or if statements, personally. Not exactly, assignment expression also works for "if expression", we can now have null checking. var.method() if var:= function() else None Null checking is importance enough to have a specific syntax in many other languages(C#, kotlin, Ruby and so on), and we can even have more than null checking by adding expression assignment. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Apr 12 23:22:02 2018 From: mertz at gnosis.cx (David Mertz) Date: Fri, 13 Apr 2018 03:22:02 +0000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: Yes, I should have added ternary expressions to if statements. I can definitely see the use there. However, your example is not null checking. You'd have to modify it slightly to get that: None if var:= function() is None else var.method() Still not bad looking. On Thu, Apr 12, 2018, 11:01 PM Thautwarm Zhao wrote: > > > You're looking at a very early commit there. I suggest looking at the > > most recent commits on one of two branches: > > > https://github.com/Rosuav/cpython/blob/statement-local-variables/Grammar/Grammar > > https://github.com/Rosuav/cpython/blob/assignment-expressions/Grammar/Grammar > > > Those are the two most recent states in my progress towards (a) > > statement-local name bindings with "EXPR as NAME", and (b) assignment > > expressions with "target := value". > > Okay, and the syntax "target := value" seems to be much easier to handle > with. > > I do support this feature, but now I'm worried about the meaning of ':=', > it seems to be a lazy assignment in some degree. I'm not sure using ':=' is > proper here. > > > > Inasmuch as I might like assignment expressions, it would only be in > while or if statements, personally. > > Not exactly, assignment expression also works for "if expression", we can > now have null checking. > > var.method() if var:= function() else None > > Null checking is importance enough to have a specific syntax in many other > languages(C#, kotlin, Ruby and so on), and we can even have more than null > checking by adding expression assignment. > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Apr 12 23:27:36 2018 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Apr 2018 20:27:36 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Thu, Apr 12, 2018 at 8:22 PM, David Mertz wrote: > Yes, I should have added ternary expressions to if statements. I can > definitely see the use there. > > However, your example is not null checking. You'd have to modify it > slightly to get that: > > None if var:= function() is None else var.method() > > Still not bad looking. > Though a long way from function()?.method() per PEP 505. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaoxiansamma at gmail.com Thu Apr 12 23:35:11 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Fri, 13 Apr 2018 11:35:11 +0800 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: > None if var:= function() is None else var.method() Make sense. In some specific scenes(maybe general, I'm not sure), var.method() if var:=function() else var looks cool although it's not really null checking. It works for Python, just like use `if lst` to check if the list `lst` is empty. 2018-04-13 11:22 GMT+08:00 David Mertz : > Yes, I should have added ternary expressions to if statements. I can > definitely see the use there. > > However, your example is not null checking. You'd have to modify it > slightly to get that: > > None if var:= function() is None else var.method() > > Still not bad looking. > > On Thu, Apr 12, 2018, 11:01 PM Thautwarm Zhao > wrote: > >> >> > You're looking at a very early commit there. I suggest looking at the >> > most recent commits on one of two branches: >> >> https://github.com/Rosuav/cpython/blob/statement-local- >> variables/Grammar/Grammar >> https://github.com/Rosuav/cpython/blob/assignment- >> expressions/Grammar/Grammar >> >> > Those are the two most recent states in my progress towards (a) >> > statement-local name bindings with "EXPR as NAME", and (b) assignment >> > expressions with "target := value". >> >> Okay, and the syntax "target := value" seems to be much easier to handle >> with. >> >> I do support this feature, but now I'm worried about the meaning of ':=', >> it seems to be a lazy assignment in some degree. I'm not sure using ':=' is >> proper here. >> >> >> > Inasmuch as I might like assignment expressions, it would only be in >> while or if statements, personally. >> >> Not exactly, assignment expression also works for "if expression", we can >> now have null checking. >> >> var.method() if var:= function() else None >> >> Null checking is importance enough to have a specific syntax in many >> other languages(C#, kotlin, Ruby and so on), and we can even have more than >> null checking by adding expression assignment. >> >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Apr 12 23:55:47 2018 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 12 Apr 2018 22:55:47 -0500 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: [David Mertz ] > Yes, I should have added ternary expressions to if statements. I can > definitely see the use there. > > However, your example is not null checking. You'd have to modify it slightly > to get that: > > None if var:= function() is None else var.method() > > Still not bad looking. I couldn't find text in the PEP spelling out precedence, but there are two plausible ways that could be grouped: 1. None if (var:= function()) is None else var.method() which is what I believe you intended, and 2. None if var:= (function() is None) else var.method() which from earlier iterations of this thread I believe is the actual meaning - but isn't at all what was intended. The most clearly related example in the PEP appears to be: x = "default" if (eggs := spam().ham) is None else eggs which forced the intended meaning as in #1 above. While "assignment" is currently a statement type rather than "an operator", viewing the current situation as if "=" were an operator it has very low precedence, so it would be just as surprising at times to boost the precedence of ":=": if x := i+1: That is, in _that_ example, if x := (i+1): is "clearly intended", not the more tightly binding if (x ;= i) + 1: On the third hand, requiring parentheses all the time would also feel strained: while m := someregexp.match(somestring): is already impossible to misread. Annoying ;-) From mertz at gnosis.cx Fri Apr 13 00:14:01 2018 From: mertz at gnosis.cx (David Mertz) Date: Fri, 13 Apr 2018 04:14:01 +0000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: Yes, I have not read all the iterations of the PEP, and none of them extremely closely. I had thought it "obvious" that ':=' should have a very high operator precedence. But of course if it doesn't then expressions like the one I proposed could mean something quite different. I find the third hand argument rather compelling. All those potentially required parentheses turn elegant looking code into an ugly mess. On Thu, Apr 12, 2018, 11:56 PM Tim Peters wrote: > [David Mertz ] > > Yes, I should have added ternary expressions to if statements. I can > > definitely see the use there. > > > > However, your example is not null checking. You'd have to modify it > slightly > > to get that: > > > > None if var:= function() is None else var.method() > > > > Still not bad looking. > > I couldn't find text in the PEP spelling out precedence, but there are > two plausible ways that could be grouped: > > 1. None if (var:= function()) is None else var.method() > > which is what I believe you intended, and > > 2. None if var:= (function() is None) else var.method() > > which from earlier iterations of this thread I believe is the actual > meaning - but isn't at all what was intended. > > The most clearly related example in the PEP appears to be: > > x = "default" if (eggs := spam().ham) is None else eggs > > which forced the intended meaning as in #1 above. > > While "assignment" is currently a statement type rather than "an > operator", viewing the current situation as if "=" were an operator it > has very low precedence, so it would be just as surprising at times to > boost the precedence of ":=": > > if x := i+1: > > That is, in _that_ example, > > if x := (i+1): > > is "clearly intended", not the more tightly binding > > if (x ;= i) + 1: > > On the third hand, requiring parentheses all the time would also feel > strained: > > while m := someregexp.match(somestring): > > is already impossible to misread. > > Annoying ;-) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Apr 13 00:26:48 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 14:26:48 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Fri, Apr 13, 2018 at 1:00 PM, Thautwarm Zhao wrote: > I do support this feature, but now I'm worried about the meaning of ':=', it > seems to be a lazy assignment in some degree. I'm not sure using ':=' is > proper here. What do you mean by "lazy"? The assignment happens at the exact point that it is reached. Is there another language that uses := to mean something where the assignment doesn't happen until later? Or where the expression is assigned unevaluated, and is evaluated later? What part of it is lazy? ChrisA From tritium-list at sdamon.com Fri Apr 13 00:32:49 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Fri, 13 Apr 2018 00:32:49 -0400 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: <021101d3d2e0$7b44cc90$71ce65b0$@sdamon.com> > > On the third hand, requiring parentheses all the time would also feel > strained: > > while m := someregexp.match(somestring): > > is already impossible to misread. > > Annoying ;-) While adding parens to that would be superfluous for the reader of the module, as a tradeoff for requiring explicitness instead of doing the implicitly wrong (depending on context) thing is probably worth it. From guido at python.org Fri Apr 13 00:37:10 2018 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Apr 2018 21:37:10 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: Clearly the PEP should spell out the precedence of :=. I'm sure Chris intended to give := the lowest possible precedence: the analogy with = (both in Python and in other languages), and the original design of the PEP, where the syntax was ( as ), with mandatory parentheses. (Probably his implementation helps to guess his intention too, but I can't be bothered yet to clone and build it just to answer this question definitively. :-) I find the analogy argument compelling, and I think we should just make it a style rule that parentheses should be used whenever it could be misread by a human who's not sure about the precedence. Too bad for that one example. On Thu, Apr 12, 2018 at 9:14 PM, David Mertz wrote: > Yes, I have not read all the iterations of the PEP, and none of them > extremely closely. I had thought it "obvious" that ':=' should have a very > high operator precedence. But of course if it doesn't then expressions like > the one I proposed could mean something quite different. > > I find the third hand argument rather compelling. All those potentially > required parentheses turn elegant looking code into an ugly mess. > > > On Thu, Apr 12, 2018, 11:56 PM Tim Peters wrote: > >> [David Mertz ] >> > Yes, I should have added ternary expressions to if statements. I can >> > definitely see the use there. >> > >> > However, your example is not null checking. You'd have to modify it >> slightly >> > to get that: >> > >> > None if var:= function() is None else var.method() >> > >> > Still not bad looking. >> >> I couldn't find text in the PEP spelling out precedence, but there are >> two plausible ways that could be grouped: >> >> 1. None if (var:= function()) is None else var.method() >> >> which is what I believe you intended, and >> >> 2. None if var:= (function() is None) else var.method() >> >> which from earlier iterations of this thread I believe is the actual >> meaning - but isn't at all what was intended. >> >> The most clearly related example in the PEP appears to be: >> >> x = "default" if (eggs := spam().ham) is None else eggs >> >> which forced the intended meaning as in #1 above. >> >> While "assignment" is currently a statement type rather than "an >> operator", viewing the current situation as if "=" were an operator it >> has very low precedence, so it would be just as surprising at times to >> boost the precedence of ":=": >> >> if x := i+1: >> >> That is, in _that_ example, >> >> if x := (i+1): >> >> is "clearly intended", not the more tightly binding >> >> if (x ;= i) + 1: >> >> On the third hand, requiring parentheses all the time would also feel >> strained: >> >> while m := someregexp.match(somestring): >> >> is already impossible to misread. >> >> Annoying ;-) >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Apr 13 00:44:41 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 14:44:41 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Fri, Apr 13, 2018 at 1:55 PM, Tim Peters wrote: > [David Mertz ] >> Yes, I should have added ternary expressions to if statements. I can >> definitely see the use there. >> >> However, your example is not null checking. You'd have to modify it slightly >> to get that: >> >> None if var:= function() is None else var.method() >> >> Still not bad looking. > > I couldn't find text in the PEP spelling out precedence, Oops, that's a mistake. I'll add that in a bit. It has extremely low precedence. > but there are > two plausible ways that could be grouped: > > 1. None if (var:= function()) is None else var.method() > > which is what I believe you intended, and > > 2. None if var:= (function() is None) else var.method() > > which from earlier iterations of this thread I believe is the actual > meaning - but isn't at all what was intended. I just went to test it, and the unparenthesized version is actually bombing with SyntaxError. I'm going to call that a bug, though, and see about fixing it tonight. Currently, the precedence is so low that basically anything will get captured, and if you want to capture anything less than the entire expression, you parenthesize. I'll experiment with placing it between '|' and comparison operators, which would mean that an unparenthesized assignment expression generally won't capture a boolean, but will instead capture the value itself. That's likely to be more useful, I think. (Obviously you can always use parens to overrule the precedence table.) > While "assignment" is currently a statement type rather than "an > operator", viewing the current situation as if "=" were an operator it > has very low precedence, so it would be just as surprising at times to > boost the precedence of ":=": > > if x := i+1: > > That is, in _that_ example, > > if x := (i+1): > > is "clearly intended", not the more tightly binding > > if (x ;= i) + 1: Agreed as regards addition. Less certain as regards comparisons. if x := i > 1: is currently "if x := (i > 1)" and not "if (x := i) > 1". Assuming I don't run into implementation difficulties, I'll play with this tonight, then document it one way or the other, and maybe have an open question about the alternative precedence. > On the third hand, requiring parentheses all the time would also feel strained: > > while m := someregexp.match(somestring): > > is already impossible to misread. > > Annoying ;-) Agreed. There's no reason to mandate the parens here. Neither the PEP nor the reference implementation has any problem with this expression being unparenthesized. ChrisA From rosuav at gmail.com Fri Apr 13 00:52:10 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 14:52:10 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Fri, Apr 13, 2018 at 2:14 PM, David Mertz wrote: > Yes, I have not read all the iterations of the PEP, and none of them > extremely closely. I had thought it "obvious" that ':=' should have a very > high operator precedence. But of course if it doesn't then expressions like > the one I proposed could mean something quite different. Interesting. How high, exactly? https://docs.python.org/3/reference/expressions.html#operator-precedence (note: "higher precedence" doesn't mean higher on that table - the table's from lowest to highest) Currently, I've placed ':=' up with if-else expressions. That's extremely low precedence. I'm contemplating moving it to just higher than comparison operators (ie just below the bitwise operators), putting it below all the arithmetic operators. The highest precedence I would consider putting it is just below the unary operators; that would mean it binds more tightly than arithmetic operators, but you can still say "x := y[1]" without having to parenthesize the latter. foo := a > b # does this capture 'a', or 'a > b'? bar := c + d # 'c' or 'c + d'? I'm open to argument here, but my thinking is that these should capture 'a' and 'c + d'. ChrisA From tim.peters at gmail.com Fri Apr 13 00:52:35 2018 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 12 Apr 2018 23:52:35 -0500 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: [David Mertz ] > Yes, I have not read all the iterations of the PEP, and none of them > extremely closely. I had thought it "obvious" that ':=' should have a very > high operator precedence. But you don't really believe that ;-) That's the rub. In your specific None if var:= function() is None else var.method() example it was obvious := should have very high precedence _there_, but in an earlier _, x1, x2 = (D := b**2 - 4*a*c), (-b + sqrt(D))/2, (-b - sqrt(D))/2 it was just as obvious that you wanted := to have a very low precedence _there_ (else D would be bound to the value of plain old `b`). I don't at all mean to be picking on you there - I suffer the same illusions. It appears that "very low precedence" is actually wanted in nearly all examples, _except_ when comparison is involved. So maybe := "should have" precedence between comparison operators (<=, is, etc) and bitwise OR (|). Then both your examples above would do what you expected them to do (and what I expected them to do at first glance). OTOH, that wouldn't be at all like C, whose precedence rules Python largely follows (only the comma operator has lower precedence than assignment in C), and - worse - wouldn't be at all like Python assignment statements appear to behave. > But of course if it doesn't then expressions like > the one I proposed could mean something quite different. Whether very high or very low, one of your two examples above is screwed. > I find the third hand argument rather compelling. All those potentially > required parentheses turn elegant looking code into an ugly mess. It is a hangup for me - but I'm not yet sure how severe. From rosuav at gmail.com Fri Apr 13 00:55:35 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 14:55:35 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: On Fri, Apr 13, 2018 at 2:37 PM, Guido van Rossum wrote: > Clearly the PEP should spell out the precedence of :=. I'm sure Chris > intended to give := the lowest possible precedence: the analogy with = (both > in Python and in other languages), and the original design of the PEP, where > the syntax was ( as ), with mandatory parentheses. (Probably his > implementation helps to guess his intention too, but I can't be bothered yet > to clone and build it just to answer this question definitively. :-) > > I find the analogy argument compelling, and I think we should just make it a > style rule that parentheses should be used whenever it could be misread by a > human who's not sure about the precedence. > > Too bad for that one example. Yes, that's the current intention, and (modulo a bug or two) the current reference implementation. This may change tonight. (If anyone's curious, and/or wants to join the discussion in real-time, I'll be streaming live on https://twitch.tv/rosuav as I tinker with the precedence.) ChrisA From storchaka at gmail.com Fri Apr 13 01:11:47 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 13 Apr 2018 08:11:47 +0300 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: 12.04.18 22:42, Andr?s Delfino ????: > I think the update method can (and personally, should) stay unchanged: > > spam.update(dict(x, y)) > > seems succinct and elegant enough, with the proposed constructor syntax. > > Sorry my ignorance, do (Mutable)Mapping ABC say anything about > constructors? Mapping and MutableMapping ABCs don't have constructors, but many dict-like objects imitate the dict constructor: accept a single mapping or a sequence of pairs as a positional argument, and accept other dict as kwargs. From ethan at stoneleaf.us Fri Apr 13 02:27:11 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 12 Apr 2018 23:27:11 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: <5AD04DBF.50704@stoneleaf.us> On 04/12/2018 09:52 PM, Chris Angelico wrote: > foo := a > b # does this capture 'a', or 'a > b'? > bar := c + d # 'c' or 'c + d'? > > I'm open to argument here, but my thinking is that these should > capture 'a' and 'c + d'. I think := should act the same as = or there will be plenty of confusion. If one wants to capture less then parenthesis can be used to narrow it down: (foo := a) > b Looked at another way -- you already have the value named as 'a', so why would you also name it as 'foo'? -- ~Ethan~ From rosuav at gmail.com Fri Apr 13 02:47:17 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 16:47:17 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5AD04DBF.50704@stoneleaf.us> References: <5AD04DBF.50704@stoneleaf.us> Message-ID: On Fri, Apr 13, 2018 at 4:27 PM, Ethan Furman wrote: > On 04/12/2018 09:52 PM, Chris Angelico wrote: > >> foo := a > b # does this capture 'a', or 'a > b'? >> bar := c + d # 'c' or 'c + d'? >> >> I'm open to argument here, but my thinking is that these should >> capture 'a' and 'c + d'. > > > I think := should act the same as = or there will be plenty of confusion. > If one wants to capture less then parenthesis can be used to narrow it down: > > (foo := a) > b > > Looked at another way -- you already have the value named as 'a', so why > would you also name it as 'foo'? > More likely, 'a' won't be a simple name lookup, but a function call. Consider: pos = -1 while pos := buffer.find(search_term, pos + 1) >= 0: ... Once find() returns -1, the loop terminates. Should this need to be parenthesized? ChrisA From steve at pearwood.info Fri Apr 13 06:45:42 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 13 Apr 2018 20:45:42 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <3903AF63-331B-45BF-8B30-008E2F0C0218@gmail.com> Message-ID: <20180413104541.GA11616@ando.pearwood.info> On Thu, Apr 12, 2018 at 12:28:23AM +1000, Chris Angelico wrote: > On Thu, Apr 12, 2018 at 12:11 AM, Paul Moore wrote: > > On 11 April 2018 at 14:54, Chris Angelico wrote: > >> Sure, if you're just assigning zero to everything. But you could do > >> that with a statement. What about this: > >> > >> q = { > >> lambda: x := lambda y: z := a := 0, > >> } > >> > >> Yes, it's an extreme example, but look at all those colons and tell me > >> if you can figure out what each one is doing. > > > > lambda: x := (lambda y: (z := (a := 0))) > > > > As I say, it's the only *possible* parsing. It's ugly, and it > > absolutely should be parenthesised, but there's no need to make the > > parentheses mandatory. (And actually, it didn't take me long to add > > those parentheses, it's not *hard* to parse correctly - for a human). I agree with Paul, except I think he's added too many parens. Chained assignments ought to be obvious enough that we can dispense with the extras: lambda: x := (lambda y: (z := a := 0)) I know that they are legal, but I really dislike *pointless* examples that bind to a name and then never use it. If we're to get a good feel for how complex these expressions are going to be, they ought to be realistic -- even if that makes them more complex. And I'm not terribly disturbed by excessively obfuscated examples. The answer to obfuscated code is, Don't Do That. So we should consider complex examples which are *realistic*, not ones designed intentionally as obfuscated code. So, with that advice, let's take your q example from above, and re-work it into something which is at least potentially realistic, of a sort. We want q to be a set consisting of a factory function which takes a single argument (different from your example, I know), builds an inner function, then returns that function and the result of that function called with the original argument: def factory(arg): def inner(y): a := z := y + 1 # seems kinda pointless to me, but okay... return (a, a+z, a*z) return (inner, inner(arg)) q = {1, 2, factory, 3, 4} Now let's re-write it in using expression assignment: q = {1, 2, (lambda arg: lambda y: (a := (z := y + 1), a+z, z*z) ), 3, 4, } Not too awful, although it is kinda pointless and not really a great justification for the feature. Let's obfuscate it: q = {1, 2, (lambda arg: lambda y: a := z := y + 1, a+z, z*z), 3, 4} I've seen worse :-) -- Steve From steve at pearwood.info Fri Apr 13 07:17:00 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 13 Apr 2018 21:17:00 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: <20180413111700.GC11616@ando.pearwood.info> On Thu, Apr 12, 2018 at 07:28:28AM +1000, Chris Angelico wrote: > Fair enough. Also adding that chained assignment expressions should > generally be avoided. So long as its not a blanket prohibition, I'm good with that. Also, something like this: spam := 2*(eggs := expression) + 1 should not be considered a chained assignment. -- Steve From rosuav at gmail.com Fri Apr 13 07:28:17 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 21:28:17 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <20180413111700.GC11616@ando.pearwood.info> References: <20180413111700.GC11616@ando.pearwood.info> Message-ID: On Fri, Apr 13, 2018 at 9:17 PM, Steven D'Aprano wrote: > On Thu, Apr 12, 2018 at 07:28:28AM +1000, Chris Angelico wrote: > >> Fair enough. Also adding that chained assignment expressions should >> generally be avoided. > > So long as its not a blanket prohibition, I'm good with that. > > Also, something like this: > > spam := 2*(eggs := expression) + 1 > > should not be considered a chained assignment. > Agreed. Chained assignment is setting the same value to two or more targets, without any further changes or anything: a = b = c = 0 Extremely handy as a statement (quickly setting a bunch of things to zero or None or something), but less useful with expression assignment. Incidentally, I've only just made "x := y := 1" legal tonight (literally during the writing of this post). Previously it needed parentheses, and I just didn't see much priority on fixing something that wasn't all that crucial :) ChrisA From steve at pearwood.info Fri Apr 13 07:04:32 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 13 Apr 2018 21:04:32 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: Message-ID: <20180413110432.GB11616@ando.pearwood.info> On Wed, Apr 11, 2018 at 03:32:04PM +1000, Chris Angelico wrote: > In any context where arbitrary Python expressions can be used, a **named > expression** can appear. This can be parenthesized for clarity, and is of > the form ``(target := expr)`` where ``expr`` is any valid Python expression, > and ``target`` is any valid assignment target. Have we really decided on spelling this as `target := expression`? You list this as a rejected spelling: > 1. ``EXPR as NAME``, with or without parentheses:: > > stuff = [[f(x) as y, x/y] for x in range(5)] but I don't think the objections given should be fatal: > Omitting the parentheses in this form of the proposal introduces many > syntactic ambiguities. Requiring them in all contexts leaves open the > option to make them optional in specific situations where the syntax is > unambiguous (cf generator expressions as sole parameters in function > calls), but there is no plausible way to make them optional everywhere. > > With the parentheses, this becomes a viable option, with its own tradeoffs > in syntactic ambiguity. Since ``EXPR as NAME`` already has meaning in > ``except`` and ``with`` statements (with different semantics), this would > create unnecessary confusion or require special-casing. The special casing you refer to would be to prohibit name binding expressions in "except" and "with" statements. You should explicitly say so in the PEP. I don't think that prohibiting those two forms is a big loss. I think any form of except (name := expression) as err: do_something(name) is going to be contrived. Likewise for `with` statements. I don't especially dislike := but I really think that putting the expression first is a BIG win for readability. If that requires parens to disambiguate it, so be it. You also missed the "arrow assignment operator" from various languages, including R: expression -> name (In an earlier post, I suggested R's other arrow operator, name <- expr, but of course that already evaluates as unary minus expr.) I think that there should be more attention paid to the idea of putting the expression first, rather than the name. -- Steve From rosuav at gmail.com Fri Apr 13 07:56:35 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 21:56:35 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180413110432.GB11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> Message-ID: On Fri, Apr 13, 2018 at 9:04 PM, Steven D'Aprano wrote: > On Wed, Apr 11, 2018 at 03:32:04PM +1000, Chris Angelico wrote: > >> In any context where arbitrary Python expressions can be used, a **named >> expression** can appear. This can be parenthesized for clarity, and is of >> the form ``(target := expr)`` where ``expr`` is any valid Python expression, >> and ``target`` is any valid assignment target. > > Have we really decided on spelling this as `target := expression`? You > list this as a rejected spelling: > > >> 1. ``EXPR as NAME``, with or without parentheses:: >> >> stuff = [[f(x) as y, x/y] for x in range(5)] > > but I don't think the objections given should be fatal: > >> Omitting the parentheses in this form of the proposal introduces many >> syntactic ambiguities. Requiring them in all contexts leaves open the >> option to make them optional in specific situations where the syntax is >> unambiguous (cf generator expressions as sole parameters in function >> calls), but there is no plausible way to make them optional everywhere. >> >> With the parentheses, this becomes a viable option, with its own tradeoffs >> in syntactic ambiguity. Since ``EXPR as NAME`` already has meaning in >> ``except`` and ``with`` statements (with different semantics), this would >> create unnecessary confusion or require special-casing. > > The special casing you refer to would be to prohibit name binding > expressions in "except" and "with" statements. You should explicitly say > so in the PEP. Parenthesis added to the rejection paragraph. > I don't think that prohibiting those two forms is a big loss. I think > any form of > > except (name := expression) as err: > do_something(name) > > is going to be contrived. Likewise for `with` statements. I agree as regards except statements. Not so much the with statements, though. How many times have people asked for "with (expr as name):" to be supported, allowing the statement to spread over multiple lines? With this syntax, it would suddenly be permitted - with dangerously similar semantics. For many MANY context managers, "with (expr as name):" would do the exact same thing as "with expr as name:". There is a general expectation that adding parentheses to an expression usually doesn't change the behaviour, and if it's legal, people will assume that the behaviour is the same. It isn't, and it's such a sneaky difference that I would call it a bug magnet. So if it's a bug magnet, what do we do? 1) Permit the subtly different semantics, and tell people to be careful 2) Forbid any use of "(expr as name)" in the header of a 'with' statement 3) Forbid it at top level, but permit it deeper down 4) Something else?? > I don't especially dislike := but I really think that putting the > expression first is a BIG win for readability. If that requires parens > to disambiguate it, so be it. There's a mild parallel between "(expr as name)" and other uses of 'as', which bind to that name. Every other use of 'as' is part of a special syntactic form ('import', 'with', and 'except'), but they do all bind to that name. (Point of interest: You can "with expr as x[0]:" but none of the other forms allow anything other than a simple name.) There's a strong parallel between "target := value" and "target = value"; in fact, the section on the differences is incredibly short and could become shorter. The only really important difference is that augmented assignment is not supported (you can't say "x +:= 5"), partly because it'd mean creating a boatload of three-character operators for very little value, and partly because augmented-assignment-as-expression is hard to explain. Which is better? A weak parallel or a strong one? How important is putting the expression first? On balance, I'm currently in favour of the := syntax, but it's only a small difference. > You also missed the "arrow assignment operator" from various languages, > including R: > > expression -> name > > (In an earlier post, I suggested R's other arrow operator, name <- expr, > but of course that already evaluates as unary minus expr.) I actually can't find anything about the -> operator, only the <- one. (Not that I looked very hard.) Is it a truly viable competitor, or just one that you'd like to see mentioned for completeness? > I think that there should be more attention paid to the idea of putting > the expression first, rather than the name. How many ways are there to bind a value to a name? Name last: * import x as y * from x import y as z * except x as y * with x as y Name first: * x = y * x += y # etc * for x in y * def x(.....) .... * def f(x=1) - arg defaults * class X: ... I'm seeing consistency here in that *EVERY* name binding where the name is at the end uses "as target" as its syntax. Everything else starts with the target, then defines what's being assigned to it. So I don't see much value in a "->" operator, except for the mere fact that it's different (and thus won't conflict in except/with); and the bulk of name bindings in Python put the name first. I don't have any really strong arguments in favour of :=, but I have a few weak ones and a few not quite so weak. So far, I have only weak arguments in favour of 'as'. ChrisA From j.van.dorp at deonet.nl Fri Apr 13 08:02:23 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Fri, 13 Apr 2018 14:02:23 +0200 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180413110432.GB11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> Message-ID: Before, I briefly mentioned the idea if this could be unified with except/with's "as". To the casual observer, they're really similar. However, their semantics would be totally different, and people don't seem to like a unification attempt. A huge argument against "as" would be to prevent confusion, especially for new people. I must admit I like putting the expression first, though. Even if it's just to make it harder to mix it up with normal assignment. Perhaps => could be used - it's a new token, unlike -> which is used to annotate return values, it's not legal syntax now(so no backwards compatibility issues), and used a for similar purposes in for example php when declaring associative arrays.($arr = array("key"=>"value");). I'm not convinced myself, though. From steve at pearwood.info Fri Apr 13 08:22:10 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 13 Apr 2018 22:22:10 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: <20180413122209.GD11616@ando.pearwood.info> On Wed, Apr 11, 2018 at 11:50:44PM +1000, Chris Angelico wrote: > > Previously, there was an alternative _operator form_ `->` proposed by > > Steven D'Aprano. This option is no longer considered? I see several > > advantages with this variant: > > 1. It does not use `:` symbol which is very visually overloaded in Python. > > 2. It is clearly distinguishable from the usual assignment statement and > > it's `+=` friends > > There are others but they are minor. > > I'm not sure why you posted this in response to the open question, but > whatever. The arrow operator is already a token in Python (due to its > use in 'def' statements) and should not conflict with anything; > however, apart from "it looks different", it doesn't have much to > speak for it. On the contrary, it puts the expression first, where it belongs *semi-wink*. The expression is the most important part of the assignment expression, and because we read from left to right, it should come first. Let's take a simple example: pair = (first_value := x + y + z, a + b + first_value ) What's the first item of the pair? If you're like me, and I think most people are similar, when skimming the code, you read only far across each line to get an idea of whether it is relevant or not. In this case, when skimming, you have to read past the name, past the assignment operator, and only then do you see the relevant information. Contrast: pair = (x + y + z -> first_value, a + b + first_value ) Now you need only read *up to* the assignment operator. Now clearly a careful and detailed reading of the code requires just as much work either way, but we often skim code, especially trying to identify the section that needs careful reading. I know that it is a long standing convention in programming and mathematics to write assignments with the variable on the left, but when I'm sketching out code on paper, I often *start* with the expression: x + y + z and only then do I go back and squeeze in a name binding on the left. (Especially since half the time I'm not really sure what the variable should be called. Naming things is hard.) Or I just draw in an arrow pointing to the name on the right: x + y + z ----> name > The arrow faces the other way in languages like Haskell, Indeed, but in R, it faces to the right. (Actually, R allows both direction.) There's also apparently a language BETA which uses -> for assignment, although I've never used it. HP calculator "RPN" language also includes a -> assignment operator for binding named parameters (taken off the stack) inside functions, except they use a custom encoding with an arrow symbol, not a literal hyphen+greater-than sign. Likewise for TI Nspire calculators, which also use an right-pointing arrow assignment operator. (They also have a Pascal-style := operator, so you're covered both ways.) This comes from various dialects of calculator BASIC. -- Steve From rosuav at gmail.com Fri Apr 13 08:35:59 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 13 Apr 2018 22:35:59 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <20180413122209.GD11616@ando.pearwood.info> References: <20180413122209.GD11616@ando.pearwood.info> Message-ID: On Fri, Apr 13, 2018 at 10:22 PM, Steven D'Aprano wrote: > On Wed, Apr 11, 2018 at 11:50:44PM +1000, Chris Angelico wrote: > >> > Previously, there was an alternative _operator form_ `->` proposed by >> > Steven D'Aprano. This option is no longer considered? I see several >> > advantages with this variant: >> > 1. It does not use `:` symbol which is very visually overloaded in Python. >> > 2. It is clearly distinguishable from the usual assignment statement and >> > it's `+=` friends >> > There are others but they are minor. >> >> I'm not sure why you posted this in response to the open question, but >> whatever. The arrow operator is already a token in Python (due to its >> use in 'def' statements) and should not conflict with anything; >> however, apart from "it looks different", it doesn't have much to >> speak for it. > > On the contrary, it puts the expression first, where it belongs > *semi-wink*. The 'as' syntax already has that going for it. What's the advantage of the arrow over the two front-runners, ':=' and 'as'? > The expression is the most important part of the assignment expression, > and because we read from left to right, it should come first. Let's take > a simple example: > > pair = (first_value := x + y + z, > a + b + first_value > ) > > What's the first item of the pair? If you're like me, and I think most > people are similar, when skimming the code, you read only far across > each line to get an idea of whether it is relevant or not. Yet Python has an if/else operator that, in contrast to C-inspired languages, violates that rule. So it's not a showstopper. :) >> The arrow faces the other way in languages like Haskell, > > Indeed, but in R, it faces to the right. (Actually, R allows both > direction.) There's also apparently a language BETA which uses -> for > assignment, although I've never used it. I looked up R's Wikipedia page and saw only the left-facing arrow. How common is the right-facing arrow? Will people automatically associate it with name binding? > HP calculator "RPN" language also includes a -> assignment operator for > binding named parameters (taken off the stack) inside functions, except > they use a custom encoding with an arrow symbol, not a literal > hyphen+greater-than sign. > > Likewise for TI Nspire calculators, which also use an right-pointing > arrow assignment operator. (They also have a Pascal-style := operator, > so you're covered both ways.) This comes from various dialects of > calculator BASIC. So we have calculators, and possibly R, and sorta-kinda Haskell, recommending some form of arrow. We have Pascal and its derivatives recommending colon-equals. And we have other usage in Python, with varying semantics, recommending 'as'. I guess that's enough to put the arrow in as another rejected alternate spelling, but not to seriously consider it. ChrisA From gmarcel.plch at gmail.com Fri Apr 13 08:42:29 2018 From: gmarcel.plch at gmail.com (Marcel Plch) Date: Fri, 13 Apr 2018 14:42:29 +0200 Subject: [Python-ideas] PEP 573: Module State Access from C Extension Methods Message-ID: I have prepared a PEP for better support of PEP 489 multiphase initialization. It makes it possible for extension types, their methods to be more specific, to access the state of the module they are defined in. conversation - https://mail.python.org/pipermail/import-sig/2015-July/001022.html PEP - https://github.com/python/peps/blob/master/pep-0573.rst Implementation - https://github.com/Traceur759/cpython/pull/4 From steve at pearwood.info Fri Apr 13 09:18:59 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 13 Apr 2018 23:18:59 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> Message-ID: <20180413131859.GE11616@ando.pearwood.info> On Fri, Apr 13, 2018 at 09:56:35PM +1000, Chris Angelico wrote: > How many times have people asked for "with (expr as name):" to > be supported, allowing the statement to spread over multiple lines? > With this syntax, it would suddenly be permitted - with dangerously > similar semantics. I see your point, but why don't we support "with (expr as name):" to allow multiple lines? No, don't answer that... its off-topic. Forget I asked. In any case, we already allow similar syntax with different meaning in different places, for example, something that looks just like assignment inside expressions: functions = [len, ord, map, lambda x, y=1: x+y] But its not really an assignment as such, its a parameter declaration. If we agree that the benefit of putting the expression first is sufficiently large, or that the general Pythonic look of "expr as name" is sufficiently desirable (it just looks and reads nicely), then we can afford certain compromises. Namely, we can rule that: except expr as name: with expr as name: continue to have the same meaning that they have now and never mean assignment expressions. Adding parens should not change that. If you try to write something like: except (spam or eggs as cheese) or function(cheese) as name: with (spam or eggs as cheese) or function(cheese) as name: etc (or any other assignment expression, effectively anything which isn't currently allowed) then you get a syntax error. So this: with expr as name: process(name) will only work if expr returns an object with a context manager. But that's they way it works now, so nothing really changes. In other words, the rule is that "expr as name" keeps its current, older semantics in with and except statements, and NEVER means the new, PEP 572 assignment expression. Yes, that's a special case that breaks the rules, and I accept that it is a point against "as". But the Zen is a guideline, not a law of physics, and I think the benefits of "as" are sufficient that even losing a point it still wins. > For many MANY context managers, "with (expr as > name):" would do the exact same thing as "with expr as name:". There > is a general expectation that adding parentheses to an expression > usually doesn't change the behaviour, and if it's legal, people will > assume that the behaviour is the same. It isn't, and it's such a > sneaky difference that I would call it a bug magnet. Indeed. I wouldn't allow such a subtle difference in behaviour due to parens. That reminds me of the Python 1 and early 2.x except clauses, where except ValueError, TypeError: except (ValueError, TypeError): meant different things. I still shudder at that one. > So if it's a bug magnet, what do we do? > > 1) Permit the subtly different semantics, and tell people to be careful No. > 2) Forbid any use of "(expr as name)" in the header of a 'with' statement You can't forbid it, because it is currently allowed syntax (albeit currently without the parens). So the rule is, it is allowed, but it means what it meant pre-PEP 572. > 3) Forbid it at top level, but permit it deeper down I don't know what that means. But whatever it means, probably no :-) > 4) Something else?? Well, there's always the hypothetical -> arrow binding operator, or the Pascal := assignment operator (the current preference). I don't hate the := choice, I just think it is more Pascal-esque that Pythonic :-) > > I don't especially dislike := but I really think that putting the > > expression first is a BIG win for readability. If that requires parens > > to disambiguate it, so be it. > > There's a mild parallel between "(expr as name)" and other uses of > 'as', which bind to that name. Every other use of 'as' is part of a > special syntactic form ('import', 'with', and 'except'), but they do > all bind to that name. (Point of interest: You can "with expr as > x[0]:" but none of the other forms allow anything other than a simple > name.) I disagree: I think it is a strong parallel. They're both name bindings. How much stronger do you want? True, we don't currently allow such things as import math as maths, mathematics, spam.modules[0] but we could if we wanted to and there was a sensible use-case for it. > There's a strong parallel between "target := value" and "target > = value"; Sure. And for a statement, either form would be fine. I just think that in an expression, it is important enough to bring the expression to the front, even if it requires compromise elsewhere. [...] > I actually can't find anything about the -> operator, only the <- one. > (Not that I looked very hard.) Is it a truly viable competitor, or > just one that you'd like to see mentioned for completeness? Yes, as I mentioned in another post, R allows both -> and <-, some language called BETA uses ->, various calculator BASICs use -> (albeit with a special single character, not a digraph) as does HP RPN. Here's an example from R: > c(1, 2, 3+4, 5) -> data > data [1] 1 2 7 5 But whether it is viable or not depends on *us*, not what other languages do. No other language choose the syntax of ternary if expression before Python used it. We aren't limited to only using syntax some other language used first. > > I think that there should be more attention paid to the idea of putting > > the expression first, rather than the name. > > How many ways are there to bind a value to a name? [...] > I'm seeing consistency here in that *EVERY* name binding where the > name is at the end uses "as target" as its syntax. Everything else > starts with the target, then defines what's being assigned to it. So I > don't see much value in a "->" operator, except for the mere fact that > it's different (and thus won't conflict in except/with); and the bulk > of name bindings in Python put the name first. We shouldn't be choosing syntax because other syntax does the same. We should pick the syntax which is most readable and avoids the most problems. That's why Guido bucked the trends of half a century of programming languages, dozens of modern languages, and everything else in Python, to put the conditional in the middle of ternary if instead of the beginning or end. (And he was right to do so -- I like Python's ternary operator, even if other people think it is weird.) If people agree with me that it is important to put the expression first rather than the target name, then the fact that statements and for loops put the name first shouldn't matter. And if they don't, then I'm outvoted :-) -- Steve From ethan at stoneleaf.us Fri Apr 13 09:29:05 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 13 Apr 2018 06:29:05 -0700 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180413131859.GE11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: <5AD0B0A1.9080905@stoneleaf.us> On 04/13/2018 06:18 AM, Steven D'Aprano wrote: > On Fri, Apr 13, 2018 at 09:56:35PM +1000, Chris Angelico wrote: > If we agree that the benefit of putting the expression first is > sufficiently large, or that the general Pythonic look of "expr as name" > is sufficiently desirable (it just looks and reads nicely), then we can > afford certain compromises. Namely, we can rule that: > > except expr as name: > with expr as name: > > continue to have the same meaning that they have now and never mean > assignment expressions. Adding parens should not change that. +1 > In other words, the rule is that "expr as name" keeps its current, older > semantics in with and except statements, and NEVER means the new, PEP > 572 assignment expression. > > Yes, that's a special case that breaks the rules, and I accept that it > is a point against "as". But the Zen is a guideline, not a law of > physics, and I think the benefits of "as" are sufficient that even > losing a point it still wins. +1 >> 2) Forbid any use of "(expr as name)" in the header of a 'with' statement > > You can't forbid it, because it is currently allowed syntax (albeit > currently without the parens). So the rule is, it is allowed, but it > means what it meant pre-PEP 572. +1 > If people agree with me that it is important to put the expression first > rather than the target name, then the fact that statements and for loops > put the name first shouldn't matter. +1 to expression coming first! ;) -- ~Ethan~ From p.f.moore at gmail.com Fri Apr 13 09:30:24 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 13 Apr 2018 14:30:24 +0100 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180413131859.GE11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On 13 April 2018 at 14:18, Steven D'Aprano wrote: > On Fri, Apr 13, 2018 at 09:56:35PM +1000, Chris Angelico wrote: >> 2) Forbid any use of "(expr as name)" in the header of a 'with' statement > > You can't forbid it, because it is currently allowed syntax It's not currently allowed: >>> with (12 as x): File "", line 1 with (12 as x): ^ SyntaxError: invalid syntax > (albeit currently without the parens). Well, yes, but we're talking about *with* the parens. > So the rule is, it is allowed, but it means what it meant pre-PEP 572. So it's a syntax error, because that's what it is pre-PEP 572. So it's allowed, but as a syntax error (which is what "not allowed" means". Huh? In any event it's a special case, because "with EXPR:" is valid, and "(12 as x)" is an example of an EXPR, but put these two together and you're saying you should get a syntax error. (Or if you're not, you haven't stated what you *are* proposing...) And there's no good justification for making this a special case (unless you argue in circles: it's worth being a special case because "as" is a good syntax for assignment expressions, and "as" is a good syntax because it's unambiguous...) > If people agree with me that it is important to put the expression first > rather than the target name, then the fact that statements and for loops > put the name first shouldn't matter. I agree that having the expression first is better. And I think that -> deserves consideration on that basis. I think "as" is *not* a good option for a "puts the expression first" option, because of the ambiguities that Chris has explained. But I also think that it's a relatively minor point, and I have bigger reservations than this about the PEP, so arguing over this level of detail isn't crucial to me. Paul From peter.ed.oconnor at gmail.com Fri Apr 13 09:30:44 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Fri, 13 Apr 2018 09:30:44 -0400 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <5AD0B0A1.9080905@stoneleaf.us> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <5AD0B0A1.9080905@stoneleaf.us> Message-ID: Well this may be crazy sounding, but we could allow left or right assignment with name := expr expr =: name Although it would seem to violate the "only one obvious way" maxim, at least it avoids this overloaded meaning with the "as" of "except" and "with" On Fri, Apr 13, 2018 at 9:29 AM, Ethan Furman wrote: > On 04/13/2018 06:18 AM, Steven D'Aprano wrote: > >> On Fri, Apr 13, 2018 at 09:56:35PM +1000, Chris Angelico wrote: >> > > If we agree that the benefit of putting the expression first is >> sufficiently large, or that the general Pythonic look of "expr as name" >> is sufficiently desirable (it just looks and reads nicely), then we can >> afford certain compromises. Namely, we can rule that: >> >> except expr as name: >> with expr as name: >> >> continue to have the same meaning that they have now and never mean >> assignment expressions. Adding parens should not change that. >> > > +1 > > In other words, the rule is that "expr as name" keeps its current, older >> semantics in with and except statements, and NEVER means the new, PEP >> 572 assignment expression. >> >> Yes, that's a special case that breaks the rules, and I accept that it >> is a point against "as". But the Zen is a guideline, not a law of >> physics, and I think the benefits of "as" are sufficient that even >> losing a point it still wins. >> > > +1 > > 2) Forbid any use of "(expr as name)" in the header of a 'with' statement >>> >> >> You can't forbid it, because it is currently allowed syntax (albeit >> currently without the parens). So the rule is, it is allowed, but it >> means what it meant pre-PEP 572. >> > > +1 > > If people agree with me that it is important to put the expression first >> rather than the target name, then the fact that statements and for loops >> put the name first shouldn't matter. >> > > +1 to expression coming first! ;) > > -- > ~Ethan~ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Apr 13 10:11:27 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 13 Apr 2018 07:11:27 -0700 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> Message-ID: <5AD0BA8F.7070306@stoneleaf.us> On 04/13/2018 05:02 AM, Jacco van Dorp wrote: > I must admit I like putting the expression first, though. Even if it's > just to make it harder to mix it up with normal assignment. Perhaps => > could be used - it's a new token, unlike -> which is used to annotate > return values, it's not legal syntax now(so no backwards compatibility > issues), and used a for similar purposes in for example php when > declaring associative arrays.($arr = array("key"=>"value");). I'm not > convinced myself, though. The problem with => is that it's the opposite of >= which means typos would not cause SyntaxError and be hard to spot. -- ~Ethan~ From ncoghlan at gmail.com Fri Apr 13 10:06:52 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 14 Apr 2018 00:06:52 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <5AD04DBF.50704@stoneleaf.us> Message-ID: On 13 April 2018 at 16:47, Chris Angelico wrote: > Consider: > > pos = -1 > while pos := buffer.find(search_term, pos + 1) >= 0: > ... > > Once find() returns -1, the loop terminates. Should this need to be > parenthesized? I've certainly been assuming that cases like that would need to be written as: pos = -1 while (pos := buffer.find(search_term, pos + 1)) >= 0: ... I'd write the equivalent C while loop the same way: int pos = -1 while ((pos = find(buffer, search_term, pos + 1)) >= 0): ... The parentheses around the assignment in C are technically redundant, but I consider finding the matching parenthesis to be straightforward (especially with text editor assistance), while I consider figuring out where the next lower precedence operator appears difficult (since I don't have the C operand precedence table memorized, and there isn't any simple way for my text editor to help me out). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Apr 13 10:36:09 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 14 Apr 2018 00:36:09 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <20180413122209.GD11616@ando.pearwood.info> Message-ID: On 13 April 2018 at 22:35, Chris Angelico wrote: > On Fri, Apr 13, 2018 at 10:22 PM, Steven D'Aprano wrote: >> On Wed, Apr 11, 2018 at 11:50:44PM +1000, Chris Angelico wrote: >> >>> > Previously, there was an alternative _operator form_ `->` proposed by >>> > Steven D'Aprano. This option is no longer considered? I see several >>> > advantages with this variant: >>> > 1. It does not use `:` symbol which is very visually overloaded in Python. >>> > 2. It is clearly distinguishable from the usual assignment statement and >>> > it's `+=` friends >>> > There are others but they are minor. >>> >>> I'm not sure why you posted this in response to the open question, but >>> whatever. The arrow operator is already a token in Python (due to its >>> use in 'def' statements) and should not conflict with anything; >>> however, apart from "it looks different", it doesn't have much to >>> speak for it. >> >> On the contrary, it puts the expression first, where it belongs >> *semi-wink*. > > The 'as' syntax already has that going for it. What's the advantage of > the arrow over the two front-runners, ':=' and 'as'? I stumbled across https://www.hillelwayne.com/post/equals-as-assignment/ earlier this week, and I think it provides grounds to reconsider the suitability of ":=", as that symbol has historically referred to *re*binding an already declared name. That isn't the way we're proposing to use it here: we're using it to mean both implicit local variable declaration *and* rebinding of an existing name, the same as we do for "=" and "as". I think the "we already use colons in too many unrelated places" argument also has merit, as we already use the colon as: 1. the header terminator when introducing a nested suite 2. the key:value separator in dictionary displays and comprehensions 3. the name:annotation separator in function parameter declarations 4. the name:annotation separator in variable declarations and assignment statements 5. the parameter:result separator in lambda expressions 6. the start:stop:step separator in slice syntax "as" is at least more consistently associated with name binding, and has fewer existing uses in the first place, but has the notable downside of being thoroughly misleading in with statement header lines, as well as being *so* syntactically unobtrusive that it's easy to miss entirely (especially in expressions that use other keywords). The symbolic "right arrow" operator would be a more direct alternative to the "as" variant that was more visually distinct: # Handle a matched regex if (pattern.search(data) -> match) is not None: ... # More flexible alternative to the 2-arg form of iter() invocation while (read_next_item() -> item) is not None: ... # Share a subexpression between a comprehension filter clause and its output filtered_data = [y for x in data if (f(x) -> y) is not None] # Visually and syntactically unambigous in with statement headers with create_cm() -> cm as enter_result: ... (Pronunciation-wise, if we went with that option, I'd probably pronounce "->" as "as" most of the time, but there are some cases like the "while" example above where I'd pronounce it as "into") The connection with function declarations would be a little tenuous, but could be rationalised as: Given the function declation: def f(...) -> Annotation: ... Then in the named subexpression: (f(...) -> name) the inferred type of "name" is "Annotation" Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Fri Apr 13 10:50:15 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 14 Apr 2018 00:50:15 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: <20180413145014.GF11616@ando.pearwood.info> On Fri, Apr 13, 2018 at 02:30:24PM +0100, Paul Moore wrote: > On 13 April 2018 at 14:18, Steven D'Aprano wrote: > > On Fri, Apr 13, 2018 at 09:56:35PM +1000, Chris Angelico wrote: > > >> 2) Forbid any use of "(expr as name)" in the header of a 'with' statement > > > > You can't forbid it, because it is currently allowed syntax > > It's not currently allowed: It is without the parens, as I said: [...] > > (albeit currently without the parens). > > Well, yes, but we're talking about *with* the parens. You might be, but I'm not. I'm talking about the conflict between: # pre-PEP 572 context manager with expr as name: # post-PEP 572, is it a context manager or an assignment expression? with expr as name: and I'm saying, don't worry about the syntactical ambiguity, just make a ruling that it is a context manager, never an assignment expression. I'm saying, don't even try to distinguish between the forms with or without parens. If we add parens: with (expr as name): it may or may not be allowed some time in the future (since it isn't allowed now, but there are many requests for it) but if it is allowed, it will still mean a context manager and not assignment expression. (In case it isn't obvious, I'm saying that we need not *require* parens for this feature, at least not if the only reason for doing so is to make the with/except case unambiguous.) [...] > (unless you argue in > circles: it's worth being a special case because "as" is a good syntax > for assignment expressions, and "as" is a good syntax because it's > unambiguous...) I never said that "as" was unambiguous. I just spent a rather long post attempting to explain how to deal with that ambiguity. Sorry if I was not clear enough. I'm saying we can deal with it by simply defining what we want it to mean, and using the surrounding context to distinguish the cases. If it is in a with or except clause (just the header clause, not the following block) , then it means the same thing it means now. Everywhere else, it means assignment expression. Just like "x = 1" can mean assignment, or it can mean a parameter declaration. We don't have different syntax for those two distinct actions, we simply state that if "x = 1" is inside a lambda or function def, then it always means a parameter declaration and never an illegal assignment. > > If people agree with me that it is important to put the expression first > > rather than the target name, then the fact that statements and for loops > > put the name first shouldn't matter. > > I agree that having the expression first is better. And I think that > -> deserves consideration on that basis. Indeed. > I think "as" is *not* a good option for a "puts the expression first" > option, because of the ambiguities that Chris has explained. I can see why people may dislike the "as" option. I think it comes down to personal taste. My aim is to set out what I see as a way around the with/except problem with a compromise: - allow "expr as name" anywhere; - require parentheses only to avoid syntactic ambiguity; - and simply ban assignment expressions in with/except clauses and keep the existing with/except semantics. I acknowledge that it isn't ideal, but compromises between conflicting requirements rarely are. If people don't agree with me, you're all wrong, er, that is to say, I understand your position even if I disagree *wink* -- Steve From j.van.dorp at deonet.nl Fri Apr 13 11:04:00 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Fri, 13 Apr 2018 17:04:00 +0200 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180413145014.GF11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180413145014.GF11616@ando.pearwood.info> Message-ID: > I'm saying, don't even try to distinguish between the forms with or > without parens. If we add parens: > > with (expr as name): > > it may or may not be allowed some time in the future (since it isn't > allowed now, but there are many requests for it) but if it is allowed, > it will still mean a context manager and not assignment expression. > > (In case it isn't obvious, I'm saying that we need not *require* parens > for this feature, at least not if the only reason for doing so is to > make the with/except case unambiguous.) > So if I read this correctly, you're making an argument to ignore parens ? If I'd type with (expr as name) as othername:, I'd expect the original value of expr in my name and the context manager's __enter__ return value in othername. I don't really see any ambiguity in that case. Without parens -> old syntax + meaning with parens -> bind expr to name, because that's what the parens say. Ignoring parens, in any way, just seems like a bad idea. If you want to avoid with(expr as name):, shouldn't you make parens illegal ? From adelfino at gmail.com Fri Apr 13 11:08:08 2018 From: adelfino at gmail.com (=?UTF-8?Q?Andr=C3=A9s_Delfino?=) Date: Fri, 13 Apr 2018 12:08:08 -0300 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: Oh, I get it now, thanks! On Fri, Apr 13, 2018 at 2:11 AM, Serhiy Storchaka wrote: > 12.04.18 22:42, Andr?s Delfino ????: > >> I think the update method can (and personally, should) stay unchanged: >> >> spam.update(dict(x, y)) >> >> seems succinct and elegant enough, with the proposed constructor syntax. >> >> Sorry my ignorance, do (Mutable)Mapping ABC say anything about >> constructors? >> > > Mapping and MutableMapping ABCs don't have constructors, but many > dict-like objects imitate the dict constructor: accept a single mapping or > a sequence of pairs as a positional argument, and accept other dict as > kwargs. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Apr 13 11:13:20 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 13 Apr 2018 08:13:20 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <5AD04DBF.50704@stoneleaf.us> Message-ID: Regarding the precedence of :=, at this point the only acceptable outcome for me is one where at the statement level, := and = can be used interchangeably and will have the same meaning (except probably we wouldn't allow combining both in a chained assignment). Because then we can say "and you can use assignment in expressions too, except there you must use := because = would be too easily confused with ==". Ideally this would allow all forms of = to be replaced with :=, even extended structure unpacking (a, b, *c := foo()). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Fri Apr 13 11:34:05 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 13 Apr 2018 16:34:05 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <5AD04DBF.50704@stoneleaf.us> Message-ID: On 13 April 2018 at 16:13, Guido van Rossum wrote: > Regarding the precedence of :=, at this point the only acceptable outcome > for me is one where at the statement level, := and = can be used > interchangeably and will have the same meaning (except probably we wouldn't > allow combining both in a chained assignment). Because then we can say "and > you can use assignment in expressions too, except there you must use := > because = would be too easily confused with ==". > > Ideally this would allow all forms of = to be replaced with :=, even > extended structure unpacking (a, b, *c := foo()). If we're going this far, why not just replace assignment statements with assignment expressions using := completely, and leave the current assignment statement for backward compatibility only (to be removed at some point following a deprecation period)? That's not rhetorical - if you're taking this position, I genuinely don't know if you see any remaining advantages to the assignment statement other than backward compatibility. Personally, I don't like the idea - but I'm not sure if that's just because I'm so used to assignment as a statement that I'm reluctant to accept change... Paul From steve at pearwood.info Fri Apr 13 11:44:15 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 14 Apr 2018 01:44:15 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <20180413122209.GD11616@ando.pearwood.info> Message-ID: <20180413154415.GG11616@ando.pearwood.info> On Fri, Apr 13, 2018 at 10:35:59PM +1000, Chris Angelico wrote: > > On the contrary, it puts the expression first, where it belongs > > *semi-wink*. > > The 'as' syntax already has that going for it. What's the advantage of > the arrow over the two front-runners, ':=' and 'as'? Personally, I like "as" better than -> since English-like expressions and syntax is nicer than symbols. (Up to a point of course -- we surely don't want COBOL-like "ADD 1 TO x" syntax.) The arrow also completely bypasses the entire with/except problem. So my position is: - "as" is more Pythonic and looks nicer; - but it requires a compromise to avoid the with/except problem; - I'm okay with that compromise, but if others aren't, my second preference is the arrow binding operator -> - which also has the advantage that it is completely unused apart from function annotations; - and if people don't like that, I'm okay with := as a distant third choice. (But see below.) > > The expression is the most important part of the assignment expression, > > and because we read from left to right, it should come first. Let's take > > a simple example: > > > > pair = (first_value := x + y + z, > > a + b + first_value > > ) > > > > What's the first item of the pair? If you're like me, and I think most > > people are similar, when skimming the code, you read only far across > > each line to get an idea of whether it is relevant or not. > > Yet Python has an if/else operator that, in contrast to C-inspired > languages, violates that rule. So it's not a showstopper. :) A ternary operator can't put *all* three clauses first. Only one can go first. And you know which one we picked? Not the condition, as C went with. Not the alternative "else" expression. But the primary "if" clause, in other words, the expression that we consider to be the usual case. So the ternary if supports my position, it isn't in opposition :-) But of course your general observation is correct. "Expression first" is violated by regular assignment, by long tradition. I believe that was inherited from mathematics, where is may or may not make sense, but either way it is acceptible (if only because of long familiarity!) for assignment statements. But we should reconsider it for expressions. Analogy: we write "with expression as name" for context managers. We could have put the name first and written "using name from expression", but that puts the name and expression in the wrong order. Similarly we don't write "import as np numpy", rather we use "import numpy as np". We put the entity being imported first, the name it is bound to last. But again, I acknowledge that there are exceptions, like for loops, and they've been around a long time. Back to the 1950s and the invention of Fortran. So no, this isn't an absolute showstopper. > >> The arrow faces the other way in languages like Haskell, > > > > Indeed, but in R, it faces to the right. (Actually, R allows both > > direction.) There's also apparently a language BETA which uses -> for > > assignment, although I've never used it. > > I looked up R's Wikipedia page and saw only the left-facing arrow. How > common is the right-facing arrow? Will people automatically associate > it with name binding? I don't interact with the R community enough to know how commonly people use -> versus <- but you can certainly try it for yourself in the R interpreter if you have any doubts that it works. The statistician John Cook says -> is "uncommon": https://www.johndcook.com/blog/r_language_for_programmers/#assignment but in any case, I think that the idea of arrows as pointers, motion, "putting into" and by analogy assignment shouldn't be hard to grasp. The first time I saw pseudo-code using <- for assignment, I was confused by why the arrow was pointing to the left instead of the right but I had no trouble understanding that it implied taking the value at non-pointy end and moving it into the variable at the pointy end. > So we have calculators, and possibly R, and sorta-kinda Haskell, > recommending some form of arrow. We have Pascal and its derivatives > recommending colon-equals. And we have other usage in Python, with > varying semantics, recommending 'as'. I guess that's enough to put the > arrow in as another rejected alternate spelling, but not to seriously > consider it. Well, it's your PEP, and I can't force you to treat my suggestion seriously, but I am serious about it and there have been a few other people agree with me. I grew up with Pascal and I like it, but it's 2018 and a good twenty to thirty years since Pascal was a mainstream language outside of academia. In 1988, even academia was slowly moving away from Pascal. I think the death knell of Pascal as a serious mainstream language was when Apple stopped using Pascal for their OS and swapped to C for System 7 in 1991. The younger generation of programmers today mostly know Pascal only as one of those old-timer languages that is "considered harmful", if even that. And there is an entire generation of kids growing up using CAS calculators for high school maths who will be familiar with -> as assignment. So in my opinion, while := is a fine third-choice, I really think that the arrow operator is better. -- Steve From guido at python.org Fri Apr 13 11:45:32 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 13 Apr 2018 08:45:32 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <20180413122209.GD11616@ando.pearwood.info> Message-ID: On Fri, Apr 13, 2018 at 7:36 AM, Nick Coghlan wrote: > On 13 April 2018 at 22:35, Chris Angelico wrote: > > The 'as' syntax already has that going for it. What's the advantage of > > the arrow over the two front-runners, ':=' and 'as'? > > I stumbled across > https://www.hillelwayne.com/post/equals-as-assignment/ earlier this > week, and I think it provides grounds to reconsider the suitability of > ":=", as that symbol has historically referred to *re*binding an > already declared name. That isn't the way we're proposing to use it > here: we're using it to mean both implicit local variable declaration > *and* rebinding of an existing name, the same as we do for "=" and > "as". > I've not done much research about this topic, but I lived through it (my first languages were Algol-60 And Fortran=IV, in 1974, soon followed by Pascal and Algol-68) and I think that blog post is overly biased by one particular thread of C's ancestry. CPL, BCPL and B were bit players in the world of languages (I suspect mostly focused on Bell Labs and/or MIT, at least US east coast). I should probably blog about my own view of this history, but basically I don't believe that the distinction between initialization and re-assignment is the big decider here. Python historically doesn't care, all its assignments work like dict[key] = value (with a slight exception for the analyses related to local scopes and closures). > I think the "we already use colons in too many unrelated places" > argument also has merit, as we already use the colon as: > > 1. the header terminator when introducing a nested suite > 2. the key:value separator in dictionary displays and comprehensions > 3. the name:annotation separator in function parameter declarations > 4. the name:annotation separator in variable declarations and > assignment statements > 5. the parameter:result separator in lambda expressions > 6. the start:stop:step separator in slice syntax > But := is not a colon -- it's a new symbol spelled as two characters. The lexer returns it as a single symbol, like it does != and ==. And we're lucky in the sense that no expression or statement can start with =, so there is no context where : = and := would both be legal. > "as" is at least more consistently associated with name binding, and > has fewer existing uses in the first place, but has the notable > downside of being thoroughly misleading in with statement header > lines, as well as being *so* syntactically unobtrusive that it's easy > to miss entirely (especially in expressions that use other keywords). > Right -- 'as' signals that there's something funky happening to its left argument before the assignment is made. > The symbolic "right arrow" operator would be a more direct alternative > to the "as" variant that was more visually distinct: > > # Handle a matched regex > if (pattern.search(data) -> match) is not None: > ... > > # More flexible alternative to the 2-arg form of iter() invocation > while (read_next_item() -> item) is not None: > ... > > # Share a subexpression between a comprehension filter clause and its > output > filtered_data = [y for x in data if (f(x) -> y) is not None] > > # Visually and syntactically unambigous in with statement headers > with create_cm() -> cm as enter_result: > ... > > (Pronunciation-wise, if we went with that option, I'd probably > pronounce "->" as "as" most of the time, but there are some cases like > the "while" example above where I'd pronounce it as "into") > > The connection with function declarations would be a little tenuous, > but could be rationalised as: > > Given the function declation: > > def f(...) -> Annotation: > ... > > Then in the named subexpression: > > (f(...) -> name) > > the inferred type of "name" is "Annotation" > I am not excited about (expr -> var) at all, because the existing use of -> in annotations is so entirely different. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Fri Apr 13 12:14:18 2018 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 13 Apr 2018 11:14:18 -0500 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> Message-ID: [Chris] > ... > So I don't see much value in a "->" operator, except for the > mere fact that it's different (and thus won't conflict in > except/with); and the bulk of name bindings in Python put > the name first. It does give a natural solution to one of the problematic examples, because as a very-low-precedence operator it would suck up "everything to the left" instead of "everything to the right" as "the expression part":: if f(x) -> a is not None: can't be misunderstood, but: if a := f(x) is not None: is routinely misunderstood. On the other hand, if assignment expressions are expanded to support all the forms of unpacking syntax (as Guido appears to favor), then other cases arise: if f() -> a, b > (3, 6): Is that: if ((f() -> a), b) > (3, 6): or if (f() -> (a, b)) > (3, 6): ? Which is an argument in favor of ":=" to me: an assignment statement can be pretty complex, and if an assignment operator can become just as complex then it's best if it looks and works (as much as possible} exactly like an assignment statement. If someone writes if a, b := f() > (3, 6): it's easy to explain why that's broken by referring to the assignment statement a, b = f() > (3, 6) "You're trying to unpack a Boolean into a 2-tuple - add parens to express what you really want." From srkunze at mail.de Fri Apr 13 12:05:58 2018 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 13 Apr 2018 18:05:58 +0200 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: <20180411044441.GQ16661@ando.pearwood.info> <838a4818-a530-f112-9505-b5f2f09d9a91@mgmiller.net> Message-ID: <1067507e-c2ad-79b3-88c5-0255415d4ea1@mail.de> I usually go with + only to find out that dict was something special here. ;-) Then, I use .update only to find out, that it's in-place and I need to join more than two. Then, I use some sort of dict comprehension or the dict constructor etc. Naively, I would say + is what comes to mind easily. :D Then, even sum(my_dicts) would work. ;-) on-topic: Multiple arguments to dict(*my_dicts) just complements the alternative {**...} comprehension. So, it seems legit. +1 On 12.04.2018 21:32, Andr?s Delfino wrote: > There's a long thread about the subject: > https://mail.python.org/pipermail/python-ideas/2015-February/031748.html > > I suggest to avoid the matter altogether :) > > On Thu, Apr 12, 2018 at 4:15 PM, Mike Miller > > wrote: > > While we're on the subject, I've tried to add dicts a few times > over the years to get a new one but it doesn't work: > > ? ? d3 = d1 + d2? # TypeError > > Thinking a bit, set union is probably a better analogue, but it > doesn't work either: > > ? ? d3 = d1 | d2? # TypeError > > Where the last value of any duplicate keys should win. > > -Mike > > > > On 2018-04-12 06:46, Andr?s Delfino wrote: > > Extending the original idea, IMHO it would make sense for the > dict constructor to create a new dictionary not only from > several mappings, but mixing mappings and iterables too. > > Consider this example: > > x = [(1, 'one')] > y = {2: 'two'} > > Now: {**dict(x), **y} > Proposed: dict(x, y) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Fri Apr 13 12:27:35 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Fri, 13 Apr 2018 09:27:35 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <20180413154415.GG11616@ando.pearwood.info> References: <20180413122209.GD11616@ando.pearwood.info> <20180413154415.GG11616@ando.pearwood.info> Message-ID: <5cd0610c-c766-3ba4-adbd-b499d7b90b4f@mgmiller.net> On 2018-04-13 08:44, Steven D'Aprano wrote: > So my position is: > > - "as" is more Pythonic and looks nicer; Yes, perhaps if typing annotations had not chosen the colon but used a whitespace delimiter instead, adding a few more colons to the source would not be an issue. But in combination, it feels like there will be way too many colons in a potential future Python. One of the things I liked about it historically was its relative lack of punctuation and use of short words like in, and, not, as, etc. As mentioned before, I liked := for assignment in Pascal, but presumably since we are keeping == for testing, there's not as strong of an argument for that spelling. -Mike From steve at pearwood.info Fri Apr 13 12:43:57 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 14 Apr 2018 02:43:57 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180413145014.GF11616@ando.pearwood.info> Message-ID: <20180413164357.GH11616@ando.pearwood.info> On Fri, Apr 13, 2018 at 05:04:00PM +0200, Jacco van Dorp wrote: > > I'm saying, don't even try to distinguish between the forms with or > > without parens. If we add parens: > > > > with (expr as name): > > > > it may or may not be allowed some time in the future (since it isn't > > allowed now, but there are many requests for it) but if it is allowed, > > it will still mean a context manager and not assignment expression. > > > > (In case it isn't obvious, I'm saying that we need not *require* parens > > for this feature, at least not if the only reason for doing so is to > > make the with/except case unambiguous.) > > > > So if I read this correctly, you're making an argument to ignore parens ? You can always add unneeded parentheses for grouping that have no effect: py> value = ((((( 1 )) + (((1)))))) py> value 2 One should never be penalised by the interpreter for unnecessary parentheses. If they do nothing, it shouldn't be an error. Do you really want an error just because you can't remember operator precendence? flag = (spam is None) or (len(spam) == 0) The parens are unnecessary, but I still want to write them. That shouldn't be an error. > If I'd type with (expr as name) as othername:, I'd expect the original value > of expr in my name and the context manager's __enter__ return value in > othername. I don't really see any ambiguity in that case. That case would be okay. But the ambiguity comes from this case: with expr as name: That could mean either of: 1. Assignment-as-expression, in which case gets set to the value of and __enter__ is never called; 2. With-statement context manager, in which case gets set to the value of .__enter__(). (This assumes that assignment-expressions don't require parens. For the case where they are required, see below.) That's a subtle but important difference, and it is especially awful because most of the time it *seems* to work. Until suddenly it doesn't. The problem is that the most common context manager objects return themselves from __enter__, so it doesn't matter whether is set to or .__enter__(), the result will be the same. But some context managers don't work like that, and so your code will have a non-obvious bug just waiting to bite. How about if we require parentheses? That will mean that we treat these two statements as different: with expr as name: # 1 with (expr as name): # 2 Case #1 is of course the current syntax, and it is fine as it is. It is the second case that has problems. Suppose the expression is really long, as so you innocently intend to write: with (really long expression) as name: but you accidently put the closing parenthesis in the wrong place: with (really long expression as name): as is easy enough to do. And now you have a suble bug: name will no longer be set to the result of calling __enter__ as you expect. It is generally a bad idea to have the presence or absense of parentheses *alone* to make a semantic difference. With very few exceptions, it leads to problems. For example, if you remember except clauses back in the early versions of Python 2, or 1.5, you will remember the problems caused by treating: except NameError, ValueError: except (NameError, ValueError): as every-so-subtly different. If you don't remember Python that long ago, would you like to guess the difference? > Without parens -> old syntax + meaning > with parens -> bind expr to name, because that's what the parens say. It isn't the parentheses that cause the binding. It is the "as". So if you move a perfectly innocent assignment-expression into a with statement, the result will depend on whether or not it came with parentheses: # these do the same thing while expression as name: while (expression as name): # so do these result = [1, expression as name, name + 2] result = [1, (expression as name), name + 2] # but these are subtly different and will be a trap for the unwary with expression as name: # name is set to __enter__() with (expression as name): # name is not set to __enter__() Of course, we could insist that parens are ALWAYS required around assignment-expressions, but that will be annoying. -- Steve From rosuav at gmail.com Fri Apr 13 13:10:36 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 14 Apr 2018 03:10:36 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <20180413122209.GD11616@ando.pearwood.info> Message-ID: On Sat, Apr 14, 2018 at 12:36 AM, Nick Coghlan wrote: > On 13 April 2018 at 22:35, Chris Angelico wrote: >> On Fri, Apr 13, 2018 at 10:22 PM, Steven D'Aprano wrote: >>> On Wed, Apr 11, 2018 at 11:50:44PM +1000, Chris Angelico wrote: >>> >>>> > Previously, there was an alternative _operator form_ `->` proposed by >>>> > Steven D'Aprano. This option is no longer considered? I see several >>>> > advantages with this variant: >>>> > 1. It does not use `:` symbol which is very visually overloaded in Python. >>>> > 2. It is clearly distinguishable from the usual assignment statement and >>>> > it's `+=` friends >>>> > There are others but they are minor. >>>> >>>> I'm not sure why you posted this in response to the open question, but >>>> whatever. The arrow operator is already a token in Python (due to its >>>> use in 'def' statements) and should not conflict with anything; >>>> however, apart from "it looks different", it doesn't have much to >>>> speak for it. >>> >>> On the contrary, it puts the expression first, where it belongs >>> *semi-wink*. >> >> The 'as' syntax already has that going for it. What's the advantage of >> the arrow over the two front-runners, ':=' and 'as'? > > I stumbled across > https://www.hillelwayne.com/post/equals-as-assignment/ earlier this > week, and I think it provides grounds to reconsider the suitability of > ":=", as that symbol has historically referred to *re*binding an > already declared name. That isn't the way we're proposing to use it > here: we're using it to mean both implicit local variable declaration > *and* rebinding of an existing name, the same as we do for "=" and > "as". I'm not bothered by that. Assignment semantics vary from one language to another; the fact that Python marks as local anything that's assigned to is independent of the way you assign to it. ("print(x)" followed by "for x in ..." is going to bomb with UnboundLocalError, for instance.) If Python had any form of local variable declarations, it wouldn't change the behaviour of the := operator. ChrisA From rosuav at gmail.com Fri Apr 13 13:28:24 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 14 Apr 2018 03:28:24 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <20180413154415.GG11616@ando.pearwood.info> References: <20180413122209.GD11616@ando.pearwood.info> <20180413154415.GG11616@ando.pearwood.info> Message-ID: On Sat, Apr 14, 2018 at 1:44 AM, Steven D'Aprano wrote: > On Fri, Apr 13, 2018 at 10:35:59PM +1000, Chris Angelico wrote: >> Yet Python has an if/else operator that, in contrast to C-inspired >> languages, violates that rule. So it's not a showstopper. :) > > A ternary operator can't put *all* three clauses first. Only one can go > first. And you know which one we picked? > > Not the condition, as C went with. Which is also the most important clause in some situations, since it's the governing clause. In an if *statement*, it's the one in the header, before the indented block. I don't think there's enough logic here to draw a pattern from. > But of course your general observation is correct. "Expression first" is > violated by regular assignment, by long tradition. I believe that was > inherited from mathematics, where is may or may not make sense, but > either way it is acceptible (if only because of long familiarity!) for > assignment statements. But we should reconsider it for expressions. I don't know about professional-level mathematics, but certainly what I learned in grade school was conventionally written with the dependent variable before the equals sign. You'd write "y = x?" to graph a parabola, not "x? = y". So the programming language convention of putting the assignment target first was logical to me. > but in any case, I think that the idea of arrows as pointers, motion, > "putting into" and by analogy assignment shouldn't be hard to grasp. Agreed. Whichever way the arrow points, it's saying "the data flows thattaway". C++ took this to a cute level with "cout << blah", abusing the shift operators to signal data flow, and it's a nuisance because of operator precedence sometimes, but nobody can deny that it makes intuitive sense. > The first time I saw pseudo-code using <- for assignment, I was confused > by why the arrow was pointing to the left instead of the right but I had > no trouble understanding that it implied taking the value at non-pointy > end and moving it into the variable at the pointy end. (Now I want to use "pointy end" somewhere in the PEP. Just because.) >> So we have calculators, and possibly R, and sorta-kinda Haskell, >> recommending some form of arrow. We have Pascal and its derivatives >> recommending colon-equals. And we have other usage in Python, with >> varying semantics, recommending 'as'. I guess that's enough to put the >> arrow in as another rejected alternate spelling, but not to seriously >> consider it. > > Well, it's your PEP, and I can't force you to treat my suggestion > seriously, but I am serious about it and there have been a few other > people agree with me. > > I grew up with Pascal and I like it, but it's 2018 and a good twenty to > thirty years since Pascal was a mainstream language outside of academia. > In 1988, even academia was slowly moving away from Pascal. I think the > death knell of Pascal as a serious mainstream language was when Apple > stopped using Pascal for their OS and swapped to C for System 7 in 1991. > > The younger generation of programmers today mostly know Pascal only as > one of those old-timer languages that is "considered harmful", if even > that. And there is an entire generation of kids growing up using CAS > calculators for high school maths who will be familiar with -> as > assignment. > > So in my opinion, while := is a fine third-choice, I really think that > the arrow operator is better. If I were to take a poll of Python developers, offering just these three options: 1) TARGET := EXPR 2) EXPR as TARGET 3) EXPR -> TARGET and ask them to rank them in preferential order, I fully expect that all six arrangements would have passionate supporters. So far, I'm still seeing a strong parallel between expression-assignment and statement-assignment, to the point of definitely wanting to preserve that. ChrisA From python at mrabarnett.plus.com Fri Apr 13 13:37:22 2018 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 13 Apr 2018 18:37:22 +0100 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: <5AD04DBF.50704@stoneleaf.us> Message-ID: On 2018-04-13 16:13, Guido van Rossum wrote: > Regarding the precedence of :=, at this point the only acceptable > outcome for me is one where at the statement level, := and = can be used > interchangeably and will have the same meaning (except probably we > wouldn't allow combining both in a chained assignment). Because then we > can say "and you can use assignment in expressions too, except there you > must use := because = would be too easily confused with ==". > > Ideally this would allow all forms of = to be replaced with :=, even > extended structure unpacking (a, b, *c := foo()). > +1 It would be confusing to have 2 forms of assignment which look alike but with different precedences. From rosuav at gmail.com Fri Apr 13 15:24:08 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 14 Apr 2018 05:24:08 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180413131859.GE11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On Fri, Apr 13, 2018 at 11:18 PM, Steven D'Aprano wrote: > On Fri, Apr 13, 2018 at 09:56:35PM +1000, Chris Angelico wrote: > > >> How many times have people asked for "with (expr as name):" to >> be supported, allowing the statement to spread over multiple lines? >> With this syntax, it would suddenly be permitted - with dangerously >> similar semantics. > > I see your point, but why don't we support "with (expr as name):" to > allow multiple lines? No, don't answer that... its off-topic. Forget I > asked. The answer is simple: for the same reason that you can't parenthesize *most* statements. >>> for (x in ... range(5)): File "", line 2 range(5)): ^ SyntaxError: invalid syntax It's not an expression, so it doesn't follow the rules of expressions. There is special grammar around the import statement to permit this, and anywhere else, if it's not an expression, parens don't work. (Backslashes do, but that's because they function at a different level.) > If we agree that the benefit of putting the expression first is > sufficiently large, or that the general Pythonic look of "expr as name" > is sufficiently desirable (it just looks and reads nicely), then we can > afford certain compromises. Namely, we can rule that: > > except expr as name: > with expr as name: > > continue to have the same meaning that they have now and never mean > assignment expressions. Adding parens should not change that. That's a different sort of special case from what I had in mind, which is that illegal parenthesization should raise SyntaxError. Either way, though... > Yes, that's a special case that breaks the rules, and I accept that it > is a point against "as". But the Zen is a guideline, not a law of > physics, and I think the benefits of "as" are sufficient that even > losing a point it still wins. ... yes, the false parallel is a strike against the "as" syntax. >> So if it's a bug magnet, what do we do? >> >> 1) Permit the subtly different semantics, and tell people to be careful > > No. Agreed. >> 2) Forbid any use of "(expr as name)" in the header of a 'with' statement > > You can't forbid it, because it is currently allowed syntax (albeit > currently without the parens). So the rule is, it is allowed, but it > means what it meant pre-PEP 572. It isn't currently-allowed syntax precisely because of those parens. So "what it meant pre-PEP 572" is "raise SyntaxError". >> 3) Forbid it at top level, but permit it deeper down > > I don't know what that means. But whatever it means, probably no :-) That would mean that this is a SyntaxError: with (get_file() as f): But this is allowed: with open(get_filename() as name): It's a more complicated rule, but a more narrow special case (the ONLY thing disallowed is the one thing that would be interpreted differently with and without parens), and there's no ambiguity here. >> 4) Something else?? > > Well, there's always the hypothetical -> arrow binding operator, or the > Pascal := assignment operator (the current preference). > > I don't hate the := choice, I just think it is more Pascal-esque that > Pythonic :-) That's fair. And since Python has the "==" operator for comparison, using a Pascal-style ":=" for assignment isn't really paralleling properly. But purely within Python, any one of the three ('as', ':=', '->') is going to be at least somewhat viable, and it comes down to what they parallel *in Python*: 1) 'as' parallels 'import', 'with', and 'except', which perform name bindings based on some language magic using what's to the left of the 'as' 2) ':=' parallels '=', which takes what's on its right and assigns it to the target on the left 3) '->' parallels function return value annotations, but looks like data's moving from the left to the right. >> > I don't especially dislike := but I really think that putting the >> > expression first is a BIG win for readability. If that requires parens >> > to disambiguate it, so be it. >> >> There's a mild parallel between "(expr as name)" and other uses of >> 'as', which bind to that name. Every other use of 'as' is part of a >> special syntactic form ('import', 'with', and 'except'), but they do >> all bind to that name. (Point of interest: You can "with expr as >> x[0]:" but none of the other forms allow anything other than a simple >> name.) > > I disagree: I think it is a strong parallel. They're both name bindings. > How much stronger do you want? True, they're name bindings. So are non-renaming import statements ("import os", "from pprint import pprint"), function and class definitions, and for loops. None of those use 'as'. So it's a weak parallel. > True, we don't currently allow such things as > > import math as maths, mathematics, spam.modules[0] > > but we could if we wanted to and there was a sensible use-case for it. I'm not entirely sure what this would do, partly because I'm unsure whether it means to import "math" under the name "maths", and the other two separately, or to import "math" with three targets. >> There's a strong parallel between "target := value" and "target >> = value"; > > Sure. And for a statement, either form would be fine. I just think that > in an expression, it is important enough to bring the expression to the > front, even if it requires compromise elsewhere. Why is it okay for a statement but not an expression? Genuine question, not scorning it. Particularly since expressions can be used as statements, so "value -> target" could be used as a statement. > But whether it is viable or not depends on *us*, not what other > languages do. No other language choose the syntax of ternary if > expression before Python used it. We aren't limited to only using syntax > some other language used first. Indeed, but for the sake of the PEP, it's useful to cite prior art. Just wanted to clarify. > If people agree with me that it is important to put the expression first > rather than the target name, then the fact that statements and for loops > put the name first shouldn't matter. > > And if they don't, then I'm outvoted :-) Fair enough. That said, though, consistency DOES have value. The language fits into people's heads far better if parts of it behave the same way as other parts. The only question is, which consistencies matter the most? ChrisA From rosuav at gmail.com Fri Apr 13 15:33:57 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 14 Apr 2018 05:33:57 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <5AD0B0A1.9080905@stoneleaf.us> Message-ID: On Fri, Apr 13, 2018 at 11:30 PM, Peter O'Connor wrote: > Well this may be crazy sounding, but we could allow left or right assignment > with > > name := expr > expr =: name > > Although it would seem to violate the "only one obvious way" maxim, at least > it avoids this overloaded meaning with the "as" of "except" and "with" > Hah. It took me multiple readings to even notice the change in the operator there, so I think this would cause a lot of confusion. (Don't forget, by the way, that an expression can be a simple name, and an assignment target can be more complicated than a single name, so it won't always be obvious on that basis.) It probably wouldn't *technically* conflict with anything, but it would get extremely confusing! ChrisA From rosuav at gmail.com Fri Apr 13 15:41:23 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 14 Apr 2018 05:41:23 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180413145014.GF11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180413145014.GF11616@ando.pearwood.info> Message-ID: On Sat, Apr 14, 2018 at 12:50 AM, Steven D'Aprano wrote: > On Fri, Apr 13, 2018 at 02:30:24PM +0100, Paul Moore wrote: >> On 13 April 2018 at 14:18, Steven D'Aprano wrote: >> > On Fri, Apr 13, 2018 at 09:56:35PM +1000, Chris Angelico wrote: >> >> >> 2) Forbid any use of "(expr as name)" in the header of a 'with' statement >> > >> > You can't forbid it, because it is currently allowed syntax >> >> It's not currently allowed: > > It is without the parens, as I said: > > [...] >> > (albeit currently without the parens). >> >> Well, yes, but we're talking about *with* the parens. > > You might be, but I'm not. Then we're talking about unrelated points, and "currently-allowed" is immaterial. > I'm talking about the conflict between: > > # pre-PEP 572 context manager > with expr as name: > > # post-PEP 572, is it a context manager or an assignment expression? > with expr as name: > > > and I'm saying, don't worry about the syntactical ambiguity, just make a > ruling that it is a context manager, never an assignment expression. That part isn't at all in question. The meaning of "with expr as name:" MUST NOT change, because that would massively break backward compatibility. > I'm saying, don't even try to distinguish between the forms with or > without parens. If we add parens: > > with (expr as name): > > it may or may not be allowed some time in the future (since it isn't > allowed now, but there are many requests for it) but if it is allowed, > it will still mean a context manager and not assignment expression. For that to work, the assignment expression MUST be disallowed in the header of a 'with' statement. Which is one of the suggestions I made. In fact, it was the PEP's recommendation as of the last point when 'as' was the preferred syntax. I never got as far as implementing that, but it'd be the safest solution. And it's still a special case. > (In case it isn't obvious, I'm saying that we need not *require* parens > for this feature, at least not if the only reason for doing so is to > make the with/except case unambiguous.) Definitely not; again, that would break backward compatibility. We are in agreement on that. > If it is in a with or except clause (just the header clause, not the > following block) , then it means the same thing it means now. Everywhere > else, it means assignment expression. Then it HAS to be a SyntaxError. Because that's what it means now. Or are you saying "it means the same thing it would mean now if the parentheses were omitted"? Because that is definitely not an easy thing to explain. > If people don't agree with me, you're all wrong, er, that is to say, I > understand your position even if I disagree *wink* Funnily enough, that's my stance too. I guess we're all just wrong, wronger, and wrongest today :) ChrisA From rosuav at gmail.com Fri Apr 13 16:31:31 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 14 Apr 2018 06:31:31 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180413164357.GH11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180413145014.GF11616@ando.pearwood.info> <20180413164357.GH11616@ando.pearwood.info> Message-ID: On Sat, Apr 14, 2018 at 2:43 AM, Steven D'Aprano wrote: > On Fri, Apr 13, 2018 at 05:04:00PM +0200, Jacco van Dorp wrote: > >> > I'm saying, don't even try to distinguish between the forms with or >> > without parens. If we add parens: >> > >> > with (expr as name): >> > >> > it may or may not be allowed some time in the future (since it isn't >> > allowed now, but there are many requests for it) but if it is allowed, >> > it will still mean a context manager and not assignment expression. >> > >> > (In case it isn't obvious, I'm saying that we need not *require* parens >> > for this feature, at least not if the only reason for doing so is to >> > make the with/except case unambiguous.) >> > >> >> So if I read this correctly, you're making an argument to ignore parens ? > > You can always add unneeded parentheses for grouping that have no > effect: > > py> value = ((((( 1 )) + (((1)))))) > py> value > 2 That is true ONLY in an expression, not in all forms of syntax. Any place where you can have expression, you can have (expression). You cannot, however, add parentheses around non-expression units: (x = 1) # SyntaxError (for x in range(2)): # SyntaxError assert (False, "ham") # Won't raise assert False, "spam" # Will raise There is a special case for import statements: from sys import (modules) but for everything else, you can only parenthesize expressions (aka "slabs of syntax that evaluate, eventually, to a single object"). > One should never be penalised by the interpreter for unnecessary > parentheses. If they do nothing, it shouldn't be an error. Do you really > want an error just because you can't remember operator precendence? > > flag = (spam is None) or (len(spam) == 0) > > The parens are unnecessary, but I still want to write them. That > shouldn't be an error. The RHS of an assignment is an expression. But you can't move that first '(' any further to the left. >> If I'd type with (expr as name) as othername:, I'd expect the original value >> of expr in my name and the context manager's __enter__ return value in >> othername. I don't really see any ambiguity in that case. > > That case would be okay. But the ambiguity comes from this case: > > with expr as name: > > > That could mean either of: > > 1. Assignment-as-expression, in which case gets set to the value > of and __enter__ is never called; > > 2. With-statement context manager, in which case gets set to the > value of .__enter__(). Since it's pre-existing syntax, the only valid interpretation is #2. But if parenthesized, both meanings are plausible, and #1 is far more sane (since #2 would demand special handling in the grammar). > The problem is that the most common context manager objects return > themselves from __enter__, so it doesn't matter whether is set to > or .__enter__(), the result will be the same. > > But some context managers don't work like that, and so your code will > have a non-obvious bug just waiting to bite. Right. > How about if we require parentheses? That will mean that we treat these > two statements as different: > > with expr as name: # 1 > > with (expr as name): # 2 > > Case #1 is of course the current syntax, and it is fine as it is. > > It is the second case that has problems. Suppose the expression is > really long, as so you innocently intend to write: > > with (really > long > expression) as name: > > but you accidently put the closing parenthesis in the wrong place: > > with (really > long > expression as name): > > > as is easy enough to do. And now you have a suble bug: name will no > longer be set to the result of calling __enter__ as you expect. Sure. And that's a good reason to straight-up *forbid* expression-as-name in a 'with' statement. > It is generally a bad idea to have the presence or absense of > parentheses *alone* to make a semantic difference. With very few > exceptions, it leads to problems. For example, if you remember except > clauses back in the early versions of Python 2, or 1.5, you will > remember the problems caused by treating: > > except NameError, ValueError: > > except (NameError, ValueError): > > as every-so-subtly different. If you don't remember Python that long > ago, would you like to guess the difference? The 'as' syntax was around when I started using Python seriously, but the comma was still legal right up until 2.7, so I was aware of it. And yes, the parens here make a drastic difference, and that's bad. > It isn't the parentheses that cause the binding. It is the "as". So if > you move a perfectly innocent assignment-expression into a with > statement, the result will depend on whether or not it came with > parentheses: > > > # these do the same thing > while expression as name: > while (expression as name): > > # so do these > result = [1, expression as name, name + 2] > result = [1, (expression as name), name + 2] > > # but these are subtly different and will be a trap for the unwary > with expression as name: # name is set to __enter__() > with (expression as name): # name is not set to __enter__() And that's a good reason to reject the last one with a SyntaxError, but that creates an odd discrepancy where something that makes perfect logical sense is rejected. > Of course, we could insist that parens are ALWAYS required around > assignment-expressions, but that will be annoying. Such a mandate would, IMO, come under the heading of "foolish consistency". Unless, of course, I'm deliberately trying to get this proposal rejected, in which case I'd add the parens to make it look uglier :) ChrisA From kirillbalunov at gmail.com Fri Apr 13 17:33:49 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sat, 14 Apr 2018 00:33:49 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180413145014.GF11616@ando.pearwood.info> <20180413164357.GH11616@ando.pearwood.info> Message-ID: 2018-04-13 23:31 GMT+03:00 Chris Angelico : > > > # but these are subtly different and will be a trap for the unwary > > with expression as name: # name is set to __enter__() > > with (expression as name): # name is not set to __enter__() > > And that's a good reason to reject the last one with a SyntaxError, > but that creates an odd discrepancy where something that makes perfect > logical sense is rejected. > > Maybe it does not suit you, but what do you think about `SyntaxWarning` instead of `SyntaxError` for both `with` and `except`. By analogy how it was done for `global name` into function body prior to Python 3.6? With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Apr 13 17:44:35 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 14 Apr 2018 07:44:35 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180413145014.GF11616@ando.pearwood.info> <20180413164357.GH11616@ando.pearwood.info> Message-ID: On Sat, Apr 14, 2018 at 7:33 AM, Kirill Balunov wrote: > > > 2018-04-13 23:31 GMT+03:00 Chris Angelico : >> >> >> > # but these are subtly different and will be a trap for the unwary >> > with expression as name: # name is set to __enter__() >> > with (expression as name): # name is not set to __enter__() >> >> And that's a good reason to reject the last one with a SyntaxError, >> but that creates an odd discrepancy where something that makes perfect >> logical sense is rejected. >> > > Maybe it does not suit you, but what do you think about `SyntaxWarning` > instead of `SyntaxError` for both `with` and `except`. By analogy how it was > done for `global name` into function body prior to Python 3.6? Warnings are often not seen. For an error this subtle, a warning wouldn't be enough. Good call though; that was one of the considerations, and if we knew for sure that warnings could be seen by the right people, they could be more useful for these cases. ChrisA From george at fischhof.hu Fri Apr 13 18:11:55 2018 From: george at fischhof.hu (George Fischhof) Date: Sat, 14 Apr 2018 00:11:55 +0200 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180411000335.GN16661@ando.pearwood.info> References: <20180411000335.GN16661@ando.pearwood.info> Message-ID: 2018-04-11 2:03 GMT+02:00 Steven D'Aprano : [snip] > I shouldn't think that the number of files on disk is very important, > now that they're hidden away in the __pycache__ directory where they can > be ignored by humans. Even venerable old FAT32 has a limit of 65,534 > files in a single folder, and 268,435,437 on the entire volume. So > unless the std lib expands to 16000+ modules, the number of files in the > __pycache__ directory ought to be well below that limit. > [snip] Hi all, Just for information for everyone: (I was a VMS system manager more than a decade ago, and I know that Win NT (at least the core) is developed by a former VMS engineer. NTFS is created on the bases of Files-11 (Files-11B) file system. And in both file systems the directory is a tree (in Files-11 it is a B-tree, maybe in NTFS it is different tree, but tree). Holding the files ordered alphabetically. And if there are "too much" files then accessing files will be slower. (check for example the windows\system32 folder). Of course it is not matter if there are some hundred or 1-2 thousand files. But the too much matters. I did a little measurement (intentionally not used functions not to make the result wrong): import os import time try: os.mkdir('tmp_thousands_of_files') except: pass name1 = 10001 start = time.time() file_name = 'tmp_thousands_of_files/' + str(name1) f = open(file_name, 'w') f.write('aaa') f.close() stop = time.time() file_time = stop-start print(f'one file time {file_time} \n {start} \n {stop}') for i in range(10002, 20000): file_name = 'tmp_thousands_of_files/' + str(i) f = open(file_name, 'w') f.write('aaa') f.close() name2 = 10000 start = time.time() file_name = 'tmp_thousands_of_files/' + str(name2) f = open(file_name, 'w') f.write('aaa') f.close() stop = time.time() file_time = stop-start print(f'after 10k, name before {file_time} \n {start} \n {stop}') name3 = 20010 start = time.time() file_name = 'tmp_thousands_of_files/' + str(name3) f = open(file_name, 'w') f.write('aaa') f.close() stop = time.time() file_time = stop-start print(f'after 10k, name after {file_time} \n {start} \n {stop}') """ result c:\>python several_files_in_one_folder.py one file time 0.0 1523476699.5144918 1523476699.5144918 after 10k, name before 0.015625953674316406 1523476714.622918 1523476714.6385438 after 10k, name after 0.0 1523476714.6385438 1523476714.6385438 """ used: Python 3.6.1, windows 8.1, SSD drive As you can see, when there an insertion into the beginning of the tree it is much slower then adding to the end. (yes, I know the list insertion is slow as well, but I saw VMS directory with 50k files, and the dir command gave 5-10 files then waited some seconds before the next 5-10 files ... ;-) ) BR, George -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlhilton at gmail.com Fri Apr 13 23:28:57 2018 From: kenlhilton at gmail.com (Ken Hilton) Date: Sat, 14 Apr 2018 11:28:57 +0800 Subject: [Python-ideas] Idea: Importing from arbitrary filenames Message-ID: Hi all, First of all, please excuse me if I'm presenting this idea in the wrong way or at the wrong time - I'm new to this mailing list and haven't seen anyone propose a new idea on it yet, so I don't know the customs. I have an idea for importing files with arbitrary names. Currently, the "official" way to import arbitrary files is to use the "imp" module, as shown by this answer: https://stackoverflow.com/a/3137914/6605349 However, this method takes two function calls and is not as (aesthetically pleasing? is that the word?) as a simple "import" statement. Therefore, my idea is to allow the "import" statement to accept one of three targets. First, the normal "import": import antigravity which simply imports from sys.path. Second, importing with a string literal specifying the path to a file: import '/home/pi/anti-gravity.py' *as antigravity* Note the "as antigravity" in this statement - this is to avoid ambiguities when choosing the global name to bind to. Should "import '/home/pi/anti-gravity.py'" import to the name "/home/pi/anti-gravity.py", "anti-gravity.py", "anti-gravity", or "anti_gravity"? None of those are really ideal. Therefore, when the import target is a string literal, the statement must include "as NAME". Third, importing with an expression providing a value castable to a string, specifying the path to a file: def file_in_home(filename): return '/home/pi/' + filename import *$*file_in_home('anti-gravity.py') *as antigravity* Once again, for the same reasons, import statements like this must include "as NAME" to avoid ambiguities. Notice that the expression is preceded by a dollar sign ($) to indicate that what follows is an expression rather than a name - imagine a scenario like this: antigravity_file = '/home/pi/anti-gravity.py' import antigravity_file as antigravity Should it look for a sys.path module with the name "antigravity_file" or should it use the value of the variable "antigravity_file"? Looking for the sys.path module first before trying a variable's value would waste processing time and potentially be unexpected behavior. Trying a variable's value first before looking for a sys.path module would be even less expected behavior. Therefore, a dollar sign must come before expression imports to indicate that the import target is an expression. Side note: the dollar sign was chosen because it mimics other languages' conventions of preceding variable names with dollar signs, but any arbitrary character not present at the start of an expression would work. One more thing about expression imports: if the final returned value of the expression is not a string, I believe the statement should raise a TypeError (the same way that __repr__ or __str__ raise TypeError if they return a non-string). Why? If the statement attempted to cast the return value to a string, and the return value's __str__ method raised an error, then should the statement allow the error to pass through, or should it attempt to use a parent class's __str__ method? Allowing the error to pass through would almost certainly be unexpected behavior; attempting to use a parent class's __str__ method would take more time and more processing power (though it would eventually reach "object"'s __str__ method and succeed). Therefore, non-string expression values should raise TypeError. What are your thoughts? ?Regards , Ken ? Hilton? ; -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sat Apr 14 01:12:58 2018 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 14 Apr 2018 01:12:58 -0400 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: I'm fairly certain similar changes have been discussed in the past. Someone else can probably find / link / rehash the reasons why imports deliberately use dot notation instead of path? I can think of a few: 1) Portability. dotted imports looked up from sys.path are platform-portable # $HOME/\this\/__init__.py sys.path.append( os.path.expanduser( os.path.join('~', 'this',))) 2) Flexibility. importlib Loaders (3.1+) are abstract; they only know what to do with dotted paths. They can import from the filesystem, zip files, ... git repos, https://docs.python.org/3/library/importlib.html https://docs.python.org/3/library/importlib.html#importlib.abc.SourceLoader https://pypi.org/search/?q=importlib https://docs.python.org/3/library/imp.html#imp.find_module https://docs.python.org/3/library/imp.html#imp.load_module sys.path (`python -m site`) can be configured with: - $PYTHONSTARTUP='~/.pythonrc.py' - modules.py and dirs/__init__.py in site-packages/ - .pth files in site-packages/ - idempotent sys.path config at the top of a .py source file - sys.USER_SITE in sys.path - ~/.local/lib/python*X.Y*/site-packages - ~/Library/Python/*X.Y*/lib/python/site-packages - *%APPDATA%*\Python\Python*XY*\site-packages - https://docs.python.org/3/library/site.html - https://docs.python.org/3/using/cmdline.html#envvar-PYTHONSTARTUP - https://docs.python.org/3/using/cmdline.html#envvar-PYTHONUSERBASE - https://docs.python.org/3/library/site.html#site.USER_SITE Is there a good write-up of how, where, and in what order sys.path is configured, by default, in Python? TLDR dotted names are preferable for sharing code with people who don't have the same paths, os.pathsep, or os.platform_info. (Stevedore and Jupyter Notebook take different approaches to handling plugins, if that's your use case?) Though I could just be arguing for the status quo; there are probably good reasons to consider changing EVERYTHING On Friday, April 13, 2018, Ken Hilton wrote: > Hi all, > > First of all, please excuse me if I'm presenting this idea in the wrong > way or at the wrong time - I'm new to this mailing list and haven't seen > anyone propose a new idea on it yet, so I don't know the customs. > > I have an idea for importing files with arbitrary names. Currently, the > "official" way to import arbitrary files is to use the "imp" module, as > shown by this answer: https://stackoverflow.com/a/3137914/6605349 > However, this method takes two function calls and is not as (aesthetically > pleasing? is that the word?) as a simple "import" statement. > > Therefore, my idea is to allow the "import" statement to accept one of > three targets. > First, the normal "import": > > import antigravity > > which simply imports from sys.path. > > Second, importing with a string literal specifying the path to a file: > > import '/home/pi/anti-gravity.py' *as antigravity* > > Note the "as antigravity" in this statement - this is to avoid ambiguities > when choosing the global name to bind to. Should "import > '/home/pi/anti-gravity.py'" import to the name "/home/pi/anti-gravity.py", > "anti-gravity.py", "anti-gravity", or "anti_gravity"? None of those are > really ideal. Therefore, when the import target is a string literal, the > statement must include "as NAME". > > Third, importing with an expression providing a value castable to a > string, specifying the path to a file: > > def file_in_home(filename): > return '/home/pi/' + filename > import *$*file_in_home('anti-gravity.py') *as antigravity* > > Once again, for the same reasons, import statements like this must include > "as NAME" to avoid ambiguities. Notice that the expression is preceded by a > dollar sign ($) to indicate that what follows is an expression rather than > a name - imagine a scenario like this: > > antigravity_file = '/home/pi/anti-gravity.py' > import antigravity_file as antigravity > > Should it look for a sys.path module with the name "antigravity_file" or > should it use the value of the variable "antigravity_file"? Looking for the > sys.path module first before trying a variable's value would waste > processing time and potentially be unexpected behavior. Trying a variable's > value first before looking for a sys.path module would be even less > expected behavior. Therefore, a dollar sign must come before expression > imports to indicate that the import target is an expression. > Side note: the dollar sign was chosen because it mimics other languages' > conventions of preceding variable names with dollar signs, but any > arbitrary character not present at the start of an expression would work. > One more thing about expression imports: if the final returned value of > the expression is not a string, I believe the statement should raise a > TypeError (the same way that __repr__ or __str__ raise TypeError if they > return a non-string). Why? If the statement attempted to cast the return > value to a string, and the return value's __str__ method raised an error, > then should the statement allow the error to pass through, or should it > attempt to use a parent class's __str__ method? Allowing the error to pass > through would almost certainly be unexpected behavior; attempting to use a > parent class's __str__ method would take more time and more processing > power (though it would eventually reach "object"'s __str__ method and > succeed). Therefore, non-string expression values should raise TypeError. > > What are your thoughts? > > ?Regards > , > Ken > ? Hilton? > ; > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Apr 14 01:27:28 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 14 Apr 2018 15:27:28 +1000 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On 14 April 2018 at 13:28, Ken Hilton wrote: > Hi all, > > First of all, please excuse me if I'm presenting this idea in the wrong way > or at the wrong time - I'm new to this mailing list and haven't seen anyone > propose a new idea on it yet, so I don't know the customs. > > I have an idea for importing files with arbitrary names. Currently, the > "official" way to import arbitrary files is to use the "imp" module, as > shown by this answer: https://stackoverflow.com/a/3137914/6605349 > However, this method takes two function calls and is not as (aesthetically > pleasing? is that the word?) as a simple "import" statement. Modules aren't required to be stored on the filesystem, so we have no plans to offer this. `runpy.run_path()` exists to let folks run arbitrary Python files and collect the resulting namespace, while if folks really want to implement pseudo-imports based on filenames we expose the necessary building blocks in importlib (https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly) The fact that run_path() has a nice straightforward invocation model, and the import emulation recipe doesn't is intended as a hint :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From Nikolaus at rath.org Sat Apr 14 11:46:24 2018 From: Nikolaus at rath.org (Nikolaus Rath) Date: Sat, 14 Apr 2018 08:46:24 -0700 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: (Chris Angelico's message of "Sat, 14 Apr 2018 05:33:57 +1000") References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <5AD0B0A1.9080905@stoneleaf.us> Message-ID: <877ep9ssdr.fsf@thinkpad.rath.org> On Apr 14 2018, Chris Angelico wrote: > On Fri, Apr 13, 2018 at 11:30 PM, Peter O'Connor > wrote: >> Well this may be crazy sounding, but we could allow left or right assignment >> with >> >> name := expr >> expr =: name >> >> Although it would seem to violate the "only one obvious way" maxim, at least >> it avoids this overloaded meaning with the "as" of "except" and "with" >> > > Hah. It took me multiple readings to even notice the change in the > operator there, so I think this would cause a lot of confusion. Well, if putting the expression first is generally considered better, the reasonable thing to do would be to allow *only* =:. -Best Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From gadgetsteve at live.co.uk Sat Apr 14 05:22:29 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Sat, 14 Apr 2018 09:22:29 +0000 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On 14/04/2018 06:27, Nick Coghlan wrote: > On 14 April 2018 at 13:28, Ken Hilton wrote: >> Hi all, >> >> First of all, please excuse me if I'm presenting this idea in the wrong way >> or at the wrong time - I'm new to this mailing list and haven't seen anyone >> propose a new idea on it yet, so I don't know the customs. >> >> I have an idea for importing files with arbitrary names. Currently, the >> "official" way to import arbitrary files is to use the "imp" module, as >> shown by this answer: https://stackoverflow.com/a/3137914/6605349 >> However, this method takes two function calls and is not as (aesthetically >> pleasing? is that the word?) as a simple "import" statement. > > Modules aren't required to be stored on the filesystem, so we have no > plans to offer this. > > `runpy.run_path()` exists to let folks run arbitrary Python files and > collect the resulting namespace, while if folks really want to > implement pseudo-imports based on filenames we expose the necessary > building blocks in importlib > (https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly) > > The fact that run_path() has a nice straightforward invocation model, > and the import emulation recipe doesn't is intended as a hint :) > > Cheers, > Nick. > I generally love the current import system for "just working" regardless of platform, installation details, etc., but what I would like to see is a clear import local, (as opposed to import from wherever you can find something to satisfy mechanism). This is the one thing that I miss from C/C++ where #include is system includes and #include "x" search differing include paths, (if used well). -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com From tjreedy at udel.edu Sat Apr 14 15:26:52 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 14 Apr 2018 15:26:52 -0400 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On 4/13/2018 11:28 PM, Ken Hilton wrote: > Hi all, > > First of all, please excuse me if I'm presenting this idea in the wrong > way or at the wrong time - I'm new to this mailing list and haven't seen > anyone propose a new idea on it yet, so I don't know the customs. > > I have an idea for importing files with arbitrary names. Currently, the > "official" Alex Martelli intentionally put that in quotes. > way to import arbitrary files is to use the "imp" module, as > shown by this answer: https://stackoverflow.com/a/3137914/6605349 Read the first comment -- the above is deprecated. There was always the __import__(name) function, but importlib.import_module is recommended now. > However, this method takes two function calls and is not as > (aesthetically pleasing? is that the word?) as a simple "import" statement. Only one is needed for most purposes. importlib has separate find and load functions, which are used by 'import', and which are available to those who need them. > Second, importing with a string literal specifying the path to a file: > > ? ? import '/home/pi/anti-gravity.py' *as antigravity* antigravity = import_module('/home/pi/anti-gravity.py') -- Terry Jan Reedy From danilo.bellini at gmail.com Sat Apr 14 15:57:27 2018 From: danilo.bellini at gmail.com (Danilo J. S. Bellini) Date: Sat, 14 Apr 2018 16:57:27 -0300 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: On 5 April 2018 at 13:52, Peter O'Connor wrote: > I was thinking it would be nice to be able to encapsulate this common type > of operation into a more compact comprehension. > > I propose a new "Reduce-Map" comprehension that allows us to write: > > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] > smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] > > Instead of: > > def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): > average = initial_value > for xt in signal: > average = (1-decay)*average + decay*xt > yield average > > signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] > smooth_signal = list(exponential_moving_average(signal, decay=0.05)) > > I wrote in this mail list the very same proposal some time ago. I was trying to let the scan higher order function (itertools.accumulate with a lambda, or what was done in the example above) fit into a simpler list comprehension. As a result, I wrote this project, that adds the "scan" feature to Python comprehensions using a decorator that performs bytecode manipulation (and it had to fit in with a valid Python syntax): https://github.com/danilobelli ni/pyscanprev In that GitHub page I've wrote several examples and a rationale on why this would be useful. -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap) -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python-ideas at arctrix.com Sat Apr 14 19:54:05 2018 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Sat, 14 Apr 2018 17:54:05 -0600 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <15602430-b133-239a-1aa4-b3bc9973f44a@egenix.com> References: <15602430-b133-239a-1aa4-b3bc9973f44a@egenix.com> Message-ID: <20180414235405.wft73xwtbwwcwwme@python.ca> On 2018-04-12, M.-A. Lemburg wrote: > This leaves the proposal to restructure pyc files into a sectioned > file and possibly indexed file to make access to (lazily) loaded > parts faster. I would like to see a format can hold one or more modules in a single file. Something like the zip format but optimized for fast interpreter startup time. It should support lazy loading of module parts (e.g. maybe my lazy bytecode execution idea[1]). Obviously a lot of details to work out. The design should also take into account the widespread use of virtual environments. So, it should be easy and space efficient to build virtual environments using this format (e.g. maybe allow overlays so that stdlib package is not copied into virtual environment, virtual packages would be overlaid on stdlib file). Also, should be easy to bundle all modules into a "uber" package and append it to the Python executable. CPython should provide out-of-box support for single-file executables. 1. https://github.com/python/cpython/pull/6194 From ncoghlan at gmail.com Sat Apr 14 23:08:20 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Apr 2018 13:08:20 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180413131859.GE11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On 13 April 2018 at 23:18, Steven D'Aprano wrote: > On Fri, Apr 13, 2018 at 09:56:35PM +1000, Chris Angelico wrote: > > >> How many times have people asked for "with (expr as name):" to >> be supported, allowing the statement to spread over multiple lines? >> With this syntax, it would suddenly be permitted - with dangerously >> similar semantics. > > I see your point, but why don't we support "with (expr as name):" to > allow multiple lines? No, don't answer that... its off-topic. Forget I > asked. It's not completely off topic. as it's due to the fact we use "," to separate both context managers and items in a tuple, so "with (cm1, cm2, cm3):" is currently legal syntax that means something quite different from "with cm1, cm2, cm3:". While using the parenthesised form is *pointless* (since it will blow up at runtime due to tuples not being context managers), the fact it's syntactically valid makes us more hesitant to add the special case around parentheses handling than we were for import statements. The relevance to PEP 572 is as a reminder that since we *really* don't like to add yet more different cases to "What do parentheses indicate in Python?", we should probably show similar hesitation when it comes to giving ":" yet another meaning. This morning, I remembered a syntax idea I had a while back in relation to lambda expressions that could perhaps be better applied to assignment expressions, so I'll quickly recap the current options, and then go on to discussing that. Since the threads are pretty sprawling, I've also included a postscript going into more detail on my current view of the pros and cons of the various syntax proposals presented so far. Expression first proposals: while (read_next_item() as value) is not None: ... while (read_next_item() -> value) is not None: ... Target first proposal (current PEP): while (value := read_next_item()) is not None: ... New keyword based target first proposal: while (value from read_next_item()) is not None: ... The new one is a fairly arbitrary repurposing of the import system's "from" keyword, but it avoids all the ambiguities of "as", is easier to visually distinguish from other existing expression level keywords than "as", avoids giving ":" yet another meaning, still lets us use a keyword instead of a symbol, and gives the new expression type a more clearly self-evident name ("from expressions", as opposed to the "from statements" used for imports). It also more easily lends itself to skipping over the details of the defining expression when reading code aloud or in your head (e.g. "while value is not None, where value comes from read_next_item()" would be a legitimate way of reading the above for loop header, and you could drop the trailing clause completely when the details aren't particularly relevant). Avoiding the use of a colon as part of the syntax also means that if we wanted to, we could potentially allow optional type annotations in from-expressions ("target: annotation from expression"), and even adopt them as a shorthand for the sentinel pattern in function declarations (more on that below). As far as the connection with "from module import name" goes, given the proposed PEP 572 semantics, these three statements would all be equivalent: from dotted.module import name name = __import__("dotted_module", fromlist=["name"]).name name from __import__("dotted_module", fromlist=["name"]).name Other examples from the PEP: # Handle a matched regex if (match from pattern.search(data)) is not None: ... # Share a subexpression between a comprehension filter clause and its output filtered_data = [y for x in data if (y from f(x)) is not None] # Nested assignments assert 0 == (x from (y from (z from 0))) # Re-using fields in a container display stuff = [[y from f(x), x/y] for x in range(5)] And the set/dict examples display where ":=" could be visually confusing: # Set display with local name bindings data = { value_a from 1, value_b from 2, value_c from 3, } # Dict display with local key & value name bindings data = { key_a from 'a': value_a from 1, key_b from 'b': value_b from 2, key_c from 'c': value_c from 3, } Potential extension to simplifying the optional non-shared-mutable-default-argument pattern: # Shared mutable default (stored directly in f.__defaults__) def f(shared = []): .... # Unshared mutable default (implementation details TBD) def f(unshared from []): .... That last part would only be a potential extension beyond the scope of PEP 572 (since it would go against the grain of "name = expression" and "name from expression" otherwise being functionally equivalent in their behaviour), but it's an opportunity that wouldn't arise if a colon is part of the expression level name binding syntax. Cheers, Nick. P.S. The pros and cons of the current syntax proposals, as I see them: === Expression first, 'as' keyword === while (read_next_item() as value) is not None: ... Pros: * typically reads nicely as pseudocode * "as" is already associated with namebinding operations Cons: * syntactic ambiguity in with statement headers (major concern) * encourages a common misunderstanding of how with statements work (major concern) * visual similarity between "as" and "and" makes name bindings easy to miss * syntactic ambiguity in except clause headers theoretically exists, but is less of a concern due to the consistent type difference that makes the parenthesised form pointless === Expression first, '->' symbol === while (read_next_item() -> value) is not None: ... Pros: * avoids the syntactic ambiguity of "as" * "->" is used for name bindings in at least some other languages (but this is irrelevant to users for whom Python is their first, and perhaps only, programming language) Cons: * doesn't read like pseudocode (you need to interpret an arbitrary non-arithmetic symbol) * invites the question "Why doesn't this use the 'as' keyword?" * symbols are typically harder to look up than keywords * symbols don't lend themselves to easy mnemonics * somewhat arbitrary repurposing of "->" compared to its use in function annotations === Target first, ':=' symbol === while (value := read_next_item()) is not None: ... Pros: * avoids the syntactic ambiguity of "as" * being target first provides an obvious distinction from the "as" keyword * ":=" is used for name bindings in at least some other languages (but this is irrelevant to users for whom Python is their first, and perhaps only, language) Cons: * symbols are typically harder to look up than keywords * symbols don't lend themselves to easy mnemonics * subject to a visual "line noise" phenomenon when combined with other uses of ":" as a syntactic marker (e.g. slices, dict key/value pairs, lambda expressions, type annotations) === Target first, 'from' keyword === while (value from read_next_item()) is not None: # New ... Pros: * avoids the syntactic ambiguity of "as" * being target first provides an obvious distinction from the "as" keyword * typically reads nicely as pseudocode * "from" is already associated with a namebinding operation ("from module import name") Cons: * I'm sure we'll think of some more, but all I have so far is that the association with name binding is relatively weak and would need to be learned -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Sat Apr 14 23:54:49 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 15 Apr 2018 13:54:49 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On Sun, Apr 15, 2018 at 1:08 PM, Nick Coghlan wrote: > === Target first, 'from' keyword === > > while (value from read_next_item()) is not None: # New > ... > > Pros: > > * avoids the syntactic ambiguity of "as" > * being target first provides an obvious distinction from the "as" keyword > * typically reads nicely as pseudocode > * "from" is already associated with a namebinding operation ("from > module import name") > > Cons: > > * I'm sure we'll think of some more, but all I have so far is that > the association with name binding is relatively weak and would need to > be learned > Cons: Syntactic ambiguity with "raise exc from otherexc", probably not serious. ChrisA From tim.peters at gmail.com Sun Apr 15 01:05:58 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 15 Apr 2018 00:05:58 -0500 Subject: [Python-ideas] A cute Python implementation of itertools.tee Message-ID: Just for fun - no complaint, no suggestion, just sharing a bit of code that tickled me. The docs for `itertools.tee()` contain a Python work-alike, which is easy to follow. It gives each derived generator its own deque, and when a new value is obtained from the original iterator it pushes that value onto each of those deques. Of course it's possible for them to share a single deque, but the code gets more complicated. Is it possible to make it simpler instead? What it "really" needs is a shared singly-linked list of values, pointing from oldest value to newest. Then each derived generator can just follow the links, and yield its next result in time independent of the number of derived generators. But how do you know when a new value needs to be obtained from the original iterator, and how do you know when space for an older value can be recycled (because all of the derived generators have yielded it)? I ended up with almost a full page of code to do that, storing with each value and link a count of the number of derived generators that had yet to yield the value, effectively coding my own reference-count scheme by hand, along with "head" and "tail" pointers to the ends of the linked list that proved remarkably tricky to keep correct in all cases. Then I thought "this is stupid! Python already does reference counting." Voila! Vast swaths of tedious code vanished, giving this remarkably simple implementation: def mytee(xs, n): last = [None, None] def gen(it, mylast): nonlocal last while True: mylast = mylast[1] if not mylast: mylast = last[1] = last = [next(it), None] yield mylast[0] it = iter(xs) return tuple(gen(it, last) for _ in range(n)) There's no need to keep a pointer to the start of the shared list at all - we only need a pointer to the end of the list ("last"), and each derived generator only needs a pointer to its own current position in the list ("mylast"). What I find kind of hilarious is that it's no help at all as a prototype for a C implementation: Python recycles stale `[next(it), None]` pairs all by itself, when their internal refcounts fall to 0. That's the hardest part. BTW, I certainly don't suggest adding this to the itertools docs either. While it's short and elegant, it's too subtle to grasp easily - if you think "it's obvious", you haven't yet thought hard enough about the problem ;-) From ethan at stoneleaf.us Sun Apr 15 01:52:52 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 14 Apr 2018 22:52:52 -0700 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: <5AD2E8B4.7020108@stoneleaf.us> On 04/14/2018 08:08 PM, Nick Coghlan wrote: > New keyword based target first proposal: > > while (value from read_next_item()) is not None: > ... I could get behind this. Current preferencs: "as" +1 "from" +0.85 ":=" +0.5 -- ~Ethan~ From neatnate at gmail.com Sun Apr 15 02:01:28 2018 From: neatnate at gmail.com (Nathan Schneider) Date: Sun, 15 Apr 2018 02:01:28 -0400 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On Sat, Apr 14, 2018 at 11:54 PM, Chris Angelico wrote: > On Sun, Apr 15, 2018 at 1:08 PM, Nick Coghlan wrote: > > === Target first, 'from' keyword === > > > > while (value from read_next_item()) is not None: # New > > ... > > > > Pros: > > > > * avoids the syntactic ambiguity of "as" > > * being target first provides an obvious distinction from the "as" > keyword > > * typically reads nicely as pseudocode > > * "from" is already associated with a namebinding operation ("from > > module import name") > > > > Cons: > > > > * I'm sure we'll think of some more, but all I have so far is that > > the association with name binding is relatively weak and would need to > > be learned > > > > Cons: Syntactic ambiguity with "raise exc from otherexc", probably not > serious. > > To me, "from" strongly suggests that an element is being obtained from a container/collection of elements. This is how I conceptualize "from module import name": "name" refers to an object INSIDE the module, not the module itself. If I saw if (match from pattern.search(data)) is not None: ... I would guess that it is equivalent to m = next(pattern.search(data)) if m is not None: ... i.e. that the target is bound to the next item from an iterable (the "container"). Cheers, Nathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Apr 15 03:12:35 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Apr 2018 17:12:35 +1000 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On 14 April 2018 at 19:22, Steve Barnes wrote: > I generally love the current import system for "just working" regardless > of platform, installation details, etc., but what I would like to see is > a clear import local, (as opposed to import from wherever you can find > something to satisfy mechanism). This is the one thing that I miss from > C/C++ where #include is system includes and #include "x" search > differing include paths, (if used well). For the latter purpose, we prefer that folks use either explicit relative imports (if they want to search the current package specifically), or else direct manipulation of package.__path__. That is, if you do: from . import custom_imports # Definitely from your own project custom_imports.__path__[:] = (some_directory, some_other_directory) then: from .custom_imports import name will search those directories for packages & modules to import, while still cleanly mapping to a well-defined location in the module namespace for the process as a whole (and hence being able to use all the same caches as other imports, without causing name conflicts or other problems). If you want to do this dynamically relative to the current module, then it's possible to do: global __path__ __path__[:] = (some_directory, some_other_directory) custom_mod = importlib.import_module(".name", package=__name__) The discoverability of these kinds of techniques could definitely stand to be improved, but the benefit of adopting them is that they work on all currently supported versions of Python (even importlib.import_module exists in Python 2.7 as a convenience wrapper around __import__), rather than needing to wait for new language level syntax for them. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Apr 15 03:20:31 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Apr 2018 17:20:31 +1000 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On 15 April 2018 at 17:12, Nick Coghlan wrote: > If you want to do this dynamically relative to the current module, > then it's possible to do: > > global __path__ > __path__[:] = (some_directory, some_other_directory) > custom_mod = importlib.import_module(".name", package=__name__) Copy and paste error there: to make this work in non-package modules, drop the "[:]" from the __path__ assignment. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Apr 15 03:35:06 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Apr 2018 17:35:06 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On 15 April 2018 at 13:54, Chris Angelico wrote: > On Sun, Apr 15, 2018 at 1:08 PM, Nick Coghlan wrote: >> === Target first, 'from' keyword === >> >> while (value from read_next_item()) is not None: # New >> ... >> >> Pros: >> >> * avoids the syntactic ambiguity of "as" >> * being target first provides an obvious distinction from the "as" keyword >> * typically reads nicely as pseudocode >> * "from" is already associated with a namebinding operation ("from >> module import name") >> >> Cons: >> >> * I'm sure we'll think of some more, but all I have so far is that >> the association with name binding is relatively weak and would need to >> be learned >> > > Cons: Syntactic ambiguity with "raise exc from otherexc", probably not serious. Ah, I forgot about that usage. The keyword usage is at least somewhat consistent, in that it's short for: _tmp = exc _exc.__cause__ from otherexc raise exc However, someone writing "raise (ExcType from otherexc)" could be confusing, since it would end up re-raising "otherexc" instead of wrapping it in a new ExcType instance. If "otherexc" was also an ExcType instance, that would be a *really* subtle bug to try and catch, so this would likely need the same kind of special casing as was proposed for "as" (i.e. prohibiting the top level parentheses). I also agree with Nathan that if you hadn't encountered from expressions before, it would be reasonable to assume they were semantically comparable to "target = next(expr)" rather than just "target = expr". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ashrub at yandex.ru Sun Apr 15 04:57:00 2018 From: ashrub at yandex.ru (Alexey Shrub) Date: Sun, 15 Apr 2018 11:57:00 +0300 Subject: [Python-ideas] Rewriting file - pythonic way Message-ID: <1523782620.2055.0@smtp.yandex.ru> Hi all, I am new in python (i am moving from Perl world), but I always love Python for hight level, beatuful and clean syntax. Now I have question/idea about working with files. On mine opinion it very popular use case: 1. Open file (for read and write) 2. Read data from file 3. Modify data. 4. Rewrite file by modified data. But now it is looks not so pythonic: with open(filename, 'r+') as file: data = file.read() data = data.replace('old', 'new') file.seek(0) file.write(data) file.truncate() or something like this with open(filename) as file: data = file.read() data = data.replace('old', 'new') with open(filename) as file: file.write(data) I think best way is something like this with open(filename, 'r+') as file: data = file.read() data = data.replace('old', 'new') file.rewrite(data) but for this io.BufferedIOBase must contain rewrite method what you think about this? From kirillbalunov at gmail.com Sun Apr 15 05:19:46 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sun, 15 Apr 2018 12:19:46 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: 2018-04-15 6:08 GMT+03:00 Nick Coghlan : > > It's not completely off topic. as it's due to the fact we use "," to > separate both context managers and items in a tuple, so "with (cm1, > cm2, cm3):" is currently legal syntax that means something quite > different from "with cm1, cm2, cm3:". While using the parenthesised > form is *pointless* (since it will blow up at runtime due to tuples > not being context managers), the fact it's syntactically valid makes > us more hesitant to add the special case around parentheses handling > than we were for import statements. The relevance to PEP 572 is as a > reminder that since we *really* don't like to add yet more different > cases to "What do parentheses indicate in Python?" Despite the fact that "with (cm1,cm2, cm3):" currently is the legal syntax, but as you said and as it was also noted before in this thread - it is "pointless" in 99% cases (in context of with statement) and will fail at runtime. Therefore, regardless of this PEP, maybe it is fair to make it at least to be a `SyntaxWarning` (or `SyntaxError`)? Better fail sooner than later. we should probably > show similar hesitation when it comes to giving ":" yet another > meaning. > > Yes, `:` is used (as a symbol) in a lot of places (in fact there is not much in it), but in some places Python looks as a combination of words and colons. > P.S. The pros and cons of the current syntax proposals, as I see them: > > === Expression first, 'as' keyword === > > while (read_next_item() as value) is not None: > ... > > Pros: > > * typically reads nicely as pseudocode > * "as" is already associated with namebinding operations > > I understand that this list is subjective. But as for me it will be huge PRO that the expression comes first. > Cons: > > * syntactic ambiguity in with statement headers (major concern) > * encourages a common misunderstanding of how with statements work > (major concern) > * visual similarity between "as" and "and" makes name bindings easy to > miss > * syntactic ambiguity in except clause headers theoretically exists, > but is less of a concern due to the consistent type difference that > makes the parenthesised form pointless > > In reality, the first two points can be explained (if it will be required at all). Misunderstanding is a consequence of a lack of experience. I don't understand the the point about "visual similarity between "as" and "and" can you elaborate on this point a little bit more? > === Expression first, '->' symbol === > > while (read_next_item() -> value) is not None: > ... > > Pros: > > * avoids the syntactic ambiguity of "as" > * "->" is used for name bindings in at least some other languages > (but this is irrelevant to users for whom Python is their first, and > perhaps only, programming language) > > The same as previous, the expression comes first is a huge PRO for me and I'm sure for many others too. With the second point I agree that it is somewhat irrelevant. > Cons: > > * doesn't read like pseudocode (you need to interpret an arbitrary > non-arithmetic symbol) > Here I am a bit disagree with you. The most common for of assignment in formal pseudo-code is `name <- expr`. The second most common form, to my regret, is - `:=`. The `<-` form is not possible in Python and that is why `expr -> name` was suggested. > * invites the question "Why doesn't this use the 'as' keyword?" > All forms invites this question :))) > * symbols are typically harder to look up than keywords > * symbols don't lend themselves to easy mnemonics > * somewhat arbitrary repurposing of "->" compared to its use in > function annotations > > The last one is a major concern. I think that is why Guido is so skeptical about this form. > === Target first, ':=' symbol === > > while (value := read_next_item()) is not None: > ... > > Pros: > > * avoids the syntactic ambiguity of "as" > * being target first provides an obvious distinction from the "as" > keyword > For me it is a CON. Originally the rationale of this PEP was to reduce the number of unnecessary calculations and to provide a useful syntax to make a name binding in appropriate places. It should not, in any way, replace the existing `=` usual way to make a name binding. Therefore, as I see it, it is one of design goals to make the syntax forms of `assignment statement` and `assignment expression` to be distinct and `:=` does not help with this. This does not mean that this new syntax form should not be convenient, but it should be different from the usual `=` form. > * ":=" is used for name bindings in at least some other languages > (but this is irrelevant to users for whom Python is their first, and > perhaps only, language) > > Cons: > > * symbols are typically harder to look up than keywords > * symbols don't lend themselves to easy mnemonics > * subject to a visual "line noise" phenomenon when combined with > other uses of ":" as a syntactic marker (e.g. slices, dict key/value > pairs, lambda expressions, type annotations) > Totally agree with the last point! > > > === Target first, 'from' keyword === > > while (value from read_next_item()) is not None: # New > ... > > Pros: > > * avoids the syntactic ambiguity of "as" > * being target first provides an obvious distinction from the "as" > keyword > As above. > * typically reads nicely as pseudocode > As for me this form implies _extraction_+binding (finding inside + binding) instead of just binding. > * "from" is already associated with a namebinding operation ("from > module import name") > but module is a namespace, and `from` means _extract_ and bind. > > Cons: > > * I'm sure we'll think of some more, but all I have so far is that > the association with name binding is relatively weak and would need to > be learned > > With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sun Apr 15 05:40:57 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 15 Apr 2018 12:40:57 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: <1523782620.2055.0@smtp.yandex.ru> References: <1523782620.2055.0@smtp.yandex.ru> Message-ID: 15.04.18 11:57, Alexey Shrub ????: > I am new in python (i am moving from Perl world), but I always love > Python for hight level, beatuful and clean syntax. > Now I have question/idea about working with files. > On mine opinion it very popular use case: > 1. Open file (for read and write) > 2. Read data from file > 3. Modify data. > 4. Rewrite file by modified data. > > But now it is looks not so pythonic: > > with open(filename, 'r+') as file: > ?? data = file.read() > ?? data = data.replace('old', 'new') > ?? file.seek(0) > ?? file.write(data) > ?? file.truncate() What do you mean by calling this not pythonic? > I think best way is something like this > > with open(filename, 'r+') as file: > ?? data = file.read() > ?? data = data.replace('old', 'new') > ?? file.rewrite(data) > > but for this io.BufferedIOBase must contain rewrite method If the problem is that you want to use a single line instead of three line, you can add a function: def file_rewrite(file, data): file.seek(0) file.write(data) file.truncate() and use it. This looks pretty pythonic to me. From mikhailwas at gmail.com Sun Apr 15 05:41:44 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sun, 15 Apr 2018 12:41:44 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On Sun, Apr 15, 2018 at 12:19 PM, Kirill Balunov wrote: > > > 2018-04-15 6:08 GMT+03:00 Nick Coghlan : >> >> > >> >> P.S. The pros and cons of the current syntax proposals, as I see them: >> >> === Expression first, 'as' keyword === >> >> while (read_next_item() as value) is not None: >> ... >> >> Pros: >> >> * typically reads nicely as pseudocode >> * "as" is already associated with namebinding operations >> > > I understand that this list is subjective. But as for me it will be huge PRO > that the expression comes first. > [...] >> >> === Expression first, '->' symbol === >> >> while (read_next_item() -> value) is not None: >> ... >> >> Pros: >> >> * avoids the syntactic ambiguity of "as" >> * "->" is used for name bindings in at least some other languages >> (but this is irrelevant to users for whom Python is their first, and >> perhaps only, programming language) [...] > >> >> * invites the question "Why doesn't this use the 'as' keyword?" > > > All forms invites this question :))) Exactly, all forms invites this and other questions. First of all, coming back to original spelling choice arguments [Sorry in advance if I've missed some points in this huge thread] citation from PEP: "Differences from regular assignment statements" [...] "Otherwise, the semantics of assignment are unchanged by this proposal." So basically it's the same Python assignment? Then obvious solution seems just to propose "=". But I see Chris have put this in FAQ section: "The syntactic similarity between ``if (x == y)`` and ``if (x = y)`` ...." So IIUC, the *only* reason is to avoid '==' ad '=' similarity? If so, then it does not sound convincing at all. Of course Python does me a favor showing an error, when I make a typo like this: if (x = y) But still, if this is the only real reason, it is not convincing. Syntactically seen, I feel strong that normal '=' would be the way to go. Just look at this: y = ((eggs := spam()), (cheese := eggs.method()) y = ((eggs = spam()), (cheese = eggs.method()) The latter is so much cleaner, and already so common to any old or new Python user. And does not raise a question what this ":=" should really mean. (Or probably it should raise such question?) Given the fact that the PEP gives quite edge-case usage examples only, this should be really more convincing. And as a side note: I personally find the look of ":=" a bit 'noisy'. Another point: *Target first vs Expression first* ======================= Well, this is nice indeed. Don't you find that first of all it must be decided what should be the *overall tendency for Python*? Now we have common "x = a + b" everywhere. Then there are comprehensions (somewhat mixed direction) and "foo as bar" things. But wait, is the tendency to "give the freedom"? Then you should introduce something like "<--" in the first place so that we can write normal assignment in both directions. Or is the tendency to convert Python to the "expression first" generally? So if this question can be answered first, then I think it will be more constructive to discuss the choice of particular spellings. Mikhail From ashrub at yandex.ru Sun Apr 15 05:49:54 2018 From: ashrub at yandex.ru (Alexey Shrub) Date: Sun, 15 Apr 2018 12:49:54 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: References: <1523782620.2055.0@smtp.yandex.ru> Message-ID: <1523785794.2055.2@smtp.yandex.ru> ? ???????????, 15 ???. 2018 ? 12:40 , Serhiy Storchaka ???????: > If the problem is that you want to use a single line instead of three > line, you can add a function Yes, I think that single line with word 'rewrite' is much more readable than those three lines. And yes, I can make my own function, but it is typical task - maybe it must be in standard library? From solipsis at pitrou.net Sun Apr 15 05:55:46 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 15 Apr 2018 11:55:46 +0200 Subject: [Python-ideas] A cute Python implementation of itertools.tee References: Message-ID: <20180415115546.33d66718@fsol> On Sun, 15 Apr 2018 00:05:58 -0500 Tim Peters wrote: > Just for fun - no complaint, no suggestion, just sharing a bit of code > that tickled me. > > The docs for `itertools.tee()` contain a Python work-alike, which is > easy to follow. It gives each derived generator its own deque, and > when a new value is obtained from the original iterator it pushes that > value onto each of those deques. > > Of course it's possible for them to share a single deque, but the code > gets more complicated. Is it possible to make it simpler instead? > > What it "really" needs is a shared singly-linked list of values, > pointing from oldest value to newest. Then each derived generator can > just follow the links, and yield its next result in time independent > of the number of derived generators. But how do you know when a new > value needs to be obtained from the original iterator, and how do you > know when space for an older value can be recycled (because all of the > derived generators have yielded it)? > > I ended up with almost a full page of code to do that, storing with > each value and link a count of the number of derived generators that > had yet to yield the value, effectively coding my own reference-count > scheme by hand, along with "head" and "tail" pointers to the ends of > the linked list that proved remarkably tricky to keep correct in all > cases. > > Then I thought "this is stupid! Python already does reference > counting." Voila! Vast swaths of tedious code vanished, giving this > remarkably simple implementation: This implementation doesn't work with Python 3.7 or 3.8. I've tried it here: https://gist.github.com/pitrou/b3991f638300edb6d06b5be23a4c66d6 and get: Traceback (most recent call last): File "mytee.py", line 14, in gen mylast = last[1] = last = [next(it), None] StopIteration The above exception was the direct cause of the following exception: Traceback (most recent call last): File "mytee.py", line 47, in run(mytee1) File "mytee.py", line 36, in run lists[i].append(next(iters[i])) RuntimeError: generator raised StopIteration (Yuck!) In short, you want the following instead: try: mylast = last[1] = last = [next(it), None] except StopIteration: return > def mytee(xs, n): > last = [None, None] > > def gen(it, mylast): > nonlocal last > while True: > mylast = mylast[1] > if not mylast: > mylast = last[1] = last = [next(it), None] That's smart and obscure :-o The way it works is that the `last` assignment changes the `last` value seen by all derived generators, while the `last[1]` assignment updates the bindings made in the other generators' `mylast` lists... It's difficult to find the words to explain it. The chained assignment makes it more difficult to parse as well (when I read this I don't know if `last[i]` or `last` gets assigned first; apparently the answer is `last[i]`, otherwise the recipe wouldn't work correctly). Perhaps like this: while True: mylast = mylast[1] if not mylast: try: # Create new list link mylast = [next(it), None] except StopIteration: return else: # Append to other generators `mylast` linked lists last[1] = mylast # Update shared list link last = last[1] yield mylast[0] Regards Antoine. From storchaka at gmail.com Sun Apr 15 06:12:13 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 15 Apr 2018 13:12:13 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: <1523785794.2055.2@smtp.yandex.ru> References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> Message-ID: 15.04.18 12:49, Alexey Shrub ????: > ? ???????????, 15 ???. 2018 ? 12:40 , Serhiy Storchaka > ???????: >> If the problem is that you want to use a single line instead of three >> line, you can add a function > > Yes, I think that single line with word 'rewrite' is much more readable > than those three lines. > And yes, I can make my own function, but it is typical task - maybe it > must be in standard library? Not every three lines of code must be a function in standard library. And these three lines don't look enough common. Actually the reliable code should write into a separate file and replace the original file by the new file only if writing is successful. Or backup the old file and restore it if writing is failed. Or do both. And handle hard and soft links if necessary. And use file locks if needed to prevent race condition when read/write by different processes. Depending on the specific of the application you may need different code. Your three lines are enough for a one-time script if the risk of a powerful blackout or disk space exhaustion is insignificant or if the data is not critical. From elazarg at gmail.com Sun Apr 15 06:22:54 2018 From: elazarg at gmail.com (Elazar) Date: Sun, 15 Apr 2018 10:22:54 +0000 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> Message-ID: This pitfall sounds like a good reason to have such a function in the standard library. Elazar ?????? ??? ??, 15 ????? 2018, 13:13, ??? Serhiy Storchaka ?< storchaka at gmail.com>: > 15.04.18 12:49, Alexey Shrub ????: > > ? ???????????, 15 ???. 2018 ? 12:40 , Serhiy Storchaka > > ???????: > >> If the problem is that you want to use a single line instead of three > >> line, you can add a function > > > > Yes, I think that single line with word 'rewrite' is much more readable > > than those three lines. > > And yes, I can make my own function, but it is typical task - maybe it > > must be in standard library? > > Not every three lines of code must be a function in standard library. > And these three lines don't look enough common. > > Actually the reliable code should write into a separate file and replace > the original file by the new file only if writing is successful. Or > backup the old file and restore it if writing is failed. Or do both. And > handle hard and soft links if necessary. And use file locks if needed to > prevent race condition when read/write by different processes. Depending > on the specific of the application you may need different code. Your > three lines are enough for a one-time script if the risk of a powerful > blackout or disk space exhaustion is insignificant or if the data is not > critical. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Sun Apr 15 06:39:12 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 15 Apr 2018 11:39:12 +0100 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: <1523785794.2055.2@smtp.yandex.ru> References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> Message-ID: On 15 April 2018 at 10:49, Alexey Shrub wrote: > ? ???????????, 15 ???. 2018 ? 12:40 , Serhiy Storchaka > ???????: >> >> If the problem is that you want to use a single line instead of three >> line, you can add a function > > > Yes, I think that single line with word 'rewrite' is much more readable than > those three lines. > And yes, I can make my own function, but it is typical task - maybe it must > be in standard library? I don't think it's *that* typical. I don't recall even having wanted to do this in all the time I've been using Python... Paul From p.f.moore at gmail.com Sun Apr 15 06:47:17 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 15 Apr 2018 11:47:17 +0100 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> Message-ID: On 15 April 2018 at 11:22, Elazar wrote: > ?????? ??? ??, 15 ????? 2018, 13:13, ??? Serhiy Storchaka > ?: >> Actually the reliable code should write into a separate file and replace >> the original file by the new file only if writing is successful. Or >> backup the old file and restore it if writing is failed. Or do both. And >> handle hard and soft links if necessary. And use file locks if needed to >> prevent race condition when read/write by different processes. Depending >> on the specific of the application you may need different code. Your >> three lines are enough for a one-time script if the risk of a powerful >> blackout or disk space exhaustion is insignificant or if the data is not >> critical. > > This pitfall sounds like a good reason to have such a function in the > standard library. It certainly sounds like a good reason for someone to write a "safe file rewrite" library function. But I don't think that it's such a common need that it needs to be a stdlib function. It may well even be the case that there's such a function already available on PyPI - has anyone actually checked? And if there isn't, then writing module and publishing it there would seem like a *very* good starting point - as well as allowing the developer to thrash out the best API, it would also provide for lots of testing in unusual scenarios that the developer may not have thought about (Windows file locking is very different from Unix, what is an atomic operation differs between platforms, error handling and retries may be something to consider, etc). The result would be a useful package, and the download and activity stats for it would be a great indication of whether it's a frequent enough need to justify including in core Python. IMO, it probably isn't. I suspect that most uses would be fine with the quoted 3-liner, but very few people would need the sort of robustness that Serhiy is describing (and that level of robustness *would* be needed for a stdlib implementation). So PyPI is likely a better home for the "bulletproof" version, and 3 lines of code is a perfectly acceptable and Pythonic solution for people with simpler needs. Paul From ncoghlan at gmail.com Sun Apr 15 07:01:55 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Apr 2018 21:01:55 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On 15 April 2018 at 19:41, Mikhail V wrote: > So IIUC, the *only* reason is to avoid '==' ad '=' similarity? > If so, then it does not sound convincing at all. > Of course Python does me a favor showing an error, > when I make a typo like this: > if (x = y) > > But still, if this is the only real reason, it is not convincing. It's thoroughly convincing, because we're already familiar with the consequences of folks confusing "=" and "==" when writing C & C++ code. It's an eternal bug magnet, so it's not a design we're ever going to port over to Python. (And that's even before we get into the parsing ambiguity problems that attempting to reuse "=" for this purpose in Python would introduce, especially when it comes to keyword arguments). The only way Python will ever gain expression level name binding support is with a spelling *other than* "=", as when that's the proposed spelling, the answer will be an unequivocal "No, we're not adding that". Even if the current discussion does come up with a potentially plausible spelling, the final determination on python-dev may *still* be "No, we're not going to add that". That isn't a predetermined answer though - it will depend on whether or not a proposal can be developed that threads the small gap between "this adds too much new cognitive overhead to reading and writing the language" and "while this does add more complexity to the base language, it provides sufficient compensation in allowing common ideas to be expressed more simply". > Syntactically seen, I feel strong that normal '=' would be the way to go. > > Just look at this: > y = ((eggs := spam()), (cheese := eggs.method()) > y = ((eggs = spam()), (cheese = eggs.method()) > > The latter is so much cleaner, and already so common to any > old or new Python user. Consider how close the second syntax is to "y = f(eggs=spam(), cheese=fromage())", though. > Given the fact that the PEP gives quite edge-case > usage examples only, this should be really more convincing. The examples in the PEP have been updated to better reflect some of the key motivating use cases (embedded assignments in if and while statement conditions, generator expressions, and container comprehensions) > And as a side note: I personally find the look of ":=" a bit 'noisy'. You're not alone in that, which is one of the reasons finding a keyword based option that's less syntactically ambiguous than "as" could be an attractive alternative. > Another point: > > *Target first vs Expression first* > ======================= > > Well, this is nice indeed. Don't you find that first of all it must be > decided what should be the *overall tendency for Python*? > Now we have common "x = a + b" everywhere. Then there > are comprehensions (somewhat mixed direction) and > "foo as bar" things. > But wait, is the tendency to "give the freedom"? Then you should > introduce something like "<--" in the first place so that we can > write normal assignment in both directions. > Or is the tendency to convert Python to the "expression first" generally? There's no general tendency towards expression first syntax, nor towards offering flexibility in whether ordinary assignments are target first. All the current cases where we use the "something as target" form are *not* direct equivalents to "target = something": * "import dotted.modname as name": also prevents "dotted" getting bound in the current scope the way it normally would * "from dotted import modname as name": also prevents "modname" getting bound in the current scope the way it normally would * "except exc_filter as exc": binds the caught exception, not the exception filter * "with cm as name": binds the result of __enter__ (which may be self), not the cm directly Indeed, https://www.python.org/dev/peps/pep-0343/#motivation-and-summary points out that it's this "not an ordinary assignment" aspect that lead to with statements using the "with cm as name:" structure in the first place - the original proposal in PEP 310 was for "with name = cm:" and ordinary assignment semantics. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From kirillbalunov at gmail.com Sun Apr 15 07:05:32 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sun, 15 Apr 2018 14:05:32 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: 2018-04-15 12:41 GMT+03:00 Mikhail V : > > Exactly, all forms invites this and other questions. > > First of all, coming back to original spelling choice arguments > [Sorry in advance if I've missed some points in this huge thread] > > citation from PEP: > "Differences from regular assignment statements" [...] > "Otherwise, the semantics of assignment are unchanged by this proposal." > > So basically it's the same Python assignment? > Then obvious solution seems just to propose "=". > But I see Chris have put this in FAQ section: > "The syntactic similarity between ``if (x == y)`` and ``if (x = y)`` ...." > [OT] To be honest I never liked the fact that `=` was used in various programming languages as assignment. But it became so common that and I got used to it and even stopped taking a sedative :) So IIUC, the *only* reason is to avoid '==' ad '=' similarity? > If so, then it does not sound convincing at all. > Of course Python does me a favor showing an error, > when I make a typo like this: > if (x = y) > > But still, if this is the only real reason, it is not convincing. > Syntactically seen, I feel strong that normal '=' would be the way to go. > > Just look at this: > y = ((eggs := spam()), (cheese := eggs.method()) > y = ((eggs = spam()), (cheese = eggs.method()) > > The latter is so much cleaner, and already so common to any > old or new Python user. And does not raise a > question what this ":=" should really mean. > (Or probably it should raise such question?) > > Given the fact that the PEP gives quite edge-case > usage examples only, this should be really more convincing. > And as a side note: I personally find the look of ":=" a bit 'noisy'. > You are not alone. On the other hand it is one of the strengths of Python - not allow to do so common and complex to finding bugs. For me personally, `: =` looks and feels just like normal assignment statement which can be used interchangeable but in many more places in the code. And if the main goal of the PEP was to offer this `assignment expression` as a future replacement for `assignment statement` the `:=` syntax form would be the very reasonable proposal (of course in this case there will be a lot more other questions). But somehow this PEP does not mean it! And with the current rationale of this PEP it's a huge CON for me that `=` and `:=` feel and look the same. > > Another point: > > *Target first vs Expression first* > ======================= > > Well, this is nice indeed. Don't you find that first of all it must be > decided what should be the *overall tendency for Python*? > Now we have common "x = a + b" everywhere. Then there > are comprehensions (somewhat mixed direction) and > "foo as bar" things. > But wait, is the tendency to "give the freedom"? Then you should > introduce something like "<--" in the first place so that we can > write normal assignment in both directions. > As it was noted previously `<-` would not work because of unary minus on the right: >>> x = 10 >>> x <- 5 False > Or is the tendency to convert Python to the "expression first" generally? > > So if this question can be answered first, then I think it will be > more constructive to discuss the choice of particular spellings. > If the idea of the whole PEP was to replace `assignment statement` with `assignment expression` I would choose name first. If the idea was to offer an expression with the name-binding side effect, which can be used in the appropriate places I would choose expression first. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Apr 15 07:40:33 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Apr 2018 21:40:33 +1000 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> Message-ID: On 15 April 2018 at 20:47, Paul Moore wrote: > On 15 April 2018 at 11:22, Elazar wrote: >> ?????? ??? ??, 15 ????? 2018, 13:13, ??? Serhiy Storchaka >> ?: >>> Actually the reliable code should write into a separate file and replace >>> the original file by the new file only if writing is successful. Or >>> backup the old file and restore it if writing is failed. Or do both. And >>> handle hard and soft links if necessary. And use file locks if needed to >>> prevent race condition when read/write by different processes. Depending >>> on the specific of the application you may need different code. Your >>> three lines are enough for a one-time script if the risk of a powerful >>> blackout or disk space exhaustion is insignificant or if the data is not >>> critical. >> >> This pitfall sounds like a good reason to have such a function in the >> standard library. > > It certainly sounds like a good reason for someone to write a "safe > file rewrite" library function. But I don't think that it's such a > common need that it needs to be a stdlib function. It may well even be > the case that there's such a function already available on PyPI - has > anyone actually checked? There wasn't last time I checked (which admittedly was several years ago now). The issue is that it's painfully difficult to write a robust cross-platform "atomic rewrite" operation that can cleanly handle a wide range of arbitrary use cases - instead, folks are more likely to write simpler alternatives that work well enough given whichever simplifying assumptions are applicable to their use case (which may even include "I don't care about atomicity, and am quite happy to let a poorly timed Ctrl-C or unexpected system shutdown corrupt the file I'm rewriting"). https://bugs.python.org/issue8604#msg174104 is the relevant tracker discussion (deliberately linking into the middle of it, since the early part is akin to this thread: reactions mostly along the lines of "that's easy, and doesn't need to be in the standard library". It definitely *isn't* easy, but it's also challenging to publish on PyPI, since it's a quagmire of platform specific complexity and edge cases, if you mess it up you can cause significant data loss, and anyone that already knows they need atomic rewrites is likely to be able to come up with their own purpose specific implementation in less time than it would take them to assess the suitability of 3rd party alternatives). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Sun Apr 15 08:21:02 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 15 Apr 2018 22:21:02 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On Sun, Apr 15, 2018 at 7:19 PM, Kirill Balunov wrote: >> === Expression first, 'as' keyword === >> >> while (read_next_item() as value) is not None: >> ... >> >> Pros: >> >> * typically reads nicely as pseudocode >> * "as" is already associated with namebinding operations >> > > I understand that this list is subjective. But as for me it will be huge PRO > that the expression comes first. I don't think we're ever going to unify everyone on an arbitrary question of "expression first" or "name first". But to all the "expression first" people, a question: what if the target is not just a simple name? while (read_next_item() -> items[i + 1 -> i]) is not None: print("%d/%d..." % (i, len(items)), end="\r") Does this make sense? With the target coming first, it perfectly parallels the existing form of assignment: >>> items = [None] * 10 >>> i = -1 >>> i, items[i] = i+1, input("> ") > asdf >>> i, items[i] = i+1, input("> ") > qwer >>> i, items[i] = i+1, input("> ") > zxcv >>> items ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] The unpacking syntax is a bit messy, but with expression assignment, we can do this: >>> items = [None] * 10 >>> i = -1 >>> items[i := i + 1] = input("> ") > asdf >>> items[i := i + 1] = input("> ") > qwer >>> items[i := i + 1] = input("> ") > zxcv >>> >>> items ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] Okay, it's not quite as simple as C's "items[i++]" (since you have to start i off at negative one so you can pre-increment), but it's still logical and sane. Are you as happy with that sort of complex expression coming after 'as' or '->'? Not a rhetorical question. I'm genuinely curious as to whether people are expecting "expression -> NAME" or "expression -> TARGET", where TARGET can be any valid assignment target. ChrisA From ashrub at yandex.ru Sun Apr 15 10:15:55 2018 From: ashrub at yandex.ru (Alexey Shrub) Date: Sun, 15 Apr 2018 17:15:55 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> Message-ID: <1523801755.2055.4@smtp.yandex.ru> ? ???????????, 15 ???. 2018 ? 2:40 , Nick Coghlan ???????: > https://bugs.python.org/issue8604#msg174104 is the relevant tracker > discussion Thanks all, I agree that universal and absolutly safe solution is very difficult, but for experiment I made some draft https://github.com/worldmind/scripts/tree/master/filerewrite main code here https://github.com/worldmind/scripts/blob/master/filerewrite/filerewrite.py#L46 From kirillbalunov at gmail.com Sun Apr 15 10:17:53 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sun, 15 Apr 2018 17:17:53 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: 2018-04-15 15:21 GMT+03:00 Chris Angelico : > On Sun, Apr 15, 2018 at 7:19 PM, Kirill Balunov > wrote: > >> === Expression first, 'as' keyword === > >> > >> while (read_next_item() as value) is not None: > >> ... > >> > >> Pros: > >> > >> * typically reads nicely as pseudocode > >> * "as" is already associated with namebinding operations > >> > > > > I understand that this list is subjective. But as for me it will be huge > PRO > > that the expression comes first. > > I don't think we're ever going to unify everyone on an arbitrary > question of "expression first" or "name first". But to all the > "expression first" people, a question: what if the target is not just > a simple name? > > while (read_next_item() -> items[i + 1 -> i]) is not None: > print("%d/%d..." % (i, len(items)), end="\r") > > [...] > > Not a rhetorical question. I'm genuinely curious as to whether people > are expecting "expression -> NAME" or "expression -> TARGET", where > TARGET can be any valid assignment target. > > I completely agree with you that it is impossible to unify everyone opinion - we all have different background. But this example is more likely to play against this PEP. This is an extra complexity within one line and it can fail hard in at least three obvious places :) And I am against this usage no matter `name first` or `expression first`. But i will reask this with following snippets. What do you choose from this examples: 0. while (items[i := i+1] := read_next_item()) is not None: print(r'%d/%d' % (i, len(items)), end='\r') 1. while (read_next_item() -> items[(i+1) -> i]) is not None: print(r'%d/%d' % (i, len(items)), end='\r') 2. while (item := read_next_item()) is not None: items[i := (i+1)] = item print(r'%d/%d' % (i, len(items)), end='\r') 3. while (read_next_item() -> item) is not None: items[(i+1) -> i] = item print(r'%d/%d' % (i, len(items)), end='\r') 4. while (item := read_next_item()) is not None: i = i+1 items[i] = item print(r'%d/%d' % (i, len(items)), end='\r') 5. while (read_next_item() -> item) is not None: i = i+1 items[i] = item print(r'%d/%d' % (i, len(items)), end='\r') I am definitely Ok with both 2 and 3 here. But as it was noted `:=` produces additional noise in other places and I am also an `expression first` guy :) So I still prefer variant 3 to 2. But to be completely honest, I would write it in the following way: for item in iter(read_next_item, None): items.append(item) print(r'%d/%d' % (i, len(items)), end='\r') With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirillbalunov at gmail.com Sun Apr 15 10:22:09 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sun, 15 Apr 2018 17:22:09 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: 2018-04-15 17:17 GMT+03:00 Kirill Balunov : > > > for item in iter(read_next_item, None): > items.append(item) > print(r'%d/%d' % (i, len(items)), end='\r') > > > With kind regards, > -gdg > Oh, I forgot about `i`: for item in iter(read_next_item, None): i += 1 items.append(item) print(r'%d/%d' % (i, len(items)), end='\r') With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaoxiansamma at gmail.com Sun Apr 15 10:27:09 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Sun, 15 Apr 2018 22:27:09 +0800 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) Message-ID: > To me, "from" strongly suggests that an element is being obtained from a container/collection of > elements. This is how I conceptualize "from module import name": "name" refers to an object > INSIDE the module, not the module itself. If I saw > > if (match from pattern.search(data)) is not None: ... > I would guess that it is equivalent to > > m = next(pattern.search(data)) > if m is not None: ... +1, although unpacking seems to be reasonable `[elem1, *elems] from contains`. Now we have - "expr as name" - "name := expr" - "expr -> name" - "name from expr" Personally I prefer "as", but I think without a big change of python Grammar file, it's impossible to avoid parsing "with expr as name" into "with (expr as name)" because "expr as name" is actually an "expr". I have mentioned this in previous discussions and it seems it's better to warn you all again. I don't think people of Python-Dev are willing to implement a totally new Python compiler. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaoxiansamma at gmail.com Sun Apr 15 11:11:37 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Sun, 15 Apr 2018 23:11:37 +0800 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) Message-ID: > > > 0. > > while (items[i := i+1] := read_next_item()) is not None: > print(r'%d/%d' % (i, len(items)), end='\r') > > 1. > > while (read_next_item() -> items[(i+1) -> i]) is not None: > print(r'%d/%d' % (i, len(items)), end='\r') > > 2. > > while (item := read_next_item()) is not None: > items[i := (i+1)] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 3. > > while (read_next_item() -> item) is not None: > items[(i+1) -> i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 4. > > while (item := read_next_item()) is not None: > i = i+1 > items[i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 5. > > while (read_next_item() -> item) is not None: > i = i+1 > items[i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > Also 2 or 3. The 3rd one is in the order of natural language, just like: while get then next item and assign it to `item`, if it's not None, do some stuff. However just as we have pointed out, the semantics of '->' is quite different from the cases it's currently used at, so it should be handled much more carefully. I think maybe we can use unicode characters like ? (\triangleq) and add the support of unicode completion to python repl. The unicode completion of editors or ides has been quite mature. -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Sun Apr 15 11:19:49 2018 From: phd at phdru.name (Oleg Broytman) Date: Sun, 15 Apr 2018 17:19:49 +0200 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: <1523801755.2055.4@smtp.yandex.ru> References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> <1523801755.2055.4@smtp.yandex.ru> Message-ID: <20180415151949.r42vffyunhix36na@phdru.name> On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub wrote: > ? ???????????, 15 ???. 2018 ? 2:40 , Nick Coghlan > ???????: > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker > > discussion > > Thanks all, I agree that universal and absolutly safe solution is very > difficult, but for experiment I made some draft > https://github.com/worldmind/scripts/tree/master/filerewrite Good! > main code here > https://github.com/worldmind/scripts/blob/master/filerewrite/filerewrite.py#L46 Can I recommend to catch exceptions in `backuper.backup()`, cleanup backuper and unlock locker? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From steve at pearwood.info Sun Apr 15 11:58:06 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 16 Apr 2018 01:58:06 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: <20180415155805.GI11616@ando.pearwood.info> On Sun, Apr 15, 2018 at 10:21:02PM +1000, Chris Angelico wrote: > I don't think we're ever going to unify everyone on an arbitrary > question of "expression first" or "name first". But to all the > "expression first" people, a question: what if the target is not just > a simple name? > > while (read_next_item() -> items[i + 1 -> i]) is not None: > print("%d/%d..." % (i, len(items)), end="\r") I don't see why it would make a difference. It doesn't to me. > Does this make sense? With the target coming first, it perfectly > parallels the existing form of assignment: Yes, except this isn't ordinary assignment-as-a-statement. I've been mulling over the question why I think the expression needs to come first here, whereas I'm satisfied with the target coming first for assignment statements, and I think I've finally got the words to explain it. It is not just long familiarity with maths and languages that put the variable first (although that's also part of it). It has to do with what we're looking for when we read code, specifically what is the primary piece of information we're initially looking for. In assignment STATEMENTS the primary piece of information is the target. Yes, of course the value assigned to the target is important, but often we don't care what the value is, at least not at first. We're hunting for a known target, and only when we find it do we care about the value it gets. A typical scenario: I'm reading a function, and I scan down the block looking at the start of each line until I find the variable I want: spam = don't care eggs = don't care self.method(don't care) cheese = ... <<<==== HERE IT IS so it actually helps to have the name up front. Copying standard maths notation for assignment (variable first, value second) is a good thing for statements. With assignment-statements, if you're scanning the code for a variable name, you're necessarily interested in the name and it will be helpful to have it on the left. But with assignment-expressions, there's an additional circumstance: sometimes you don't care about the name, you only care what the value is. (I expect this will be more common.) The name is just something to skip over when you're scanning the code looking for the value. # what did I pass as the fifth argument to the function? result = some_func(don't care, spam := don't care, eggs := don't care, self.method(don't care), cheese := HERE IT IS, ...) Of course it's hard counting commas so it's probably better to add a bit of structure to your function call: result = some_func(don't care, spam := don't care, eggs := don't care, self.method(don't care), cheese := HERE IT IS, ...) But this time we don't care about the name. Its the value we care about: result = some_func(don't care, don't care -> don't care don't care -> don't care don't care(don't care), HERE IT IS .... , ...) The target is just one more thing you have to ignore, and it is helpful to have expression first and the target second. Some more examples: # what am I adding to the total? total += don't care := expression # what key am I looking up? print(mapping[don't care := key]) # how many items did I just skip? self.skip(don't care := obj.start + extra) versus total += expression -> don't care print(mapping[key -> don't care]) self.skip(obj.start + extra -> don't care) It is appropriate for assignment statements and expressions to be written differently because they are used differently. [...] > >>> items = [None] * 10 > >>> i = -1 > >>> items[i := i + 1] = input("> ") > > asdf > >>> items[i := i + 1] = input("> ") > > qwer > >>> items[i := i + 1] = input("> ") > > zxcv > >>> > >>> items > ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] I don't know why you would write that instead of: items = [None]*10 for i in range(3): items[i] = input("> ") or even for that matter: items = [input("> ") for i in range(3)] + [None]*7 but whatever floats your boat. (Python isn't just not Java. It's also not C *wink*) > Are you as happy with that sort of complex > expression coming after 'as' or '->'? Sure. Ignoring the output of the calls to input(): items = [None] * 10 i = -1 items[i + 1 -> i] = input("> ") items[i + 1 -> i] = input("> ") items[i + 1 -> i] = input("> ") which isn't really such a complex target. How about this instead? obj = SimpleNamespace(spam=None, eggs=None, aardvark={'key': [None, None, -1]} ) items[obj.aardvark['key'][2] + 1 -> obj.aardvark['key'][2]] = input("> ") versus: items[obj.aardvark['key'][2] := obj.aardvark['key'][2] + 1] = input("> ") Neither is exactly a shining exemplar of great code designed for readability. But putting the target on the right doesn't make it worse. -- Steve From mahmoud at hatnote.com Sun Apr 15 12:10:57 2018 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Sun, 15 Apr 2018 09:10:57 -0700 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: <20180415151949.r42vffyunhix36na@phdru.name> References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> <1523801755.2055.4@smtp.yandex.ru> <20180415151949.r42vffyunhix36na@phdru.name> Message-ID: Depending on how firm your requirements around locking are, you may find this code useful: https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303 (docs here: http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving ) Basically every operating system has _some_ way of doing an atomic file replacement, letting us guarantee that a file at a given location is always valid. atomic_save provides a unified interface to that cross-platform behavior. The code does not do locking, as neither I nor its other users have wanted it, but I'd be happy to extend it if there's a sensible default. On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman wrote: > On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub > wrote: > > ? ???????????, 15 ???. 2018 ? 2:40 , Nick Coghlan > > ???????: > > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker > > > discussion > > > > Thanks all, I agree that universal and absolutly safe solution is very > > difficult, but for experiment I made some draft > > https://github.com/worldmind/scripts/tree/master/filerewrite > > Good! > > > main code here > > https://github.com/worldmind/scripts/blob/master/ > filerewrite/filerewrite.py#L46 > > Can I recommend to catch exceptions in `backuper.backup()`, > cleanup backuper and unlock locker? > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Apr 15 12:15:18 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 15 Apr 2018 11:15:18 -0500 Subject: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] In-Reply-To: References: Message-ID: [Raymond Hettinger ] > Q. Do other languages do it? > A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. > > * http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html > * https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html > * http://microapl.com/apl/apl_concepts_chapter5.html > \+ 1 2 3 4 5 > 1 3 6 10 15 > * https://reference.wolfram.com/language/ref/Accumulate.html > * https://www.haskell.org/hoogle/?hoogle=mapAccumL There's also C++, which is pretty much "yes" to every variation discussed so far: * partial_sum() is like Python's current accumulate(), including defaulting to doing addition. http://en.cppreference.com/w/cpp/algorithm/partial_sum * inclusive_scan() is also like accumulate(), but allows an optional "init" argument (which is returned if specified), and there's no guarantee of "left-to-right" evaluation (it's intended for associative binary functions, and wants to allow parallelism in the implementation). http://en.cppreference.com/w/cpp/algorithm/inclusive_scan * exclusive_scan() is like inclusive_scan(), but _requires_ an "init" argument (which is not returned). http://en.cppreference.com/w/cpp/algorithm/exclusive_scan * accumulate() is like Python's functools.reduce(), but the operation is optional and defaults to addition, and an "init" argument is required. http://en.cppreference.com/w/cpp/algorithm/accumulate From steve at pearwood.info Sun Apr 15 12:15:59 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 16 Apr 2018 02:15:59 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: Message-ID: <20180415161559.GJ11616@ando.pearwood.info> On Sun, Apr 15, 2018 at 11:11:37PM +0800, Thautwarm Zhao wrote: > I think maybe we can use unicode characters like ? (\triangleq) and add the > support of unicode completion to python repl. The unicode completion of > editors or ides has been quite mature. What key combination do I need to type to get ? in the following editors please? I tried typing \triangleq but all I got was \triangleq. Notepad (Windows) Brackets (Mac) BBEdit (Mac) kwrite (Linux) kate nano geany gedit as well as IDLE, my mail client (kmail, Thunderbird or mutt), my web browsers (Firefox, Opera and Chromium), the interactive interpreter in various different consoles, my Usenet client (Pan and KNode) and IRC (pidgin). Oh, having it work in LibreOffice and GoogleApps too would be nice, although not essential since I don't often write code in them. And what decent fonts do I need to install for ? to show up as something other than a square box ("missing glyph")? -- Steve From phd at phdru.name Sun Apr 15 12:49:16 2018 From: phd at phdru.name (Oleg Broytman) Date: Sun, 15 Apr 2018 18:49:16 +0200 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> <1523801755.2055.4@smtp.yandex.ru> <20180415151949.r42vffyunhix36na@phdru.name> Message-ID: <20180415164916.ea5g4gvurqweu2xx@phdru.name> On Sun, Apr 15, 2018 at 09:10:57AM -0700, Mahmoud Hashemi wrote: > Depending on how firm your requirements around locking are, you may find > this code useful: > https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303 > > (docs here: > http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving ) > > Basically every operating system has _some_ way of doing an atomic file > replacement, letting us guarantee that a file at a given location is always > valid. atomic_save provides a unified interface to that cross-platform > behavior. > > The code does not do locking, as neither I nor its other users have wanted > it, but I'd be happy to extend it if there's a sensible default. I don't like it renames the file at the end. Renaming could lead to changed file ownership and permissions; restoring permissions is not always possible, restoring ownership is almost never possible. Renaming is also not always possible due to restricted directory permissions. > On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman wrote: > > > On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub > > wrote: > > > ? ???????????, 15 ???. 2018 ? 2:40 , Nick Coghlan > > > ???????: > > > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker > > > > discussion > > > > > > Thanks all, I agree that universal and absolutly safe solution is very > > > difficult, but for experiment I made some draft > > > https://github.com/worldmind/scripts/tree/master/filerewrite > > > > Good! > > > > > main code here > > > https://github.com/worldmind/scripts/blob/master/ > > filerewrite/filerewrite.py#L46 > > > > Can I recommend to catch exceptions in `backuper.backup()`, > > cleanup backuper and unlock locker? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From tim.peters at gmail.com Sun Apr 15 12:52:49 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 15 Apr 2018 11:52:49 -0500 Subject: [Python-ideas] A cute Python implementation of itertools.tee In-Reply-To: <20180415115546.33d66718@fsol> References: <20180415115546.33d66718@fsol> Message-ID: [Antoine Pitrou ] > This implementation doesn't work with Python 3.7 or 3.8. > I've tried it here: > https://gist.github.com/pitrou/b3991f638300edb6d06b5be23a4c66d6 > > and get: > Traceback (most recent call last): > File "mytee.py", line 14, in gen > mylast = last[1] = last = [next(it), None] > StopIteration > > The above exception was the direct cause of the following exception: > > Traceback (most recent call last): > File "mytee.py", line 47, in > run(mytee1) > File "mytee.py", line 36, in run > lists[i].append(next(iters[i])) > RuntimeError: generator raised StopIteration > > (Yuck!) Thanks for trying! I wonder whether that will break other code. I wrote PEP 255, and this part was intentional at the time: """ If an unhandled exception-- including, but not limited to, StopIteration --is raised by, OR PASSES THROUGH [emphasis added], a generator function, then the exception is passed on to the caller in the usual way, and subsequent attempts to resume the generator function raise StopIteration. """ I've exploited that a number of times. > In short, you want the following instead: > > try: > mylast = last[1] = last = [next(it), None] > except StopIteration: > return No, I don't ;-) If I have to catch StopIteration myself now, then I want the entire "white True:" loop in the "try" block. Setting up try/except machinery anew on each iteration would add significant overhead; doing it just once per derived generator wouldn't. >> def mytee(xs, n): >> last = [None, None] >> >> def gen(it, mylast): >> nonlocal last >> while True: >> mylast = mylast[1] >> if not mylast: >> mylast = last[1] = last = [next(it), None] > That's smart and obscure :-o > The way it works is that the `last` assignment changes the `last` value > seen by all derived generators, while the `last[1]` assignment updates > the bindings made in the other generators' `mylast` lists... It's > difficult to find the words to explain it. Which is why I didn't even try - I did warn people that if they thought it "was obvious", they hadn't yet thought hard enough ;-) Good job! > The chained assignment makes it more difficult to parse as well (when I > read this I don't know if `last[i]` or `last` gets assigned first; > apparently the answer is `last[i]`, otherwise the recipe wouldn't work > correctly). Ya, I had to look it up too :-) Although, like almost everything else in Python, chained assignments proceed "left to right". I was just trying to make it as short as possible, to increase the "huh - can something that tiny really work?!" healthy skepticism factor :-) > Perhaps like this: > > while True: > mylast = mylast[1] > if not mylast: > try: > # Create new list link > mylast = [next(it), None] > except StopIteration: > return > else: > # Append to other generators `mylast` linked lists > last[1] = mylast > # Update shared list link > last = last[1] > yield mylast[0] I certainly agree that's easier to follow. But that wasn't really the point ;-) From kirillbalunov at gmail.com Sun Apr 15 13:07:39 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sun, 15 Apr 2018 20:07:39 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180415155805.GI11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: 2018-04-15 18:58 GMT+03:00 Steven D'Aprano : > > [...] > > But this time we don't care about the name. Its the value we care about: > > result = some_func(don't care, > don't care -> don't care > don't care -> don't care > don't care(don't care), > HERE IT IS .... , > ...) > This made my day! :) The programming style when you absolutely don't care :))) I understand that this is a typo but it turned out to be very funny. In general, I agree with everything you've said. And I think you found a very correct way to explain why expression should go first in assignment expression. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Apr 15 13:19:57 2018 From: guido at python.org (Guido van Rossum) Date: Sun, 15 Apr 2018 10:19:57 -0700 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On Sun, Apr 15, 2018 at 4:05 AM, Kirill Balunov wrote: > [...] For me personally, `: =` looks and feels just like normal assignment > statement which can be used interchangeable but in many more places in the > code. And if the main goal of the PEP was to offer this `assignment > expression` as a future replacement for `assignment statement` the `:=` > syntax form would be the very reasonable proposal (of course in this case > there will be a lot more other questions). > I haven't kept up with what's in the PEP (or much of this thread), but this is the key reason I strongly prefer := as inline assignment operator. > But somehow this PEP does not mean it! And with the current rationale of > this PEP it's a huge CON for me that `=` and `:=` feel and look the same. > Then maybe the PEP needs to be updated. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sun Apr 15 13:35:51 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 15 Apr 2018 20:35:51 +0300 Subject: [Python-ideas] A cute Python implementation of itertools.tee In-Reply-To: References: <20180415115546.33d66718@fsol> Message-ID: 15.04.18 19:52, Tim Peters ????: > No, I don't ;-) If I have to catch StopIteration myself now, then I > want the entire "white True:" loop in the "try" block. Setting up > try/except machinery anew on each iteration would add significant > overhead; doing it just once per derived generator wouldn't. This overhead is around 10% of the time for calling `next(it)`. It may be less than 1-2% of the whole step of mytee iteration. I have ideas about implementing zero-overhead try/except, but I have doubts that it is worth. The benefit seems too small. From gadgetsteve at live.co.uk Sun Apr 15 13:45:20 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Sun, 15 Apr 2018 17:45:20 +0000 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On 15/04/2018 08:12, Nick Coghlan wrote: > On 14 April 2018 at 19:22, Steve Barnes wrote: >> I generally love the current import system for "just working" regardless >> of platform, installation details, etc., but what I would like to see is >> a clear import local, (as opposed to import from wherever you can find >> something to satisfy mechanism). This is the one thing that I miss from >> C/C++ where #include is system includes and #include "x" search >> differing include paths, (if used well). > > For the latter purpose, we prefer that folks use either explicit > relative imports (if they want to search the current package > specifically), or else direct manipulation of package.__path__. > > That is, if you do: > > from . import custom_imports # Definitely from your own project > custom_imports.__path__[:] = (some_directory, some_other_directory) > > then: > > from .custom_imports import name > > will search those directories for packages & modules to import, while > still cleanly mapping to a well-defined location in the module > namespace for the process as a whole (and hence being able to use all > the same caches as other imports, without causing name conflicts or > other problems). > > If you want to do this dynamically relative to the current module, > then it's possible to do: > > global __path__ > __path__[:] = (some_directory, some_other_directory) > custom_mod = importlib.import_module(".name", package=__name__) > > The discoverability of these kinds of techniques could definitely > stand to be improved, but the benefit of adopting them is that they > work on all currently supported versions of Python (even > importlib.import_module exists in Python 2.7 as a convenience wrapper > around __import__), rather than needing to wait for new language level > syntax for them. > > Cheers, > Nick. > Thanks Nick, As you say not too discoverable at the moment - I have just reread PEP328 & https://docs.python.org/3/library/importlib.html but did not find any mention of these mechanisms or even that setting an external __path__ variable existed as a possibility. Maybe a documentation enhancement proposal would be in order? -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com From yaoxiansamma at gmail.com Sun Apr 15 14:58:22 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Sun, 15 Apr 2018 18:58:22 +0000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) Message-ID: Dear Steve, I'm sorry to annoy you by my proposal, but I do think using unicode might be wise in current stage. \triangleq could be print with unicode number \u225c, and adding plugins to support typing this in editors could be easy, just simply map \xxx to the specific unicode char when we press the tab after typing it. People using Julia language are proud of it but I think it's just something convenient could be used in any other language. There are other reasons to support unicode but it's out of this topic. Although ':=' and '->' are not perfect, in the range of ASCII it seems to be impossible to find a better one. ? 2018?4?16??? ??12:53??? > Send Python-ideas mailing list submissions to > python-ideas at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/python-ideas > or, via email, send a message with subject or body 'help' to > python-ideas-request at python.org > > You can reach the person managing the list at > python-ideas-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-ideas digest..." > Today's Topics: > > 1. Re: Rewriting file - pythonic way (Mahmoud Hashemi) > 2. Re: Start argument for itertools.accumulate() [Was: Proposal: > A Reduce-Map Comprehension and a "last" builtin] (Tim Peters) > 3. Re: Spelling of Assignment Expressions PEP 572 (was post #4) > (Steven D'Aprano) > 4. Re: Rewriting file - pythonic way (Oleg Broytman) > 5. Re: A cute Python implementation of itertools.tee (Tim Peters) > > > > ---------- Forwarded message ---------- > From: Mahmoud Hashemi > To: python-ideas > Cc: > Bcc: > Date: Sun, 15 Apr 2018 09:10:57 -0700 > Subject: Re: [Python-ideas] Rewriting file - pythonic way > Depending on how firm your requirements around locking are, you may find > this code useful: > https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303 > > (docs here: > http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving > ) > > Basically every operating system has _some_ way of doing an atomic file > replacement, letting us guarantee that a file at a given location is always > valid. atomic_save provides a unified interface to that cross-platform > behavior. > > The code does not do locking, as neither I nor its other users have wanted > it, but I'd be happy to extend it if there's a sensible default. > > On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman wrote: > >> On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub >> wrote: >> > ? ???????????, 15 ???. 2018 ? 2:40 , Nick Coghlan >> > ???????: >> > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker >> > > discussion >> > >> > Thanks all, I agree that universal and absolutly safe solution is very >> > difficult, but for experiment I made some draft >> > https://github.com/worldmind/scripts/tree/master/filerewrite >> >> Good! >> >> > main code here >> > >> https://github.com/worldmind/scripts/blob/master/filerewrite/filerewrite.py#L46 >> >> Can I recommend to catch exceptions in `backuper.backup()`, >> cleanup backuper and unlock locker? >> >> Oleg. >> -- >> Oleg Broytman http://phdru.name/ >> phd at phdru.name >> Programmers don't die, they just GOSUB without RETURN. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > > ---------- Forwarded message ---------- > From: Tim Peters > To: Raymond Hettinger > Cc: Python-Ideas > Bcc: > Date: Sun, 15 Apr 2018 11:15:18 -0500 > Subject: Re: [Python-ideas] Start argument for itertools.accumulate() > [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] > [Raymond Hettinger ] > > Q. Do other languages do it? > > A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. > > > > * > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html > > * > https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html > > * http://microapl.com/apl/apl_concepts_chapter5.html > > \+ 1 2 3 4 5 > > 1 3 6 10 15 > > * https://reference.wolfram.com/language/ref/Accumulate.html > > * https://www.haskell.org/hoogle/?hoogle=mapAccumL > > There's also C++, which is pretty much "yes" to every variation > discussed so far: > > * partial_sum() is like Python's current accumulate(), including > defaulting to doing addition. > > http://en.cppreference.com/w/cpp/algorithm/partial_sum > > * inclusive_scan() is also like accumulate(), but allows an optional > "init" argument (which is returned if specified), and there's no > guarantee of "left-to-right" evaluation (it's intended for associative > binary functions, and wants to allow parallelism in the > implementation). > > http://en.cppreference.com/w/cpp/algorithm/inclusive_scan > > * exclusive_scan() is like inclusive_scan(), but _requires_ an "init" > argument (which is not returned). > > http://en.cppreference.com/w/cpp/algorithm/exclusive_scan > > * accumulate() is like Python's functools.reduce(), but the operation > is optional and defaults to addition, and an "init" argument is > required. > > http://en.cppreference.com/w/cpp/algorithm/accumulate > > > > > ---------- Forwarded message ---------- > From: "Steven D'Aprano" > To: python-ideas at python.org > Cc: > Bcc: > Date: Mon, 16 Apr 2018 02:15:59 +1000 > Subject: Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 > (was post #4) > On Sun, Apr 15, 2018 at 11:11:37PM +0800, Thautwarm Zhao wrote: > > > I think maybe we can use unicode characters like ? (\triangleq) and add > the > > support of unicode completion to python repl. The unicode completion of > > editors or ides has been quite mature. > > What key combination do I need to type to get ? in the following editors > please? I tried typing \triangleq but all I got was \triangleq. > > Notepad (Windows) > Brackets (Mac) > BBEdit (Mac) > kwrite (Linux) > kate > nano > geany > gedit > > as well as IDLE, my mail client (kmail, Thunderbird or mutt), my web > browsers (Firefox, Opera and Chromium), the interactive interpreter in > various different consoles, my Usenet client (Pan and KNode) and IRC > (pidgin). > > Oh, having it work in LibreOffice and GoogleApps too would be nice, > although not essential since I don't often write code in them. > > And what decent fonts do I need to install for ? to show up as something > other than a square box ("missing glyph")? > > > -- > Steve > > > > > ---------- Forwarded message ---------- > From: Oleg Broytman > To: python-ideas at python.org > Cc: > Bcc: > Date: Sun, 15 Apr 2018 18:49:16 +0200 > Subject: Re: [Python-ideas] Rewriting file - pythonic way > On Sun, Apr 15, 2018 at 09:10:57AM -0700, Mahmoud Hashemi < > mahmoud at hatnote.com> wrote: > > Depending on how firm your requirements around locking are, you may find > > this code useful: > > > https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303 > > > > (docs here: > > > http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving > ) > > > > Basically every operating system has _some_ way of doing an atomic file > > replacement, letting us guarantee that a file at a given location is > always > > valid. atomic_save provides a unified interface to that cross-platform > > behavior. > > > > The code does not do locking, as neither I nor its other users have > wanted > > it, but I'd be happy to extend it if there's a sensible default. > > I don't like it renames the file at the end. Renaming could lead to > changed file ownership and permissions; restoring permissions is not > always possible, restoring ownership is almost never possible. Renaming > is also not always possible due to restricted directory permissions. > > > On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman wrote: > > > > > On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub < > ashrub at yandex.ru> > > > wrote: > > > > ? ???????????, 15 ???. 2018 ? 2:40 , Nick Coghlan < > ncoghlan at gmail.com> > > > > ???????: > > > > > https://bugs.python.org/issue8604#msg174104 is the relevant > tracker > > > > > discussion > > > > > > > > Thanks all, I agree that universal and absolutly safe solution is > very > > > > difficult, but for experiment I made some draft > > > > https://github.com/worldmind/scripts/tree/master/filerewrite > > > > > > Good! > > > > > > > main code here > > > > https://github.com/worldmind/scripts/blob/master/ > > > filerewrite/filerewrite.py#L46 > > > > > > Can I recommend to catch exceptions in `backuper.backup()`, > > > cleanup backuper and unlock locker? > > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > > > > > ---------- Forwarded message ---------- > From: Tim Peters > To: Antoine Pitrou > Cc: Python-Ideas > Bcc: > Date: Sun, 15 Apr 2018 11:52:49 -0500 > Subject: Re: [Python-ideas] A cute Python implementation of itertools.tee > [Antoine Pitrou ] > > This implementation doesn't work with Python 3.7 or 3.8. > > I've tried it here: > > https://gist.github.com/pitrou/b3991f638300edb6d06b5be23a4c66d6 > > > > and get: > > Traceback (most recent call last): > > File "mytee.py", line 14, in gen > > mylast = last[1] = last = [next(it), None] > > StopIteration > > > > The above exception was the direct cause of the following exception: > > > > Traceback (most recent call last): > > File "mytee.py", line 47, in > > run(mytee1) > > File "mytee.py", line 36, in run > > lists[i].append(next(iters[i])) > > RuntimeError: generator raised StopIteration > > > > (Yuck!) > > Thanks for trying! I wonder whether that will break other code. I > wrote PEP 255, and this part was intentional at the time: > > """ > If an unhandled exception-- including, but not limited to, > StopIteration --is raised by, OR PASSES THROUGH [emphasis added], a > generator function, then the exception is passed on to the caller in > the usual way, and subsequent attempts to resume the generator > function raise StopIteration. > """ > > I've exploited that a number of times. > > > > In short, you want the following instead: > > > > try: > > mylast = last[1] = last = [next(it), None] > > except StopIteration: > > return > > No, I don't ;-) If I have to catch StopIteration myself now, then I > want the entire "white True:" loop in the "try" block. Setting up > try/except machinery anew on each iteration would add significant > overhead; doing it just once per derived generator wouldn't. > > > >> def mytee(xs, n): > >> last = [None, None] > >> > >> def gen(it, mylast): > >> nonlocal last > >> while True: > >> mylast = mylast[1] > >> if not mylast: > >> mylast = last[1] = last = [next(it), None] > > > That's smart and obscure :-o > > The way it works is that the `last` assignment changes the `last` value > > seen by all derived generators, while the `last[1]` assignment updates > > the bindings made in the other generators' `mylast` lists... It's > > difficult to find the words to explain it. > > Which is why I didn't even try - I did warn people that if they > thought it "was obvious", they hadn't yet thought hard enough ;-) > Good job! > > > > The chained assignment makes it more difficult to parse as well (when I > > read this I don't know if `last[i]` or `last` gets assigned first; > > apparently the answer is `last[i]`, otherwise the recipe wouldn't work > > correctly). > > Ya, I had to look it up too :-) Although, like almost everything else > in Python, chained assignments proceed "left to right". I was just > trying to make it as short as possible, to increase the "huh - can > something that tiny really work?!" healthy skepticism factor :-) > > > > Perhaps like this: > > > > while True: > > mylast = mylast[1] > > if not mylast: > > try: > > # Create new list link > > mylast = [next(it), None] > > except StopIteration: > > return > > else: > > # Append to other generators `mylast` linked lists > > last[1] = mylast > > # Update shared list link > > last = last[1] > > yield mylast[0] > > I certainly agree that's easier to follow. But that wasn't really the > point ;-) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Sun Apr 15 15:42:52 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sun, 15 Apr 2018 22:42:52 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180415161559.GJ11616@ando.pearwood.info> References: <20180415161559.GJ11616@ando.pearwood.info> Message-ID: On Sun, Apr 15, 2018 at 7:15 PM, Steven D'Aprano wrote: > On Sun, Apr 15, 2018 at 11:11:37PM +0800, Thautwarm Zhao wrote: > >> I think maybe we can use unicode characters like ? (\triangleq) and add the >> support of unicode completion to python repl. The unicode completion of >> editors or ides has been quite mature. > > What key combination do I need to type to get ? in the following editors > please? I tried typing \triangleq but all I got was \triangleq. > > Notepad (Windows) > Brackets (Mac) > BBEdit (Mac) > kwrite (Linux) > kate > nano > geany > gedit > > as well as IDLE, my mail client (kmail, Thunderbird or mutt), my web > browsers (Firefox, Opera and Chromium), the interactive interpreter in > various different consoles, my Usenet client (Pan and KNode) and IRC > (pidgin). > > Oh, having it work in LibreOffice and GoogleApps too would be nice, > although not essential since I don't often write code in them. Typing should not be a problem generally. There are a lot of 3d-party apps which can bind a key to specific char input, system-wide. On windows I use Autohotkey. But no 100% guarantee of course for any editor. > And what decent fonts do I need to install for ? to show up as something > other than a square box ("missing glyph")? Well, here it is way less optimistic :) The chances to see that "delta equal to" sign in some random font / random app is not so big. It's only if you have fonts fallback system setup, and by default on my windows it seems to work only in Firefox browser. Mikhail From george at fischhof.hu Sun Apr 15 15:47:23 2018 From: george at fischhof.hu (George Fischhof) Date: Sun, 15 Apr 2018 21:47:23 +0200 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: <1523782620.2055.0@smtp.yandex.ru> References: <1523782620.2055.0@smtp.yandex.ru> Message-ID: Hi, some similar thing already exist in standard: https://docs.python.org/3/library/fileinput.html fileinput(... inplace=True...) BR, George 2018-04-15 10:57 GMT+02:00 Alexey Shrub : > Hi all, > > I am new in python (i am moving from Perl world), but I always love Python > for hight level, beatuful and clean syntax. > Now I have question/idea about working with files. > On mine opinion it very popular use case: > 1. Open file (for read and write) > 2. Read data from file > 3. Modify data. > 4. Rewrite file by modified data. > > But now it is looks not so pythonic: > > with open(filename, 'r+') as file: > data = file.read() > data = data.replace('old', 'new') > file.seek(0) > file.write(data) > file.truncate() > > or something like this > > with open(filename) as file: > data = file.read() > data = data.replace('old', 'new') > with open(filename) as file: > file.write(data) > > I think best way is something like this > > with open(filename, 'r+') as file: > data = file.read() > data = data.replace('old', 'new') > file.rewrite(data) > > but for this io.BufferedIOBase must contain rewrite method > > what you think about this? > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Apr 15 16:16:46 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 16 Apr 2018 06:16:46 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180415155805.GI11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: On Mon, Apr 16, 2018 at 1:58 AM, Steven D'Aprano wrote: > On Sun, Apr 15, 2018 at 10:21:02PM +1000, Chris Angelico wrote: > >> I don't think we're ever going to unify everyone on an arbitrary >> question of "expression first" or "name first". But to all the >> "expression first" people, a question: what if the target is not just >> a simple name? >> >> while (read_next_item() -> items[i + 1 -> i]) is not None: >> print("%d/%d..." % (i, len(items)), end="\r") > > I don't see why it would make a difference. It doesn't to me. Okay, that's good. I just hear people saying "name" a lot, but that would imply restricting the grammar to just a name, and I don't know how comfortable people are with more complex targets. >> Does this make sense? With the target coming first, it perfectly >> parallels the existing form of assignment: > > Yes, except this isn't ordinary assignment-as-a-statement. > > I've been mulling over the question why I think the expression needs to > come first here, whereas I'm satisfied with the target coming first for > assignment statements, and I think I've finally got the words to explain > it. It is not just long familiarity with maths and languages that put > the variable first (although that's also part of it). It has to do with > what we're looking for when we read code, specifically what is the > primary piece of information we're initially looking for. > > In assignment STATEMENTS the primary piece of information is the target. > Yes, of course the value assigned to the target is important, but often > we don't care what the value is, at least not at first. We're hunting > for a known target, and only when we find it do we care about the value > it gets. > [chomp details] > It is appropriate for assignment statements and expressions to be > written differently because they are used differently. I don't know that assignment expressions are inherently going to be used in ways where you ignore the assignment part and care only about the expression part. And I disagree that assignment statements are used primarily the way you say. Frequently I skim down a column of assignments, caring primarily about the functions being called, and looking at the part before the equals sign only when I come across a parameter in another call; the important part of the line is what it's doing, not where it's stashing its result. > [...] >> >>> items = [None] * 10 >> >>> i = -1 >> >>> items[i := i + 1] = input("> ") >> > asdf >> >>> items[i := i + 1] = input("> ") >> > qwer >> >>> items[i := i + 1] = input("> ") >> > zxcv >> >>> >> >>> items >> ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] > > > I don't know why you would write that instead of: > > items = [None]*10 > for i in range(3): > items[i] = input("> ") > > > or even for that matter: > > items = [input("> ") for i in range(3)] + [None]*7 > > > but whatever floats your boat. (Python isn't just not Java. It's also > not C *wink*) You and Kirill have both fallen into the trap of taking the example too far. By completely rewriting it, you destroy its value as an example. Write me a better example of a complex target if you like, but the question is about how you feel about complex assignment targets, NOT how you go about creating a particular list in memory. That part is utterly irrelevant. >> Are you as happy with that sort of complex >> expression coming after 'as' or '->'? > > Sure. Ignoring the output of the calls to input(): The calls to input were in a while loop's header for a reason. Ignoring them is ignoring the point of assignment expressions. ChrisA From rosuav at gmail.com Sun Apr 15 16:22:54 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 16 Apr 2018 06:22:54 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On Mon, Apr 16, 2018 at 12:17 AM, Kirill Balunov wrote: > > > 2018-04-15 15:21 GMT+03:00 Chris Angelico : >> I don't think we're ever going to unify everyone on an arbitrary >> question of "expression first" or "name first". But to all the >> "expression first" people, a question: what if the target is not just >> a simple name? >> >> while (read_next_item() -> items[i + 1 -> i]) is not None: >> print("%d/%d..." % (i, len(items)), end="\r") >> > > I completely agree with you that it is impossible to unify everyone opinion > - we all have different background. But this example is more likely to play > against this PEP. This is an extra complexity within one line and it can > fail hard in at least three obvious places :) And I am against this usage no > matter `name first` or `expression first`. But i will reask this with > following snippets. What do you choose from this examples: > > 0. > > while (items[i := i+1] := read_next_item()) is not None: > print(r'%d/%d' % (i, len(items)), end='\r') > > 1. > > while (read_next_item() -> items[(i+1) -> i]) is not None: > print(r'%d/%d' % (i, len(items)), end='\r') These two are matching what I wrote, and are thus the two forms under consideration. I notice that you added parentheses to the second one; is there a clarity problem here and you're unsure whether "i + 1 -> i" would capture "i + 1" or "1"? If so, that's a downside to the proposal. > 2. > > while (item := read_next_item()) is not None: > items[i := (i+1)] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 3. > > while (read_next_item() -> item) is not None: > items[(i+1) -> i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 4. > > while (item := read_next_item()) is not None: > i = i+1 > items[i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 5. > > while (read_next_item() -> item) is not None: > i = i+1 > items[i] = item > print(r'%d/%d' % (i, len(items)), end='\r') All of these are fundamentally different from what I'm asking: they are NOT all expressions that can be used in the while header. So it doesn't answer the question of "expression first" or "target first". Once the expression gets broken out like this, you're right back to using "expression -> NAME" or "NAME := expression", and it's the same sort of simple example that people have been discussing all along. > I am definitely Ok with both 2 and 3 here. But as it was noted `:=` produces > additional noise in other places and I am also an `expression first` guy :) > So I still prefer variant 3 to 2. But to be completely honest, I would write > it in the following way: > > for item in iter(read_next_item, None): > items.append(item) > print(r'%d/%d' % (i, len(items)), end='\r') And that's semantically different in several ways. Not exactly a fair comparison. I invite you to write up a better example with a complex target. ChrisA From rosuav at gmail.com Sun Apr 15 16:28:15 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 16 Apr 2018 06:28:15 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On Mon, Apr 16, 2018 at 3:19 AM, Guido van Rossum wrote: > On Sun, Apr 15, 2018 at 4:05 AM, Kirill Balunov > wrote: >> But somehow this PEP does not mean it! And with the current rationale of >> this PEP it's a huge CON for me that `=` and `:=` feel and look the same. > > Then maybe the PEP needs to be updated. I can never be sure what people are reading when they say "current" with PEPs like this. The text gets updated fairly frequently. As of time of posting, here's the rationale: ----- Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts. Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations. ----- Kirill, is this what you read, and if so, how does that make ':=' a negative? The rationale says "hey, see this really good thing you can do as a statement? Let's do it as an expression too", so the parallel should be a good thing. ChrisA From mikhailwas at gmail.com Sun Apr 15 16:28:28 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sun, 15 Apr 2018 23:28:28 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: On Sun, Apr 15, 2018 at 2:01 PM, Nick Coghlan wrote: > On 15 April 2018 at 19:41, Mikhail V wrote: >> So IIUC, the *only* reason is to avoid '==' ad '=' similarity? >> If so, then it does not sound convincing at all. >> Of course Python does me a favor showing an error, >> when I make a typo like this: >> if (x = y) >> >> But still, if this is the only real reason, it is not convincing. > > It's thoroughly convincing, because we're already familiar with the > consequences of folks confusing "=" and "==" when writing C & C++ > code. It's an eternal bug magnet, so it's not a design we're ever > going to port over to Python. [...] > The examples in the PEP have been updated to better reflect some of > the key motivating use cases (embedded assignments in if and while > statement conditions, generator expressions, and container > comprehensions) Im personally "0" on the whole proposal. Just was curious about that "demonisation" of "=" and "==" visual similarity. Granted, writing ":=" instead of "=" helps a little bit. But if the ":=" will be accepted, then we end up with two spellings :-) > >> And as a side note: I personally find the look of ":=" a bit 'noisy'. > > You're not alone in that, which is one of the reasons finding a > keyword based option that's less syntactically ambiguous than "as" > could be an attractive alternative. > Keyword variants look less appealing than ":=". but if it had to be a keyword, then I'd definitely stay by "TARGET keyword EXPR" just not to swap the traditional order. Mikhail From rosuav at gmail.com Sun Apr 15 16:36:32 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 16 Apr 2018 06:36:32 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: Message-ID: On Mon, Apr 16, 2018 at 4:58 AM, Thautwarm Zhao wrote: > Dear Steve, I'm sorry to annoy you by my proposal, but I do think using > unicode might be wise in current stage. > > \triangleq could be print with unicode number \u225c, and adding plugins to > support typing this in editors could be easy, just simply map \xxx to the > specific unicode char when we press the tab after typing it. > > People using Julia language are proud of it but I think it's just something > convenient could be used in any other language. > > There are other reasons to support unicode but it's out of this topic. > > Although ':=' and '->' are not perfect, in the range of ASCII it seems to be > impossible to find a better one. > If you want to introduce non-ASCII tokens to Python, start by adding them as _alternatives_ to the current syntax. See whether people adopt them. I've seen one or two people using editors that redisplay ASCII-only source code using other symbols (eg ? for JavaScript's ===), and you could make it so the source code can actually be saved in that form. But making it so that the ONLY way to use a feature is to use a non-ASCII character? That's going to break a lot of people's workflows. ChrisA From k7hoven at gmail.com Sun Apr 15 16:46:06 2018 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 15 Apr 2018 23:46:06 +0300 Subject: [Python-ideas] A cute Python implementation of itertools.tee In-Reply-To: References: Message-ID: On Sun, Apr 15, 2018 at 8:05 AM, Tim Peters wrote: ?[...]? > Then I thought "this is stupid! Python already does reference > counting." Voila! Vast swaths of tedious code vanished, giving this > remarkably simple implementation: > > def mytee(xs, n): > last = [None, None] > > def gen(it, mylast): > nonlocal last > while True: > mylast = mylast[1] > if not mylast: > mylast = last[1] = last = [next(it), None] > yield mylast[0] > > it = iter(xs) > return tuple(gen(it, last) for _ in range(n)) > > There's no need to keep a pointer to the start of the shared list at > all - we only need a pointer to the end of the list ("last"), and each > derived generator only needs a pointer to its own current position in > the list ("mylast"). > > Things here remind me of my implementation design for PEP 555: the "contexts" present in the process are represented by a singly-linked tree of assignment objects. It's definitely possible to write the above in a more readable way, and FWIW I don't think it involves "assignments as expressions". > What I find kind of hilarious is that it's no help at all as a > prototype for a C implementation: Python recycles stale `[next(it), > None]` pairs all by itself, when their internal refcounts fall to 0. > That's the hardest part. > > ?Why can't the C implementation use Python refcounts? Are you talking about standalone C code? Or perhaps you are thinking about overhead? (In PEP 555 that was not a concern, though). Surely it would make sense to reuse the refcounting code that's already there. There are no cycles here, so it's not particulaly complicated -- just duplication. Anyway, the whole linked list is unnecessary if the iterable can be iterated over multiple times. But "tee" won't know when to do that. *That* is what I call overhead (unless of course all the tee branches are consumed in an interleaved manner). > BTW, I certainly don't suggest adding this to the itertools docs > either. While it's short and elegant, it's too subtle to grasp easily > - if you think "it's obvious", you haven't yet thought hard enough > about the problem ;-) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Apr 15 16:55:48 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 16 Apr 2018 06:55:48 +1000 Subject: [Python-ideas] A cute Python implementation of itertools.tee In-Reply-To: References: Message-ID: On Mon, Apr 16, 2018 at 6:46 AM, Koos Zevenhoven wrote: > Anyway, the whole linked list is unnecessary if the iterable can be iterated > over multiple times. But "tee" won't know when to do that. *That* is what I > call overhead (unless of course all the tee branches are consumed in an > interleaved manner). But if you have something you can iterate over multiple times, why bother with tee at all? Just take N iterators from the underlying iterable. The overhead is intrinsic to the value of the function. ChrisA From tim.peters at gmail.com Sun Apr 15 17:06:49 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 15 Apr 2018 16:06:49 -0500 Subject: [Python-ideas] A cute Python implementation of itertools.tee In-Reply-To: References: Message-ID: [Koos Zevenhoven ] >.... It's definitely possible to write the above in a more > readable way, and FWIW I don't think it involves "assignments as > expressions". Of course it is. The point was brevity and speed, not readability. It was presented partly as a puzzle :-) >> What I find kind of hilarious is that it's no help at all as a >> prototype for a C implementation: Python recycles stale `[next(it), >> None]` pairs all by itself, when their internal refcounts fall to 0. >> That's the hardest part. > Why can't the C implementation use Python refcounts? Are you talking about > standalone C code? Yes, expressing the algorithm in plain old C, not building on top of (say) the Python C API. > Or perhaps you are thinking about overhead? Nope. > (In PEP 555 that was not a concern, though). Surely it would make sense > to reuse the refcounting code that's already there. There are no cycles > here, so it's not particulaly complicated -- just duplication. > > Anyway, the whole linked list is unnecessary if the iterable can be iterated > over multiple times. If the latter were how iterables always worked, there would be no need for tee() at all. It's tee's _purpose_ to make it possible for multiple consumers to traverse an iterable's can't-restart-or-even -go-back result sequence each at their own pace. From python-ideas at mgmiller.net Sun Apr 15 17:13:59 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Sun, 15 Apr 2018 14:13:59 -0700 Subject: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts In-Reply-To: References: Message-ID: <6a8e3c7f-d551-83cd-46cd-75006468b6a5@mgmiller.net> On 2018-04-12 18:03, Guido van Rossum wrote: > It's a slippery slope indeed. While having to change update() alone wouldn't > worry me, the subclass constructors do seem like they are going to want changing > too, and that's indeed a bit much. So let's back off a bit. Not every three > lines of code need a built-in shorthand. This is disappointing since the dictionary is one of the most used but simultaneously limited of the builtin types. It doesn't support a lot of operations that strings, lists, tuples, sets, etc do. These are the little niceties that make Python fun to program in. But, for some reason we're stingy when it comes to dictionaries, the foundation of the language. Has anyone disagreed the dict constructor shouldn't take multiple arguments? Also, it isn't always three lines of code, but expands with the number that need to be merged. My guess is that the dict is used an order of magnitude more than specialized subclasses, even more so now that the Ordered variant is unnecessary in newer versions. It wouldn't bother me at all if it took a few years for the improvement to get rolled out to subclasses or never, it's quite a minor disappointment compared to getting the functionality ~90% of the time. Also wouldn't mind helping out with the subclasses if there is some lifting that needed to be done. -Mike From brenbarn at brenbarn.net Sun Apr 15 17:18:33 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Sun, 15 Apr 2018 14:18:33 -0700 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180415155805.GI11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: <5AD3C1A9.8080908@brenbarn.net> On 2018-04-15 08:58, Steven D'Aprano wrote: > I've been mulling over the question why I think the expression needs to > come first here, whereas I'm satisfied with the target coming first for > assignment statements, and I think I've finally got the words to explain > it. It is not just long familiarity with maths and languages that put > the variable first (although that's also part of it). It has to do with > what we're looking for when we read code, specifically what is the > primary piece of information we're initially looking for. Interesting. I think your arguments are pretty reasonable overall. But, for me, they just don't outweigh the fact that "->" is an ugly assignment operator that looks nothing like the existing one, whereas ":=" is a less-ugly one that has the additional benefit of looking like the existing one. From your arguments I am convinced that putting the expression first has some advantages, but they just don't seem as important to me as they apparently do to you. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From peter at norvig.com Sun Apr 15 17:05:04 2018 From: peter at norvig.com (Peter Norvig) Date: Sun, 15 Apr 2018 21:05:04 +0000 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ Message-ID: For most types that implement __add__, `x + x` is equal to `2 * x`. That is true for all numbers, list, tuple, str, timedelta, etc. -- but not for collections.Counter. I can add two Counters, but I can't multiply one by a scalar. That seems like an oversight. It would be worthwhile to implement multiplication because, among other reasons, Counters are a nice representation for discrete probability distributions, for which multiplication is an even more fundamental operation than addition. Here's an implementation: def __mul__(self, scalar): "Multiply each entry by a scalar." result = Counter() for key in self: result[key] = self[key] * scalar return result def __rmul__(self, scalar): "Multiply each entry by a scalar." result = Counter() for key in self: result[key] = scalar * self[key] return result -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sun Apr 15 18:45:56 2018 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 15 Apr 2018 18:45:56 -0400 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: Good call. Is it any faster to initialize Counter with a dict comprehension? return Counter({k: v*scalar for (k, v) in self.items()) On Sun, Apr 15, 2018 at 5:05 PM, Peter Norvig wrote: > For most types that implement __add__, `x + x` is equal to `2 * x`. > > That is true for all numbers, list, tuple, str, timedelta, etc. -- but not > for collections.Counter. I can add two Counters, but I can't multiply one > by a scalar. That seems like an oversight. > > It would be worthwhile to implement multiplication because, among other > reasons, Counters are a nice representation for discrete probability > distributions, for which multiplication is an even more fundamental > operation than addition. > > Here's an implementation: > > def __mul__(self, scalar): > "Multiply each entry by a scalar." > result = Counter() > for key in self: > result[key] = self[key] * scalar > return result > > def __rmul__(self, scalar): > "Multiply each entry by a scalar." > result = Counter() > for key in self: > result[key] = scalar * self[key] > return result > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at norvig.com Sun Apr 15 19:02:50 2018 From: peter at norvig.com (Peter Norvig) Date: Sun, 15 Apr 2018 23:02:50 +0000 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: That's actually how I coded it myself the first time. But I worried it would be wasteful to create an intermediate dict and discard it. `timeit` results: 3.79 ?s for the for-loop, 5.08 ?s for the dict-comprehension with a 10-key Counter 257 ?s for the for-loop, 169 ?s for the dict-comprehension with a 1000-key Counter So results are mixed, but you are probably right. On Sun, Apr 15, 2018 at 3:46 PM Wes Turner wrote: > Good call. Is it any faster to initialize Counter with a dict > comprehension? > > return Counter({k: v*scalar for (k, v) in self.items()) > > On Sun, Apr 15, 2018 at 5:05 PM, Peter Norvig wrote: > >> For most types that implement __add__, `x + x` is equal to `2 * x`. >> >> That is true for all numbers, list, tuple, str, timedelta, etc. -- but >> not for collections.Counter. I can add two Counters, but I can't multiply >> one by a scalar. That seems like an oversight. >> >> It would be worthwhile to implement multiplication because, among other >> reasons, Counters are a nice representation for discrete probability >> distributions, for which multiplication is an even more fundamental >> operation than addition. >> >> Here's an implementation: >> >> def __mul__(self, scalar): >> "Multiply each entry by a scalar." >> result = Counter() >> for key in self: >> result[key] = self[key] * scalar >> return result >> >> def __rmul__(self, scalar): >> "Multiply each entry by a scalar." >> result = Counter() >> for key in self: >> result[key] = scalar * self[key] >> return result >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Sun Apr 15 20:05:55 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 15 Apr 2018 17:05:55 -0700 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: > On Apr 15, 2018, at 2:05 PM, Peter Norvig wrote: > > For most types that implement __add__, `x + x` is equal to `2 * x`. > > ... > > > That is true for all numbers, list, tuple, str, timedelta, etc. -- but not for collections.Counter. I can add two Counters, but I can't multiply one by a scalar. That seems like an oversight. If you view the Counter as a sparse associative array of numeric values, it does seem like an oversight. If you view the Counter as a Multiset or Bag, it doesn't make sense at all ;-) From an implementation point of view, Counter is just a kind of dict that has a __missing__() method that returns zero. That makes it trivially easy to subclass Counter to add new functionality or just use dictionary comprehensions for bulk updates. > > > It would be worthwhile to implement multiplication because, among other reasons, Counters are a nice representation for discrete probability distributions, for which multiplication is an even more fundamental operation than addition. There is an open issue on this topic. See: https://bugs.python.org/issue25478 One stumbling point is that a number of commenters are fiercely opposed to non-integer uses of Counter. Also, some of the use cases (such as those found in Allen Downey's "Think Stats" and "Think Bayes" books) also need division and rescaling to a total (i.e. normalizing the total to 1.0) for a probability mass function. If the idea were to go forward, it still isn't clear whether the correct API should be low level (__mul__ and __div__ and a "total" property) or higher level (such as a normalize() or rescale() method that produces a new Counter instance). The low level approach has the advantage that it is simple to understand and that it feels like a logical extension of the __add__ and __sub__ methods. The downside is that doesn't really add any new capabilities (being just short-cuts for a simple dict comprehension or call to c.values()). And, it starts to feature creep the Counter class further away from its core mission of counting and ventures into the realm of generic sparse arrays with numeric values. There is also a learnability/intelligibility issue in __add__ and __sub__ correspond to "elementwise" operations while __mul__ and __div__ would be "scalar broadcast" operations. Peter, I'm really glad you chimed in. My advocacy lacked sufficient weight to move this idea forward. Raymond From peter at norvig.com Sun Apr 15 20:44:19 2018 From: peter at norvig.com (Peter Norvig) Date: Mon, 16 Apr 2018 00:44:19 +0000 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: If you think of a Counter as a multiset, then it should support __or__, not __add__, right? I do think it would have been fine if Counter did not support "+" at all (and/or if Counter was limited to integer values). But given where we are now, it feels like we should preserve `c + c == 2 * c`. As to the "doesn't really add any new capabilities" argument, that's true, but it is also true for Counter as a whole: it doesn't add much over defaultdict(int), but it is certainly convenient to have a standard way to do what it does. I agree with your intuition that low level is better. `total` would be useful. If you have total and mul, then as you and others have pointed out, normalize is just c *= 1/c.total. I can also see the argument for a new FrequencyTable class in the statistics module. (By the way, I refactored my https://github.com/norvig/pytudes/blob/master/ipynb/Probability.ipynb a bit, and now I no longer need a `normalize` function.) On Sun, Apr 15, 2018 at 5:06 PM Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > > On Apr 15, 2018, at 2:05 PM, Peter Norvig wrote: > > > > For most types that implement __add__, `x + x` is equal to `2 * x`. > > > > ... > > > > > > That is true for all numbers, list, tuple, str, timedelta, etc. -- but > not for collections.Counter. I can add two Counters, but I can't multiply > one by a scalar. That seems like an oversight. > > If you view the Counter as a sparse associative array of numeric values, > it does seem like an oversight. If you view the Counter as a Multiset or > Bag, it doesn't make sense at all ;-) > > From an implementation point of view, Counter is just a kind of dict that > has a __missing__() method that returns zero. That makes it trivially easy > to subclass Counter to add new functionality or just use dictionary > comprehensions for bulk updates. > > > > > > > It would be worthwhile to implement multiplication because, among other > reasons, Counters are a nice representation for discrete probability > distributions, for which multiplication is an even more fundamental > operation than addition. > > There is an open issue on this topic. See: > https://bugs.python.org/issue25478 > > One stumbling point is that a number of commenters are fiercely opposed to > non-integer uses of Counter. Also, some of the use cases (such as those > found in Allen Downey's "Think Stats" and "Think Bayes" books) also need > division and rescaling to a total (i.e. normalizing the total to 1.0) for a > probability mass function. > > If the idea were to go forward, it still isn't clear whether the correct > API should be low level (__mul__ and __div__ and a "total" property) or > higher level (such as a normalize() or rescale() method that produces a new > Counter instance). The low level approach has the advantage that it is > simple to understand and that it feels like a logical extension of the > __add__ and __sub__ methods. The downside is that doesn't really add any > new capabilities (being just short-cuts for a simple dict comprehension or > call to c.values()). And, it starts to feature creep the Counter class > further away from its core mission of counting and ventures into the realm > of generic sparse arrays with numeric values. There is also a > learnability/intelligibility issue in __add__ and __sub__ correspond to > "elementwise" operations while __mul__ and __div__ would be "scalar > broadcast" operations. > > Peter, I'm really glad you chimed in. My advocacy lacked sufficient > weight to move this idea forward. > > > Raymond > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sun Apr 15 21:34:55 2018 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 15 Apr 2018 21:34:55 -0400 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: On Sunday, April 15, 2018, Peter Norvig wrote: > If you think of a Counter as a multiset, then it should support __or__, > not __add__, right? > > I do think it would have been fine if Counter did not support "+" at all > (and/or if Counter was limited to integer values). But given where we are > now, it feels like we should preserve `c + c == 2 * c`. > > As to the "doesn't really add any new capabilities" argument, that's > true, but it is also true for Counter as a whole: it doesn't add much over > defaultdict(int), but it is certainly convenient to have a standard way to > do what it does. > > I agree with your intuition that low level is better. `total` would be > useful. If you have total and mul, then as you and others have pointed out, > normalize is just c *= 1/c.total. > > I can also see the argument for a new FrequencyTable class in the > statistics module. (By the way, I refactored my https://github.com/norvig/ > pytudes/blob/master/ipynb/Probability.ipynb a bit, and now I no longer > need a `normalize` function.) > nltk.probability.FreqDist(collections.Counter) doesn't have a __mul__ either http://www.nltk.org/api/nltk.html#nltk.probability.FreqDist numpy.unique(, return_counts=True).unique_counts returns an array sorted by value with a __mul__. https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html scipy.stats.itemfreq returns an array sorted by value with a __mul__ and the items in the first column. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.itemfreq.html pandas.Series.value_counts(, normalize=False) returns a Series sorted by descending frequency. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html > On Sun, Apr 15, 2018 at 5:06 PM Raymond Hettinger < > raymond.hettinger at gmail.com> wrote: > >> >> >> > On Apr 15, 2018, at 2:05 PM, Peter Norvig wrote: >> > >> > For most types that implement __add__, `x + x` is equal to `2 * x`. >> > >> > ... >> > >> > >> > That is true for all numbers, list, tuple, str, timedelta, etc. -- but >> not for collections.Counter. I can add two Counters, but I can't multiply >> one by a scalar. That seems like an oversight. >> >> If you view the Counter as a sparse associative array of numeric values, >> it does seem like an oversight. If you view the Counter as a Multiset or >> Bag, it doesn't make sense at all ;-) >> >> From an implementation point of view, Counter is just a kind of dict that >> has a __missing__() method that returns zero. That makes it trivially easy >> to subclass Counter to add new functionality or just use dictionary >> comprehensions for bulk updates. >> >> > >> > >> > It would be worthwhile to implement multiplication because, among other >> reasons, Counters are a nice representation for discrete probability >> distributions, for which multiplication is an even more fundamental >> operation than addition. >> >> There is an open issue on this topic. See: https://bugs.python.org/ >> issue25478 >> >> One stumbling point is that a number of commenters are fiercely opposed >> to non-integer uses of Counter. Also, some of the use cases (such as those >> found in Allen Downey's "Think Stats" and "Think Bayes" books) also need >> division and rescaling to a total (i.e. normalizing the total to 1.0) for a >> probability mass function. >> >> If the idea were to go forward, it still isn't clear whether the correct >> API should be low level (__mul__ and __div__ and a "total" property) or >> higher level (such as a normalize() or rescale() method that produces a new >> Counter instance). The low level approach has the advantage that it is >> simple to understand and that it feels like a logical extension of the >> __add__ and __sub__ methods. The downside is that doesn't really add any >> new capabilities (being just short-cuts for a simple dict comprehension or >> call to c.values()). And, it starts to feature creep the Counter class >> further away from its core mission of counting and ventures into the realm >> of generic sparse arrays with numeric values. There is also a >> learnability/intelligibility issue in __add__ and __sub__ correspond to >> "elementwise" operations while __mul__ and __div__ would be "scalar >> broadcast" operations. >> >> Peter, I'm really glad you chimed in. My advocacy lacked sufficient >> weight to move this idea forward. >> >> >> Raymond >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sun Apr 15 22:18:36 2018 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 15 Apr 2018 22:18:36 -0400 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: tf.bincount() returns a vector with integer counts. https://www.tensorflow.org/api_docs/python/tf/bincount Keras calls np.bincount in an mnist example. np.bincount returns an array with a __mul__ https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.bincount.html - sklearn.preprocessing.normalize http://scikit-learn.org/stable/modules/preprocessing.html#preprocessing-normalization http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.normalize.html featuretools.primitives.NUnique has a normalize method. https://docs.featuretools.com/generated/featuretools.primitives.NUnique.html#featuretools.primitives.NUnique And I'm done sharing non-pure-python solutions for this problem, I promise On Sunday, April 15, 2018, Wes Turner wrote: > > > On Sunday, April 15, 2018, Peter Norvig wrote: > >> If you think of a Counter as a multiset, then it should support __or__, >> not __add__, right? >> >> I do think it would have been fine if Counter did not support "+" at all >> (and/or if Counter was limited to integer values). But given where we are >> now, it feels like we should preserve `c + c == 2 * c`. >> >> As to the "doesn't really add any new capabilities" argument, that's >> true, but it is also true for Counter as a whole: it doesn't add much over >> defaultdict(int), but it is certainly convenient to have a standard way to >> do what it does. >> >> I agree with your intuition that low level is better. `total` would be >> useful. If you have total and mul, then as you and others have pointed out, >> normalize is just c *= 1/c.total. >> >> I can also see the argument for a new FrequencyTable class in the >> statistics module. (By the way, I refactored my >> https://github.com/norvig/pytudes/blob/master/ipynb/Probability.ipynb a >> bit, and now I no longer need a `normalize` function.) >> > > nltk.probability.FreqDist(collections.Counter) doesn't have a __mul__ > either > http://www.nltk.org/api/nltk.html#nltk.probability.FreqDist > > numpy.unique(, return_counts=True).unique_counts returns an array sorted > by value with a __mul__. > https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html > > scipy.stats.itemfreq returns an array sorted by value with a __mul__ and > the items in the first column. > https://docs.scipy.org/doc/scipy/reference/generated/ > scipy.stats.itemfreq.html > > pandas.Series.value_counts(, normalize=False) returns a Series sorted by > descending frequency. > https://pandas.pydata.org/pandas-docs/stable/generated/ > pandas.Series.value_counts.html > > >> On Sun, Apr 15, 2018 at 5:06 PM Raymond Hettinger < >> raymond.hettinger at gmail.com> wrote: >> >>> >>> >>> > On Apr 15, 2018, at 2:05 PM, Peter Norvig wrote: >>> > >>> > For most types that implement __add__, `x + x` is equal to `2 * x`. >>> > >>> > ... >>> > >>> > >>> > That is true for all numbers, list, tuple, str, timedelta, etc. -- but >>> not for collections.Counter. I can add two Counters, but I can't multiply >>> one by a scalar. That seems like an oversight. >>> >>> If you view the Counter as a sparse associative array of numeric values, >>> it does seem like an oversight. If you view the Counter as a Multiset or >>> Bag, it doesn't make sense at all ;-) >>> >>> From an implementation point of view, Counter is just a kind of dict >>> that has a __missing__() method that returns zero. That makes it trivially >>> easy to subclass Counter to add new functionality or just use dictionary >>> comprehensions for bulk updates. >>> >>> > >>> > >>> > It would be worthwhile to implement multiplication because, among >>> other reasons, Counters are a nice representation for discrete probability >>> distributions, for which multiplication is an even more fundamental >>> operation than addition. >>> >>> There is an open issue on this topic. See: >>> https://bugs.python.org/issue25478 >>> >>> One stumbling point is that a number of commenters are fiercely opposed >>> to non-integer uses of Counter. Also, some of the use cases (such as those >>> found in Allen Downey's "Think Stats" and "Think Bayes" books) also need >>> division and rescaling to a total (i.e. normalizing the total to 1.0) for a >>> probability mass function. >>> >>> If the idea were to go forward, it still isn't clear whether the correct >>> API should be low level (__mul__ and __div__ and a "total" property) or >>> higher level (such as a normalize() or rescale() method that produces a new >>> Counter instance). The low level approach has the advantage that it is >>> simple to understand and that it feels like a logical extension of the >>> __add__ and __sub__ methods. The downside is that doesn't really add any >>> new capabilities (being just short-cuts for a simple dict comprehension or >>> call to c.values()). And, it starts to feature creep the Counter class >>> further away from its core mission of counting and ventures into the realm >>> of generic sparse arrays with numeric values. There is also a >>> learnability/intelligibility issue in __add__ and __sub__ correspond to >>> "elementwise" operations while __mul__ and __div__ would be "scalar >>> broadcast" operations. >>> >>> Peter, I'm really glad you chimed in. My advocacy lacked sufficient >>> weight to move this idea forward. >>> >>> >>> Raymond >>> >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Sun Apr 15 23:39:45 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 15 Apr 2018 20:39:45 -0700 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: > On Apr 15, 2018, at 5:44 PM, Peter Norvig wrote: > > If you think of a Counter as a multiset, then it should support __or__, not __add__, right? FWIW, Counter is explicitly documented to support the four multiset-style mathematical operations discussed in Knuth TAOCP Volume II section 4.6.3 exercise 19: >>> c = Counter(a=3, b=1) >>> d = Counter(a=1, b=2) >>> c + d # add two counters together: c[x] + d[x] Counter({'a': 4, 'b': 3}) >>> c - d # saturating subtraction (keeping only positive counts) Counter({'a': 2}) >>> c & d # intersection: min(c[x], d[x]) Counter({'a': 1, 'b': 1}) >>> c | d # union: max(c[x], d[x]) Counter({'a': 3, 'b': 2}) The wikipedia article on Multisets lists a further operation, inclusion, that is not currently supported: https://en.wikipedia.org/wiki/Multiset#Basic_properties_and_operations > I do think it would have been fine if Counter did not support "+" at all (and/or if Counter was limited to integer values). But given where we are now, it feels like we should preserve `c + c == 2 * c`. The + operation has legitimate use cases (it is perfectly reasonable to want to combine the results two separate counts). And, as you pointed out, it is what we already have and cannot change :-) So, the API design issue that confronts us is that it would be a bit weird and disorienting for the arithmetic operators to have two different signatures: += -= *= /= Also, we should respect the comments given by others on the tracker issue. In particular, there is a preference to not have an in-place operation and only allow a new counter instance to be created. That will help people avoid data structure modality problems: . c[category] += 1 # Makes sense during the frequency counting or accumulation phase c /= c.total # Covert to a probability mass function c[category] += 1 # This code looks correct but no longer makes any sense > As to the "doesn't really add any new capabilities" argument, that's true, but it is also true for Counter as a whole: it doesn't add much over defaultdict(int), but it is certainly convenient to have a standard way to do what it does. IIRC, the defaultdict(int) in your first version triggered a bug because the model inadvertently changed during the analysis phase rather than being frozen after the training phase. The Counter doesn't suffer from the same issue (modifying the dict on a failed lookup). Also, the Counter class does have a few value added features: Counter(iterable), c.most_common(), c.elements(), etc. But yes, at its heart the counter is mostly just a specialized dictionary. The thought I was trying to express is that suggestions to build out Counter API are a little less compelling when we already have a way to do it that is flexible, fast, clear, and standard (i.e. dict comprehensions). > I agree with your intuition that low level is better. `total` would be useful. If you have total and mul, then as you and others have pointed out, normalize is just c *= 1/c.total. I fully support adding some functionality for scaling to support probability distributions, bayesian update steps, chi-square tests, and whatnot. The people who need convincing are the other respondents on the tracker. They had a strong mental model for the Counter class that is somewhat at odds with this proposal. Raymond From raymond.hettinger at gmail.com Sun Apr 15 23:41:41 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 15 Apr 2018 20:41:41 -0700 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: <0AB0C29A-6852-475E-A9F8-B2CCEA503A1B@gmail.com> > On Apr 15, 2018, at 7:18 PM, Wes Turner wrote: > > And I'm done sharing non-pure-python solutions for this problem, I promise Keep them coming :-) Thanks for the research. It helps to remind ourselves that almost none of our problems are new :-) Raymond From rosuav at gmail.com Sun Apr 15 23:42:47 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 16 Apr 2018 13:42:47 +1000 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: On Mon, Apr 16, 2018 at 1:39 PM, Raymond Hettinger wrote: > > So, the API design issue that confronts us is that it would be a bit weird and disorienting for the arithmetic operators to have two different signatures: > > += > -= > *= > /= > This needn't be a blocker. Strings can be added to strings, and strings can be multiplied by integers. If it's of practical value to multiply a Counter by a number, by all means do it. ChrisA From peter at norvig.com Mon Apr 16 00:04:00 2018 From: peter at norvig.com (Peter Norvig) Date: Mon, 16 Apr 2018 04:04:00 +0000 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: On Sun, Apr 15, 2018 at 8:39 PM Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > FWIW, Counter is explicitly documented to support the four multiset-style > mathematical operations discussed in Knuth TAOCP Volume II section 4.6.3 > exercise 19: > Wow, I never noticed "&" and "|" -- I guess when I got to "Common patterns for working with" in the documentation, I figured that there wouldn't be any new methods introduced after that and I stopped reading. > > it would be a bit weird and disorienting for the arithmetic operators to > have two different signatures: > > += > -= > *= > /= > Is it weird and disorienting to have: += *= -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon Apr 16 00:39:32 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 15 Apr 2018 21:39:32 -0700 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: <1CB3843C-5E12-479D-86E2-F6C024F4875C@gmail.com> > On Apr 15, 2018, at 9:04 PM, Peter Norvig wrote: > > it would be a bit weird and disorienting for the arithmetic operators to have two different signatures: > > += > -= > *= > /= > > Is it weird and disorienting to have: > > += > *= Yes, there is a precedent that does seem to have worked out well in practice :-) It isn't exactly parallel because strings aren't containers of numbers, they don't have & and |, and there isn't a reason to want a / operation, but it does suggest that signature variation might not be problematic. BTW, do you just want __mul__ and __rmul__? If those went in, presumably there will be a request to support __imul__ because otherwise c*=3 would still work but would be inefficient (that was the rationale for adding inplace variants for all the current arithmetic operators). Likewise, presumably someone would legitimately want __div__ to support the normalization use case. Perhaps less likely, there would be also be a request for __floordiv__ to allow exactly scaled results to stay in the domain of integers. Which if any of these makes sense to you? Also, any thoughts on the cleanest way to express the computation of a chi-squared statistic (for example, to compare observed first digit frequencies to the frequencies predicted by Benford's Law)? This isn't an arbitrary question (it came up when a professor first proposed a variant of this idea a few years ago). Raymond From peter at norvig.com Mon Apr 16 00:55:16 2018 From: peter at norvig.com (Peter Norvig) Date: Mon, 16 Apr 2018 04:55:16 +0000 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: <1CB3843C-5E12-479D-86E2-F6C024F4875C@gmail.com> References: <1CB3843C-5E12-479D-86E2-F6C024F4875C@gmail.com> Message-ID: I don't have strong feelings, but I would say yes to __imul__, no to __div__ and __floordiv__ (with str/list/tuple as the precedent). For chisquare, I would be perfectly happy with: digit_counts = Counter(...) scipy.stats.chisquare(list(digit_counts.values())) On Sun, Apr 15, 2018 at 9:39 PM Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > > On Apr 15, 2018, at 9:04 PM, Peter Norvig wrote: > > > > it would be a bit weird and disorienting for the arithmetic operators to > have two different signatures: > > > > += > > -= > > *= > > /= > > > > Is it weird and disorienting to have: > > > > += > > *= > > Yes, there is a precedent that does seem to have worked out well in > practice :-) It isn't exactly parallel because strings aren't containers > of numbers, they don't have & and |, and there isn't a reason to want a / > operation, but it does suggest that signature variation might not be > problematic. > > BTW, do you just want __mul__ and __rmul__? If those went in, presumably > there will be a request to support __imul__ because otherwise c*=3 would > still work but would be inefficient (that was the rationale for adding > inplace variants for all the current arithmetic operators). Likewise, > presumably someone would legitimately want __div__ to support the > normalization use case. Perhaps less likely, there would be also be a > request for __floordiv__ to allow exactly scaled results to stay in the > domain of integers. Which if any of these makes sense to you? > > Also, any thoughts on the cleanest way to express the computation of a > chi-squared statistic (for example, to compare observed first digit > frequencies to the frequencies predicted by Benford's Law)? This isn't an > arbitrary question (it came up when a professor first proposed a variant of > this idea a few years ago). > > > Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Apr 16 01:07:24 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 16 Apr 2018 00:07:24 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: [Peter Norvig] > For most types that implement __add__, `x + x` is equal to `2 * x`. > > That is true for all numbers, list, tuple, str, timedelta, etc. -- but not > for collections.Counter. I can add two Counters, but I can't multiply one > by a scalar. That seems like an oversight. > > ... > Here's an implementation: > > def __mul__(self, scalar): > "Multiply each entry by a scalar." > result = Counter() > for key in self: > result[key] = self[key] * scalar > return result > > def __rmul__(self, scalar): > "Multiply each entry by a scalar." > result = Counter() > for key in self: > result[key] = scalar * self[key] > return result Adding Counter * integer doesn't bother me a bit, but the definition of what that should compute isn't obvious. In particular, that implementation doesn't preserve that `x+x == 2*x` if x has any negative values: >>> x = Counter(a=-1) >>> x Counter({'a': -1}) >>> x+x Counter() It would be strange if x+x != 2*x, and if x*-1 != -x: >>> y = Counter(a=1) >>> y Counter({'a': 1}) >>> -y Counter() Etc. Then again, it's already the case that, e.g., x-y isn't always the same as x + -y: >>> x = Counter(a=1) >>> y = Counter(a=2) >>> x - y Counter() >>> x + -y Counter({'a': 1}) So screw obvious formal identities ;-) I'm not clear on why "+" and "-" discard keys with values <= 0 to begin with. For "-" it's natural enough viewing "-" as being multiset difference, but for "+"? That's just made up ;-) In any case, despite the oddities, I think your implementation would be least surprising overall (ignore the sign of the resulting values). At least for Counters that actually make sense as multisets (have no values <= 0), and for a positive integer multiplier `n > 0`, it does preserve that `x*n` = `x + x + ... + x` (with `n` instances of `x`). From wes.turner at gmail.com Mon Apr 16 01:24:54 2018 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 16 Apr 2018 01:24:54 -0400 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: <1CB3843C-5E12-479D-86E2-F6C024F4875C@gmail.com> References: <1CB3843C-5E12-479D-86E2-F6C024F4875C@gmail.com> Message-ID: On Monday, April 16, 2018, Raymond Hettinger wrote: > > > > On Apr 15, 2018, at 9:04 PM, Peter Norvig wrote: > > > > it would be a bit weird and disorienting for the arithmetic operators to > have two different signatures: > > > > += > > -= > > *= > > /= > > > > Is it weird and disorienting to have: > > > > += > > *= > > Yes, there is a precedent that does seem to have worked out well in > practice :-) It isn't exactly parallel because strings aren't containers > of numbers, they don't have & and |, and there isn't a reason to want a / > operation, but it does suggest that signature variation might not be > problematic. > > BTW, do you just want __mul__ and __rmul__? If those went in, presumably > there will be a request to support __imul__ because otherwise c*=3 would > still work but would be inefficient (that was the rationale for adding > inplace variants for all the current arithmetic operators). Likewise, > presumably someone would legitimately want __div__ to support the > normalization use case. Perhaps less likely, there would be also be a > request for __floordiv__ to allow exactly scaled results to stay in the > domain of integers. Which if any of these makes sense to you? > > Also, any thoughts on the cleanest way to express the computation of a > chi-squared statistic (for example, to compare observed first digit > frequencies to the frequencies predicted by Benford's Law)? This isn't an > arbitrary question (it came up when a professor first proposed a variant of > this idea a few years ago). https://en.wikipedia.org/wiki/Chi-squared_distribution https://en.wikipedia.org/wiki/Chi-squared_test https://en.wikipedia.org/wiki/Benford%27s_law (How might one test this with e.g. *double* SHA256?) proportions_chisquare(count, nobs, value=None) https://www.statsmodels.org/dev/generated/statsmodels.stats.proportion.proportions_chisquare.html https://www.statsmodels.org/dev/genindex.html?highlight=chi scipy.stats.chisquare(f_obs, f_exp=None, ddof=0, axis=0) https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.stats.chisquare.html sklearn.feature_selection.chi2(X, y) http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.chi2.html#sklearn.feature_selection.chi2 kernel_approximation.AdditiveChi2Sampler kernel_approximation.SkewedChi2Sampler http://scikit-learn.org/stable/modules/classes.html#module-sklearn.kernel_approximation has sklearn.metrics.pairwise.chi2_kernel(X, Y=None, gamma=1.0) http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.chi2_kernel.html#sklearn.metrics.pairwise.chi2_kernel sklearn.metrics.pairwise.additive_chi2_kernel(X, Y=None) http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.additive_chi2_kernel.html#sklearn.metrics.pairwise.additive_chi2_kernel ... FreqDist(collections.Counter(odict)) ... sparse-coding ... One-Hot / Binarization http://contrib.scikit-learn.org/categorical-encoding/ StandardScalar (for standardization) refuses to work with sparse matrices: http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler > > Raymond > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon Apr 16 01:37:23 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 15 Apr 2018 22:37:23 -0700 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> > On Apr 15, 2018, at 10:07 PM, Tim Peters wrote: > > Adding Counter * integer doesn't bother me a bit, but the definition > of what that should compute isn't obvious. Any thoughts on Counter * float? A key use case for what is being proposed is: c *= 1 / c.total Raymond From tim.peters at gmail.com Mon Apr 16 01:51:20 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 16 Apr 2018 00:51:20 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> Message-ID: [Tim] >> Adding Counter * integer doesn't bother me a bit, but the definition >> of what that should compute isn't obvious. [Raymond] > Any thoughts on Counter * float? A key use case for what is being proposed is: > > c *= 1 / c.total Ah, I thought I had already addressed that, but looks like my fingers forgot to type it ;-) By all mean, yes! Indeed, that strengthens the "argument" for why `Counter * int` should ignore the signs of the values - if we allow multiplying by anything supporting __mul__, that clearly says we view multiplication as being outside the "multiset" view, and so there's no reason at all to suppress values <= 0. I also have no problem with inplace operators. Or with adding `Counter /= scalar", for that matter. Perhaps whining could be reduced by rearranging the docs some: clearly separate operations designed to support the multiset view from the others. Then "but that operation makes no sense for multisets!" can be answered with "so don't use it on multisets - like the docs told you" ;-) From gadgetsteve at live.co.uk Mon Apr 16 01:22:54 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Mon, 16 Apr 2018 05:22:54 +0000 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: On 16/04/2018 06:07, Tim Peters wrote: > [Peter Norvig] >> For most types that implement __add__, `x + x` is equal to `2 * x`. >> >> That is true for all numbers, list, tuple, str, timedelta, etc. -- but not >> for collections.Counter. I can add two Counters, but I can't multiply one >> by a scalar. That seems like an oversight. >> >> ... >> Here's an implementation: >> >> def __mul__(self, scalar): >> "Multiply each entry by a scalar." >> result = Counter() >> for key in self: >> result[key] = self[key] * scalar >> return result >> >> def __rmul__(self, scalar): >> "Multiply each entry by a scalar." >> result = Counter() >> for key in self: >> result[key] = scalar * self[key] >> return result > > Adding Counter * integer doesn't bother me a bit, but the definition > of what that should compute isn't obvious. In particular, that > implementation doesn't preserve that `x+x == 2*x` if x has any > negative values: > >>>> x = Counter(a=-1) >>>> x > Counter({'a': -1}) >>>> x+x > Counter() > > It would be strange if x+x != 2*x, and if x*-1 != -x: > >>>> y = Counter(a=1) >>>> y > Counter({'a': 1}) >>>> -y > Counter() > > Etc. > > Then again, it's already the case that, e.g., x-y isn't always the > same as x + -y: > >>>> x = Counter(a=1) >>>> y = Counter(a=2) >>>> x - y > Counter() >>>> x + -y > Counter({'a': 1}) > > So screw obvious formal identities ;-) > > I'm not clear on why "+" and "-" discard keys with values <= 0 to > begin with. For "-" it's natural enough viewing "-" as being multiset > difference, but for "+"? That's just made up ;-) > > In any case, despite the oddities, I think your implementation would > be least surprising overall (ignore the sign of the resulting values). > At least for Counters that actually make sense as multisets (have no > values <= 0), and for a positive integer multiplier `n > 0`, it does > preserve that `x*n` = `x + x + ... + x` (with `n` instances of `x`). > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > Wouldn't it make sense to have the current counter behaviour, (negative counts not allowed), and also a counter that did allow negative values (my bank doesn't seem to have a problem with my balance being able to go below negative), and possibly at the same time a counter class that allowed fractional counts? Then: >>>> x = Counter(a=1) >>>> y = Counter(a=2) >>>> x - y > Counter() >>>> x + -y > Counter({'a': 1}) BUT: >>>> x = Counter(a=1, allow_negative=True) >>>> y = Counter(a=2, allow_negative=True) >>>> x - y > Counter({'a': 1}) >>>> x + -y > Counter({'a': 1}) Likewise for a Counter that was allowed to be fractional the result of some_counter / scalar would have (potentially) fractional results and one that did not would give floor results. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com From raymond.hettinger at gmail.com Mon Apr 16 02:58:05 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 15 Apr 2018 23:58:05 -0700 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> Message-ID: <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> > On Apr 15, 2018, at 10:51 PM, Tim Peters wrote: > > I also have no problem with inplace operators. Or with adding > `Counter /= scalar", for that matter. But surely __rdiv__() would be over the top, harmonic means be damned ;-) Raymond From k7hoven at gmail.com Mon Apr 16 04:55:56 2018 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 16 Apr 2018 11:55:56 +0300 Subject: [Python-ideas] A cute Python implementation of itertools.tee In-Reply-To: References: Message-ID: On Mon, Apr 16, 2018 at 12:06 AM, Tim Peters wrote: > [Koos Zevenhoven ] > >.... It's definitely possible to write the above in a more > > readable way, and FWIW I don't think it involves "assignments as > > expressions". > > Of course it is. The point was brevity and speed, not readability. > It was presented partly as a puzzle :-) > > >> What I find kind of hilarious is that it's no help at all as a > >> prototype for a C implementation: Python recycles stale `[next(it), > >> None]` pairs all by itself, when their internal refcounts fall to 0. > >> That's the hardest part. > > > Why can't the C implementation use Python refcounts? Are you talking > about > > standalone C code? > > Yes, expressing the algorithm in plain old C, not building on top of > (say) the Python C API. > > ?There must have been a reason why pseudo code was "invented".? > > Or perhaps you are thinking about overhead? > > Nope. > > > > (In PEP 555 that was not a concern, though). Surely it would make sense > > to reuse the refcounting code that's already there. There are no cycles > > here, so it's not particulaly complicated -- just duplication. > > > > Anyway, the whole linked list is unnecessary if the iterable can be > iterated > > over multiple times. > > If the latter were how iterables always worked, there would be no need > for tee() at all. It's tee's _purpose_ to make it possible for > multiple consumers to traverse an iterable's can't-restart-or-even > -go-back result sequence each at their own pace. > ?Yes. (I'm not sure which is easier, going back or starting from the beginning) -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Mon Apr 16 05:03:44 2018 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 16 Apr 2018 12:03:44 +0300 Subject: [Python-ideas] A cute Python implementation of itertools.tee In-Reply-To: References: Message-ID: On Sun, Apr 15, 2018 at 11:55 PM, Chris Angelico wrote: > On Mon, Apr 16, 2018 at 6:46 AM, Koos Zevenhoven > wrote: > > Anyway, the whole linked list is unnecessary if the iterable can be > iterated > > over multiple times. But "tee" won't know when to do that. *That* is > what I > > call overhead (unless of course all the tee branches are consumed in an > > interleaved manner). > > But if you have something you can iterate over multiple times, why > bother with tee at all? Just take N iterators from the underlying > iterable. The overhead is intrinsic to the value of the function. > > ? Indeed. But if you have, say, an Iterable[int], you don't know if you need the additional buffer or not. It could be a range object or a set or a generator (or iterator), who knows. Even your type checker doesn't know what you need. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Apr 16 05:41:14 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 16 Apr 2018 19:41:14 +1000 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: <20180416094114.GK11616@ando.pearwood.info> On Mon, Apr 16, 2018 at 05:22:54AM +0000, Steve Barnes wrote: > Wouldn't it make sense to have the current counter behaviour, (negative > counts not allowed), and also a counter that did allow negative values > (my bank doesn't seem to have a problem with my balance being able to go > below negative) I wish my bank was as understanding, they keep telling me I'm overdrawn... *wink* > and possibly at the same time a counter class that > allowed fractional counts? I understand the idea of counting in fractions (1/3, 2/3, 1, 1+1/3, ...) but I don't understand what fractional frequencies would mean. What's your use-case for fractional frequencies? -- Steve From steve at pearwood.info Mon Apr 16 06:08:34 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 16 Apr 2018 20:08:34 +1000 Subject: [Python-ideas] A cute Python implementation of itertools.tee In-Reply-To: References: <20180415115546.33d66718@fsol> Message-ID: <20180416100834.GL11616@ando.pearwood.info> On Sun, Apr 15, 2018 at 08:35:51PM +0300, Serhiy Storchaka wrote: > I have ideas about implementing zero-overhead try/except, but I have > doubts that it is worth. The benefit seems too small. It is conventional wisdom that catching exceptions is expensive, and that in performance critical code it is better to "look before you leap" if possible, and avoid try...except. Are you saying this advice is obsolete? If not, then perhaps reducing the overhead of catching exceptions may be worthwhile. -- Steve From kirillbalunov at gmail.com Mon Apr 16 06:18:19 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Mon, 16 Apr 2018 13:18:19 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: 2018-04-15 23:22 GMT+03:00 Chris Angelico : > > > 0. > > > > while (items[i := i+1] := read_next_item()) is not None: > > print(r'%d/%d' % (i, len(items)), end='\r') > > > > 1. > > > > while (read_next_item() -> items[(i+1) -> i]) is not None: > > print(r'%d/%d' % (i, len(items)), end='\r') > > These two are matching what I wrote, and are thus the two forms under > consideration. I notice that you added parentheses to the second one; > is there a clarity problem here and you're unsure whether "i + 1 -> i" > would capture "i + 1" or "1"? If so, that's a downside to the > proposal. > > Yes parentheses were used only for clarity. I agree that I misunderstood the purpose of your question. I have no problem if the right part is a complex target, but maybe my perception is biased. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashrub at yandex.ru Mon Apr 16 07:27:30 2018 From: ashrub at yandex.ru (Alexey Shrub) Date: Mon, 16 Apr 2018 14:27:30 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: <20180415151949.r42vffyunhix36na@phdru.name> References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> <1523801755.2055.4@smtp.yandex.ru> <20180415151949.r42vffyunhix36na@phdru.name> Message-ID: <1523878050.2020.2@smtp.yandex.ru> ? ???????????, 15 ???. 2018 ? 6:19 , Oleg Broytman ???????: > Can I recommend to catch exceptions in `backuper.backup()`, > cleanup backuper and unlock locker? Yes, thanks, I move .backup() to try, about other exception I think that it must be catched outside, because this module don't know that to do with such problems From ashrub at yandex.ru Mon Apr 16 07:37:39 2018 From: ashrub at yandex.ru (Alexey Shrub) Date: Mon, 16 Apr 2018 14:37:39 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: References: <1523782620.2055.0@smtp.yandex.ru> Message-ID: <1523878659.2020.3@smtp.yandex.ru> ? ???????????, 15 ???. 2018 ? 10:47 , George Fischhof ???????: > https://docs.python.org/3/library/fileinput.html Thanks, it works https://github.com/worldmind/scripts/blob/master/filerewrite/fileinputtest.py but looks like that way only for line by line processing From ashrub at yandex.ru Mon Apr 16 07:48:47 2018 From: ashrub at yandex.ru (Alexey Shrub) Date: Mon, 16 Apr 2018 14:48:47 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: References: <1523782620.2055.0@smtp.yandex.ru> Message-ID: <1523879327.2020.5@smtp.yandex.ru> ? ???????????, 15 ???. 2018 ? 10:47 , George Fischhof ???????: > https://docs.python.org/3/library/fileinput.html https://pypi.python.org/pypi/in-place looks not bad too From ashrub at yandex.ru Mon Apr 16 07:47:23 2018 From: ashrub at yandex.ru (Alexey Shrub) Date: Mon, 16 Apr 2018 14:47:23 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: References: <1523782620.2055.0@smtp.yandex.ru> <1523785794.2055.2@smtp.yandex.ru> Message-ID: <1523879243.2020.4@smtp.yandex.ru> ? ???????????, 15 ???. 2018 ? 1:12 , Serhiy Storchaka ???????: > Actually the reliable code should write into a separate file and > replace > the original file by the new file only if writing is successful. Or > backup the old file and restore it if writing is failed. Or do both. > And > handle hard and soft links if necessary. And use file locks if needed > to > prevent race condition when read/write by different processes. > Depending > on the specific of the application you may need different code. Your > three lines are enough for a one-time script if the risk of a powerful > blackout or disk space exhaustion is insignificant or if the data is > not > critical. I not sure that solving described problems is a task of this level, maybe it problem for higher level From ashrub at yandex.ru Mon Apr 16 07:58:07 2018 From: ashrub at yandex.ru (Alexey Shrub) Date: Mon, 16 Apr 2018 14:58:07 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: <1523879327.2020.5@smtp.yandex.ru> References: <1523782620.2055.0@smtp.yandex.ru> <1523879327.2020.5@smtp.yandex.ru> Message-ID: <1523879887.2020.6@smtp.yandex.ru> ? ???????????, 16 ???. 2018 ? 2:48 , Alexey Shrub ???????: > https://pypi.python.org/pypi/in-place I like in_place module https://github.com/worldmind/scripts/blob/master/filerewrite/inplacetest.py it fix some strange features of fileinput module. Maybe in_place must be in standard library instead fileinput? From kirillbalunov at gmail.com Mon Apr 16 08:35:33 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Mon, 16 Apr 2018 15:35:33 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> Message-ID: [Guido] 2018-04-15 20:19 GMT+03:00 Guido van Rossum : > On Sun, Apr 15, 2018 at 4:05 AM, Kirill Balunov > wrote: > >> [...] For me personally, `: =` looks and feels just like normal >> assignment statement which can be used interchangeable but in many more >> places in the code. And if the main goal of the PEP was to offer this >> `assignment expression` as a future replacement for `assignment statement` >> the `:=` syntax form would be the very reasonable proposal (of course in >> this case there will be a lot more other questions). >> > > I haven't kept up with what's in the PEP (or much of this thread), but > this is the key reason I strongly prefer := as inline assignment operator. > > >> But somehow this PEP does not mean it! And with the current rationale of >> this PEP it's a huge CON for me that `=` and `:=` feel and look the same. >> > > Then maybe the PEP needs to be updated. > [Chris] 2018-04-15 23:28 GMT+03:00 Chris Angelico : > On Mon, Apr 16, 2018 at 3:19 AM, Guido van Rossum > wrote: > > On Sun, Apr 15, 2018 at 4:05 AM, Kirill Balunov > > > wrote: > >> But somehow this PEP does not mean it! And with the current rationale of > >> this PEP it's a huge CON for me that `=` and `:=` feel and look the > same. > > > > Then maybe the PEP needs to be updated. > > I can never be sure what people are reading when they say "current" > with PEPs like this. The text gets updated fairly frequently. As of > time of posting, here's the rationale: > > ----- > Naming the result of an expression is an important part of programming, > allowing a descriptive name to be used in place of a longer expression, > and permitting reuse. Currently, this feature is available only in > statement form, making it unavailable in list comprehensions and other > expression contexts. Merely introducing a way to assign as an expression > would create bizarre edge cases around comprehensions, though, and to avoid > the worst of the confusions, we change the definition of comprehensions, > causing some edge cases to be interpreted differently, but maintaining the > existing behaviour in the majority of situations. > ----- > > Kirill, is this what you read, and if so, how does that make ':=' a > negative? The rationale says "hey, see this really good thing you can > do as a statement? Let's do it as an expression too", so the parallel > should be a good thing. > > Yes, this is what I read. I understand why you have such a question so I'll try to explain my position in more detail. Also I want to add that I did not fully understand about which part Guido said - "Then maybe the PEP needs to be updated." Therefore, I allow myself to assume that he had in mind the following - "The assignment expression should be semantically equivalent to assignment statement and perceived as a theoretically possible future replacement (usage) of assignment statement." If this is really the case and I understood correctly - I will repeat that for me the current state of the PEP does not fully imply this. 1. Part - "Then maybe the PEP needs to be updated." If you really see it as a theoretical substitute for assignment statement in future Python. I will update the rationale with maybe the following (I immediately warn you that I do not pretend to have a good English style): ----- Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, in Python this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts. This restriction, of making it as a statement, was done primarily to avoid the usual trap of `=` vs `==` in C/C++ language. Despite this, it is evident that the ability to assign a name within an expression is convenient, allows to avoid redundant recalculations of the same expression and is a familiar feature from other programming languages. Thus the main aim of this PEP is to provide a syntax which will allow to assign as an expression and be semantically and visually interchangeable with the assignment statement. ... ----- In this case, I do not see any reason to discuss the alternative syntax - there is really only one choice `:=`. And then for me the list of open questions would look like (for example): 1. ...Is it worth accepting? 2. ...Should the other forms (+=, *=, basically all) that can not be confused with `==` be changed to expressions? ... 2. Part - How do I understand the current state of the PEP I perceive the current rationale as "hey, see this really good thing you can do as a statement? Let's do it as an expression too". Which for me means opportunities to discuss the following questions: 1. Should assignment expression be viewed as a replacement of an assignment statement or as a complement to it? 2. Which spelling should have an assignment expression? ( depends on the previous ) 3. Does it make sense to make it valid only in certain context or in any? ... and many others ... You are the author of this PEP. Therefore, you choose how you take it, how you feel it and what kind of feedback you are willing to accept, and in any case I will respect your choice :) with kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashrub at yandex.ru Mon Apr 16 08:49:02 2018 From: ashrub at yandex.ru (Alexey Shrub) Date: Mon, 16 Apr 2018 15:49:02 +0300 Subject: [Python-ideas] Rewriting file - pythonic way In-Reply-To: <1523879887.2020.6@smtp.yandex.ru> References: <1523782620.2055.0@smtp.yandex.ru> <1523879327.2020.5@smtp.yandex.ru> <1523879887.2020.6@smtp.yandex.ru> Message-ID: <1523882942.17444.0@smtp.yandex.ru> https://pypi.python.org/pypi/in-place > * Instead of hijacking sys.stdout, a new filehandle is returned for writing. > * The filehandle supports all of the standard I/O methods, not just readline(). why fileinput did not support this things? From mikhailwas at gmail.com Mon Apr 16 09:05:26 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 16 Apr 2018 16:05:26 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180415155805.GI11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: On Sun, Apr 15, 2018 at 6:58 PM, Steven D'Aprano wrote: > On Sun, Apr 15, 2018 at 10:21:02PM +1000, Chris Angelico wrote: > >> I don't think we're ever going to unify everyone on an arbitrary >> question of "expression first" or "name first". But to all the >> "expression first" people, a question: what if the target is not just >> a simple name? >> >> while (read_next_item() -> items[i + 1 -> i]) is not None: >> print("%d/%d..." % (i, len(items)), end="\r") > > I don't see why it would make a difference. It doesn't to me. > > >> Does this make sense? With the target coming first, it perfectly >> parallels the existing form of assignment: > > Yes, except this isn't ordinary assignment-as-a-statement. > > I've been mulling over the question why I think the expression needs to > come first here, whereas I'm satisfied with the target coming first for > assignment statements, and I think I've finally got the words to explain > it. It is not just long familiarity with maths and languages that put > the variable first (although that's also part of it). It has to do with > what we're looking for when we read code, specifically what is the > primary piece of information we're initially looking for. > > In assignment STATEMENTS the primary piece of information is the target. > Yes, of course the value assigned to the target is important, but often > we don't care what the value is, at least not at first. We're hunting > for a known target, and only when we find it do we care about the value > it gets. > ... [SNIP] .... > > It is appropriate for assignment statements and expressions to be > written differently because they are used differently. > Wow. That feeling when you see someone giving reasonable arguments but in the end comes up with such doubtful conclusions. So you agree that in general you may need to spot values and in other case function calls or expressions. And that's it, its just depends. So if you swap the order and in some _single_ particular case you may notice tiny advantage, you conclude that the whole case with expression assignment needs this order. Lets just return to some of proposed examples (I use "=" in both examples to be less biased here): 1. if ( match = re.match("foo", S) ) == True: print("match:", match) 2. if ( re.match("foo", S) = match ) == True: print("match:", match) Now seriously, you may argue around those "pronounce" theoretical bla bla, like "take the result and save it in a token". But the variant 1. is just better, because _it is what it is in Python_. So it is better not because it is better looking or whatever, it is same sh** turned around. So just don't turn it around! Here the 1st variant can be unwrapped to: match = re.match("foo", S) if match == True: print("match:", match) Do you see what I mean? When I read code I don't have all those things you describe in a millisecond : - look at the pointy end of operator - think, oh this shows to the right - seems like I save the value there - yep, that's the way I imply things to work - stroking the belly .... Instead I just parse visually some smaller parts of code and it's just better if the assignment is in the same order as everywhere. Yes, in some single case one order can look better, but in this case it's just not good to mix those. Mikhail From peter.ed.oconnor at gmail.com Mon Apr 16 09:49:54 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Mon, 16 Apr 2018 09:49:54 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: Hi Danilo, The idea of decorating a function to show that the return variables could be fed back in in a scan form is interesting and could solve my problem in a nice way without new syntax. I looked at your code but got a bit confused as to how it works (there seems to be some magic where the decorator injects the scanned variable into the namespace). Are you able to show how you'd implement the moving average example with your package? I tried: @enable_scan("average") def exponential_moving_average_pyscan(signal, decay, initial=0): yield from ((1-decay)*(average or initial) + decay*x for x in signal) smooth_signal_9 = list(exponential_moving_average_pyscan(signal, decay=decay))[1:] Which almost gave the right result, but seemed to get the initial conditions wrong. - Peter On Sat, Apr 14, 2018 at 3:57 PM, Danilo J. S. Bellini < danilo.bellini at gmail.com> wrote: > On 5 April 2018 at 13:52, Peter O'Connor > wrote: > >> I was thinking it would be nice to be able to encapsulate this common >> type of operation into a more compact comprehension. >> >> I propose a new "Reduce-Map" comprehension that allows us to write: >> >> signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] >> smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] >> >> Instead of: >> >> def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): >> average = initial_value >> for xt in signal: >> average = (1-decay)*average + decay*xt >> yield average >> >> signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] >> smooth_signal = list(exponential_moving_average(signal, decay=0.05)) >> >> I wrote in this mail list the very same proposal some time ago. I was > trying to let the scan higher order function (itertools.accumulate with a > lambda, or what was done in the example above) fit into a simpler list > comprehension. > > As a result, I wrote this project, that adds the "scan" feature to Python > comprehensions using a decorator that performs bytecode manipulation (and > it had to fit in with a valid Python syntax): > https://github.com/danilobellini/pyscanprev > > In that GitHub page I've wrote several examples and a rationale on why > this would be useful. > > -- > Danilo J. S. Bellini > --------------- > "*It is not our business to set up prohibitions, but to arrive at > conventions.*" (R. Carnap) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ed.oconnor at gmail.com Mon Apr 16 10:08:50 2018 From: peter.ed.oconnor at gmail.com (Peter O'Connor) Date: Mon, 16 Apr 2018 10:08:50 -0400 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: In any case, although I find the magic variable-injection stuff quite strange, I like the decorator. Something like @scannable(average=0) # Wrap function so that it has a "scan" method which can be used to generate a stateful scan object def exponential_moving_average(average, x, decay): return (1-decay)*average + decay*x stateful_func = exponential_moving_average.scan(average=initial) smooth_signal = [stateful_func(x) for x in signal] Seems appealing because it allows you to define the basic function without, for instance, assuming that decay will be constant. If you wanted dynamic decay, you could easily have it without changing the function: stateful_func = exponential_moving_average.scan(average=initial) smooth_signal = [stateful_func(x, decay=decay) for x, decay in zip(signal, decay_schedule)] And you pass around state explicitly. On Mon, Apr 16, 2018 at 9:49 AM, Peter O'Connor wrote: > Hi Danilo, > > The idea of decorating a function to show that the return variables could > be fed back in in a scan form is interesting and could solve my problem in > a nice way without new syntax. > > I looked at your code but got a bit confused as to how it works (there > seems to be some magic where the decorator injects the scanned variable > into the namespace). Are you able to show how you'd implement the moving > average example with your package? > > I tried: > > @enable_scan("average") > def exponential_moving_average_pyscan(signal, decay, initial=0): > yield from ((1-decay)*(average or initial) + decay*x for x in > signal) > > > smooth_signal_9 = list(exponential_moving_average_pyscan(signal, > decay=decay))[1:] > > Which almost gave the right result, but seemed to get the initial > conditions wrong. > > - Peter > > > > On Sat, Apr 14, 2018 at 3:57 PM, Danilo J. S. Bellini < > danilo.bellini at gmail.com> wrote: > >> On 5 April 2018 at 13:52, Peter O'Connor >> wrote: >> >>> I was thinking it would be nice to be able to encapsulate this common >>> type of operation into a more compact comprehension. >>> >>> I propose a new "Reduce-Map" comprehension that allows us to write: >>> >>> signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] >>> smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] >>> >>> Instead of: >>> >>> def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): >>> average = initial_value >>> for xt in signal: >>> average = (1-decay)*average + decay*xt >>> yield average >>> >>> signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] >>> smooth_signal = list(exponential_moving_average(signal, decay=0.05)) >>> >>> I wrote in this mail list the very same proposal some time ago. I was >> trying to let the scan higher order function (itertools.accumulate with a >> lambda, or what was done in the example above) fit into a simpler list >> comprehension. >> >> As a result, I wrote this project, that adds the "scan" feature to Python >> comprehensions using a decorator that performs bytecode manipulation (and >> it had to fit in with a valid Python syntax): >> https://github.com/danilobellini/pyscanprev >> >> In that GitHub page I've wrote several examples and a rationale on why >> this would be useful. >> >> -- >> Danilo J. S. Bellini >> --------------- >> "*It is not our business to set up prohibitions, but to arrive at >> conventions.*" (R. Carnap) >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Apr 16 11:54:34 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 17 Apr 2018 01:54:34 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: <20180416155433.GN11616@ando.pearwood.info> On Mon, Apr 16, 2018 at 06:16:46AM +1000, Chris Angelico wrote: [...] > >> >>> items = [None] * 10 > >> >>> i = -1 > >> >>> items[i := i + 1] = input("> ") > >> > asdf > >> >>> items[i := i + 1] = input("> ") > >> > qwer > >> >>> items[i := i + 1] = input("> ") > >> > zxcv > >> >>> > >> >>> items > >> ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] > > > > > > I don't know why you would write that instead of: > > > > items = [None]*10 > > for i in range(3): > > items[i] = input("> ") > > > > > > or even for that matter: > > > > items = [input("> ") for i in range(3)] + [None]*7 > > > > > > but whatever floats your boat. (Python isn't just not Java. It's also > > not C *wink*) > > You and Kirill have both fallen into the trap of taking the example > too far. By completely rewriting it, you destroy its value as an > example. Write me a better example of a complex target if you like, > but the question is about how you feel about complex assignment > targets, NOT how you go about creating a particular list in memory. > That part is utterly irrelevant. Chris, I must admit that I'm utterly perplexed at this. Your example is as far as from a complex assignment target as you can possibly get. It's a simple name! i := i + 1 The target is just "i", a name. The point I was making is that your example is not a good showcase for this suggested functionality. Your code violates DRY, repeating the exact same line three times. It ought to be put in a loop, and once put in a loop, the justification for needing assignment-expression disappears. But having said that, I did respond to your question and swapping the order around: items[i + 1 -> i] = input("> ") It's still not a "complex target", the target is still just a plain ol' name, but it is precisely equivalent to your example. And then I went further and re-wrote your example to use a genuinely complex target, which I won't repeat here. > >> Are you as happy with that sort of complex > >> expression coming after 'as' or '->'? > > > > Sure. Ignoring the output of the calls to input(): > > The calls to input were in a while loop's header for a reason. > Ignoring them is ignoring the point of assignment expressions. What while loop? Your example has no while loop. But regardless, we don't need to care about the *output* (i.e. your keypresses echoed to stdout) when looking at the code sample. -- Steve From tim.peters at gmail.com Mon Apr 16 12:21:51 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 16 Apr 2018 11:21:51 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: [Steve Barnes ] > Wouldn't it make sense to have the current counter behaviour, (negative > counts not allowed), and also a counter that did allow negative values > (my bank doesn't seem to have a problem with my balance being able to go > below negative), and possibly at the same time a counter class that > allowed fractional counts? We already have all that: a counter's values can be anything at all: >>> a = Counter(a=Fraction(2, 3)) >>> a + a Counter({'a': Fraction(4, 3)}) It's four specific binary "multiset" _operations_ that purge values <= 0 from their results: >>> a = Counter(a=Fraction(-2, 3)) >>> a Counter({'a': Fraction(-2, 3)}) >>> a['a'] < 0 True >>> a + a Counter() >>> a - a Counter() >>> a | a Counter() >>> a & a Counter() OK, also unary prefix "+" and "-". Other methods do not: >>> a = Counter(a=Fraction(-2, 3)) >>> a.subtract(Counter(a=1)) # like inplace "-" but <= 0 not special >>> a Counter({'a': Fraction(-5, 3)}) >>> a.update(Counter(a=1)) # like inplace "+" but <= 0 not special >>> a Counter({'a': Fraction(-2, 3)}) >>> a['a'] -= 100 >>> a Counter({'a': Fraction(-302, 3)}) > ... > Likewise for a Counter that was allowed to be fractional the result of > some_counter / scalar would have (potentially) fractional results and > one that did not would give floor results. As above, results are a property not of the counter objects, but of the specific operations performed. some_counter / scalar would most obviously work like:the current: >>> c = Counter(a=1, b=-2) >>> c Counter({'a': 1, 'b': -2}) >>> scalar = 5 >>> Counter({key: value / scalar for key, value in c.items()}) Counter({'a': 0.2, 'b': -0.4}) From ncoghlan at gmail.com Mon Apr 16 12:22:36 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Apr 2018 02:22:36 +1000 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On 16 April 2018 at 03:45, Steve Barnes wrote: > On 15/04/2018 08:12, Nick Coghlan wrote: >> The discoverability of these kinds of techniques could definitely >> stand to be improved, but the benefit of adopting them is that they >> work on all currently supported versions of Python (even >> importlib.import_module exists in Python 2.7 as a convenience wrapper >> around __import__), rather than needing to wait for new language level >> syntax for them. > > As you say not too discoverable at the moment - I have just reread > PEP328 & https://docs.python.org/3/library/importlib.html but did not > find any mention of these mechanisms or even that setting an external > __path__ variable existed as a possibility. Yeah, the fact that "packages are ultimately just modules with a __path__ attribute that works like sys.path" tends to get obscured by the close association between package hierarchies and file system layouts in the default filesystem importer. The docs for that are all the way back in PEP 302: https://www.python.org/dev/peps/pep-0302/#packages-and-the-role-of-path > Maybe a documentation enhancement proposal would be in order? If we're not covering explicit __path__ manipulation anywhere, we should definitely mention that possibility. https://docs.python.org/3/library/pkgutil.html#pkgutil.extend_path does talk about it, but only in the context of scanning sys.path for matching names, not in the context of building a package from an arbitrary set of directory names. I'm not sure where we could put an explanation of some of the broader implications of that fact, though - while __path__ manipulation is usually fairly safe, we're always a little hesitant about encouraging too many dynamic modifications to the import system state, since it can sometimes have odd side effects based on whether imports happen before or after that state is adjusted.. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Apr 16 12:34:25 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Apr 2018 02:34:25 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: Message-ID: On 16 April 2018 at 00:27, Thautwarm Zhao wrote: > Personally I prefer "as", but I think without a big change of python Grammar > file, it's impossible to avoid parsing "with expr as name" into "with (expr > as name)" because "expr as name" is actually an "expr". > I have mentioned this in previous discussions and it seems it's better to > warn you all again. I don't think people of Python-Dev are willing to > implement a totally new Python compiler. We have ways of cheating a bit if we want to reinterpret the semantics of something that nevertheless parses cleanly - while the parser is limited to single token lookahead, it's straightforward for the subsequent code generation stage to look a single level down in the parse tree and see that the code that parsed as "with expr" is actually "with subexpr as target". So the main concern around "with (name as expr)" is with human readers getting confused, not the compiler, as we can tell the latter to implement whichever semantics we decide we want, while humans are far less tractable :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Mon Apr 16 12:36:46 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 16 Apr 2018 17:36:46 +0100 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On 16 April 2018 at 17:22, Nick Coghlan wrote: > If we're not covering explicit __path__ manipulation anywhere, we > should definitely mention that possibility. > https://docs.python.org/3/library/pkgutil.html#pkgutil.extend_path > does talk about it, but only in the context of scanning sys.path for > matching names, not in the context of building a package from an > arbitrary set of directory names. It's quite possible that we're not. > I'm not sure where we could put an explanation of some of the broader > implications of that fact, though - while __path__ manipulation is > usually fairly safe, we're always a little hesitant about encouraging > too many dynamic modifications to the import system state, since it > can sometimes have odd side effects based on whether imports happen > before or after that state is adjusted.. One of the problems with PEP 302 was that there was no really good place in the documentation to put all the information that was present (certainly not in the version of the docs that was around when we wrote it). So a lot of the important details remained buried in PEP 302. Since then, a lot of the details ended up in the docs, mostly in the importlib sections, but I don't recall ever seeing anything about __path__ (and particularly not the nice summary you gave, "packages are ultimately just modules with a __path__ attribute that works like sys.path". Paul From ericfahlgren at gmail.com Mon Apr 16 12:57:53 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Mon, 16 Apr 2018 09:57:53 -0700 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: The documentation is pretty opaque or non-existent on other aspects of importlib use, too. If I enable warnings, I see this (and many more like it). I've read PEP 302 a couple times, read the code in importlib that detects the warning and searched down several rabbit holes, only to come up empty... T:\Python36\lib\importlib\_bootstrap.py:219: ImportWarning: can't resolve package from __spec__ or __package__, falling back on __name__ and __path__ My thoughts when I see it: "Ok. So what does that mean? Is it bad? It must be bad, otherwise I wouldn't get a warning. How do I reconcile __spec__ and __package__? Which one is missing and/or incorrect?" On Mon, Apr 16, 2018 at 9:36 AM, Paul Moore wrote: > On 16 April 2018 at 17:22, Nick Coghlan wrote: > > If we're not covering explicit __path__ manipulation anywhere, we > > should definitely mention that possibility. > > https://docs.python.org/3/library/pkgutil.html#pkgutil.extend_path > > does talk about it, but only in the context of scanning sys.path for > > matching names, not in the context of building a package from an > > arbitrary set of directory names. > > It's quite possible that we're not. > > > I'm not sure where we could put an explanation of some of the broader > > implications of that fact, though - while __path__ manipulation is > > usually fairly safe, we're always a little hesitant about encouraging > > too many dynamic modifications to the import system state, since it > > can sometimes have odd side effects based on whether imports happen > > before or after that state is adjusted.. > > One of the problems with PEP 302 was that there was no really good > place in the documentation to put all the information that was present > (certainly not in the version of the docs that was around when we > wrote it). So a lot of the important details remained buried in PEP > 302. Since then, a lot of the details ended up in the docs, mostly in > the importlib sections, but I don't recall ever seeing anything about > __path__ (and particularly not the nice summary you gave, "packages > are ultimately just modules with a > __path__ attribute that works like sys.path". > > Paul > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Apr 16 13:23:05 2018 From: brett at python.org (Brett Cannon) Date: Mon, 16 Apr 2018 17:23:05 +0000 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On Mon, 16 Apr 2018 at 09:58 Eric Fahlgren wrote: > The documentation is pretty opaque or non-existent on other aspects of > importlib use, too. > Well, we are diving into the dark corners of import here. (Details can be found in the language reference: https://docs.python.org/3/reference/import.html). > If I enable warnings, I see this (and many more like it). I've read PEP > 302 a couple times, read the code in importlib that detects the warning and > searched down several rabbit holes, only to come up empty... > > T:\Python36\lib\importlib\_bootstrap.py:219: ImportWarning: can't resolve > package from __spec__ or __package__, falling back on __name__ and __path__ > > My thoughts when I see it: "Ok. So what does that mean? > It means that the mechanisms import typically uses to calculate the importing module's name in order to resolve relative imports wasn't where it should be, and so we fell back to the Python 2 way of doing it. > Is it bad? > Eh, it isn't ideal. ;) > It must be bad, otherwise I wouldn't get a warning. How do I reconcile > __spec__ and __package__? > You should be setting __spec__.parent, but we will fall back to __package__ if that doesn't exist (and raise a different warning). :) > Which one is missing and/or incorrect?" > Both are missing. :) -Brett > > > On Mon, Apr 16, 2018 at 9:36 AM, Paul Moore wrote: > >> On 16 April 2018 at 17:22, Nick Coghlan wrote: >> > If we're not covering explicit __path__ manipulation anywhere, we >> > should definitely mention that possibility. >> > https://docs.python.org/3/library/pkgutil.html#pkgutil.extend_path >> > does talk about it, but only in the context of scanning sys.path for >> > matching names, not in the context of building a package from an >> > arbitrary set of directory names. >> >> It's quite possible that we're not. >> >> > I'm not sure where we could put an explanation of some of the broader >> > implications of that fact, though - while __path__ manipulation is >> > usually fairly safe, we're always a little hesitant about encouraging >> > too many dynamic modifications to the import system state, since it >> > can sometimes have odd side effects based on whether imports happen >> > before or after that state is adjusted.. >> >> One of the problems with PEP 302 was that there was no really good >> place in the documentation to put all the information that was present >> (certainly not in the version of the docs that was around when we >> wrote it). So a lot of the important details remained buried in PEP >> 302. Since then, a lot of the details ended up in the docs, mostly in >> the importlib sections, but I don't recall ever seeing anything about >> __path__ (and particularly not the nice summary you gave, "packages >> are ultimately just modules with a >> __path__ attribute that works like sys.path". >> >> Paul >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Apr 16 13:35:30 2018 From: brett at python.org (Brett Cannon) Date: Mon, 16 Apr 2018 17:35:30 +0000 Subject: [Python-ideas] Move optional data out of pyc files In-Reply-To: <20180414235405.wft73xwtbwwcwwme@python.ca> References: <15602430-b133-239a-1aa4-b3bc9973f44a@egenix.com> <20180414235405.wft73xwtbwwcwwme@python.ca> Message-ID: On Sat, 14 Apr 2018 at 17:01 Neil Schemenauer wrote: > On 2018-04-12, M.-A. Lemburg wrote: > > This leaves the proposal to restructure pyc files into a sectioned > > file and possibly indexed file to make access to (lazily) loaded > > parts faster. > > I would like to see a format can hold one or more modules in a > single file. Something like the zip format but optimized for fast > interpreter startup time. It should support lazy loading of module > parts (e.g. maybe my lazy bytecode execution idea[1]). Obviously a > lot of details to work out. > Eric Snow, Barry Warsaw, and I chatted about a custom file format for holding Python source (and data files). My notes on the chat can be found at https://notebooks.azure.com/Brett/libraries/design-ideas/html/Python%20source%20archive%20file%20format.ipynb . (And since we aren't trying to rewrite bytecode we figured it wouldn't break your proposal, Neil ;) . -Brett > > The design should also take into account the widespread use of > virtual environments. So, it should be easy and space efficient to > build virtual environments using this format (e.g. maybe allow > overlays so that stdlib package is not copied into virtual > environment, virtual packages would be overlaid on stdlib file). > Also, should be easy to bundle all modules into a "uber" package and > append it to the Python executable. CPython should provide > out-of-box support for single-file executables. > > > 1. https://github.com/python/cpython/pull/6194 > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Apr 16 13:36:34 2018 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 17 Apr 2018 03:36:34 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180416155433.GN11616@ando.pearwood.info> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> <20180416155433.GN11616@ando.pearwood.info> Message-ID: On Tue, Apr 17, 2018 at 1:54 AM, Steven D'Aprano wrote: > On Mon, Apr 16, 2018 at 06:16:46AM +1000, Chris Angelico wrote: > [...] > >> >> >>> items = [None] * 10 >> >> >>> i = -1 >> >> >>> items[i := i + 1] = input("> ") >> >> > asdf >> >> >>> items[i := i + 1] = input("> ") >> >> > qwer >> >> >>> items[i := i + 1] = input("> ") >> >> > zxcv >> >> >>> >> >> >>> items >> >> ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] >> > >> > >> > I don't know why you would write that instead of: >> > >> > items = [None]*10 >> > for i in range(3): >> > items[i] = input("> ") >> > >> > >> > or even for that matter: >> > >> > items = [input("> ") for i in range(3)] + [None]*7 >> > >> > >> > but whatever floats your boat. (Python isn't just not Java. It's also >> > not C *wink*) >> >> You and Kirill have both fallen into the trap of taking the example >> too far. By completely rewriting it, you destroy its value as an >> example. Write me a better example of a complex target if you like, >> but the question is about how you feel about complex assignment >> targets, NOT how you go about creating a particular list in memory. >> That part is utterly irrelevant. > > Chris, I must admit that I'm utterly perplexed at this. Your example is > as far as from a complex assignment target as you can possibly get. It's > a simple name! > > i := i + 1 > > The target is just "i", a name. Thanks so much for the aggressively-trimmed quote. For once, though, TOO aggressive. What you're focusing on is the *unrolled* version of a two-line loop. Look at the actual loop, please, and respond to the actual question. :| >> The calls to input were in a while loop's header for a reason. >> Ignoring them is ignoring the point of assignment expressions. > > What while loop? Your example has no while loop. Not after it got trimmed, no. Here's what I actually said in my original post: while (read_next_item() -> items[i + 1 -> i]) is not None: print("%d/%d..." % (i, len(items)), end="\r") Now, if THAT is your assignment target, are you still as happy as you had been, or are you assuming that the target is a simple name? ChrisA From rosuav at gmail.com Mon Apr 16 13:42:32 2018 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 17 Apr 2018 03:42:32 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: On Mon, Apr 16, 2018 at 11:05 PM, Mikhail V wrote: > Lets just return to some of proposed examples > (I use "=" in both examples to be less biased here): > > 1. > if ( match = re.match("foo", S) ) == True: > print("match:", match) > > 2. > if ( re.match("foo", S) = match ) == True: > print("match:", match) > > > Now seriously, you may argue around those "pronounce" > theoretical bla bla, like "take the result and save it in a token". > But the variant 1. is just better, because _it is what it is in Python_. You start by attempting to be less biased, but you're using existing Python syntax and then justifying one of the two options because it's existing Python syntax. I'm not sure that that's a strong argument :) > So it is better not because it is better looking or whatever, > it is same sh** turned around. So just don't turn it around! Obviously if the chosen token is ":=", it's going to be target first. > When I read code I don't have all those things > you describe in a millisecond : > - look at the pointy end of operator > - think, oh this shows to the right > - seems like I save the value there > - yep, that's the way I imply things to work > - stroking the belly > .... > > Instead I just parse visually some smaller parts > of code and it's just better if the assignment is in the same > order as everywhere. > Yes, in some single case one order can look better, > but in this case it's just not good to mix those. Here are the three most popular syntax options, and how each would be explained: 1) "target := expr" ==> It's exactly the same as other forms of assignment, only now it's an expression. 2) "expr as name" ==> It's exactly the same as other uses of "as", only now it's just grabbing the preceding expression, not actually doing anything with it 3) "expr -> name" ==> The information went data way. So either you take a parallel from elsewhere in Python syntax, or you take a hopefully-intuitive dataflow mnemonic symbol. Take your pick. ChrisA From ethan at stoneleaf.us Mon Apr 16 13:48:11 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 16 Apr 2018 10:48:11 -0700 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> <20180416155433.GN11616@ando.pearwood.info> Message-ID: <5AD4E1DB.2020801@stoneleaf.us> On 04/16/2018 10:36 AM, Chris Angelico wrote: > Not after it got trimmed, no. Here's what I actually said in my original post: > > while (read_next_item() -> items[i + 1 -> i]) is not None: > print("%d/%d..." % (i, len(items)), end="\r") > > Now, if THAT is your assignment target, are you still as happy as you > had been, or are you assuming that the target is a simple name? I'm okay with it, although I'd still prefer "as". But, really, as long as we get it* I'll be happy. -- ~Ethan~ * "it" being, of course, assignment-expressions. :) From ned at nedbatchelder.com Mon Apr 16 14:09:56 2018 From: ned at nedbatchelder.com (Ned Batchelder) Date: Mon, 16 Apr 2018 14:09:56 -0400 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: <63119d38-4890-44ad-28ad-15973db81167@nedbatchelder.com> On 4/16/18 1:42 PM, Chris Angelico wrote: > 3) "expr -> name" ==> The information went data way. > > So either you take a parallel from elsewhere in Python syntax, or you > take a hopefully-intuitive dataflow mnemonic symbol. Take your pick. My problem with the "->" option is that function annotations already use "->" to indicate the return type of a function.? This is an unfortunate parallel from elsewhere in Python syntax, since the meaning is completely different. ":=" is at least new syntax. "as" is nice in that it's already used for assignment, but seems to be causing too much difficulty in parsing, whether by compilers or people. --Ned. From ericfahlgren at gmail.com Mon Apr 16 14:23:26 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Mon, 16 Apr 2018 11:23:26 -0700 Subject: [Python-ideas] Idea: Importing from arbitrary filenames In-Reply-To: References: Message-ID: On Mon, Apr 16, 2018 at 10:23 AM, Brett Cannon wrote: > > > On Mon, 16 Apr 2018 at 09:58 Eric Fahlgren wrote: > >> The documentation is pretty opaque or non-existent on other aspects of >> importlib use, too. >> > > Well, we are diving into the dark corners of import here. (Details can be > found in the language reference: https://docs.python.org/3/ > reference/import.html) > Thanks, Brett, I'll read through that and see where I get. Those corners /are/ pretty dark. The backstory is that I'm doing the final port from Py2 to Py3 (it's been a long time coming, mostly years of waiting for extension modules to get ported, notably wxPython and VTK). In Py2, all warnings were enabled and disallowed, so big surprise on first run, hundreds of lines of the aforementioned one and "ImportWarning: __package__ != __spec__.parent". We have manually defined "__package__" all over the place, for reasons lost in the fog of time, which I believe to be the culprit for the latter warning. Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From gadgetsteve at live.co.uk Mon Apr 16 15:11:55 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Mon, 16 Apr 2018 19:11:55 +0000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: > Here are the three most popular syntax options, and how each would be explained: > > 1) "target := expr" ==> It's exactly the same as other forms of > assignment, only now it's an expression. > 2) "expr as name" ==> It's exactly the same as other uses of "as", > only now it's just grabbing the preceding expression, not actually > doing anything with it > 3) "expr -> name" ==> The information went data way. > > So either you take a parallel from elsewhere in Python syntax, or you > take a hopefully-intuitive dataflow mnemonic symbol. Take your pick. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > How about "name being expression" - this avoids the already used "as" while being searchable, reasonably short and gives a reasonably clear, (at least to English speakers), indication of what is going on. It can also be typed on an ASCII keyboard without having to have a helper program or memorising Unicode codes and can be displayed or printed without having to install specialised fonts. If a postfix notation is considered desirable, either instead or as well as "being", then possibly another synonym would suit such as "expression stored_as name" or "expression storedas name" (not apologies for the awkward name as I personally find it an awkward construction just like Reverse Polish). -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com From rosuav at gmail.com Mon Apr 16 15:20:34 2018 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 17 Apr 2018 05:20:34 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: On Tue, Apr 17, 2018 at 5:11 AM, Steve Barnes wrote: > >> Here are the three most popular syntax options, and how each would be explained: >> >> 1) "target := expr" ==> It's exactly the same as other forms of >> assignment, only now it's an expression. >> 2) "expr as name" ==> It's exactly the same as other uses of "as", >> only now it's just grabbing the preceding expression, not actually >> doing anything with it >> 3) "expr -> name" ==> The information went data way. >> >> So either you take a parallel from elsewhere in Python syntax, or you >> take a hopefully-intuitive dataflow mnemonic symbol. Take your pick. > > How about "name being expression" - this avoids the already used "as" > while being searchable, reasonably short and gives a reasonably clear, > (at least to English speakers), indication of what is going on. It can > also be typed on an ASCII keyboard without having to have a helper > program or memorising Unicode codes and can be displayed or printed > without having to install specialised fonts. > > If a postfix notation is considered desirable, either instead or as well > as "being", then possibly another synonym would suit such as "expression > stored_as name" or "expression storedas name" (not apologies for the > awkward name as I personally find it an awkward construction just like > Reverse Polish). IMO searchability isn't enough of an advantage to justify creating a new keyword, which could potentially break people's code. (I don't think it'll break the stdlib, but it'll almost certainly break at least some code out there.) New keywords have an extremely high bar to reach. ChrisA From mikhailwas at gmail.com Mon Apr 16 15:42:22 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 16 Apr 2018 22:42:22 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: On Mon, Apr 16, 2018 at 8:42 PM, Chris Angelico wrote: > On Mon, Apr 16, 2018 at 11:05 PM, Mikhail V wrote: > > Here are the three most popular syntax options, and how each would be explained: > > 1) "target := expr" ==> It's exactly the same as other forms of > assignment, only now it's an expression. > 2) "expr as name" ==> It's exactly the same as other uses of "as", > only now it's just grabbing the preceding expression, not actually > doing anything with it > 3) "expr -> name" ==> The information went data way. As I initially said, I just don't find this choice list fair. and it should be: 1) "target = expr" (which would be first intuitive idea from user's PoV, and if can't be made then explained to Python people why not) 2) "target := expr" (as probably best close alternative) 3) "target ? expr" (where ? is some other word/character - IIRC "target from expr" was proposed once) That's it. But well, if I compare to your choice list - there is ":=" option you have as well :) From rosuav at gmail.com Mon Apr 16 15:52:38 2018 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 17 Apr 2018 05:52:38 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> Message-ID: On Tue, Apr 17, 2018 at 5:42 AM, Mikhail V wrote: > On Mon, Apr 16, 2018 at 8:42 PM, Chris Angelico wrote: >> On Mon, Apr 16, 2018 at 11:05 PM, Mikhail V wrote: >> >> Here are the three most popular syntax options, and how each would be explained: >> >> 1) "target := expr" ==> It's exactly the same as other forms of >> assignment, only now it's an expression. >> 2) "expr as name" ==> It's exactly the same as other uses of "as", >> only now it's just grabbing the preceding expression, not actually >> doing anything with it >> 3) "expr -> name" ==> The information went data way. > > > As I initially said, I just don't find this choice list fair. > and it should be: > > 1) "target = expr" (which would be first intuitive idea from user's PoV, > and if can't be made then explained to Python people why not) That one is dealt with in the PEP. It is not an option. > 2) "target := expr" (as probably best close alternative) Yep, got that one. > 3) "target ? expr" (where ? is some other word/character - IIRC > "target from expr" was proposed once) ... which isn't specific enough to be a front-runner option. > That's it. > But well, if I compare to your choice list - there is ":=" option you > have as well :) Yes. I'm not sure what's unfair about the options I've given there. ChrisA From danilo.bellini at gmail.com Mon Apr 16 18:02:59 2018 From: danilo.bellini at gmail.com (Danilo J. S. Bellini) Date: Mon, 16 Apr 2018 19:02:59 -0300 Subject: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin In-Reply-To: References: Message-ID: On 16 April 2018 at 10:49, Peter O'Connor wrote: > Are you able to show how you'd implement the moving average example with > your package? > Sure! The single pole IIR filter you've shown is implemented here: https://github.com/danilobellini/pyscanprev/blob/master/examples/iir-filter.rst I tried: > > @enable_scan("average") > def exponential_moving_average_pyscan(signal, decay, initial=0): > yield from ((1-decay)*(average or initial) + decay*x for x in > signal) > > > smooth_signal_9 = list(exponential_moving_average_pyscan(signal, > decay=decay))[1:] > > Which almost gave the right result, but seemed to get the initial > conditions wrong. > I'm not sure what you were expecting. A sentinel as the first "average" value? Before the loop begins, this scan-generator just echoes the first input, like itertools.accumulate. That is, the first value this generator yields is the first "signal" value, which is then the first "average" value. To put an initial memory state, you should do something like this (I've removed the floating point trailing noise): >>> from pyscanprev import enable_scan, prepend >>> >>> @enable_scan("y") >>> def iir_filter(signal, decay, memory=0): ... return ((1 - decay) * y + decay * x for x in prepend(memory, signal)) ... >>> list(iir_filter([1, 2, 3, 2, 1, -1, -2], decay=.1, memory=5)) [5, 4.6, 4.34, 4.206, 3.9854, 3.68686, 3.218174, 2.6963566] In that example, "y" is the "previous result" (a.k.a. accumulator, or what had been called "average" here). -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Apr 16 20:43:55 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 16 Apr 2018 19:43:55 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> Message-ID: [Tim] >> I also have no problem with inplace operators. Or with adding >> `Counter /= scalar", for that matter. [Raymond] > But surely __rdiv__() would be over the top, harmonic means be damned ;-) Agreed - itertools.Counter is the poster child for "practicality beats purity" :-) In context, the c *= 1 / c.total example would clearly be useful at times. But it's a strained way to spell c /= c.total and, for float values, the former also introduces a needless rounding error (to compute the reciprocal). BTW, if _`Counter * scalar` is added, we should think more about oddball cases. While everyone knows what _they_ mean by "scalar", Python doesn't. The obvious implementation (Peter already gave it) would lead to things like `Counter * Counter`, where both multiplicands have integer values, yielding a Counter whose values are also Counters. That is, if c = Counter(a=1, b=2) d = Counter(a=3, b=4) then c*d would yield a Counter mapping 'a` to 1 * d == d, and 'b' to 2 * d == Counter(a=6, b=8). That's "bad", because the next suggestion will be that c*d return Counter(a=3, b=8) instead. That is, map a shared key to the product of the values associated with that key. For example, `c` is a Counter tabulating category counts, and `d` a Counter giving category weights. I don't suggest doing that now, but it would be nice to dream up a way to stop "things like" Counter * Counter at the start so that backward compatibility doesn't preclude adding sensible meanings later. From greg at krypto.org Mon Apr 16 21:27:30 2018 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 17 Apr 2018 01:27:30 +0000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <63119d38-4890-44ad-28ad-15973db81167@nedbatchelder.com> References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> <63119d38-4890-44ad-28ad-15973db81167@nedbatchelder.com> Message-ID: On Mon, Apr 16, 2018 at 11:11 AM Ned Batchelder wrote: > On 4/16/18 1:42 PM, Chris Angelico wrote: > > 3) "expr -> name" ==> The information went data way. > > > > So either you take a parallel from elsewhere in Python syntax, or you > > take a hopefully-intuitive dataflow mnemonic symbol. Take your pick. > > My problem with the "->" option is that function annotations already use > "->" to indicate the return type of a function. This is an unfortunate > parallel from elsewhere in Python syntax, since the meaning is > completely different. > > ":=" is at least new syntax. > > "as" is nice in that it's already used for assignment, but seems to be > causing too much difficulty in parsing, whether by compilers or people. > > FWIW - We used "as" in our Python C++ binding interface description language in CLIF to denote renaming from the original C++ name to a new name in Python - effectively an assignment syntax. https://github.com/google/clif/blob/master/clif/python/primer.md I currently have a "-0" opinion on the entire PEP 572 as I don't buy that assignments within expressions are even a good thing to have in the language. #complexity - Think of people learning the language. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaoxiansamma at gmail.com Mon Apr 16 23:09:07 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Tue, 17 Apr 2018 11:09:07 +0800 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) Message-ID: > We have ways of cheating a bit if we want to reinterpret the semantics > of something that nevertheless parses cleanly - while the parser is > limited to single token lookahead, it's straightforward for the > subsequent code generation stage to look a single level down in the > parse tree and see that the code that parsed as "with expr" is > actually "with subexpr as target". It does work, however I think it does sound like a patch, and definitely it will block us to make other extensions in the future. > 3) "target ? expr" (where ? is some other word/character - IIRC > "target from expr" was proposed once) A more popular convention is to mark `?` as handling boolean variables, so `target ? expr` could mean `expr if target else target`. Other proposal for null/boolean checking might need `?`, let's preserve `?` character for further development. > How about "name being expression" - this avoids the already used "as" > while being searchable, reasonably short and gives a reasonably clear, > (at least to English speakers), indication of what is going on. It can > also be typed on an ASCII keyboard without having to have a helper > program or memorising Unicode codes and can be displayed or printed > without having to install specialised fonts. It makes sense, if we don't have a long history in Python programming... A new keyword would be something very dangerous, because it just causes the crash of some existed library using the keyword as identifier. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Apr 16 23:54:50 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 16 Apr 2018 20:54:50 -0700 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: References: Message-ID: <5AD5700A.1030003@stoneleaf.us> On 04/10/2018 10:32 PM, Chris Angelico wrote: > PEP: 572 > Title: Assignment Expressions > Author: Chris Angelico Chris, I think you've gotten all the feedback you need. Pick a symbol (I'd say either ":=" or "as"), and slap this puppy over onto python-dev. -- ~Ethan~ From guido at python.org Tue Apr 17 00:59:31 2018 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Apr 2018 21:59:31 -0700 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: Message-ID: On Mon, Apr 16, 2018 at 8:09 PM, Thautwarm Zhao wrote: > > > 3) "target ? expr" (where ? is some other word/character - IIRC > > "target from expr" was proposed once) > > A more popular convention is to mark `?` as handling boolean variables, so > `target ? expr` could mean `expr if target else target`. Other proposal for > null/boolean checking might need `?`, let's preserve `?` character for > further development. > The only acceptable use of ? is formulated in PEP 505. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Apr 17 01:30:06 2018 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 17 Apr 2018 15:30:06 +1000 Subject: [Python-ideas] PEP 572: Assignment Expressions (post #4) In-Reply-To: <5AD5700A.1030003@stoneleaf.us> References: <5AD5700A.1030003@stoneleaf.us> Message-ID: On Tue, Apr 17, 2018 at 1:54 PM, Ethan Furman wrote: > On 04/10/2018 10:32 PM, Chris Angelico wrote: > >> PEP: 572 >> Title: Assignment Expressions >> Author: Chris Angelico > > > Chris, I think you've gotten all the feedback you need. Pick a symbol (I'd > say either ":=" or "as"), and slap this puppy over onto python-dev. > Yep, sounds about right. I'm actually still toying with a couple of points of grammar to see if I can smooth out some of the annoying bits, and then I think the proposal is done with -ideas. ChrisA From derekamaciel at gmail.com Tue Apr 17 02:21:08 2018 From: derekamaciel at gmail.com (Derek Maciel) Date: Tue, 17 Apr 2018 02:21:08 -0400 Subject: [Python-ideas] Providing a public API for creating and parsing HTTP messages Message-ID: Hello all, If this is not the appropriate place for this type of proposal please let me know. The modules http.client and http.server both do a wonderful job when implementing HTTP clients and servers, respectively. However, occasionally there may be a need to create and parse HTTP messages themselves without needing to implement a client or server. While http.client and http.server clearly have solved this problem for its own uses, there is little to no public API exposed to a user wishing to do the same. For instance, the documentation for http.client.HTTPMessage [1] simply states that it extends from email.message.Message in order to parse the messages. [1]: https://docs.python.org/3/library/http.client.html#httpmessage-objects While this need may admittedly be a niche one, it is a shame that one cannot simply use the functionality already implemented for http.client and http.server for this purpose. Therefore, I am proposing this module be improved such that the creation and parsing of HTTP messages be "pulled out of" the existing code into HTTPRequest and HTTPResponse classes with a public API for this purpose. The exact API is subject to change of course, but it could work like: >>> r = http.client.HTTPRequest("POST", "/foo", "Hello world!") >>> r.method 'POST' >>> r.body 'Hello world!' >>> print(r) POST /foo HTTP/1.1 Content-Length: 12 Hello world! If this idea is not completely a bad one, I am very interested in discussing in further detail how I (or perhaps someone else) could begin to implement these changes. Thank you for reading. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Tue Apr 17 03:11:04 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 17 Apr 2018 10:11:04 +0300 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: Message-ID: On Tue, Apr 17, 2018 at 6:09 AM, Thautwarm Zhao wrote: > > > 3) "target ? expr" (where ? is some other word/character - IIRC > > "target from expr" was proposed once) > > A more popular convention is to mark `?` as handling boolean variables, so > `target ? expr` could mean `expr if target else target`. Other proposal for > null/boolean checking might need `?`, let's preserve `?` character for > further development. Hey! I did not propose "?". Read the explanation in parenthesis. My whole idea that any option could be viable, as long as it does not propose reversed order notation. But anyway ":=" is better than any keyword imo. Mikhail From njs at pobox.com Tue Apr 17 04:19:16 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 17 Apr 2018 01:19:16 -0700 Subject: [Python-ideas] Providing a public API for creating and parsing HTTP messages In-Reply-To: References: Message-ID: On Mon, Apr 16, 2018 at 11:21 PM, Derek Maciel wrote: > The modules http.client and http.server both do a wonderful job when > implementing HTTP clients and servers, respectively. However, occasionally > there may be a need to create and parse HTTP messages themselves without > needing to implement a client or server. The way http.client/http.server are written, the code for creating and parsing messages is very tangled up with the code for sending and receiving data, so this wouldn't be easy to do without rewriting them from scratch. But would you accept a third-party package? https://h11.readthedocs.io -n -- Nathaniel J. Smith -- https://vorpus.org From yaoxiansamma at gmail.com Tue Apr 17 08:59:27 2018 From: yaoxiansamma at gmail.com (Thautwarm Zhao) Date: Tue, 17 Apr 2018 20:59:27 +0800 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) Message-ID: > Hey! I did not propose "?". Read the explanation in parenthesis. > My whole idea that any option could be viable, as long as > it does not propose reversed order notation. Dear Mikhail, so sorry about my misunderstanding, I should read the post with fewer presuppositions... Back to the topic, this thread seems to be closed now, and in my opinion `:=` could be synthetically the best. 2018-04-17 15:11 GMT+08:00 Mikhail V : > On Tue, Apr 17, 2018 at 6:09 AM, Thautwarm Zhao > wrote: > > > > > > 3) "target ? expr" (where ? is some other word/character - IIRC > > > "target from expr" was proposed once) > > > > A more popular convention is to mark `?` as handling boolean variables, > so > > `target ? expr` could mean `expr if target else target`. Other proposal > for > > null/boolean checking might need `?`, let's preserve `?` character for > > further development. > > Hey! I did not propose "?". Read the explanation in parenthesis. > My whole idea that any option could be viable, as long as > it does not propose reversed order notation. > > But anyway ":=" is better than any keyword imo. > > > Mikhail > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Apr 17 09:13:19 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 17 Apr 2018 23:13:19 +1000 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: References: <20180413110432.GB11616@ando.pearwood.info> <20180413131859.GE11616@ando.pearwood.info> <20180415155805.GI11616@ando.pearwood.info> <20180416155433.GN11616@ando.pearwood.info> Message-ID: <20180417131318.GO11616@ando.pearwood.info> On Tue, Apr 17, 2018 at 03:36:34AM +1000, Chris Angelico wrote: [further aggressive snippage] *wink* > > Chris, I must admit that I'm utterly perplexed at this. Your example is > > as far as from a complex assignment target as you can possibly get. It's > > a simple name! > > > > i := i + 1 > > > > The target is just "i", a name. > > Thanks so much for the aggressively-trimmed quote. For once, though, > TOO aggressive. What you're focusing on is the *unrolled* version of a > two-line loop. Look at the actual loop, please, and respond to the > actual question. :| Ah, I never even picked up on the idea that the previous while loop was connected to the following bunch of calls to input(). The code didn't seem to be related: the first was already using -> syntax and did not use input(), the second used := syntax and did. > >> The calls to input were in a while loop's header for a reason. > >> Ignoring them is ignoring the point of assignment expressions. > > > > What while loop? Your example has no while loop. > > Not after it got trimmed, no. Here's what I actually said in my original post: > > while (read_next_item() -> items[i + 1 -> i]) is not None: > print("%d/%d..." % (i, len(items)), end="\r") > > Now, if THAT is your assignment target, are you still as happy as you > had been, or are you assuming that the target is a simple name? I'll give the answer I would have given before I read Nick's comments over on Python-Dev: sure, I'm happy, and no, I'm not assuming the target is a simple name. But having seen Nick's response on Python-Dev, I'm now wondering whether this should be limited to simple names. Further discussion in reply to Nick's post over on Python-Dev please. -- Steve From raymond.hettinger at gmail.com Wed Apr 18 11:33:03 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 18 Apr 2018 08:33:03 -0700 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> Message-ID: <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> > On Apr 16, 2018, at 5:43 PM, Tim Peters wrote: > > BTW, if _`Counter * scalar` is added, we should think more about > oddball cases. While everyone knows what _they_ mean by "scalar", > Python doesn't. I've started working on an implementation and several choices arise: 1) Reject scalar with a TypeError if scalar is a Counter 2) Reject scalar with a TypeError if scalar is a Mapping 3) Reject scalar with a TypeError if scalar is a Collection 4) Reject scalar with a TypeError if scalar is Sized (has a __len__ method). I lean toward rejecting all things Sized because _everyone_ knows that scalars aren't sized ;-) Raymond From tim.peters at gmail.com Wed Apr 18 12:05:43 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 18 Apr 2018 11:05:43 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> Message-ID: [Raymond] > I've started working on an implementation and several choices arise: > > 1) Reject scalar with a TypeError if scalar is a Counter > 2) Reject scalar with a TypeError if scalar is a Mapping > 3) Reject scalar with a TypeError if scalar is a Collection > 4) Reject scalar with a TypeError if scalar is Sized (has a __len__ method). > > I lean toward rejecting all things Sized because _everyone_ knows that scalars aren't sized ;-) Hard to know how gonzo to get :-( _Scalar = (Sized, Container, Iterable)): # has __len__, __getitem__, or __iter__ ... if isinstance(arg, _Scalar): raise TypeError ... would also reject things like generator expressions. But ... those would blow up anyway, when multiplication was attempted. So, ya! Sticking to Sized sounds good :-) From encukou at gmail.com Wed Apr 18 12:13:21 2018 From: encukou at gmail.com (Petr Viktorin) Date: Wed, 18 Apr 2018 18:13:21 +0200 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> Message-ID: On 04/18/18 17:33, Raymond Hettinger wrote: > > >> On Apr 16, 2018, at 5:43 PM, Tim Peters wrote: >> >> BTW, if _`Counter * scalar` is added, we should think more about >> oddball cases. While everyone knows what _they_ mean by "scalar", >> Python doesn't. > > I've started working on an implementation and several choices arise: > > 1) Reject scalar with a TypeError if scalar is a Counter > 2) Reject scalar with a TypeError if scalar is a Mapping > 3) Reject scalar with a TypeError if scalar is a Collection > 4) Reject scalar with a TypeError if scalar is Sized (has a __len__ method). Why is Iterable (__iter__) not on the list? (Apologies if I missed this somewhere in the conversation.) > > I lean toward rejecting all things Sized because _everyone_ knows that scalars aren't sized ;-) > > > Raymond > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From storchaka at gmail.com Wed Apr 18 12:26:44 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 18 Apr 2018 19:26:44 +0300 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: 16.04.18 08:07, Tim Peters ????: > Adding Counter * integer doesn't bother me a bit, but the definition > of what that should compute isn't obvious. In particular, that > implementation doesn't preserve that `x+x == 2*x` if x has any > negative values: > >>>> x = Counter(a=-1) >>>> x > Counter({'a': -1}) >>>> x+x > Counter() > > It would be strange if x+x != 2*x, and if x*-1 != -x: > >>>> y = Counter(a=1) >>>> y > Counter({'a': 1}) >>>> -y > Counter() > > Etc. > > Then again, it's already the case that, e.g., x-y isn't always the > same as x + -y: > >>>> x = Counter(a=1) >>>> y = Counter(a=2) >>>> x - y > Counter() >>>> x + -y > Counter({'a': 1}) > > So screw obvious formal identities ;-) > > I'm not clear on why "+" and "-" discard keys with values <= 0 to > begin with. For "-" it's natural enough viewing "-" as being multiset > difference, but for "+"? That's just made up ;-) > > In any case, despite the oddities, I think your implementation would > be least surprising overall (ignore the sign of the resulting values). > At least for Counters that actually make sense as multisets (have no > values <= 0), and for a positive integer multiplier `n > 0`, it does > preserve that `x*n` = `x + x + ... + x` (with `n` instances of `x`). There are methods update() and subtract() which are similar to operators "+" and "-", but don't discard non-positive values. I expect that "*" and "/" discard non-positive values for consistency with "+" and "-". And a new method should be added which does multiplication without discarding non-positive values. From tim.peters at gmail.com Wed Apr 18 14:45:59 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 18 Apr 2018 13:45:59 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> Message-ID: [Raymond] >> I've started working on an implementation and several choices arise: >> >> 1) Reject scalar with a TypeError if scalar is a Counter >> 2) Reject scalar with a TypeError if scalar is a Mapping >> 3) Reject scalar with a TypeError if scalar is a Collection >> 4) Reject scalar with a TypeError if scalar is Sized (has a __len__ >> method). [Petr Viktorin ] > Why is Iterable (__iter__) not on the list? > > (Apologies if I missed this somewhere in the conversation.) I believe Raymond implicitly meant "test one of the above", not "test all of the above", and he's leaning toward Sized alone. What we're trying to stop is things like "Counter * Counter" for now, because the obvious implementation(*) of Counter.__mul__ would do a strange thing with that, where a quite different thing is plausibly wanted (and may - or may not - be added later - but, due to backward compatibility, cannot be added later if the initial implementation does the strange thing). Rejecting a Sized argument for now would stop that. Piling on additional tests could stop other things "early", but every test added slows the normal case (the argument is fine). In the case of an Iterable `arg` that's not Sized, it seems highly unlikely that arg.__mul__ or arg.__rmul__ exist, so the obvious implementation would blow up later without bothering to check in advance: >>> x = (i for i in range(10)) >>> 3 * x Traceback (most recent call last): ... TypeError: unsupported operand type(s) for *: 'int' and 'generator' (*) The obvious implementation: def __mul__(self, other): if isinstance(other, Sized): raise TypeError("cannot multiply Counter by Sized type %s" % type(other)) result = Counter() for k, v in self.items(): result[k] = v * other return result From wes.turner at gmail.com Wed Apr 18 15:33:57 2018 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 18 Apr 2018 15:33:57 -0400 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> Message-ID: The use cases these TypeErrors would exclude include weightings (whether from a generator without an intermediate tuple/list or from a dict) where the need is to do elementwise multiplication: if len(self) != len(other): raise ValueError("tensors are multiplicable") if self.keys() != other.keys(): #odict if set(self.keys()).difference(other.keys()): # is this contract # is it this function's responsibility to check it for (k1, v1), (k2, v2) in zip(self.items(), other.items()): self[k1] = v1*v2 At which point we might as well just introduce a sparse labeled array type that's interface-compatible with np.array and/or pd.Series(index=) with a @classmethod Counter initializer that works like collections.Counter. (see links above) On Wednesday, April 18, 2018, Tim Peters wrote: > [Raymond] > >> I've started working on an implementation and several choices arise: > >> > >> 1) Reject scalar with a TypeError if scalar is a Counter > >> 2) Reject scalar with a TypeError if scalar is a Mapping > >> 3) Reject scalar with a TypeError if scalar is a Collection > >> 4) Reject scalar with a TypeError if scalar is Sized (has a __len__ > >> method). > > [Petr Viktorin ] > > Why is Iterable (__iter__) not on the list? > > > > (Apologies if I missed this somewhere in the conversation.) > > I believe Raymond implicitly meant "test one of the above", not "test > all of the above", and he's leaning toward Sized alone. > > What we're trying to stop is things like "Counter * Counter" for now, > because the obvious implementation(*) of Counter.__mul__ would do a > strange thing with that, where a quite different thing is plausibly > wanted (and may - or may not - be added later - but, due to backward > compatibility, cannot be added later if the initial implementation > does the strange thing). > > Rejecting a Sized argument for now would stop that. Piling on > additional tests could stop other things "early", but every test added > slows the normal case (the argument is fine). > > In the case of an Iterable `arg` that's not Sized, it seems highly > unlikely that arg.__mul__ or arg.__rmul__ exist, so the obvious > implementation would blow up later without bothering to check in > advance: > > >>> x = (i for i in range(10)) > >>> 3 * x > Traceback (most recent call last): > ... > TypeError: unsupported operand type(s) for *: 'int' and 'generator' > > > (*) The obvious implementation: > > def __mul__(self, other): > if isinstance(other, Sized): > raise TypeError("cannot multiply Counter by Sized type %s" > % type(other)) > result = Counter() > for k, v in self.items(): > result[k] = v * other > return result > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Apr 18 15:34:15 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 18 Apr 2018 14:34:15 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: [Serhiy Storchaka ] > There are methods update() and subtract() which are similar to operators "+" > and "-", but don't discard non-positive values. Yup. > I expect that "*" and "/" discard non-positive values for consistency with > "+" and "-". And a new method should be added which does multiplication > without discarding non-positive values. Counter supports a wonderfully weird mix of methods driven by use cases, not by ideology. + (binary) - (binary) | & have semantics driven by viewing a Counter as a multiset implementation. That's why they discard values <= 0. They correspond, respectively, to "the standard" multiset operations of sum (disjoint union), difference, union, and intersection. That the unary versions of '+' and '-' also discard values <= 0 is justified by saying "because they're shorthand for what the binary operator does when given an empty Counter as the left argument", but they're not standard multiset operations on their own. Nothing else in Counter is trying to cater to the multiset view, but to other use cases. And that's why "*" and "/" should do what everyone _expects_ them to do ;-) There are no analogous multiset operations to justify them caring at all what the values are. If Raymond had it to do over again, I'd suggest that only "-" discard values <= 0. The other operators deliver legit(*) multisets _given_ that their arguments are legit multisets - only "-" has to care about creating an illegitimate (for a multiset) value from legit multiset arguments. But there there's no good reason for "*" or "/" to care at all. They don't make sense for multisets. After, e.g., c /= sum(c.values()) it's sane to expect that the new sum(c.values()) is close to 1 regardless of the numeric types or signs of the original values. Indeed, normalizing values so that their sum _is_ close to 1 is a primary use case motivating the current change. Note I suggested before rearranging the docs to make clear that the multiset view is just a part of what Counter is intended to be used for, and that only a handful of specific operations are intended to support it. (*) "legit" meaning that all values are integers > 0 From tim.peters at gmail.com Wed Apr 18 15:42:25 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 18 Apr 2018 14:42:25 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> Message-ID: [Wes Turner ] > The use cases these TypeErrors would exclude include weightings (whether > from a generator without an intermediate tuple/list or from a dict) where > the need is to do elementwise multiplication: Yes. That's exactly what I had in mind when I wrote: >> What we're trying to stop is things like "Counter * Counter" for now, >> because the obvious implementation(*) of Counter.__mul__ would do a >> strange thing with that, where a quite different thing is plausibly >> wanted (and may - or may not - be added later - but, due to backward >> compatibility, cannot be added later if the initial implementation >> does the strange thing). The obvious implementation (already given (*)) of Counter.__mul__ would NOT AT ALL do elementwise multiplication. Nobody has asked for that yet either, so nobody is proposing to add it now either. But it's predictable that someone _will_ ask for it when __mul__ is defined for "scalars". It may or may not be added at that time. But it will be flat-out impossible to add it later (because of backward compatibility) _if_ the initial implementation does something entirely different. So, at first, we want to raise an exception for a non-'scalar" argument, so that it remains _possible_ to do something sane with it later. ... >> (*) The obvious implementation: >> >> def __mul__(self, other): >> if isinstance(other, Sized): >> raise TypeError("cannot multiply Counter by Sized type %s" >> % type(other)) >> result = Counter() >> for k, v in self.items(): >> result[k] = v * other >> return result From storchaka at gmail.com Wed Apr 18 16:24:13 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 18 Apr 2018 23:24:13 +0300 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: 18.04.18 22:34, Tim Peters ????: > Counter supports a wonderfully weird mix of methods driven by use > cases, not by ideology. > > + (binary) > - (binary) > | > & > > have semantics driven by viewing a Counter as a multiset > implementation. That's why they discard values <= 0. They > correspond, respectively, to "the standard" multiset operations of sum > (disjoint union), difference, union, and intersection. This explains only why binary "-" discards non-positive values and "&" discards keys that are only in one Counter. Multisets contain only positive counts. > Nothing else in Counter is trying to cater to the multiset view, but > to other use cases. And that's why "*" and "/" should do what > everyone _expects_ them to do ;-) There are no analogous multiset > operations to justify them caring at all what the values are. Isn't everyone expect that x*2 == x + x? Isn't this the definition of multiplication? And when we have a multiplication, it can be generalized to division. > But there there's no good reason for "*" or "/" to care at all. They > don't make sense for multisets. I disagree. "+" and "*" are defined for sequences, and these operations can be defined for multisets in terms of sequences of their elements. x + y = multiset(x.elements() + y.elements()) x * n = multiset(x.elements() * n) > After, e.g., > > c /= sum(c.values()) > > it's sane to expect that the new sum(c.values()) is close to 1 > regardless of the numeric types or signs of the original values. > Indeed, normalizing values so that their sum _is_ close to 1 is a > primary use case motivating the current change. If there are negative values, then their sum can be very small, and the relative error of the sum can be large. Dividing by it can results in values with large magnitude, significantly larger than 1, and large errors. What is the use case for division a Counter with negative values by the sum of its values? From storchaka at gmail.com Wed Apr 18 16:29:04 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 18 Apr 2018 23:29:04 +0300 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> Message-ID: 18.04.18 21:45, Tim Peters ????: > (*) The obvious implementation: > > def __mul__(self, other): > if isinstance(other, Sized): > raise TypeError("cannot multiply Counter by Sized type %s" > % type(other)) Wouldn't be better to return NotImplemented here? > result = Counter() > for k, v in self.items(): > result[k] = v * other > return result If discard non-positive values, this will automatically make multiplying Counter by Counter (or by sequence) invalid, because they are not comparable with 0. for k, v in self.items(): v = v * other if v > 0: result[k] = v * other From tim.peters at gmail.com Wed Apr 18 16:55:01 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 18 Apr 2018 15:55:01 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: [Tim] >> Counter supports a wonderfully weird mix of methods driven by use >> cases, not by ideology. >> >> + (binary) >> - (binary) >> | >> & >> >> have semantics driven by viewing a Counter as a multiset >> implementation. That's why they discard values <= 0. They >> correspond, respectively, to "the standard" multiset operations of sum >> (disjoint union), difference, union, and intersection. [Serhiy Storchaka ] > This explains only why binary "-" discards non-positive values and "&" > discards keys that are only in one Counter. Multisets contain only positive > counts. As I said later, if Raymond had it to do over again, I'd suggest that only "-" special-case values <= 0. We have what we have now. Perhaps he had other use cases in mind too - I don't know about that. >> Nothing else in Counter is trying to cater to the multiset view, but >> to other use cases. And that's why "*" and "/" should do what >> everyone _expects_ them to do ;-) There are no analogous multiset >> operations to justify them caring at all what the values are. > Isn't everyone expect that x*2 == x + x? As shown in earlier messages, it's already the case that, e.g., "x - y" isn't always the same as "x + -y" for multisets now. It's already too late to stress about satisfying "obvious" formal identities ;-) Again, Counter isn't driven by ideology, but by use cases, and it caters to all kinds of use cases now. > Isn't this the definition of multiplication? In some algebraic structures, yes. Same as, e.g., "x - y" can be "defined by" "x + -y". > And when we have a multiplication, it can be generalized to division. In some algebraic structures, yes. >> But there there's no good reason for "*" or "/" to care at all. They >> don't make sense for multisets. > I disagree. "+" and "*" are defined for sequences, and these operations can > be defined for multisets in terms of sequences of their elements. Ya, but you're just making that up because it suits your current argument. The mathematical definition of multisets says nothing at all about "sequences". I used "the standard" earlier as shorthand for "use Google to find a standard account"; e.g., here: http://planetmath.org/operationsonmultisets Counter implements all and only the multiset operations spelled out there (or in any number of other standard accounts). >> After, e.g., >> >> c /= sum(c.values()) >> >> it's sane to expect that the new sum(c.values()) is close to 1 >> regardless of the numeric types or signs of the original values. >> Indeed, normalizing values so that their sum _is_ close to 1 is a >> primary use case motivating the current change. > If there are negative values, then their sum can be very small, and the > relative error of the sum can be large. So? > Dividing by it can results in values with large magnitude, significantly larger > than 1, and large errors. Likewise: so what? There's no reason to assume that the values aren't, e.g.. fractions.Fractions, where arithmetic is exact. If they're floats, then _of course_ all kinds of numeric surprises are possible. But unless you want to claim that float surprises go away if values <= 0 are thrown away, it's just irrelevant to the case you _were_ arguing. Do you seriously want to argue that c /= sum(c.values()) should include negative values in the sum, but then throw away keys with quotients <= 0 when the division is performed? That's pretty much incomprehensible. > What is the use case for division a Counter with negative values by the sum > of its values? Avoiding the incomprehensible behavior noted just above - for which I'd like to see a use case too ;-) But, seriously, no, I don't have a good use for that. It _follows_ from the obvious implementation Peter gave in the thread's first message, which is in fact obvious to just about everyone else so far. I can't count _against_ it that c /= sum(c.values()) assert sum(c.values()) == 1 would succeed if the values support exact arithmetic and the original sum isn't 0. From tim.peters at gmail.com Wed Apr 18 17:39:47 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 18 Apr 2018 16:39:47 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: <3E8290D6-6244-4F2F-8149-C63C42BC9387@gmail.com> <50FEA75C-5F3D-4B99-8B11-E34D8608C33F@gmail.com> <14A96923-B0C2-46FB-99A2-87C1AEB735B1@gmail.com> Message-ID: [Tim] >> (*) The obvious implementation: >> >> def __mul__(self, other): >> if isinstance(other, Sized): >> raise TypeError("cannot multiply Counter by Sized type %s" % type(other)) [Serhiy] > Wouldn't be better to return NotImplemented here? Probably, but that's up to Raymond. What I like about the above is that the error message is very explicit about why the argument was rejected. A generic, e.g., TypeError: unsupported operand type(s) for *: 'Counter' and 'generator' isn't as informative. OTOH, ya, it's better practice to give other.__rmul__ a chance at it. >> result = Counter() >> for k, v in self.items(): >> result[k] = v * other >> return result > If discard non-positive values, this will automatically make multiplying > Counter by Counter (or by sequence) invalid, because they are not comparable > with 0. > > for k, v in self.items(): > v = v * other > if v > 0: > result[k] = v * other I'd start to buy that only if you promised to answer all Stackoverflow questions of the form: Hey, I tried multiplying two Counters and got TypeError: '>' not supported between instances of 'Counter' and 'int' What the hell is that supposed to mean? ;-) From tim.peters at gmail.com Wed Apr 18 21:48:07 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 18 Apr 2018 20:48:07 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: [Serhiy] > Isn't everyone expect that x*2 == x + x? Isn't this the definition of > multiplication? Note: in my first message in this thread, I made the same objection. And that, e.g., it would also be weird if x*-1 is not the same as -x. Obviously, I don't care enough about those to make up a meaning for "multiset * scalar" that preserves those in all cases. And we have to make up a meaning, because there is no generally accepted _mathematical_ meaning for what "multiset times a scalar" should do. It's not that there's controversy about what it should do, it's that nobody asks the question. It's like asking "what should the logarithm of matrix to a base that's a string mean?" This isn't Perl ;-) For a scalar that's a non-negative integer, I agree repeated addition makes _most_ sense. But, as I also noted in my first message, Peter's "obvious" implementation satisfies that! Provided that the Counter represents a legit (all values are strictly positive integers) multiset to begin with, `Counter * n` is the same as adding the Counter n times. If the Counter doesn't represent a legit multiset to begin with, who cares? It's not "a multiset operation" at all then. Or if the scalar isn't a non-negative integer. What on Earth is "multiset * math.pi" supposed to do that gives a legit multiset result? For each value v, return math.floor(v * math.pi)? That's just nuts - _any_ answer to that would be nuts, utterly arbitrary just to preserve a useless notion of "consistency". The use cases Peter have in mind appear to do with manipulating discrete probability distributions represented as Counters, which really have nothing to do with multisets. Has values will always be non-negative, but almost never integers, so almost never _sanely_ viewed as being multisets. Counter doesn't care. And I don't want to get in his way by insisting on some notion of formal consistency in new operations that have nothing to do with multisets except by accident, or by artificially forced contrivance. I've used Counters to represent multisets a fair bit, but have never had the slightest desire to use "times an integer" as a shorthand for repeated addition, and can't even dream up a meaning for what dividing a multiset by an integer (let alone a float!) could return that would make a lick of sense. "Scalar broadcast" makes instant sense in both cases, though, just by dropping the illusion that people using Counters _as multisets_ are interested in these operations at all. From steve at pearwood.info Wed Apr 18 22:32:25 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 19 Apr 2018 12:32:25 +1000 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: <20180419023224.GR11616@ando.pearwood.info> On Wed, Apr 18, 2018 at 11:24:13PM +0300, Serhiy Storchaka wrote: > Isn't everyone expect that x*2 == x + x? Isn't this the definition of > multiplication? I can't think of any counter-examples, but it wouldn't surprise me even a tiny bit to learn of some. > And when we have a multiplication, it can be generalized > to division. Not always. A counter-example is matrix multiplication, where multiplication is non-commutative: A*B ? B*A and division is not defined at all. Instead, we multiply by the inverse matrix, in the appropriate order if A*B = C then A = C*inv(B) and B = inv(A)*C And of course, when you talk about data types that aren't numeric, division makes even less sense: string multiplication means repetition, but string division has no obvious meaning. -- Steve From tim.peters at gmail.com Thu Apr 19 00:10:01 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 18 Apr 2018 23:10:01 -0500 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: <20180419023224.GR11616@ando.pearwood.info> References: <20180419023224.GR11616@ando.pearwood.info> Message-ID: [Serhiy] >> Isn't everyone expect that x*2 == x + x? Isn't this the definition of >> multiplication? [Steven D'Aprano ] > I can't think of any counter-examples, but it wouldn't surprise me even > a tiny bit to learn of some. Sure you can: explain -3.1 * -2.7 = 8.37 in terms of repeated addition ;-) It's not even necessarily true in finite fields. For example, look at the "+" and "*" tables for the 4-element finite field GF(4) here: https://en.wikipedia.org/wiki/Finite_field#Field_with_four_elements For every element x of that field, x+x = 0, so a*(1+a) = 1 in that field can't be obtained by adding any number of a's (or any number of 1+a's).. Adding x to itself just gives x again if an odd number of x's are added together, or 0 if an even number. The sense in which it's always true in fields is technical: first, x*y is guaranteed to be equivalent to repeated addition of x only if y is "an integer"; and, more subtly, "integer" is defined as meaning the result of adding the multiplicative identity any number of times to the additive identity. In GF(4), under that definition, only 0 (the additive identity) and 1 (the multiplicative identity (1) added to the additive identity (0)) are "integers". Attempting to add 1 more times just continues alternating between 0 and 1. `a` and `1+a` can't be reached that way, so are not integers, and none of a*(1+a), a*a, (1+a)*a, or (1+a)*(1+a) is the same as adding either operand any number of times. You could nevertheless define x*n as _meaning_ x+x+...+x (n times) for cardinal numbers n, but then n is outside the field. What's the application to Counter.__mul__? Not much ;-) The current '+' and '-' map _pairs_ of Counters to a Counter. The proposed '*' instead maps a Counter and "a scalar" to a Counter. It's always nice if x*n is the same as repeated addition of x when n is a cardinal integer, but because Counter.__add__ does special stuff specific to the multiset interpretation, the proposed '*' doesn't always do the same _unless_ the Counter is a legitimate multiset. So it still works fine _in_ the multiset view, provided that you're actually working with multiset Counters. But applications seeking to mix Counters and scalars with "*" and "/" don't have multisets in mind at all, so I'd be fine with `Counter * n` not being equivalent to repeated multiset addition even when Counter is a legitimate multiset and `n` is a cardinal. It's just gravy that it _is_ equivalent in that case. Where it breaks down is that, e.g, >>> a = Counter(b=-100) >>> a Counter({'b': -100}) >>> a + a Counter() but the proposed a*2 would return Counter(a=-200). But, in that case, `a` wasn't a legit multiset to begin with. From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Apr 19 02:43:33 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Thu, 19 Apr 2018 15:43:33 +0900 Subject: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4) In-Reply-To: <20180415161559.GJ11616@ando.pearwood.info> References: <20180415161559.GJ11616@ando.pearwood.info> Message-ID: <23256.14997.452822.473881@turnbull.sk.tsukuba.ac.jp> Steven D'Aprano writes: > What key combination do I need to type to get ? in the following editors > please? I tried typing \triangleq but all I got was \triangleq. Your implied point is correct IMO, but all of the editors and applications mentioned that I've used are perfectly happy with any characters delivered by the OS. So your question should be "what key combination do I use ... in the following OSes and/or GUIs:" (I don't have a list). Ditto fonts (MS fonts tend to work crappily on Mac in my experience, at least in Word and Excel). I note that currently there is heated debate on the Fedora lists about distributing the base OS with decent East Asian font support, and I think the majority are *opposed* (on the grounds that the size increase is noticable). I've also noticed that the recommended fonts on Linux seem to have been in flux for decades, and even differ across distros much of the time. Either way, teaching how to augment your OS is not the business of Python, so we should stick to a reasonable approximation to the least advanced environment, which is (TA-DA!) that of the U.S. From mayinbing12 at gmail.com Sat Apr 21 06:25:38 2018 From: mayinbing12 at gmail.com (Yinbin Ma) Date: Sat, 21 Apr 2018 18:25:38 +0800 Subject: [Python-ideas] Checking interned string after stringobjects concat? Message-ID: Hi all: I notice that if concatenating two stringobjects, PVM will not check the dictionary of interned string. For example: >>> a = "qwerty" >>> b = "qwe" >>> c = "rty" >>> d = b+c >>> id(a) 4572089736 >>> id(d) 4572111176 >>> e = "".join(["qwe","rty"]) >>> id(e) 4546460280 But if concatenating two string directly, PVM would check the dictionary: >>> a = "qwerty" >>> b = "qwe"+"rty" >>> id(a) 4546460112 >>> id(b) 4546460112 It happens in Py2 and Py3 both. Is it necessary for fixing this bug or not? Cheers! --- Yinbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Apr 21 06:42:41 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 21 Apr 2018 20:42:41 +1000 Subject: [Python-ideas] Checking interned string after stringobjects concat? In-Reply-To: References: Message-ID: On Sat, Apr 21, 2018 at 8:25 PM, Yinbin Ma wrote: > Hi all: > > I notice that if concatenating two stringobjects, PVM will not check the > dictionary of interned string. For example: > >>>> a = "qwerty" >>>> b = "qwe" >>>> c = "rty" >>>> d = b+c >>>> id(a) > 4572089736 >>>> id(d) > 4572111176 >>>> e = "".join(["qwe","rty"]) >>>> id(e) > 4546460280 > > But if concatenating two string directly, PVM would check the dictionary: > >>>> a = "qwerty" >>>> b = "qwe"+"rty" >>>> id(a) > 4546460112 >>>> id(b) > 4546460112 > > It happens in Py2 and Py3 both. > Is it necessary for fixing this bug or not? > What you're seeing there is actually the peephole optimizer at work. Your assignment to 'b' here is actually the exact same thing as 'a', by the time you get to execution. If you're curious about what's happening, check out the dis.dis() function and have fun! :) ChrisA From p.f.moore at gmail.com Sat Apr 21 06:49:16 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 21 Apr 2018 11:49:16 +0100 Subject: [Python-ideas] Checking interned string after stringobjects concat? In-Reply-To: References: Message-ID: On 21 April 2018 at 11:42, Chris Angelico wrote: > On Sat, Apr 21, 2018 at 8:25 PM, Yinbin Ma wrote: >> Hi all: >> >> I notice that if concatenating two stringobjects, PVM will not check the >> dictionary of interned string. For example: >> >>>>> a = "qwerty" >>>>> b = "qwe" >>>>> c = "rty" >>>>> d = b+c >>>>> id(a) >> 4572089736 >>>>> id(d) >> 4572111176 >>>>> e = "".join(["qwe","rty"]) >>>>> id(e) >> 4546460280 >> >> But if concatenating two string directly, PVM would check the dictionary: >> >>>>> a = "qwerty" >>>>> b = "qwe"+"rty" >>>>> id(a) >> 4546460112 >>>>> id(b) >> 4546460112 >> >> It happens in Py2 and Py3 both. >> Is it necessary for fixing this bug or not? >> > > What you're seeing there is actually the peephole optimizer at work. > Your assignment to 'b' here is actually the exact same thing as 'a', > by the time you get to execution. If you're curious about what's > happening, check out the dis.dis() function and have fun! :) To clarify, though, this is not a bug. The language doesn't guarantee that the two strings will have the same id, just that they will be equal (in the sense of ==). Paul From storchaka at gmail.com Sat Apr 21 08:29:47 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 21 Apr 2018 15:29:47 +0300 Subject: [Python-ideas] Checking interned string after stringobjects concat? In-Reply-To: References: Message-ID: 21.04.18 13:42, Chris Angelico ????: > What you're seeing there is actually the peephole optimizer at work. Since 3.7 constant folding is the AST optimizer work. The end result is the same in most cases though. Other optimization takes place here too. Constants strings that look like identifiers (short string consisting of ASCII alphanumerical characters) are interned in the code object constructor. From rosuav at gmail.com Sat Apr 21 10:47:05 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 22 Apr 2018 00:47:05 +1000 Subject: [Python-ideas] Checking interned string after stringobjects concat? In-Reply-To: References: Message-ID: On Sat, Apr 21, 2018 at 10:29 PM, Serhiy Storchaka wrote: > 21.04.18 13:42, Chris Angelico ????: >> >> What you're seeing there is actually the peephole optimizer at work. > > > Since 3.7 constant folding is the AST optimizer work. The end result is the > same in most cases though. > > Other optimization takes place here too. Constants strings that look like > identifiers (short string consisting of ASCII alphanumerical characters) are > interned in the code object constructor. Ah, sorry, my bad. Anyhow, it's part of compile-time optimization, which means that it runs the exact same code for both assignments. ChrisA From storchaka at gmail.com Sat Apr 21 10:58:24 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 21 Apr 2018 17:58:24 +0300 Subject: [Python-ideas] Checking interned string after stringobjects concat? In-Reply-To: References: Message-ID: 21.04.18 17:47, Chris Angelico ????: > On Sat, Apr 21, 2018 at 10:29 PM, Serhiy Storchaka wrote: >> 21.04.18 13:42, Chris Angelico ????: >>> >>> What you're seeing there is actually the peephole optimizer at work. >> >> >> Since 3.7 constant folding is the AST optimizer work. The end result is the >> same in most cases though. >> >> Other optimization takes place here too. Constants strings that look like >> identifiers (short string consisting of ASCII alphanumerical characters) are >> interned in the code object constructor. > > Ah, sorry, my bad. Anyhow, it's part of compile-time optimization, > which means that it runs the exact same code for both assignments. Don't blame yourself for missing details of the implementation of the version that is not released yet. ;-) From gvanrossum at gmail.com Sat Apr 21 15:12:44 2018 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 21 Apr 2018 19:12:44 +0000 Subject: [Python-ideas] Checking interned string after stringobjects concat? In-Reply-To: References: Message-ID: But to the OP, this is not considered a bug. On Sat, Apr 21, 2018, 07:59 Serhiy Storchaka wrote: > 21.04.18 17:47, Chris Angelico ????: > > On Sat, Apr 21, 2018 at 10:29 PM, Serhiy Storchaka > wrote: > >> 21.04.18 13:42, Chris Angelico ????: > >>> > >>> What you're seeing there is actually the peephole optimizer at work. > >> > >> > >> Since 3.7 constant folding is the AST optimizer work. The end result is > the > >> same in most cases though. > >> > >> Other optimization takes place here too. Constants strings that look > like > >> identifiers (short string consisting of ASCII alphanumerical > characters) are > >> interned in the code object constructor. > > > > Ah, sorry, my bad. Anyhow, it's part of compile-time optimization, > > which means that it runs the exact same code for both assignments. > > Don't blame yourself for missing details of the implementation of the > version that is not released yet. ;-) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Sun Apr 22 18:53:59 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Sun, 22 Apr 2018 18:53:59 -0400 Subject: [Python-ideas] collections.Counter should implement __mul__, __rmul__ In-Reply-To: References: Message-ID: Instead of extending Counter to fit fancier usecases, why not have a new class that is designed for arithmetic? I, for one, would love Numpy-style list and dict classes in the standard library. And they wouldn't be confusingly called Counter, and have strange behaviors with negative values. I only saw discussion about whether or not we want Counter to support Peter's use, but there isn't much talk of supporting it with a new class. If we can get behind a new class, there wouldn't be as much conflict about what to do with the old one. On Sun, Apr 15, 2018 at 5:05 PM, Peter Norvig wrote: > For most types that implement __add__, `x + x` is equal to `2 * x`. > > That is true for all numbers, list, tuple, str, timedelta, etc. -- but not > for collections.Counter. I can add two Counters, but I can't multiply one by > a scalar. That seems like an oversight. > > It would be worthwhile to implement multiplication because, among other > reasons, Counters are a nice representation for discrete probability > distributions, for which multiplication is an even more fundamental > operation than addition. > > Here's an implementation: > > def __mul__(self, scalar): > "Multiply each entry by a scalar." > result = Counter() > for key in self: > result[key] = self[key] * scalar > return result > > def __rmul__(self, scalar): > "Multiply each entry by a scalar." > result = Counter() > for key in self: > result[key] = scalar * self[key] > return result > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From j.van.dorp at deonet.nl Tue Apr 24 08:52:23 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Tue, 24 Apr 2018 14:52:23 +0200 Subject: [Python-ideas] Change magic strings to enums Message-ID: A bit ago I was reading some of the python docs ( https://docs.python.org/3.6/library/warnings.html ), the warning module, and I noticed a table of magic strings. I can think of a few other places where magic strings are used - for example, string encoding/decoding locales and strictness, and probably a number of other places. Since Python 3.4, We've been having Enums. Wouldn't it be cleaner to use enums by default instead of those magic strings ? for example, for warnings filter actions, (section 29.5.2), quite near the top of the page. You could declare in the warnings module: class FilterAction(Enum): Error = 'error' Ignore = 'ignore' Always = 'always' Default = 'default' Module = 'module' Once = 'once' And put in the docs that people should use enums. For as long as a transition period would last, any entrypoint into the module where currently the magic string is used, you could transform it with the single line: action = FilterAction(action) # transforms from old magic string to shiny new enum member Then whenever enough people have shifted/4.0 comes around, the string argument version can be deprecated. Pro's - no magic strings - more clarity - easier to get all possible values with for example type checking editors (because of the type annotation) Con's : - implementation effort (as long as the modules are in pure Python, I could perhaps help. Im not to confident about my C skills tho) Backwards compatibility wouldn't be an issue because of the easy transformation from the old string as long as we use those string as the enum values. ofc, precise names of enums/members is up for debate. I personally prefer the above version to ALLCAPS, but that might be my comparitive lack of C experience. These named constants are generally done in all caps, but they're also often because of lack of a simple enum class. I tried to search for this, but couldn't find any discussion about it. Apologies if this has been rejected before. Jacco From ncoghlan at gmail.com Tue Apr 24 09:58:19 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 24 Apr 2018 23:58:19 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: Message-ID: On 24 April 2018 at 22:52, Jacco van Dorp wrote: > A bit ago I was reading some of the python docs ( > https://docs.python.org/3.6/library/warnings.html ), the warning > module, and I noticed a table of magic strings. > > I can think of a few other places where magic strings are used - for > example, string encoding/decoding locales and strictness, and probably > a number of other places. > > Since Python 3.4, We've been having Enums. > > Wouldn't it be cleaner to use enums by default instead of those magic > strings ? for example, for warnings filter actions, (section 29.5.2), > quite near the top of the page. "It's cleaner" isn't a user problem though. The main justification for using enums is that they're easier to interpret in log messages and expection tracebacks than opaque constants, and that argument is much weaker for well-chosen string constants than it is for other constants (like the numeric constants in the socket and errno modules). For backwards compatibility reasons, we'd want to keep accepting the plain string versions anyway (implicitly converting them to their enum counterparts). At a less philosophical level, many of the cases where we use magic strings are in code that has to work even when the import system isn't working yet - that's relatively straightforward to achieve when the code is only relying on strings with particular contents, but *much* harder if they're relying on a higher level type like enum objects. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From j.van.dorp at deonet.nl Tue Apr 24 11:06:49 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Tue, 24 Apr 2018 17:06:49 +0200 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: Message-ID: I 2018-04-24 15:58 GMT+02:00 Nick Coghlan : > On 24 April 2018 at 22:52, Jacco van Dorp wrote: >> Wouldn't it be cleaner to use enums by default instead of those magic >> strings ? for example, for warnings filter actions, (section 29.5.2), >> quite near the top of the page. > > "It's cleaner" isn't a user problem though. The main justification for > using enums is that they're easier to interpret in log messages and > expection tracebacks than opaque constants, and that argument is much > weaker for well-chosen string constants than it is for other constants > (like the numeric constants in the socket and errno modules). > > For backwards compatibility reasons, we'd want to keep accepting the > plain string versions anyway (implicitly converting them to their enum > counterparts). > > At a less philosophical level, many of the cases where we use magic > strings are in code that has to work even when the import system isn't > working yet - that's relatively straightforward to achieve when the > code is only relying on strings with particular contents, but *much* > harder if they're relying on a higher level type like enum objects. > > Cheers, > Nick. I guess we could add inconsistency as a con, then, since if the import system isn't working at places where you'd like to use the Enums (or even executing python code ?). This would mean that to the casual observer, it'd be arbitrary where they could be used instead. I wonder how many of these would be in places used by most people, though. I don't mind putting in some time to figure it out, but I have no idea where to start. Is there any easily searchable place where I could scan the standard library for occurences of magic strings ? From rosuav at gmail.com Tue Apr 24 11:18:10 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 25 Apr 2018 01:18:10 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: Message-ID: On Wed, Apr 25, 2018 at 1:06 AM, Jacco van Dorp wrote: > I > > 2018-04-24 15:58 GMT+02:00 Nick Coghlan : >> On 24 April 2018 at 22:52, Jacco van Dorp wrote: >>> Wouldn't it be cleaner to use enums by default instead of those magic >>> strings ? for example, for warnings filter actions, (section 29.5.2), >>> quite near the top of the page. >> >> "It's cleaner" isn't a user problem though. The main justification for >> using enums is that they're easier to interpret in log messages and >> expection tracebacks than opaque constants, and that argument is much >> weaker for well-chosen string constants than it is for other constants >> (like the numeric constants in the socket and errno modules). >> >> For backwards compatibility reasons, we'd want to keep accepting the >> plain string versions anyway (implicitly converting them to their enum >> counterparts). >> >> At a less philosophical level, many of the cases where we use magic >> strings are in code that has to work even when the import system isn't >> working yet - that's relatively straightforward to achieve when the >> code is only relying on strings with particular contents, but *much* >> harder if they're relying on a higher level type like enum objects. >> >> Cheers, >> Nick. > > > I guess we could add inconsistency as a con, then, since if the import > system isn't working at places where you'd like to use the Enums (or > even executing python code ?). This would mean that to the casual > observer, it'd be arbitrary where they could be used instead. > > I wonder how many of these would be in places used by most people, > though. I don't mind putting in some time to figure it out, but I have > no idea where to start. Is there any easily searchable place where I > could scan the standard library for occurences of magic strings ? First, though, can you enumerate (pun intended) the problems with magic strings? You list "no magic strings" as a benefit, as if it's self-evident; I'm not sure that it is. ChrisA From ncoghlan at gmail.com Tue Apr 24 11:49:12 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 Apr 2018 01:49:12 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: Message-ID: On 25 April 2018 at 01:06, Jacco van Dorp wrote: > I guess we could add inconsistency as a con, then, since if the import > system isn't working at places where you'd like to use the Enums (or > even executing python code ?). This would mean that to the casual > observer, it'd be arbitrary where they could be used instead. Running './python -X importtime -Wd -c "pass"' with Python 3.7 gives a pretty decent list of the parts of the standard library that constitute the low level core that we try to keep independent of everything else (there's a slightly smaller core that omits the warning module and it's dependencies - leaving "-Wd" off the command line will give that list). > I wonder how many of these would be in places used by most people, > though. > I don't mind putting in some time to figure it out, but I have > no idea where to start. Is there any easily searchable place where I > could scan the standard library for occurences of magic strings ? Searching the documentation for :data: fields, and then checking those to see which ones had already been converted to enums would likely be your best bet. You wouldn't be able to get a blanket approval for "Let's convert all the magic strings to Enums" though - you'd need to make the case that each addition of a new Enum provided a genuine API improvement for the affected module (e.g. I suspect a plausible case could potentially be made for converting some of the inspect module state introspection APIs over to StringEnum, so it was easier to iterate over the valid states in a consistent way, but even there I'd need to see a concrete proposal before I made up my mind). Making the case for IntEnum usage tends to be much easier, simply due to the runime introspection benefits that it brings. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Tue Apr 24 13:19:02 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 25 Apr 2018 03:19:02 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: Message-ID: <20180424171901.GM11616@ando.pearwood.info> On Wed, Apr 25, 2018 at 01:18:10AM +1000, Chris Angelico wrote: > First, though, can you enumerate (pun intended) the problems with > magic strings? You list "no magic strings" as a benefit, as if it's > self-evident; I'm not sure that it is. It shouldn't be self-evident, because the use of strings in the warnings module doesn't match the most common accepted meaning of magic strings. https://en.wikipedia.org/wiki/Magic_string Possibly Jacco was thinking of "magic constants": https://en.wikipedia.org/wiki/Magic_number_%28programming%29#Unnamed_numerical_constants (although in this case, they're text constants, not numerical). But this seems like a fairly benign example to my eyes: the strings aren't likely to change their values. As discussed here: https://softwareengineering.stackexchange.com/questions/221034/usage-of-magic-strings-numbers not all uses of literals are harmful. A fairly lightweight change would be to add named constants to the warning module: ERROR = 'error' etc, and refactor the module to use the named constants instead of hard-coded strings. I'm surprised that it doesn't already do that. That would be 100% backwards compatible, without the cost of importing and creating enums, but allow consumers of the warnings module to inspect it and import symbolic names if they so choose: from warnings import ERROR By the way, I notice that the warnings module makes heavy use of assert to check user-supplied input. That's dangerous since asserts can be disabled, and also poor API design: it means that even if the asserts trigger on error (which isn't guaranteed), they raise the wrong kind of exception: AssertionError instead of TypeError for bad types or ValueError for bad values. -- Steve From solipsis at pitrou.net Tue Apr 24 13:32:22 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 24 Apr 2018 19:32:22 +0200 Subject: [Python-ideas] Change magic strings to enums References: Message-ID: <20180424193222.19f2a2ea@fsol> On Tue, 24 Apr 2018 23:58:19 +1000 Nick Coghlan wrote: > On 24 April 2018 at 22:52, Jacco van Dorp wrote: > > A bit ago I was reading some of the python docs ( > > https://docs.python.org/3.6/library/warnings.html ), the warning > > module, and I noticed a table of magic strings. > > > > I can think of a few other places where magic strings are used - for > > example, string encoding/decoding locales and strictness, and probably > > a number of other places. > > > > Since Python 3.4, We've been having Enums. > > > > Wouldn't it be cleaner to use enums by default instead of those magic > > strings ? for example, for warnings filter actions, (section 29.5.2), > > quite near the top of the page. > > "It's cleaner" isn't a user problem though. The main justification for > using enums is that they're easier to interpret in log messages and > expection tracebacks than opaque constants, and that argument is much > weaker for well-chosen string constants than it is for other constants > (like the numeric constants in the socket and errno modules). Also beware the import time cost of having a widely-used module like "warnings" depend on the "enum" module and its own dependencies. Regards Antoine. From rosuav at gmail.com Tue Apr 24 13:42:27 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 25 Apr 2018 03:42:27 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: <20180424171901.GM11616@ando.pearwood.info> References: <20180424171901.GM11616@ando.pearwood.info> Message-ID: On Wed, Apr 25, 2018 at 3:19 AM, Steven D'Aprano wrote: > On Wed, Apr 25, 2018 at 01:18:10AM +1000, Chris Angelico wrote: > >> First, though, can you enumerate (pun intended) the problems with >> magic strings? You list "no magic strings" as a benefit, as if it's >> self-evident; I'm not sure that it is. > > It shouldn't be self-evident, because the use of strings in the warnings > module doesn't match the most common accepted meaning of magic strings. > > https://en.wikipedia.org/wiki/Magic_string > > Possibly Jacco was thinking of "magic constants": > > https://en.wikipedia.org/wiki/Magic_number_%28programming%29#Unnamed_numerical_constants > > (although in this case, they're text constants, not numerical). I assumed this to be the case, yes. Many people decry "magic numbers" where the number 7 means one thing, and the number 83 means something else; but I'm less convinced that the textual equivalent is as problematic. (If not "magic strings", what would you call them?) > A fairly lightweight change would be to add named constants to the > warning module: > > ERROR = 'error' > > etc, and refactor the module to use the named constants instead of > hard-coded strings. I'm surprised that it doesn't already do that. > > That would be 100% backwards compatible, without the cost of importing > and creating enums, but allow consumers of the warnings module to > inspect it and import symbolic names if they so choose: > > from warnings import ERROR Yeah, that would be one way to do it. But I'd still like to know what problems are being solved by this, as a means of determining whether they're being solved adequately. Is it the risk of misspellings? Because that can happen just as easily with the imported name as with a string literal. (The from-import pollutes your namespace, and "warnings.EROR" is a run-time failure just as much as a misspelled string literal would be.) Is it the need to list all the possible strings? That could be done with something like __future__.all_feature_names or the way the logging module translates level names into numbers. Something else? ChrisA From ethan at stoneleaf.us Tue Apr 24 14:56:55 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 24 Apr 2018 11:56:55 -0700 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: <20180424193222.19f2a2ea@fsol> References: <20180424193222.19f2a2ea@fsol> Message-ID: <5ADF7DF7.3020004@stoneleaf.us> On 04/24/2018 10:32 AM, Antoine Pitrou wrote: > Also beware the import time cost of having a widely-used module like > "warnings" depend on the "enum" module and its own dependencies. With all the recent changes to Python, I should go through and see which dependencies are no longer needed. -- ~Ethan~ From ncoghlan at gmail.com Tue Apr 24 23:04:40 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 Apr 2018 13:04:40 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: <5ADF7DF7.3020004@stoneleaf.us> References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: On 25 April 2018 at 04:56, Ethan Furman wrote: > On 04/24/2018 10:32 AM, Antoine Pitrou wrote: > >> Also beware the import time cost of having a widely-used module like >> "warnings" depend on the "enum" module and its own dependencies. > > > With all the recent changes to Python, I should go through and see which > dependencies are no longer needed. I was checking this with "./python -X importtime -c 'import enum'", and the overall import time was around 9 ms with a cold disk cache, and 2 ms with a warm one. In both cases, importing "types" and "_collections" accounted for around a 3rd of the time, with the bulk of the execution time being enum's own module level code. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From songofacandy at gmail.com Wed Apr 25 01:03:30 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 25 Apr 2018 14:03:30 +0900 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: On Wed, Apr 25, 2018 at 12:04 PM, Nick Coghlan wrote: > On 25 April 2018 at 04:56, Ethan Furman wrote: >> On 04/24/2018 10:32 AM, Antoine Pitrou wrote: >> >>> Also beware the import time cost of having a widely-used module like >>> "warnings" depend on the "enum" module and its own dependencies. >> >> >> With all the recent changes to Python, I should go through and see which >> dependencies are no longer needed. > > I was checking this with "./python -X importtime -c 'import enum'", > and the overall import time was around 9 ms with a cold disk cache, > and 2 ms with a warm one. In both cases, importing "types" and > "_collections" accounted for around a 3rd of the time, with the bulk > of the execution time being enum's own module level code. > enum class&member creation cost is much heavier than "import enum" cost. Especially, "import socket, ssl" is much slower than before... -- INADA Naoki From levkivskyi at gmail.com Wed Apr 25 03:57:24 2018 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 25 Apr 2018 08:57:24 +0100 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: On 25 April 2018 at 06:03, INADA Naoki wrote: > On Wed, Apr 25, 2018 at 12:04 PM, Nick Coghlan wrote: > > On 25 April 2018 at 04:56, Ethan Furman wrote: > >> On 04/24/2018 10:32 AM, Antoine Pitrou wrote: > >> > >>> Also beware the import time cost of having a widely-used module like > >>> "warnings" depend on the "enum" module and its own dependencies. > >> > >> > >> With all the recent changes to Python, I should go through and see which > >> dependencies are no longer needed. > > > > I was checking this with "./python -X importtime -c 'import enum'", > > and the overall import time was around 9 ms with a cold disk cache, > > and 2 ms with a warm one. In both cases, importing "types" and > > "_collections" accounted for around a 3rd of the time, with the bulk > > of the execution time being enum's own module level code. > > > > enum class&member creation cost is much heavier than "import enum" cost. > Especially, "import socket, ssl" is much slower than before... > > Is it slow simply because we are creating new class objects or EnumMeta.__new__ does some extensive calculations? In the latter case rewriting EnumMeta in C might help. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Wed Apr 25 04:02:42 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Wed, 25 Apr 2018 10:02:42 +0200 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: <9c82e8a3496143c087149976fa4cf90d@xmail101.UGent.be> References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> <9c82e8a3496143c087149976fa4cf90d@xmail101.UGent.be> Message-ID: <5AE03622.7070600@UGent.be> On 2018-04-25 09:57, Ivan Levkivskyi wrote: > In the latter case rewriting EnumMeta in C ... or Cython. It's a great language and I'm sure that the Python standard library could benefit a lot from it. From j.van.dorp at deonet.nl Wed Apr 25 04:06:56 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Wed, 25 Apr 2018 10:06:56 +0200 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: > First, though, can you enumerate (pun intended) the problems with > magic strings? You list "no magic strings" as a benefit, as if it's > self-evident; I'm not sure that it is. > > ChrisA One of my main reasons would be the type-checking from tools like Pycharm, which is the one I use. If I don't remember the exact strings, I won't have to switch to my browser to look up the documentation, but instead I type the enum name, and the typechecker will give me the members with correct spelling - all I need to remember is a vague idea of what option did what. The option names will be reminders instead of the thing to remember. Perhaps the string encode/decode would be a better case, tho. Is it latin 1 or latin-1 ? utf-8 or UTF-8 ? They might be fast to look up if you know where to look (probably the top result of googling "python string encoding utf 8", and it's the second and first option respectively IIRC. But I shouldn't -have- to recall correctly), but it's still a lot faster if you can type "Encoding.U" and it gives you the option. I'll go and see if I can make a small list of modules using these kind of strings that aren't of the essential core when I get home this evening. My apologies if magic strings isn't the correct word. Despite that, I believe everyone knows what I intend to say. From rosuav at gmail.com Wed Apr 25 04:30:32 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 25 Apr 2018 18:30:32 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: On Wed, Apr 25, 2018 at 6:06 PM, Jacco van Dorp wrote: >> First, though, can you enumerate (pun intended) the problems with >> magic strings? You list "no magic strings" as a benefit, as if it's >> self-evident; I'm not sure that it is. >> >> ChrisA > > One of my main reasons would be the type-checking from tools like > Pycharm, which is the one I use. If I don't remember the exact > strings, I won't have to switch to my browser to look up the > documentation, but instead I type the enum name, and the typechecker > will give me the members with correct spelling - all I need to > remember is a vague idea of what option did what. The option names > will be reminders instead of the thing to remember. > > Perhaps the string encode/decode would be a better case, tho. Is it > latin 1 or latin-1 ? utf-8 or UTF-8 ? They might be fast to look up if > you know where to look (probably the top result of googling "python > string encoding utf 8", and it's the second and first option > respectively IIRC. But I shouldn't -have- to recall correctly), but > it's still a lot faster if you can type "Encoding.U" and it gives you > the option. There are so many encodings that I don't think an enum would be practical. Also, their canonical names are not valid identifiers, so you would have to futz around just as much - is it Encoding.ISO_8859_1 or Encoding.ISO88591 or something else? Perhaps an alternative tool in PyCharm is the solution. There's no reason that you can't have tab completion inside string literals; imagine, for instance, if >> open("/usr/lo << could tab-complete "/usr/local" and let you fill in a valid path name from your file system. Tab-completing a set of common encodings would be reasonably easy. Tab-completing a set of constant strings for the warnings module, even easier. Maybe there could be a way to store this info on a function object, and then PyCharm just has to follow instructions ("this arg of this function uses these strings")? Possibly as a type annotation, even - instead of saying "this takes a string", it can say "this takes a string drawn from these options"? The strings themselves don't have to be any different. ChrisA From j.van.dorp at deonet.nl Wed Apr 25 05:12:42 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Wed, 25 Apr 2018 11:12:42 +0200 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: 2018-04-25 10:30 GMT+02:00 Chris Angelico : > On Wed, Apr 25, 2018 at 6:06 PM, Jacco van Dorp wrote: >>> First, though, can you enumerate (pun intended) the problems with >>> magic strings? You list "no magic strings" as a benefit, as if it's >>> self-evident; I'm not sure that it is. >>> >>> ChrisA >> >> One of my main reasons would be the type-checking from tools like >> Pycharm, which is the one I use. If I don't remember the exact >> strings, I won't have to switch to my browser to look up the >> documentation, but instead I type the enum name, and the typechecker >> will give me the members with correct spelling - all I need to >> remember is a vague idea of what option did what. The option names >> will be reminders instead of the thing to remember. >> >> Perhaps the string encode/decode would be a better case, tho. Is it >> latin 1 or latin-1 ? utf-8 or UTF-8 ? They might be fast to look up if >> you know where to look (probably the top result of googling "python >> string encoding utf 8", and it's the second and first option >> respectively IIRC. But I shouldn't -have- to recall correctly), but >> it's still a lot faster if you can type "Encoding.U" and it gives you >> the option. > > There are so many encodings that I don't think an enum would be > practical. Also, their canonical names are not valid identifiers, so > you would have to futz around just as much - is it Encoding.ISO_8859_1 > or Encoding.ISO88591 or something else? Which is where the auto-completion comes in. type Encoding.IS - and you'd have a list of options. If the function is annotated with the type of the enum, it'll even suggest you that. I'll freely admit that the amount of encodings might make this a bad idea. On the other hand....it might make it a good idea as well. See a list of possibilities, and the IDE can filter it as you type. IIRC tho, you can add encodings at runtime, while you can't add Enum members. If you actually can, than this might be unsuitable for Enum solution. (looking at the docs 7.2:codecs though, the error handling strings look...tasty ;p) > > Perhaps an alternative tool in PyCharm is the solution. There's no > reason that you can't have tab completion inside string literals; > imagine, for instance, if >> open("/usr/lo << could tab-complete > "/usr/local" and let you fill in a valid path name from your file > system. Tab-completing a set of common encodings would be reasonably > easy. Tab-completing a set of constant strings for the warnings > module, even easier. But where would you get a list of these strings, and how'd you define that list of strings ? It's currently commonly accepted that annotations are for types, and the later PEP's about the subject seem to assume this without question. Another annotation-like functionality ? hardcode the list of possibilities inside pycharm ? Also, perhaps I want to select the string first, store in into a variable, then pass that variable into a function. How would any checker ever know what im going to feed that string into ? But an Enum, I can use that as type annotation myself and then it'll know without question which are legal arguments. > Maybe there could be a way to store this info on a function object, > and then PyCharm just has to follow instructions ("this arg of this > function uses these strings")? Possibly as a type annotation, even - > instead of saying "this takes a string", it can say "this takes a > string drawn from these options"? The strings themselves don't have to > be any different. > > ChrisA Pycharm doesn't execute your code - it scans it. It wont know what you store on a function object. As whether type annotations can mean "a string from either of these options" would require a currently not known to me interpretation of type annotations - and probably incompatible with what most people currenlty use them for. Forgive me if I misunderstand you, but aren't you really just trying to use those strings as enum members when you define a function like "takes one of these strings as argument" ? Because as far as I know, except from some fluff, that's exactly what enums are and are intended for - a unique set of keys that all have special meaning. From rosuav at gmail.com Wed Apr 25 05:46:20 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 25 Apr 2018 19:46:20 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: On Wed, Apr 25, 2018 at 7:12 PM, Jacco van Dorp wrote: > Pycharm doesn't execute your code - it scans it. It wont know what you > store on a function object. How does it currently know what type something is? If you define something using an enum, how is PyCharm going to know what the members of that enum are? > Forgive me if I misunderstand you, but aren't you really just trying > to use those strings as enum members when you define a function like > "takes one of these strings as argument" ? Because as far as I know, > except from some fluff, that's exactly what enums are and are intended > for - a unique set of keys that all have special meaning. That's *one of* the things you can do with an enum. There are many other features of enumerations, and I'm trying to boil your idea down to its most compact form. You don't need all the power of Enum - you just want to be able to list off the valid options. ChrisA From storchaka at gmail.com Wed Apr 25 06:03:31 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 25 Apr 2018 13:03:31 +0300 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: 25.04.18 10:57, Ivan Levkivskyi ????: > On 25 April 2018 at 06:03, INADA Naoki > > wrote: > enum class&member creation cost is much heavier than "import enum" cost. > Especially, "import socket, ssl" is much slower than before... > > Is it slow simply because we are creating new? class objects or > EnumMeta.__new__ does > some extensive calculations? In the latter case rewriting EnumMeta in C > might help. Creating a new function is very cheap -- just around 50 ns on my computer. Creating a new class is over two orders costly -- around 7 us for an empty class on my computer. Creating a new Enum class is much more costly -- around 40 us for an empty class (or 50 us for IntEnum) plus 7 us per member. Creating a new namedtuple type has the same cost as creating Enum class. It was much costly before 3.7. Thus creating a typical Enum class with 3-5 members is like creating 10 normal classes. Not much modules have 10 classes. From levkivskyi at gmail.com Wed Apr 25 06:15:42 2018 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 25 Apr 2018 11:15:42 +0100 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: On 25 April 2018 at 11:03, Serhiy Storchaka wrote: > 25.04.18 10:57, Ivan Levkivskyi ????: > >> On 25 April 2018 at 06:03, INADA Naoki > songofacandy at gmail.com>> wrote: >> enum class&member creation cost is much heavier than "import enum" >> cost. >> Especially, "import socket, ssl" is much slower than before... >> >> Is it slow simply because we are creating new class objects or >> EnumMeta.__new__ does >> some extensive calculations? In the latter case rewriting EnumMeta in C >> might help. >> > > Creating a new function is very cheap -- just around 50 ns on my computer. > > Creating a new class is over two orders costly -- around 7 us for an empty > class on my computer. > > Creating a new Enum class is much more costly -- around 40 us for an empty > class (or 50 us for IntEnum) plus 7 us per member. > > Hm, this is what I wanted to know. I think by rewriting EnumMeta in C we can reduce the creation time of an Enum class (almost) down to the creation time of a normal class, which may be a 4-5x speed-up. What do you think? -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Apr 25 06:21:48 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 25 Apr 2018 20:21:48 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: <20180425102147.GR11616@ando.pearwood.info> On Wed, Apr 25, 2018 at 10:06:56AM +0200, Jacco van Dorp wrote: > Perhaps the string encode/decode would be a better case, tho. Is it > latin 1 or latin-1 ? utf-8 or UTF-8 ? py> 'abc'.encode('latin 1') == 'abc'.encode('LATIN-1') True py> 'abc'.encode('utf8') == 'abc'.encode('UTF 8') == 'abc'.encode('UtF_8') True Encoding names are normalised before being used. > They might be fast to look up if > you know where to look (probably the top result of googling "python > string encoding utf 8", and it's the second and first option > respectively IIRC. But I shouldn't -have- to recall correctly), but > it's still a lot faster if you can type "Encoding.U" and it gives you > the option. If you did this with Encodings.ISO you would get a couple of dozen possibilities. ISO-8859-1 ISO-8859-7 ISO-8859-14 ISO-8859-15 etc, just to pick a few at random. How do you know which one you want? In general, there's not really much *practical* use-case for code completion on encodings, aside from just exploratory mucking about in the interactive interpreter. There are too many codecs (multiple dozen), the names are too similar and not self-explanatory, and they can have aliases. It would be like doing code-completion on an object and getting a couple of dozen methods looking like method1245 method1246 method1247 method2390 method2395 Besides, aside from UTF-16, UTF-8 and ASCII, we shouldn't encourage the use of most codecs except for legacy data. And when working with legacy data, we really need to know ahead of time what the encoding is, and declare it as constant or application option. (Or, worst case, we've used chardet or another encoding guesser, and stored the name of the encoding in a variable.) I don't really see a big advantage aside from laziness for completing on encodings. And while laziness is a virtue in programmers, that only goes so far before it becomes silly. Having to type import encodings enc .Enc .u arrow arrow arrow arrow arrow arrow enter (19 key presses, plus the import) to save from having to type 'utf8' (six keypresses) is not what I would call efficient use of programmer time and effort. (Why so many arrows? Since you'll have to tab past at least utf16 utf16be utf16le utf32 utf32be utf32le utf7 before you get to utf8.) But the biggest problem is that they aren't currently available for introspection anywhere. You can register new codecs, but there's no API for querying the list of currently registered codecs or their aliases. I think that problem would need to be solved first, in which case code completion will then be either easy, or irrelevant. (I'd be perfectly satisfied with an API I could call from the interactive interpreter.) -- Steve From storchaka at gmail.com Wed Apr 25 07:01:43 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 25 Apr 2018 14:01:43 +0300 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: 25.04.18 13:15, Ivan Levkivskyi ????: > Hm, this is what I wanted to know. I think by rewriting EnumMeta in C we > can reduce the creation time of an Enum class > (almost) down to the creation time of a normal class, which may be a > 4-5x speed-up. What do you think? It could be great. But I afraid this may add too much complexity in C code. Maybe try to implement a simple and fast Enum for using it in the stdlib and extend it with a richer interface in the enum module? From gvanrossum at gmail.com Wed Apr 25 11:01:13 2018 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 25 Apr 2018 15:01:13 +0000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: <5AE03622.7070600@UGent.be> References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> <9c82e8a3496143c087149976fa4cf90d@xmail101.UGent.be> <5AE03622.7070600@UGent.be> Message-ID: On Wed, Apr 25, 2018, 01:03 Jeroen Demeyer wrote: > On 2018-04-25 09:57, Ivan Levkivskyi wrote: > > In the latter case rewriting EnumMeta in C > > ... or Cython. It's a great language and I'm sure that the Python > standard library could benefit a lot from it. > No, the stdlib should not depend on Cython, no matter how great. That would be a terrible dependency cycle. --Guido > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gvanrossum at gmail.com Wed Apr 25 11:05:09 2018 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 25 Apr 2018 15:05:09 +0000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: On Wed, Apr 25, 2018, 02:13 Jacco van Dorp wrote: > ... Which is where the auto-completion comes in. ... > Designing the language with auto-complete in mind feels wrong to me. It assumes a very sophisticated IDE and may lead to lazy design compromises. --Guido > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Apr 25 12:20:51 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 25 Apr 2018 09:20:51 -0700 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: <5AE0AAE3.9060902@stoneleaf.us> On 04/25/2018 03:15 AM, Ivan Levkivskyi wrote: > On 25 April 2018 at 11:03, Serhiy Storchaka wrote: >> Creating a new function is very cheap -- just around 50 ns on my computer. >> >> Creating a new class is over two orders costly -- around 7 us for an empty class on my computer. >> >> Creating a new Enum class is much more costly -- around 40 us for an empty class (or 50 us for IntEnum) plus 7 us >> per member. > > Hm, this is what I wanted to know. I think by rewriting EnumMeta in C we can reduce the creation time of an Enum class > (almost) down to the creation time of a normal class, which may be a 4-5x speed-up. What do you think? Someone else would have to take that on (a C version of EnumMeta) -- my C skills are not up to that task and I do not currently possess the time to get them there. -- ~Ethan~ From julia.hiyeon.kim at gmail.com Wed Apr 25 14:22:24 2018 From: julia.hiyeon.kim at gmail.com (Julia Kim) Date: Wed, 25 Apr 2018 11:22:24 -0700 Subject: [Python-ideas] string method count() Message-ID: Hi, There?s an error with the string method count(). x = ?AAA? y = ?AA? print(x.count(y)) The output is 1, instead of 2. I write programs on SoloLearn mobile app. Warm regards, Julia Kim From jmcs at jsantos.eu Wed Apr 25 14:31:49 2018 From: jmcs at jsantos.eu (=?UTF-8?B?Sm/Do28gU2FudG9z?=) Date: Wed, 25 Apr 2018 18:31:49 +0000 Subject: [Python-ideas] string method count() In-Reply-To: References: Message-ID: Hi, >From https://docs.python.org/3/library/stdtypes.html#str.count: str.count(*sub*[, *start*[, *end*]]) Return the number of *non-overlapping* occurrences of substring *sub* in the range [*start*, *end*]. Optional arguments *start* and *end* are interpreted as in slice notation. Best regards, Jo?o Santos On Wed, 25 Apr 2018 at 20:22 Julia Kim wrote: > Hi, > > There?s an error with the string method count(). > > x = ?AAA? > y = ?AA? > print(x.count(y)) > > The output is 1, instead of 2. > > > I write programs on SoloLearn mobile app. > > > > Warm regards, > Julia Kim > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abrault at mapgears.com Wed Apr 25 14:27:59 2018 From: abrault at mapgears.com (Alexandre Brault) Date: Wed, 25 Apr 2018 14:27:59 -0400 Subject: [Python-ideas] string method count() In-Reply-To: References: Message-ID: <157c574a-c522-e690-449b-5b1a57a72cdd@mapgears.com> str.count counts non-overlapping instances of the substring. After counting the first 'AA', there is only one A left, so that isn't a second instance of 'AA' On 2018-04-25 02:22 PM, Julia Kim wrote: > Hi, > > There?s an error with the string method count(). > > x = ?AAA? > y = ?AA? > print(x.count(y)) > > The output is 1, instead of 2. > > > I write programs on SoloLearn mobile app. > > > > Warm regards, > Julia Kim > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From steve at pearwood.info Wed Apr 25 17:33:53 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 26 Apr 2018 07:33:53 +1000 Subject: [Python-ideas] string method count() In-Reply-To: References: Message-ID: <20180425213353.GA7400@ando.pearwood.info> On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote: > Hi, > > There?s an error with the string method count(). > > x = ?AAA? > y = ?AA? > print(x.count(y)) > > The output is 1, instead of 2. Are you proposing that there ought to be a version of count that looks for *overlapping* substrings? When will this be useful? -- Steve From solipsis at pitrou.net Wed Apr 25 18:03:12 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 26 Apr 2018 00:03:12 +0200 Subject: [Python-ideas] Change magic strings to enums References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: <20180426000312.1d489948@fsol> On Wed, 25 Apr 2018 14:01:43 +0300 Serhiy Storchaka wrote: > 25.04.18 13:15, Ivan Levkivskyi ????: > > Hm, this is what I wanted to know. I think by rewriting EnumMeta in C we > > can reduce the creation time of an Enum class > > (almost) down to the creation time of a normal class, which may be a > > 4-5x speed-up. What do you think? > > It could be great. But I afraid this may add too much complexity in C > code. Maybe try to implement a simple and fast Enum for using it in the > stdlib and extend it with a richer interface in the enum module? Or perhaps we want a way to eschew the current complexity of the EnumMeta constructor, for example by passing some pre-computed values. Regards Antoine. From fakedme+py at gmail.com Wed Apr 25 19:11:49 2018 From: fakedme+py at gmail.com (Soni L.) Date: Wed, 25 Apr 2018 20:11:49 -0300 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: <1ea6fe6e-2cac-c81b-6273-541c6e200eb5@gmail.com> On 2018-04-25 12:05 PM, Guido van Rossum wrote: > On Wed, Apr 25, 2018, 02:13 Jacco van Dorp > wrote: > > ... Which is where the auto-completion comes in. ... > > > Designing the language with auto-complete in mind feels wrong to me. > It assumes a very sophisticated IDE and may lead to lazy design > compromises. You can tab-complete enums (in the REPL), but not strings. Tab-complete is not an IDE thing, it's a CPython REPL thing. It seems reasonable to design for it. Any IDE worth my time would support string autocompletion, anyway. ;) > > --Guido > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.van.dorp at deonet.nl Thu Apr 26 02:55:01 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Thu, 26 Apr 2018 08:55:01 +0200 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: <1ea6fe6e-2cac-c81b-6273-541c6e200eb5@gmail.com> References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> <1ea6fe6e-2cac-c81b-6273-541c6e200eb5@gmail.com> Message-ID: Even if not just for the autocompletion, it would be more explicit that it's not just a random string like you'd pass to print(), but it has a specific meaning. Something in PEP 20 about explicit and implicit ? Autocompletion might be a good advantage, but 1) the IDE would need to know what to autocomplete it to, and you probably shouldn't special-case the stdlib like you'd need to with strings, and 2) enums -are- more explicit. When there's a distinct and limited set of options, they're just the tool for the job. (or at least, a much better tool for this job than to remember colors, which is used all over their documentation). Im naming auto-completion here as a solid reason, but when clean code itself can be considered a solid reason, I think that's probably the real reason. 2018-04-26 1:11 GMT+02:00 Soni L. : > > > On 2018-04-25 12:05 PM, Guido van Rossum wrote: > > On Wed, Apr 25, 2018, 02:13 Jacco van Dorp wrote: >> >> ... Which is where the auto-completion comes in. ... > > > Designing the language with auto-complete in mind feels wrong to me. It > assumes a very sophisticated IDE and may lead to lazy design compromises. > > > You can tab-complete enums (in the REPL), but not strings. > Tab-complete is not an IDE thing, it's a CPython REPL thing. It seems > reasonable to design for it. > > Any IDE worth my time would support string autocompletion, anyway. ;) > > > --Guido > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From wes.turner at gmail.com Thu Apr 26 02:57:04 2018 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 26 Apr 2018 02:57:04 -0400 Subject: [Python-ideas] string method count() In-Reply-To: <20180425213353.GA7400@ando.pearwood.info> References: <20180425213353.GA7400@ando.pearwood.info> Message-ID: On Wednesday, April 25, 2018, Steven D'Aprano wrote: > On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote: > > Hi, > > > > There?s an error with the string method count(). > > > > x = ?AAA? > > y = ?AA? > > print(x.count(y)) > > > > The output is 1, instead of 2. > > Are you proposing that there ought to be a version of count that looks > for *overlapping* substrings? > > When will this be useful? "Finding a motif in DNA" http://rosalind.info/problems/subs/ This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window https://stackoverflow.com/questions/2970520/string-count-with-overlapping-occurrences n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_algorithms https://en.wikipedia.org/wiki/Sequential_pattern_mining > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Thu Apr 26 03:38:43 2018 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 26 Apr 2018 08:38:43 +0100 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: On 25 April 2018 at 12:01, Serhiy Storchaka wrote: > 25.04.18 13:15, Ivan Levkivskyi ????: > >> Hm, this is what I wanted to know. I think by rewriting EnumMeta in C we >> can reduce the creation time of an Enum class >> (almost) down to the creation time of a normal class, which may be a 4-5x >> speed-up. What do you think? >> > > It could be great. But I afraid this may add too much complexity in C > code. Maybe try to implement a simple and fast Enum for using it in the > stdlib and extend it with a richer interface in the enum module? > > I think we can do something similar to ABCMeta, i.e. the metaclass itself will stay defined in Python, but the "hottest" parts of its methods will be replaced with helper functions written in C. This way we can limit complexity of the C code while still getting almost all the performance benefits. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Apr 26 04:42:06 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 26 Apr 2018 10:42:06 +0200 Subject: [Python-ideas] Change magic strings to enums References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: <20180426104206.02c2caba@fsol> On Thu, 26 Apr 2018 08:38:43 +0100 Ivan Levkivskyi wrote: > On 25 April 2018 at 12:01, Serhiy Storchaka wrote: > > > 25.04.18 13:15, Ivan Levkivskyi ????: > > > >> Hm, this is what I wanted to know. I think by rewriting EnumMeta in C we > >> can reduce the creation time of an Enum class > >> (almost) down to the creation time of a normal class, which may be a 4-5x > >> speed-up. What do you think? > >> > > > > It could be great. But I afraid this may add too much complexity in C > > code. Maybe try to implement a simple and fast Enum for using it in the > > stdlib and extend it with a richer interface in the enum module? > > > > > I think we can do something similar to ABCMeta, i.e. the metaclass itself > will stay defined in Python, but the "hottest" parts of its methods will be > replaced with helper functions written in C. It's still a PITA to maintain, and it's not nice to Ethan (the main enum maintainer) if suddenly he has to act on C code when he wants to fix a bug or add a feature. Regards Antoine. From songofacandy at gmail.com Thu Apr 26 04:52:44 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 26 Apr 2018 08:52:44 +0000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: >> It could be great. But I afraid this may add too much complexity in C code. Maybe try to implement a simple and fast Enum for using it in the stdlib and extend it with a richer interface in the enum module? > I think we can do something similar to ABCMeta, i.e. the metaclass itself will stay defined in Python, but the "hottest" parts of its methods will be replaced with helper functions written in C. > This way we can limit complexity of the C code while still getting almost all the performance benefits. > -- > Ivan Adding C speedup module has high bar, especially performance gain is only on startup time. Personally speaking, I want speedup module for `re.compile` than enum. In case of enum, I feel CONSTANT = "magic" is enough for most cases. re doesn't have such workaround, and using third party regular expression library has very high bar, especially when using it in stdlib. But there were some -1 on adding speedup module for re. I think it's same to enum. So I think we should: * Don't use enum blindly; use it only when it's very far better than CONST = "magic". * Add faster API which bypass some slow part, especially for IntEnum.convert() or IntFlag.convert() in socket module. Regards, -- INADA Naoki From j.van.dorp at deonet.nl Thu Apr 26 05:37:39 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Thu, 26 Apr 2018 11:37:39 +0200 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: I'm kind of curious why everyone here seems to want to use IntFlags and other mixins. The docs themselves say that their use should be minimized, and tbh I agree with them. Backwards compatiblity can be maintained by allowing the old value and internally converting it to the enum. Combinability is inherent to enum.Flags. There'd be no real reason to use mixins as far as I can see ? 2018-04-26 10:52 GMT+02:00 INADA Naoki : >>> It could be great. But I afraid this may add too much complexity in C > code. Maybe try to implement a simple and fast Enum for using it in the > stdlib and extend it with a richer interface in the enum module? > > >> I think we can do something similar to ABCMeta, i.e. the metaclass itself > will stay defined in Python, but the "hottest" parts of its methods will be > replaced with helper functions written in C. >> This way we can limit complexity of the C code while still getting almost > all the performance benefits. > >> -- >> Ivan > > > Adding C speedup module has high bar, especially performance gain is only > on startup time. > > Personally speaking, I want speedup module for `re.compile` than enum. > In case of enum, I feel CONSTANT = "magic" is enough for most cases. > re doesn't have such workaround, and using third party regular expression > library has > very high bar, especially when using it in stdlib. > > But there were some -1 on adding speedup module for re. > I think it's same to enum. > > So I think we should: > > * Don't use enum blindly; use it only when it's very far better than CONST > = "magic". > * Add faster API which bypass some slow part, especially for > IntEnum.convert() or IntFlag.convert() in socket module. > > Regards, > > -- > INADA Naoki > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From j.van.dorp at deonet.nl Thu Apr 26 05:44:21 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Thu, 26 Apr 2018 11:44:21 +0200 Subject: [Python-ideas] string method count() In-Reply-To: References: <20180425213353.GA7400@ando.pearwood.info> Message-ID: or build it yourself... def str_count(string, sub): c = 0 for c in range(len(string)-len(sub)): if string[c:].startswith(sub): c += 1 return c (probably some optimizations possible...) Or in one line with a generator expression: def str_count(string, sub): return sum(string[c:].startswith(sub) for c in range(len(string)-len(sub))) regular expressions would probably be at least an order of magnitude better in speed, if it's a bottleneck to you. But pure python implementation for this is a lot easier than it would be for the current string.count(). 2018-04-26 8:57 GMT+02:00 Wes Turner : > > > On Wednesday, April 25, 2018, Steven D'Aprano wrote: >> >> On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote: >> > Hi, >> > >> > There?s an error with the string method count(). >> > >> > x = ?AAA? >> > y = ?AA? >> > print(x.count(y)) >> > >> > The output is 1, instead of 2. >> >> Are you proposing that there ought to be a version of count that looks >> for *overlapping* substrings? >> >> When will this be useful? > > > "Finding a motif in DNA" > http://rosalind.info/problems/subs/ > > This is possible with re.find, re.finditer, re.findall, regex.findall(, > overlapped=True), sliding window > https://stackoverflow.com/questions/2970520/string-count-with-overlapping-occurrences > > n-grams can be by indices or by value. > count = len(indices) > https://en.wikipedia.org/wiki/N-gram#Examples > > https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_algorithms > > https://en.wikipedia.org/wiki/Sequential_pattern_mining > >> >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From ncoghlan at gmail.com Thu Apr 26 09:26:52 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Apr 2018 23:26:52 +1000 Subject: [Python-ideas] Change magic strings to enums In-Reply-To: References: <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us> Message-ID: On 26 April 2018 at 19:37, Jacco van Dorp wrote: > I'm kind of curious why everyone here seems to want to use IntFlags > and other mixins. The docs themselves say that their use should be > minimized, and tbh I agree with them. Backwards compatiblity can be > maintained by allowing the old value and internally converting it to > the enum. Combinability is inherent to enum.Flags. There'd be no real > reason to use mixins as far as I can see ? Serialisation formats are a good concrete example of how problems can arise by switching out concrete types on people: >>> import enum, json >>> a = "A" >>> class Enum(enum.Enum): ... a = "A" ... >>> class StrEnum(str, enum.Enum): ... a = "A" ... >>> json.dumps(a) '"A"' >>> json.dumps(StrEnum.a) '"A"' >>> json.dumps(Enum.a) Traceback (most recent call last): File "", line 1, in File "/usr/lib64/python3.6/json/__init__.py", line 231, in dumps return _default_encoder.encode(obj) File "/usr/lib64/python3.6/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/lib64/python3.6/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/usr/lib64/python3.6/json/encoder.py", line 180, in default o.__class__.__name__) TypeError: Object of type 'Enum' is not JSON serializable The mixin variants basically say "If you run into code that doesn't natively understand enums, act like an instance of this type". Since most of the standard library has been around for years, and sometimes even decades, we tend to face a *lot* of backwards compatibility requirements along those lines. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From julian.demille at demilletech.net Thu Apr 26 09:29:22 2018 From: julian.demille at demilletech.net (Julian DeMille) Date: Thu, 26 Apr 2018 13:29:22 +0000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace Message-ID: I personally would like a feature where instead of doing `from ... import ...` (which imports the specified items into the current namespace), one could use something along the lines of `import .{ , , ... }` such that the imported modules/attributes could be accessed as `.`, etc. -- Thanks, Julian DeMille CEO, demilleTech, LLC This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Apr 26 09:37:28 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 26 Apr 2018 14:37:28 +0100 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: On 26 April 2018 at 14:29, Julian DeMille via Python-ideas wrote: > I personally would like a feature where instead of doing `from ... import > ...` (which imports the specified items into the current namespace), one > could use something along the lines of `import .{ , , ... > }` such that the imported modules/attributes could be accessed as > `.`, etc. What are the benefits of this over a simple "import "? I get that it will mean that *only* the names listed will be accessible as ., but I don't see why that's important (and specifically why it's important enough to warrant dedicated syntax). Hiding names in a namespace isn't typically something that Python provides language support for. Paul From julian.demille at demilletech.net Thu Apr 26 09:39:58 2018 From: julian.demille at demilletech.net (Julian DeMille) Date: Thu, 26 Apr 2018 13:39:58 +0000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: Some library authors get pretty pissy about implicit imports at the root On Thu, Apr 26, 2018, 09:37 Paul Moore wrote: > On 26 April 2018 at 14:29, Julian DeMille via Python-ideas > wrote: > > I personally would like a feature where instead of doing `from ... import > > ...` (which imports the specified items into the current namespace), one > > could use something along the lines of `import .{ , , > ... > > }` such that the imported modules/attributes could be accessed as > > `.`, etc. > > What are the benefits of this over a simple "import "? I get that > it will mean that *only* the names listed will be accessible as > ., but I don't see why that's important (and specifically > why it's important enough to warrant dedicated syntax). Hiding names > in a namespace isn't typically something that Python provides language > support for. > > Paul > -- Thanks, Julian DeMille CEO, demilleTech, LLC This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Thu Apr 26 09:50:17 2018 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Thu, 26 Apr 2018 13:50:17 +0000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: On Thu, 26 Apr 2018 at 19:10 Julian DeMille via Python-ideas < python-ideas at python.org> wrote: > Some library authors get pretty pissy about implicit imports at the root > > On Thu, Apr 26, 2018, 09:37 Paul Moore wrote: > >> On 26 April 2018 at 14:29, Julian DeMille via Python-ideas >> wrote: >> > I personally would like a feature where instead of doing `from ... >> import >> > ...` (which imports the specified items into the current namespace), one >> > could use something along the lines of `import .{ , , >> ... >> > }` such that the imported modules/attributes could be accessed as >> > `.`, etc. >> >> What are the benefits of this over a simple "import "? I get that >> it will mean that *only* the names listed will be accessible as >> ., but I don't see why that's important (and specifically >> why it's important enough to warrant dedicated syntax). Hiding names >> in a namespace isn't typically something that Python provides language >> support for. >> >> Paul >> > -- > Thanks, > Julian DeMille > > CEO, demilleTech, LLC > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you have received this email in error please notify the system manager. > This message contains confidential information and is intended only for the > individual named. If you are not the named addressee you should not > disseminate, distribute or copy this e-mail. Please notify the sender > immediately by e-mail if you have received this e-mail by mistake and > delete this e-mail from your system. If you are not the intended recipient > you are notified that disclosing, copying, distributing or taking any > action in reliance on the contents of this information is strictly > prohibited. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > The following works today: Python 3.6.3 (default, Oct 4 2017, 06:09:15) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import os.path >>> os >>> os.path I am not sure what you're asking for here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Apr 26 09:51:08 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Apr 2018 23:51:08 +1000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: On 26 April 2018 at 23:37, Paul Moore wrote: > On 26 April 2018 at 14:29, Julian DeMille via Python-ideas > wrote: >> I personally would like a feature where instead of doing `from ... import >> ...` (which imports the specified items into the current namespace), one >> could use something along the lines of `import .{ , , ... >> }` such that the imported modules/attributes could be accessed as >> `.`, etc. > > What are the benefits of this over a simple "import "? Forcing submodule imports would be the main thing, as at the moment, you have to choose between repeating the base name multiple times (once per submodule) or losing the hierarchical namespace. So where: from pkg import mod1, mod2, mod3 bind "mod1", "mod2", and "mod3" in the current namespace, you might instead write: from pkg import .mod1, .mod2, .mod3 to only bind "pkg" locally, but still make sure "pkg.mod1", "pkg.mod2" and "pkg.mod3" all resolve at import time. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From julian.demille at demilletech.net Thu Apr 26 09:53:58 2018 From: julian.demille at demilletech.net (Julian DeMille) Date: Thu, 26 Apr 2018 13:53:58 +0000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: That's the kind of thing I'm looking for. I've dealt with some library authors who were highly against importing the root allowing me to access submodules with hierarchy. On Thu, Apr 26, 2018 at 9:51 AM Nick Coghlan wrote: > On 26 April 2018 at 23:37, Paul Moore wrote: > > On 26 April 2018 at 14:29, Julian DeMille via Python-ideas > > wrote: > >> I personally would like a feature where instead of doing `from ... > import > >> ...` (which imports the specified items into the current namespace), one > >> could use something along the lines of `import .{ , , > ... > >> }` such that the imported modules/attributes could be accessed as > >> `.`, etc. > > > > What are the benefits of this over a simple "import "? > > Forcing submodule imports would be the main thing, as at the moment, > you have to choose between repeating the base name multiple times > (once per submodule) or losing the hierarchical namespace. > > So where: > > from pkg import mod1, mod2, mod3 > > bind "mod1", "mod2", and "mod3" in the current namespace, you might > instead write: > > from pkg import .mod1, .mod2, .mod3 > > to only bind "pkg" locally, but still make sure "pkg.mod1", "pkg.mod2" > and "pkg.mod3" all resolve at import time. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- Thanks, Julian DeMille CEO, demilleTech, LLC This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From julia.hiyeon.kim at gmail.com Thu Apr 26 10:19:30 2018 From: julia.hiyeon.kim at gmail.com (Julia Kim) Date: Thu, 26 Apr 2018 07:19:30 -0700 Subject: [Python-ideas] string method count() In-Reply-To: References: <20180425213353.GA7400@ando.pearwood.info> Message-ID: There are two ?AA? in ?AAA?, one starting from 0 and the other starting from 1. If ?AA? starting from 0 is deleted and inserted with ?BANAN?, ?AAA? becomes ?BANANA ?. If ?AA? starting from 1 is deleted and inserted with ?PPLE?, ?AAA? becomes ?APPLE?. Depending on which one is chosen, ?AAA? can be edited to ?BANANA? or ?APPLE ?, two different results. I wrote a program which edits a part of a text. If the part to be edited occurs more than once, it presents the positions and asks the user to choose which one to be edited. I tried with different algorithms. Best one so far would be using just find() and collecting the results in a list. > On Apr 25, 2018, at 11:57 PM, Wes Turner wrote: > > > >> On Wednesday, April 25, 2018, Steven D'Aprano wrote: >> On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote: >> > Hi, >> > >> > There?s an error with the string method count(). >> > >> > x = ?AAA? >> > y = ?AA? >> > print(x.count(y)) >> > >> > The output is 1, instead of 2. >> >> Are you proposing that there ought to be a version of count that looks >> for *overlapping* substrings? >> >> When will this be useful? > > "Finding a motif in DNA" > http://rosalind.info/problems/subs/ > > This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window > https://stackoverflow.com/questions/2970520/string-count-with-overlapping-occurrences > > n-grams can be by indices or by value. > count = len(indices) > https://en.wikipedia.org/wiki/N-gram#Examples > > https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_algorithms > > https://en.wikipedia.org/wiki/Sequential_pattern_mining > >> >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Apr 26 11:11:40 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 26 Apr 2018 18:11:40 +0300 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: > The following works today: > > Python 3.6.3 (default, Oct? 4 2017, 06:09:15) > [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import os.path > >>> os > '/Users/pradyunsg/.venvwrap/venvs/pip/bin/../lib/python3.6/os.py'> > >>> os.path > '/Users/pradyunsg/.venvwrap/venvs/pip/bin/../lib/python3.6/posixpath.py'> > > I am not sure what you're asking for here. os.path is a very special case. First at all there are no the os/path.py file... From storchaka at gmail.com Thu Apr 26 11:22:04 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 26 Apr 2018 18:22:04 +0300 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: 26.04.18 16:51, Nick Coghlan ????: > Forcing submodule imports would be the main thing, as at the moment, > you have to choose between repeating the base name multiple times > (once per submodule) or losing the hierarchical namespace. If the base name is short, there are no problems with repeating it. If it is long, it is better to get rid of this prefix than repeat it every time when you use submodules. In any case it takes pressing just a few keystrokes in modern editors for duplicating an import line for editing the last component. > So where: > > from pkg import mod1, mod2, mod3 > > bind "mod1", "mod2", and "mod3" in the current namespace, you might > instead write: > > from pkg import .mod1, .mod2, .mod3 > > to only bind "pkg" locally, but still make sure "pkg.mod1", "pkg.mod2" > and "pkg.mod3" all resolve at import time. I think this special cases isn't special enough to introduce a special syntax. From rosuav at gmail.com Thu Apr 26 11:24:23 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 27 Apr 2018 01:24:23 +1000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: On Thu, Apr 26, 2018 at 11:53 PM, Julian DeMille via Python-ideas wrote: > That's the kind of thing I'm looking for. I've dealt with some library > authors who were highly against importing the root allowing me to access > submodules with hierarchy. With a package, having automatic imports forces those submodules to be loaded eagerly (as soon as you import the package, you load up those modules). Lazily-loaded submodules can improve performance if you don't always need them. +0 for an easier way to import multiple submodules at once. It's not something I've personally had a need for, but it's a sane and logical thing to do. ChrisA From robertve92 at gmail.com Thu Apr 26 12:54:42 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Thu, 26 Apr 2018 16:54:42 +0000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: I just ran into a similar problem, how to relatively import without binding the submodule. Let's say you have this : myapp/ urls.py views/ base.py When you're in urls.py and you want to relatively access Functions from base.py, you must use the from syntax. from .views import base base.func() But what if I just want "views" in my namespace? from . import views from .views import base views.base.func() base.func() import myapp.views.base myapp.views.base.func() from . import views import myapp.views.base views.base.func() myapp.views.base.func() Le jeu. 26 avr. 2018 ? 17:24, Chris Angelico a ?crit : > On Thu, Apr 26, 2018 at 11:53 PM, Julian DeMille via Python-ideas > wrote: > > That's the kind of thing I'm looking for. I've dealt with some library > > authors who were highly against importing the root allowing me to access > > submodules with hierarchy. > > With a package, having automatic imports forces those submodules to be > loaded eagerly (as soon as you import the package, you load up those > modules). Lazily-loaded submodules can improve performance if you > don't always need them. > > +0 for an easier way to import multiple submodules at once. It's not > something I've personally had a need for, but it's a sane and logical > thing to do. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Apr 26 15:13:10 2018 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 26 Apr 2018 15:13:10 -0400 Subject: [Python-ideas] string method count() In-Reply-To: References: <20180425213353.GA7400@ando.pearwood.info> Message-ID: If this was for a school assignment, I'd probably go to edit distance and fuzzy string match next: https://en.wikipedia.org/wiki/Edit_distance https://en.wikipedia.org/wiki/String-to-string_correction_problem - https://pypi.org/search/?q=Levenshtein - https://pypi.org/project/textdistance/ As a bioinformatics program, this is a bit like CRISPR: https://en.wikipedia.org/wiki/CRISPR BioPython Seq has a count_overlap method with a BSD 3-Clause LICENSE: https://github.com/biopython/biopython/blob/master/LICENSE.rst Can it be made faster with e.g. itertools.count and a generator comprehension? - Bio.Seq.Seq.count_overlap() http://biopython.org/DIST/docs/api/Bio.Seq.Seq-class.html#count_overlap Are there any changes or features necessary in core Python in order to finish this application? If not, the python-tutor mailing list or r/learnpython are set up to handle this sort of thing. It may or may not be appropriate for core Python to support all of these string algorithms: http://rosalind.info/problems/topics/string-algorithms/ On Thursday, April 26, 2018, Julia Kim wrote: > There are two ?AA? in ?AAA?, one starting from 0 and the other starting > from 1. > > If ?AA? starting from 0 is deleted and inserted with ?BANAN?, ?AAA? > becomes ?BANANA ?. > > If ?AA? starting from 1 is deleted and inserted with ?PPLE?, ?AAA? becomes > ?APPLE?. > > Depending on which one is chosen, ?AAA? can be edited to ?BANANA? or > ?APPLE ?, two different results. > > > I wrote a program which edits a part of a text. If the part to be edited > occurs more than once, it presents the positions and asks the user to > choose which one to be edited. > > I tried with different algorithms. Best one so far would be using just > find() and collecting the results in a list. > > > > On Apr 25, 2018, at 11:57 PM, Wes Turner wrote: > > > > On Wednesday, April 25, 2018, Steven D'Aprano wrote: > >> On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote: >> > Hi, >> > >> > There?s an error with the string method count(). >> > >> > x = ?AAA? >> > y = ?AA? >> > print(x.count(y)) >> > >> > The output is 1, instead of 2. >> >> Are you proposing that there ought to be a version of count that looks >> for *overlapping* substrings? >> >> When will this be useful? > > > "Finding a motif in DNA" > http://rosalind.info/problems/subs/ > > This is possible with re.find, re.finditer, re.findall, regex.findall(, > overlapped=True), sliding window > https://stackoverflow.com/questions/2970520/string-count-with-overlapping- > occurrences > > n-grams can be by indices or by value. > count = len(indices) > https://en.wikipedia.org/wiki/N-gram#Examples > > https://en.wikipedia.org/wiki/String_(computer_science)# > String_processing_algorithms > > https://en.wikipedia.org/wiki/Sequential_pattern_mining > > >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Apr 26 19:12:38 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 27 Apr 2018 11:12:38 +1200 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: <5AE25CE6.5090107@canterbury.ac.nz> Chris Angelico wrote: > +0 for an easier way to import multiple submodules at once. It's not > something I've personally had a need for, but it's a sane and logical > thing to do. Maybe: import display, event, mixer in pygame or in pygame import display, event, mixer -- Greg From storchaka at gmail.com Fri Apr 27 02:58:30 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 27 Apr 2018 09:58:30 +0300 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: <5AE25CE6.5090107@canterbury.ac.nz> References: <5AE25CE6.5090107@canterbury.ac.nz> Message-ID: 27.04.18 02:12, Greg Ewing ????: > Chris Angelico wrote: >> +0 for an easier way to import multiple submodules at once. It's not >> something I've personally had a need for, but it's a sane and logical >> thing to do. > > Maybe: > > ?? import display, event, mixer in pygame I read this as import display, event, mixer in pygame pygame.display = display pygame.event = event pygame.mixer = mixer del display, event, mixer in pygame From greg.ewing at canterbury.ac.nz Fri Apr 27 04:08:03 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 27 Apr 2018 20:08:03 +1200 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: <5AE25CE6.5090107@canterbury.ac.nz> Message-ID: <5AE2DA63.4000202@canterbury.ac.nz> Serhiy Storchaka wrote: > 27.04.18 02:12, Greg Ewing ????: > >> import display, event, mixer in pygame > > I read this as > > import display, event, mixer in pygame > pygame.display = display > pygame.event = event > pygame.mixer = mixer > del display, event, mixer in pygame It's meant to be shorthand for import pygame.display import pygame.event import pygame.mixer -- Greg From njs at pobox.com Fri Apr 27 04:29:42 2018 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 27 Apr 2018 01:29:42 -0700 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? Message-ID: Hi all, This came up in passing in one of the PEP 572 threads, and I'm curious if folks think it's a good idea or not. When debugging, sometimes you have a somewhat complicated expression that's not working: # Hmm, is func2() returning the right thing? while (func1() + 2 * func2()) < func3(): ... It'd be nice to print out what func2() returns, but to do that we have to refactor this code, which might be rather tricky in a case like this. I think if you want to use print() directly here, the simplest way to do that is: while True: tmp = func2() print(tmp) if not (func1() + 2 * func2()) < func3()): break ... Obviously this is annoying and error prone ? especially for beginners, who are the ones most likely to need to print out lots of stuff to figure out why their code isn't working. (Chris Angelico mentioned that he finds this to be a common problem when teaching beginners.) There is a better way: if you define a trivial helper like: # "debug print": prints and then returns its argument def dp(obj): print(repr(obj)) return obj then the rewritten code becomes: while (func1() + 2 * dp(func2())) < func3(): ... Of course, this is trivial -- for me or you. But the leap to first realize that this is a useful thing, and then implement it correctly, is really asking a lot of beginners, who by assumption are struggling to do *anything* with Python syntax. And similarly, putting a package on PyPI is useful (cf. the venerable 'q' package), but still adds a significant barrier to entry: you need to be able to install packages, and you need to add an import. In fact, I can imagine that you might want to teach this trick even before you teach what imports are. So, would it make sense to include a utility like this in __builtins__? PEP 553, the breakpoint() builtin, provides some relevant precedent. Looking at it, I see it also emphasized the value of letting IDEs override the debugger, and I can see some similar value here: e.g. fancy REPLs like Spyder or Jupyter could potentially capture the objects passed to dp() and make them available for interactive viewing (imagine if they're like a large dataframe or something). Points to argue over if people like the general idea: - The name: p(), dp(), debug(), debugprint(), ...? - __str__ or __repr__? Presumably __repr__ since it's a debugging tool. - Exact semantics: there should probably be some way to add a bit of metadata that gets printed out, for cases like: while (dp(func1(), "func1") + 2 * dp(func2(), "func2")) < dp(func3(), "func3"): ... Maybe other tweaks would be useful as well. -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Fri Apr 27 07:14:52 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 27 Apr 2018 21:14:52 +1000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: On 27 April 2018 at 01:22, Serhiy Storchaka wrote: > I think this special cases isn't special enough to introduce a special > syntax. While I'm mostly inclined to agree, I do think it would be nice to have a clean spelling for "ensure this module is fully imported, but don't bind it locally". Right now, the most concise spelling I can think of for that would be to bind all the imported modules to the same throwaway variable, then delete that variable: import pkg from pkg import mod1 as __, mod2 as __, mod3 as __ del __ Allowing "as None" to mean "Don't bind this name at all" would give: import pkg from pkg import mod1 as None, mod2 as None, mod3 as None from . import views from .views import base as None It would be a bit odd though, since "None = anything" is normally a syntax error. Taking this idea in a completely different direction: what if we were to take advantage of PEP 451 __spec__ attributes to enhance modules to natively support implicit on-demand imports before they give up and raise AttributeError? (Essentially making all submodule imports implicitly lazy if you don't explicitly import them - you'd only be *required* to explicitly import top level modules, with everything under that being an implementation detail of that top level package) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Fri Apr 27 07:27:34 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 27 Apr 2018 21:27:34 +1000 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: References: Message-ID: <20180427112733.GP7400@ando.pearwood.info> I think that this is either a great idea or pointless, depending on what the built-in actually does. If all it does is literally the debug print function you give: > # "debug print": prints and then returns its argument > def dp(obj): > print(repr(obj)) > return obj then it is just a trivial helper as you say, and in my opinion too trivial to bother making a builtin. As far as discoverability by beginners, I think that having their instructor teach them to write such a simple helper would be a good lesson. But suppose we were willing to add a bit of compiler magic to the language, something that would be hard to do in pure Python: give dp() access to the source code of the argument it is called with, and then print out that source as well as the value's repr, plus the line number and name of the module it is called from. An example: # module.py x = 1 y = dp(x + 99)+2 print("y is", y) Then running that module would output: Line 2 of module.py, "x + 99", result 100 y is 102 Compare that to the pretty anaemic output of the dp() helper you give: 100 y is 102 I know which I would rather see when debugging. Obviously dp() would have to be magic. There's no way that I know of for a Python function to see the source code of its own arguments. I have no idea what sort of deep voodoo would be required to make this work. But if it could work, wow, that would really be useful. And not just for beginners. Some objections... Objection 1: dp() looks like an ordinary function call. Magic in Python is usually a statement, like assert. Answer: Very true. But on the other hand, there's super() inside classes. Objection 2: Yes, but even super() isn't this magical. Answer: Details, details. I'm just the ideas man, somebody else can work out the implementation... *wink* Objection 3: What if the caller has shadowed or replaced dp()? Answer: Don't do that. Let's make dp() a reserved name. Objection 4: You're kidding, right? That needs a full deprecation cycle, it will break code, etc. Answer: Okay, okay. Maybe the compiler could be smart enough to only pass the extra information on (line number, module, source code of argument) when dp() is the actual genuine builtin dp() function, and not if it has been shadowed. Objection 5: Even if there is a way to do that, it would require an expensive runtime check that will slow down calls to anything called dp(). Answer: Yes, but that's only one name out of millions. All other function calls will be unaffected. And besides, performance regressions don't count as breakage. Much. Yeah, I don't think this is going to fly either. But boy would it be useful if it could... -- Steve From ncoghlan at gmail.com Fri Apr 27 08:53:39 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 27 Apr 2018 22:53:39 +1000 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: <20180427112733.GP7400@ando.pearwood.info> References: <20180427112733.GP7400@ando.pearwood.info> Message-ID: On 27 April 2018 at 21:27, Steven D'Aprano wrote: > Obviously dp() would have to be magic. There's no way that I know of for > a Python function to see the source code of its own arguments. I have no > idea what sort of deep voodoo would be required to make this work. But > if it could work, wow, that would really be useful. And not just for > beginners. > If you relax the enhancement to just noting the line where the debug print came from, it doesn't need to be deep compiler magic - the same kind of stack introspection that warnings and tracebacks use would suffice. (Stack introspection to find the caller's module, filename and line number, linecache to actually retrieve the line if we want to print that). Cheers, Nick. P.S. While super() is a *little* magic, it isn't *that* magic - it gets converted from "super()" to "super(name_of_first_param, __class__)". And even that limited bit of magic has proven quirky enough to be a recurring source of irritation when it comes to interpreter maintenance. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Apr 27 08:58:59 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 27 Apr 2018 22:58:59 +1000 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: <20180427112733.GP7400@ando.pearwood.info> References: <20180427112733.GP7400@ando.pearwood.info> Message-ID: On Fri, Apr 27, 2018 at 9:27 PM, Steven D'Aprano wrote: > Obviously dp() would have to be magic. There's no way that I know of for > a Python function to see the source code of its own arguments. I have no > idea what sort of deep voodoo would be required to make this work. But > if it could work, wow, that would really be useful. And not just for > beginners. It's a debugging function. It's okay if the magic has some restrictions on it. How about: 1) The dp() function is CPython-specific. Other Pythons may or may not include it, and may or may not have this magic. 2) For the magic to work, the calling module must have source code available. Otherwise, dp() will do as much as it can, but it might not be able to do everything. 3) The magic may not work if you use any name other than "dp" to call the function. Then, the function can be written much more plausibly. It can use sys._getframe to find the calling function, fetch up the source code from disk, and look at the corresponding line of code. The hardest part will be figuring out code like this: x = dp(spam) if qq else dp(ham) In theory, frm.f_lasti (the last bytecode instruction executed) should be able to help with this, but I'm not sure how well you could parse through that to figure out which of multiple dp() calls we're in. At this point, it's DEFINITELY too large for an instructor to dictate to a beginner as part of a lesson on debugging, but it could be a great addition to the 'inspect' module. You could teach students to add "from inspect import dp" to their imports, and the rest would 'just work'. I don't think this needs any specific compiler magic or making 'dp' a reserved name, but it might well be a lot easier to write if there were some compiler features provided to _all_ functions. For instance, column positions are currently available in SyntaxErrors, but not other exceptions: >>> x = 1 >>> print("spam" + x) Traceback (most recent call last): File "", line 1, in TypeError: can only concatenate str (not "int") to str >>> print("spam" : x) File "", line 1 print("spam" : x) ^ SyntaxError: invalid syntax Imagine if the TypeError could show a caret, pointing to the plus sign. That would require that a function store column positions, not just line numbers. I'm not sure how much overhead it would add, nor how much benefit you'd really get from those markers, but it would then be the same mechanic for exception tracebacks and for semi-magical functions like this. ChrisA From steve at pearwood.info Fri Apr 27 09:05:02 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 27 Apr 2018 23:05:02 +1000 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: <20180427112733.GP7400@ando.pearwood.info> References: <20180427112733.GP7400@ando.pearwood.info> Message-ID: <20180427130500.GR7400@ando.pearwood.info> Actually, I think I can think of a way to make this work, if we're willing to resurrect some old syntax. On Fri, Apr 27, 2018 at 09:27:34PM +1000, Steven D'Aprano wrote: > I think that this is either a great idea or pointless, depending on what > the built-in actually does. > > If all it does is literally the debug print function you give: > > > # "debug print": prints and then returns its argument > > def dp(obj): > > print(repr(obj)) > > return obj > > then it is just a trivial helper as you say, and in my opinion too > trivial to bother making a builtin. I changed my mind... let's add this as a builtin, under the name debugprint. It is a completely normal, non-magical function, which takes four (not one) arguments: def debugprint(obj, lineno=None, module=None, source=None): out = [] if module is not None: if lineno is None: lineno = "?" out.append(f"Line {lineno} of {module}") if source is not None: out.append(ascii(source)) out.append(f"result {repr(obj)}") print(', '.join(out)) return obj Now let's put all the magic into some syntax. I'm going to suggest resurrecting the `` backtick syntax from Python 2. If that's not visually distinct enough, we could double them: ``expression``. When the compiler sees an expression inside backticks, it grabs the name of the module, the line number, and the expression source, and compiles a call to: debugprint(expression, lineno, module, source) in its place. That's the only magic needed, and since it is entirely at compile-time, all that information should be easily available. (I hope.) If not, then simply replace the missing values with None. If the caller shadows debugprint, it is their responsibility to either give it the correct signature, or not to use the backticks. Since it's just a normal function call, the worst that happens is that a mismatch in arguments gives you a TypeError. Shadowing debugprint would be an easy way to disable backticks on a per-module basis, at runtime. Simply define: def debugprint(obj, *args): return obj and Bob's yer uncle. -- Steve From ericfahlgren at gmail.com Fri Apr 27 09:27:44 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Fri, 27 Apr 2018 06:27:44 -0700 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: <20180427130500.GR7400@ando.pearwood.info> References: <20180427112733.GP7400@ando.pearwood.info> <20180427130500.GR7400@ando.pearwood.info> Message-ID: I've had a 'dprint' in sitecustomize for years. It clones 'print' and adds a couple of keyword parameters, 'show_stack' and 'depth', which give control over traceback output (walks back up sys._getframe for 'depth' entries). It returns the final argument if there is one, otherwise None. It can be used anywhere and everywhere that builtin print is used, plus anywhere in any expression just passing a single argument. I thought about replacing standard print with it, but I like the greppability of 'dprint' when it comes time to clean things. On Fri, Apr 27, 2018 at 6:05 AM, Steven D'Aprano wrote: > Actually, I think I can think of a way to make this work, if we're > willing to resurrect some old syntax. > > On Fri, Apr 27, 2018 at 09:27:34PM +1000, Steven D'Aprano wrote: > > I think that this is either a great idea or pointless, depending on what > > the built-in actually does. > > > > If all it does is literally the debug print function you give: > > > > > # "debug print": prints and then returns its argument > > > def dp(obj): > > > print(repr(obj)) > > > return obj > > > > then it is just a trivial helper as you say, and in my opinion too > > trivial to bother making a builtin. > > I changed my mind... let's add this as a builtin, under the name > debugprint. It is a completely normal, non-magical function, which takes > four (not one) arguments: > > > def debugprint(obj, lineno=None, module=None, source=None): > out = [] > if module is not None: > if lineno is None: > lineno = "?" > out.append(f"Line {lineno} of {module}") > if source is not None: > out.append(ascii(source)) > out.append(f"result {repr(obj)}") > print(', '.join(out)) > return obj > > > Now let's put all the magic into some syntax. I'm going to suggest > resurrecting the `` backtick syntax from Python 2. If that's not > visually distinct enough, we could double them: ``expression``. > > When the compiler sees an expression inside backticks, it grabs the name > of the module, the line number, and the expression source, and compiles > a call to: > > debugprint(expression, lineno, module, source) > > in its place. That's the only magic needed, and since it is entirely at > compile-time, all that information should be easily available. (I hope.) > If not, then simply replace the missing values with None. > > If the caller shadows debugprint, it is their responsibility to either > give it the correct signature, or not to use the backticks. Since it's > just a normal function call, the worst that happens is that a mismatch > in arguments gives you a TypeError. > > Shadowing debugprint would be an easy way to disable backticks on a > per-module basis, at runtime. Simply define: > > def debugprint(obj, *args): > return obj > > and Bob's yer uncle. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gadgetsteve at live.co.uk Fri Apr 27 10:22:30 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Fri, 27 Apr 2018 14:22:30 +0000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: On 27/04/2018 12:14, Nick Coghlan wrote: > On 27 April 2018 at 01:22, Serhiy Storchaka wrote: >> I think this special cases isn't special enough to introduce a special >> syntax. > > While I'm mostly inclined to agree, I do think it would be nice to > have a clean spelling for "ensure this module is fully imported, but > don't bind it locally". > How about a modifier for import such as: from module non_local import (a, b, c, d) or: from module import_named (a, b, c, d) I think that either gets the point across. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com From ericsnowcurrently at gmail.com Fri Apr 27 12:10:15 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 27 Apr 2018 10:10:15 -0600 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: On Thu, Apr 26, 2018 at 7:51 AM, Nick Coghlan wrote: > On 26 April 2018 at 23:37, Paul Moore wrote: >> What are the benefits of this over a simple "import "? > > Forcing submodule imports would be the main thing, as at the moment, > you have to choose between repeating the base name multiple times > (once per submodule) or losing the hierarchical namespace. > > So where: > > from pkg import mod1, mod2, mod3 > > bind "mod1", "mod2", and "mod3" in the current namespace, you might > instead write: > > from pkg import .mod1, .mod2, .mod3 > > to only bind "pkg" locally, but still make sure "pkg.mod1", "pkg.mod2" > and "pkg.mod3" all resolve at import time. I'm not exactly sure what this thread is about. :) I'm pretty sure it's one of the following (in order of confidence): * explicit relative imports that bind to a fully qualified module name * combining absolute import statements, for submodules of the same package, into a single import statement Either way I don't think there's much point. For the relative import case it kind of defeats the point of relative imports. For the absolute import case there isn't much value added. I'll elaborate in the rest of this message. Combining Absolute Import Statements ============================== First, I'll get combining-absolute-import-statements case out of the way. Given the following example: # in cafe.spam from . import eggs, ham, bacon # or # from . import eggs # from . import ham # from . import bacon ... # use eggs # use ham # use bacon The equivalent absolute imports would be: import cafe.eggs import cafe.ham import cafe.bacon ... # use cafe.eggs # use cafe.ham # use cafe.bacon Under the proposed syntax it would be: # OP # This matches the layout of the existing absolute import # statement syntax. import cafe.{eggs, ham, bacon} # Nick # This matches the layout of the existing relative import # statement syntax (albeit without the dot). from cafe import .eggs, .ham, .bacon ... # use cafe.eggs # use cafe.ham # use cafe.bacon For one thing, having one import per line improves the programming experience, IMHO. (Then again, I'm a fan of "do at most one thing per line of code".) For readers one-import-per-line makes it easier to quickly identify the imported modules, and especially to match a module name in the code to its corresponding import statement. I do that often enough that I worry a combined syntax would obscure the names and lead to more effort, as a reader, to match them. Also, it's not like a module is going to import so many other modules (in the normal case) that one module per line is going to result in too many import lines at the top of the file. [1] Qualifying Relative Imports ===================== Second, I'll address the case of possibly fully qualifying relative imports. That seems reasonable enough at first glance. However, the point of relative imports is that you don't have to tie a module's code to the overall package hierarchy in your project (app/library). This helps you: * avoid mass refactoring when you choose to move a directory to somewhere else in the tree (or even to a different/new library) * make it clear to readers that the imported modules are part of the same library/app * make it clear to readers how the current module relates to the imported ones * (sometimes) keep your import statements shorter, benefiting you and readers alike; this is particularly relevant if you happen to have a deep directory structure (not a great situation, but sometimes appropriate) [2] Again, relative imports mean that your code does not become tied to your project's directory layout; and you only have to know the base name of the imported module and how it relates (in the directory structure) to the current module. Suppose we add a syntax like the following (equivalent to the above examples): # This matches the relative import syntax. from '.' import eggs, ham, bacon # Note the quotation marks around the dot. # This matches the absolute import syntax. import .eggs, .ham, .bacon ... # use cafe.eggs # use cafe.ham # use cafe.bacon In that case the code is tied to the directory layout, so why did we use relative imports in the first place? -eric [1] Just in case, I'll preemptively response to one possible concern: that each of the three import statements above is going to import the "pkg" module. However, at most the first one will actually import it. For the others the import machinery will short-circuit when it finds the module in sys.modules. [Nick is well aware of this, but I'd be surprised if the majority of Python programmers know this.] So the cost is not high enough to worry about it. [2] "Consenting adults" aside, explicit relative imports will typically be to sibling modules in a package. The more indirect the import, the more dependent the module is on a larger tree of packages which leads to more complexity for readers and more likelihood of breakage when packages move around. From ericsnowcurrently at gmail.com Fri Apr 27 12:18:35 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 27 Apr 2018 10:18:35 -0600 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: On Fri, Apr 27, 2018 at 5:14 AM, Nick Coghlan wrote: > Taking this idea in a completely different direction: what if we were > to take advantage of PEP 451 __spec__ attributes to enhance modules to > natively support implicit on-demand imports before they give up and > raise AttributeError? (Essentially making all submodule imports > implicitly lazy if you don't explicitly import them - you'd only be > *required* to explicitly import top level modules, with everything > under that being an implementation detail of that top level package) That might be interesting to explore. It smells like something related to the lazy import magic that Neil Shemenauer is working on. On the plus side, it means one less thing for programmers to do. On the minus side, I find the imports at the top of the file to be a nice catalog of external dependencies. Implicitly importing submodules would break that. The idea might be not as useful since the programmer would have to use the fully qualified name (relative to the "top-level" package). So why not just put that in the import statement? -eric From clint.hepner at gmail.com Fri Apr 27 12:35:22 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Fri, 27 Apr 2018 12:35:22 -0400 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: <20180427130500.GR7400@ando.pearwood.info> References: <20180427112733.GP7400@ando.pearwood.info> <20180427130500.GR7400@ando.pearwood.info> Message-ID: <3C84CEEE-6996-4300-AB55-82C4767067A4@gmail.com> > On 2018 Apr 27 , at 9:05 a, Steven D'Aprano wrote: > > Actually, I think I can think of a way to make this work, if we're > willing to resurrect some old syntax. > > On Fri, Apr 27, 2018 at 09:27:34PM +1000, Steven D'Aprano wrote: >> I think that this is either a great idea or pointless, depending on what >> the built-in actually does. >> >> If all it does is literally the debug print function you give: >> >>> # "debug print": prints and then returns its argument >>> def dp(obj): >>> print(repr(obj)) >>> return obj >> >> then it is just a trivial helper as you say, and in my opinion too >> trivial to bother making a builtin. > > I changed my mind... let's add this as a builtin, under the name > debugprint. It is a completely normal, non-magical function, which takes > four (not one) arguments: > > > def debugprint(obj, lineno=None, module=None, source=None): > [magic elided] > > Now let's put all the magic into some syntax. I'm going to suggest > resurrecting the `` backtick syntax from Python 2. If that's not > visually distinct enough, we could double them: ``expression``. I don't want to hijack the thread on a digression, but instead of bringing `` back for just this one purpose, it could be used as a prefix to define a candidate pool of new keywords. ``debugprint(obj) # instead of ``obj`` meaning debugprint(obj) Any ``-prefixed word would either be a defined keyword or a syntax error. -- Clint From chris.barker at noaa.gov Fri Apr 27 12:38:34 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 27 Apr 2018 09:38:34 -0700 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: References: <20180427112733.GP7400@ando.pearwood.info> <20180427130500.GR7400@ando.pearwood.info> Message-ID: When I teach decorators, I start with a "logged" decorator example: https://uwpce-pythoncert.github.io/PythonCertDevel/modules/Decorators.html#an-example def logged_func(func): def logged(*args, **kwargs): print("Function {} called".format(func.__name__)) if args: print("\twith args: {}".format(args)) if kwargs: print("\twith kwargs: {}".format(kwargs)) result = func(*args, **kwargs) print("\t Result --> {}".format(result)) return result return logged Interestingly, I don't actually use such a thing in any kind of production, but it's could be a way to ackomplish what's been proposed here. As a decorator, we usually would expect to use it with the decoration syntax: @logged def a_new_func(): ... But it would also be used to re-bind a already defined function for testing. using the original example: while (func1() + 2 * func2()) < func3(): could become "logged" by adding: func2 = logged(func2) while (func1() + 2 * func2()) < func3(): I actually like that better than inserting extra code into the line you want to test. And I"m pretty clueless about what yu can to with inspect -- but maybe some more magic could be added in the decorator if you wanted that. -CHB On Fri, Apr 27, 2018 at 6:27 AM, Eric Fahlgren wrote: > I've had a 'dprint' in sitecustomize for years. It clones 'print' and > adds a couple of keyword parameters, 'show_stack' and 'depth', which give > control over traceback output (walks back up sys._getframe for 'depth' > entries). It returns the final argument if there is one, otherwise None. > It can be used anywhere and everywhere that builtin print is used, plus > anywhere in any expression just passing a single argument. > > I thought about replacing standard print with it, but I like the > greppability of 'dprint' when it comes time to clean things. > > > On Fri, Apr 27, 2018 at 6:05 AM, Steven D'Aprano > wrote: > >> Actually, I think I can think of a way to make this work, if we're >> willing to resurrect some old syntax. >> >> On Fri, Apr 27, 2018 at 09:27:34PM +1000, Steven D'Aprano wrote: >> > I think that this is either a great idea or pointless, depending on >> what >> > the built-in actually does. >> > >> > If all it does is literally the debug print function you give: >> > >> > > # "debug print": prints and then returns its argument >> > > def dp(obj): >> > > print(repr(obj)) >> > > return obj >> > >> > then it is just a trivial helper as you say, and in my opinion too >> > trivial to bother making a builtin. >> >> I changed my mind... let's add this as a builtin, under the name >> debugprint. It is a completely normal, non-magical function, which takes >> four (not one) arguments: >> >> >> def debugprint(obj, lineno=None, module=None, source=None): >> out = [] >> if module is not None: >> if lineno is None: >> lineno = "?" >> out.append(f"Line {lineno} of {module}") >> if source is not None: >> out.append(ascii(source)) >> out.append(f"result {repr(obj)}") >> print(', '.join(out)) >> return obj >> >> >> Now let's put all the magic into some syntax. I'm going to suggest >> resurrecting the `` backtick syntax from Python 2. If that's not >> visually distinct enough, we could double them: ``expression``. >> >> When the compiler sees an expression inside backticks, it grabs the name >> of the module, the line number, and the expression source, and compiles >> a call to: >> >> debugprint(expression, lineno, module, source) >> >> in its place. That's the only magic needed, and since it is entirely at >> compile-time, all that information should be easily available. (I hope.) >> If not, then simply replace the missing values with None. >> >> If the caller shadows debugprint, it is their responsibility to either >> give it the correct signature, or not to use the backticks. Since it's >> just a normal function call, the worst that happens is that a mismatch >> in arguments gives you a TypeError. >> >> Shadowing debugprint would be an easy way to disable backticks on a >> per-module basis, at runtime. Simply define: >> >> def debugprint(obj, *args): >> return obj >> >> and Bob's yer uncle. >> >> >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Apr 27 20:37:33 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Apr 2018 10:37:33 +1000 Subject: [Python-ideas] [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20 In-Reply-To: References: <4DA349EE-7C0C-4BD2-B2C7-A7D984869248@langa.pl> <20180426061000.GF7400@ando.pearwood.info> <04057199-F38D-47C8-A5D6-39F618824988@langa.pl> <20180427053837.GN7400@ando.pearwood.info> Message-ID: <20180428003733.GS7400@ando.pearwood.info> On Fri, Apr 27, 2018 at 04:24:35PM -0400, Wes Turner wrote: > if ((1) or (x := 3)): > if ((y := func(x)) if x else (x := 3)) Wes, there is *absolutely nothing new* here. This sort of error is already possible in Python. Do you see a lot of code like: if (1 or sequence.append(3) or sequence[-1]): in real life? If you do, then I'm really, really sorry that you are forced to deal with such rubbish code, but honestly, the vast bulk of Python programmers do not write like that, and they won't write this either: if (1 or (x := 3)): [...] > Assignment expressions, though they are noticeable :=, may not actually > define the variable in cases where that part of the line doesn't run but > reads as covered. The same applies to any operation at all. /sarcasm I guess adding print() to the language was a mistake, because we can write rubbish code like this: if 1 or print(x): and then be confused by the fact that x doesn't get printed. /end sarcasm In another post, you insisted that we need to warn in the PEP and the docs not to do this sort of thing. Should we also go through and add these warnings to list.append, dict.update, set.add, etc? I trust that the answer to that is obviously no. And neither do we have to assume that people who use binding-expressions will lose their minds and start writing the sort of awful code that they don't write with anything else. -- Steve From tim.peters at gmail.com Fri Apr 27 22:37:53 2018 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 27 Apr 2018 21:37:53 -0500 Subject: [Python-ideas] A "local" pseudo-function Message-ID: A brain dump, inspired by various use cases that came up during the binding expression discussions. Idea: introduce a "local" pseudo-function to capture the idea of initialized names with limited scope. As an expression, it's "local" "(" arguments ")" - Because it "looks like" a function call, nobody will expect the targets of named arguments to be fancier than plain names. - `a=12` in the "argument" list will (& helpfully so) mean pretty much the same as "a=12" in a "def" statement. - In a "local call" on its own, the scope of a named argument begins at the start of the next (if any) argument, and ends at the closing ")". For the duration, any variable of the same name in an enclosing scope is shadowed. - The parentheses allow for extending over multiple lines without needing to teach editors (etc) any new tricks (they already know how to format function calls with arglists spilling over multiple lines). - The _value_ of a local "call" is the value of its last "argument". In part, this is a way to sneak in C's comma operator without adding cryptic new line noise syntax. Time for an example. First a useless one: a = 1 b = 2 c = local(a=3) * local(b=4) Then `c` is 12, but `a` is still 1 and `b` is still 2. Same thing in the end: c = local(a=3, b=4, a*b) And just to be obscure, also the same: c = local(a=3, b=local(a=2, a*a), a*b) There the inner `a=2` temporarily shadows the outer `a=3` just long enough to compute `a*a` (4). This is one that little else really handled nicely: r1, r2 = local(D = b**2 - 4*a*c, sqrtD = math.sqrt(D), twoa = 2*a, ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa)) Everyone's favorite: if local(m = re.match(regexp, line)): print(m.group(0)) Here's where it's truly essential that the compiler know everything about "local", because in _that_ context it's required that the new scope extend through the end of the entire block construct (exactly what that means TBD - certainly through the end of the `if` block, but possibly also through the end of its associated (if any) `elif` and `else` blocks - and similarly for while/else constructs). Of course that example could also be written as: if local(m = re.match(regexp, line), m): print(m.group(0)) or more specifically: if local(m = re.match(regexp, line), m is not None): print(m.group(0)) or even: if local(m = re.match(regexp, line)) is not None: print(m.group(0)) A listcomp example, building the squares of integers from an iterable but only when the square is a multiple of 18: squares18 = [i2 for i in iterable if local(i2=i*i) % 18 == 0] That's a bit mind-bending, but becomes clear if you picture the kinda-equivalent nest: for i in iterable: if local(i2=i*i) % 18 == 0: append i2 to the output list That should also make clear that if `iterable` or `i` had been named `i2` instead, no problem. The `i2` created by `local()` is in a wholly enclosed scope. Drawbacks: since this is just a brain dump, absolutely none ;-) Q: Some of those would be clearer if it were the more Haskell-like local(...) "in" expression A: Yup, but for some of the others needing to add "in m" would be annoyingly redundant noise. Making an "in" clause optional doesn't really fly either, because then local(a='z') in 'xyz' would be ambiguous. Is it meant to return `'xyz'`, or evaluate `'z' in 'xyz'`? And any connector other than "in" would make the loose resemblance to Haskell purely imaginary ;-) Q: Didn't you drone on about how assignment expressions with complex targets seemed essentially useless without also introducing a "comma operator" - and now you're sneaking the latter in but _still_ avoiding complex targets?! A. Yes, and yes :-) The syntactic complexity of the fully general assignment statement is just too crushing to _sanely_ shoehorn into any "expression-like" context. Q: What's the value of this? local(a=7, local(a=a+1, a*2)) A: 16. Obviously. Q: Wow - that _is_ obvious! OK, what about this, where there is no `a` in any enclosing scope: local(a) A: I think it should raise NameError, just like a function call would. There is no _intent_ here to allow merely declaring a local variable without supplying an initial value. Q: What about local(2, 4, 5)? A: It should return 5, and introduce no names. I don't see a point to trying to outlaw stupidity ;-) Then again, it would be consistent with the _intent_ to require that all but the last "argument" be of the `name=expression` form. Q: Isn't changing the meaning of scope depending on context waaaay magical? A: Yup! But in a language with such a strong distinction between statements and expressions, without a bit of deep magic there's no single syntax I can dream up that could work well for both that didn't require _some_ deep magic. The gimmick here is something I expect will be surprising the first time it's seen, less so the second, and then you're never confused about it again. Q: Are you trying to kill PEP 572? A: Nope! But since this largely subsumes the functionality of binding expressions, I did want to put this out there before 572's fate is history. Binding expressions are certainly easier to implement, and I like them just fine :-) Note: the thing I'm most interested in isn't debates, but in whether this would be of real use in real code. From yselivanov.ml at gmail.com Fri Apr 27 23:34:44 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 28 Apr 2018 03:34:44 +0000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: Hi Tim, This is interesting. Even "as is" I prefer this to PEP 572. Below are some comments and a slightly different idea inspired by yours (sorry!) On Fri, Apr 27, 2018 at 10:41 PM Tim Peters wrote: [..] > As an expression, it's > "local" "(" arguments ")" > - Because it "looks like" a function call, nobody will expect the targets > of named arguments to be fancier than plain names. [..] > Everyone's favorite: > if local(m = re.match(regexp, line)): > print(m.group(0)) > Here's where it's truly essential that the compiler know everything > about "local", because in _that_ context it's required that the new > scope extend through the end of the entire block construct (exactly It does look like a function call, although it has a slightly different syntax. In regular calls we don't allow positional arguments to go after keyword arguments. Hence the compiler/parser will have to know what 'local(..)' is *regardless* of where it appears. If you don't want to make 'local' a new keyword, we would need to make the compiler/parser to trace the "local()" name to check if it was imported or is otherwise "local". This would add some extra complexity to already complex code. Another problematic case is when one has a big file and someone adds their own "def local()" function to it at some point, which would break things. Therefore, "local" should probably be a keyword. Perhaps added to Python with a corresponding "from __future__" import. The other way would be to depart from the function call syntax by dropping the parens. (And maybe rename "local" to "let" ;)) In this case, the syntax will become less like a function call but still distinct enough. We will be able to unambiguously parse & compile it. The cherry on top is that we can make it work even without a "__future__" import! When we implemented PEP 492 in Python 3.5 we did a little trick in tokenizer to treat "async def" in a special way. Tokenizer would switch to an "async" mode and yield ASYNC and AWAIT tokens instead of NAME tokens. This resulted in async/await syntax available without a __future__ import, while having full backwards compatibility. We can do a similar trick for "local" / "let" syntax, allowing the following: "let" NAME "=" expr ("," NAME = expr)* ["," expr] * "if local(m = re.match(...), m):" becomes "if let m = re.match(...), m:" * "c = local(a=3) * local(b=4)" becomes "c = let a=3, b=4, a*b" or "c = (let a=3, b=4, a*b)" * for i in iterable: if let i2=i*i, i2 % 18 == 0: append i2 to the output list etc. Note that I don't propose this new "let" or "local" to return their last assignment. That should be done explicitly (as in your "local(..)" idea): `let a = 'spam', a`. Potentially we could reuse our function return annotation syntax, changing the last example to `let a = "spam" -> a` but I think it makes the whole thing to look unnecessarily complex. One obvious downside is that "=" would have a different precedence compared to a regular assignment statement. But it already has a different precedent in function calls, so maybe this isn't a big deal, considered that we'll have a keyword before it. I think that "let" was discussed a couple of times recently, but it's really hard to find a definitive reason of why it was rejected (or was it?) in the ocean of emails about assignment expressions. Yury From ncoghlan at gmail.com Sat Apr 28 00:42:27 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Apr 2018 14:42:27 +1000 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: On 28 April 2018 at 02:18, Eric Snow wrote: > On the plus side, it means one less thing for programmers to do. On > the minus side, I find the imports at the top of the file to be a nice > catalog of external dependencies. Implicitly importing submodules > would break that. > > The idea might be not as useful since the programmer would have to use > the fully qualified name (relative to the "top-level" package). So > why not just put that in the import statement? > I'm mainly thinking of it in terms of inadvertently breaking abstraction layers. Right now, implementation decisions of third-party package authors get exposed to end users, as the question of whether or not a submodule gets eagerly imported by the parent module, or explicitly configured as a lazy import, is exposed to end users: Parent package eagerly imports submodules: no explicit submodule import needed, just import the parent package Parent package explicitly sets up a lazy submodule import: no explicit submodule import needed, just import the parent package Parent package doesn't do either: possible AttributeError at time of use depending on whether or not you or someone else has previously run "import package.submodule" (or an equivalent) The current attribute error is cryptic (and not always raised if some other module has already done the import!), so trying the submodule import implicitly would also provide an opportunity to customise the way the failure is reported. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sat Apr 28 02:07:29 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 28 Apr 2018 01:07:29 -0500 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: [Yury Selivanov ] > This is interesting. Even "as is" I prefer this to PEP 572. Below are some > comments and a slightly different idea inspired by yours (sorry!) That's fine :-) > It does look like a function call, although it has a slightly different > syntax. In regular calls we don't allow positional arguments to go after > keyword arguments. Hence the compiler/parser will have to know what > 'local(..)' is *regardless* of where it appears. Not only for that reason, but because the semantics have almost nothing in common with function calls. For example, in local(a=1, b=a+1) the new binding of `a` needs to be used to establish the binding of `b`. Not to mention that a new scope may need to be established, and torn down later. To the compiler, it's approximately nothing like "a function call". "Looking like" a function call nevertheless has compelling benefits: - People already know the syntax for specifying keyword arguments. - The precedence of "=" in a function call is already exactly right for this purpose. So nothing new to learn there either. - The explicit parentheses make it impossible to misunderstand where the expression begins or ends. - Even if someone knows nothing about "local", they _will_ correctly assume that, at runtime, it will evaluate to an object. In that key respect it is exactly like the function call it "looks like". I do want to leverage what people "already know". > If you don't want to make 'local' a new keyword, we would need to make the > compiler/parser to trace the "local()" name to check if it was imported or > is otherwise "local". This would add some extra complexity to already > complex code. Another problematic case is when one has a big file and > someone adds their own "def local()" function to it at some point, which > would break things. I believe it absolutely needs to become a reserved word. While binding expressions suffice to capture values in conditionals, they're not all that pleasant for use in expressions (that really wants a comma operator too), and it's a fool's errand to imagine that _any_ currently-unused sequence of gibberish symbols and/or abused keywords can plausibly suggest "and here we're also creating a new scope, and here I'm declaring some names to live in that scope". If all that is actually wanted, a new reserved word seems pragmatically necessary. That's a high bar. Speaking of which, "sublocal" would be a more accurate name, and less likely to conflict with existing code names, but after people got over the shock to their sense of purity, they'd appreciate typing the shorter "local" instead ;-) For that matter, I'd be fine too with shortening it to "let". In fact, I prefer that! Thanks :-) > Therefore, "local" should probably be a keyword. Perhaps added to Python > with a corresponding "from __future__" import. Yes. > The other way would be to depart from the function call syntax by dropping > the parens. (And maybe rename "local" to "let" ;)) In this case, the > syntax will become less like a function call but still distinct enough. We > will be able to unambiguously parse & compile it. The cherry on top is > that we can make it work even without a "__future__" import! I'm far less concerned about pain for the compiler than pain for the human reader. There's almost no precedent in Python's expression grammar that allows two names separated only by whitespace. The screams "statement" instead, whether "class NAME" or "async for" or "import modulename" or "global foo" or "for i in" or ... I'm afraid it would prove just too jarring to see that inside what's supposed to be "an expression". That's why I settled on the (admittedly unexpected) "pseudo-function" syntax. The exceptions involve expressions that are really test-&-branch structures in disguise ("and", "or", ternary "if"). In those we can find NAME NAME snippets, but the keyword is never at the start of those. So, curiously enough, I'd be fonder of result_expression "where" name=expression, ... than of "let" name=expression, ... if I hadn't already resigned myself to believing function-like syntax is overwhelmingly less jarring regardless. > When we implemented PEP 492 in Python 3.5 we did a little trick in > tokenizer to treat "async def" in a special way. Tokenizer would switch to > an "async" mode and yield ASYNC and AWAIT tokens instead of NAME tokens. > This resulted in async/await syntax available without a __future__ import, > while having full backwards compatibility. Which was clever, and effective, but - as far as I know - limited to _statements_, where KEYWORD NAME thingies were already common as mud. > We can do a similar trick for "local" / "let" syntax, allowing the > following: > > "let" NAME "=" expr ("," NAME = expr)* ["," expr] See the bullet list near the top for all the questions that _raises_ that don't even need to be asked (by users) when using a function-like syntax instead. > * "if local(m = re.match(...), m):" becomes > "if let m = re.match(...), m:" > > * "c = local(a=3) * local(b=4)" becomes > "c = let a=3, b=4, a*b" or "c = (let a=3, b=4, a*b)" I assume that was meant to be c = (let a=3, a) * (let b=4, b) instead. In _an expression_, I naturally group the a = 3, a part as the unintended a = (3, a) When I'm thinking of function calls, though, I naturally use the intended (a=3), a grouping. I really don't want to fight with what "everyone already knows", but build on that to the extent possible. > * for i in iterable: > if let i2=i*i, i2 % 18 == 0: > append i2 to the output list > > etc. In the second line there too, I did like that in if local(i2=i*i) % 18 == 0: the mandatory parentheses made it wholly explicit where "the binding part" ended. > Note that I don't propose this new "let" or "local" to return their last > assignment. That should be done explicitly (as in your "local(..)" idea): > `let a = 'spam', a`. Why not? A great many objects in Python are _designed_ so that their __bool__ method does a useful thing in a plain if object: test. In these common cases, needing to type if let name=object, object: instead of if let name=object: is an instance of what my pseudo-FAQ called "annoyingly redundant noise". Deciding to return the value of the last "argument" expression was a "practicality beats purity" thing. > Potentially we could reuse our function return annotation syntax, changing the > last example to `let a = "spam" -> a` but I think it makes the whole thing to > look unnecessarily complex. Me too - but then I thought it was _already_ too wordy to require `let a="spam", a` ;-) > One obvious downside is that "=" would have a different precedence compared > to a regular assignment statement. But it already has a different precedent > in function calls, so maybe this isn't a big deal, considered that we'll > have a keyword before it. I think my response to that is too predictable by now to annoy you by giving it ;-) > I think that "let" was discussed a couple of times recently, but it's > really hard to find a definitive reason of why it was rejected (or was it?) > in the ocean of emails about assignment expressions. I don't know either. There were a number of halfheartedly presented ideas, though, that floundered (to my eyes) on the rocks of trying to ignore that syntax ideas borrowed from "everything's an expression" functional languages don't carry over well to a language where many constructs aren't expressions at all. Or, conversely, that syntax unique to the opening line of a statement-oriented language's block doesn't carry over at all to expressions. I'm trying to do both at once, because I'm not a wimp - and neither are you ;-) From yselivanov.ml at gmail.com Sat Apr 28 03:16:40 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 28 Apr 2018 07:16:40 +0000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: On Sat, Apr 28, 2018 at 2:07 AM Tim Peters wrote: [...] > Speaking of which, "sublocal" would be a more accurate name, and less > likely to conflict with existing code names, but after people got over > the shock to their sense of purity, they'd appreciate typing the > shorter "local" instead ;-) > For that matter, I'd be fine too with shortening it to "let". In > fact, I prefer that! Thanks :-) Great! :) [...] > So, curiously enough, I'd be fonder of > result_expression "where" name=expression, ... > than of > "let" name=expression, ... My gripe about the "where"-like syntax family (including "as", proposed by many) is that the whole expression needs to be read backwards to make sense of `name*. It's one of the things that annoy me in languages like SQL, where the relevant "AS" keyword can be buried a couple of screens below/above the area of interest. That's why I find the "let" form superior. [..] > instead. In _an expression_, I naturally group the > a = 3, a > part as the unintended > a = (3, a) > When I'm thinking of function calls, though, I naturally use the intended > (a=3), a > grouping. I really don't want to fight with what "everyone already > knows", but build on that to the extent possible. Looking at all of these and other examples in your email I have to agree that the version with parenthesis reads way more clearly. A different (and correct in this case) precedence of "," between parens is pretty much hardwired in our brains. [..] > Why not? A great many objects in Python are _designed_ so that their > __bool__ method does a useful thing in a plain > if object: > test. In these common cases, needing to type > if let name=object, object: Alright. And yes, I agree, surrounding parens are badly needed. Maybe we can replace parens with {}? Although my immediate reaction to that is "it's too ugly to be taken seriously". In any case, the similarity of this new syntax to a function call still bothers me. If I just looked at some Python 3.xx code and saw the new "let(..)" syntax, I would assume that the names it declares are only visible *within* the parens. Parens imply some locality of whatever is happening between them. Even if I googled the docs first to learn some basics about "let", when I saw "if let(name...)" it wouldn't be immediately apparent to me that `name` is set only for its "if" block (contrary to "if let a = 1: print(a)"). Yury From steve at pearwood.info Sat Apr 28 04:30:44 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Apr 2018 18:30:44 +1000 Subject: [Python-ideas] [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20 In-Reply-To: <20180428003733.GS7400@ando.pearwood.info> References: <4DA349EE-7C0C-4BD2-B2C7-A7D984869248@langa.pl> <20180426061000.GF7400@ando.pearwood.info> <04057199-F38D-47C8-A5D6-39F618824988@langa.pl> <20180427053837.GN7400@ando.pearwood.info> <20180428003733.GS7400@ando.pearwood.info> Message-ID: <20180428083043.GT7400@ando.pearwood.info> On Sat, Apr 28, 2018 at 10:37:33AM +1000, Steven D'Aprano wrote: > On Fri, Apr 27, 2018 at 04:24:35PM -0400, Wes Turner wrote: [...] Oops, sent to the wrong list. Sorry for the noise. -- Steve From kenlhilton at gmail.com Sat Apr 28 04:33:32 2018 From: kenlhilton at gmail.com (Ken Hilton) Date: Sat, 28 Apr 2018 16:33:32 +0800 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? Message-ID: Perhaps there could be a special magic method that classes could implement for such a function? I.e. something like this: class Pizza(object): diameter = 1 toppings = [] def __init__(self, diameter=1, toppings=[]): self.diameter = diameter self.toppings = toppings def __str__(self): return '{}-inch pizza, toppings: {}'.format(self.diameter, ', '.join(self.toppings)) def __repr__(self): return '<{}-inch Pizza>'.format(self.diameter) def __dprint__(self): return ''.format(self.__dict__) This is just an idea on the side - basically, have a new magic method for "dprint" (or whatever it should be called) that returns a separate string form of an object to be debug-printed. So the following example (following up from the Pizza example): def cheesepizza(size): return Pizza(size, ['cheese']) print(cheesepizza(1)) print(repr(cheesepizza(1))) print(dprint(cheesepizza(1))) Should produce the following output: 1-inch pizza, toppings: cheese <1-inch Pizza> 1-inch pizza, toppings: cheese Note the last line - remember, dprint passes through the first argument, printing it in the process. So ``print(dprint(x))`` should first output ``x.__dprint__ + '\n'`` then output ``x.__str__ + '\n'``. Behavior note: if __dprint__ is not available, fall back to __str__. Such an extra magic method would be useful if the class should have a different representation when debugging (i.e. maybe something makes use of repr(x), but dprint(x) should only be used for debugging so it shows more information). Also remember that it's totally plausible to simply state dprint(cheesepizza(1)) and thereby throw away the return value; such use would be like "print" but more verbose (again assuming __dprint__ is implemented). I would also like to ask whether "dprint" should support the same keyword arguments as "print" - i.e. "end", "file", and so on. IMO it should, as the only difference is that "dprint" returns its first argument while "print" returns None. But what are your thoughts? Sincerely, Ken ? Hilton? ; -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Sat Apr 28 05:09:09 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 28 Apr 2018 10:09:09 +0100 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: On 28 April 2018 at 03:37, Tim Peters wrote: > Idea: introduce a "local" pseudo-function to capture the idea of > initialized names with limited scope. This looks disturbingly good to me. I say "disturbingly" because the amount of magic involved in that "function call" is pretty high, and the more you look at it, the more weird its behaviour seems. I have essentially no real world use cases for this, though (other than ones that have been brought up already in the binding expression debate), so my comments are purely subjective opinion. [...] > Everyone's favorite: > > if local(m = re.match(regexp, line)): > print(m.group(0)) > > Here's where it's truly essential that the compiler know everything > about "local", because in _that_ context it's required that the new > scope extend through the end of the entire block construct (exactly > what that means TBD - certainly through the end of the `if` block, but > possibly also through the end of its associated (if any) `elif` and > `else` blocks - and similarly for while/else constructs). This is where, in my mind, the magic behaviour goes too far. I can see why it's essential that this happens, but I can't come up with a justification for it other than pure expediency. And while I know "practicality beats purity" (look at me, daring to quote the Zen at Tim!!! :-)) this just feels like a step too far to me. Paul From steve at pearwood.info Sat Apr 28 05:33:34 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Apr 2018 19:33:34 +1000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: <20180428093334.GU7400@ando.pearwood.info> On Fri, Apr 27, 2018 at 09:37:53PM -0500, Tim Peters wrote: > A brain dump, inspired by various use cases that came up during the > binding expression discussions. > > Idea: introduce a "local" pseudo-function to capture the idea of > initialized names with limited scope. [...] Chris' PEP 572 started off with the concept that binding expressions would create a "sub-local" scope, below function locals. After some debate on Python-Ideas, Chris, Nick and Guido took the discussion off list and decided to drop the sub-local scope idea as confusing and hard to implement. But the biggest problem is that this re-introduces exactly the same awful C mistake that := was chosen to avoid. Which of the following two contains the typo? local(spam=expression, eggs=expression, cheese = spam+eggs) local(spam=expression, eggs=expression, cheese == spam+eggs) I have other objections, but I'll leave them for now, since I think these two alone are fatal. Once you drop those two flaws, you're basically left with PEP 572 :-) -- Steve From steve at pearwood.info Sat Apr 28 05:36:37 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Apr 2018 19:36:37 +1000 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: References: Message-ID: <20180428093637.GV7400@ando.pearwood.info> On Sat, Apr 28, 2018 at 04:33:32PM +0800, Ken Hilton wrote: > Perhaps there could be a special magic method that classes could implement > for such a function? I.e. something like this: That doesn't help for the scenario in the original motivating example, where there are no classes involved. -- Steve From schesis at gmail.com Sat Apr 28 06:22:04 2018 From: schesis at gmail.com (Zero Piraeus) Date: Sat, 28 Apr 2018 11:22:04 +0100 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: : On 28 April 2018 at 07:07, Tim Peters wrote: > [...] For that matter, I'd be fine too with shortening it to "let". In > fact, I prefer that! Thanks :-) If you really wanted to overcome objections that this looks too much like a function, you could spell it "$". I'm not sure what I think about $(a=7, $(a=a+1, a*2)) yet, but it doesn't make me want to run screaming in the way that := does. I think I finally worked out why I have such a violent reaction to := in the end, by the way: it's because it reminds me of Javascript's "===" (not the meaning, but the fact that it exists). -[]z. From njs at pobox.com Sat Apr 28 06:35:23 2018 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 28 Apr 2018 03:35:23 -0700 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: References: <20180427112733.GP7400@ando.pearwood.info> Message-ID: On Fri, Apr 27, 2018 at 5:58 AM, Chris Angelico wrote: > On Fri, Apr 27, 2018 at 9:27 PM, Steven D'Aprano wrote: > I don't think this needs any specific compiler magic or making 'dp' a > reserved name, but it might well be a lot easier to write if there > were some compiler features provided to _all_ functions. For instance, > column positions are currently available in SyntaxErrors, but not > other exceptions: > >>>> x = 1 >>>> print("spam" + x) > Traceback (most recent call last): > File "", line 1, in > TypeError: can only concatenate str (not "int") to str >>>> print("spam" : x) > File "", line 1 > print("spam" : x) > ^ > SyntaxError: invalid syntax > > Imagine if the TypeError could show a caret, pointing to the plus > sign. That would require that a function store column positions, not > just line numbers. I'm not sure how much overhead it would add, nor > how much benefit you'd really get from those markers, but it would > then be the same mechanic for exception tracebacks and for > semi-magical functions like this. Being able to add carets to tracebacks in general would be quite nice actually. Imagine: Traceback (most recent call last): File "/tmp/blah.py", line 16, in print(foo()) ^^^^^ File "/tmp/blah.py", line 6, in foo return bar(1) + bar(2) ^^^^^^ File "/tmp/blah.py", line 10, in bar return baz(2 * x) / baz(2 * x + 1) ^^^^^^^^^^ File "/tmp/blah.py", line 13, in baz return 1 + 1 / (x - 4) ^^^^^^^^^^^ ZeroDivisionError: division by zero This is how I report error messages in patsy[1], and people seem to appreciate it... it would also help Python catch back up with other languages whose error reporting has gotten much friendlier in recent years (e.g., rust, clang). Threading column numbers through the compiler might be tedious but AFAICT should be straightforward in principle. (Peephole optimizations and similar might be a bit of a puzzle, but you can do pretty crude things like saying new_span_start = min(*old_span_starts); new_span_end = max(*old_span_ends) and still get something that's at least useful, even if not necessarily 100% theoretically accurate.) The runtime overhead would be essentially zero, since this would be a static table that only gets consulted when printing tracebacks, similar to the lineno table. (Tracebacks already preserve f_lasti.) So I think the main issue would be the extra memory in each code object to hold the bytecode offset -> column numbers table. We'd need some actual numbers to judge this for real, but my guess is that the gain in usability+friendliness would be easily worth it for 99% of users, and the other 1% are already plotting how to add options to strip out unnecessary things like type annotations so if it's a problem then this could be another thing for them to add to their list ? leave out these tables at -OOO or whatever. -n [1] https://patsy.readthedocs.io/en/latest/overview.html -- Nathaniel J. Smith -- https://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Apr 28 07:17:47 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Apr 2018 21:17:47 +1000 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: <3C84CEEE-6996-4300-AB55-82C4767067A4@gmail.com> References: <20180427112733.GP7400@ando.pearwood.info> <20180427130500.GR7400@ando.pearwood.info> <3C84CEEE-6996-4300-AB55-82C4767067A4@gmail.com> Message-ID: <20180428111747.GX7400@ando.pearwood.info> On Fri, Apr 27, 2018 at 12:35:22PM -0400, Clint Hepner wrote: > I don't want to hijack the thread on a digression, but instead of bringing `` back for > just this one purpose, it could be used as a prefix to define a candidate > pool of new keywords. > > ``debugprint(obj) # instead of ``obj`` meaning debugprint(obj) Part of the reason for me picking `obj` or ``obj`` is that I want it to be short and easy to type, and not very distracting. Neither of those apply to debugprint. -- Steve From steve at pearwood.info Sat Apr 28 07:25:23 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Apr 2018 21:25:23 +1000 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: References: <20180427112733.GP7400@ando.pearwood.info> Message-ID: <20180428112522.GY7400@ando.pearwood.info> On Fri, Apr 27, 2018 at 10:53:39PM +1000, Nick Coghlan wrote: > On 27 April 2018 at 21:27, Steven D'Aprano wrote: > > > Obviously dp() would have to be magic. There's no way that I know of for > > a Python function to see the source code of its own arguments. I have no > > idea what sort of deep voodoo would be required to make this work. But > > if it could work, wow, that would really be useful. And not just for > > beginners. > > > > If you relax the enhancement to just noting the line where the debug print > came from, it doesn't need to be deep compiler magic - the same kind of > stack introspection that warnings and tracebacks use would suffice. I don't think this is worth bothering with if we relax the enhancement to just the line. As such, there are already ways to get the desired result, and people can just add it to the startup file or personal toolbox module. It doesn't need to be a builtin. Maybe in the std lib? from pdb import debugprint as dp perhaps, but not a builtin. Although it will require compiler support, I think that being able to drill down to the level of individual expressions would be a fantastic aid to debugging. -- Steve From steve at pearwood.info Sat Apr 28 07:27:36 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Apr 2018 21:27:36 +1000 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: References: <20180427112733.GP7400@ando.pearwood.info> <20180427130500.GR7400@ando.pearwood.info> Message-ID: <20180428112735.GZ7400@ando.pearwood.info> On Fri, Apr 27, 2018 at 06:27:44AM -0700, Eric Fahlgren wrote: > I've had a 'dprint' in sitecustomize for years. It clones 'print' and adds > a couple of keyword parameters, 'show_stack' and 'depth', which give > control over traceback output (walks back up sys._getframe for 'depth' > entries). It returns the final argument if there is one, otherwise None. > It can be used anywhere and everywhere that builtin print is used, plus > anywhere in any expression just passing a single argument. Is this published anywhere? -- Steve From rosuav at gmail.com Sat Apr 28 07:32:16 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 28 Apr 2018 21:32:16 +1000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: On Sat, Apr 28, 2018 at 8:22 PM, Zero Piraeus wrote: > I think I finally worked out why I have such a violent reaction to := > in the end, by the way: it's because it reminds me of Javascript's > "===" (not the meaning, but the fact that it exists). Out of morbid curiosity, why does it remind you of that? ChrisA From rymg19 at gmail.com Sat Apr 28 09:52:51 2018 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Sat, 28 Apr 2018 08:52:51 -0500 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: <1630c871b38.2837.db5b03704c129196a4e9415e55413ce6@gmail.com> I have to say, this idea feels really nice to me. It's far easier to read than := and separates the assignments and the result expression nicely. Others have brought up the same problem of = vs ==. IMO a solution could be to make a requirement that the last argument is NOT an assignment. In other words, this would be illegal: local(a=1) and you would have to do this: local(a=1, a) Now if the user mixes up = and ==, it'd be a "compile-time error". -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/ On April 27, 2018 9:41:57 PM Tim Peters wrote: > A brain dump, inspired by various use cases that came up during the > binding expression discussions. > > Idea: introduce a "local" pseudo-function to capture the idea of > initialized names with limited scope. > > As an expression, it's > > "local" "(" arguments ")" > > - Because it "looks like" a function call, nobody will expect the targets > of named arguments to be fancier than plain names. > > - `a=12` in the "argument" list will (& helpfully so) mean pretty much the > same as "a=12" in a "def" statement. > > - In a "local call" on its own, the scope of a named argument begins at the > start of the next (if any) argument, and ends at the closing ")". For the > duration, any variable of the same name in an enclosing scope is shadowed. > > - The parentheses allow for extending over multiple lines without needing > to teach editors (etc) any new tricks (they already know how to format > function calls with arglists spilling over multiple lines). > > - The _value_ of a local "call" is the value of its last "argument". In > part, this is a way to sneak in C's comma operator without adding cryptic > new line noise syntax. > > Time for an example. First a useless one: > > a = 1 > b = 2 > c = local(a=3) * local(b=4) > > Then `c` is 12, but `a` is still 1 and `b` is still 2. Same thing in the end: > > c = local(a=3, b=4, a*b) > > And just to be obscure, also the same: > > c = local(a=3, b=local(a=2, a*a), a*b) > > There the inner `a=2` temporarily shadows the outer `a=3` just long > enough to compute `a*a` (4). > > This is one that little else really handled nicely: > > r1, r2 = local(D = b**2 - 4*a*c, > sqrtD = math.sqrt(D), > twoa = 2*a, > ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa)) > > Everyone's favorite: > > if local(m = re.match(regexp, line)): > print(m.group(0)) > > Here's where it's truly essential that the compiler know everything > about "local", because in _that_ context it's required that the new > scope extend through the end of the entire block construct (exactly > what that means TBD - certainly through the end of the `if` block, but > possibly also through the end of its associated (if any) `elif` and > `else` blocks - and similarly for while/else constructs). > > Of course that example could also be written as: > > if local(m = re.match(regexp, line), m): > print(m.group(0)) > > or more specifically: > > if local(m = re.match(regexp, line), m is not None): > print(m.group(0)) > > or even: > > if local(m = re.match(regexp, line)) is not None: > print(m.group(0)) > > A listcomp example, building the squares of integers from an iterable > but only when the square is a multiple of 18: > > squares18 = [i2 for i in iterable if local(i2=i*i) % 18 == 0] > > That's a bit mind-bending, but becomes clear if you picture the > kinda-equivalent nest: > > for i in iterable: > if local(i2=i*i) % 18 == 0: > append i2 to the output list > > That should also make clear that if `iterable` or `i` had been named > `i2` instead, no problem. The `i2` created by `local()` is in a > wholly enclosed scope. > > Drawbacks: since this is just a brain dump, absolutely none ;-) > > Q: Some of those would be clearer if it were the more Haskell-like > > local(...) "in" expression > > A: Yup, but for some of the others needing to add "in m" would be > annoyingly redundant noise. Making an "in" clause optional doesn't > really fly either, because then > > local(a='z') in 'xyz' > > would be ambiguous. Is it meant to return `'xyz'`, or evaluate `'z' > in 'xyz'`? And any connector other than "in" would make the loose > resemblance to Haskell purely imaginary ;-) > > Q: Didn't you drone on about how assignment expressions with complex > targets seemed essentially useless without also introducing a "comma > operator" - and now you're sneaking the latter in but _still_ avoiding > complex targets?! > > A. Yes, and yes :-) The syntactic complexity of the fully general > assignment statement is just too crushing to _sanely_ shoehorn into > any "expression-like" context. > > Q: What's the value of this? local(a=7, local(a=a+1, a*2)) > > A: 16. Obviously. > > Q: Wow - that _is_ obvious! OK, what about this, where there is no > `a` in any enclosing scope: local(a) > > A: I think it should raise NameError, just like a function call would. > There is no _intent_ here to allow merely declaring a local variable > without supplying an initial value. > > Q: What about local(2, 4, 5)? > > A: It should return 5, and introduce no names. I don't see a point to > trying to outlaw stupidity ;-) Then again, it would be consistent > with the _intent_ to require that all but the last "argument" be of > the `name=expression` form. > > Q: Isn't changing the meaning of scope depending on context waaaay magical? > > A: Yup! But in a language with such a strong distinction between > statements and expressions, without a bit of deep magic there's no > single syntax I can dream up that could work well for both that didn't > require _some_ deep magic. The gimmick here is something I expect > will be surprising the first time it's seen, less so the second, and > then you're never confused about it again. > > Q: Are you trying to kill PEP 572? > > A: Nope! But since this largely subsumes the functionality of binding > expressions, I did want to put this out there before 572's fate is > history. Binding expressions are certainly easier to implement, and I > like them just fine :-) > > > Note: the thing I'm most interested in isn't debates, but in whether > this would be of real use in real code. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From fakedme+py at gmail.com Sat Apr 28 10:21:09 2018 From: fakedme+py at gmail.com (Soni L.) Date: Sat, 28 Apr 2018 11:21:09 -0300 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: <08296785-eff2-89ae-1b9e-0cd7c0111fa4@gmail.com> On 2018-04-27 11:37 PM, Tim Peters wrote: > A brain dump, inspired by various use cases that came up during the > binding expression discussions. > > Idea: introduce a "local" pseudo-function to capture the idea of > initialized names with limited scope. > > As an expression, it's > > "local" "(" arguments ")" > > - Because it "looks like" a function call, nobody will expect the targets > of named arguments to be fancier than plain names. > > - `a=12` in the "argument" list will (& helpfully so) mean pretty much the > same as "a=12" in a "def" statement. > > - In a "local call" on its own, the scope of a named argument begins at the > start of the next (if any) argument, and ends at the closing ")". For the > duration, any variable of the same name in an enclosing scope is shadowed. > > - The parentheses allow for extending over multiple lines without needing > to teach editors (etc) any new tricks (they already know how to format > function calls with arglists spilling over multiple lines). > > - The _value_ of a local "call" is the value of its last "argument". In > part, this is a way to sneak in C's comma operator without adding cryptic > new line noise syntax. > > Time for an example. First a useless one: > > a = 1 > b = 2 > c = local(a=3) * local(b=4) > > Then `c` is 12, but `a` is still 1 and `b` is still 2. Same thing in the end: > > c = local(a=3, b=4, a*b) > > And just to be obscure, also the same: > > c = local(a=3, b=local(a=2, a*a), a*b) > > There the inner `a=2` temporarily shadows the outer `a=3` just long > enough to compute `a*a` (4). > > This is one that little else really handled nicely: > > r1, r2 = local(D = b**2 - 4*a*c, > sqrtD = math.sqrt(D), > twoa = 2*a, > ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa)) > > Everyone's favorite: > > if local(m = re.match(regexp, line)): > print(m.group(0)) > > Here's where it's truly essential that the compiler know everything > about "local", because in _that_ context it's required that the new > scope extend through the end of the entire block construct (exactly > what that means TBD - certainly through the end of the `if` block, but > possibly also through the end of its associated (if any) `elif` and > `else` blocks - and similarly for while/else constructs). > > Of course that example could also be written as: > > if local(m = re.match(regexp, line), m): > print(m.group(0)) > > or more specifically: > > if local(m = re.match(regexp, line), m is not None): > print(m.group(0)) > > or even: > > if local(m = re.match(regexp, line)) is not None: > print(m.group(0)) > > A listcomp example, building the squares of integers from an iterable > but only when the square is a multiple of 18: > > squares18 = [i2 for i in iterable if local(i2=i*i) % 18 == 0] > > That's a bit mind-bending, but becomes clear if you picture the > kinda-equivalent nest: > > for i in iterable: > if local(i2=i*i) % 18 == 0: > append i2 to the output list > > That should also make clear that if `iterable` or `i` had been named > `i2` instead, no problem. The `i2` created by `local()` is in a > wholly enclosed scope. > > Drawbacks: since this is just a brain dump, absolutely none ;-) > > Q: Some of those would be clearer if it were the more Haskell-like > > local(...) "in" expression > > A: Yup, but for some of the others needing to add "in m" would be > annoyingly redundant noise. Making an "in" clause optional doesn't > really fly either, because then > > local(a='z') in 'xyz' > > would be ambiguous. Is it meant to return `'xyz'`, or evaluate `'z' > in 'xyz'`? And any connector other than "in" would make the loose > resemblance to Haskell purely imaginary ;-) > > Q: Didn't you drone on about how assignment expressions with complex > targets seemed essentially useless without also introducing a "comma > operator" - and now you're sneaking the latter in but _still_ avoiding > complex targets?! > > A. Yes, and yes :-) The syntactic complexity of the fully general > assignment statement is just too crushing to _sanely_ shoehorn into > any "expression-like" context. > > Q: What's the value of this? local(a=7, local(a=a+1, a*2)) > > A: 16. Obviously. > > Q: Wow - that _is_ obvious! OK, what about this, where there is no > `a` in any enclosing scope: local(a) > > A: I think it should raise NameError, just like a function call would. > There is no _intent_ here to allow merely declaring a local variable > without supplying an initial value. > > Q: What about local(2, 4, 5)? > > A: It should return 5, and introduce no names. I don't see a point to > trying to outlaw stupidity ;-) Then again, it would be consistent > with the _intent_ to require that all but the last "argument" be of > the `name=expression` form. > > Q: Isn't changing the meaning of scope depending on context waaaay magical? > > A: Yup! But in a language with such a strong distinction between > statements and expressions, without a bit of deep magic there's no > single syntax I can dream up that could work well for both that didn't > require _some_ deep magic. The gimmick here is something I expect > will be surprising the first time it's seen, less so the second, and > then you're never confused about it again. > > Q: Are you trying to kill PEP 572? > > A: Nope! But since this largely subsumes the functionality of binding > expressions, I did want to put this out there before 572's fate is > history. Binding expressions are certainly easier to implement, and I > like them just fine :-) > > > Note: the thing I'm most interested in isn't debates, but in whether > this would be of real use in real code. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ Why not non-lexical variables? Basically, make this work (and print 3): ??? def test(): ??????? i = 3 ??????? def test_inner(): ??? ? ?? ?? print(i) ??????? hide i ??????? i = 4 ?? ? ?? test_inner() # 3 ??????? print(i) # 4 `hide`, unlike `del`, only applies to the current scope, and only forward. it does what it says on the tin: makes the variable disappear/be hidden. ofc, python doesn't support shadowing of variables, so this is kinda useless, but eh I thought it was a cool idea anyway :/ (See also this thread on the Lua mailing list: https://marc.info/?l=lua-l&m=152149915527486&w=2 ) From ericfahlgren at gmail.com Sat Apr 28 11:15:47 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Sat, 28 Apr 2018 08:15:47 -0700 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: <20180428112735.GZ7400@ando.pearwood.info> References: <20180427112733.GP7400@ando.pearwood.info> <20180427130500.GR7400@ando.pearwood.info> <20180428112735.GZ7400@ando.pearwood.info> Message-ID: On Sat, Apr 28, 2018 at 4:27 AM, Steven D'Aprano wrote: > On Fri, Apr 27, 2018 at 06:27:44AM -0700, Eric Fahlgren wrote: > > > I've had a 'dprint' in sitecustomize for years. It clones 'print' and > adds > > a couple of keyword parameters, 'show_stack' and 'depth', which give > > control over traceback output (walks back up sys._getframe for 'depth' > > entries). It returns the final argument if there is one, otherwise None. > > It can be used anywhere and everywhere that builtin print is used, plus > > anywhere in any expression just passing a single argument. > > Is this published anywhere? > > Sorry, no, it's part of work code, but it's pretty simple stuff. 'get_stack' is a debug-quality stack dumper (my memory failed, now uses inspect instead of the more primitive sys._getframe), used whenever anyone wants to see where they are. The 'all' parameter lets you filter out some stdlib entries. def get_stack(all=False, skip=0, depth=0): stack = inspect.stack()[1:] # Implicitly ignore the frame containing get_stack. stack.reverse() # And make it oldest first. length = len(stack) upper = length - skip lower = (upper - depth) if depth else 0 max_len = 0 for i_frame in range(lower, upper): # Preprocessing loop to format the output frames and # calculate the length of the fields in the output. stack[i_frame] = list(stack[i_frame]) path = _cleaner().clean(stack[i_frame][1]) # Canonical form for paths... if all or '/SITE-P' not in path.upper() and ' From mertz at gnosis.cx Sat Apr 28 11:36:38 2018 From: mertz at gnosis.cx (David Mertz) Date: Sat, 28 Apr 2018 11:36:38 -0400 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: > $(a=7, $(a=a+1, a*2)) I suspect you ALREADY have bash installed on your computer, you don't need Python to emulate it. On Sat, Apr 28, 2018 at 6:22 AM, Zero Piraeus wrote: > : > > On 28 April 2018 at 07:07, Tim Peters wrote: > > [...] For that matter, I'd be fine too with shortening it to "let". In > > fact, I prefer that! Thanks :-) > > If you really wanted to overcome objections that this looks too much > like a function, you could spell it "$". I'm not sure what I think > about > > $(a=7, $(a=a+1, a*2)) > > yet, but it doesn't make me want to run screaming in the way that := does. > > I think I finally worked out why I have such a violent reaction to := > in the end, by the way: it's because it reminds me of Javascript's > "===" (not the meaning, but the fact that it exists). > > -[]z. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Sat Apr 28 12:09:10 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Sat, 28 Apr 2018 18:09:10 +0200 Subject: [Python-ideas] __loader__.get_source(): semantics of returning None Message-ID: <5AE49CA6.7080209@UGent.be> The documentation of __loader__.get_source() says "Returns None if no source is available (e.g. a built-in module)." But what does "no source is available" mean precisely? It could mean either of two things: (A) I am absolutely certain that there is no source anywhere to be found. (B) I don't know where to find the source, but if you look hard enough, you may find it. Currently, linecache interprets it as (A). Is there any chance that we can either change the interpretation for returning None to (B) or to provide an officially documented way to answer (B)? This could be using a new return value (say, NotImplemented) or by not implementing get_source at all (such that __loader__.get_source raises AttributeError). The latter is probably how things already work in practice, but it isn't really documented that way. The context for this question/proposal is https://bugs.python.org/issue32797 When the linecache module is asked for the source code of a certain file, it queries the __loader__.get_source() for the source code. If this returns None, that's the end: no source is returned. However, linecache is also passed the filename! So even if it could find the source by filename, it won't even try that if get_source() returns None. Jeroen. From p.f.moore at gmail.com Sat Apr 28 12:26:00 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 28 Apr 2018 17:26:00 +0100 Subject: [Python-ideas] __loader__.get_source(): semantics of returning None In-Reply-To: <5AE49CA6.7080209@UGent.be> References: <5AE49CA6.7080209@UGent.be> Message-ID: On 28 April 2018 at 17:09, Jeroen Demeyer wrote: > The documentation of __loader__.get_source() says > > "Returns None if no source is available (e.g. a built-in module)." > > But what does "no source is available" mean precisely? It could mean either > of two things: > > (A) I am absolutely certain that there is no source anywhere to be found. > > (B) I don't know where to find the source, but if you look hard enough, you > may find it. The intention was (B). But to an extent, that misses the point. If you only have bytecode on your system, then the loader can't find the source - but *of course* there is source "somewhere", probably not on your system, though. "Looking hard enough" isn't really well defined. If you consider this in terms of "what the loader knows" then there's no difference between (A) and (B). > Currently, linecache interprets it as (A). Is there any chance that we can > either change the interpretation for returning None to (B) or to provide an > officially documented way to answer (B)? This could be using a new return > value (say, NotImplemented) or by not implementing get_source at all (such > that __loader__.get_source raises AttributeError). The latter is probably > how things already work in practice, but it isn't really documented that > way. > > The context for this question/proposal is https://bugs.python.org/issue32797 I've never used linecache, but that sounds a bit weird. I've just looked at the docs: """ If a file named filename is not found, the function will look for it in the module search path, sys.path, after first checking for a PEP 302 __loader__ in module_globals, in case the module was imported from a zipfile or other non-filesystem import source. """ That sounds like exactly what is being requested in the issue - so I'd say this is simply a bug in linecache, for not continuing with the path search after the loader check fails. It's not a problem with "interpreting" the loader's result, but simply a case of not doing what the docs say it does... > When the linecache module is asked for the source code of a certain file, it > queries the __loader__.get_source() for the source code. If this returns > None, that's the end: no source is returned. But that's not what the linecache docs (quoted above) say that it does - so this is a straight bug in linecache. > However, linecache is also passed the filename! So even if it could find the > source by filename, it won't even try that if get_source() returns None. And that's also a bug - the docs explicitly say that the filename is tried first. Paul From tim.peters at gmail.com Sat Apr 28 13:16:16 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 28 Apr 2018 12:16:16 -0500 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: <20180428093334.GU7400@ando.pearwood.info> References: <20180428093334.GU7400@ando.pearwood.info> Message-ID: [Steven D'Aprano ] > Chris' PEP 572 started off with the concept that binding expressions > would create a "sub-local" scope, below function locals. After some > debate on Python-Ideas, Chris, Nick and Guido took the discussion off > list and decided to drop the sub-local scope idea as confusing and hard > to implement. Enormously harder to implement than binding expressions, and the latter (to my eyes) capture many high-value use cases "good enough". I'm not concerned about "confusing". "Sub-local" scopes are ubiquitous in modern languages, and they're aimed at experienced programmers. It was also the case that nesting scopes _at all_ was very controversial in Python's earliest years, and Guido resisted it mightily (with my full support). The only scopes at first were function-local, module-global, and builtin, and while functions could _textually_ nest, they had no access to enclosing local scopes. Cute: to write a recursive nested function, you needed to pass it its own name (because its own name isn't in its _own_ local scope, but in the local scope of the function that contains it). Adding nested local scopes was also "confusing" at the time, and indeed made the scoping rules far harder to explain to newbies, and complicated the implementation. Then again, experienced programmers overwhelmingly (unanimously?) welcomed the change after it was done. Since then, Python has gone down a pretty bizarre path, inventing sublocal scopes on an ad hoc basis by "pure magic" when their absence in some specific context seemed just too unbearable to live with (e.g., in comprehensions). So we already have sublocal scopes, but in no explicit form that can be either exploited or explained. > But the biggest problem is that this re-introduces exactly the same > awful C mistake that := was chosen to avoid. Which of the following two > contains the typo? > > local(spam=expression, eggs=expression, cheese = spam+eggs) > > local(spam=expression, eggs=expression, cheese == spam+eggs) Neither :-) I don't expect that to be a real problem. In C I'm _thinking_ "if a equals b" and type "if (a=b)" by mistake in haste. In a "local" I'm _ thinking_ "I want to create these names with these values" in the former case, and in the latter case also "and I want to to test whether cheese equals spam + eggs". But having already typed "=" to mean "binding" twice in the same line, "but the third time I type it it will mean equality instead" just doesn't seem likely. The original C mistake is exceedingly unlikely on the face of it: if what I'm thinking is "if a equals b", or "while a equals b", I'm not going to use "local()" _at all_. Forcing the programmer to be explicit about that they're trying to create a new scope limits the only possible confusions to cases where they _are_ going out of their way to use a "local()" construct, in which case binding behavior is very much at the top of their mind. Plain old "if a=b" remains a SyntaxError regardless. Still, if people are scared of that, a variation of Yury's alternative avoids it: the last "argument" must be an expression (not a binding). In that case your first line above is a compile-time error. I didn't like that because I really dislike the textual redundancy in the common if local(matchobject=re.match(regexp, line), matchobject): compared to if local(matchobject=re.match(regexp, line)): But I could compromise ;-) - There must be at least one argument. - The first argument must be a binding. - All but the last argument must also be bindings. - If there's more than one argument, the last argument must be an expression. Then your first line above is a compile-time error, the common "just name a result and test its truthiness" doesn't require repetition, and "local(a==b)" is also a compile-time error. > I have other objections, but I'll leave them for now, since I think > these two alone are fatal. I don't. > Once you drop those two flaws, you're basically left with PEP 572 :-) Which is fine by me, but do realize that since PEP 572 dropped any notion of sublocal scopes, that recurring issue remains wholly unaddressed regardless. From J.Demeyer at UGent.be Sat Apr 28 13:20:01 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Sat, 28 Apr 2018 19:20:01 +0200 Subject: [Python-ideas] __loader__.get_source(): semantics of returning None In-Reply-To: <22d14b464c16482f9456e81432f459c9@xmail101.UGent.be> References: <5AE49CA6.7080209@UGent.be> <22d14b464c16482f9456e81432f459c9@xmail101.UGent.be> Message-ID: <5AE4AD41.1050608@UGent.be> On 2018-04-28 18:26, Paul Moore wrote: > """ > If a file named filename is not found, the function will look for it > in the module search path, sys.path, after first checking for a PEP > 302 __loader__ in module_globals, in case the module was imported from > a zipfile or other non-filesystem import source. > """ Sorry, I should have been more precise. linecache does the following (which is what is in the doc you quoted above): (1) Look for the filename exactly as given. (2) Look for a loader and call its get_source() method. (3) Look for the filename under all sys.path entries. The difference between (1) and (3) is when a relative filename is given, typically relative to site-packages. And I'm actually interested in that third case. So the question is really: should (3) still be tried if __loader__.get_source() returns None? Jeroen. From rosuav at gmail.com Sat Apr 28 13:36:56 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 29 Apr 2018 03:36:56 +1000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: On Sat, Apr 28, 2018 at 12:37 PM, Tim Peters wrote: > This is one that little else really handled nicely: > > r1, r2 = local(D = b**2 - 4*a*c, > sqrtD = math.sqrt(D), > twoa = 2*a, > ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa)) > > Everyone's favorite: > > if local(m = re.match(regexp, line)): > print(m.group(0)) > > Here's where it's truly essential that the compiler know everything > about "local", because in _that_ context it's required that the new > scope extend through the end of the entire block construct (exactly > what that means TBD - certainly through the end of the `if` block, but > possibly also through the end of its associated (if any) `elif` and > `else` blocks - and similarly for while/else constructs). I'm concerned that there are, in effect, two quite different uses of the exact same syntax. 1) In an arbitrary expression, local() creates a scope that is defined entirely by the parentheses. 2) In an 'if' header, the exact same local() call creates a scope that extends to the corresponding suite. For instance: a = 1; b = 2 x = a + local(a = 3, b = 4, a + b) + b if x == 10: # Prints "x is 10: 1 2" print("x is 10: ", a, b) This makes reasonable sense. The parentheses completely enclose the local scope. It's compiler magic, and you cannot explain it as a function call, but it makes intuitive sense. But the same thing inside the if header itself would be much weirder. I'm actually not even sure what it would do. And you've clearly shown that the local() call can be anywhere inside the condition, based on these examples: > Of course that example could also be written as: > > if local(m = re.match(regexp, line), m): > print(m.group(0)) > > or more specifically: > > if local(m = re.match(regexp, line), m is not None): > print(m.group(0)) > > or even: > > if local(m = re.match(regexp, line)) is not None: > print(m.group(0)) At what point does the name 'm' stop referring to the local? More generally: if local(m = ...) is not m: print("Will I ever happen?") Perhaps it would be better to make this special case *extremely* special. For instance: if_local: 'if' 'local' '(' local_item (',' local_item)* ')' ':' suite as the ONLY way to have the local names persist. In other words, if you tack "is not None" onto the outside of the local() call, it becomes a regular expression-local, and its names die at the close parentheses. It'd still be a special case, but it'd be a bit saner to try to think about. ChrisA From p.f.moore at gmail.com Sat Apr 28 14:29:44 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 28 Apr 2018 19:29:44 +0100 Subject: [Python-ideas] __loader__.get_source(): semantics of returning None In-Reply-To: <5AE4AD41.1050608@UGent.be> References: <5AE49CA6.7080209@UGent.be> <22d14b464c16482f9456e81432f459c9@xmail101.UGent.be> <5AE4AD41.1050608@UGent.be> Message-ID: On 28 April 2018 at 18:20, Jeroen Demeyer wrote: > On 2018-04-28 18:26, Paul Moore wrote: >> >> """ >> If a file named filename is not found, the function will look for it >> in the module search path, sys.path, after first checking for a PEP >> 302 __loader__ in module_globals, in case the module was imported from >> a zipfile or other non-filesystem import source. >> """ > > > Sorry, I should have been more precise. linecache does the following (which > is what is in the doc you quoted above): > > (1) Look for the filename exactly as given. > (2) Look for a loader and call its get_source() method. > (3) Look for the filename under all sys.path entries. > > The difference between (1) and (3) is when a relative filename is given, > typically relative to site-packages. And I'm actually interested in that > third case. > > So the question is really: should (3) still be tried if > __loader__.get_source() returns None? Well, the docs say it is, so I'd say yes. I guess the docs could be interpreted as saying "if there isn't a loader, go on to (3) otherwise call the loader and stop". But I'd say that if __loader__.get_source() returns None, that should be treated the same as no loader being found. Paul From e+python-ideas at kellett.im Sat Apr 28 16:04:00 2018 From: e+python-ideas at kellett.im (Ed Kellett) Date: Sat, 28 Apr 2018 21:04:00 +0100 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: <76d70196-3fec-38bc-6d3f-860a44daedc0@kellett.im> On 2018-04-28 18:36, Chris Angelico wrote: > This makes reasonable sense. The parentheses completely enclose the > local scope. It's compiler magic, and you cannot explain it as a > function call, but it makes intuitive sense. But the same thing inside > the if header itself would be much weirder. I'm actually not even sure > what it would do. And you've clearly shown that the local() call can > be anywhere inside the condition, based on these examples: What if the names persist until the end of the statement? That covers if (where the statement lasts until the end of the if...elif...else block) and regular expressions, though it does introduce a potentially annoying shadowing thing: >>> x = 2 >>> (local(x=4, x + 1), x) (5,4) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From e+python-ideas at kellett.im Sat Apr 28 16:12:25 2018 From: e+python-ideas at kellett.im (Ed Kellett) Date: Sat, 28 Apr 2018 21:12:25 +0100 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: <20180428093334.GU7400@ando.pearwood.info> Message-ID: <9d20660f-43d7-d224-b342-c6c5919874de@kellett.im> On 2018-04-28 18:16, Tim Peters wrote: > [Steven D'Aprano ] >> But the biggest problem is that this re-introduces exactly the same >> awful C mistake that := was chosen to avoid. Which of the following two >> contains the typo? >> >> local(spam=expression, eggs=expression, cheese = spam+eggs) >> >> local(spam=expression, eggs=expression, cheese == spam+eggs) > > [snip] > > Still, if people are scared of that, a variation of Yury's alternative > avoids it: the last "argument" must be an expression (not a binding). > In that case your first line above is a compile-time error. > > I didn't like that because I really dislike the textual redundancy in the common > > if local(matchobject=re.match(regexp, line), matchobject): > > compared to > > if local(matchobject=re.match(regexp, line)): > > But I could compromise ;-) > > - There must be at least one argument. > - The first argument must be a binding. > - All but the last argument must also be bindings. > - If there's more than one argument, the last argument must be an expression. How about if you just can't have an expression in a local()? There are a few obvious alternative possibilities: 1. No special case at all: local(x=1, y=2, _=x+y) 2. Co-opt a keyword after local(): local(x=1, y=2) in x+y 3. Co-opt a keyword inside local(): local(x=1, y=2, return x+y) I hate the first and wish the _ pattern would die in all its forms, but it's worth mentioning. I don't think there's much to choose between the other two, but 2 uses syntax that might have been valid and meant something else, so 3 is probably less confusing. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From rosuav at gmail.com Sat Apr 28 16:20:14 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 29 Apr 2018 06:20:14 +1000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: <76d70196-3fec-38bc-6d3f-860a44daedc0@kellett.im> References: <76d70196-3fec-38bc-6d3f-860a44daedc0@kellett.im> Message-ID: On Sun, Apr 29, 2018 at 6:04 AM, Ed Kellett wrote: > On 2018-04-28 18:36, Chris Angelico wrote: >> This makes reasonable sense. The parentheses completely enclose the >> local scope. It's compiler magic, and you cannot explain it as a >> function call, but it makes intuitive sense. But the same thing inside >> the if header itself would be much weirder. I'm actually not even sure >> what it would do. And you've clearly shown that the local() call can >> be anywhere inside the condition, based on these examples: > > What if the names persist until the end of the statement? That covers if > (where the statement lasts until the end of the if...elif...else block) > and regular expressions, though it does introduce a potentially annoying > shadowing thing: > >>>> x = 2 >>>> (local(x=4, x + 1), x) > (5,4) > Oh, you mean exactly like PEP 572 used to advocate for? :) Can't say I'd be against that, although I'm not enamoured of the function-like syntax. But if you remove the function-like syntax and change the semantics, it isn't exactly Tim's proposal any more. :) ChrisA From tim.peters at gmail.com Sat Apr 28 16:40:35 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 28 Apr 2018 15:40:35 -0500 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: <9d20660f-43d7-d224-b342-c6c5919874de@kellett.im> References: <20180428093334.GU7400@ando.pearwood.info> <9d20660f-43d7-d224-b342-c6c5919874de@kellett.im> Message-ID: [Ed Kellett ] > How about if you just can't have an expression in a local()? See the quadratic equation example in the original post. When working with expressions, the entire point of the construct is to define (sub)local names for use in a result expression. > There are a few obvious alternative possibilities: > > 1. No special case at all: > > local(x=1, y=2, _=x+y) As above. > 2. Co-opt a keyword after local(): > > local(x=1, y=2) in x+y Requiring "in" requires annoying syntactic repetition in the common if local(match=re.match(regexp, line)) in match: kinds of cases. Making "in" optional instead was discussed near the end of the original post. I agree that your spelling just above is more obvious than `local(x=1, y=2, x+y)` which is why the original post discussed making an "in clause" optional. But, overwhelmingly, it appears that people are more interested in establishing sublocal scopes in `if` and `while` constructs than in standalone expressions, so I picked a spelling that's _most_ convenient for the latter's common "just name a result and test its truthiness" uses. > 3. Co-opt a keyword inside local(): > > local(x=1, y=2, return x+y) Why would that be better than local(x=1, y=2, x+y)? That no binding is intended for `x+y` is already obvious in the latter. > I hate the first and wish the _ pattern would die in all its forms, but > it's worth mentioning. I don't think there's much to choose between the > other two, but 2 uses syntax that might have been valid and meant > something else, so 3 is probably less confusing. Indeed, 3 is the only one I'd consider, but I don't see that it's a real improvement. It seems to require extra typing every time just to avoid learning "and it returns the value of the last expression" once. From e+python-ideas at kellett.im Sat Apr 28 17:04:50 2018 From: e+python-ideas at kellett.im (Ed Kellett) Date: Sat, 28 Apr 2018 22:04:50 +0100 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: <20180428093334.GU7400@ando.pearwood.info> <9d20660f-43d7-d224-b342-c6c5919874de@kellett.im> Message-ID: <08928038-0131-5139-dc57-964e5bc8954d@kellett.im> On 2018-04-28 21:40, Tim Peters wrote: > [Ed Kellett ] >> How about if you just can't have an expression in a local()? > > See the quadratic equation example in the original post. When > working with expressions, the entire point of the construct is to > define (sub)local names for use in a result expression. > > [snip] > > Requiring "in" requires annoying syntactic repetition in the common > > if local(match=re.match(regexp, line)) in match: > > kinds of cases. I don't mean "in" should be required, but rather that the last thing is always an assignment, and the local() yields the assigned value. So that'd remain: if local(match=re.match(regexp, line)): whereas your quadratic equation might do the "in" thing: r1, r2 = local(D = b**2 - 4*a*c, sqrtD = math.sqrt(D), twoa = 2*a) in ( (-b + sqrtD)/twoa, (-b - sqrtD)/twoa) ... which doesn't look particularly wonderful (I think it's nicer with my option 3), but then I've never thought the quadratic equation was a good motivating case for any version of an assignment expression. In most cases I can imagine, a plain local() would be sufficient, e.g.: if local(match=re.match(regexp, line)) and match[1] == 'frog': > Making "in" optional instead was discussed near the end of the > original post. I agree that your spelling just above is more obvious > than `local(x=1, y=2, x+y)` which is why the original post discussed > making an "in clause" optional. But, overwhelmingly, it appears that > people are more interested in establishing sublocal scopes in `if` > and `while` constructs than in standalone expressions, so I picked a > spelling that's _most_ convenient for the latter's common "just name > a result and test its truthiness" uses. > > [snip] > > Indeed, 3 is the only one I'd consider, but I don't see that it's a > real improvement. It seems to require extra typing every time just > to avoid learning "and it returns the value of the last expression" > once. I agree with pretty much all of this--I was trying to attack the "but you could make the terrible C mistake" problem from another angle. I don't think I mind the last "argument" being *allowed* to be an expression, but if the consensus is that it makes '='/'==' confusion too easy, I think it'd be more straightforward to say they're all assignments (with optional, explicit syntax to yield an expression) than to say "they're all assignments except the last thing" (argh) "except in the special case of one argument" (double argh). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From tim.peters at gmail.com Sat Apr 28 18:31:55 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 28 Apr 2018 17:31:55 -0500 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: [Chris Angelico ] > I'm concerned that there are, in effect, two quite different uses of > the exact same syntax. Yes, the construct implements a profoundly different meaning of "scope" depending on the context it appears in. > 1) In an arbitrary expression, local() creates a scope that is defined > entirely by the parentheses. Yes. > 2) In an 'if' header, the exact same local() call creates a scope that > extends to the corresponding suite. And in a 'while' header, and also possibly (likely) including associated suites (elif/else). So it goes ;-) There is nothing "obvious" you can say inside an "if" or "while" expression that says "and, oh ya, this name also shadows anything of the same name from here on, except when it stops doing so". Even in C, e.g., it's not *obvious* what the scope of `i` is in: for (int i = 0; ...) { } It needs to be learned. Indeed, it's so non-obvious that C and C++ give different answers. The {...} part by itself introduces a new scope in both languages. In C++ the `int i` is viewed as being _part_ of that scope, despite that it's outside the braces. But in C the `int i` is really viewed as being part of a Yet Another new scope _enclosing_ the scope introduced by {...}, but nevertheless ending when the {...} scope ends. Not that it matters much. The practical effect is that, e.g., double i = 3.0; is legal as the first line of the block in C (shadows the `int i`), but illegal in C++ (a conflicting declaration for `i` in a single scope). In either case, it's only "obvious" if you learned it and then stopped thinking too much about it ;-) > For instance: > > a = 1; b = 2 > x = a + local(a = 3, b = 4, a + b) + b > if x == 10: > # Prints "x is 10: 1 2" > print("x is 10: ", a, b) > > This makes reasonable sense. The parentheses completely enclose the > local scope. It's compiler magic, and you cannot explain it as a > function call, but it makes intuitive sense. Yup, it's effectively a function-like spelling of any number of binding constructs widely used in functional languages. I had mostly in mind Haskell's "let" pile-of-bindings "in" expression spelled as "local(" pile-of-bindings "," expression ")" The points to using function-call-like syntax were already covered ("nothing syntactically new to learn there", since the syntax for specifying keyword arguments is already understood, and already groups as intended). > But the same thing inside the if header itself would be much weirder. > I'm actually not even sure what it would do. You think I am? ;-) I don't know that it matters, because intended use cases are far simpler than all the goofy things people _can_ dream up just for the hell of it. They need to be defined, but exactly how isn't of much interest to me. For example, let's put your example in an `if`: a = 1; b = 2 if a + local(a = 3, b = 4, a + b) + b: The rules I sketched pretty clearly imply that would be evaluated as: if 1 + (3+4) + 4: It's the final "4" that's of interest. In your original example the original `b` was restored because ")" ended the new scope, leaving the final "+b" to resolve to "+2". But because it's in an "if" expression here, the new scope doesn't end at ")" anymore. > And you've clearly shown that the local() call can > be anywhere inside the condition, based on these examples: And/or used multiple times, and/or used in nested ways. None of which anyone will actually do ;-) >> ... >> if local(m = re.match(regexp, line)) is not None: >> print(m.group(0)) > > At what point does the name 'm' stop referring to the local? More generally: Probably at the end of the final (if any) `elif` or `else` suite associated with the `if`/`while`, but possibly at the end of the suite associated with the `if`/`while`. Time to note another subtlety: people don't _really_ want "a new scope" in Python. If they did, then _every_ name appearing in a binding context (assignment statement target, `for` target, ...) for the duration would vanish when the new scope ended. What they really want is a new scope with an implied "nonlocal" declaration for every name appearing in a binding context _except_ for the specific names they're effectively trying to declare as being "sublocal" instead. So It's somewhat of a conceptual mess no mater how it's spelled ;) In most other languages this doesn't come up because the existence of a variable in a scope is established by an explicit declaration rather than inferred from examining binding sites. > > if local(m = ...) is not m: > print("Will I ever happen?") No, that `print` can't be reached. > Perhaps it would be better to make this special case *extremely* > special. For instance: > > if_local: 'if' 'local' '(' local_item (',' local_item)* ')' ':' suite > > as the ONLY way to have the local names persist. In other words, if > you tack "is not None" onto the outside of the local() call, it > becomes a regular expression-local, and its names die at the close > parentheses. It'd still be a special case, but it'd be a bit saner to > try to think about. That's an interesting twist ... but, to me, if "local()" inside an if/while expression _can_ be deeply magical, then I'd be less surprised over time it it were _always_ deeply magical in those contexts. I could change my mind if use cases derived from real code suggest it would be a real problem. Or if there's no real-life interest in catering to sublocal scopes in expressions anyway ... then there's no reason to even try to use something that makes clean sense for an expression. From timothy.c.delaney at gmail.com Sat Apr 28 19:00:28 2018 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sat, 28 Apr 2018 23:00:28 +0000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: ?? On Sat, 28 Apr 2018 at 12:41, Tim Peters wrote: My big concern here involves the: ??if local(m = re.match(regexp, line)): print(m.group(0)) example. The entire block needs to be implicitly local for that to work - what happens if I assign a new name in that block? Also, what happens with: ??if local(m = re.match(regexp1, line)) or local(m = re.match(regexp2, line) ): print(m.group(0)) Would the special-casing of local still apply to the block? Or would you need to do: ??if local(m = re.match(regexp1, line) or re.match(regexp2, line)): print(m.group(0)) ?This might just be lack of coffee and sleep talking, but maybe new "scoping delimiters" could be introduced. Yes - I'm suggesting introducing curly braces for blocks, but with a limited scope (pun intended). Within a local {} block statements and expressions are evaluated exactly like they currently are, including branching statements, optional semi-colons, etc. The value returned from the block is from an explicit return, or the last evalauted expression. a = 1 > b = 2 > c = > ?? > local(a=3) * local(b=4) > ?c = ?local { a=3 } * local { b=4 } c = > ?? > local(a=3 > ?, > b=4 > ?,? > a*b) ?? ?c = ? ? ? ? local ? { a=3 ?;? b=4 ?;? a*b ? }? ? ? ?c = ? ? ? local ? { a = 3 b = 4 a * b ? }? c = local(a=3, > ?? > b=local(a=2, a*a), a*b) ? ? ?c = ? ? ? local ? { a = 3 b = local(a=2, a*a) return a * b ? }? > ?? > r1, r2 = local(D = b**2 - 4*a*c, > sqrtD = math.sqrt(D), > twoa = 2*a, > ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa)) > ? ? r1, r2 = local { D = b**2 - 4*a*c sqrtD = math.sqrt(D) twoa = 2*a return ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa) } ?? > if local(m = re.match(regexp, line)): > print(m.group(0)) > ? if local { m = re.match(regexp, line) }: print(m.group(0)) And a further implication: a = lambda a, b: local(c=4, a*b*c) a = lambda a, b: local { c = 4 return a * b * c } Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Apr 28 20:26:22 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 29 Apr 2018 10:26:22 +1000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: <76d70196-3fec-38bc-6d3f-860a44daedc0@kellett.im> Message-ID: <20180429002622.GA7400@ando.pearwood.info> On Sun, Apr 29, 2018 at 06:20:14AM +1000, Chris Angelico wrote: > On Sun, Apr 29, 2018 at 6:04 AM, Ed Kellett wrote: > > What if the names persist until the end of the statement? [...] > Oh, you mean exactly like PEP 572 used to advocate for? :) Indeed. It was a bad idea in revision 1 and it remains a bad idea now :-) -- Steve From tim.peters at gmail.com Sat Apr 28 20:30:10 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 28 Apr 2018 19:30:10 -0500 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: [Tim Delaney ] > My big concern here involves the: > > if local(m = re.match(regexp, line)): > print(m.group(0)) > > example. The entire block needs to be implicitly local for that to work - > what happens if I assign a new name in that block? I really don't know what you're asking there. Can you make it concrete? If, e.g., you're asking what happens if this appeared after the `print`: x = 3.14 then the answer is "the same as what would happen if `local` had not been used". We can't know what that is without context, though. Maybe x is global. Maybe x was declared nonlocal earlier. Maybe it's function-local. While it may be irrelevant to what you're asking, I noted just before: """ Time to note another subtlety: people don't _really_ want "a new scope" in Python. If they did, then _every_ name appearing in a binding context (assignment statement target, `for` target, ...) for the duration would vanish when the new scope ended. What they really want is a new scope with an implied "nonlocal" declaration for every name appearing in a binding context _except_ for the specific names they're effectively trying to declare as being "sublocal" instead. """ If by "new name" you mean that `x` didn't appear in any earlier line, then Python's current analysis would classify `x` as local to the current function (or as global if this is module-level code ...). That wouldn't change. > Also, what happens with: > > if local(m = re.match(regexp1, line)) or local(m = re.match(regexp2, line)): > print(m.group(0)) As is, if `local()` appears in an `if` or `while` expression, the scope extends to the end of the block construct. In that specific case, I'd expect to get a compile-time error, for attempting to initialize the same name more than once in the new scope. If the scope also encompasses associated `elif` statements, I'd instead expect local(m=...) in one of those to be treated as a re-initialization ;-) instead. > Would the special-casing of local still apply to the block? As is, `local()` in an `if` or `while` expressions triggers deeply magical behavior, period. > Or would you need to do: > > if local(m = re.match(regexp1, line) or re.match(regexp2, line)): > print(m.group(0)) Yes, not to trigger magical behavior, but to avoid the compile-time error. > This might just be lack of coffee and sleep talking, but maybe new "scoping > delimiters" could be introduced. Yes - I'm suggesting introducing curly > braces for blocks, but with a limited scope (pun intended). Within a local > {} block statements and expressions are evaluated exactly like they > currently are, including branching statements, optional semi-colons, etc. > The value returned from the block is from an explicit return, or the last > evalauted expression. >> a = 1 >> b = 2 >> c = local(a=3) * local(b=4) > c = local { a=3 } * local { b=4 } >> c = local(a=3 , b=4, a*b) > c = local > { > a = 3 > b = 4 > a * b > } >> c = local(a=3, b=local(a=2, a*a), a*b) > c = > local > { > a = 3 > b = local(a=2, a*a) I expect you wanted b = local{a=2; a*a} there instead (braces instead of parens, and semicolon instead of comma). > return a * b > } >> r1, r2 = local(D = b**2 - 4*a*c, >> sqrtD = math.sqrt(D), >> twoa = 2*a, >> ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa)) > r1, r2 = local { > D = b**2 - 4*a*c > sqrtD = math.sqrt(D) > twoa = 2*a > return ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa) > } >> if local(m = re.match(regexp, line)): >> print(m.group(0)) > if local { m = re.match(regexp, line) }: > print(m.group(0)) OK, this is the only case in which you used it in an `if` or `while` expression. All the questions you asked of me at the start can be asked of this spelling too. You seemed to imply at the start that the right curly brace would always mark the end of the new scope. But if that's so, the `m` in `m.group()` has nothing to do with the `m` assigned to in the `local` block - _that_ scope ended before `print` was reached. So if you're not just trying to increase the level of complexity of what can appear in a local block, a fundamental problem still needs solving ;-) I suppose you could solve it like so: local { m = re.match(regexp, line) if m: print(m.group(0)) } but, besides losing the "shortcut", it would also mean something radically different if x = 3.14 appeared after the "print". Right? If a "local block" is taken seriously, then _all_ names bound inside it vanish when the block ends. > And a further implication: > > a = lambda a, b: local(c=4, a*b*c) > > a = lambda a, b: local { > c = 4 > return a * b * c > } If people do want a for-real "new scope" in Python, I certainly agree `local {...}` is far better suited for that purpose. From c at anthonyrisinger.com Sat Apr 28 20:34:37 2018 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Sun, 29 Apr 2018 00:34:37 +0000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: let[x](EXPR) x == EXPR let[x](a=1) x == 1 let[x](a=1, EXPR) x == EXPR let[x, y](a=1, EXPR) x == 1 y == EXPR let[x, y](a=1, b=2, EXPR) x == 2 y == EXPR z = let[x, y](a=1, EXPR) x == 1 y == EXPR z == (1, EXPR) Anybody seeing how the above might be useful, and address some of the concerns I've read? I don't recall seeing this suggested prior. I like the idea behind pseudo-function let/local, especially when paired with the explanation of equal sign precedence changes within paren, but I'm having a really hard time getting over the name binding leaking out of the paren. I like this item-ish style because it puts the name itself outside the parentheses while still retaining the benefits in readability. It also allows capturing the entire resultset, or individual parts. On Sat, Apr 28, 2018, 6:00 PM Tim Delaney wrote: > ?? > On Sat, 28 Apr 2018 at 12:41, Tim Peters wrote: > > My big concern here involves the: > > ??if local(m = re.match(regexp, line)): > print(m.group(0)) > > example. The entire block needs to be implicitly local for that to work - > what happens if I assign a new name in that block? Also, what happens with: > > ??if local(m = re.match(regexp1, line)) or local(m = re.match(regexp2, > line)): > print(m.group(0)) > > Would the special-casing of local still apply to the block? Or would you > need to do: > > ??if local(m = re.match(regexp1, line) or re.match(regexp2, line)): > print(m.group(0)) > > ?This might just be lack of coffee and sleep talking, but maybe new > "scoping delimiters" could be introduced. Yes - I'm suggesting introducing > curly braces for blocks, but with a limited scope (pun intended). Within > a local {} block statements and expressions are evaluated exactly like they > currently are, including branching statements, optional semi-colons, etc. > The value returned from the block is from an explicit return, or the last > evalauted expression. > > a = 1 >> b = 2 >> c = >> ?? >> local(a=3) * local(b=4) >> > > ?c = ?local { a=3 } * local { b=4 } > > c = >> ?? >> local(a=3 >> ?, >> b=4 >> ?,? >> a*b) > > > ?? > ?c = ? > ? > ? > ? > local > ? { > a=3 > ?;? > b=4 > ?;? > a*b > ? }? > ? > > ? > ?c = ? > ? > ? > local > ? { > a = 3 > b = 4 > a * b > ? > }? > > c = local(a=3, >> ?? >> b=local(a=2, a*a), a*b) > > ? > ? > ?c = ? > ? > ? > local > ? { > a = 3 > b = local(a=2, a*a) > return a * b > ? > }? > > >> ?? >> r1, r2 = local(D = b**2 - 4*a*c, >> sqrtD = math.sqrt(D), >> twoa = 2*a, >> ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa)) >> > > ? > ? > r1, r2 = local { > D = b**2 - 4*a*c > sqrtD = math.sqrt(D) > twoa = 2*a > return ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa) > } > > ?? >> if local(m = re.match(regexp, line)): >> print(m.group(0)) >> > > ? > if local { m = re.match(regexp, line) }: > print(m.group(0)) > > And a further implication: > > a = lambda a, b: local(c=4, a*b*c) > > a = lambda a, b: local { > c = 4 > return a * b * c > } > > Tim Delaney > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > On Apr 28, 2018 6:00 PM, "Tim Delaney" wrote: ?? On Sat, 28 Apr 2018 at 12:41, Tim Peters wrote: My big concern here involves the: ??if local(m = re.match(regexp, line)): print(m.group(0)) example. The entire block needs to be implicitly local for that to work - what happens if I assign a new name in that block? Also, what happens with: ??if local(m = re.match(regexp1, line)) or local(m = re.match(regexp2, line) ): print(m.group(0)) Would the special-casing of local still apply to the block? Or would you need to do: ??if local(m = re.match(regexp1, line) or re.match(regexp2, line)): print(m.group(0)) ?This might just be lack of coffee and sleep talking, but maybe new "scoping delimiters" could be introduced. Yes - I'm suggesting introducing curly braces for blocks, but with a limited scope (pun intended). Within a local {} block statements and expressions are evaluated exactly like they currently are, including branching statements, optional semi-colons, etc. The value returned from the block is from an explicit return, or the last evalauted expression. a = 1 > b = 2 > c = > ?? > local(a=3) * local(b=4) > ?c = ?local { a=3 } * local { b=4 } c = > ?? > local(a=3 > ?, > b=4 > ?,? > a*b) ?? ?c = ? ? ? ? local ? { a=3 ?;? b=4 ?;? a*b ? }? ? ? ?c = ? ? ? local ? { a = 3 b = 4 a * b ? }? c = local(a=3, > ?? > b=local(a=2, a*a), a*b) ? ? ?c = ? ? ? local ? { a = 3 b = local(a=2, a*a) return a * b ? }? > ?? > r1, r2 = local(D = b**2 - 4*a*c, > sqrtD = math.sqrt(D), > twoa = 2*a, > ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa)) > ? ? r1, r2 = local { D = b**2 - 4*a*c sqrtD = math.sqrt(D) twoa = 2*a return ((-b + sqrtD)/twoa, (-b - sqrtD)/twoa) } ?? > if local(m = re.match(regexp, line)): > print(m.group(0)) > ? if local { m = re.match(regexp, line) }: print(m.group(0)) And a further implication: a = lambda a, b: local(c=4, a*b*c) a = lambda a, b: local { c = 4 return a * b * c } Tim Delaney _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Apr 28 21:28:01 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Apr 2018 13:28:01 +1200 Subject: [Python-ideas] Allow multiple imports from a package while preserving its namespace In-Reply-To: References: Message-ID: <5AE51FA1.1010907@canterbury.ac.nz> Nick Coghlan wrote: > I find the imports at the top of the file to be a nice > catalog of external dependencies. Not only is it useful for human readers, it's also useful for packaging tools such as py2exe that need to know which modules are being used. I experimented once with auto-importing in PyGUI, but in the end I dropped it, partly because of this consideration. There were other problems with it as well. I don't recall all the details, but I think one issue is that any errors resulting from an import triggered by an attribute access get masked and turned into an AttributeError, making them very confusing to diagnose. Also, importing requires acquisition of the import lock, which could cause problems in a multithreaded environment if it happens at unpredictable times. For these reasons I'm inclined to regard auto-importing as an anti-pattern -- it seems like it should be a good idea, but it leads to more problems than it solves. -- Greg From greg.ewing at canterbury.ac.nz Sat Apr 28 22:03:11 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Apr 2018 14:03:11 +1200 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: <5AE527DF.2070309@canterbury.ac.nz> Tim Peters wrote: > To the compiler, it's approximately nothing like "a > function call". It's nothing like a function call to the user, either, except in the most superficial of ways. > - The explicit parentheses make it impossible to misunderstand where > the expression begins or ends. Except that you then go and break that rule by saying that it doesn't apply when you're in the condition of an "if" statement. > I do want to leverage what people "already know". Seems to me this proposal does that in the worst possible way, by deceiving the user into thinking it's something familiar when it's not. -- Greg From tim.peters at gmail.com Sat Apr 28 22:11:54 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 28 Apr 2018 21:11:54 -0500 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: <5AE527DF.2070309@canterbury.ac.nz> References: <5AE527DF.2070309@canterbury.ac.nz> Message-ID: To all of the following, I was talking about the syntax. Which includes that all existing Python-aware editors and IDEs already know how to format it intelligibly. It would be nice if they also colored "local" (or "let") as a keyword, though. For the rest, of course I'm already aware of the _semantic_ trickery. Indeed, that may be too much to bear. But I'm already sympathetic to that too :-) On Sat, Apr 28, 2018 at 9:03 PM, Greg Ewing wrote: > Tim Peters wrote: >> >> To the compiler, it's approximately nothing like "a >> function call". > > > It's nothing like a function call to the user, either, > except in the most superficial of ways. > >> - The explicit parentheses make it impossible to misunderstand where >> the expression begins or ends. > > > Except that you then go and break that rule by saying > that it doesn't apply when you're in the condition of > an "if" statement. > >> I do want to leverage what people "already know". > > > Seems to me this proposal does that in the worst > possible way, by deceiving the user into thinking > it's something familiar when it's not. > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From greg.ewing at canterbury.ac.nz Sat Apr 28 22:29:21 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Apr 2018 14:29:21 +1200 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: References: <20180427112733.GP7400@ando.pearwood.info> Message-ID: <5AE52E01.8020806@canterbury.ac.nz> An HTML attachment was scrubbed... URL: From kenlhilton at gmail.com Sat Apr 28 22:11:38 2018 From: kenlhilton at gmail.com (Ken Hilton) Date: Sun, 29 Apr 2018 10:11:38 +0800 Subject: [Python-ideas] A "local" pseudo-function Message-ID: > local { m = re.match(regexp, line) > if m: > print(m.group(0)) > } Or how about making "local" a pseudo-statement of sorts? local (m=re.match(exp, string)) { if m: print(m.group(0)) } The grammar would be as follows: local_stmt = "local" "(" local_assignments [ "," local_assignments ... ] ")" "{" BLOCK "}" local_assignments = NAME "=" EXPR There would be no question about the scope of things in BLOCK - the variables would disappear after the closing "}". I say "pseudo"-statement because I'm wondering if something like this would be legal: things = list(map(lambda m: local (gp1=m.group(1)) { result = gp1 + ''.join(reversed(gp1)) result += gp1.replace('some', 'thing') return result }, re.finditer(exp, string))) I'm thinking specifically about the "lambda m: local (...) {...}". If that was made legal, it would finally allow for full-fledged anonymous functions. Indeed, the "local" (statement?) itself is actually almost equivalent to defining an anonymous function and executing it immediately, i.e. this: (lambda x=5: x*x)() would be equivalent to this: local (x=5) { return x * x } both evaluating to 25. Just some random thoughts! Sincerely, Ken ? Hilton? ; -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Apr 28 22:50:05 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 29 Apr 2018 12:50:05 +1000 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: <20180428093334.GU7400@ando.pearwood.info> Message-ID: <20180429025005.GB7400@ando.pearwood.info> On Sat, Apr 28, 2018 at 12:16:16PM -0500, Tim Peters wrote: > [Steven D'Aprano ] > > Chris' PEP 572 started off with the concept that binding expressions > > would create a "sub-local" scope, below function locals. After some > > debate on Python-Ideas, Chris, Nick and Guido took the discussion off > > list and decided to drop the sub-local scope idea as confusing and hard > > to implement. > > Enormously harder to implement than binding expressions, and the > latter (to my eyes) capture many high-value use cases "good enough". And yet you're suggesting an alternative which is harder and more confusing. What's the motivation here for re-introducing sublocal scopes, if they're hard to do and locals are "good enough"? That's not a rhetorical question: why have you suggested this sublocal scoping idea? PEP 572 stopped talking about sublocals back in revision 2 or so, and as far as I can see, *not a single objection* since has been that the variables weren't sublocal. For what it is worth, if we ever did introduce a sublocal scope, I don't hate Nick's "given" block statement: https://www.python.org/dev/peps/pep-3150/ [...] > It was also the case that nesting scopes _at all_ was very > controversial in Python's earliest years, and Guido resisted it > mightily (with my full support). The only scopes at first were > function-local, module-global, and builtin, and while functions could > _textually_ nest, they had no access to enclosing local scopes. While I started off with Python 1.5, I wasn't part of the discussions about nested scopes. But I'm astonished that you say that nested scopes were controversial. *Closures* I would completely believe, but mere lexical scoping? Astonishing. Even when I started, as a novice programmer who wouldn't have recognised the term "lexical scoping" if it fell on my head from a great height, I thought it was strange that inner functions couldn't see their surrounding function's variables. Nested scopes just seemed intuitively obvious: if a function sees the variables in the module surrounding it, then it should also see the variables in any function surrounding it. This behaviour in Python 1.5 made functions MUCH less useful: >>> def outer(): ... x = 1 ... def inner(): ... return x ... return inner() ... >>> outer() Traceback (innermost last): File "", line 1, in ? File "", line 5, in outer File "", line 4, in inner NameError: x I think it is fair to say that inner functions in Python 1.5 were crippled to the point of uselessness. > Adding nested local scopes was also "confusing" at the time, and > indeed made the scoping rules far harder to explain to newbies, and > complicated the implementation. Then again, experienced programmers > overwhelmingly (unanimously?) welcomed the change after it was done. I agree with the above regarding closures, which are harder to explain, welcomed by experienced programmers, and often a source of confusion for newbies and experts alike: https://stackoverflow.com/questions/7546285/creating-lambda-inside-a-loop http://math.andrej.com/2009/04/09/pythons-lambda-is-broken/comment-page-1/ but I disagree that lexical scoping alone is or ever was confusing. Neither did Niklaus Wirth, who included it in Pascal, a language intended to be friendly for beginners *wink* > Since then, Python has gone down a pretty bizarre path, inventing > sublocal scopes on an ad hoc basis by "pure magic" when their absence > in some specific context seemed just too unbearable to live with > (e.g., in comprehensions). So we already have sublocal scopes, but in > no explicit form that can be either exploited or explained. I'm not entirely sure that comprehensions (including generator expressions) alone counts as "a path" :-) but I agree with this. I'm not a fan of comprehensions being their own scope. As far as I am concerned, leakage of comprehension variables was never a problem that needed to be solved, and was (very occasionally) a useful feature. Mostly for introspection and debugging. But the decision was made for generator comprehensions to be in their own scope, and from there I guess it was inevitable that list comprehensions would have to match. > > But the biggest problem is that this re-introduces exactly the same > > awful C mistake that := was chosen to avoid. Which of the following two > > contains the typo? > > > > local(spam=expression, eggs=expression, cheese = spam+eggs) > > > > local(spam=expression, eggs=expression, cheese == spam+eggs) > > Neither :-) I don't expect that to be a real problem. I'm sure the C designers didn't either. You miss the point that looking at the above, it is impossible to tell whether I meant assignment or an equality test. Typos of = for == do happen, even in Python, for whatever reason typos occur. Regardless of whether this makes them more likely or not (I didn't make that claim) once made, it is a bug that can fail silently in a way that is hard to see and debug. Most = for == typos in Python give an instant SyntaxError, but there are two places where they don't: - a statement like "spam == eggs" called for its side-effects only; - in a function call, func(spam==eggs, spam=eggs) are both legal. The first is so vanishingly rare that we can forget it. If you see spam = eggs as a statement, we can safely assume it means exactly what it says. Inside function calls, it's a bit less cut and dried: func(foo=bar) *could* be a typoed positional argument (foo == bar) but in practice a couple of factors mitigate that risk: - PEP 8 style conventions: we expect to see func(foo=bar) for the keyword argument case and func(foo == bar) for the positional argument case; - if we mess it up, unless there happens to be a parameter called foo we'll get a TypeError, not a silent bug. But with your suggested local() pseudo-function, neither mitigating factor applies and we can't tell or even guess which meaning is intended just by sight. > In C I'm > _thinking_ "if a equals b" and type "if (a=b)" by mistake in haste. > In a "local" I'm _ thinking_ "I want to create these names with these > values" in the former case, and in the latter case also "and I want to > to test whether cheese equals spam + eggs". But having already typed > "=" to mean "binding" twice in the same line, "but the third time I > type it it will mean equality instead" just doesn't seem likely. Of course people won't *consciously* think that the operator for equality testing is = but they'll be primed to hit the key once, not twice, and they'll be less likely to notice their mistake. I never make more = instead of == typos than after I've just spent a lot of time working on maths problems, even though I am still consciously aware that I should be using == I simply don't notice the error. > The original C mistake is exceedingly unlikely on the face of it: if > what I'm thinking is "if a equals b", or "while a equals b", I'm not > going to use "local()" _at all_. Given that while ... is one of the major motivating use-cases for binding expressions, I think you are mistaken to say that people won't use this local() pseudo-function in while statements. [...] > Still, if people are scared of that, a variation of Yury's alternative > avoids it: the last "argument" must be an expression (not a binding). > In that case your first line above is a compile-time error. > > I didn't like that because I really dislike the textual redundancy in the common > > if local(matchobject=re.match(regexp, line), matchobject): > > compared to > > if local(matchobject=re.match(regexp, line)): Indeed. That sort of "Repeat Yourself To Satisfy The Compiler" will be an ugly anti-pattern. > But I could compromise ;-) > > - There must be at least one argument. > - The first argument must be a binding. > - All but the last argument must also be bindings. > - If there's more than one argument, the last argument must be an expression. That's not really a complete specification of the pseudo-function though, since sometimes the sublocals it introduces extend past the final parenthesis and into the subsequent block. What will, for example, this function return? spam = eggs = "global" def func(arg=local(spam="sublocal", eggs="sublocal", 1)): eggs = "local" return (spam, eggs) Even if you think nobody will be tempted to write such "clever" (dumb?) code, the behaviour still has to be specified. [...] > > Once you drop those two flaws, you're basically left with PEP 572 :-) > > Which is fine by me, but do realize that since PEP 572 dropped any > notion of sublocal scopes, that recurring issue remains wholly > unaddressed regardless. I don't think that sublocal scopes is a recurring issue, nor that we need address it now. -- Steve From rymg19 at gmail.com Sat Apr 28 23:09:46 2018 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Sat, 28 Apr 2018 22:09:46 -0500 Subject: [Python-ideas] A "local" pseudo-function In-Reply-To: References: Message-ID: <1630f60b490.2837.db5b03704c129196a4e9415e55413ce6@gmail.com> I'm pretty sure the debate about braces defining scope in Python has long-since ended... -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/ On April 28, 2018 9:37:57 PM Ken Hilton wrote: > > local { m = re.match(regexp, line) >> if m: >> print(m.group(0)) >> } > > Or how about making "local" a pseudo-statement of sorts? > > local (m=re.match(exp, string)) { > if m: > print(m.group(0)) > } > > The grammar would be as follows: > > local_stmt = "local" "(" local_assignments [ "," local_assignments ... > ] ")" "{" BLOCK "}" > local_assignments = NAME "=" EXPR > > There would be no question about the scope of things in BLOCK - the > variables would disappear after the closing "}". > I say "pseudo"-statement because I'm wondering if something like this would > be legal: > > things = list(map(lambda m: local (gp1=m.group(1)) { > result = gp1 + ''.join(reversed(gp1)) > result += gp1.replace('some', 'thing') > return result > }, re.finditer(exp, string))) > > I'm thinking specifically about the "lambda m: local (...) {...}". If that > was made legal, it would finally allow for full-fledged anonymous > functions. Indeed, the "local" (statement?) itself is actually almost > equivalent to defining an anonymous function and executing it immediately, > i.e. this: > > (lambda x=5: x*x)() > > would be equivalent to this: > > local (x=5) { > return x * x > } > > both evaluating to 25. > > Just some random thoughts! > > Sincerely, > Ken > ? Hilton? > ; > > > > ---------- > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From njs at pobox.com Sat Apr 28 23:12:03 2018 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 28 Apr 2018 20:12:03 -0700 Subject: [Python-ideas] Should __builtins__ have some kind of pass-through print function, for debugging? In-Reply-To: <5AE52E01.8020806@canterbury.ac.nz> References: <20180427112733.GP7400@ando.pearwood.info> <5AE52E01.8020806@canterbury.ac.nz> Message-ID: On Sat, Apr 28, 2018 at 7:29 PM, Greg Ewing wrote: > but he sent it in HTML using a proportional font, which spoils the effect! Uh...? https://vorpus.org/~njs/tmp/monospace.png It looks like my client used "font-family: monospace", maybe yours only understands
 or something? Anyway, if anyone else is having
trouble viewing it, it seems to have come through correctly in the
archives:

https://mail.python.org/pipermail/python-ideas/2018-April/050137.html

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From rosuav at gmail.com  Sat Apr 28 23:14:54 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 29 Apr 2018 13:14:54 +1000
Subject: [Python-ideas] Sublocal scoping at its simplest
Message-ID: 

There's been a lot of talk about sublocal scopes, within and without
the context of PEP 572. I'd like to propose what I believe is the
simplest form of sublocal scopes, and use it to simplify one specific
special case in Python.

There are no syntactic changes, and only a very slight semantic change.

def f():
    e = 2.71828
    try:
        1/0
    except Exception as e:
        print(e)
    print(e)
f()

The current behaviour of the 'except... as' statement is as follows:

1) Bind the caught exception to the name 'e', replacing 2.71828
2) Execute the suite (printing "Division by zero")
3) Set e to None
4) Unbind e

Consequently, the final print call raises UnboundLocalError. I propose
to change the semantics as follows:

1) Bind the caught exception to a sublocal 'e'
2) Execute the suite, with the reference to 'e' seeing the sublocal
3) Set the sublocal e to None
4) Unbind the sublocal e

At the unindent, the sublocal name will vanish, and the original 'e'
will reappear. Thus the final print will display 2.71828, just as it
would if no exception had been raised.

The above definitions would become language-level specifications. For
CPython specifically, my proposed implementation would be for the name
'e' to be renamed inside the block, creating a separate slot with the
same name.

With no debates about whether "expr as name" or "name := expr" or
"local(name=expr)" is better, hopefully we can figure out whether
sublocal scopes are themselves a useful feature :)

ChrisA

From rosuav at gmail.com  Sat Apr 28 23:36:31 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 29 Apr 2018 13:36:31 +1000
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <20180429025005.GB7400@ando.pearwood.info>
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <20180429025005.GB7400@ando.pearwood.info>
Message-ID: 

On Sun, Apr 29, 2018 at 12:50 PM, Steven D'Aprano  wrote:
> On Sat, Apr 28, 2018 at 12:16:16PM -0500, Tim Peters wrote:
>> [Steven D'Aprano ]
>> > Chris' PEP 572 started off with the concept that binding expressions
>> > would create a "sub-local" scope, below function locals. After some
>> > debate on Python-Ideas, Chris, Nick and Guido took the discussion off
>> > list and decided to drop the sub-local scope idea as confusing and hard
>> > to implement.
>>
>> Enormously harder to implement than binding expressions, and the
>> latter (to my eyes) capture many high-value use cases "good enough".
>
> And yet you're suggesting an alternative which is harder and more
> confusing. What's the motivation here for re-introducing sublocal
> scopes, if they're hard to do and locals are "good enough"?
>
> That's not a rhetorical question: why have you suggested this sublocal
> scoping idea? PEP 572 stopped talking about sublocals back in revision 2
> or so, and as far as I can see, *not a single objection* since has
> been that the variables weren't sublocal.

No objections, per se, but I do know a number of people were saddened
at their loss. And it's not like eliminating sublocals solved problems
without creating more; it's just a different set of trade-offs.
Sublocals have their value.

>> It was also the case that nesting scopes _at all_ was very
>> controversial in Python's earliest years, and Guido resisted it
>> mightily (with my full support).  The only scopes at first were
>> function-local, module-global, and builtin, and while functions could
>> _textually_ nest, they had no access to enclosing local scopes.
>
> While I started off with Python 1.5, I wasn't part of the discussions
> about nested scopes. But I'm astonished that you say that nested scopes
> were controversial. *Closures* I would completely believe, but mere
> lexical scoping? Astonishing.

I'm not sure how you can distinguish them:

> This behaviour in Python 1.5 made functions MUCH less useful:
>
>
>>>> def outer():
> ...     x = 1
> ...     def inner():
> ...             return x
> ...     return inner()
> ...
>>>> outer()
> Traceback (innermost last):
>   File "", line 1, in ?
>   File "", line 5, in outer
>   File "", line 4, in inner
> NameError: x
>
>
> I think it is fair to say that inner functions in Python 1.5 were
> crippled to the point of uselessness.

What you expect here is lexical scope, yes. But if you have lexical
scope with no closures, the inner function can ONLY be used while its
calling function is still running. What would happen if you returned
'inner' uncalled, and then called the result? How would it resolve the
name 'x'? I can't even begin to imagine what lexical scope would do in
the absence of closures. At least, not with first-class functions. If
functions aren't first-class objects, it's much easier, and a nested
function serves as a refactored block of code. But then you can't even
pass a nested function to a higher order function.

ChrisA

From tim.peters at gmail.com  Sun Apr 29 00:20:52 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 28 Apr 2018 23:20:52 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <20180429025005.GB7400@ando.pearwood.info>
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <20180429025005.GB7400@ando.pearwood.info>
Message-ID: 

[Tim]
>> Enormously harder to implement than binding expressions, and the
>> latter (to my eyes) capture many high-value use cases "good enough".

[Steven D'Aprano ]
> And yet you're suggesting an alternative which is harder and more
> confusing.

I am?  I said at the start that it was a "brain dump".  It was meant
to be a point of discussion for anyone interested.  I also said I was
more interested in real use cases from real code than in debating, and
I wasn't lying about that ;-)

Since no real new use cases (let alone compelling ones) have turned up
yet, I'm ready to drop it for now.


> What's the motivation here for re-introducing sublocal
> scopes, if they're hard to do and locals are "good enough"?

That's why I wanted to see if there were significant unaddressed use
cases.  That _my_ particular itches would be scratched "good enough"
if the PEP is accepted doesn't imply everyone's will be.  And my
particular itches will continue to annoy if the PEP is rejected.


> That's not a rhetorical question: why have you suggested this sublocal
> scoping idea?

Putting an idea out for discussion isn't suggesting it be adopted.
The list is named "python-ideas", not "python-advocacy-death-match"
;-)


> PEP 572 stopped talking about sublocals back in revision 2
> or so, and as far as I can see, *not a single objection* since has
> been that the variables weren't sublocal.

Meh.  Chris didn't seem all that thrilled about dropping them, and I
saw a number of messages more-than-less supporting the idea _before_
they were dropped.  When it became clear that the PEP didn't stand a
chance _unless_ they were dropped, nobody was willing to die for it,
because they weren't the PEP's _primary_ point.


> For what it is worth, if we ever did introduce a sublocal scope, I
> don't hate Nick's "given" block statement:
>
> https://www.python.org/dev/peps/pep-3150/

And Chris just tried introducing it again.   That's in reference to
the last sentence of your reply:

    I don't think that sublocal scopes is a recurring issue

How many times does it have to come up before "recurring" applies? ;-)
 I've seen it come up many times over ... well, literally decades by
now.


> ...
> While I started off with Python 1.5, I wasn't part of the discussions
> about nested scopes. But I'm astonished that you say that nested scopes
> were controversial. *Closures* I would completely believe, but mere
> lexical scoping? Astonishing.

But true.  Guido agonized over it for a long time.  Limiting to 3
scopes was a wholly deliberate design decision at the start, not just,
.e.g, due to lack of time to implement lexical scoping at the start.
And that shouldn't be surprising given Python's start as "somewhere
between a scripting language and C", and the many influences carried
over from Guido's time working on ABC's implementation team (ABC had
no lexical scoping either - nor, if I recall correctly, even textual
nesting of its flavor of functions).

I'm glad he tried it!  Everyone learned something from it.


> Even when I started, as a novice programmer who wouldn't have recognised
> the term "lexical scoping" if it fell on my head from a great height, I
> thought it was strange that inner functions couldn't see their
> surrounding function's variables. Nested scopes just seemed intuitively
> obvious: if a function sees the variables in the module surrounding it,
> then it should also see the variables in any function surrounding it.
>
> This behaviour in Python 1.5 made functions MUCH less useful:
>
>
> >>> def outer():
> ...     x = 1
> ...     def inner():
> ...             return x
> ...     return inner()
> ...
> >>> outer()
> Traceback (innermost last):
>   File "", line 1, in ?
>   File "", line 5, in outer
>   File "", line 4, in inner
> NameError: x
>
>
> I think it is fair to say that inner functions in Python 1.5 were
> crippled to the point of uselessness.

I don't think that's fair to say.  A great many functions are in fact
... functions ;-)  That is, they compute a result from the arguments
passed to them.  They don't need more than that, although being able
to access globals and builtins and import whatever they want from the
standard library made them perfectly capable of doing a whole lot more
than just staring at their arguments.

To this day, _most_ of the nested functions I write would have worked
fine under the original scoping rules, because that's all they need.
Many of the rest are recursive, but would also work fine if I passed
their names into them and rewrote the bits of code to do recursive
calls via the passed-in name.  But, yes, I am relieved I don't need to
do the latter anymore ;-)

... [snip similar things about closures] ...


>> Since then, Python has gone down a pretty bizarre path, inventing
>> sublocal scopes on an ad hoc basis by "pure magic" when their absence
>> in some specific context seemed just too unbearable to live with
>> (e.g., in comprehensions).  So we already have sublocal scopes, but in
>> no explicit form that can be either exploited or explained.
>
> I'm not entirely sure that comprehensions (including generator
> expressions) alone counts as "a path" :-) but I agree with this. I'm not
> a fan of comprehensions being their own scope. As far as I am concerned,
> leakage of comprehension variables was never a problem that needed to be
> solved, and was (very occasionally) a useful feature. Mostly for
> introspection and debugging.
>
> But the decision was made for generator comprehensions to be in their
> own scope, and from there I guess it was inevitable that list
> comprehensions would have to match.

I didn't mind comprehensions "leaking" either.  But I expect the need
became acute when  generator expressions were introduced, because the
body of those can execute in an environment entirely unrelated to the
definition site:

    def process(g):
        i = 12
        for x in g:
            pass
        print(i) # was i clobbered?  nope!

    process(i for i in range(3))

... [snip more "head arguments" about "=" vs "=="] ...

>> But I could compromise ;-)
>>
>> - There must be at least one argument.
>> - The first argument must be a binding.
>> - All but the last argument must also be bindings.
>> - If there's more than one argument, the last argument must be an expression.

> That's not really a complete specification of the pseudo-function
> though, since sometimes the sublocals it introduces extend past the
> final parenthesis and into the subsequent block.

It was talking about compile-time-checkable syntactic requirements,
nothing about semantics.


> What will, for example, this function return?
>
> spam = eggs = "global"
> def func(arg=local(spam="sublocal", eggs="sublocal", 1)):
>     eggs = "local"
>     return (spam, eggs)

It would return (spam, "local"), for whatever value `spam` happened to
be bound to at the time of the call.  `local()` had magically extended
scope only in `if`/`elif` and `while` openers.  In this example, it
would the same if the `def` had been written

def func(arg=1):

Remember that default argument values are computed at the time `def`
is executed, and never again.  At that time, local(spam="sublocal",
eggs="sublocal", 1) created two local names that are never referenced,
threw the new scope away, and returned 1, which was saved away as the
default value for `arg`.

> Even if you think nobody will be tempted to write such "clever" (dumb?)
> code, the behaviour still has to be specified.

Of course, but that specific case wasn't even slightly subtle ;-)

From gadgetsteve at live.co.uk  Sat Apr 28 22:51:05 2018
From: gadgetsteve at live.co.uk (Steve Barnes)
Date: Sun, 29 Apr 2018 02:51:05 +0000
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 
Message-ID: 



On 28/04/2018 04:34, Yury Selivanov wrote:
> Hi Tim,
> 
> This is interesting. Even "as is" I prefer this to PEP 572. Below are some
> comments and a slightly different idea inspired by yours (sorry!)
> 
> On Fri, Apr 27, 2018 at 10:41 PM Tim Peters  wrote:
> [..]
>> As an expression, it's
> 
>>       "local" "(" arguments ")"
> 
>> - Because it "looks like" a function call, nobody will expect the targets
>>     of named arguments to be fancier than plain names.
> [..]
>> Everyone's favorite:
> 
>> if local(m = re.match(regexp, line)):
>>       print(m.group(0))
> 
>> Here's where it's truly essential that the compiler know everything
>> about "local", because in _that_ context it's required that the new
>> scope extend through the end of the entire block construct (exactly
> 
> It does look like a function call, although it has a slightly different
> syntax. In regular calls we don't allow positional arguments to go after
> keyword arguments.  Hence the compiler/parser will have to know what
> 'local(..)' is *regardless* of where it appears.
> 
> If you don't want to make 'local' a new keyword, we would need to make the
> compiler/parser to trace the "local()" name to check if it was imported or
> is otherwise "local". This would add some extra complexity to already
> complex code.  Another problematic case is when one has a big file and
> someone adds their own "def local()" function to it at some point, which
> would break things.
> 
> Therefore, "local" should probably be a keyword. Perhaps added to Python
> with a corresponding "from __future__" import.
> 
> The other way would be to depart from the function call syntax by dropping
> the parens.  (And maybe rename "local" to "let" ;))  In this case, the
> syntax will become less like a function call but still distinct enough.  We
> will be able to unambiguously parse & compile it.  The cherry on top is
> that we can make it work even without a "__future__" import!
> 
> When we implemented PEP 492 in Python 3.5 we did a little trick in
> tokenizer to treat "async def" in a special way. Tokenizer would switch to
> an "async" mode and yield ASYNC and AWAIT tokens instead of NAME tokens.
> This resulted in async/await syntax available without a __future__ import,
> while having full backwards compatibility.
> 
> We can do a similar trick for "local" / "let" syntax, allowing the
> following:
> 
>     "let" NAME "=" expr ("," NAME = expr)* ["," expr]
> 
> * "if local(m = re.match(...), m):" becomes
>      "if let m = re.match(...), m:"
> 
> * "c = local(a=3) * local(b=4)" becomes
>     "c = let a=3, b=4, a*b" or "c = (let a=3, b=4, a*b)"
> 
> *      for i in iterable:
>             if let i2=i*i, i2 % 18 == 0:
>                append i2 to the output list
> 
> etc.
> 
> Note that I don't propose this new "let" or "local" to return their last
> assignment. That should be done explicitly (as in your "local(..)" idea):
>    `let a = 'spam', a`.  Potentially we could reuse our function return
> annotation syntax, changing the last example to `let a = "spam" -> a` but I
> think it makes the whole thing to look unnecessarily complex.
> 
> One obvious downside is that "=" would have a different precedence compared
> to a regular assignment statement. But it already has a different precedent
> in function calls, so maybe this isn't a big deal, considered that we'll
> have a keyword before it.
> 
> I think that "let" was discussed a couple of times recently, but it's
> really hard to find a definitive reason of why it was rejected (or was it?)
> in the ocean of emails about assignment expressions.
> 
> Yury
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 
If things were to go the keyword direction I personally think that a 
great deal of clarity could be achieved by borrowing somewhat from 
Pascal with either

   use  as [,  as   ...]:
       # Local names only exist in this scope,
       # existing names are in scope unless overridden
       # new names introduced here go out to nesting scope

OR (for those who prefer name first, a function like look, and slightly 
less typing):

   using(=[ ,=...]):
       # Local names only exist in this scope,
       # existing names are in scope unless overridden
       # new names introduced here go out to nesting scope

Lastly how about:

     =  using(=[ 
,=...])  # in this case where might be better than using

Presumably, in the latter cases, the using function would return a 
sub_local dictionary that would only exist to the end of the scope be 
that the current line or indented block which should minimise the 
required magic.

-- 
Steve (Gadget) Barnes
Any opinions in this message are my personal opinions and do not reflect 
those of my employer.

---
This email has been checked for viruses by AVG.
http://www.avg.com


From greg.ewing at canterbury.ac.nz  Sun Apr 29 01:02:18 2018
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Apr 2018 17:02:18 +1200
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 
 
Message-ID: <5AE551DA.1000303@canterbury.ac.nz>

Tim Peters wrote:
> The points to using function-call-like syntax were already covered
> ("nothing syntactically new to learn there",

The trouble is that one usually expects "nothing syntactically
new" to imply "nothing semantically new" as well, which is very
far from the case here. So I think that not using *any* new
syntax would actually be hurting users rather than helping them.

If you really want to leverage existing knowledge, I'd suggest
something based on lambda:

   let a = 3, b = 4: a + b

This can be easily explained as a shorthand for

   (lambda a = 3, b = 4: a + b)()

except, of course, for the magic needed to make it DWIM in an if
or while statement. I'm still pretty uncomfortable about that.

-- 
Greg

From timothy.c.delaney at gmail.com  Sun Apr 29 01:15:23 2018
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sun, 29 Apr 2018 05:15:23 +0000
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 
 
Message-ID: 

On Sun, 29 Apr 2018 at 10:30, Tim Peters  wrote:

> [Tim Delaney ]
> > My big concern here involves the:
> >
> ??
> > if local(m = re.match(regexp, line)):
> >     print(m.group(0))
> >
> > example. The entire block needs to be implicitly local for that to work -
> > what happens if I assign a new name in that block?
>
> I really don't know what you're asking there.  Can you make it
> concrete?  If, e.g., you're asking what happens if this appeared after
> the `print`:
>
>         x = 3.14
>
> then the answer is "the same as what would happen if `local` had not
> been used".  We can't know what that is without context, though.
> Maybe x is global.  Maybe x was declared nonlocal earlier.  Maybe it's
> function-local.  While it may be irrelevant to what you're asking, I
> noted just before:
>

That's exactly what I was asking, and as I understand what you're saying,
we would have a local name m available in the indented block which went
away when the block ended, but any names modified in the block are not
local to the block. That seems likely to be a source of errors.

To clarify my understanding, if the names 'x' and 'm' did not exist prior
to the following code, what would x and m refer to after the block
completed?

??
if local(m = re.match(regexp, line)):
?    x = 1
    m = ?2


> >> if local(m = re.match(regexp, line)):
> >>     print(m.group(0))
>
> > if local { m = re.match(regexp, line) }:
> >     print(m.group(0))
>
> OK, this is the only case in which you used it in an `if` or `while`
> expression.  All the questions you asked of me at the start can be
> asked of this spelling too.
> ? ?
> You seemed to imply at the start that the
> right curly brace would always mark the end of the new scope.  But if
> that's so, the `m` in `m.group()` has nothing to do with the `m`
> assigned to in the `local` block - _that_ scope ended before `print`
> was reached.
>

?Yes - I think this is exactly the same issue as with your proposed syntax.?


> So if you're not just trying to increase the level of complexity of
> what can appear in a local block, a fundamental problem still needs
> solving ;-)  I suppose you could solve it like so:
>
> local { m = re.match(regexp, line)
>            if m:
>                print(m.group(0))
>          }
>
> but, besides losing the "shortcut", it would also mean something
> radically different if
>
>                x = 3.14
>
> appeared after the "print".  Right?  If a "local block" is taken
> seriously, then _all_ names bound inside it vanish when the block
> ends.


?Indeed, and I don't have a proposal - just concerns it wold be very
difficult to explain and understand exactly what would happen in the case
of something like:

?
if local(m = re.match(regexp, line)):
?    x = 1
    m = ?2

Regarding the syntax, I didn't want to really change your proposal, but
just thought the functionality was different enough from the function call
it appears to be that it probably merits different syntax.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From greg.ewing at canterbury.ac.nz  Sun Apr 29 01:19:37 2018
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Apr 2018 17:19:37 +1200
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <20180429025005.GB7400@ando.pearwood.info>
 
Message-ID: <5AE555E9.5020102@canterbury.ac.nz>

Tim Peters wrote:
> (ABC had
> no lexical scoping either - nor, if I recall correctly, even textual
> nesting of its flavor of functions).

If Python hadn't allowed textual nesting either, folks
might have been content to leave it that way. But having
textual nesting without lexical scoping was just weird
and confusing!

 > A great many functions are in fact
> ... functions ;-)  That is, they compute a result from the arguments
> passed to them.  They don't need more than that,

Yes, but they often make use of other functions to do
that, and not being able to call other local functions
in the same scope seemed like a perverse restriction.

-- 
Greg


From steve at pearwood.info  Sun Apr 29 01:53:17 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 29 Apr 2018 15:53:17 +1000
Subject: [Python-ideas] Sublocal scoping at its simplest
In-Reply-To: 
References: 
Message-ID: <20180429055317.GC7400@ando.pearwood.info>

On Sun, Apr 29, 2018 at 01:14:54PM +1000, Chris Angelico wrote:

[...]
> def f():
>     e = 2.71828
>     try:
>         1/0
>     except Exception as e:
>         print(e)
>     print(e)
> f()


> I propose to change the semantics as follows:
> 
> 1) Bind the caught exception to a sublocal 'e'
> 2) Execute the suite, with the reference to 'e' seeing the sublocal
> 3) Set the sublocal e to None
> 4) Unbind the sublocal e
> 
> At the unindent, the sublocal name will vanish, and the original 'e'
> will reappear. Thus the final print will display 2.71828, just as it
> would if no exception had been raised.

What problem does this solve?

The current behaviour where 'e' is unbound when the except clause 
finishes is a neccessary but ugly hack that forces you to do bind 'e' to 
another variable if you want to inspect it after the exception:

    try:
        something()
    except Exception as e:
        err = e  # defeat the automatic deletion of e
    print(e)

For example, in the interactive interpreter, where I do this very 
frequently.

I understand and accept the reasons for deleting e in Python 3, and 
don't wish to re-debate those. But regardless of whether we have 
Python 3 behaviour or Python 2 behaviour, binding to e has always 
replaced the value of e. Just as if I had written:

    except Exception as some_other_name:
        e = some_other_name

It has never been the case that binding to e in the except clause won't 
replace any existing binding to e, and I see no reason why anyone would 
desire that. If you don't want to replace e, then don't use e as the 
name for the exception.

Your proposal doesn't solve any known problem that I can see. For people 
like me who want to inspect the error object outside the except clause, 
we still have to defeat the compiler, so you're not solving anything for 
me. You're not solving any problems for those people who desire (for 
some reason) that except clauses are their own scope. It is only the 
exception variable itself which is treated as a special case.

The one use-case you give is awfully dubious: if I wanted 'e' to keep 
its value from before the exception, why on earth would I rebind 'e' 
when there are approximately a zillion alternative names I could use?

Opportunities for confusion should be obvious:

    e = 2.71
    x = 1
    try:
        ...
    except Exception as e:
        assert isinstance(e, Exception)
        x = 2
    assert isinstance(e, Exception)  # why does this fail?
    assert x != 2  # why does this fail?

Conceptually, this is even more more complex than the idea of giving the 
except block its own scope. The block shares the local scope, it's just 
the block header (the except ... as ... line itself) which introduces a 
new scope.



-- 
Steve

From tim.peters at gmail.com  Sun Apr 29 02:18:40 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 29 Apr 2018 01:18:40 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <5AE551DA.1000303@canterbury.ac.nz>
References: 
 
 
 <5AE551DA.1000303@canterbury.ac.nz>
Message-ID: 

Tim] Peters wrote:
>> The points to using function-call-like syntax were already covered
>> ("nothing syntactically new to learn there",

[Greg Ewing]
[> The trouble is that one usually expects "nothing syntactically
> new" to imply "nothing semantically new" as well, which is very
> far from the case here. So I think that not using *any* new
> syntax would actually be hurting users rather than helping them.

I expect anyone who has programmed for over a year has proved beyond
doubt that they can run a marathon even if forced to wear a backpack
containing a ton of lead, with barbed wire wrapped around their feet
;-)

They're indestructible.  If they came from Perl, they'd even have a
hearty laugh when they learned the syntax had utterly fooled them :-)


> If you really want to leverage existing knowledge, I'd suggest
> something based on lambda:
>
>   let a = 3, b = 4: a + b
>
> This can be easily explained as a shorthand for
>
>   (lambda a = 3, b = 4: a + b)()

In that case, yes, but not in all.  There's more than one kind of
magic here, even outside of block constructs.  See, e.g., the
quadratic equation example in the original post

    local(D = b**2 - 4*a*c,
          sqrtD = math.sqrt(D),
          ...

Try that with a lambda, and you get a NameError when computing sqrt(D)
- or, worse, pick up an irrelevant value of D left over from earlier
code.

In Scheme terminology, what Python does with default args (keyword
args too) is "let" (as if you evaluate all the expressions before
doing any of the bindings), but what's wanted is "let*" (bindings are
established left-to-right, one at a time, and each binding already
established is visible in the expression part of each later binding).

But even with `let*`, I'm not sure whether this would (or should, or
shouldn't) work:

local(even = (lambda n: n == 0 or odd(n-1)),
      odd = (lambda n: False if n == 0 else even(n-1)),
      odd(7))

Well, OK, I'm pretty sure it would work.  But by design or by accident? ;-)


> except, of course, for the magic needed to make it DWIM in an if
> or while statement. I'm still pretty uncomfortable about that.

That's because it's horrid.  Honest, I'm not even convinced it's
_worth_ "solving" , even if it didn't seem to require deep magic.

From tim.peters at gmail.com  Sun Apr 29 02:57:21 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 29 Apr 2018 01:57:21 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 
 
 
Message-ID: 

[Tim Delaney ]
>>> My big concern here involves the:
>>>
>>> if local(m = re.match(regexp, line)):
>>>     print(m.group(0))
>>>
>>> example. The entire block needs to be implicitly local for that to work
>>> -
>>> what happens if I assign a new name in that block?

[Tim Peters]
>> I really don't know what you're asking there.  Can you make it
>> concrete?  If, e.g., you're asking what happens if this appeared after
>> the `print`:
>>
>>         x = 3.14
>>
>> then the answer is "the same as what would happen if `local` had not
>> been used".  We can't know what that is without context, though.
>> Maybe x is global.  Maybe x was declared nonlocal earlier.  Maybe it's
>> function-local. ...

[Tim D]
> That's exactly what I was asking, and as I understand what you're saying, we
> would have a local name m available in the indented block which went away
> when the block ended, but any names modified in the block are not local to
> the block. That seems likely to be a source of errors.

If you what you _want_ is a genuinely new scope, yes.  But no actual
use cases so far wanted that at all.

This is the kind of code about which there have been background
complaints "forever":

    m1 = regexp1.match(line)
    m2 = regexp2.match(iine)
    if m1 and m2:
        do all sorts of stuff with m1 and/or m2,
        including perhaps modifying local variables
        and/or global variables
        and/or nonlocal variables

The complaints are of two distinct kinds:

1. "I want to compute m1 and m2 _in_ the `if` test".

2. "I don't want these temp names (m1 and m2) accidentally
   conflicting with local names already in scope - if these names
   already exist, I want the temp names to shadow their
   current bindings until the `if` structure is done".

So,

    if local(m1=regexp1.match(line),
              m2 = regexp2.match(iine),
              m1 and m2):

intends to address both complaints via means embarrassingly obvious to
the most casual observer ;-)

This is, e.g., the same kind of name-specific "shadowing" magically
done by list and dict comprehensions now, and by generator
expressions.  For example,

    [i**2 for i in range(10)]

has no effect on whatever `i` meant before the listcomp was executed.


> To clarify my understanding, if the names 'x' and 'm' did not exist prior to
> the following code, what would x and m refer to after the block completed?
>
> if local(m = re.match(regexp, line)):
>     x = 1
>     m = 2

I hope the explanation above made that clear.  What's wanted is
exactly what the current

    m = re.match(regexp, line):
    if m:
        x =1
        m = 2

_would_ do if only there were a sane way to spell "save m's current
status before that all started and restore it after that all ends".

So they want `x == 1` after it's over, and `m` to raise NameError.


>>> if local { m = re.match(regexp, line) }:
>>>     print(m.group(0))

>> OK, this is the only case in which you used it in an `if` or `while`
>> expression.  All the questions you asked of me at the start can be
>> asked of this spelling too.
>> You seemed to imply at the start that the
>> right curly brace would always mark the end of the new scope.  But if
>> that's so, the `m` in `m.group()` has nothing to do with the `m`
>> assigned to in the `local` block - _that_ scope ended before `print`
>> was reached.

> Yes - I think this is exactly the same issue as with your proposed syntax.

Wholly agreed :-)


>> So if you're not just trying to increase the level of complexity of
>> what can appear in a local block, a fundamental problem still needs
>> solving ;-)  I suppose you could solve it like so:
>>
>> local { m = re.match(regexp, line)
>>            if m:
>>                print(m.group(0))
>>          }
>>
>> but, besides losing the "shortcut", it would also mean something
>> radically different if
>>
>>                x = 3.14
>>
>> appeared after the "print".  Right?  If a "local block" is taken
>> seriously, then _all_ names bound inside it vanish when the block
>> ends.

> Indeed, and I don't have a proposal - just concerns it wold be very
> difficult to explain and understand exactly what would happen in the case of
> something like:
>
> if local(m = re.match(regexp, line)):
>     x = 1
>     m = 2

Only names appearing as targets _in_ the `local(...)` are affected in
any way.  The states of those names are captured, then those names are
bound to the values of the associated expressions in the `local(...)`,
and when the scope of the `local` construct ends (which _is_ hard to
explain!) those names' original states are restored.

So the effects on names are actually pretty easy to explain:  all and
only the names appearing inside the `local(...)` are affected.


> Regarding the syntax, I didn't want to really change your proposal, but just
> thought the functionality was different enough from the function call it
> appears to be that it probably merits different syntax.

Probably so!

From ncoghlan at gmail.com  Sun Apr 29 04:03:19 2018
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Apr 2018 18:03:19 +1000
Subject: [Python-ideas] Sublocal scoping at its simplest
In-Reply-To: 
References: 
Message-ID: 

On 29 April 2018 at 13:14, Chris Angelico  wrote:

> There's been a lot of talk about sublocal scopes, within and without
> the context of PEP 572. I'd like to propose what I believe is the
> simplest form of sublocal scopes, and use it to simplify one specific
> special case in Python.
>
> There are no syntactic changes, and only a very slight semantic change.
>
> def f():
>     e = 2.71828
>     try:
>         1/0
>     except Exception as e:
>         print(e)
>     print(e)
> f()
>
> The current behaviour of the 'except... as' statement is as follows:
>
> 1) Bind the caught exception to the name 'e', replacing 2.71828
> 2) Execute the suite (printing "Division by zero")
> 3) Set e to None
> 4) Unbind e
>
> Consequently, the final print call raises UnboundLocalError. I propose
> to change the semantics as follows:
>
> 1) Bind the caught exception to a sublocal 'e'
> 2) Execute the suite, with the reference to 'e' seeing the sublocal
> 3) Set the sublocal e to None
> 4) Unbind the sublocal e
>
> At the unindent, the sublocal name will vanish, and the original 'e'
> will reappear. Thus the final print will display 2.71828, just as it
> would if no exception had been raised.
>

The challenge with doing this implicitly is that there's no indication
whatsoever that the two "e"'s are different, especially given the
longstanding precedent that the try/except level one will overwrite any
existing reference in the local namespace.

By contrast, if the sublocal marker could be put on the *name itself*, then:

1. Sublocal names are kept clearly distinct from ordinary names
2. Appropriate sublocal semantics can be defined for any name binding
operation, not just exception handlers
3. When looking up a sublocal for code compiled in exec or eval mode,
missing names can be identified and reported at compile time (just as they
can be for nonlocal declarations) (Such a check likely wouldn't be possible
for code compiled in "single" mode, although working out a suitable
relationship between sublocal scoping and the interactive prompt is likely
to prove tricky no matter what)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From rosuav at gmail.com  Sun Apr 29 07:24:40 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 29 Apr 2018 21:24:40 +1000
Subject: [Python-ideas] Sublocal scoping at its simplest
In-Reply-To: 
References: 
 
Message-ID: 

On Sun, Apr 29, 2018 at 6:03 PM, Nick Coghlan  wrote:
> The challenge with doing this implicitly is that there's no indication
> whatsoever that the two "e"'s are different, especially given the
> longstanding precedent that the try/except level one will overwrite any
> existing reference in the local namespace.

My intention is that the "except" statement IS the indication that
they're different. Now that the name gets unbound at the exit of the
clause, the only indication that it overwrites is that, after "except
Exception as e:", any previous e has been disposed of. I'd hardly call
that a feature. Can you show me code that actually DEPENDS on this
behaviour?

> By contrast, if the sublocal marker could be put on the *name itself*, then:
>
> 1. Sublocal names are kept clearly distinct from ordinary names
> 2. Appropriate sublocal semantics can be defined for any name binding
> operation, not just exception handlers
> 3. When looking up a sublocal for code compiled in exec or eval mode,
> missing names can be identified and reported at compile time (just as they
> can be for nonlocal declarations) (Such a check likely wouldn't be possible
> for code compiled in "single" mode, although working out a suitable
> relationship between sublocal scoping and the interactive prompt is likely
> to prove tricky no matter what)

I'm aware of this, but that gets us right back to debating syntax, and
I'm pretty sure death and syntaxes are the two things that we can
truly debate forever. :)

ChrisA

From ncoghlan at gmail.com  Sun Apr 29 07:39:45 2018
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Apr 2018 21:39:45 +1000
Subject: [Python-ideas] Sublocal scoping at its simplest
In-Reply-To: 
References: 
 
 
Message-ID: 

On 29 April 2018 at 21:24, Chris Angelico  wrote:

> On Sun, Apr 29, 2018 at 6:03 PM, Nick Coghlan  wrote:
> > The challenge with doing this implicitly is that there's no indication
> > whatsoever that the two "e"'s are different, especially given the
> > longstanding precedent that the try/except level one will overwrite any
> > existing reference in the local namespace.
>
> My intention is that the "except" statement IS the indication that
> they're different. Now that the name gets unbound at the exit of the
> clause, the only indication that it overwrites is that, after "except
> Exception as e:", any previous e has been disposed of. I'd hardly call
> that a feature. Can you show me code that actually DEPENDS on this
> behaviour?
>

That's not the bar the proposal needs to meet, though: it needs to meet the
bar of being *better* than the status quo of injecting an implicit "del e"
at the end of the suite.

While the status quo isn't always convenient, it has two main virtues:

1. It's easily explained in terms of the equivalent "del" statement
2. Given that equivalence, it's straightforward to avoid the unwanted side
effects by either adjusting your exact choices of names (if you want to
avoid overwriting an existing name), or else by rebinding the caught
exception to a different name (if you want to avoid the exception reference
getting dropped).

I do agree that *if* sublocal scopes existed, *then* they would offer a
reasonable implementation mechanism for block-scoped name binding in
exception handlers. However, exception handlers don't offer a good
motivation for *adding* sublocal scopes, simply because the simpler
"implicitly unbind the name at the end of the block" approach works well
enough in practice.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From python at mrabarnett.plus.com  Sun Apr 29 10:52:59 2018
From: python at mrabarnett.plus.com (MRAB)
Date: Sun, 29 Apr 2018 15:52:59 +0100
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 
 
 
 
Message-ID: <015e4716-5cca-877f-ef95-2df6ade13004@mrabarnett.plus.com>

On 2018-04-29 07:57, Tim Peters wrote:
> [Tim Delaney ]
>>>> My big concern here involves the:
>>>>
>>>> if local(m = re.match(regexp, line)):
>>>>     print(m.group(0))
>>>>
>>>> example. The entire block needs to be implicitly local for that to work
>>>> -
>>>> what happens if I assign a new name in that block?
> 
> [Tim Peters]
>>> I really don't know what you're asking there.  Can you make it
>>> concrete?  If, e.g., you're asking what happens if this appeared after
>>> the `print`:
>>>
>>>         x = 3.14
>>>
>>> then the answer is "the same as what would happen if `local` had not
>>> been used".  We can't know what that is without context, though.
>>> Maybe x is global.  Maybe x was declared nonlocal earlier.  Maybe it's
>>> function-local. ...
> 
> [Tim D]
>> That's exactly what I was asking, and as I understand what you're saying, we
>> would have a local name m available in the indented block which went away
>> when the block ended, but any names modified in the block are not local to
>> the block. That seems likely to be a source of errors.
> 
> If you what you _want_ is a genuinely new scope, yes.  But no actual
> use cases so far wanted that at all.
> 
> This is the kind of code about which there have been background
> complaints "forever":
> 
>      m1 = regexp1.match(line)
>      m2 = regexp2.match(iine)
>      if m1 and m2:
>          do all sorts of stuff with m1 and/or m2,
>          including perhaps modifying local variables
>          and/or global variables
>          and/or nonlocal variables
> 
> The complaints are of two distinct kinds:
> 
> 1. "I want to compute m1 and m2 _in_ the `if` test".
> 
> 2. "I don't want these temp names (m1 and m2) accidentally
>     conflicting with local names already in scope - if these names
>     already exist, I want the temp names to shadow their
>     current bindings until the `if` structure is done".
> 
> So,
> 
>      if local(m1=regexp1.match(line),
>                m2 = regexp2.match(iine),
>                m1 and m2):
> 
> intends to address both complaints via means embarrassingly obvious to
> the most casual observer ;-)
> 
How about these:

     local m1, m2:
         m1 = regexp1.match(line)
         m2 = regexp2.match(line):
         if m1 and m2:
             ...


     local m1, m2:

         if (m1 := regexp1.match(line)) and (m2 := regexp2.match(line)):

             ...

     local m1=regexp1.match(line), m2=regexp2.match(line):
         if m1 and m2:
             ...

?

[snip]

From mikhailwas at gmail.com  Sun Apr 29 12:22:52 2018
From: mikhailwas at gmail.com (Mikhail V)
Date: Sun, 29 Apr 2018 19:22:52 +0300
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 
 
Message-ID: 

On Sun, Apr 29, 2018 at 3:30 AM, Tim Peters  wrote:

>
> """
> Time to note another subtlety:  people don't _really_ want "a new
> scope" in Python.  If they did, then _every_ name appearing in a
> binding context (assignment statement target, `for` target, ...) for
> the duration would vanish when the new scope ended.  What they really
> want is a new scope with an implied "nonlocal" declaration for every
> name appearing in a binding context _except_ for the specific names
> they're effectively trying to declare as being "sublocal" instead.
> """
>
> If by "new name" you mean that `x` didn't appear in any earlier line,
> then Python's current analysis would classify `x` as local to the
> current function (or as global if this is module-level code ...).
> That wouldn't change.


I have hard time understanding what is
the demand here actually. (it's been too many
posts and ideas to absorb)

>From your description of "what people need" -
how is this different from current "def"?
Now I can use nested "def"s and call it right away:

def d():
    global a
    x = 1; y = 2
    a = x + y
d()
print (a)

And this will do the thing: here if the "a" variable is "new" then it
will be initialized and pushed to the outer scope, right?
If there is demand for this, how about just introducing a
derived syntax for the "auto-called" def block, say, just "def" without a name:

def :
    global a
    x = 1; y = 2
    a = x + y
print (a)

Which would act just like a scope block without any new rules introduced.
And for inline usage directly inside single-line expression - I don't
think it is
plausible to come up with very nice syntax anyway, and I bet at best you'll
end up with something looking C-ish, e.g.:

if (def {x=1; y=2; &a = x + y } ) :
    ...

As a short-cut for the above multi-line scope block.



Mikhail

From tim.peters at gmail.com  Sun Apr 29 13:01:49 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 29 Apr 2018 12:01:49 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <015e4716-5cca-877f-ef95-2df6ade13004@mrabarnett.plus.com>
References: 
 
 
 
 
 <015e4716-5cca-877f-ef95-2df6ade13004@mrabarnett.plus.com>
Message-ID: 

[Tim]
>> ...
>> This is the kind of code about which there have been background
>> complaints "forever":
>>
>>      m1 = regexp1.match(line)
>>      m2 = regexp2.match(iine)
>>      if m1 and m2:
>>          do all sorts of stuff with m1 and/or m2,
>>          including perhaps modifying local variables
>>          and/or global variables
>>          and/or nonlocal variables
>>
>> The complaints are of two distinct kinds:
>>
>> 1. "I want to compute m1 and m2 _in_ the `if` test".
>>
>> 2. "I don't want these temp names (m1 and m2) accidentally
>>     conflicting with local names already in scope - if these names
>>     already exist, I want the temp names to shadow their
>>     current bindings until the `if` structure is done".
>>
>> So,
>>
>>      if local(m1=regexp1.match(line),
>>                m2 = regexp2.match(iine),
>>                m1 and m2):
>>
>> intends to address both complaints via means embarrassingly obvious to
>> the most casual observer ;-)


[MRAB ]
> How about these:
>
>     local m1, m2:
>         m1 = regexp1.match(line)
>         m2 = regexp2.match(line):
>         if m1 and m2:
>             ...
>
>
>     local m1, m2:
>
>         if (m1 := regexp1.match(line)) and (m2 := regexp2.match(line)):
>
>             ...
>
>     local m1=regexp1.match(line), m2=regexp2.match(line):
>         if m1 and m2:


They address complaint #2 in what seems to me a thoroughly Pythonic
(direct, transparent, no more magical than necessary, easy to read)
way.  They don't address complaint #1 at all, but as you've shown (in
the 2nd spelling)  that isn't _inherently_ tied to complaint #2
(complaint #1 is what PEP 572 addresses).

So _if_ PEP 572 is accepted, adding this form of a compound `local`
statement too would address both of the listed complaints, at the
"cost" of a bit more typing and adding a level of indentation.
Neither of which bother me ;-)

`local()` itself was also intended to address the
even-more-in-the-background recurring desires for an expression (as
opposed to statement) oriented way to use throwaway bindings; e.g.,
instead of

    temp = x + y - z + 1
    r = temp**2 - 1/temp

this instead:

    r = local(t=x + y - z + 1, t**2 - 1/t)

It's not an accident that the shorter `t` is used in the latter than
the former's `temp`:  when people are wary of clobbering names by
accident, they tend to use longer names that say "I'm just a temp -
please don't _expect_ my binding to persist beyond the immediate uses
on the next few lines":.

Anyway, that kind of thing is common n functional languages, where

    "let" pile-of-bindings "in" expression

kinds of constructs are widely used _as_ (sub)expressions themselves.

    local t = x + y - z + 1:
        r = t**2 - 1/t

would be the same semantically, but they'd still complain about the
"extra" typing and the visual "heaviness" of introducing a block for
what they _think_ of as being "just another kind of expression".

The `local()` I brought up was, I think, far too biased _toward_ that
use.  It didn't "play nice" with block-oriented uses short of
excruciatingly deep magic.  Your `local` statement is biased in the
other direction, but that's a Good Thing :-)

From ethan at stoneleaf.us  Sun Apr 29 13:19:33 2018
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 29 Apr 2018 10:19:33 -0700
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
Message-ID: <5AE5FEA5.1010603@stoneleaf.us>

On 04/27/2018 07:37 PM, Tim Peters wrote:

> Idea:  introduce a "local" pseudo-function to capture the idea of
> initialized names with limited scope.

> Note:  the thing I'm most interested in isn't debates, but in whether
> this would be of real use in real code.

I keep going back and forth on the ":=" syntax as on the one hand I find the functionality very useful but on the other 
hand it's ugly and doesn't really read well.

However, I can say I am solidly

-1

on local, let, etc.:

- local() vs locals(): very similar words with disparate meanings
- more parens -- ugh
- sublocal: one more scope for extra complexity

--
~Ethan~

From kirillbalunov at gmail.com  Sun Apr 29 13:23:38 2018
From: kirillbalunov at gmail.com (Kirill Balunov)
Date: Sun, 29 Apr 2018 20:23:38 +0300
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <015e4716-5cca-877f-ef95-2df6ade13004@mrabarnett.plus.com>
References: 
 
 
 
 
 <015e4716-5cca-877f-ef95-2df6ade13004@mrabarnett.plus.com>
Message-ID: 

2018-04-29 17:52 GMT+03:00 MRAB :
>
>
>> How about these:
>
>     local m1, m2:
>         m1 = regexp1.match(line)
>         m2 = regexp2.match(line):
>         if m1 and m2:
>             ...
>

Is it possible to do the same thing, but with the help of `with` statement:

with local('m1', 'm2'):

m1 = regex1.match(line)
m2 = regex2.match(line)
if m1 and m2:

...


With kind regards,
-gdg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From python at mrabarnett.plus.com  Sun Apr 29 14:18:17 2018
From: python at mrabarnett.plus.com (MRAB)
Date: Sun, 29 Apr 2018 19:18:17 +0100
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 
 
 
 
 <015e4716-5cca-877f-ef95-2df6ade13004@mrabarnett.plus.com>
 
Message-ID: <22ff2bc9-9e31-89ff-d2f7-6a8a8526ad7d@mrabarnett.plus.com>

On 2018-04-29 18:01, Tim Peters wrote:
> [Tim]
> >> ...
> >> This is the kind of code about which there have been background
> >> complaints "forever":
> >>
> >>      m1 = regexp1.match(line)
> >>      m2 = regexp2.match(iine)
> >>      if m1 and m2:
> >>          do all sorts of stuff with m1 and/or m2,
> >>          including perhaps modifying local variables
> >>          and/or global variables
> >>          and/or nonlocal variables
> >>
> >> The complaints are of two distinct kinds:
> >>
> >> 1. "I want to compute m1 and m2 _in_ the `if` test".
> >>
> >> 2. "I don't want these temp names (m1 and m2) accidentally
> >>     conflicting with local names already in scope - if these names
> >>     already exist, I want the temp names to shadow their
> >>     current bindings until the `if` structure is done".
> >>
> >> So,
> >>
> >>      if local(m1=regexp1.match(line),
> >>                m2 = regexp2.match(iine),
> >>                m1 and m2):
> >>
> >> intends to address both complaints via means embarrassingly obvious to
> >> the most casual observer ;-)
>
>
> [MRAB ]
> > How about these:
> >
> >     local m1, m2:
> >         m1 = regexp1.match(line)
> >         m2 = regexp2.match(line):
> >         if m1 and m2:
> >             ...
> >
> >
> >     local m1, m2:
> >
> >         if (m1 := regexp1.match(line)) and (m2 := regexp2.match(line)):
> >
> >             ...
> >
> >     local m1=regexp1.match(line), m2=regexp2.match(line):
> >         if m1 and m2:
>
>
> They address complaint #2 in what seems to me a thoroughly Pythonic
> (direct, transparent, no more magical than necessary, easy to read)
> way.  They don't address complaint #1 at all, but as you've shown (in
> the 2nd spelling)  that isn't _inherently_ tied to complaint #2
> (complaint #1 is what PEP 572 addresses).
>
> So _if_ PEP 572 is accepted, adding this form of a compound `local`
> statement too would address both of the listed complaints, at the
> "cost" of a bit more typing and adding a level of indentation.
> Neither of which bother me ;-)
>
> `local()` itself was also intended to address the
> even-more-in-the-background recurring desires for an expression (as
> opposed to statement) oriented way to use throwaway bindings; e.g.,
> instead of
>
>      temp = x + y - z + 1
>      r = temp**2 - 1/temp
>
> this instead:
>
>      r = local(t=x + y - z + 1, t**2 - 1/t)
>
> It's not an accident that the shorter `t` is used in the latter than
> the former's `temp`:  when people are wary of clobbering names by
> accident, they tend to use longer names that say "I'm just a temp -
> please don't _expect_ my binding to persist beyond the immediate uses
> on the next few lines":.
>
> Anyway, that kind of thing is common n functional languages, where
>
>      "let" pile-of-bindings "in" expression
>
> kinds of constructs are widely used _as_ (sub)expressions themselves.
>
>      local t = x + y - z + 1:
>          r = t**2 - 1/t
>
> would be the same semantically, but they'd still complain about the
> "extra" typing and the visual "heaviness" of introducing a block for
> what they _think_ of as being "just another kind of expression".
>
> The `local()` I brought up was, I think, far too biased _toward_ that
> use.  It didn't "play nice" with block-oriented uses short of
> excruciatingly deep magic.  Your `local` statement is biased in the
> other direction, but that's a Good Thing :-)
>
As well as:

 ??? local t = x + y - z + 1: r = t**2 - 1/t

I wonder if it could be rewritten as:

 ??? r = local t = x + y - z + 1: t**2 - 1/t

Would parentheses be needed?

 ??? r = (local t = x + y - z + 1: t**2 - 1/t)

It kind of resembles the use of default parameters with lambda!

The names would be local to the suite if used as a statement or the 
following expression if used in an expression, either way, the bit after 
the colon.


From Nikolaus at rath.org  Sun Apr 29 15:17:23 2018
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Sun, 29 Apr 2018 20:17:23 +0100
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
 (Tim Peters's message of "Fri, 27 Apr 2018 21:37:53 -0500")
References: 
Message-ID: <87po2hrfek.fsf@vostro.rath.org>

On Apr 27 2018, Tim Peters  wrote:
> Then `c` is 12, but `a` is still 1 and `b` is still 2.  Same thing in the end:
>
> c = local(a=3, b=4, a*b)

I think this can be done already with slighly different syntax:

c = (lambda a=3, b=4: a*b)()


The trailing () is a little ugly, but the semantics are much more
obvious. So maybe go with a variation that makes function evaluation
implicit?

c = lambda! a=3, b=4: a*b

(reads terrible, but maybe someone has a better idea).


> if local(m = re.match(regexp, line)):
>     print(m.group(0))

Of course, that wouldn't (and shouldn't) work anymore. But that's a good
thing, IMO :-).


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From fakedme+py at gmail.com  Sun Apr 29 15:23:02 2018
From: fakedme+py at gmail.com (Soni L.)
Date: Sun, 29 Apr 2018 16:23:02 -0300
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <87po2hrfek.fsf@vostro.rath.org>
References: 
 <87po2hrfek.fsf@vostro.rath.org>
Message-ID: <52917d24-21ef-8964-fa20-5bcb5f060ed5@gmail.com>



On 2018-04-29 04:17 PM, Nikolaus Rath wrote:
> On Apr 27 2018, Tim Peters  wrote:
>> Then `c` is 12, but `a` is still 1 and `b` is still 2.  Same thing in the end:
>>
>> c = local(a=3, b=4, a*b)
> I think this can be done already with slighly different syntax:
>
> c = (lambda a=3, b=4: a*b)()
>
>
> The trailing () is a little ugly, but the semantics are much more
> obvious. So maybe go with a variation that makes function evaluation
> implicit?
>
> c = lambda! a=3, b=4: a*b
>
> (reads terrible, but maybe someone has a better idea).
>
>
>> if local(m = re.match(regexp, line)):
>>      print(m.group(0))
> Of course, that wouldn't (and shouldn't) work anymore. But that's a good
> thing, IMO :-).
>
>
> Best,
> -Nikolaus
>

Has anyone heard of Lua? Lexical scoping? Block scope? (Python doesn't 
have blocks and that sucks?) Etc?

From ethan at stoneleaf.us  Sun Apr 29 15:39:34 2018
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 29 Apr 2018 12:39:34 -0700
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
Message-ID: <5AE61F76.60900@stoneleaf.us>

On 04/28/2018 10:16 AM, Tim Peters wrote:

> ... but do realize that since PEP 572 dropped any
> notion of sublocal scopes, that recurring issue remains wholly
> unaddressed regardless.

If we need a sublocal scope, I think the most Pythonic* route to have it would be:

     with sublocal():
         blah blah

which would act just like local/global does now:

   - any assignment creates a new variable
     - unless that variable has been declared global/nonlocal
   - plain reads (no assignment ever happens) refer to nonlocal/global/built-in
     names

This has the advantages of:

  - no confusion about which variables are sublocal (acts like a new function scope)
  - no extra parens, assignments, expressions on "with sublocal():" line

Possible enhancements:

  - give sublocal block a name "with sublocal() as blahblah:" and then reuse that block
    again later ("with blahblah:") or maybe pass it to other functions...

Of course, as previously stated, this is orthogonal to PEP 572.

--
~Ethan~

From tim.peters at gmail.com  Sun Apr 29 15:55:33 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 29 Apr 2018 14:55:33 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <87po2hrfek.fsf@vostro.rath.org>
References: 
 <87po2hrfek.fsf@vostro.rath.org>
Message-ID: 

[Tim]
>> Then `c` is 12, but `a` is still 1 and `b` is still 2.  Same thing in the end:
>>
>> c = local(a=3, b=4, a*b)

[Nikolaus Rath ]
> I think this can be done already with slighly different syntax:
>
> c = (lambda a=3, b=4: a*b)()
>
> The trailing () is a little ugly, but the semantics are much more
> obvious.

But also broken, in a way that can't be sanely fixed.  Covered before
in other messages.  Short course:

>>> a = 10
>>> b = 20
>>> (lambda a=3, b=a+1: (a, b))()
(3, 11)

This context really demands (3, 4) instead.  In Scheme terms, Python's
lambda default arguments do "let" binding ("all at once"), but "let*"
binding is what's needed ("one at a time, left to right, with bindings
already done visible to later bindings").

Of course in Scheme you explicitly type either "let" or "let*" (or
"letrec", or ...) depending on what you want at the time, but "let*"
is overwhelmingly what's wanted when it makes a difference (in the
example at the top, it makes no difference at all).  Otherwise you
can't build up a complex result from little named pieces that may
depend on pieces already defined.  See,.e.g, the quadratic equation
example in the original post, where this was implicit.

> ...

From rosuav at gmail.com  Sun Apr 29 16:15:50 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 30 Apr 2018 06:15:50 +1000
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <87po2hrfek.fsf@vostro.rath.org>
 
Message-ID: 

On Mon, Apr 30, 2018 at 5:55 AM, Tim Peters  wrote:
> [Tim]
>>> Then `c` is 12, but `a` is still 1 and `b` is still 2.  Same thing in the end:
>>>
>>> c = local(a=3, b=4, a*b)
>
> [Nikolaus Rath ]
>> I think this can be done already with slighly different syntax:
>>
>> c = (lambda a=3, b=4: a*b)()
>>
>> The trailing () is a little ugly, but the semantics are much more
>> obvious.
>
> But also broken, in a way that can't be sanely fixed.  Covered before
> in other messages.  Short course:
>
>>>> a = 10
>>>> b = 20
>>>> (lambda a=3, b=a+1: (a, b))()
> (3, 11)
>
> This context really demands (3, 4) instead.  In Scheme terms, Python's
> lambda default arguments do "let" binding ("all at once"), but "let*"
> binding is what's needed ("one at a time, left to right, with bindings
> already done visible to later bindings").

So maybe the effective semantics should be:

>>> (lambda a=3: (lambda b=a+1: (a, b))())()
(3, 4)

ChrisA

From tim.peters at gmail.com  Sun Apr 29 16:20:14 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 29 Apr 2018 15:20:14 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <5AE61F76.60900@stoneleaf.us>
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
Message-ID: 

[Ethan Furman ]
> If we need a sublocal scope, I think the most Pythonic* route to have it
> would be:
>
>     with sublocal():
>         blah blah
>
> which would act just like local/global does now:
>
>   - any assignment creates a new variable
>     - unless that variable has been declared global/nonlocal
>   - plain reads (no assignment ever happens) refer to
> nonlocal/global/built-in names
> ...

As covered most recently in an exchange with Tim Delaney, best I can
tell absolutely nobody has wanted that.  By "sublocal scope" they
don't mean a full-fledged new scope at all, but a kind of limited
"shadowing" of a handful of specific, explicitly given names.  It acts
like a context manager, if there were a way to clearly spell

    save the current state of these specific identifiers at the start (& I
        couldn't care less whether they're local, nonlocal, or global - I
        don't know & don't care)

    then execute the code exactly as if this gimmick had never been used

    then, at the end, restore the specific identifier states we saved
at the start

It's the same kind of shadowing Python already does by magic for, e.g., `i`, in

    [i for i in range(3)]

So, e.g.,

"""
a = 42

def showa():
    print(a)

def run():
    global a

    local a: # assuming this existed
        a = 43
        showa()
    showa()
"""

would print 43 and then 42.  Which makes "local a:" sound senseless on
the face of it ;-)  "shadow" would be a more descriptive name for what
it actually does.

> ...

From mikhailwas at gmail.com  Sun Apr 29 16:23:40 2018
From: mikhailwas at gmail.com (Mikhail V)
Date: Sun, 29 Apr 2018 23:23:40 +0300
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 
 
 
Message-ID: 

On Sun, Apr 29, 2018 at 7:22 PM, Mikhail V  wrote:
> On Sun, Apr 29, 2018 at 3:30 AM, Tim Peters  wrote:
>
>> Time to note another subtlety:  people don't _really_ want "a new
>> scope" in Python.  If they did, then _every_ name appearing in a

> If there is demand for this, how about just introducing a
> derived syntax for the "auto-called" def block, say, just "def" without a name:
>
> def :
>     global a
>     x = 1; y = 2
>     a = x + y
> print (a)
>

Or even better, it would be better to avoid overloading "global" or "return",
and use dedicated prefix for variables that are pushed to outer scope.
I think it would look way better for cases with multiple variables:

def func():
    state = 0
    def:
        localstate1 = state + 1
        localstate2 = state + 2
        & localstate1
        & localstate2
    print (localstate1)
    print (localstate2)

Prefix in assignment would allow more expressive dispatching:

def func():
    state = 0
    def:
        localstate1 = state + 1
        localstate2 = state + 2
        & M = state + 3
        & L1,  & L2 = localstate1, localstate2
    print (L1, L2, M)


Imo such syntax is closest do "def" block and should be so,
because functions have very strong association with new scope
definition, and same rules should work here.
Not only it is more readable than "with local()", but also
makes it easier to edit,  comment/uncomment lines.

From mertz at gnosis.cx  Sun Apr 29 16:44:18 2018
From: mertz at gnosis.cx (David Mertz)
Date: Sun, 29 Apr 2018 16:44:18 -0400
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
Message-ID: 

This doesn't address the fact no one actually needs it.  But if we WANTED a
sublocal() context manager, we could spell it something like this:

In [42]: @contextmanager
    ...: def sublocal(**kws):
    ...:     _locals = locals().copy()
    ...:     _globals = globals().copy()
    ...:     for k, v in kws.items():
    ...:         if k in locals():
    ...:             exec(f"locals()['{k}'] = {v}")
    ...:         elif k in globals():
    ...:             exec(f"globals()['{k}'] = {v}")
    ...:     yield
    ...:     locals().update(_locals)
    ...:     globals().update(_globals)
    ...:

In [43]: a = 42

In [44]: with sublocal(a=43):
    ...:     showa()
    ...:
43

In [45]: showa()
42

In [46]: with sublocal():
    ...:     a = 41
    ...:     showa()
    ...:
41

In [47]: showa()
42

On Sun, Apr 29, 2018 at 4:20 PM, Tim Peters  wrote:

> [Ethan Furman ]
> > If we need a sublocal scope, I think the most Pythonic* route to have it
> > would be:
> >
> >     with sublocal():
> >         blah blah
> >
>
> As covered most recently in an exchange with Tim Delaney, best I can
> tell absolutely nobody has wanted that.  By "sublocal scope" they
> don't mean a full-fledged new scope at all, but a kind of limited
> "shadowing" of a handful of specific, explicitly given names.  It acts
> like a context manager, if there were a way to clearly spell
>


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From tim.peters at gmail.com  Sun Apr 29 16:45:09 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 29 Apr 2018 15:45:09 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <87po2hrfek.fsf@vostro.rath.org>
 
 
Message-ID: 

[Tim]
>>>> Then `c` is 12, but `a` is still 1 and `b` is still 2.  Same thing in the end:
>>>>
>>>> c = local(a=3, b=4, a*b)

[Nikolaus Rath ]
>>> I think this can be done already with slighly different syntax:
>>>
>>> c = (lambda a=3, b=4: a*b)()
>>>
>>> The trailing () is a little ugly, but the semantics are much more
>>> obvious.

[Tim]
>> But also broken, in a way that can't be sanely fixed.  Covered before
>> in other messages.  Short course:
>>
>> >>> a = 10
>> >>> b = 20
>> >>> (lambda a=3, b=a+1: (a, b))()
>> (3, 11)
>>
>> This context really demands (3, 4) instead.  In Scheme terms, Python's
>> lambda default arguments do "let" binding ("all at once"), but "let*"
>> binding is what's needed ("one at a time, left to right, with bindings
>> already done visible to later bindings").

[Chris Angelico ]
> So maybe the effective semantics should be:
>
> >>> (lambda a=3: (lambda b=a+1: (a, b))())()
> (3, 4)

Almost, but by that point the idea that this is already "easily
spelled" via lambdas has become ludicrously difficult to argue with a
straight face ;-)

By "almost", I mean there are other cases where even nesting Python
lambdas doesn't capture the intent.  In these cases, not only does the
expression defining b refer to a, but _also_ the expression defining a
refers to b.

You can play, if you like, with trying to define the `iseven` lambda
here in one line by nesting lambdas to define `even` and `odd` as
default arguments:

    even = (lambda n: n == 0 or odd(n-1))
    odd = (lambda n: False if n == 0 else even(n-1))
    iseven = lambda n: even(n)

Scheme supplies `letrec` for when "mutually recursive" bindings are
needed.  In Python that distinction isn't nearly as evidently needed,
because Python's idea of closures doesn't capture all the bindings
currently in effect,.  For example, when `odd` above is defined,
Python has no idea at all what the then-current binding for `even` is
- it doesn't even look for "even" until the lambda is _executed_.

But, to be fair, I'm not sure:

    iseven =- local(
        even = (lambda n: n == 0 or odd(n-1)),
        odd = (lambda n: False if n == 0 else even(n-1)).
        lambda n: even(n))

would have worked either.  At the moment I'm certain it wouldn't.
Last night I was pretty sure it would ;-)

From rosuav at gmail.com  Sun Apr 29 16:47:03 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 30 Apr 2018 06:47:03 +1000
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <87po2hrfek.fsf@vostro.rath.org>
 
 
 
Message-ID: 

On Mon, Apr 30, 2018 at 6:45 AM, Tim Peters  wrote:
> [Chris Angelico ]
>> So maybe the effective semantics should be:
>>
>> >>> (lambda a=3: (lambda b=a+1: (a, b))())()
>> (3, 4)
>
> Almost, but by that point the idea that this is already "easily
> spelled" via lambdas has become ludicrously difficult to argue with a
> straight face ;-)

Oh, I dunno, I've seen people argue that PHP is the right choice of language :-)

ChrisA

From p.f.moore at gmail.com  Sun Apr 29 17:24:17 2018
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 29 Apr 2018 22:24:17 +0100
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
Message-ID: 

On 29 April 2018 at 21:20, Tim Peters  wrote:
> As covered most recently in an exchange with Tim Delaney, best I can
> tell absolutely nobody has wanted that.  By "sublocal scope" they
> don't mean a full-fledged new scope at all, but a kind of limited
> "shadowing" of a handful of specific, explicitly given names.  It acts
> like a context manager, if there were a way to clearly spell
>
>     save the current state of these specific identifiers at the start (& I
>         couldn't care less whether they're local, nonlocal, or global - I
>         don't know & don't care)
>
>     then execute the code exactly as if this gimmick had never been used
>
>     then, at the end, restore the specific identifier states we saved
> at the start

So maybe adding such a primitive (maybe something live states =
sys.get_variable_state('a', 'b', 'c') and
sys.set_variable_state(states)) would be useful? Of course, we've
moved away from real use cases and back to theoretical arguments now,
so it's entirely possible that doing so would only solve problems that
no-one actually has... David Mertz' sublocal context manager would be
a good prototype of such a thing - at least good enough to demonstrate
that it's of no benefit in practice 

Paul

From mertz at gnosis.cx  Sun Apr 29 17:53:03 2018
From: mertz at gnosis.cx (David Mertz)
Date: Sun, 29 Apr 2018 17:53:03 -0400
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 
Message-ID: 

Ooops. My proof on anti-concept has a flaw.  It only "shadows" names that
already exist.  Presumably that's the wrong idea, but it's easy enough to
change if desired.

On Sun, Apr 29, 2018 at 5:24 PM, Paul Moore  wrote:

> On 29 April 2018 at 21:20, Tim Peters  wrote:
> > As covered most recently in an exchange with Tim Delaney, best I can
> > tell absolutely nobody has wanted that.  By "sublocal scope" they
> > don't mean a full-fledged new scope at all, but a kind of limited
> > "shadowing" of a handful of specific, explicitly given names.  It acts
> > like a context manager, if there were a way to clearly spell
> >
> >     save the current state of these specific identifiers at the start (&
> I
> >         couldn't care less whether they're local, nonlocal, or global - I
> >         don't know & don't care)
> >
> >     then execute the code exactly as if this gimmick had never been used
> >
> >     then, at the end, restore the specific identifier states we saved
> > at the start
>
> So maybe adding such a primitive (maybe something live states =
> sys.get_variable_state('a', 'b', 'c') and
> sys.set_variable_state(states)) would be useful? Of course, we've
> moved away from real use cases and back to theoretical arguments now,
> so it's entirely possible that doing so would only solve problems that
> no-one actually has... David Mertz' sublocal context manager would be
> a good prototype of such a thing - at least good enough to demonstrate
> that it's of no benefit in practice 
>
> Paul
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ethan at stoneleaf.us  Sun Apr 29 21:20:33 2018
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 29 Apr 2018 18:20:33 -0700
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
Message-ID: <5AE66F61.9000900@stoneleaf.us>

On 04/29/2018 01:20 PM, Tim Peters wrote:

> So, e.g.,
>
> """
> a = 42
>
> def showa():
>      print(a)
>
> def run():
>      global a
>
>      local a: # assuming this existed
>          a = 43
>          showa()
>      showa()
> """
>
> would print 43 and then 42.  Which makes "local a:" sound senseless on
> the face of it ;-)  "shadow" would be a more descriptive name for what
> it actually does.

Yeah, "shadow" would be a better name than "local", considering that it effectively temporarily changes what other 
functions see as global.  Talk about a debugging nightmare!  ;)

--
~Ethan~

From tim.peters at gmail.com  Sun Apr 29 21:28:37 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 29 Apr 2018 20:28:37 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 
 
Message-ID: 

[David Mertz ]
> Ooops. My proof on anti-concept has a flaw.  It only "shadows" names that
> already exist.  Presumably that's the wrong idea, but it's easy enough to
> change if desired.

Even in the very early days when Python's runtime was more
relentlessly simple-minded than it is now, these kinds of things were
always hard to get right in all cases.  But it's heartening to see
_someone_ still has courage enough to leap into the void ;-)

I see that the docs for `locals()` now say:

    The contents of this dictionary should not be modified; changes
    may not affect the values of local and free variables used by
    the interpreter.

I thought that explained something I was seeing, but the problem
turned out to be more obvious:  the "locals()" inside your
"sublocal()" context manager refer to _sublocal_'s own locals, not to
the locals of the code invoking `sublocal()`..  So, e.g., run this:

    def run():
        a = 1
        b = 2

        def g(tag):
            print(f"{tag}: a {a} b {b}")

        with sublocal(a=6):
            g("first in block")
            a = 5
            g("set a to 5")
            b = 19
            g("set b to 19")
        g("after")

Here's output:

first in block: a 1 b 2  # the `a=6` had no visible effect
set a to 5: a 5 b 2  # golden
set b to 19: a 5 b 19  # also golden
after: a 5 b 19  # `a` wasn't restored to 1

To be very clear, the output is the same as if the `with` statement
were replaced with `if True:`.

But even if we crawled up the call stack to get the right locals()
dict, looks like `exec` is no longer enough (in Python 3) to badger
the runtime into making it work anyway:

    https://bugs.python.org/issue4831

"""
> Specifically, what is the approved way to have exec() modify the local
> environment of a function?

There is none.  To modify the locals of a function on the fly is not
possible without several consequences: normally, function locals are not
stored in a dictionary, but an array, whose indices are determined at
compile time from the known locales.  This collides at least with new
locals added by exec.  The old exec statement circumvented this, because
the compiler knew that if an exec without globals/locals args occurred
in a function, that namespace would be "unoptimized", i.e. not using the
locals array.  Since exec() is now a normal function, the compiler does
not know what "exec" may be bound to, and therefore can not treat is
specially.
"""

Worm around that (offhand I don't know how), and there are nonlocal
names too.  I don't know whether Python's current introspection
features are enough to even find out which nonlocals have been
declared, let alone to _which_ scope each nonlocal belongs.

Worm around that too, then going back to the example at the top, if
the manager's

        locals().update(_locals)

had the intended effect, it would end up restoring `b` to 2 too, yes?
The only names that "should be" restored are the names in the `kws`
dict.

So, in all, this may be a case where it's easier to implement in the
compiler than to get working at runtime via ever-more-tortured Python
code.

And when that's all fixed, "a" can appear in both locals() and
globals() (not to mention also in enclosing scopes), in which case
what to do is unclear regardless how this is implemented.  Which
"a"(s) did the user _intend_ to shadow?

The fun never ends ;-)

From fakedme+py at gmail.com  Sun Apr 29 21:35:04 2018
From: fakedme+py at gmail.com (Soni L.)
Date: Sun, 29 Apr 2018 22:35:04 -0300
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <5AE66F61.9000900@stoneleaf.us>
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 <5AE66F61.9000900@stoneleaf.us>
Message-ID: <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>



On 2018-04-29 10:20 PM, Ethan Furman wrote:
> On 04/29/2018 01:20 PM, Tim Peters wrote:
>
>> So, e.g.,
>>
>> """
>> a = 42
>>
>> def showa():
>> ???? print(a)
>>
>> def run():
>> ???? global a
>>
>> ???? local a: # assuming this existed
>> ???????? a = 43
>> ???????? showa()
>> ???? showa()
>> """
>>
>> would print 43 and then 42.? Which makes "local a:" sound senseless on
>> the face of it ;-)? "shadow" would be a more descriptive name for what
>> it actually does.
>
> Yeah, "shadow" would be a better name than "local", considering that 
> it effectively temporarily changes what other functions see as 
> global.? Talk about a debugging nightmare!? ;)

That ain't shadow. That is dynamic scoping.

Shadowing is something different:

def f():
 ??? a = 42
 ??? def g():
 ??????? print(a)
 ??? local a:
 ??????? a = 43
 ??????? g()
 ??? g()

should print "42" both times, *if it's lexically scoped*.

If it's lexically scoped, this is just adding another scope: blocks. 
(instead of the smallest possible scope being function scope)

>
> -- 
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


From mertz at gnosis.cx  Sun Apr 29 21:52:30 2018
From: mertz at gnosis.cx (David Mertz)
Date: Sun, 29 Apr 2018 21:52:30 -0400
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 
 
 
Message-ID: 

On Sun, Apr 29, 2018 at 9:28 PM, Tim Peters  wrote:

> [David Mertz ]
> > Ooops. My proof [of] anti-concept has a flaw.  It only "shadows" names
> that
> > already exist.  Presumably that's the wrong idea, but it's easy enough to
> > change if desired.
>


>         with sublocal(a=6):
>             g("first in block")
>             a = 5
>             g("set a to 5")
>             b = 19
>             g("set b to 19")
>         g("after")
>
> Worm around that too, then going back to the example at the top, if
> the manager's
>
>         locals().update(_locals)
>
> had the intended effect, it would end up restoring `b` to 2 too, yes?
> The only names that "should be" restored are the names in the `kws`
> dict.
>

Actually, that wasn't my intention.  As I imagined the semantics, I wanted
a context manager that restored the "outside" context for anything defined
"inside" the context.  Allowing keyword arguments was just an extra
"convenience" that was meant to be equivalent to defining/overwriting
variables inside the body.  So these would be equivalent:

## 1
with sublocal():
    a = 1
    b = 2
    x = a + b
# a, b now have their old values again

## 2
with sublocal(a=1, b=2):
    x = a + b
# a, b now have their old values again

## 3
with sublocal(a=1):
    b = 2
    x = a + b
# a, b now have their old values again

I knew I was ignoring nonlocals and nested function scopes.  But just
trying something really simple that looks like the un-proposal in some
cases.  Maybe there's no way in pure-Python to deal with the edge cases
though.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From tim.peters at gmail.com  Sun Apr 29 22:49:36 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 29 Apr 2018 21:49:36 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 <5AE66F61.9000900@stoneleaf.us>
 <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>
Message-ID: 

[Soni L. ]
> That ain't shadow. That is dynamic scoping.

I don't believe either term is technically accurate, but don't really care.


> Shadowing is something different:
>
> def f():
>     a = 42
>     def g():
>         print(a)
>     local a:
>         a = 43
>         g()
>     g()
>
> should print "42" both times, *if it's lexically scoped*.

Why?  The `local` statement, despite its name, and casual talk about
it, isn't _intended_ to create a new scope in any technically accurate
sense.  I think what it is intended to do has been adequately
explained several times already.  The `a` in `a = 42` is intended to
be exactly the same as the `a` in `a = 43`, changing nothing at all
about Python's lexical scoping rules.  It is _the_ `a` local to `f`.
If lexical scoping hasn't changed one whit (and it hasn't), the code
_must_ print 43 first.  Same as if `local a:` were replaced by `if
True:`.  `local` has no effect on a's value until the new "scope"
_ends_, and never any effect at all on a's visibility (a's "scope").
"local" or not, after `a = 43` there is no scope anywhere, neither
lexical nor dynamic, in which `a` is still bound to 42.  _The_ value
of `a` is 43 then, `local` or not.

The _only_ twist is that `local` wants to save/restore `a`'s binding
before/after the suite of code it controls.  That's not really about
"scope" at all with its technical meaning.  I don't much care if
people casually _think_ about it as being "about.scope", though.

Yes, the effect is similar to what you might see in a language with
dynamic scoping _if_ it pushed a _copy_ of a's current 
binding on an evaluation stack at the start, and popped that (possibly
mutated) copy later, but actual dynamic binding doesn't push copies,
and Python isn't using any sort of dynamic evaluation stack
regardless.  That both restore previous bindings at times is a shallow
coincidence.  Neither is it really shadowing, which is lexical hiding
of names.  Neither is an accurate model for what it actually does,
but, yes, it bears more _resemblance_ to dynamic scoping if you ignore
that it's not dynamic scoping ;-)


> If it's lexically scoped, this is just adding another scope: blocks.
> (instead of the smallest possible scope being function scope)

I expect that thinking about "scope" at all just confuses people here,
unless they don't think too much about it ;-)  Nothing about Python's
scoping rules changes one whit.  Exactly the same in the above could
be achieved by replacing the `local a:` construct above by, e.g,,

    __save_a_with_a_unique_name = a
    a = 43
    g()
    a =  __save_a_with_a_unique_name

Indeed, that's one way the compiler could _implement_ it.  Nothing
about a's scope is altered; it's merely a block-structured
save-value/restore-value gimmick.

From tim.peters at gmail.com  Sun Apr 29 23:13:52 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 29 Apr 2018 22:13:52 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 
 
 
 
Message-ID: 

[Tim]
>> ...
>> Worm around that too, then going back to the example at the top, if
>> the manager's
>>
>>         locals().update(_locals)
>>
>> had the intended effect, it would end up restoring `b` to 2 too, yes?
>> The only names that "should be" restored are the names in the `kws`
>> dict.

[David]
> Actually, that wasn't my intention.  As I imagined the semantics, I wanted a
> context manager that restored the "outside" context for anything defined
> "inside" the context.  Allowing keyword arguments was just an extra
> "convenience" that was meant to be equivalent to defining/overwriting
> variables inside the body.  So these would be equivalent:
>
> ## 1
> with sublocal():
>     a = 1
>     b = 2
>     x = a + b
> # a, b now have their old values again

What about x?  I assume that's also restored.

> ## 2
> with sublocal(a=1, b=2):
>     x = a + b
> # a, b now have their old values again
>
> ## 3
> with sublocal(a=1):
>     b = 2
>     x = a + b
> # a, b now have their old values again
>
> I knew I was ignoring nonlocals and nested function scopes.  But just trying
> something really simple that looks like the un-proposal in some cases.
> Maybe there's no way in pure-Python to deal with the edge cases though.

Or even just locals - there doesn't appear to be any supported way in
Python 3 for even `exec` to reliably change locals anymore.

If the intent is to create an "honest-to-Guido new Python scope", then
I expect changing

    BLOCK_OF_CODE

to

    def unique_name():
        BLOCK_OF_CODE
    unique_name()
    del unique_name

gets pretty close?  Any name "created" inside BLOCK_OF_CODE would be
tagged by the compiler as function-local, and vanish when the function
returned.  Anything merely referenced would inherit the enclosing
function's meaning.

I can think of some warts.  E.g., after

    global g

if BLOCK_OF_CODE were

    g += 1

then wrapping that line alone in a function would lead to an
UnboundLocalError when the function was called (because the `global g`
isn't "inherited").

Anyway, since I've never done that in my life, never saw anyone else
do it, and never even thought of it before today, I'm pretty confident
there isn't a hitherto undetected groundswell of demand for a quick
way to create an honest-to-Guido new Python scope ;-)

From greg.ewing at canterbury.ac.nz  Mon Apr 30 01:42:14 2018
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Apr 2018 17:42:14 +1200
Subject: [Python-ideas] Should __builtins__ have some kind of
 pass-through print function, for debugging?
In-Reply-To: 
References: 
 <20180427112733.GP7400@ando.pearwood.info>
 
 
 <5AE52E01.8020806@canterbury.ac.nz>
 
Message-ID: <5AE6ACB6.5030909@canterbury.ac.nz>

Nathaniel Smith wrote:
> It looks like my client used "font-family: monospace", maybe yours
> only understands 
 or something?

Hmmm, looking at the message source, it does indeed specify
monospace. It seems the version of Thunderbird I'm using
does a spectacularly bad job of interpreting HTML. Sorry
for the false alarm, Nathaniel!

-- 
Greg

From greg.ewing at canterbury.ac.nz  Mon Apr 30 01:54:33 2018
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Apr 2018 17:54:33 +1200
Subject: [Python-ideas] Sublocal scoping at its simplest
In-Reply-To: 
References: 
Message-ID: <5AE6AF99.20404@canterbury.ac.nz>

Chris Angelico wrote:
> 1) Bind the caught exception to a sublocal 'e'
> 2) Execute the suite, with the reference to 'e' seeing the sublocal
> 3) Set the sublocal e to None
> 4) Unbind the sublocal e
> 
> At the unindent, the sublocal name will vanish, and the original 'e'
> will reappear.

That's a reasonable way to define how a sublocal scope
might work. But as far as I can see, the debate is about
whether sublocal scopes are a good idea in the first
place.

-- 
Greg


From robertvandeneynde at hotmail.com  Sun Apr 29 22:14:17 2018
From: robertvandeneynde at hotmail.com (Robert Vanden Eynde)
Date: Mon, 30 Apr 2018 02:14:17 +0000
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 <5AE66F61.9000900@stoneleaf.us>
 <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>
Message-ID: 

I really liked the syntax that mimicked lambda even if I find it verbose :

a = local x=1, y=2: x + y + 3

Even if I still prefer the postfix syntax :

a = x + 3 where x = 2

About scheme "let" vs "let*", the paralel in Python is :

a, b, c = 5, a+1, 2 # let syntax
a = 5; b = a+1; c = 2 # let* syntax

Which makes be wonder, we could use the semicolon in the syntax ?

a = local x = 1; y = x+1: x + y + 3

Or with the postfix syntax :

a = x + y + 3 where x = 1; y = x+1

Chaining where would be is syntax error :

a = x + y + 3 where x = 1 where y = x+1

Parenthesis could be mandatory if one wants to use tuple assignment :

a = local (x, y) = 1, 2: x + y + 3

When I see that, I really want to call it "def"

a = def (x, y) = 1, 2: x + y + 3
a = def x = 1; y = 2: x + y + 3

Which is read define x = 1 then y = 2 in x + y + 3

Using def would be obvious this is not a function call.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From j.van.dorp at deonet.nl  Mon Apr 30 03:04:08 2018
From: j.van.dorp at deonet.nl (Jacco van Dorp)
Date: Mon, 30 Apr 2018 09:04:08 +0200
Subject: [Python-ideas] Change magic strings to enums
In-Reply-To: 
References: 
 
 <20180424193222.19f2a2ea@fsol> <5ADF7DF7.3020004@stoneleaf.us>
 
 
 
 
 
 
 
 
 
 
Message-ID: 

2018-04-26 15:26 GMT+02:00 Nick Coghlan :
> On 26 April 2018 at 19:37, Jacco van Dorp  wrote:
>> I'm kind of curious why everyone here seems to want to use IntFlags
>> and other mixins. The docs themselves say that their use should be
>> minimized, and tbh I agree with them. Backwards compatiblity can be
>> maintained by allowing the old value and internally converting it to
>> the enum. Combinability is inherent to enum.Flags. There'd be no real
>> reason to use mixins as far as I can see ?
>
> Serialisation formats are a good concrete example of how problems can
> arise by switching out concrete types on people:
>
>>>> import enum, json
>>>> a = "A"
>>>> class Enum(enum.Enum):
> ...     a = "A"
> ...
>>>> class StrEnum(str, enum.Enum):
> ...     a = "A"
> ...
>>>> json.dumps(a)
> '"A"'
>>>> json.dumps(StrEnum.a)
> '"A"'
>>>> json.dumps(Enum.a)
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "/usr/lib64/python3.6/json/__init__.py", line 231, in dumps
>     return _default_encoder.encode(obj)
>   File "/usr/lib64/python3.6/json/encoder.py", line 199, in encode
>     chunks = self.iterencode(o, _one_shot=True)
>   File "/usr/lib64/python3.6/json/encoder.py", line 257, in iterencode
>     return _iterencode(o, 0)
>   File "/usr/lib64/python3.6/json/encoder.py", line 180, in default
>     o.__class__.__name__)
> TypeError: Object of type 'Enum' is not JSON serializable
>
> The mixin variants basically say "If you run into code that doesn't
> natively understand enums, act like an instance of this type".
>
> Since most of the standard library has been around for years, and
> sometimes even decades, we tend to face a *lot* of backwards
> compatibility requirements along those lines.
>
> Cheers,
> Nick.

However, as the docs not, they will be comparable with each other,
which should throw an error. Since this is only a problem for this
case when serializing (since the functions would allow the old str
arguments for probably ever), shouldn't this be something caught when
you upgrade the version you run your script under ?

It's also a rather simple fix: json.dumps(Enum.a.value) would work just fine.

Now im aware that most people don't have 100% test coverage and such.
I also rather lack the amount of experience you guys have.

Guess im just a bit behind on the practicality beats purity here :)

Jacco

From marcidy at gmail.com  Mon Apr 30 04:20:15 2018
From: marcidy at gmail.com (Matt Arcidy)
Date: Mon, 30 Apr 2018 08:20:15 +0000
Subject: [Python-ideas] Sublocal scoping at its simplest
In-Reply-To: 
References: 
Message-ID: 

On Sat, Apr 28, 2018, 20:16 Chris Angelico  wrote:

> There's been a lot of talk about sublocal scopes, within and without
> the context of PEP 572. I'd like to propose what I believe is the
> simplest form of sublocal scopes, and use it to simplify one specific
> special case in Python.
>
> There are no syntactic changes, and only a very slight semantic change.
>
> def f():
>     e = 2.71828
>     try:
>         1/0
>     except Exception as e:
>         print(e)
>     print(e)
> f()
>
> The current behaviour of the 'except... as' statement is as follows:
>
> 1) Bind the caught exception to the name 'e', replacing 2.71828
> 2) Execute the suite (printing "Division by zero")
> 3) Set e to None
> 4) Unbind e
>
> Consequently, the final print call raises UnboundLocalError. I propose
> to change the semantics as follows:
>
> 1) Bind the caught exception to a sublocal 'e'
> 2) Execute the suite, with the reference to 'e' seeing the sublocal
> 3) Set the sublocal e to None
> 4) Unbind the sublocal e
>
> At the unindent, the sublocal name will vanish, and the original 'e'
> will reappear. Thus the final print will display 2.71828, just as it
> would if no exception had been raised.
>
>
Does this mean indentation is now a scope, or colons are a scope, or is
that over simplifying?

either seems to be more consistent with the patterns set by class and
function defs, barring keywords.

not sure if relevant but curious.

I think with sublocal scope, reuse of a name makes more sense.  Currently,
if using sensible, descriptive names, it really doesn't make sense to go
from food = apple to food = car as the value between scopes, but it
happens.  And if from fruit = apple to fruit = orange (eg appending a msg
to a base string) it _could_ be nice to restore to apple once finished.

Obviously that's simple enough to do now, I am only illustrating my point.
I know bad code can be written with anything, this is not my point.  It can
be seen as enforcing that, what every nonsense someone writes like
fruit=car, there is at least some continuity of information represented by
the name... till they do it again once out of the sublocal scope of course.

as for the value of this use case, I do not know.


> The above definitions would become language-level specifications. For
> CPython specifically, my proposed implementation would be for the name
> 'e' to be renamed inside the block, creating a separate slot with the
> same name.
>
> With no debates about whether "expr as name" or "name := expr" or
> "local(name=expr)" is better, hopefully we can figure out whether
> sublocal scopes are themselves a useful feature :)
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From rosuav at gmail.com  Mon Apr 30 06:28:47 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 30 Apr 2018 20:28:47 +1000
Subject: [Python-ideas] Sublocal scoping at its simplest
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, Apr 30, 2018 at 6:20 PM, Matt Arcidy  wrote:
> Does this mean indentation is now a scope, or colons are a scope, or is that
> over simplifying?

No, no, and yes. This is JUST about the 'except' statement, which
currently has the weird effect of unbinding the name it just bound.

ChrisA

From steve at pearwood.info  Mon Apr 30 11:45:51 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 1 May 2018 01:45:51 +1000
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <20180429025005.GB7400@ando.pearwood.info>
 
Message-ID: <20180430154550.GD7400@ando.pearwood.info>

On Sat, Apr 28, 2018 at 11:20:52PM -0500, Tim Peters wrote:
> [Tim]
> >> Enormously harder to implement than binding expressions, and the
> >> latter (to my eyes) capture many high-value use cases "good enough".
> 
> [Steven D'Aprano ]
> > And yet you're suggesting an alternative which is harder and more
> > confusing.
> 
> I am?  I said at the start that it was a "brain dump".  It was meant
> to be a point of discussion for anyone interested.  I also said I was
> more interested in real use cases from real code than in debating, and
> I wasn't lying about that ;-)

Ah, my mistake... I thought you were advocating sublocal scopes as well 
as just brain dumping the idea.


[...]
> > Even when I started, as a novice programmer who wouldn't have recognised
> > the term "lexical scoping" if it fell on my head from a great height, I
> > thought it was strange that inner functions couldn't see their
> > surrounding function's variables. Nested scopes just seemed intuitively
> > obvious: if a function sees the variables in the module surrounding it,
> > then it should also see the variables in any function surrounding it.
> >
> > This behaviour in Python 1.5 made functions MUCH less useful:
[...]
> > I think it is fair to say that inner functions in Python 1.5 were
> > crippled to the point of uselessness.
> 
> I don't think that's fair to say.  A great many functions are in fact
> ... functions ;-)  That is, they compute a result from the arguments
> passed to them.

Sure, but not having access to the surrounding function scope means that 
inner functions couldn't call other inner functions. Given:

    def outer():
        def f(): ...
        def g(): ...

f cannot call g, or vice versa.

I think it was a noble experiment in how minimal you could make scoping 
rules and still be usable, but I don't think that particular aspect was 
a success.



-- 
Steve

From tim.peters at gmail.com  Mon Apr 30 12:18:17 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 30 Apr 2018 11:18:17 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <87po2hrfek.fsf@vostro.rath.org>
 
 
 
Message-ID: 

[Tim, on differences among Scheme-ish `let`, `let*`, `letrec` binding]
> ...
>
> You can play, if you like, with trying to define the `iseven` lambda
> here in one line by nesting lambdas to define `even` and `odd` as
> default arguments:
>
>     even = (lambda n: n == 0 or odd(n-1))
>     odd = (lambda n: False if n == 0 else even(n-1))
>     iseven = lambda n: even(n)
>
> Scheme supplies `letrec` for when "mutually recursive" bindings are
> needed.  In Python that distinction isn't nearly as evidently needed,
> because Python's idea of closures doesn't capture all the bindings
> currently in effect,.  For example, when `odd` above is defined,
> Python has no idea at all what the then-current binding for `even` is
> - it doesn't even look for "even" until the lambda is _executed_.

Just FYI, I still haven't managed to do it as 1-liner (well, one
statement).  I expected the following would work, but it doesn't :-)

    iseven = lambda n: (
               lambda n=n, \
                      even = (lambda n: n == 0 or odd(n-1)), \
                      odd = (lambda n: False if n == 0 else even(n-1)):
                   even(n))()

Ugly and obscure, but why not?  In the inner lambda, `n`, `even`, and
`odd` are all defined in its namespace, so why does it fail anyway?

>>> iseven(6)
Traceback (most recent call last):
...
    iseven(6)
...
    even(n))()
...
    even = (lambda n: n == 0 or odd(n-1)), \
NameError: name 'odd' is not defined

Because while Python indeed doesn't capture the current binding for
`odd` when the `even` lambda is compiled, it _does_ recognize that the
name `odd` is not local to the lambda at compile-time, so generates a
LOAD_GLOBAL opcode to retrieve `odd`'s binding at runtime.  But there
is no global `odd` (well, unless there is - and then there's no
guessing what the code would do).

For `even` to know at compile-time that `odd` will show up later in
its enclosing lambda's arglist requires that Python do `letrec`-style
binding instead.  For a start ;-)

From steve at pearwood.info  Mon Apr 30 12:20:28 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 1 May 2018 02:20:28 +1000
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <20180429025005.GB7400@ando.pearwood.info>
 
Message-ID: <20180430162027.GE7400@ando.pearwood.info>

On Sun, Apr 29, 2018 at 01:36:31PM +1000, Chris Angelico wrote:

[...]
> > While I started off with Python 1.5, I wasn't part of the discussions
> > about nested scopes. But I'm astonished that you say that nested scopes
> > were controversial. *Closures* I would completely believe, but mere
> > lexical scoping? Astonishing.
> 
> I'm not sure how you can distinguish them:

Easily. Pascal, for example, had lexical scoping back in the 1970s, but 
no closures. I expect Algol probably did also, even earlier. So they are 
certainly distinct concepts.

(And yes, Pascal functions were *not* first class values.)


> What you expect here is lexical scope, yes. But if you have lexical 
> scope with no closures, the inner function can ONLY be used while its 
> calling function is still running. What would happen if you returned 
> 'inner' uncalled, and then called the result? How would it resolve the 
> name 'x'?

Failing to resolve 'x' is an option. It would simply raise NameError, 
the same as any other name lookup that doesn't find the name.

Without closures, we could say that names are looked up in the following 
scopes:

# inner function, called from inside the creating function
(1) Local to inner.
(2) Local to outer (nonlocal).
(3) Global (module).
(4) Builtins.


If you returned the inner function and called it from the outside of the 
factory function which created it, we could use the exact same name 
resolution order, except that (2) the nonlocals would be either absent 
or empty.

Obviously that would limit the usefulness of factory functions, but 
since Python 1.5 didn't have closures anyway, that would have been no 
worse that what we had.

Whether you have a strict Local-Global-Builtins scoping, or lexical 
scoping without closures, the effect *outside* of the factory function 
is the same. But at least with the lexical scoping option, inner 
functions can call each other while still *inside* the factory.

(Another alternative would be dynamic scoping, where nonlocals becomes 
the environment of the caller.)


> I can't even begin to imagine what lexical scope would do in
> the absence of closures. At least, not with first-class functions.

What they would likely do is raise NameError, of course :-)

An inner function that didn't rely on its surrounding nonlocal scope 
wouldn't be affected. Or if you had globals that happened to match the 
names it was relying on, the function could still work. (Whether it 
would work as you expected is another question.)

I expect that given the lack of closures, the best approach is to simply 
make sure that any attempt to refer to a nonlocal from the surrounding 
function outside of that function would raise NameError.

All in all, closures are much better :-)



-- 
Steve

From python at mrabarnett.plus.com  Mon Apr 30 13:18:42 2018
From: python at mrabarnett.plus.com (MRAB)
Date: Mon, 30 Apr 2018 18:18:42 +0100
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 <5AE66F61.9000900@stoneleaf.us>
 <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>
 
Message-ID: 

On 2018-04-30 03:49, Tim Peters wrote:
> [Soni L. ]
>> That ain't shadow. That is dynamic scoping.
> 
> I don't believe either term is technically accurate, but don't really care.
> 
> 
>> Shadowing is something different:
>>
>> def f():
>>     a = 42
>>     def g():
>>         print(a)
>>     local a:
>>         a = 43
>>         g()
>>     g()
>>
>> should print "42" both times, *if it's lexically scoped*.
> 
> Why?  The `local` statement, despite its name, and casual talk about
> it, isn't _intended_ to create a new scope in any technically accurate
> sense.  I think what it is intended to do has been adequately
> explained several times already.  The `a` in `a = 42` is intended to
> be exactly the same as the `a` in `a = 43`, changing nothing at all
> about Python's lexical scoping rules.  It is _the_ `a` local to `f`.
> If lexical scoping hasn't changed one whit (and it hasn't), the code
> _must_ print 43 first.  Same as if `local a:` were replaced by `if
> True:`.  `local` has no effect on a's value until the new "scope"
> _ends_, and never any effect at all on a's visibility (a's "scope").
> "local" or not, after `a = 43` there is no scope anywhere, neither
> lexical nor dynamic, in which `a` is still bound to 42.  _The_ value
> of `a` is 43 then, `local` or not.
> 
[snip]
I think it should be lexically scoped.

The purpose of 'local' would be to allow you to use a name that _might_ 
be used elsewhere.

The problem with a dynamic scope is that you might call some global 
function from within the local scope, but find that it's "not working 
correctly" because you've inadvertently shadowed a name that the 
function refers to.

Imagine, in a local scope, that you call a global function that calls 
'len', but you've shadowed 'len'...

From marcidy at gmail.com  Mon Apr 30 14:28:17 2018
From: marcidy at gmail.com (Matt Arcidy)
Date: Mon, 30 Apr 2018 11:28:17 -0700
Subject: [Python-ideas] Objectively Quantifying Readability
Message-ID: 

The number and type of arguments about readability as a justification,
or an opinion, or an opinion about an opinion seems counter-productive
to reaching conclusions efficiently.  I think they are very important
either way, but the justifications used are not rich enough in
information to be very useful.

A study has been done regarding readability in code which may serve as
insight into this issue. Please see page 8, fig 9 for a nice chart of
the results, note the negative/positive coloring of the correlations,
grey/black respectively.

https://web.eecs.umich.edu/~weimerw/p/weimer-tse2010-readability-preprint.pdf

The criteria in the paper can be applied to assess an increase or
decrease in readability between current and proposed changes.  Perhaps
even an automated tool could be implemented based on agreed upon
criteria.

Opinions about readability can be shifted from:
 - "Is it more or less readable?"
to
 - "This change exceeds a tolerance for levels of readability given
the scope of the change."

Still need to argue "exceeds ...given" and "tolerance", but at least
the readability score exists, and perhaps over time there will be
consensus.

Note this is an attempt to impact rhetoric in PEP (or other)
discussions, not about  supporting a particular PEP.  Please consider
this food for thought to increase the efficacy and efficiency of PEP
discussions, not as commenting on any specific current discussion,
which of course is the motivating factor of sending this email today.

I think using python implicitly accepts readability being partially
measurable, even if the resolution of current measure is too low to
capture the changes currently being discussed.  Perhaps using this
criteria can increase that resolution.

Thank you,
- Matt

From tim.peters at gmail.com  Mon Apr 30 14:58:17 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 30 Apr 2018 13:58:17 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <87po2hrfek.fsf@vostro.rath.org>
 
 
 
 
Message-ID: 

[Tim, still trying to define `iseven` in one statement]

>     even = (lambda n: n == 0 or odd(n-1))
>     odd = (lambda n: False if n == 0 else even(n-1))
>     iseven = lambda n: even(n)
...

> [and the last attempt failed because a LOAD_GLOBAL was generated
>    instead of a more-general runtime lookup]

So if I want LOAD_FAST instead, that has to be forced, leading to a
one-statement definition that's wonderfully clear:

    iseven = lambda n: (
        lambda
            n=n,
            even = (lambda n, e, o: n == 0 or o(n-1, e, o)),
            odd = (lambda n, e, o: False if n == 0 else e(n-1, e, o)):
                   even(n, even, odd)
        )()

Meaning "wonderfully clear" to the compiler, not necessarily to you ;-)

Amusingly enough, that's a lot like the tricks we sometimes did in
Python's early days, before nested scopes were added, to get recursive
- or mutually referring - non-global functions to work at all.  That
is, since their names weren't in any scope they could access, their
names had to be passed as arguments (or, sure, stuffed in globals -
but that would have been ugly ;-) ).

From tim.peters at gmail.com  Mon Apr 30 16:41:17 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 30 Apr 2018 15:41:17 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 <5AE66F61.9000900@stoneleaf.us>
 <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>
 
 
Message-ID: 

[MRAB ]
> I think it should be lexically scoped.

That's certainly arguable, but that's why I like real-code driven
design:  abstract arguments never end, and often yield a dubious
in-real-life outcome after one side is worn out and the other side
"wins" by attrition ;-)


> The purpose of 'local' would be to allow you to use a name that _might_ be
> used elsewhere.
>
> The problem with a dynamic scope is that you might call some global function
> from within the local scope, but find that it's "not working correctly"
> because you've inadvertently shadowed a name that the function refers to.

Already explained at excessive length that there's nothing akin to
"dynamic scopes" here, except that both happen to restore a previous
binding at times.  That's a shallow coincidence.  It's no more
"dynamic scope" than that

    savea = a
    try:
        a += 1
        f(a)
    finally:
        a = savea

is "dynamic scoping".  It's merely saving/restoring a binding across a
block of code.


> Imagine, in a local scope, that you call a global function that calls 'len',
> but you've shadowed 'len'...

I'm not clear on whether you picked the name of a builtin to make a
subtle point not spelled out, but I don't think it matters.
Regardless of whether `len` refers to a builtin or a module global
inside your global function now, the _current_

def f():
     len = 12
     global_function()

has no effect at all on the binding of `len` seen inside
`global_function`.  Because my understanding of "local:" changes
absolutely nothing about Python's current scope rules, it's
necessarily the case that the same would be true in:

def f():
    local len:
        len = 12
        call_something()

The only difference from current semantics is that if

    print(len)

were added after the `local:` block, UnboundLocalError would be raised
(restoring the state of the function-local-with-or-without-'local:'
`len` to what it was before the block).

To have "local:" mean "new nested lexical scope" instead requires
specifying a world of semantics that haven't even been mentioned yet.

In Python today, in the absence of `global` and `nonlocal`
declarations, the names local to a given lexical scope are determined
entirely by analyzing binding sites.  If you intend something other
than that, then it needs to be spelled out.  But if you intend to keep
"and names appearing in binding sites are also local to the new
lexical scope", I expect that's pretty much useless.   For example,

    def f():
        ...
        local x. y:
            x = a*b
            y = a/b
            r1, r2 = x+y, x-y

That is, the programmer surely doesn't _intend_ to throw away r1 and
r2 when the block ends.  If they have to add a

        nonlocal r1, r2

declaration at the top of the block, maybe it would work as intended.
But it still wouldn't work unless `r1` and `r2` _also_ appeared in
binding sites in an enclosing lexical scope.  If they don't, you'd get
a compile-time error like

SyntaxError: no binding for nonlocal 'r1' found

To be more accurate, the message should really say "sorry, but I have
no idea in which scope you _intend_ 'r1' to live, because the only way
I could know that is to find a binding site for 'r1', and I can't find
any except inside _this_ scope containing the 'nonlocal'".  But that's
kind of wordy ;-)

If you agree that makes the feature probably unusable, you don't get
off the hook by saying "no, unlike current Python scopes, binding
sites have nothing to do with what's local to a new lexical scope
introduced by 'local:'".  The same question raised in the example
above doesn't go away:  in which scope(s) are 'r1' and 'r2' to be
bound?

There's more than one plausible answer to that, but in the absence of
real use cases how can they be judged?

Under "'local:' changes nothing at all about Python's scopes", the
answer is obvious:  `r1` and `r2` are function locals (exactly the
same as if "local:" hadn't been used).  There's nothing new about
scope to learn, and the code works as intended on the first try ;-)
Of course "local:" would be a misleading name for the construct,
though.

Going back to your original example, where a global (not builtin)
"len" was intended:

    def f():
        global len  # LINE ADDED HERE
        local len:
            len = 12
            global_function()

yes, in _that_ case the global-or-builtin "len" seen inside
`global_function` would change under my "nothing about scoping
changes" reading, but would not under your reading.

That's worth _something_ ;-)  But without fleshing out the rules for
all the other stuff (like which scope(s) own r1 and r2 in the example
above) I can't judge whether it's worth enough to care.  All the
plausibly realistic use cases I've considered don't _really_ want a
full-blown new scope (just robust save/restore for a handful of
explicitly given names), and the example just above is contrived in
comparison.  Nobody types "global len" unless they _intend_ to rebind
the global `len`, in which case i'm happy to let them shoot both feet
off ;-)

In any case, nothing can change the binding of the builtin "len" short
of mucking directly with the mapping object implementing builtin
lookups.

Note:  most of this doesn't come up in most other languages because
they require explicitly declaring in which scope a name lives.
Python's "infer that in almost all cases instead from examining
binding sites" has consequences.

From python at mrabarnett.plus.com  Mon Apr 30 19:19:36 2018
From: python at mrabarnett.plus.com (MRAB)
Date: Tue, 1 May 2018 00:19:36 +0100
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 <5AE66F61.9000900@stoneleaf.us>
 <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>
 
 
 
Message-ID: <78764cbe-ec31-626f-35e8-6d0f5dd6fb06@mrabarnett.plus.com>

On 2018-04-30 21:41, Tim Peters wrote:
> [MRAB ]
> > I think it should be lexically scoped.
>
> That's certainly arguable, but that's why I like real-code driven
> design:  abstract arguments never end, and often yield a dubious
> in-real-life outcome after one side is worn out and the other side
> "wins" by attrition ;-)
>
>
> > The purpose of 'local' would be to allow you to use a name that _might_ be
> > used elsewhere.
> >
> > The problem with a dynamic scope is that you might call some global function
> > from within the local scope, but find that it's "not working correctly"
> > because you've inadvertently shadowed a name that the function refers to.
>
> Already explained at excessive length that there's nothing akin to
> "dynamic scopes" here, except that both happen to restore a previous
> binding at times.  That's a shallow coincidence.  It's no more
> "dynamic scope" than that
>
>      savea = a
>      try:
>          a += 1
>          f(a)
>      finally:
>          a = savea
>
> is "dynamic scoping".  It's merely saving/restoring a binding across a
> block of code.
>
>
> > Imagine, in a local scope, that you call a global function that calls 'len',
> > but you've shadowed 'len'...
>
> I'm not clear on whether you picked the name of a builtin to make a
> subtle point not spelled out, but I don't think it matters.
> Regardless of whether `len` refers to a builtin or a module global
> inside your global function now, the _current_
>
> def f():
>       len = 12
>       global_function()
>
> has no effect at all on the binding of `len` seen inside
> `global_function`.  Because my understanding of "local:" changes
> absolutely nothing about Python's current scope rules, it's
> necessarily the case that the same would be true in:
>
> def f():
>      local len:
>          len = 12
>          call_something()
>
> The only difference from current semantics is that if
>
>      print(len)
>
> were added after the `local:` block, UnboundLocalError would be raised
> (restoring the state of the function-local-with-or-without-'local:'
> `len` to what it was before the block).
>
> To have "local:" mean "new nested lexical scope" instead requires
> specifying a world of semantics that haven't even been mentioned yet.
>
> In Python today, in the absence of `global` and `nonlocal`
> declarations, the names local to a given lexical scope are determined
> entirely by analyzing binding sites.  If you intend something other
> than that, then it needs to be spelled out.  But if you intend to keep
> "and names appearing in binding sites are also local to the new
> lexical scope", I expect that's pretty much useless.   For example,
>
>      def f():
>          ...
>          local x. y:
>              x = a*b
>              y = a/b
>              r1, r2 = x+y, x-y
>
> That is, the programmer surely doesn't _intend_ to throw away r1 and
> r2 when the block ends.  If they have to add a
>
>          nonlocal r1, r2
>
> declaration at the top of the block, maybe it would work as intended.
> But it still wouldn't work unless `r1` and `r2` _also_ appeared in
> binding sites in an enclosing lexical scope.  If they don't, you'd get
> a compile-time error like
The intention is that only the specified names are local.

After all, what's the point of specifying names after the 'local' if 
_any_ binding in the local scope was local?

> SyntaxError: no binding for nonlocal 'r1' found
>
> To be more accurate, the message should really say "sorry, but I have
> no idea in which scope you _intend_ 'r1' to live, because the only way
> I could know that is to find a binding site for 'r1', and I can't find
> any except inside _this_ scope containing the 'nonlocal'".  But that's
> kind of wordy ;-)
>
> If you agree that makes the feature probably unusable, you don't get
> off the hook by saying "no, unlike current Python scopes, binding
> sites have nothing to do with what's local to a new lexical scope
> introduced by 'local:'".  The same question raised in the example
> above doesn't go away:  in which scope(s) are 'r1' and 'r2' to be
> bound?
Any binding that's not specified as local is bound in the parent scope:

local b:
 ??? local c:
 ??????? c = 0 # Bound in the "local c" scope.
 ? ?? ?? b = 0 # Bound in the "local b" scope.
 ??????? a = 0 # Bound in the main scope (function, global, whatever)
> There's more than one plausible answer to that, but in the absence of
> real use cases how can they be judged?
>
> Under "'local:' changes nothing at all about Python's scopes", the
> answer is obvious:  `r1` and `r2` are function locals (exactly the
> same as if "local:" hadn't been used).  There's nothing new about
> scope to learn, and the code works as intended on the first try ;-)
> Of course "local:" would be a misleading name for the construct,
> though.
>
> Going back to your original example, where a global (not builtin)
> "len" was intended:
>
>      def f():
>          global len  # LINE ADDED HERE
>          local len:
>              len = 12
>              global_function()
>
> yes, in _that_ case the global-or-builtin "len" seen inside
> `global_function` would change under my "nothing about scoping
> changes" reading, but would not under your reading.
>
> That's worth _something_ ;-)  But without fleshing out the rules for
> all the other stuff (like which scope(s) own r1 and r2 in the example
> above) I can't judge whether it's worth enough to care.  All the
> plausibly realistic use cases I've considered don't _really_ want a
> full-blown new scope (just robust save/restore for a handful of
> explicitly given names), and the example just above is contrived in
> comparison.  Nobody types "global len" unless they _intend_ to rebind
> the global `len`, in which case i'm happy to let them shoot both feet
> off ;-)
>
> In any case, nothing can change the binding of the builtin "len" short
> of mucking directly with the mapping object implementing builtin
> lookups.
>
> Note:  most of this doesn't come up in most other languages because
> they require explicitly declaring in which scope a name lives.
> Python's "infer that in almost all cases instead from examining
> binding sites" has consequences.
>
Would/should it be possible to inject a name into a local scope? You 
can't inject into a function scope, and names in a function scope can be 
determined statically (they are allocated slots), so could the same kind 
of thing be done for names in a local scope?

From steve at pearwood.info  Mon Apr 30 20:42:53 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 1 May 2018 10:42:53 +1000
Subject: [Python-ideas] Objectively Quantifying Readability
In-Reply-To: 
References: 
Message-ID: <20180501004252.GG7400@ando.pearwood.info>

On Mon, Apr 30, 2018 at 11:28:17AM -0700, Matt Arcidy wrote:

> A study has been done regarding readability in code which may serve as
> insight into this issue. Please see page 8, fig 9 for a nice chart of
> the results, note the negative/positive coloring of the correlations,
> grey/black respectively.

Indeed. It seems that nearly nothing is positively correlated to 
increased readability, aside from comments, blank lines, and (very 
weakly) arithmetic operators. Everything else hurts readability.

The conclusion here is that if you want readable source code, you should 
remove the source code. *wink*

 
> https://web.eecs.umich.edu/~weimerw/p/weimer-tse2010-readability-preprint.pdf
> 
> The criteria in the paper can be applied to assess an increase or
> decrease in readability between current and proposed changes.  Perhaps
> even an automated tool could be implemented based on agreed upon
> criteria.


That's a really nice study, and thank you for posting it. There are some 
interested observations here, e.g.:

- line length is negatively correlated with readability;

  (a point against those who insist that 79 character line 
  limits are irrelevant since we have wide screens now)

- conventional measures of complexity do not correlate well
  with readability;

- length of identifiers was strongly negatively correlated
  with readability: long, descriptive identifier names hurt
  readability while short variable names appeared to make
  no difference;

  (going against the common wisdom that one character names
  hurt readability -- maybe mathematicians got it right 
  after all?)

- people are not good judges of readability;

but I think the practical relevance here is very slim. Aside from 
questions about the validity of the study (it is only one study, can the 
results be replicated, do they generalise beyond the narrowly self- 
selected set of university students they tested?) I don't think that it 
gives us much guidance here. For example:

1. The study is based on Java, not Python.

2. It looks at a set of pre-existing source code features.

3. It gives us little or no help in deciding whether new syntax will or 
won't affect readability: the problem of *extrapolation* remains.

(If we know that, let's say, really_long_descriptive_identifier_names 
hurt readability, how does that help us judge whether adding a new kind 
of expression will hurt or help readability?)

4. The authors themselves warn that it is descriptive, not prescriptive, 
for example replacing long identifier names with randomly selected two 
character names is unlikely to be helpful.

5. The unfamiliarity affect: any unfamiliar syntax is going to be less 
readable than a corresponding familiar syntax.


It's a great start to the scientific study of readability, but I don't 
think it gives us any guidance with respect to adding new features.


> Opinions about readability can be shifted from:
>  - "Is it more or less readable?"
> to
>  - "This change exceeds a tolerance for levels of readability given
> the scope of the change."

One unreplicated(?) study for readability of Java snippets does not give 
us a metric for predicting the readability of new Python syntax. While 
it would certainly be useful to study the possibly impact of adding new 
features to a language, the authors themselves state that this study is 
just "a framework for conducting such experiments".

Despite the limitations of the study, it was an interesting read, thank 
you for posting it.



-- 
Steve

From rosuav at gmail.com  Mon Apr 30 20:56:05 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 1 May 2018 10:56:05 +1000
Subject: [Python-ideas] Objectively Quantifying Readability
In-Reply-To: <20180501004252.GG7400@ando.pearwood.info>
References: 
 <20180501004252.GG7400@ando.pearwood.info>
Message-ID: 

On Tue, May 1, 2018 at 10:42 AM, Steven D'Aprano  wrote:
> The conclusion here is that if you want readable source code, you should
> remove the source code. *wink*

That's more true than your winky implies. Which is more readable: a
Python function, or the disassembly of its corresponding byte-code?
Which is more readable: a "for item in items:" loop, or one that
iterates up to the length of the list and subscripts it each time? The
less code it takes to express the same concept, the easier it is to
read - and to debug.

So yes, if you want readable source code, you should have less source code.

ChrisA

From turnbull.stephen.fw at u.tsukuba.ac.jp  Mon Apr 30 21:51:50 2018
From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Tue, 1 May 2018 10:51:50 +0900
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <87po2hrfek.fsf@vostro.rath.org>
 
 
 
 
 
Message-ID: <23271.51254.539986.380171@turnbull.sk.tsukuba.ac.jp>

Tim Peters writes:

 > Meaning "wonderfully clear" to the compiler, not necessarily to you
 > ;-)

Is the compiler African or European (perhaps even Dutch)?



From tim.peters at gmail.com  Mon Apr 30 21:52:13 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 30 Apr 2018 20:52:13 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: <78764cbe-ec31-626f-35e8-6d0f5dd6fb06@mrabarnett.plus.com>
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 <5AE66F61.9000900@stoneleaf.us>
 <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>
 
 
 
 <78764cbe-ec31-626f-35e8-6d0f5dd6fb06@mrabarnett.plus.com>
Message-ID: 

[MRAB ]
> ...
> The intention is that only the specified names are local.
>
> After all, what's the point of specifying names after the 'local' if _any_
> binding in the local scope was local?

Don't look at me ;-)  In the absence of use cases, I don't know which
problem(s) you're trying to solve.  All the use cases I've looked at
are adequately addressed by having some spelling of "local:" change
nothing at all about Python's current scope rules.  If you have uses
in mind that require more than just that, I'd need to see them.

>> ...
>> If you agree that makes the feature probably unusable, you don't get
>> off the hook by saying "no, unlike current Python scopes, binding
>> sites have nothing to do with what's local to a new lexical scope
>> introduced by 'local:'".  The same question raised in the example
>> above doesn't go away:  in which scope(s) are 'r1' and 'r2' to be
>> bound?

> Any binding that's not specified as local is bound in the parent scope:

Reverse-engineering the example following, is this a fair way of
making that more precise?

Given a binding-target name N in scope S, N is bound in scope T, where
T is the closest-containing scope (which may be S itself) for which T
is either

1. established by a "local:" block that declares name N

or

2. not established by a "local: block


> local b:
>     local c:
>         c = 0 # Bound in the "local c" scope.

By clause #1 above, "c" is declared in the starting "local:" scope.

>         b = 0 # Bound in the "local b" scope.

By clause #1 above, after searching one scope up to find `b` declared
in a "local:" scope

>         a = 0 # Bound in the main scope (function, global, whatever)

By clause #2 above, after searching two scopes up and not finding any
"local:" scope declaring name "a".  By your original "the parent
scope", I would have expected this be bound in the "local b:" scope
(which is the parent scope of the "local c:" scope).

So that's _a_ possible answer.  It's not like the scoping rules in any
other language I'm aware of, but then Python's current scoping rules
are unique too.

Are those useful rules?  Optimal?  The first thing that popped into
your head?  The third?  Again I'd need to see use cases to even begin
to guess.

I agree it's well defined, though, and so miles ahead of most ideas ;-)

...

>> Note:  most of this doesn't come up in most other languages because
>> they require explicitly declaring in which scope a name lives.
>> Python's "infer that in almost all cases instead from examining
>> binding sites" has consequences.

> Would/should it be possible to inject a name into a local scope? You can't
> inject into a function scope, and names in a function scope can be
> determined statically (they are allocated slots), so could the same kind of
> thing be done for names in a local scope?

Sorry, I'm unclear on what "inject a name into a local scope" means.
Do you mean at runtime?

In Python's very early days, all scope namespaces were implemented as
Python dicts, and you could mutate those at runtime any way you liked.
Name lookup first tried the "local" dict ("the" because local scopes
didn't nest), then the "global" dict, then the "builtin" dict.  Names
could be added or removed from any of those at will.

People had a lot of fun playing with that, but nobody seriously
complained as that extreme flexibility was incrementally traded away
for faster runtime.

So now take it as given that the full set of names in a local scope
must be determinable at compile-time (modulo whatever hacks may still
exist to keep old "from module import *" code working - if any still
do exist).  I don't believe CPython has grown any optimizations
preventing free runtime mutation of global (module-level) or builtin
namespace mappings, but I may be wrong about that.

From tim.peters at gmail.com  Mon Apr 30 23:40:49 2018
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 30 Apr 2018 22:40:49 -0500
Subject: [Python-ideas] A "local" pseudo-function
In-Reply-To: 
References: 
 <20180428093334.GU7400@ando.pearwood.info>
 
 <5AE61F76.60900@stoneleaf.us>
 
 <5AE66F61.9000900@stoneleaf.us>
 <4a5496f7-c180-4a01-0de8-dcd9b07681ad@gmail.com>
 
 
 
 <78764cbe-ec31-626f-35e8-6d0f5dd6fb06@mrabarnett.plus.com>
 
Message-ID: 

[MRAB]
>> Any binding that's not specified as local is bound in the parent scope:

[Tim]
> Reverse-engineering the example following, is this a fair way of
> making that more precise?
>
> Given a binding-target name N in scope S, N is bound in scope T, where
> T is the closest-containing scope (which may be S itself) for which T
> is either
>
> 1. established by a "local:" block that declares name N
>
> or
>
> 2. not established by a "local: block

Here's an example where I don't know what the consequences of "the
rules" should be:

def f():
    a = 10
    local a:
        def showa():
            print("a is", a)
        showa() # 10
        a = 20
        showa() # 20
        a = 30
    showa() # 10

The comments show what the output would be under the "nothing about
scope rules change" meaning.  They're all obvious (since there is is
no new scope then - it's all function-local).

But under the other meaning ...?

The twist here is that `def` is an executable statement in Python, and
is a "binding site" for the name of the function being defined.  So
despite that `showa` appears to be defined in a new nested lexical
scope, it's _actually_ bound as a function-local name.  That's bound
to be surprising to people from other languages:  "I defined it in a
nested lexical scope, but the name is still visible after that scope
ends?".

I don't know what the first `showa()` is intended to do.  Presumably
`a` is unbound at the start of the new nested scope?  So raises
NameError?  If so, comment that line out so we can make progress ;-)

It seems clear that the second `showa()` will display 20 under any reading.

But the third?  Now we're out of the `local a:` scope, but call a
function whose textual definition was inside that scope.  What does
`showa()` do now to find a's value?  f's local `a` had nothing to do
with the `a` in the nested scope, so presumably it shouldn't display
10 now.  What should it do?
Does the final state of the nested scope's locals need to preserved so
that showa() can display 30 instead?  Or ...?

Not necessarily complaining  - just fleshing out a bit my earlier
claim that a world of semantics need to be defined if anything akin to
a "real scope" is desired.

From marcidy at gmail.com  Mon Apr 30 23:46:42 2018
From: marcidy at gmail.com (Matt Arcidy)
Date: Mon, 30 Apr 2018 20:46:42 -0700
Subject: [Python-ideas] Objectively Quantifying Readability
In-Reply-To: <20180501004252.GG7400@ando.pearwood.info>
References: 
 <20180501004252.GG7400@ando.pearwood.info>
Message-ID: 

On Mon, Apr 30, 2018 at 5:42 PM, Steven D'Aprano  wrote:
> On Mon, Apr 30, 2018 at 11:28:17AM -0700, Matt Arcidy wrote:
>
>> A study has been done regarding readability in code which may serve as
>> insight into this issue. Please see page 8, fig 9 for a nice chart of
>> the results, note the negative/positive coloring of the correlations,
>> grey/black respectively.
>
> Indeed. It seems that nearly nothing is positively correlated to
> increased readability, aside from comments, blank lines, and (very
> weakly) arithmetic operators. Everything else hurts readability.
>
> The conclusion here is that if you want readable source code, you should
> remove the source code. *wink*
>
>
>> https://web.eecs.umich.edu/~weimerw/p/weimer-tse2010-readability-preprint.pdf
>>
>> The criteria in the paper can be applied to assess an increase or
>> decrease in readability between current and proposed changes.  Perhaps
>> even an automated tool could be implemented based on agreed upon
>> criteria.
>
>
> That's a really nice study, and thank you for posting it. There are some
> interested observations here, e.g.:
>
> - line length is negatively correlated with readability;
>
>   (a point against those who insist that 79 character line
>   limits are irrelevant since we have wide screens now)
>
> - conventional measures of complexity do not correlate well
>   with readability;
>
> - length of identifiers was strongly negatively correlated
>   with readability: long, descriptive identifier names hurt
>   readability while short variable names appeared to make
>   no difference;
>
>   (going against the common wisdom that one character names
>   hurt readability -- maybe mathematicians got it right
>   after all?)
>
> - people are not good judges of readability;
>
> but I think the practical relevance here is very slim. Aside from
> questions about the validity of the study (it is only one study, can the
> results be replicated, do they generalise beyond the narrowly self-
> selected set of university students they tested?) I don't think that it
> gives us much guidance here. For example:

I don't propose to replicate correlations.  I don't see these
"standard" terminal conclusions as forgone when looking at the idea as
a whole, as opposed to the paper itself, which they may be.  The
authors crafted a method and used that method to do a study, I like
the method.  I think I can agree with your point about the study
without validating or invalidating the method.

>
> 1. The study is based on Java, not Python.

An objective measure can be created, based or not on the paper's
parameters, but it clearly would need to be adjusted to a specific
language, good point.

Here "objective" does not mean "with absolute correctness" but
"applied the same way such that a 5 is always a 5, and a 5 is always
greater than 4."  I think I unfortunately presented the paper as "The
Answer" in my initial email, but I didn't intend to say "each detail
must be implemented as is" but more like "this is a thing which can be
done."  Poor job on my part.

>
> 2. It looks at a set of pre-existing source code features.
>
> 3. It gives us little or no help in deciding whether new syntax will or
> won't affect readability: the problem of *extrapolation* remains.
>
> (If we know that, let's say, really_long_descriptive_identifier_names
> hurt readability, how does that help us judge whether adding a new kind
> of expression will hurt or help readability?)

A new feature can remove symbols or add them.  It can increase density
on a line, or remove it.  It can be a policy of variable naming, or it
can specifically note that variable naming has no bearing on a new
feature.  This is not limited in application.  It's just scoring.
When anyone complains about readability, break out the scoring
criteria and assess how good the _comparative_ readability claim is:
2 vs 10?  4 vs 5?  The arguments will no longer be singularly about
"readability," nor will the be about the question of single score for
a specific statement.  The comparative scores of applying the same
function over two inputs gives a relative difference.  This is what
measures do in the mathematical sense.

Maybe the "readability" debate then shifts to arguing criteria: "79?
Too long in your opinion!"  A measure will at least break
"readability" up and give some structure to that argument.  Right now
"readability" comes up and starts a semi-polite flame war.  Creating
_any_ criteria will help narrow the scope of the argument.

Even when someone writes perfectly logical statements about it, the
statements can always be dismantled because it's based in opinion.
By creating a measure, objectivity is forced.  While each criterion is
less or more subjective, the measure will be applied objectively to
each instance, the same way, to get a score.

>
> 4. The authors themselves warn that it is descriptive, not prescriptive,
> for example replacing long identifier names with randomly selected two
> character names is unlikely to be helpful.

Of course, which is why it's a score, not a single criterion.   For
example, if you hit the Shannon limit, no one will be able to read it
anyways.  "shorter is better" doesn't mean "shortest is best".

>
> 5. The unfamiliarity affect: any unfamiliar syntax is going to be less
> readable than a corresponding familiar syntax.

Definitely, let me respond specifically, but as an example of how to
apply a measure flexibly.
A criterion can be turned on/off based on the target of the new
feature.  Do you want beginners to understand this?  Is this for core
developers?
If there exists one measure, another can be created by
adding/subtracting criteria. I'm not saying do it, I'm saying it can
be done.  It's a matter of conditioning, like a marginal distribution.
Core developers seem fairly indifferent to symbolic density on a line,
but many are concerned about beginners.  Heck, run both measures and
see how dramatically the numbers change.

>
>
> It's a great start to the scientific study of readability, but I don't
> think it gives us any guidance with respect to adding new features.
>
>
>> Opinions about readability can be shifted from:
>>  - "Is it more or less readable?"
>> to
>>  - "This change exceeds a tolerance for levels of readability given
>> the scope of the change."
>
> One unreplicated(?) study for readability of Java snippets does not give
> us a metric for predicting the readability of new Python syntax. While
> it would certainly be useful to study the possibly impact of adding new
> features to a language, the authors themselves state that this study is
> just "a framework for conducting such experiments".

It's example of a measure.  I presented it poorly, but even poor
presentation should not prevent acknowledging that objective measures
exist today.  "In english" is a good one for me for sure, I'm barely
monolingual.

Perhaps agreement on the criteria will be attritive, perhaps
impossible, but the "pure opinion" argument is definitely not true.
This should be clearly noted, specifically because there is _so much
upon which is already agreed_, which is a real tragedy here.  Even as
information theory is useful in this pursuit, it cannot be applied to
limit, or we'd be trying to read/write bzip hex files.

I think what you have mentioned enhances the point that rules exist,
and your points can be formalized to rules and then incorporated into
a model.

Using your names example again, single letter names are very not
meaningful, and 5 random alphanumerics are no better, perhaps even
less so if 'e' is Exception and now it's 'et82c'.  However, 5 letters
that trigger a spell checker to propose the correct concept pointed to
by the name, clearly has _value_ over the random 5 alphanumerics, i.e.
a Hamming distance type measure would measure this improvement
perfectly.

As for predictability, every possible statement has a score that just
needs to be computed by the measure, measures are not predictive.
Running a string of symbols will result in a number based on patterns,
as it's not a semantic analysis.   If the scoring is garbage, the
measure is garbage and needs to be redone, but it's not because it
fails at predicting.  Each symbol string has a score, precisely as two
points have a distance in the euclidean measure.

In lieu of a wide statistical study, assumptions will be made, argued,
set, used, argued again, etc.  This is life.  But the rhetoric of
"readability; there for my opinion is right or your statement is an
opinion" will be tempered.  A criteria can be theorized or accepted as
a de-facto tool.  Or not.  But it can exist.

>
> Despite the limitations of the study, it was an interesting read, thank
> you for posting it.

I think I presented the paper as "The answer" as opposed to "this is
an approach."  I  agree that some, perhaps many, of the paper
specifics are wholly irrelevant.   Crafting a meaningful measure of
readability is do-able, however.   Obtaining agreement is still hard,
and perhaps unfortunately impossible, but a measure exists.

I supposed I'll build something and see where it goes, oddly enough, I
am very enamored with my own idea!  I appreciate your feedback and
will incorporate it, and if you have any more, I am interested to hear
it.


>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/