From mertz at gnosis.cx Thu Feb 1 00:05:22 2018 From: mertz at gnosis.cx (David Mertz) Date: Wed, 31 Jan 2018 21:05:22 -0800 Subject: [Python-ideas] Format mini-language for lakh and crore In-Reply-To: References: <1f1f8906-8cf7-ff3c-087e-8f75b6399df5@trueblade.com> <7679410d-8261-33f4-54ca-a2e4d8edb09a@trueblade.com> Message-ID: On Jan 31, 2018 8:12 PM, "Nick Coghlan" wrote: On 1 February 2018 at 08:14, Eric V. Smith wrote: >>> print(f"In European format x is {x:,.2f}, in Indian format it is {x:,2,3.2f}") > This just seems too complicated to me, and is overgeneralizing. How many of > these different formats would ever really be used? Can you really expect > someone to remember what that means by looking at it? That's even more arbitrary and hard to interpret than listing out the grouping spec, though. I suggested a single character, although my thought of backtick was different from Eric's of semicolon. Neither of them would be obvious, but rather "something to look up the first few times." There is a lot in the format mini-language that is "have to look up" though. A single character South Asian number delimiter style wouldn't be different from a lot of features of that DSL. Albeit, most of it seems intuitive after you've used it a while... The symbols are somewhat iconic. I think if we only cared about decimal digit groups (which is all I initially thought of), Nick's would be excessive generalization. However, when you think of also grouping hex, octal, and binary, there genuinely are several conventions and different useful presentations. So overall I do like Nick's approach better than my initial suggestion or Eric's one that is similar to mine. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 1 00:11:42 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Feb 2018 15:11:42 +1000 Subject: [Python-ideas] Format mini-language for lakh and crore In-Reply-To: References: <1f1f8906-8cf7-ff3c-087e-8f75b6399df5@trueblade.com> <7679410d-8261-33f4-54ca-a2e4d8edb09a@trueblade.com> Message-ID: On 1 February 2018 at 14:11, Nick Coghlan wrote: > On 1 February 2018 at 08:14, Eric V. Smith wrote: >> On 1/29/2018 2:13 AM, Nick Coghlan wrote: >>> Given the example, I think a more useful approach would be to allow an >>> optional digit grouping specifier after the comma separator, and allow >>> the separator to be repeated to indicate non-uniform groupings in the >>> lower order digits. >>> >>> If we did that, then David's example could become: >>> >>> >>> print(f"In European format x is {x:,.2f}, in Indian format it >>> is {x:,2,3.2f}") >> >> >> This just seems too complicated to me, and is overgeneralizing. How many of >> these different formats would ever really be used? Can you really expect >> someone to remember what that means by looking at it? > > Sure - "," and "_" both mean "digit grouping", the numbers tell you > how large the groups are from left to right (with the leftmost group > size repeated as needed), and a single "," means the same thing as > ",3," for decimal digits, and the same thing as ",4," for binary, > octal, and hexadecimal digits. Slight correction here, since the comma-separator is decimal only: - "," would be short for ",3," with decimal digits - "_" would be short for "_3_" with decimal digits - "_" would be short for "_4_" with binary/octal/hexadecimal digits Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephanh42 at gmail.com Thu Feb 1 03:16:22 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Feb 2018 09:16:22 +0100 Subject: [Python-ideas] Format mini-language for lakh and crore In-Reply-To: References: <1f1f8906-8cf7-ff3c-087e-8f75b6399df5@trueblade.com> <7679410d-8261-33f4-54ca-a2e4d8edb09a@trueblade.com> Message-ID: What about something like: f"{x:?d}" ? = Indian Rupees symbol I realize it is not ASCII but ? would be, for the target audience, both easy to type (Ctrl-Alt-4 on Windows Indian English keyboard layout) and be mnemonic ("format number like you would format an amount in rupees"). Stephan 2018-02-01 6:11 GMT+01:00 Nick Coghlan : > On 1 February 2018 at 14:11, Nick Coghlan wrote: > > On 1 February 2018 at 08:14, Eric V. Smith wrote: > >> On 1/29/2018 2:13 AM, Nick Coghlan wrote: > >>> Given the example, I think a more useful approach would be to allow an > >>> optional digit grouping specifier after the comma separator, and allow > >>> the separator to be repeated to indicate non-uniform groupings in the > >>> lower order digits. > >>> > >>> If we did that, then David's example could become: > >>> > >>> >>> print(f"In European format x is {x:,.2f}, in Indian format it > >>> is {x:,2,3.2f}") > >> > >> > >> This just seems too complicated to me, and is overgeneralizing. How > many of > >> these different formats would ever really be used? Can you really expect > >> someone to remember what that means by looking at it? > > > > Sure - "," and "_" both mean "digit grouping", the numbers tell you > > how large the groups are from left to right (with the leftmost group > > size repeated as needed), and a single "," means the same thing as > > ",3," for decimal digits, and the same thing as ",4," for binary, > > octal, and hexadecimal digits. > > Slight correction here, since the comma-separator is decimal only: > > - "," would be short for ",3," with decimal digits > - "_" would be short for "_3_" with decimal digits > - "_" would be short for "_4_" with binary/octal/hexadecimal digits > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu Feb 1 03:45:14 2018 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 1 Feb 2018 03:45:14 -0500 Subject: [Python-ideas] Format mini-language for lakh and crore In-Reply-To: References: <1f1f8906-8cf7-ff3c-087e-8f75b6399df5@trueblade.com> <7679410d-8261-33f4-54ca-a2e4d8edb09a@trueblade.com> Message-ID: On 2/1/2018 12:05 AM, David Mertz wrote: > So overall I do like Nick's approach better than my initial suggestion > or Eric's one that is similar to mine. Oops, I'd forgotten that you (David) had proposed a single character in your original email. I'm not trying to claim the idea! The important part is that it's a single character, not what that character is, so I'll refer to it as "David's suggestion"! FWIW, PEP 378 also summarizes some of the discussion we're rehashing, except Nick's proposal (the chosen one) was simpler, and mine slightly more complex, but still not generalized to solve the problem being discussed here. Nine years on it might be worth doing some research to see if other languages have done anything since the PEP was written. For the languages that support picture-style formatting, I suspect not, but maybe there's something to learn from newer languages? Eric From mal at egenix.com Thu Feb 1 04:20:00 2018 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 1 Feb 2018 10:20:00 +0100 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> Message-ID: On 01.02.2018 00:40, Chris Angelico wrote: > On Thu, Feb 1, 2018 at 10:15 AM, Chris Barker wrote: >> I still have no ide4a why there is such resistance to this -- yes, it's a >> fairly small benefit over a package no PyPi, but there is also virtually no >> downside. > > I don't understand it either. Aside from maybe bikeshedding the *name* > of the encoding, this seems like a pretty straight-forward addition. I guess many of you are not aware of how we have treated such encoding additions in the past 1.5 decades. In general, we have only added new encodings when there was an encoding missing which a lot of people were actively using. We asked for official documentation defining the mappings, references showing usage and IANA or similar standard names to use for the encoding itself and its aliases. In recent years, we had only very few such requests, mainly because the set we have in Python is already fairly complete. Now the OP comes proposing to add a whole set of encodings which only differ slightly from our existing ones. Backing is their use and definition by WHATWG, a consortium of browser vendors who are interested in showing web pages to users in a consistent way. WHATWG decided to simply override the standard names for encodings with new mappings under their control. Again, their motivation is clear: browsers get documents with advertised encoding which don't always match the standard ones, so they have to make some choices on how to display those documents. The easiest way for them is to define all special cases in a set of new mappings for each standard encoding name. This is all fine, but it's also a very limited use case: that of wanting to display web pages in a browser. It's certainly needed for applications implementing browser interfaces and probably also for ones which do web scraping, but otherwise, the need should rarely arise. What WHATWG uses as workarounds may also not necessarily be what actual users would like to have. Such workarounds are always trade-offs and they can change over time - which WHATWG addresses by making the encodings "living standards". They are a solution, but not a one fits all way of dealing with broken data. We also have the naming issue, since WHATWG chose to use the same names as the standard mappings. Anything we'd define will neither match WHATWG nor any other encoding standard name, so we'd be creating a new set of encoding names - which is really not what the world is after, including WHATWG itself. People would start creating encoded text using these new encoding names, resulting in even more mojibake out there instead of fixing the errors in the data and using Unicode or UTF-8 for interchange. As I mentioned before, we could disable encoding in the new mappings to resolve this concern, but the OP wasn't interested in such an approach. As alternative approach we proposed error handlers, which are the normal technology to use when dealing with encoding errors. Again, the OP wasn't interested. Please also note that once we start adding, say "whatwg-" encodings (or rather decodings :-), going for the simple charmap encodings first, someone will eventually also request addition of the more complex Asian encodings which WHATWG defines. Maintaining these is hard, since they require writing C code for performance reasons and to keep the mapping tables small. I probably forgot a few aspects, but the above is how I would summarize the discussion from the perspective of the people who have dealt with such discussions in the past. There are quite a few downsides to consider and since the OP is not interested in going for a compromise as described above, I don't see a way forward. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Feb 01 2018) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From pfreixes at gmail.com Thu Feb 1 10:34:24 2018 From: pfreixes at gmail.com (Pau Freixes) Date: Thu, 1 Feb 2018 16:34:24 +0100 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: References: Message-ID: Maybe it is happening but not in the way that you would expect https://mail.python.org/pipermail/python-dev/2018-January/152029.html Anyway, do we conclude, or at least a significant part, that is something desiderable but some constraints do not allow to work on that? Also, more technically Iwouls like to have your point of view of two questions, sorry if these sound kind stupid. 1) Is CPython 4 a good place to start to think on make the default execution mone debuggale. Having an explicit -g operand that by default is disabled, shouldnt be an open window for changing many thinks behind the scenes? 2) Regarding the Yuris proposal to cache bultin functions, why this strategy cant be used for objects and their attributes within the function scope? Technically which is the red flag? Cheers, El 29/01/2018 18:10, "Brett Cannon" escribi?: > > > On Sat, Jan 27, 2018, 23:36 Pau Freixes, wrote: > >> > At a technical level, the biggest problems relate to the way we >> > manipulate frame objects at runtime, including the fact that we expose >> > those frames programmatically for the benefit of debuggers and other >> > tools. >> >> Shoudnt be something that could be tackled with the introduction of a >> kind of "-g" flag ? Asking the user to make explicit that is willing >> on having all of this extra information that in normal situations >> won't be there. >> >> > >> > More broadly, the current lack of perceived commercial incentives for >> > large corporations to invest millions in offering a faster default >> > Python runtime, the way they have for the other languages you >> > mentioned in your initial post :) >> >> Agree, at least from my understanding, Google has had a lot of >> initiatives to improve the JS runtime. But at the same moment, these >> last years and with the irruption of Asyncio many companies such as >> Facebook are implementing their systems on top of CPython meaning that >> they are indirectly inverting on it. >> > > I find that's a red herring. There are plenty of massive companies that > have relied on Python for performance-critical workloads in timespans > measuring in decades and they have not funded core Python development or > the PSF in a way even approaching the other languages Python was compared > against in the original email. It might be the feeling of community > ownership that keeps companies from making major investments in Python, but > regardless it's important to simply remember the core devs are volunteers > so the question of "why hasn't this been solved" usually comes down to > "lack of volunteer time". > > -Brett > > >> -- >> --pau >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Feb 1 11:05:15 2018 From: mertz at gnosis.cx (David Mertz) Date: Thu, 1 Feb 2018 08:05:15 -0800 Subject: [Python-ideas] Format mini-language for lakh and crore In-Reply-To: References: <1f1f8906-8cf7-ff3c-087e-8f75b6399df5@trueblade.com> <7679410d-8261-33f4-54ca-a2e4d8edb09a@trueblade.com> Message-ID: On Feb 1, 2018 12:17 AM, "Stephan Houben" wrote: What about something like: f"{x:?d}" ? = Indian Rupees symbol I realize it is not ASCII but ? would be, for the target audience, both easy to type and be mnemonic ("format number like you would format an amount in rupees"). I like how iconic it is very much. However... There are two obstacles to this. The main one is the BDFLs often declared opposition to using non-ASCII in Python itself. The format mini-language is borderline between being part of the Python language and merely being strings you can quote (which strongly need to allow non-ASCII literals). It's a lot like regex or glob this way (but for historic reasons at least, both those are also pure ASCII in their syntax elements, but can obviously match non-ASCII literals or classes). The other element is that not all of South Asia is India. U+09F3 BENGALI RUPEE SIGN ? U+0AF1 GUJARATI RUPEE SIGN ? U+0BF9 TAMIL RUPEE SIGN ?I believe U+02A8 ? is deprecated in India but still used in Pakistan. However, the discussion also let me to find your on Wikipedia, which also urged towards Nick's more general pattern specifier: Outside of Taiwan, digits are sometimes grouped by myriads instead of thousands. Hence it is more convenient to think of numbers here as in groups of four, thus 1,234,567,890 is regrouped here as 12,3456,7890. Larger than a myriad, each number is therefore four zeroes longer than the one before it, -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Feb 1 12:55:52 2018 From: brett at python.org (Brett Cannon) Date: Thu, 01 Feb 2018 17:55:52 +0000 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: References: Message-ID: On Thu, 1 Feb 2018 at 07:34 Pau Freixes wrote: > Maybe it is happening but not in the way that you would expect > > https://mail.python.org/pipermail/python-dev/2018-January/152029.html > > As one of the people who works at Microsoft and has Steve as a teammate I'm well aware of what MS contributes. :) My point is even with the time Steve, me, and our fellow core devs at MS get to spend on Python, it still pales in comparison to what some other languages get with dedicated staffing. > Anyway, do we conclude, or at least a significant part, that is something > desiderable but some constraints do not allow to work on that? > I'm not sure what you're referencing as "something desirable", but I think we all want to see Python improve if possible. > > Also, more technically Iwouls like to have your point of view of two > questions, sorry if these sound kind stupid. > > 1) Is CPython 4 a good place to start to think on make the default > execution mone debuggale. Having an explicit -g operand that by default is > disabled, shouldnt be an open window for changing many thinks behind the > scenes? > Don't view Python 4 as a magical chance to do a ton of breaking changes like Python 3. > > 2) Regarding the Yuris proposal to cache bultin functions, why this > strategy cant be used for objects and their attributes within the function > scope? Technically which is the red flag? > Descriptors are the issue for attributes. After that it's a question of whether it's worth the overhead of other scope levels (built-ins are somewhat unique in that they are very rarely changed). The key point is that all of this requires people's time and we just don't have tons of that available at the moment. -Brett > > Cheers, > > El 29/01/2018 18:10, "Brett Cannon" escribi?: > >> >> >> On Sat, Jan 27, 2018, 23:36 Pau Freixes, wrote: >> >>> > At a technical level, the biggest problems relate to the way we >>> > manipulate frame objects at runtime, including the fact that we expose >>> > those frames programmatically for the benefit of debuggers and other >>> > tools. >>> >>> Shoudnt be something that could be tackled with the introduction of a >>> kind of "-g" flag ? Asking the user to make explicit that is willing >>> on having all of this extra information that in normal situations >>> won't be there. >>> >>> > >>> > More broadly, the current lack of perceived commercial incentives for >>> > large corporations to invest millions in offering a faster default >>> > Python runtime, the way they have for the other languages you >>> > mentioned in your initial post :) >>> >>> Agree, at least from my understanding, Google has had a lot of >>> initiatives to improve the JS runtime. But at the same moment, these >>> last years and with the irruption of Asyncio many companies such as >>> Facebook are implementing their systems on top of CPython meaning that >>> they are indirectly inverting on it. >>> >> >> I find that's a red herring. There are plenty of massive companies that >> have relied on Python for performance-critical workloads in timespans >> measuring in decades and they have not funded core Python development or >> the PSF in a way even approaching the other languages Python was compared >> against in the original email. It might be the feeling of community >> ownership that keeps companies from making major investments in Python, but >> regardless it's important to simply remember the core devs are volunteers >> so the question of "why hasn't this been solved" usually comes down to >> "lack of volunteer time". >> >> -Brett >> >> >>> -- >>> --pau >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Thu Feb 1 13:04:34 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Thu, 1 Feb 2018 13:04:34 -0500 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: References: Message-ID: On Thu, Feb 1, 2018 at 12:55 PM, Brett Cannon wrote: > > > On Thu, 1 Feb 2018 at 07:34 Pau Freixes wrote: [..] >> 2) Regarding the Yuris proposal to cache bultin functions, why this >> strategy cant be used for objects and their attributes within the function >> scope? Technically which is the red flag? > > > Descriptors are the issue for attributes. After that it's a question of > whether it's worth the overhead of other scope levels (built-ins are > somewhat unique in that they are very rarely changed). I'm not sure I understand Pau's question but I can assure that my optimizations were fully backwards compatible and preserved all of Python attribute lookup semantics. And they made some macrobenchmarks up to 10% faster. Unfortunately I failed at merging them in 3.7. Will do that for 3.8. Yury From pfreixes at gmail.com Thu Feb 1 14:30:25 2018 From: pfreixes at gmail.com (Pau Freixes) Date: Thu, 1 Feb 2018 20:30:25 +0100 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: References: Message-ID: I'm not sure I understand Pau's question but I can assure that my optimizations were fully backwards compatible and preserved all of Python attribute lookup semantics. And they made some macrobenchmarks up to 10% faster. Unfortunately I failed at merging them in 3.7. Will do that for 3.8. I was referring to that https://bugs.python.org/issue28158 -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Thu Feb 1 15:15:57 2018 From: barry at barrys-emacs.org (Barry Scott) Date: Thu, 1 Feb 2018 20:15:57 +0000 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: References: Message-ID: > On 30 Jan 2018, at 05:45, Nick Coghlan wrote: > > I'll also note that one of the things we (and others) *have* been > putting quite a bit of time into is the question of "Why do people > avoid using extension modules for code acceleration?". I think that is simple. Those that try give up because its a difficult API to call correctly. At PYCON UK on speaker explain how she, PhD level researcher, had failed to get the a C extension working. I was contacted to improve PyCXX by a contractor for the US Army that stated that he was called in to help the internal developers get a big library wrapped for use from python. After 6 months they where no where near working code. He did what was needed with PyCXX in 3 weeks + the time getting me to make some nice additions to help him. It seems that if people find a C++ library that will do the heavy lifting they end up with extensions. But those that attempt the C API as is seem to fail and give up. It also seems that people do not go looking for the helper libraries. Next year at PYCON I hope to give a talk on PyCXX and encourage people to write extensions. Barry PyCXX maintainer. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kissgyorgy at me.com Thu Feb 1 15:45:13 2018 From: kissgyorgy at me.com (=?UTF-8?Q?Kiss=2C_Gy=C3=B6rgy?=) Date: Thu, 01 Feb 2018 20:45:13 +0000 Subject: [Python-ideas] JSON encoding protocol with __json__ dunder method Message-ID: Hi! Most of the classes (even if very simple like datetime.datetime) cannot be serialized to JSON by default. Would it be a good idea for the default json.JSONEncoder to call the __json__ dunder method automatically if the object has one? I can't find anything about why this protocol or PEP doesn't exists yet. Currently almost everyone implements it like this, there are thousands of results on GitHub to this: https://github.com/search?l=Python&q=def+__json__&type=Code&utf8=%E2%9C%93 but there is no canonical way (Python/standard way) to do this. It would be very nice, because a custom JSONEncoder would not be needed and everyone could implement JSON serialization in One True Way (yay for code reuse!) The implementation could be very simple, would look something like this: class JSONEncoder: def default(self, obj): if hasattr(obj, '__json__'): return obj.__json__() return current_implementation Or to make it easier even for decoding, it could be __to_json__ and __from_json__ or __json_encode__ and __json_decode__ or something like these. Was there already a PEP/discussion about this? Gy?rgy -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Feb 1 16:34:50 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 1 Feb 2018 16:34:50 -0500 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> Message-ID: On 1/31/2018 6:15 PM, Chris Barker wrote: > I still have no idea why there is such resistance to this [spelling corrected] Every proposal should be resisted to the extent of requiring clarity, consideration of alternatives, and sufficient justification. > yes, it's a fairly small benefit over a package on PyPi, [spelling corrected] So why move *this* code? The clash with flake8 is an issue between the package and flake8 and is irrelevant to adding it to the stdlib. Every feature on PyPi would be more convenient for at least a few people if moved. Why specifically this package, more than a couple hundred others? Our current position is that most anything on PyPI should stay there. > but there is also virtually no downside. All changes, and especially feature additions, have a downside, as has been explained by Steven D'Aprano more than once. M.-A. Lemburg already summarized his view of the specifics for this issue. And see below. > (I'm assuming the OP (or someone) will do all the actual work of coding > and updating docs....) At least one core developer has to *volunteer* to review, likely edit or request edits, merge, and *take responsibility* for the consequences of the PR. At minimum, there is the opportunity cost of the core developer not making some other improvement, which some might see as more valuable. > Practicality Beats Purity -- and this is a practical solution. It is an ugly hack, which also has practical problems. Here is the full applicable quote from Tim's Zen: Special cases aren't special enough to break the rules. Although practicality beats purity. I take this to mean that normal special cases are not special enough but some special special cases are. The meta meaning is that decisions are not mechanical and require tradeoffs, and that people will honestly disagree in close cases. -- Terry Jan Reedy From breamoreboy at gmail.com Thu Feb 1 16:54:59 2018 From: breamoreboy at gmail.com (Mark Lawrence) Date: Thu, 1 Feb 2018 21:54:59 +0000 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> Message-ID: On 01/02/18 21:34, Terry Reedy wrote: > On 1/31/2018 6:15 PM, Chris Barker wrote: > >> I still have no idea why there is such resistance to this [spelling >> corrected] > > Every proposal should be resisted to the extent of requiring clarity, > consideration of alternatives, and sufficient justification. > >> yes, it's a fairly small benefit over a package on PyPi, [spelling >> corrected] > > So why move *this* code?? The clash with flake8 is an issue between the > package and flake8 and is irrelevant to adding it to the stdlib.? Every > feature on PyPi would be more convenient for at least a few people if > moved.? Why specifically this package, more than a couple hundred > others?? Our current position is that most anything on PyPI should stay > there. > >> but there is also virtually no downside. > > All changes, and especially feature additions, have a downside, as has > been explained by Steven D'Aprano more than once.? M.-A. Lemburg already > summarized his view of the specifics for this issue.? And see below. > >> (I'm assuming the OP (or someone) will do all the actual work of >> coding and updating docs....) > > At least one core developer has to *volunteer* to review, likely edit or > request edits, merge, and *take responsibility* for the consequences of > the PR.? At minimum, there is the opportunity cost of the core developer > not making some other improvement, which some might see as more valuable. > >> Practicality Beats Purity -- and this is a practical solution. > > It is an ugly hack, which also has practical problems. > > Here is the full applicable quote from Tim's Zen: > > Special cases aren't special enough to break the rules. > Although practicality beats purity. > > I take this to mean that normal special cases are not special enough but > some special special cases are.? The meta meaning is that decisions are > not mechanical and require tradeoffs, and that people will honestly > disagree in close cases. > I now see this entire thread as Status Quo 1, Proposal -1, so can we please move on? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From greg.ewing at canterbury.ac.nz Thu Feb 1 17:23:35 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 02 Feb 2018 11:23:35 +1300 Subject: [Python-ideas] Format mini-language for lakh and crore In-Reply-To: References: <1f1f8906-8cf7-ff3c-087e-8f75b6399df5@trueblade.com> <7679410d-8261-33f4-54ca-a2e4d8edb09a@trueblade.com> Message-ID: <5A739367.3000708@canterbury.ac.nz> Nick Coghlan wrote: > - "," would be short for ",3," with decimal digits > - "_" would be short for "_3_" with decimal digits > - "_" would be short for "_4_" with binary/octal/hexadecimal digits Is it necessary to restrict "," to decimal? Why not make "," and "_" orthogonal? While "," with non-decimal might be rarely used, there would be less special cases to remember. -- Greg From steve at pearwood.info Thu Feb 1 18:26:36 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 2 Feb 2018 10:26:36 +1100 Subject: [Python-ideas] JSON encoding protocol with __json__ dunder method In-Reply-To: References: Message-ID: <20180201232636.GM26553@ando.pearwood.info> On Thu, Feb 01, 2018 at 08:45:13PM +0000, Kiss, Gy?rgy wrote: > Hi! > > > Most of the classes (even if very simple like datetime.datetime) cannot be > serialized to JSON by default. > > Would it be a good idea for the default json.JSONEncoder to call the > __json__ dunder method automatically if the object has one? > I can't find anything about why this protocol or PEP doesn't exists yet. See: http://bugs.python.org/issue27362 This has been discussed before: https://mail.python.org/pipermail/python-ideas/2010-July/007811.html https://mail.python.org/pipermail/python-ideas/2011-March/009644.html Please familiarise yourself with the objections and counter-arguments already discussed before trying to debate this again. -- Steve From ncoghlan at gmail.com Fri Feb 2 00:47:21 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 2 Feb 2018 15:47:21 +1000 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: References: Message-ID: On 2 February 2018 at 06:15, Barry Scott wrote: > > > On 30 Jan 2018, at 05:45, Nick Coghlan wrote: > > I'll also note that one of the things we (and others) *have* been > putting quite a bit of time into is the question of "Why do people > avoid using extension modules for code acceleration?". > > > I think that is simple. Those that try give up because its a difficult API > to > call correctly. > > At PYCON UK on speaker explain how she, PhD level researcher, > had failed to get the a C extension working. Aye, indeed. That's a big part of why we've never had much motivation to fill in the "How to do this by hand" instructions on https://packaging.python.org/guides/packaging-binary-extensions/: it's so hard to get the refcounting and GIL manager right by hand that it's almost never the right answer vs either using a dedicated extension-module-writing language like Cython, or writing a normal shared external library and then using a wrapper generator like cffi/SWIG/milksnake, or using a helper library like PySIP/PyCXX/Boost to do the heavy lifting for you. So while wheels and conda have helped considerably with the cross-platform end user UX of extension modules, there's still a lot of work to be done around the publisher UX, both for existing publishers (to get more tools to work the way PySIP does and allow a single wheel build to target multiple Python versions), and for new publishers (to make the various extension module authoring tools easier to discover, rather than having folks assuming that handcrafted calls directly into the CPython C API is their only option). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Feb 2 00:53:16 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 2 Feb 2018 15:53:16 +1000 Subject: [Python-ideas] Format mini-language for lakh and crore In-Reply-To: <5A739367.3000708@canterbury.ac.nz> References: <1f1f8906-8cf7-ff3c-087e-8f75b6399df5@trueblade.com> <7679410d-8261-33f4-54ca-a2e4d8edb09a@trueblade.com> <5A739367.3000708@canterbury.ac.nz> Message-ID: On 2 February 2018 at 08:23, Greg Ewing wrote: > Nick Coghlan wrote: >> >> - "," would be short for ",3," with decimal digits >> - "_" would be short for "_3_" with decimal digits >> - "_" would be short for "_4_" with binary/octal/hexadecimal digits > > > Is it necessary to restrict "," to decimal? Why not make > "," and "_" orthogonal? > > While "," with non-decimal might be rarely used, there would > be less special cases to remember. I wasn't even aware the restriction existed until this thread (it's one of those "I'd never tried it, so I didn't know it was prohibited" cases). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Fri Feb 2 01:52:39 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 2 Feb 2018 17:52:39 +1100 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> Message-ID: <20180202065239.GN26553@ando.pearwood.info> On Thu, Feb 01, 2018 at 10:20:00AM +0100, M.-A. Lemburg wrote: > In general, we have only added new encodings when there was an encoding > missing which a lot of people were actively using. We asked for > official documentation defining the mappings, references showing > usage and IANA or similar standard names to use for the encoding > itself and its aliases. [...] > Now the OP comes proposing to add a whole set of encodings which > only differ slightly from our existing ones. Backing is their > use and definition by WHATWG, a consortium of browser vendors > who are interested in showing web pages to users in a consistent > way. That gives us a defined mapping, references showing usage, but (alas) not standard names, due to the WHATWG's (foolish and arrogantly obnoxious, in my opinion) decision to re-use the standard names for the non-standard usages. Two out of three seems like a reasonable start to me. But one thing we haven't really discussed is, why is this an issue for Python? Everything I've seen so far suggests that these standards are only for browsers and/or web scrapers. That seems fairly niche to me. If you're writing a browser in Python, surely it isn't too much to ask that you import a set of codecs from a third party library? If I've missed something, please say so. > We also have the naming issue, since WHATWG chose to use > the same names as the standard mappings. Anything we'd > define will neither match WHATWG nor any other encoding > standard name, so we'd be creating a new set of encoding > names - which is really not what the world is after, > including WHATWG itself. I hear you, but I think this is a comparatively minor objection. I don't think it is a major problem for usability if we were to call these encodings "spam-whatwg" instead of "spam". It isn't difficult for browser authors to write: encoding = get_document_encoding() if config.USE_WHATWG_ENCODINGS: encoding += '-whatwg' or otherwise look the encodings up in a mapping. We could even provide that mapping in the codecs module: encoding = codecs.whatwg_mapping.get(encoding, encoding) So the naming issue shouldn't be more than a minor nuisance, and one we can entirely place in the lap of the WHATWG for misusing standard names. Documentation-wise, I'd argue for placing these in a seperate sub-section of the codecs docs, with a strong notice that they should only be used for decoding web documents and not for creating new documents (except for testing purposes). > People would start creating encoded text using these new > encoding names, resulting in even more mojibake out there > instead of fixing the errors in the data and using Unicode > or UTF-8 for interchange. We can't stop people from doing that: so long as the encodings exist as a third-party package, people who really insist on creating such abominable documents can do so. Just as they currently can accidentally create mojibake in their own documents by misunderstanding encodings, or as they can create new documents using legacy encodings like MacRoman instead of UTF-8 like they should. (And very occasionally, they might even have a good reason for doing so -- while we can and should *discourage* such uses, we cannot and should not expect to prohibit them.) If it were my decision, I'd have these codecs raise a warning (not an error) when used for encoding. But I guess some people will consider that either going too far or not far enough :-) > As I mentioned before, we could disable encoding in the new > mappings to resolve this concern, but the OP wasn't interested > in such an approach. As alternative approach we proposed error > handlers, which are the normal technology to use when dealing > with encoding errors. Again, the OP wasn't interested. Be fair: it isn't that the OP (Rob Speer) merely isn't interested, he does make some reasonable arguments that error handlers are the wrong solution. He's convinced me that an error handler isn't the right way to do this. He *hasn't* convinced me that the stdlib needs to solve this problem, but if it does, I think some new encodings are the right way to do it. > Please also note that once we start adding, say > "whatwg-" encodings (or rather decodings :-), > going for the simple charmap encodings first, someone > will eventually also request addition of the more complex > Asian encodings which WHATWG defines. Maintaining these > is hard, since they require writing C code for performance > reasons and to keep the mapping tables small. YAGNI -- we can deal with that when and if it gets requested. This is not the camel's nose: adding a handful of 8-bit WHATWG encodings does not oblige us to add more. [...] > There are quite a few downsides to consider Indeed -- this isn't a "no-brainer". That's why I'm still hoping to see a fair and balanced PEP. > and since the OP > is not interested in going for a compromise as described above, > I don't see a way forward. Status quo wins a stalemate. Sometimes that's better than a broken solution that won't satisfy anyone. -- Steve From sylvain.marie at schneider-electric.com Fri Feb 2 05:26:19 2018 From: sylvain.marie at schneider-electric.com (Sylvain MARIE) Date: Fri, 2 Feb 2018 10:26:19 +0000 Subject: [Python-ideas] Dataclasses, keyword args, and inheritance In-Reply-To: References: <2a660b18-3977-2393-ef3c-02e368934c8e@trueblade.com> Message-ID: George For what?s worth if it can help for your need ? I know that this is not part of the dataclasses PEP ? inheritance now works with @autoclass (https://smarie.github.io/python-autoclass/ ), as demonstrated below. Note that unfortunately for mutable default field values you still have to perform self-assignment. But that?s the same problem that you have with ?standard? python. # with an unmutable default value from autoclass import autoclass @autoclass class Foo: def __init__(self, some_default: str = 'unmutable_default'): pass @autoclass class Bar(Foo): def __init__(self, other_field: int, some_default: str = 'unmutable_default'): super(Bar, self).__init__(some_default=some_default) a = Bar(2) assert a.other_field == 2 assert a.some_default == 'unmutable_default' # with a mutable default value @autoclass class Foo: def __init__(self, some_default: str = None): # you have to assign, you can not delegate to @autoclass for this :( self.some_default = some_default or ['mutable_default'] @autoclass class Bar(Foo): def __init__(self, other_field: int, some_default: str = None): super(Bar, self).__init__(some_default=some_default) a = Bar(2) assert a.other_field == 2 assert a.some_default == ['mutable_default'] By the way is there any plan or idea to allow users to provide ?default values factories (generators)? directly in the function signature, when their default value is mutable ? That could be a great improvement of the python language. I could not find anything googling around? let me know if I should create a dedicated thread to discuss this. Kind regards Sylvain De : Python-ideas [mailto:python-ideas-bounces+sylvain.marie=schneider-electric.com at python.org] De la part de Guido van Rossum Envoy? : lundi 29 janvier 2018 20:44 ? : George Leslie-Waksman Cc : Eric V. Smith ; python-ideas Objet : Re: [Python-ideas] Dataclasses, keyword args, and inheritance That's fair. Let me then qualify my statement with "in the initial release". The initial release has enough functionality to deal with without considering your rather esoteric use case. (And I consider it esoteric because attrs has apparently never seen the need to solve it either.) We can reconsider for 3.8. On Mon, Jan 29, 2018 at 11:38 AM, George Leslie-Waksman > wrote: Given I started this thread from a perspective of this is a feature that I would like because I need it, it feels a little dismissive to take attrs not having the feature to mean "there's no reason to try to implement this." On Mon, Jan 29, 2018 at 11:05 AM Guido van Rossum > wrote: I think that settles it -- there's no reason to try to implement this. On Mon, Jan 29, 2018 at 10:51 AM, George Leslie-Waksman > wrote: attrs' seems to also not allow mandatory attributes to follow optional one: In [14]: @attr.s ...: class Baz: ...: a = attr.ib(default=attr.Factory(list)) ...: b = attr.ib() ...: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 @attr.s 2 class Baz: 3 a = attr.ib(default=attr.Factory(list)) 4 b = attr.ib() 5 /Users/waksman/.pyenv/versions/3.6.1/envs/temp/lib/python3.6/site-packages/attr/_make.py in attrs(maybe_cls, these, repr_ns, repr, cmp, hash, init, slots, frozen, str, auto_attribs) 700 return wrap 701 else: --> 702 return wrap(maybe_cls) 703 704 /Users/waksman/.pyenv/versions/3.6.1/envs/temp/lib/python3.6/site-packages/attr/_make.py in wrap(cls) 669 raise TypeError("attrs only works with new-style classes.") 670 --> 671 builder = _ClassBuilder(cls, these, slots, frozen, auto_attribs) 672 673 if repr is True: /Users/waksman/.pyenv/versions/3.6.1/envs/temp/lib/python3.6/site-packages/attr/_make.py in __init__(self, cls, these, slots, frozen, auto_attribs) 369 370 def __init__(self, cls, these, slots, frozen, auto_attribs): --> 371 attrs, super_attrs = _transform_attrs(cls, these, auto_attribs) 372 373 self._cls = cls /Users/waksman/.pyenv/versions/3.6.1/envs/temp/lib/python3.6/site-packages/attr/_make.py in _transform_attrs(cls, these, auto_attribs) 335 "No mandatory attributes allowed after an attribute with a " 336 "default value or factory. Attribute in question: {a!r}" --> 337 .format(a=a) 338 ) 339 elif had_default is False and \ ValueError: No mandatory attributes allowed after an attribute with a default value or factory. Attribute in question: Attribute(name='b', default=NOTHING, validator=None, repr=True, cmp=True, hash=None, init=True, metadata=mappingproxy({}), type=None, converter=None) On Fri, Jan 26, 2018 at 1:44 PM Guido van Rossum > wrote: What does attrs' solution for this problem look like? On Fri, Jan 26, 2018 at 11:11 AM, George Leslie-Waksman > wrote: Even if we could inherit the setting, I would think that we would still want to require the code be explicit. It seems worse to implicitly require keyword only arguments for a class without giving any indication in the code. As it stands, the current implementation does not allow a later subclass to be declared without `keyword_only=True` so we could handle this case by adding a note to the `TypeError` message about considering the keyword_only flag. How do I got about putting together a proposal to get this into 3.8? --George On Thu, Jan 25, 2018 at 5:12 AM Eric V. Smith > wrote: I'm not completely opposed to this feature. But there are some cases to consider. Here's the first one that occurs to me: note that due to the way dataclasses work, it would need to be used everywhere down an inheritance hierarchy. That is, if an intermediate base class required it, all class derived from that intermediate base would need to specify it, too. That's because each class just makes decisions based on its fields and its base classes' fields, and not on any flags attached to the base class. As it's currently implemented, a class doesn't remember any of the decorator's arguments, so there's no way to look for this information, anyway. I think there are enough issues here that it's not going to make it in to 3.7. It would require getting a firm proposal together, selling the idea on python-dev, and completing the implementation before Monday. But if you want to try, I'd participate in the discussion. Taking Ivan's suggestion one step further, a way to do this currently is to pass init=False and then write another decorator that adds the kw-only __init__. So the usage would be: @dataclass class Foo: some_default: dict = field(default_factory=dict) @kw_only_init @dataclass(init=False) class Bar(Foo): other_field: int kw_only_init(cls) would look at fields(cls) and construct the __init__. It would be a hassle to re-implement dataclasses's _init_fn function, but it could be made to work (in reality, of course, you'd just copy it and hack it up to do what you want). You'd also need to use some private knowledge of InitVars if you wanted to support them (the stock fields(cls) doesn't return them). For 3.8 we can consider changing dataclasses's APIs if we want to add this. Eric. On 1/25/2018 1:38 AM, George Leslie-Waksman wrote: > It may be possible but it makes for pretty leaky abstractions and it's > unclear what that custom __init__ should look like. How am I supposed to > know what the replacement for default_factory is? > > Moreover, suppose I want one base class with an optional argument and a > half dozen subclasses each with their own required argument. At that > point, I have to write the same __init__ function a half dozen times. > > It feels rather burdensome for the user when an additional flag (say > "kw_only=True") and a modification to: > https://github.com/python/cpython/blob/master/Lib/dataclasses.py#L294 that > inserted `['*']` after `[self_name]` if the flag is specified could > ameliorate this entire issue. > > On Wed, Jan 24, 2018 at 3:22 PM Ivan Levkivskyi > >> wrote: > > It is possible to pass init=False to the decorator on the subclass > (and supply your own custom __init__, if necessary): > > @dataclass > class Foo: > some_default: dict = field(default_factory=dict) > > @dataclass(init=False) # This works > class Bar(Foo): > other_field: int > > -- > Ivan > > > > On 23 January 2018 at 03:33, George Leslie-Waksman > >> wrote: > > The proposed implementation of dataclasses prevents defining > fields with defaults before fields without defaults. This can > create limitations on logical grouping of fields and on inheritance. > > Take, for example, the case: > > @dataclass > class Foo: > some_default: dict = field(default_factory=dict) > > @dataclass > class Bar(Foo): > other_field: int > > this results in the error: > > 5 @dataclass > ----> 6 class Bar(Foo): > 7 other_field: int > 8 > > ~/.pyenv/versions/3.6.2/envs/clover_pipeline/lib/python3.6/site-packages/dataclasses.py > in dataclass(_cls, init, repr, eq, order, hash, frozen) > 751 > 752 # We're called as @dataclass, with a class. > --> 753 return wrap(_cls) > 754 > 755 > > ~/.pyenv/versions/3.6.2/envs/clover_pipeline/lib/python3.6/site-packages/dataclasses.py > in wrap(cls) > 743 > 744 def wrap(cls): > --> 745 return _process_class(cls, repr, eq, order, > hash, init, frozen) > 746 > 747 # See if we're being called as @dataclass or > @dataclass(). > > ~/.pyenv/versions/3.6.2/envs/clover_pipeline/lib/python3.6/site-packages/dataclasses.py > in _process_class(cls, repr, eq, order, hash, init, frozen) > 675 # in __init__. Use > "self" if possible. > 676 '__dataclass_self__' if > 'self' in fields > --> 677 else 'self', > 678 )) > 679 if repr: > > ~/.pyenv/versions/3.6.2/envs/clover_pipeline/lib/python3.6/site-packages/dataclasses.py > in _init_fn(fields, frozen, has_post_init, self_name) > 422 seen_default = True > 423 elif seen_default: > --> 424 raise TypeError(f'non-default argument > {f.name !r} ' > 425 'follows default argument') > 426 > > TypeError: non-default argument 'other_field' follows default > argument > > I understand that this is a limitation of positional arguments > because the effective __init__ signature is: > > def __init__(self, some_default: dict = , > other_field: int): > > However, keyword only arguments allow an entirely reasonable > solution to this problem: > > def __init__(self, *, some_default: dict = , > other_field: int): > > And have the added benefit of making the fields in the __init__ > call entirely explicit. > > So, I propose the addition of a keyword_only flag to the > @dataclass decorator that renders the __init__ method using > keyword only arguments: > > @dataclass(keyword_only=True) > class Bar(Foo): > other_field: int > > --George Leslie-Waksman > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) -- --Guido van Rossum (python.org/~guido) -- --Guido van Rossum (python.org/~guido) ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Feb 2 15:12:46 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 2 Feb 2018 12:12:46 -0800 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> Message-ID: On Thu, Feb 1, 2018 at 1:34 PM, Terry Reedy wrote: > On 1/31/2018 6:15 PM, Chris Barker wrote: > > I still have no idea why there is such resistance to this [spelling >> corrected] >> > > M.-A. Lemburg already summarized his view of the specifics for this > issue. And see below. Thanks for that, I know I phrased it in no-very-open-for-discussion way, but that was what I was looking for. Frankly, I disagree with much of that, but it's been clearly layed out, which is what is needed to make a decision. > (I'm assuming the OP (or someone) will do all the actual work of coding >> and updating docs....) >> > > At least one core developer has to *volunteer* to review, likely edit or > request edits, merge, and *take responsibility* for the consequences of the > PR. Fair enough -- it would be quite reasonable to say that this (or anything) wont get included u less a core dev decides it worth his/her time to bring it in -- but that is different than saying it won't be brought in regardless. > I take this to mean that normal special cases are not special enough but > some special special cases are. The meta meaning is that decisions are not > mechanical and require tradeoffs, and that people will honestly disagree in > close cases. yup -- there seems to be much resistance, and not much support -- so I guess we're done. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Sat Feb 3 17:04:48 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Sat, 3 Feb 2018 17:04:48 -0500 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: Let s be a str. I propose to allow these existing str methods to take params in new forms. s.replace(old, new): Allow passing in a collection of olds. Allow passing in a single argument, a mapping of olds to news. Allow the olds in the mapping to be tuples of strings. s.split(sep), s.rsplit, s.partition: Allow sep to be a collection of separators. s.startswith, s.endswith: Allow argument to be a collection of strings. s.find, s.index, s.count, x in s: Similar. These methods are also in `list`, which can't distinguish between items, subsequences, and subsets. However, `str` is already inconsistent with `list` here: list.M looks for an item, while str.M looks for a subsequence. s.[r|l]strip: Sadly, these functions already interpret their str arguments as collections of characters. These new forms can be optimized internally, as a search for multiple candidate substrings can be more efficient than searching for one at a time. See https://stackoverflow.com/questions/3260962/algorithm-to-find-multiple-string-matches The most significant change is on .replace. The others are simple enough to simulate with a loop or something. It is harder to make multiple simultaneous replacements using one .replace at a time, because previous replacements can form new things that look like replaceables. The easiest Python solution is to use regex or install some package, which uses (if you're lucky) regex or (if unlucky) doesn't simulate simultaneous replacements. (If possible, just use str.translate.) I suppose .split on multiple separators is also annoying to simulate. The two-argument form of .split may be even more of a burden, though I don't know when a limited multiple-separator split is useful. The current best solution is, like before, to use regex, or install a package and hope for the best. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Feb 3 18:43:19 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 3 Feb 2018 18:43:19 -0500 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: On 2/3/2018 5:04 PM, Franklin? Lee wrote: > Let s be a str. I propose to allow these existing str methods to take > params in new forms. Thanks for the honest title. As you sort of indicate, these can all be done with re module. However, you imply loops are needed besides, which is mostly not true. Your complications mostly translate to existing calls and hence are not needed. Perhaps 'Regular Expression HOWTO' could use more examples, or even a section on generalizing string methinds. Perhaps the string method doc needs suggestion to use re for multiple string args and references to the re howto. Please consider taking a look at both. >>> import re > s.replace(old, new): > ? ? Allow passing in a collection of olds. >>> re.sub('Franklin|Lee', 'user', 'Franklin? Lee') 'user? user' Remembering the name change is a nuisance > ? ? Allow passing in a single argument, a mapping of olds to news. This needs to be a separate function, say 'dictsub', that joins the keys with '|' and calls re.sub with a function that does the lookup as the 2nd parameter. This would be a nice example for the howto. As you noted, this is generalization of str.translate, and might be proposed as a new re module function. > ? ? Allow the olds in the mapping to be tuples of strings. A minor addition to dictsub. > s.split(sep), s.rsplit, s.partition: > ? ? Allow sep to be a collection of separators. re.split is already more flexible than non-whitespace str.split and str.partition combined. >>> re.split('a|e|i|o|u', 'Franklin? Lee') ['Fr', 'nkl', 'n? L', '', ''] >>> re.split('(a|e|i|o|u)', 'Franklin? Lee') # multiple partition ['Fr', 'a', 'nkl', 'i', 'n? L', 'e', '', 'e', ''] >>> re.split('(a|e|i|o|u)', 'Franklin? Lee', 1) # single partition ['Fr', 'a', 'nklin? Lee'] re.split, and hence str.rsplit(collection) are very sensible. > s.startswith, s.endswith: > ? ? Allow argument to be a collection of strings. bool(re.match('|'.join(strings)) does exactly the proposed s.startswith, with the advantage that the actual match is available, and I think that one would nearly always want to know that match. >>> re.match('a|e|i|o|u', 'Franklin? Lee') >>> re.match('f|F', 'Franklin? Lee') re.search with '^' at the beginning or '$' at the end covers both proposals, with the added flexibility of using MULTILINE mode to match at the beginning or end of lines within the string. > s.find, s.index, s.count, x in s: > ? ? Similar. > ? ? These methods are also in `list`, which can't distinguish between > items, subsequences, and subsets. However, `str` is already inconsistent > with `list` here: list.M looks for an item, while str.M looks for a > subsequence. Comments above apply. re.search tells you which string matched as well as where. bool(re.search) is 'x in s'. re.findall and re.finditer give much more info than merely a count ('sum(bool(re.finditer))'). > s.[r|l]strip: > ? ? Sadly, these functions already interpret their str arguments as > collections of characters. To avoid this, use re.sub with ^ or $ anchor and '' replacement. >>> re.sub('(Frank|Lee)$', '', 'Franklin? Lee') 'Franklin? ' > These new forms can be optimized internally, as a search for multiple > candidate substrings can be more efficient than searching for one at a > time. This is what re does with 's1|s2|...|sn' patterns. > https://stackoverflow.com/questions/3260962/algorithm-to-find-multiple-string-matches > > The most significant change is on .replace. The others are simple enough > to simulate with a loop or something. No loops needed. > It is harder to make multiple > simultaneous replacements using one .replace at a time, because previous > replacements can form new things that look like replaceables. This problem exists for single string replacement also. The standard solution is to not backtrack and not do overlapping replacements. > The easiest Python solution is to use regex My claim above is that this is sufficient for all by one case, which should be a new function anyway. > or install some package, which > uses (if you're lucky) regex or (if unlucky) doesn't simulate > simultaneous replacements. (If possible, just use str.translate.) > > I suppose .split on multiple separators is also annoying to simulate. > The two-argument form of .split may be even more of a burden, though I > don't know when a limited multiple-separator split is useful. The > current best solution is, like before, to use regex, or install a > package and hope for the best. -- Terry Jan Reedy From rosuav at gmail.com Sat Feb 3 18:54:53 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 4 Feb 2018 10:54:53 +1100 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: On Sun, Feb 4, 2018 at 10:43 AM, Terry Reedy wrote: > On 2/3/2018 5:04 PM, Franklin? Lee wrote: >> s.startswith, s.endswith: >> Allow argument to be a collection of strings. > > > bool(re.match('|'.join(strings)) does exactly the proposed s.startswith, > with the advantage that the actual match is available, and I think that one > would nearly always want to know that match. > >>>> re.match('a|e|i|o|u', 'Franklin? Lee') >>>> re.match('f|F', 'Franklin? Lee') > Picking up this one as an example, but this applies to all of them: the transformation you're giving here is dangerously flawed. If there are any regex special characters in the strings, this will either bomb with an exception, or silently do the wrong thing. The correct way to do it is (at least, I think it is): re.match("|".join(map(re.escape, strings)), testme) With that gotcha lurking in the wings, I think this should not be cavalierly dismissed with "just 'import re' and be done with it". Maybe put some recipes in the docs showing how to do this safely? ChrisA From steve at pearwood.info Sat Feb 3 20:09:19 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 4 Feb 2018 12:09:19 +1100 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: <20180204010919.GS26553@ando.pearwood.info> On Sun, Feb 04, 2018 at 10:54:53AM +1100, Chris Angelico wrote: > Picking up this one as an example, but this applies to all of them: > the transformation you're giving here is dangerously flawed. If there > are any regex special characters in the strings, this will either bomb > with an exception, or silently do the wrong thing. The correct way to > do it is (at least, I think it is): > > re.match("|".join(map(re.escape, strings)), testme) > > With that gotcha lurking in the wings, I think this should not be > cavalierly dismissed with "just 'import re' and be done with it". Indeed. This is not Perl and "just use a regex" is not a close fit to the culture of Python. Regexes are a completely separate mini-language, and one which is the opposite of Pythonic. Instead of "executable pseudo-code", regexes are excessively terse and cryptic once you get past the simple examples. Doing anything complicated using regexes is painful. Even Larry Wall has criticised regex syntax for choosing poor defaults and information density. (Rarely used symbols get a single character, while frequently needed symbols are coded as multiple characters, so Perlish syntax has the worst of both worlds: too terse for casual users, too verbose for experts, hard to maintain for everyone.) Any serious programmer should have at least a passing familiarity with regexes. They are ubiquitous, and useful, especially as a common mini-language for user-specified searching. But I consider regexes to be the fall-back for when Python doesn't support the kind of string matching operation I need, not the primary solution. I would never write: re.match('start', text) re.search('spam', text) when text.startswith('start') text.find('spam') will do. I think this proposal to add more power to the string methods is worth some serious consideration. -- Steve From leewangzhong+python at gmail.com Sat Feb 3 22:36:44 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Sat, 3 Feb 2018 22:36:44 -0500 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: On Sat, Feb 3, 2018 at 6:43 PM, Terry Reedy wrote: > On 2/3/2018 5:04 PM, Franklin? Lee wrote: >> >> Let s be a str. I propose to allow these existing str methods to take >> params in new forms. > > > Thanks for the honest title. As you sort of indicate, these can all be done > with re module. However, you imply loops are needed besides, which is > mostly not true. Your complications mostly translate to existing calls and > hence are not needed. My point wasn't that they _needed_ loops, but that they were simple enough to write (correctly) using loops, while simultaneous multiple .replace isn't. In any case, due to the backtracking algorithm used by re, a real specialized implementation can be much more efficient for the functions I listed, and will add value. > Perhaps 'Regular Expression HOWTO' could use more examples, or even a > section on generalizing string methinds. Perhaps the string method doc > needs suggestion to use re for multiple string args and references to the re > howto. Please consider taking a look at both. Nah, I'm fine. I can take the time to write inefficient simulations, or find them online. I proposed this not because I myself need it in the language, but because I think it's useful for many, yet easy to get wrong. It's the same with `sort`. In my design notes for my own toolkit, I also allow regular expressions as "olds. I didn't add that feature, since it'd cause a dependency. >>>> import re > >> s.replace(old, new): >> Allow passing in a collection of olds. > > >>>> re.sub('Franklin|Lee', 'user', 'Franklin? Lee') > 'user? user' > > Remembering the name change is a nuisance Three issues: 1. As Chris Angelico pointed out, the input strings need to be escaped. 2. The input strings should also be reverse sorted by length. Longer strings should be higher priority than shorter strings, or else 'foobar' will never outbid 'foo'. 3. The substitution needs to be escaped (which is probably why it has a different name), using repl.replace('\\', r'\\'). As the docs note, using repl.escape here is wrong (at least before 3.7). Like I said, easy to get wrong. >> Allow passing in a single argument, a mapping of olds to news. > > > This needs to be a separate function, say 'dictsub', that joins the keys > with '|' and calls re.sub with a function that does the lookup as the 2nd > parameter. This would be a nice example for the howto. > > As you noted, this is generalization of str.translate, and might be proposed > as a new re module function. If this function is added to re, it should also allow patterns as keys, and we may also want to add re.compile(collection_of_strings_and_patterns). >> These new forms can be optimized internally, as a search for multiple >> candidate substrings can be more efficient than searching for one at a time. > > > This is what re does with 's1|s2|...|sn' patterns. Really? re generally uses backtracking, not a DFA. Timing indicates that it is NOT using an efficient algorithm. import re def findsub(haystack, needles, map=map, any=any, plusone=1 .__add__): return any(map(plusone, map(haystack.find, needles))) def findsub_re(haystack, needles, bool=bool, map=map, find=re.search, esc=re.escape): pattern = '|'.join(map(esc, needles)) return bool(find(pattern, haystack)) def findsub_re_cached(haystack, regex, bool=bool): return bool(regex.search(haystack)) n = 1000 haystack = 'a'*n + 'b' needles = ['a'*i + 'bb' for i in range(1, n)] needles = sorted(needles, key=len, reverse=True) regex = re.compile('|'.join(map(re.escape, needles))) %timeit findsub(haystack, needles) #=> 1.81 ms ? 25.4 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) %timeit findsub_re(haystack, needles) #=> 738 ms ? 39.2 ms per loop (mean ? std. dev. of 7 runs, 1 loop each) %timeit findsub_re_cached(haystack, regex) #=> 680 ms ? 14.1 ms per loop (mean ? std. dev. of 7 runs, 1 loop each) Here, the naive approach using basic string operations is over 300x faster than the re approach. Even a simple pure Python tree approach is faster. END = '' # Tree-based multiple string searches. def make_tree(needles): """Creates a tree such that each node maps characters to values. The tree is not optimized. """ root = {} for needle in needles: cur = root for c in needle: try: cur = cur[c] except KeyError: cur[c] = cur = {} #curious cur[END] = needle return root def findsub_tree(haystack, needles): tree = make_tree(needles) nodes = [tree] for c in haystack: nodes = [n[c] for n in nodes if c in n] #NOT OPTIMAL if any(END in n for n in nodes): return True nodes.append(tree) return False %timeit findsub_tree(haystack, needles) #=> 95.1 ms ? 2.5 ms per loop (mean ? std. dev. of 7 runs, 10 loops each) Precomputing the tree saves about 80 ms of that time. >> It is harder to make multiple simultaneous replacements using one .replace >> at a time, because previous replacements can form new things that look like >> replaceables. > > > This problem exists for single string replacement also. The standard > solution is to not backtrack and not do overlapping replacements. I don't know what you mean by the problem existing for single-string replacement, nor how your solution solves the problem. I'm talking about simulating multiple simultaneous replacements with the .replace method. .replace doesn't see previous replacements as candidates for replacement. However, using it to simulate multiple simultaneous replacements will fail if the new values form something that looks like the old values. From ncoghlan at gmail.com Sun Feb 4 22:01:15 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Feb 2018 13:01:15 +1000 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: <20180202065239.GN26553@ando.pearwood.info> References: <20180119033907.GH22500@ando.pearwood.info> <20180202065239.GN26553@ando.pearwood.info> Message-ID: On 2 February 2018 at 16:52, Steven D'Aprano wrote: > If it were my decision, I'd have these codecs raise a warning (not an > error) when used for encoding. But I guess some people will consider > that either going too far or not far enough :-) Rob pointed out that one of the main use cases for these codecs is when going "Oh, this was decoded with a WHATWG encoding, which isn't right, so I need to re-encode it with that encoding, and then decode it with the right encoding". So encoding is very much part of the usage model: it's needed when you've received the data over a Unicode based interface rather than a binary one. So I think the *use case* for the WHATWG encodings has been pretty well established. What hasn't been established is whether our answer to "How do I handle the WHATWG encodings?" is going to be: * "Here they are in the standard library (for 3.8+)!"; or * "These are available as part of the 'ftfy' library on PyPI, which also helps fixes various other problems in decoded text" Personally, I think a See Also note pointing to ftfy in the "codecs" module documentation would be quite a reasonable outcome of the thread - when it comes to consuming arbitrary data from the internet and cleaning up decoding issues, ftfy's data introspection based approach is likely to be far easier to start with than characterising the common errors for specific data sources and applying them individually, and if you're already using ftfy to figure out which fixes are needed, then it shouldn't be a big deal to keep it around for the more relaxed codecs that it provides. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From storchaka at gmail.com Mon Feb 5 01:40:40 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 5 Feb 2018 08:40:40 +0200 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> <20180202065239.GN26553@ando.pearwood.info> Message-ID: 05.02.18 05:01, Nick Coghlan ????: > On 2 February 2018 at 16:52, Steven D'Aprano wrote: >> If it were my decision, I'd have these codecs raise a warning (not an >> error) when used for encoding. But I guess some people will consider >> that either going too far or not far enough :-) > > Rob pointed out that one of the main use cases for these codecs is > when going "Oh, this was decoded with a WHATWG encoding, which isn't > right, so I need to re-encode it with that encoding, and then decode > it with the right encoding". So encoding is very much part of the > usage model: it's needed when you've received the data over a Unicode > based interface rather than a binary one. Wasn't the "surrogateescape" error handler designed for this purpose? WHATWG encodings solve the same problem that "surrogateescape", but 1) They use different range for representing unmapped characters. 2) Not all unmapped characters can be decoded, thus a decoding is lossy, and a round-trip not always works. From p.f.moore at gmail.com Mon Feb 5 05:07:03 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 5 Feb 2018 10:07:03 +0000 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> <20180202065239.GN26553@ando.pearwood.info> Message-ID: On 5 February 2018 at 06:40, Serhiy Storchaka wrote: > 05.02.18 05:01, Nick Coghlan ????: >> >> On 2 February 2018 at 16:52, Steven D'Aprano wrote: >>> >>> If it were my decision, I'd have these codecs raise a warning (not an >>> error) when used for encoding. But I guess some people will consider >>> that either going too far or not far enough :-) >> >> >> Rob pointed out that one of the main use cases for these codecs is >> when going "Oh, this was decoded with a WHATWG encoding, which isn't >> right, so I need to re-encode it with that encoding, and then decode >> it with the right encoding". So encoding is very much part of the >> usage model: it's needed when you've received the data over a Unicode >> based interface rather than a binary one. > > > Wasn't the "surrogateescape" error handler designed for this purpose? > > WHATWG encodings solve the same problem that "surrogateescape", but > > 1) They use different range for representing unmapped characters. > 2) Not all unmapped characters can be decoded, thus a decoding is lossy, and > a round-trip not always works. Surrogateescape is for when the source of the Unicode data is also Python. The WHATWG encodings (AIUI) can be used by any tool to attempt to decode data. If that "I think this is what it is" data is passed as Unicode to Python, and the Python code determines that the guess was wrong, then re-encoding it using the WHATWG encoding lets you try again to decode it properly. The result would be lossy, yes. Whether this is a problem, I can't say, as I've never encountered the sorts of use cases being discussed here. I assume that the people advocating for this have, and consider this option, even if it's lossy, to be the best approach. For a non-stdlib based solution, I see no problem with this. If the codecs are to go into the stdlib, then I do think we should be able to document clearly what the use case is for these encodings, and why a user reading the codecs docs should pick these encodings over another one. That's where I think the proposal currently falls down - not in the usefulness of the codecs, nor in the naming (both of which seem to me to have been covered) but in providing a good enough explanation *to non-specialists* of why these codecs exist, how they should be used, and what the caveats are. Something that we'd be comfortable including in the docs. Paul From gadgetsteve at live.co.uk Mon Feb 5 03:10:55 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Mon, 5 Feb 2018 08:10:55 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default Message-ID: When a new version of python is in alpha/beta it is often desirable to have it installed for tests but remain on a previous version for day to day use. However, currently the Windows py launcher defaults to the highest version that it finds, which means that unless you are very careful you will end up having to explicitly specify your older version every time that you start python with it once you have installed the newer version. I an thinking that it would be relatively simple to expand the current launcher functionality to allow the user to set the default version to be used. One possible syntax, echoing the way that versions are displayed with the -0 option would be to allow py -n.m* to set and store, either in the registry, environment variable or a configuration file, the desired default to be invoked by py or pyw. Personally I thing that this would encourage more people to undertake testing of new candidate releases of python. I would be interested in any feedback on the value that this might add. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. From p.f.moore at gmail.com Mon Feb 5 06:04:08 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 5 Feb 2018 11:04:08 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: Message-ID: On 5 February 2018 at 08:10, Steve Barnes wrote: > When a new version of python is in alpha/beta it is often desirable to > have it installed for tests but remain on a previous version for day to > day use. > > However, currently the Windows py launcher defaults to the highest > version that it finds, which means that unless you are very careful you > will end up having to explicitly specify your older version every time > that you start python with it once you have installed the newer version. > > I an thinking that it would be relatively simple to expand the current > launcher functionality to allow the user to set the default version to > be used. > > One possible syntax, echoing the way that versions are displayed with > the -0 option would be to allow py -n.m* to set and store, either in the > registry, environment variable or a configuration file, the desired > default to be invoked by py or pyw. > > Personally I thing that this would encourage more people to undertake > testing of new candidate releases of python. > > I would be interested in any feedback on the value that this might add. There's a `py.ini` file that lets you set the default version. See https://docs.python.org/3.6/using/windows.html#customization for details. Is that just something you weren't aware of, or does it not address the issue you're having? Paul From mal at egenix.com Mon Feb 5 05:52:41 2018 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 5 Feb 2018 11:52:41 +0100 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> <20180202065239.GN26553@ando.pearwood.info> Message-ID: On 05.02.2018 04:01, Nick Coghlan wrote: > On 2 February 2018 at 16:52, Steven D'Aprano wrote: >> If it were my decision, I'd have these codecs raise a warning (not an >> error) when used for encoding. But I guess some people will consider >> that either going too far or not far enough :-) > > Rob pointed out that one of the main use cases for these codecs is > when going "Oh, this was decoded with a WHATWG encoding, which isn't > right, so I need to re-encode it with that encoding, and then decode > it with the right encoding". So encoding is very much part of the > usage model: it's needed when you've received the data over a Unicode > based interface rather than a binary one. So the use case for encoding into WHATWG is to undo the WHATWG mappings by then decoding using the standard mappings and using an error handler to deal with decoding issues ? This strikes me as a rather unrealistic use case, esp. since it's likely that the original decoding was also done in Python, so the much more intuitive approach to fix this problem would be to not use WHATWG encodings for the initial decoding in the first place. > So I think the *use case* for the WHATWG encodings has been pretty > well established. What hasn't been established is whether our answer > to "How do I handle the WHATWG encodings?" is going to be: > > * "Here they are in the standard library (for 3.8+)!"; or > * "These are available as part of the 'ftfy' library on PyPI, which > also helps fixes various other problems in decoded text" > > Personally, I think a See Also note pointing to ftfy in the "codecs" > module documentation would be quite a reasonable outcome of the thread > - when it comes to consuming arbitrary data from the internet and > cleaning up decoding issues, ftfy's data introspection based approach > is likely to be far easier to start with than characterising the > common errors for specific data sources and applying them > individually, and if you're already using ftfy to figure out which > fixes are needed, then it shouldn't be a big deal to keep it around > for the more relaxed codecs that it provides. I think we've been going around in circles long enough. Let's leave things as they are and perhaps a section to the codecs documentation, as you suggest, where to find other encodings which a user might want to use and tools to help with fixing encoding or decoding errors. Here's a random list from PyPI with some packages: https://pypi.python.org/pypi/ebcdic/ https://pypi.python.org/pypi/latexcodec/ https://pypi.python.org/pypi/mysql-latin1-codec/ https://pypi.python.org/pypi/cbmcodecs/ Perhaps fun variants such as: https://pypi.python.org/pypi/emoji-encoding/ -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Feb 05 2018) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From storchaka at gmail.com Mon Feb 5 06:39:31 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 5 Feb 2018 13:39:31 +0200 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> <20180202065239.GN26553@ando.pearwood.info> Message-ID: 05.02.18 12:52, M.-A. Lemburg ????: > Let's leave things as they are and perhaps a section to the codecs > documentation, as you suggest, where to find other encodings which > a user might want to use and tools to help with fixing encoding or > decoding errors. > > Here's a random list from PyPI with some packages: > https://pypi.python.org/pypi/ebcdic/ > https://pypi.python.org/pypi/latexcodec/ > https://pypi.python.org/pypi/mysql-latin1-codec/ > https://pypi.python.org/pypi/cbmcodecs/ > > Perhaps fun variants such as: > https://pypi.python.org/pypi/emoji-encoding/ But first than add references to third-party packages we should examine them. Check that they are compatible with recent versions of Python, do what they are stated, and don't contain malicious code. From mal at egenix.com Mon Feb 5 06:44:38 2018 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 5 Feb 2018 12:44:38 +0100 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> <20180202065239.GN26553@ando.pearwood.info> Message-ID: <5cecde65-8161-83c5-3236-b0ac1bfa3f62@egenix.com> On 05.02.2018 12:39, Serhiy Storchaka wrote: > 05.02.18 12:52, M.-A. Lemburg ????: >> Let's leave things as they are and perhaps a section to the codecs >> documentation, as you suggest, where to find other encodings which >> a user might want to use and tools to help with fixing encoding or >> decoding errors. >> >> Here's a random list from PyPI with some packages: >> https://pypi.python.org/pypi/ebcdic/ >> https://pypi.python.org/pypi/latexcodec/ >> https://pypi.python.org/pypi/mysql-latin1-codec/ >> https://pypi.python.org/pypi/cbmcodecs/ >> >> Perhaps fun variants such as: >> https://pypi.python.org/pypi/emoji-encoding/ > > But first than add references to third-party packages we should examine > them. Check that they are compatible with recent versions of Python, do > what they are stated, and don't contain malicious code. Sure. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Feb 05 2018) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From gadgetsteve at live.co.uk Mon Feb 5 06:10:33 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Mon, 5 Feb 2018 11:10:33 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: Message-ID: On 05/02/2018 11:04, Paul Moore wrote: > On 5 February 2018 at 08:10, Steve Barnes wrote: >> When a new version of python is in alpha/beta it is often desirable to >> have it installed for tests but remain on a previous version for day to >> day use. >> >> However, currently the Windows py launcher defaults to the highest >> version that it finds, which means that unless you are very careful you >> will end up having to explicitly specify your older version every time >> that you start python with it once you have installed the newer version. >> >> I an thinking that it would be relatively simple to expand the current >> launcher functionality to allow the user to set the default version to >> be used. >> >> One possible syntax, echoing the way that versions are displayed with >> the -0 option would be to allow py -n.m* to set and store, either in the >> registry, environment variable or a configuration file, the desired >> default to be invoked by py or pyw. >> >> Personally I thing that this would encourage more people to undertake >> testing of new candidate releases of python. >> >> I would be interested in any feedback on the value that this might add. > > There's a `py.ini` file that lets you set the default version. See > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.python.org%2F3.6%2Fusing%2Fwindows.html%23customization&data=02%7C01%7C%7C4fc688adc0944614494b08d56c882f23%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636534254511665686&sdata=PVQDHuO2NFpTAaLPEr0PruQ2IcP7X9zzASryITSTBNE%3D&reserved=0 for > details. Is that just something you weren't aware of, or does it not > address the issue you're having? > > Paul > Paul, That was something that I was not aware of & covers my use case nicely. Steve -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. From lists at janc.be Mon Feb 5 10:47:16 2018 From: lists at janc.be (Jan Claeys) Date: Mon, 05 Feb 2018 16:47:16 +0100 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: Message-ID: <1517845636.5933.5.camel@janc.be> On Mon, 2018-02-05 at 11:04 +0000, Paul Moore wrote: > On 5 February 2018 at 08:10, Steve Barnes > wrote: > > When a new version of python is in alpha/beta it is often desirable > > to have it installed for tests but remain on a previous version for > > day to day use. > > > > However, currently the Windows py launcher defaults to the highest > > version that it finds, which means that unless you are very careful > > you will end up having to explicitly specify your older version > > every time that you start python with it once you have installed > > the newer version. > > [...] > > There's a `py.ini` file that lets you set the default version. See > https://docs.python.org/3.6/using/windows.html#customization for > details. Is that just something you weren't aware of, or does it not > address the issue you're having? Maybe the Windows installer should offer to set/change that, especially when installing a non-release version? -- Jan Claeys From leewangzhong+python at gmail.com Mon Feb 5 20:17:53 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Mon, 5 Feb 2018 20:17:53 -0500 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: Message-ID: On Mon, Feb 5, 2018 at 6:04 AM, Paul Moore wrote: > On 5 February 2018 at 08:10, Steve Barnes wrote: >> When a new version of python is in alpha/beta it is often desirable to >> have it installed for tests but remain on a previous version for day to >> day use. >> >> However, currently the Windows py launcher defaults to the highest >> version that it finds, which means that unless you are very careful you >> will end up having to explicitly specify your older version every time >> that you start python with it once you have installed the newer version. >> >> I an thinking that it would be relatively simple to expand the current >> launcher functionality to allow the user to set the default version to >> be used. >> >> One possible syntax, echoing the way that versions are displayed with >> the -0 option would be to allow py -n.m* to set and store, either in the >> registry, environment variable or a configuration file, the desired >> default to be invoked by py or pyw. >> >> Personally I thing that this would encourage more people to undertake >> testing of new candidate releases of python. >> >> I would be interested in any feedback on the value that this might add. > > There's a `py.ini` file that lets you set the default version. See > https://docs.python.org/3.6/using/windows.html#customization for > details. Is that just something you weren't aware of, or does it not > address the issue you're having? I think the feature is still worth considering. Playing with .ini files should be considered a hack, while a way to change the default may be useful for many, at all skill levels. In your link, though, the recommended way to change the default is to use the environment variable %PY_PYTHON%, rather than the .ini file. But environment variables are not as key to Windows use as it is to Linux use, so the feature is still worth considering. Does the installer offer to change the default to its version, even when running it after installation? If it does, it still might be good to add an option to py.exe, because it's closest to where you'd think to want to change the setting. From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Feb 6 05:10:46 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Tue, 6 Feb 2018 19:10:46 +0900 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: References: <20180119033907.GH22500@ando.pearwood.info> <20180202065239.GN26553@ando.pearwood.info> Message-ID: <23161.32550.235210.105075@turnbull.sk.tsukuba.ac.jp> Nick Coghlan writes: > Personally, I think a See Also note pointing to ftfy in the "codecs" > module documentation would be quite a reasonable outcome of the thread Yes please. The more I hear about purported use cases (with the exception of Nathaniel's "don't crash when I manipulate the DOM" case, which is best handled by errors='surrogateescape'), the less I see anything "standard" about them. From tritium-list at sdamon.com Tue Feb 6 05:10:43 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Tue, 6 Feb 2018 05:10:43 -0500 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: <1517845636.5933.5.camel@janc.be> References: <1517845636.5933.5.camel@janc.be> Message-ID: <0dbc01d39f32$c04820d0$40d86270$@sdamon.com> I actually like the idea of being able to modify the py.ini file to set the default from py.exe. That seams like the most intuitive thing to me. > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Jan Claeys > Sent: Monday, February 5, 2018 10:47 AM > To: python-ideas at python.org > Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set > default > > On Mon, 2018-02-05 at 11:04 +0000, Paul Moore wrote: > > On 5 February 2018 at 08:10, Steve Barnes > > wrote: > > > When a new version of python is in alpha/beta it is often desirable > > > to have it installed for tests but remain on a previous version for > > > day to day use. > > > > > > However, currently the Windows py launcher defaults to the highest > > > version that it finds, which means that unless you are very careful > > > you will end up having to explicitly specify your older version > > > every time that you start python with it once you have installed > > > the newer version. > > > [...] > > > > There's a `py.ini` file that lets you set the default version. See > > https://docs.python.org/3.6/using/windows.html#customization for > > details. Is that just something you weren't aware of, or does it not > > address the issue you're having? > > Maybe the Windows installer should offer to set/change that, especially > when installing a non-release version? > > > -- > Jan Claeys > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From p.f.moore at gmail.com Tue Feb 6 06:30:35 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 6 Feb 2018 11:30:35 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: <0dbc01d39f32$c04820d0$40d86270$@sdamon.com> References: <1517845636.5933.5.camel@janc.be> <0dbc01d39f32$c04820d0$40d86270$@sdamon.com> Message-ID: I'm reluctant to expand the feature set of the launcher in this direction. It's written in C, and tightly focused on being a lightweight launcher. Adding code to manage user options and persist them to the py.ini file would be a non-trivial overhead, as well as being hard to maintain (because C code and text handling :-)) It's not that hard to manage an ini file, and if anyone wants a friendlier interface, writing such a thing in Python as a standalone utility would be easy, and far more robust, flexible and maintainable than adding it to the launcher directly (you could even add a GUI if you like ;-)). Conceded, I'm saying this from the perspective of writing and maintaining the code, and not from the UX/UI perspective. If someone wants to add this feature to the launcher, I don't mind, but *personally* I don't think it's worth it. Paul On 6 February 2018 at 10:10, Alex Walters wrote: > I actually like the idea of being able to modify the py.ini file to set the > default from py.exe. That seams like the most intuitive thing to me. >> From: Python-ideas [mailto:python-ideas-bounces+tritium- >> >> Maybe the Windows installer should offer to set/change that, especially >> when installing a non-release version? From ericfahlgren at gmail.com Tue Feb 6 09:22:04 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Tue, 6 Feb 2018 06:22:04 -0800 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: <1517845636.5933.5.camel@janc.be> <0dbc01d39f32$c04820d0$40d86270$@sdamon.com> Message-ID: My only request for change would be to consolidate the various tools' behavior wrt their .ini file locations. Pip, for example, wants the file in ~/pip/pip.ini, while py.exe (on Windows) wants its py.ini in $LOCALAPPDATA. If they were all in a common location (or the same file with separate sections), that would make life a tiny bit easier. Eric On Tue, Feb 6, 2018 at 3:30 AM, Paul Moore wrote: > I'm reluctant to expand the feature set of the launcher in this > direction. It's written in C, and tightly focused on being a > lightweight launcher. Adding code to manage user options and persist > them to the py.ini file would be a non-trivial overhead, as well as > being hard to maintain (because C code and text handling :-)) It's not > that hard to manage an ini file, and if anyone wants a friendlier > interface, writing such a thing in Python as a standalone utility > would be easy, and far more robust, flexible and maintainable than > adding it to the launcher directly (you could even add a GUI if you > like ;-)). > > Conceded, I'm saying this from the perspective of writing and > maintaining the code, and not from the UX/UI perspective. If someone > wants to add this feature to the launcher, I don't mind, but > *personally* I don't think it's worth it. > > Paul > > On 6 February 2018 at 10:10, Alex Walters wrote: > > I actually like the idea of being able to modify the py.ini file to set > the > > default from py.exe. That seams like the most intuitive thing to me. > > >> From: Python-ideas [mailto:python-ideas-bounces+tritium- > >> > >> Maybe the Windows installer should offer to set/change that, especially > >> when installing a non-release version? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Tue Feb 6 09:47:57 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 6 Feb 2018 14:47:57 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: <1517845636.5933.5.camel@janc.be> <0dbc01d39f32$c04820d0$40d86270$@sdamon.com> Message-ID: There are a few different points here: 1. There's no relationship between pip and the py launcher - they are separate tools/projects. Any co-operation in terms of file locations would have to be a result of common standards. Those would normally be platform standards, not Python ones. 2. On Windows, pip.ini is in $env:APPDATA\pip, not ~/pip. Are you confusing Windows and Unix conventions? 3. $env:APPDATA and $env:LOCALAPPDATA have different functions, and the choice between the two needs to be made on a case by case basis. However, the difference between the two is subtle, and frankly is probably lost on Unix developers. So which gets used is somewhat random, in practice. But it does matter, in certain environments. I *think* the different usages here are correct (although on the systems I work on, the distinction doesn't matter in practice so I can't confirm that). 4. Python projects tend to actually be *better* at following Windows platform conventions than other applications (which often use the Unix convention of putting stuff under ~) in my experience. What looks like inconsistency is sometimes (not in the case of py vs pip, admittedly) just people transferring expectations from one platform to another (or worse, transferring expectations from Unix programs naively ported to Windows over to other Windows programs). 5. Windows history for "where you should store your application config" is a mess - inconsistencies, changes in recommendations, and use cases not catered for, abound. So even in the ideal situation, what is right now was probably wrong 5 years ago. And will likely be wrong 5 years from now (although we can hope...) But +1 on a world where config data all gets stored consistently. Oh, and can I have a pony? :-) Paul On 6 February 2018 at 14:22, Eric Fahlgren wrote: > My only request for change would be to consolidate the various tools' > behavior wrt their .ini file locations. Pip, for example, wants the file in > ~/pip/pip.ini, while py.exe (on Windows) wants its py.ini in $LOCALAPPDATA. > If they were all in a common location (or the same file with separate > sections), that would make life a tiny bit easier. > > Eric > > On Tue, Feb 6, 2018 at 3:30 AM, Paul Moore wrote: >> >> I'm reluctant to expand the feature set of the launcher in this >> direction. It's written in C, and tightly focused on being a >> lightweight launcher. Adding code to manage user options and persist >> them to the py.ini file would be a non-trivial overhead, as well as >> being hard to maintain (because C code and text handling :-)) It's not >> that hard to manage an ini file, and if anyone wants a friendlier >> interface, writing such a thing in Python as a standalone utility >> would be easy, and far more robust, flexible and maintainable than >> adding it to the launcher directly (you could even add a GUI if you >> like ;-)). >> >> Conceded, I'm saying this from the perspective of writing and >> maintaining the code, and not from the UX/UI perspective. If someone >> wants to add this feature to the launcher, I don't mind, but >> *personally* I don't think it's worth it. >> >> Paul >> >> On 6 February 2018 at 10:10, Alex Walters wrote: >> > I actually like the idea of being able to modify the py.ini file to set >> > the >> > default from py.exe. That seams like the most intuitive thing to me. >> >> >> From: Python-ideas [mailto:python-ideas-bounces+tritium- >> >> >> >> Maybe the Windows installer should offer to set/change that, especially >> >> when installing a non-release version? >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > From ericfahlgren at gmail.com Tue Feb 6 10:23:28 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Tue, 6 Feb 2018 07:23:28 -0800 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: <1517845636.5933.5.camel@janc.be> <0dbc01d39f32$c04820d0$40d86270$@sdamon.com> Message-ID: On Tue, Feb 6, 2018 at 6:47 AM, Paul Moore wrote: > There are a few different points here: > > 1. There's no relationship between pip and the py launcher - they are > separate tools/projects. Any co-operation in terms of file locations > would have to be a result of common standards. Those would normally be > platform standards, not Python ones. > ?Right, different planets, but orbiting the same star. I was thinking about the consolidation of the Windows registry layout a year or two ago, don't recall who spearheaded that (Steve Dower?). In any case, if the various tools either followed that convention, or we came up with an ini-based one that was consistent with it and usable on Unix (.pyconf or something)... 2. On Windows, pip.ini is in $env:APPDATA\pip, not ~/pip. Are you > confusing Windows and Unix conventions? > ?Yeah, our Windows dev environment uses Cygwin, so I'm constantly confused. :) Here's where I see py.exe looking for its ini file (first $LOCALAPPDATA then in $SystemRoot): > setenv PYLAUNCH_DEBUG 1 > py.exe launcher build: 32bit launcher executable: Console Using local configuration file 'C:\Users\efahlgren\AppData\Local\py.ini' File 'C:\Windows\py.ini' non-existent Not sure how to make pip cough up similar verbose output, but when it started complaining about legacy formats, I just followed its directions and this works: > ll $USERPROFILE/pip/pip.ini -rw-r--r-- efahlgren 2017-04-30 15:51 'C:/Users/efahlgren/pip/pip.ini' Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Tue Feb 6 10:44:45 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 6 Feb 2018 15:44:45 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: <1517845636.5933.5.camel@janc.be> <0dbc01d39f32$c04820d0$40d86270$@sdamon.com> Message-ID: On 6 February 2018 at 15:23, Eric Fahlgren wrote: > Right, different planets, but orbiting the same star. I was thinking about > the consolidation of the Windows registry layout a year or two ago, don't > recall who spearheaded that (Steve Dower?). In any case, if the various > tools either followed that convention, or we came up with an ini-based one > that was consistent with it and usable on Unix (.pyconf or something)... Yep, that would be an informational PEP, defining standards we expect Python applications to follow. There's a lot more Python *applications* than there are Python *distributions*, and I'm not convinced a standard for applications would get much traction (even ignoring the need they'd have for backward compatibility) but if someone wants to try to get consensus on something, then have fun! Actually, the `appdirs` project (https://pypi.python.org/pypi/appdirs) does exactly this - provides a portable interface for applications to store config data in platform-specific locations. The correct answer is probably to persuade application developers to use appdirs rather than their own schemes. Pip and py both use appdirs-compatible schemes (py doesn't use appdirs itself, as it's not written in Python, but pip does). pip: appdirs.user_config_dir('pip', appauthor=False, roaming=True) py: appdirs.user_config_dir() You could argue that appdirs offers too many options - but if all applications used appdirs, you could have that debate once with the appdirs authors, rather than having to persuade every application in turn. > Yeah, our Windows dev environment uses Cygwin, so I'm constantly confused. > :) Yuk, Cygwin. I'll refrain from commenting further :-) > Not sure how to make pip cough up similar verbose output, but when it > started complaining about legacy formats, I just followed its directions and > this works: > >> ll $USERPROFILE/pip/pip.ini > -rw-r--r-- efahlgren 2017-04-30 15:51 'C:/Users/efahlgren/pip/pip.ini' Backward compatibility. When we moved to the Windows-standard location, we left in fallbacks to the old locations. I've no idea whether pip sees Cygwin as Windows-like or Unix-like, so anything could be going on beyond that. Paul From stephanh42 at gmail.com Tue Feb 6 14:07:21 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Tue, 6 Feb 2018 20:07:21 +0100 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: <1517845636.5933.5.camel@janc.be> <0dbc01d39f32$c04820d0$40d86270$@sdamon.com> Message-ID: Hi all, Just want to point out that if tools can accept a config file in a cross-platform standard location (perhaps in addition to a platform-specific one), then that is incredibly useful. Just search on Github for "dotfiles", and see how many people store their configuration in a git repo, so they can just go to a fresh machine, "git clone dotfiles" in their $HOME and be fully set up. A platform-independent location also simplifies life for us who use Linux, Windows and MacOs on a daily basis. (And then there is the fact that these platform "standards" are only followed very haphazardly anyway, e.g. Windows Vim uses ~\.vimrc on Windows and not somewhere in $LOCALAPPDATA .) As an aside, I don't agree with the "appdirs" package on Linux: XDG != Linux. That may seem pedantry, but while ~/.local is perhaps not too bad for user-specific config, /etc/xdg is almost certainly the wrong location for global config any application that is not part of a desktop environment. (In fact, such a directory may not exist on a typical headless Linux install.) Stephan 2018-02-06 16:44 GMT+01:00 Paul Moore : > On 6 February 2018 at 15:23, Eric Fahlgren wrote: > > Right, different planets, but orbiting the same star. I was thinking > about > > the consolidation of the Windows registry layout a year or two ago, don't > > recall who spearheaded that (Steve Dower?). In any case, if the various > > tools either followed that convention, or we came up with an ini-based > one > > that was consistent with it and usable on Unix (.pyconf or something)... > > Yep, that would be an informational PEP, defining standards we expect > Python applications to follow. There's a lot more Python > *applications* than there are Python *distributions*, and I'm not > convinced a standard for applications would get much traction (even > ignoring the need they'd have for backward compatibility) but if > someone wants to try to get consensus on something, then have fun! > > Actually, the `appdirs` project (https://pypi.python.org/pypi/appdirs) > does exactly this - provides a portable interface for applications to > store config data in platform-specific locations. The correct answer > is probably to persuade application developers to use appdirs rather > than their own schemes. > > Pip and py both use appdirs-compatible schemes (py doesn't use appdirs > itself, as it's not written in Python, but pip does). > > pip: appdirs.user_config_dir('pip', appauthor=False, roaming=True) > py: appdirs.user_config_dir() > > You could argue that appdirs offers too many options - but if all > applications used appdirs, you could have that debate once with the > appdirs authors, rather than having to persuade every application in > turn. > > > Yeah, our Windows dev environment uses Cygwin, so I'm constantly > confused. > > :) > > Yuk, Cygwin. I'll refrain from commenting further :-) > > > Not sure how to make pip cough up similar verbose output, but when it > > started complaining about legacy formats, I just followed its directions > and > > this works: > > > >> ll $USERPROFILE/pip/pip.ini > > -rw-r--r-- efahlgren 2017-04-30 15:51 'C:/Users/efahlgren/pip/pip.ini' > > Backward compatibility. When we moved to the Windows-standard > location, we left in fallbacks to the old locations. I've no idea > whether pip sees Cygwin as Windows-like or Unix-like, so anything > could be going on beyond that. > > Paul > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gadgetsteve at live.co.uk Tue Feb 6 14:42:38 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Tue, 6 Feb 2018 19:42:38 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: <1517845636.5933.5.camel@janc.be> <0dbc01d39f32$c04820d0$40d86270$@sdamon.com>, Message-ID: A thought for you might be to consider an option to just produce, (on request), a template config file with the default settings and commented out options then display the path to the user. This would fit in with other tools that I have come across while keeping the configuration options that the C code recognises in with the code rather than in a manual or web page that can get out of step without the complexity of being able to set and store options from within the tool itself. It is also somewhat pythonic in that the options and their documentation being in the code fits in well with pythons self documenting features. ________________________________ From: Python-ideas on behalf of Paul Moore Sent: 06 February 2018 11:30 To: Alex Walters Cc: Python-Ideas Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set default I'm reluctant to expand the feature set of the launcher in this direction. It's written in C, and tightly focused on being a lightweight launcher. Adding code to manage user options and persist them to the py.ini file would be a non-trivial overhead, as well as being hard to maintain (because C code and text handling :-)) It's not that hard to manage an ini file, and if anyone wants a friendlier interface, writing such a thing in Python as a standalone utility would be easy, and far more robust, flexible and maintainable than adding it to the launcher directly (you could even add a GUI if you like ;-)). Conceded, I'm saying this from the perspective of writing and maintaining the code, and not from the UX/UI perspective. If someone wants to add this feature to the launcher, I don't mind, but *personally* I don't think it's worth it. Paul On 6 February 2018 at 10:10, Alex Walters wrote: > I actually like the idea of being able to modify the py.ini file to set the > default from py.exe. That seams like the most intuitive thing to me. >> From: Python-ideas [mailto:python-ideas-bounces+tritium- >> >> Maybe the Windows installer should offer to set/change that, especially >> when installing a non-release version? _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7C%7C7699d0d7669c43d7c1a608d56d5515f7%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636535134571322756&sdata=KzRBDOor7TVYLAvvEza2kr%2BIKifdMOgEwATN%2BQngFyo%3D&reserved=0 Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7C%7C7699d0d7669c43d7c1a608d56d5515f7%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636535134571322756&sdata=1ZrUqumcn4c69EGmEbQMOxL30AM%2BrYkSZSVrxBT5X7E%3D&reserved=0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From rspeer at luminoso.com Tue Feb 6 18:26:38 2018 From: rspeer at luminoso.com (Rob Speer) Date: Tue, 06 Feb 2018 23:26:38 +0000 Subject: [Python-ideas] Support WHATWG versions of legacy encodings In-Reply-To: <23161.32550.235210.105075@turnbull.sk.tsukuba.ac.jp> References: <20180119033907.GH22500@ando.pearwood.info> <20180202065239.GN26553@ando.pearwood.info> <23161.32550.235210.105075@turnbull.sk.tsukuba.ac.jp> Message-ID: By now, it sounds right to me that I should implement these codecs in a package. I accept that I've established the use case, but not sufficiently established why it belongs in Python. The package can easily be ftfy -- although I should point out that what's in ftfy at the moment isn't quite right! "ftfy.bad_codecs" implements the "fall back on Latin-1" idea that many people here have intuitively suggested, because I was implementing it just based on the evidence of text I saw; I didn't know at the time that there was an actual standard involved. The result differs subtly from what Web browsers do in cases outside the C1 range. But of course I can work on re-implementing the encodings correctly based on what I've learned. I think it would be best if these encodings were actually implemented in the "webencodings" package, or in a package that both ftfy and webencodings could use. I have certainly encountered cases in web scraping where, because webencodings doesn't use the same Windows-1252 as the actual web does, I have had to decode the text even more incorrectly using Latin-1 and _then_ run it through ftfy -- in effect, adding a layer of mojibake so I can fix two layers of mojibake. That's kind of absurd and it's why I thought this belonged in Python itself. But I'll talk to the webencodings author instead. On Tue, 6 Feb 2018 at 05:12 Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > Nick Coghlan writes: > > > Personally, I think a See Also note pointing to ftfy in the "codecs" > > module documentation would be quite a reasonable outcome of the thread > > Yes please. The more I hear about purported use cases (with the > exception of Nathaniel's "don't crash when I manipulate the DOM" case, > which is best handled by errors='surrogateescape'), the less I see > anything "standard" about them. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Wed Feb 7 00:22:51 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 7 Feb 2018 00:22:51 -0500 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: <1517845636.5933.5.camel@janc.be> <0dbc01d39f32$c04820d0$40d86270$@sdamon.com> Message-ID: <0fcc01d39fd3$b3d3e2c0$1b7ba840$@sdamon.com> > -----Original Message----- > From: Paul Moore [mailto:p.f.moore at gmail.com] > Sent: Tuesday, February 6, 2018 6:31 AM > To: Alex Walters > Cc: Jan Claeys ; Python-Ideas > Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set > default > > I'm reluctant to expand the feature set of the launcher in this > direction. It's written in C, and tightly focused on being a > lightweight launcher. Adding code to manage user options and persist > them to the py.ini file would be a non-trivial overhead, as well as > being hard to maintain (because C code and text handling :-)) It's not > that hard to manage an ini file, and if anyone wants a friendlier > interface, writing such a thing in Python as a standalone utility > would be easy, and far more robust, flexible and maintainable than > adding it to the launcher directly (you could even add a GUI if you > like ;-)). > How would you feel about such a management script being shipped with python? > Conceded, I'm saying this from the perspective of writing and > maintaining the code, and not from the UX/UI perspective. If someone > wants to add this feature to the launcher, I don't mind, but > *personally* I don't think it's worth it. > > Paul > > On 6 February 2018 at 10:10, Alex Walters wrote: > > I actually like the idea of being able to modify the py.ini file to set the > > default from py.exe. That seams like the most intuitive thing to me. > > >> From: Python-ideas [mailto:python-ideas-bounces+tritium- > >> > >> Maybe the Windows installer should offer to set/change that, especially > >> when installing a non-release version? From tritium-list at sdamon.com Wed Feb 7 00:36:29 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 7 Feb 2018 00:36:29 -0500 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: Message-ID: <0fce01d39fd5$9b23c720$d16b5560$@sdamon.com> While this thread has focused on the location and means of managing py.ini, I think there is a more general solution that should be considered to the original problem, as described. The problem isn't that it's difficult or non-obvious how to set the default python for the py.exe launcher (that's a possible documentation issue, and I argue a minor tooling problem), the problem is that the launcher, by default, selects the latest version of python as the default, regardless of that python's release status. Without looking at the C code (I haven't but should), I don't think it would be too difficult to teach py.exe to not auto-select dev, alpha, or beta versions of python without being told explicitly to do so. For example (for the archives, this is written in February 2018, when 3.7 is in its beta), on a system with 3.6 and 3.7 installed... py.exe myfile.py REM should run 3.6, unless shebang overrides py.exe -3.7 myfile.py REM should run 3.7 beta And after 3.7.0 final is released and installed on said system, py.exe myfile.py should run 3.7. Is this difficult to implement? Is this a bad idea? > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Steve Barnes > Sent: Monday, February 5, 2018 3:11 AM > To: Python-Ideas > Subject: [Python-ideas] Possible Enhancement to py Launcher - set default > > When a new version of python is in alpha/beta it is often desirable to > have it installed for tests but remain on a previous version for day to > day use. > > However, currently the Windows py launcher defaults to the highest > version that it finds, which means that unless you are very careful you > will end up having to explicitly specify your older version every time > that you start python with it once you have installed the newer version. > > I an thinking that it would be relatively simple to expand the current > launcher functionality to allow the user to set the default version to > be used. > > One possible syntax, echoing the way that versions are displayed with > the -0 option would be to allow py -n.m* to set and store, either in the > registry, environment variable or a configuration file, the desired > default to be invoked by py or pyw. > > Personally I thing that this would encourage more people to undertake > testing of new candidate releases of python. > > I would be interested in any feedback on the value that this might add. > > -- > Steve (Gadget) Barnes > Any opinions in this message are my personal opinions and do not reflect > those of my employer. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From p.f.moore at gmail.com Wed Feb 7 04:14:37 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 7 Feb 2018 09:14:37 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: <0fce01d39fd5$9b23c720$d16b5560$@sdamon.com> References: <0fce01d39fd5$9b23c720$d16b5560$@sdamon.com> Message-ID: On 7 February 2018 at 05:36, Alex Walters wrote: > While this thread has focused on the location and means of managing py.ini, > I think there is a more general solution that should be considered to the > original problem, as described. The problem isn't that it's difficult or > non-obvious how to set the default python for the py.exe launcher (that's a > possible documentation issue, and I argue a minor tooling problem), the > problem is that the launcher, by default, selects the latest version of > python as the default, regardless of that python's release status. Without > looking at the C code (I haven't but should), I don't think it would be too > difficult to teach py.exe to not auto-select dev, alpha, or beta versions of > python without being told explicitly to do so. > > For example (for the archives, this is written in February 2018, when 3.7 is > in its beta), on a system with 3.6 and 3.7 installed... > > py.exe myfile.py REM should run 3.6, unless shebang overrides > py.exe -3.7 myfile.py REM should run 3.7 beta > > And after 3.7.0 final is released and installed on said system, py.exe > myfile.py should run 3.7. > > Is this difficult to implement? Is this a bad idea? IMO the biggest technical issue with this is that as far as I can see PEP 514 doesn't specify a way to determine if a given Python is a pre-release version. If we do want to implement this (I'm +0 on it, personally) then I think the starting point would need to be an update to PEP 514 to include that data. Paul From tritium-list at sdamon.com Wed Feb 7 09:57:11 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 7 Feb 2018 09:57:11 -0500 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: <0fce01d39fd5$9b23c720$d16b5560$@sdamon.com> Message-ID: <104e01d3a023$efb3bdb0$cf1b3910$@sdamon.com> > -----Original Message----- > From: Paul Moore [mailto:p.f.moore at gmail.com] > Sent: Wednesday, February 7, 2018 4:15 AM > To: Alex Walters > Cc: Steve Barnes ; Python-Ideas ideas at python.org> > Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set > default > ... > > IMO the biggest technical issue with this is that as far as I can see > PEP 514 doesn't specify a way to determine if a given Python is a > pre-release version. If we do want to implement this (I'm +0 on it, > personally) then I think the starting point would need to be an update > to PEP 514 to include that data. > > Paul Looking at pep 514, it looks like sys.winver is what would have to change to support reporting the release status to the registry. I don't think 514 has to change at all if sys.winver changes. Is that a correct interpretation? From erik.m.bray at gmail.com Wed Feb 7 10:03:38 2018 From: erik.m.bray at gmail.com (Erik Bray) Date: Wed, 7 Feb 2018 16:03:38 +0100 Subject: [Python-ideas] importlib: making FileFinder easier to extend Message-ID: Hello, Brief problem statement: Let's say I have a custom file type (say, with extension .foo) and these .foo files are included in a package (along with other Python modules with standard extensions like .py and .so), and I want to make these .foo files importable like any other module. On its face, importlib.machinery.FileFinder makes this easy. I make a loader for my custom file type (say, FooSourceLoader), and I can use the FileFinder.path_hook helper like: sys.path_hooks.insert(0, FileFinder.path_hook((FooSourceLoader, ['.foo']))) sys.path_importer_cache.clear() Great--now I can import my .foo modules like any other Python module. However, any standard Python modules now cannot be imported. The way PathFinder sys.meta_path hook works, sys.path_hooks entries are first-come-first-serve, and furthermore FileFinder.path_hook is very promiscuous--it will take over module loading for *any* directory on sys.path, regardless what the file extensions are in that directory. So although this mechanism is provided by the stdlib, it can't really be used for this purpose without breaking imports of normal modules (and maybe it's not intended for that purpose, but the documentation is unclear). There are a number of different ways one could get around this. One might be to pass FileFinder.path_hook loaders/extension pairs for all the basic file types known by the Python interpreter. Unfortunately there's no great way to get that information. *I* know that I want to support .py, .pyc, .so etc. files, and I know which loaders to use for them. But that's really information that should belong to the Python interpreter, and not something that should be reverse-engineered. In fact, there is such a mapping provided by importlib.machinery._get_supported_file_loaders(), but this is not a publicly documented function. One could probably think of other workarounds. For example you could implement a custom sys.meta_path hook. But I think it shouldn't be necessary to go to higher levels of abstraction in order to do this--the default sys.path handler should be able to handle this use case. In order to support adding support for new file types to sys.path_hooks, I ended up implementing the following hack: ############################################################# import os import sys from importlib.abc import PathEntryFinder @PathEntryFinder.register class MetaFileFinder: """ A 'middleware', if you will, between the PathFinder sys.meta_path hook, and sys.path_hooks hooks--particularly FileFinder. The hook returned by FileFinder.path_hook is rather 'promiscuous' in that it will handle *any* directory. So if one wants to insert another FileFinder.path_hook into sys.path_hooks, that will totally take over importing for any directory, and previous path hooks will be ignored. This class provides its own sys.path_hooks hook as follows: If inserted on sys.path_hooks (it should be inserted early so that it can supersede anything else). Its find_spec method then calls each hook on sys.path_hooks after itself and, for each hook that can handle the given sys.path entry, it calls the hook to create a finder, and calls that finder's find_spec. So each sys.path_hooks entry is tried until a spec is found or all finders are exhausted. """ def __init__(self, path): if not os.path.isdir(path): raise ImportError('only directories are supported', path=path) self.path = path self._finder_cache = {} def __repr__(self): return '{}({!r})'.format(self.__class__.__name__, self.path) def find_spec(self, fullname, target=None): if not sys.path_hooks: return None for hook in sys.path_hooks: if hook is self.__class__: continue finder = None try: if hook in self._finder_cache: finder = self._finder_cache[hook] if finder is None: # We've tried this finder before and got an ImportError continue except TypeError: # The hook is unhashable pass if finder is None: try: finder = hook(self.path) except ImportError: pass try: self._finder_cache[hook] = finder except TypeError: # The hook is unhashable for some reason so we don't bother # caching it pass if finder is not None: spec = finder.find_spec(fullname, target) if spec is not None: return spec # Module spec not found through any of the finders return None def invalidate_caches(self): for finder in self._finder_cache.values(): finder.invalidate_caches() @classmethod def install(cls): sys.path_hooks.insert(0, cls) sys.path_importer_cache.clear() ############################################################# This works, for example, like: >>> MetaFileFinder.install() >>> sys.path_hooks.append(FileFinder.path_hook((SourceFileLoader, ['.foo']))) And now, .foo modules are importable, without breaking support for the built-in module types. This is still overkill though. I feel like there should instead be a way to, say, extend a sys.path_hooks hook based on FileFinder so as to be able to support loading other file types, without having to go above the default sys.meta_path hooks. A small, but related problem I noticed in the way FileFinder.path_hook is implemented, is that for almost *every directory* that gets cached in sys.path_importer_cache, a new FileFinder instance is created with its own self._loaders attribute, each containing a copy of the same list of (loader, extensions) tuples. I calculated that on one large project this alone accounted for nearly 1 MB. Not a big deal in the grand scheme of things, but still a bit overkill. ISTM it would kill two birds with one stone if FileFinder were changed, or there were a subclass thereof, that had a class attribute containing the standard loader/extension mappings. This in turn could simply be appended to in order to support new extension types. Thanks, E From p.f.moore at gmail.com Wed Feb 7 10:27:29 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 7 Feb 2018 15:27:29 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: <104e01d3a023$efb3bdb0$cf1b3910$@sdamon.com> References: <0fce01d39fd5$9b23c720$d16b5560$@sdamon.com> <104e01d3a023$efb3bdb0$cf1b3910$@sdamon.com> Message-ID: I don't think so. As an example, what registry keys would Anaconda write to say that Release 5.2.1.7 is a pre-release version? Or would the py launcher have to parse the version looking for rc/a/b/... tags? And distributions would have to agree on how they record pre-release version numbers? Paul On 7 February 2018 at 14:57, Alex Walters wrote: > > >> -----Original Message----- >> From: Paul Moore [mailto:p.f.moore at gmail.com] >> Sent: Wednesday, February 7, 2018 4:15 AM >> To: Alex Walters >> Cc: Steve Barnes ; Python-Ideas > ideas at python.org> >> Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set >> default >> > ... >> >> IMO the biggest technical issue with this is that as far as I can see >> PEP 514 doesn't specify a way to determine if a given Python is a >> pre-release version. If we do want to implement this (I'm +0 on it, >> personally) then I think the starting point would need to be an update >> to PEP 514 to include that data. >> >> Paul > > Looking at pep 514, it looks like sys.winver is what would have to change to support reporting the release status to the registry. I don't think 514 has to change at all if sys.winver changes. Is that a correct interpretation? > From steve.dower at python.org Wed Feb 7 14:35:29 2018 From: steve.dower at python.org (Steve Dower) Date: Wed, 7 Feb 2018 11:35:29 -0800 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: References: <0fce01d39fd5$9b23c720$d16b5560$@sdamon.com> <104e01d3a023$efb3bdb0$cf1b3910$@sdamon.com> Message-ID: Checking the Version (!=SysVersion) property should be enough (and perhaps we need to set it properly on install). The launcher currently only works with PythonCore entries anyway, so no need to worry about other distros. PEP 514 allows for other keys to be added as well (it specifies a minimum set), so we could just set one for this. ?NoDefaultLaunch? or similar. Finally, if someone created a script for setting py.ini, it could probably be included in the Tools directory. Wouldn?t be run on install or get a start menu shortcut though, just to set expectations right. Top-posted from my Windows phone From: Paul Moore Sent: Wednesday, February 7, 2018 7:37 To: Alex Walters Cc: Python-Ideas Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set default I don't think so. As an example, what registry keys would Anaconda write to say that Release 5.2.1.7 is a pre-release version? Or would the py launcher have to parse the version looking for rc/a/b/... tags? And distributions would have to agree on how they record pre-release version numbers? Paul On 7 February 2018 at 14:57, Alex Walters wrote: > > >> -----Original Message----- >> From: Paul Moore [mailto:p.f.moore at gmail.com] >> Sent: Wednesday, February 7, 2018 4:15 AM >> To: Alex Walters >> Cc: Steve Barnes ; Python-Ideas > ideas at python.org> >> Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set >> default >> > ... >> >> IMO the biggest technical issue with this is that as far as I can see >> PEP 514 doesn't specify a way to determine if a given Python is a >> pre-release version. If we do want to implement this (I'm +0 on it, >> personally) then I think the starting point would need to be an update >> to PEP 514 to include that data. >> >> Paul > > Looking at pep 514, it looks like sys.winver is what would have to change to support reporting the release status to the registry. I don't think 514 has to change at all if sys.winver changes. Is that a correct interpretation? > _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Wed Feb 7 16:49:23 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 7 Feb 2018 13:49:23 -0800 (PST) Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix Message-ID: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> Arbitrary radix comes up every now and then and Decimal already has a radix() method. It would be nice when initializing a Decimal object to be able to specify an arbitrary radix>=2. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Wed Feb 7 16:53:25 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 7 Feb 2018 13:53:25 -0800 (PST) Subject: [Python-ideas] Complicate str methods In-Reply-To: <20180204010919.GS26553@ando.pearwood.info> References: <20180204010919.GS26553@ando.pearwood.info> Message-ID: On Saturday, February 3, 2018 at 8:10:38 PM UTC-5, Steven D'Aprano wrote: > > On Sun, Feb 04, 2018 at 10:54:53AM +1100, Chris Angelico wrote: > > > Picking up this one as an example, but this applies to all of them: > > the transformation you're giving here is dangerously flawed. If there > > are any regex special characters in the strings, this will either bomb > > with an exception, or silently do the wrong thing. The correct way to > > do it is (at least, I think it is): > > > > re.match("|".join(map(re.escape, strings)), testme) > > > > With that gotcha lurking in the wings, I think this should not be > > cavalierly dismissed with "just 'import re' and be done with it". > > Indeed. > > This is not Perl and "just use a regex" is not a close fit to the > culture of Python. > > Regexes are a completely separate mini-language, and one which is the > opposite of Pythonic. Instead of "executable pseudo-code", regexes are > excessively terse and cryptic once you get past the simple examples. > Doing anything complicated using regexes is painful. > > Even Larry Wall has criticised regex syntax for choosing poor defaults > and information density. (Rarely used symbols get a single character, > while frequently needed symbols are coded as multiple characters, so > Perlish syntax has the worst of both worlds: too terse for casual users, > too verbose for experts, hard to maintain for everyone.) > > Any serious programmer should have at least a passing familiarity with > regexes. They are ubiquitous, and useful, especially as a common > mini-language for user-specified searching. > > But I consider regexes to be the fall-back for when Python doesn't > support the kind of string matching operation I need, not the primary > solution. I would never write: > > re.match('start', text) > re.search('spam', text) > > when > > text.startswith('start') > text.find('spam') > > will do. I think this proposal to add more power to the string methods > is worth some serious consideration. > Completely agree with the sentiment. I don't know about this proposal, but complicated regular expressions are never a good solution even when they are the best solution. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Wed Feb 7 16:47:45 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 7 Feb 2018 13:47:45 -0800 (PST) Subject: [Python-ideas] Consider making Decimal's context use PEP 567 Message-ID: Decimal could just pull its Context object from a context variable rather than having to pass it in to all functions. This would be similar to how numpy works. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Wed Feb 7 16:57:12 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 7 Feb 2018 13:57:12 -0800 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: <2d12c7ac-d1f3-bcd9-c6c4-9c41b195459c@mgmiller.net> +1 I have the need for one or two of these in every project (of a certain size) and have to come up with solutions each time with the re module or a loop. Not a fan of regex's for trivial tasks, or those that require a real parser. On 2018-02-03 14:04, Franklin? Lee wrote: From rosuav at gmail.com Wed Feb 7 16:56:06 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 8 Feb 2018 08:56:06 +1100 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> Message-ID: On Thu, Feb 8, 2018 at 8:49 AM, Neil Girdhar wrote: > Arbitrary radix comes up every now and then and Decimal already has a > radix() method. It would be nice when initializing a Decimal object to be > able to specify an arbitrary radix>=2. > The radix method always returns 10, because decimal.Decimal always operates in base 10. Are you looking for a way to change the way arithmetic is done, or are you looking for a way to construct a Decimal from a string of digits and an arbitrary base (the way int("...", x) does)? ChrisA From mistersheik at gmail.com Wed Feb 7 17:07:41 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 07 Feb 2018 22:07:41 +0000 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> Message-ID: I wanted to have something like a binary floating point number like 0.11011. Ideally, it would be as simple as Decimal('0.11011', radix=2). Best, Neil On Wed, Feb 7, 2018 at 5:02 PM Chris Angelico wrote: > On Thu, Feb 8, 2018 at 8:49 AM, Neil Girdhar > wrote: > > Arbitrary radix comes up every now and then and Decimal already has a > > radix() method. It would be nice when initializing a Decimal object to > be > > able to specify an arbitrary radix>=2. > > > > The radix method always returns 10, because decimal.Decimal always > operates in base 10. Are you looking for a way to change the way > arithmetic is done, or are you looking for a way to construct a > Decimal from a string of digits and an arbitrary base (the way > int("...", x) does)? > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/twWEvFwahaQ/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Wed Feb 7 17:08:50 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 07 Feb 2018 22:08:50 +0000 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> Message-ID: Oh, and to answer your specific question, I want to change the way arithmetic is done. I want it to be done in a different radix. On Wed, Feb 7, 2018 at 5:07 PM Neil Girdhar wrote: > I wanted to have something like a binary floating point number like > 0.11011. Ideally, it would be as simple as Decimal('0.11011', radix=2). > > Best, > > Neil > > On Wed, Feb 7, 2018 at 5:02 PM Chris Angelico wrote: > >> On Thu, Feb 8, 2018 at 8:49 AM, Neil Girdhar >> wrote: >> > Arbitrary radix comes up every now and then and Decimal already has a >> > radix() method. It would be nice when initializing a Decimal object to >> be >> > able to specify an arbitrary radix>=2. >> > >> >> The radix method always returns 10, because decimal.Decimal always >> operates in base 10. Are you looking for a way to change the way >> arithmetic is done, or are you looking for a way to construct a >> Decimal from a string of digits and an arbitrary base (the way >> int("...", x) does)? >> >> ChrisA >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> -- >> >> --- >> You received this message because you are subscribed to a topic in the >> Google Groups "python-ideas" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/python-ideas/twWEvFwahaQ/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> python-ideas+unsubscribe at googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Feb 7 17:26:49 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 8 Feb 2018 00:26:49 +0200 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: 04.02.18 00:04, Franklin? Lee ????: > Let s be a str. I propose to allow these existing str methods to take > params in new forms. > > s.replace(old, new): > ? ? Allow passing in a collection of olds. > ? ? Allow passing in a single argument, a mapping of olds to news. > ? ? Allow the olds in the mapping to be tuples of strings. > > s.split(sep), s.rsplit, s.partition: > ? ? Allow sep to be a collection of separators. > > s.startswith, s.endswith: > ? ? Allow argument to be a collection of strings. > > s.find, s.index, s.count, x in s: > ? ? Similar. > ? ? These methods are also in `list`, which can't distinguish between > items, subsequences, and subsets. However, `str` is already inconsistent > with `list` here: list.M looks for an item, while str.M looks for a > subsequence. > > s.[r|l]strip: > ? ? Sadly, these functions already interpret their str arguments as > collections of characters. The name of complicated str methods is regular expressions. For doing these operations efficiently you need to convert arguments in special optimized form. This is what re.compile() does. If make a compilation on every invocation of a str method, this will add too large overhead and kill performance. Even for simple string search a regular expression can be more efficient than a str method. $ ./python -m timeit -s 'import re; p = re.compile("spam"); s = "spa"*100+"m"' -- 'p.search(s)' 500000 loops, best of 5: 680 nsec per loop $ ./python -m timeit -s 's = "spa"*100+"m"' -- 's.find("spam")' 200000 loops, best of 5: 1.09 usec per loop From njs at pobox.com Wed Feb 7 17:32:21 2018 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 7 Feb 2018 14:32:21 -0800 Subject: [Python-ideas] Consider making Decimal's context use PEP 567 In-Reply-To: References: Message-ID: On Feb 7, 2018 1:54 PM, "Neil Girdhar" wrote: Decimal could just pull its Context object from a context variable rather than having to pass it in to all functions. This would be similar to how numpy works. Decimal has always used a thread local context the same way numpy does, and in 3.7 it's switching to use a PEP 567 context: https://bugs.python.org/issue32630 -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Wed Feb 7 17:36:43 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 07 Feb 2018 22:36:43 +0000 Subject: [Python-ideas] Consider making Decimal's context use PEP 567 In-Reply-To: References: Message-ID: Wow, that's awesome! I didn't notice that when I checked. It seemed like context had to be passed in. If it were me, I would probably deprecate those context=None arguments now that we have such a clean solution. On Wed, Feb 7, 2018 at 5:32 PM Nathaniel Smith wrote: > On Feb 7, 2018 1:54 PM, "Neil Girdhar" wrote: > > Decimal could just pull its Context object from a context variable rather > than having to pass it in to all functions. This would be similar to how > numpy works. > > > Decimal has always used a thread local context the same way numpy does, > and in 3.7 it's switching to use a PEP 567 context: > > https://bugs.python.org/issue32630 > > -n > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Feb 7 17:50:54 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 8 Feb 2018 09:50:54 +1100 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> Message-ID: <20180207225054.GB26553@ando.pearwood.info> On Wed, Feb 07, 2018 at 10:08:50PM +0000, Neil Girdhar wrote: > Oh, and to answer your specific question, I want to change the way > arithmetic is done. I want it to be done in a different radix. Why? There are clear advantages to floating point arithmetic done in base 2 (speed, minimum possible rounding error, least amount of wobble), and a different advantage to floating point done in base 10 (matches exactly the standard decimal notation used by humans with no conversion error), Outside of those two bases, arithmetic done in any other base is going to combine the worst of both: - slower; - larger errors when converting from decimal numbers (in general); - larger rounding errors; - larger wobble; with no corresponding advantages unless your data is coming to you in arbitrary bases. Doing floating point arithmetic in decimal is already slower and less accurate than doing it in binary. I'd like to hear more about your use-case for doing it in base 19 or base 7, say, but I would have to guess that it is likely to be such a niche use-case that this functionality doesn't belong in the standard library. -- Steve From mistersheik at gmail.com Wed Feb 7 18:08:52 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 07 Feb 2018 23:08:52 +0000 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: <20180207225054.GB26553@ando.pearwood.info> References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: On Wed, Feb 7, 2018 at 5:52 PM Steven D'Aprano wrote: > On Wed, Feb 07, 2018 at 10:08:50PM +0000, Neil Girdhar wrote: > > > Oh, and to answer your specific question, I want to change the way > > arithmetic is done. I want it to be done in a different radix. > > Why? > > There are clear advantages to floating point arithmetic done in base 2 > (speed, minimum possible rounding error, least amount of wobble), and a > different advantage to floating point done in base 10 (matches exactly > the standard decimal notation used by humans with no conversion error), > Outside of those two bases, arithmetic done in any other base is going > to combine the worst of both: > > - slower; > - larger errors when converting from decimal numbers (in general); > - larger rounding errors; > - larger wobble; > I don't see why it would have any of those problems. Base 10 isn't special in any way. > > with no corresponding advantages unless your data is coming to you in > arbitrary bases. > Right, I was playing with this problem ( https://brilliant.org/weekly-problems/2017-10-02/advanced/?problem=no-computer-needed) and wanted to work in base 2. I realize it's niche, but it's not exactly a significant change to the interface even if it's a big change to the implementation. > > Doing floating point arithmetic in decimal is already slower and less > accurate than doing it in binary. I'd like to hear more about your > use-case for doing it in base 19 or base 7, say, but I would have to > guess that it is likely to be such a niche use-case that this > functionality doesn't belong in the standard library. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/twWEvFwahaQ/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Feb 7 18:31:32 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 7 Feb 2018 23:31:32 +0000 Subject: [Python-ideas] Possible Enhancement to py Launcher - set default In-Reply-To: <5a7b550b.56b3500a.be314.1ceeSMTPIN_ADDED_MISSING@mx.google.com> References: <0fce01d39fd5$9b23c720$d16b5560$@sdamon.com> <104e01d3a023$efb3bdb0$cf1b3910$@sdamon.com> <5a7b550b.56b3500a.be314.1ceeSMTPIN_ADDED_MISSING@mx.google.com> Message-ID: On 7 February 2018 at 19:35, Steve Dower wrote: > Checking the Version (!=SysVersion) property should be enough (and perhaps > we need to set it properly on install). The launcher currently only works > with PythonCore entries anyway, so no need to worry about other distros. Fair enough. But there was a separate proposal to make the launcher handle non-PythonCore cases - there's a risk of conflicting feature requests here :-) > PEP 514 allows for other keys to be added as well (it specifies a minimum > set), so we could just set one for this. ?NoDefaultLaunch? or similar. Sure - it could be purely a launcher convention, rather than having to be specifically noted in PEP 514. Paul From rosuav at gmail.com Wed Feb 7 18:35:51 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 8 Feb 2018 10:35:51 +1100 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: On Thu, Feb 8, 2018 at 10:08 AM, Neil Girdhar wrote: > > On Wed, Feb 7, 2018 at 5:52 PM Steven D'Aprano wrote: >> >> - slower; >> - larger errors when converting from decimal numbers (in general); >> - larger rounding errors; >> - larger wobble; > > > I don't see why it would have any of those problems. Base 10 isn't special > in any way. Base 10 *is* special, because it corresponds to what humans use. In binary floating-point, you get weird results (by human standards) like 0.1+0.2 not being 0.3; that doesn't happen in decimal. There is no error when converting from a string of decimal digits to a decimal.Decimal, so presumably to avoid error, you'd have to work with digits in the same base. The rounding errors and wobble are by comparison with binary; you get the same problems in any other base, without the benefit of human-friendly behaviour. > Right, I was playing with this problem > (https://brilliant.org/weekly-problems/2017-10-02/advanced/?problem=no-computer-needed) > and wanted to work in base 2. I realize it's niche, but it's not exactly a > significant change to the interface even if it's a big change to the > implementation. You should be able to use the native float type for binary floating-point. But the whole point of that challenge is that you shouldn't need a computer. ChrisA From mistersheik at gmail.com Wed Feb 7 18:49:27 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 07 Feb 2018 23:49:27 +0000 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: On Wed, Feb 7, 2018 at 6:36 PM Chris Angelico wrote: > On Thu, Feb 8, 2018 at 10:08 AM, Neil Girdhar > wrote: > > > > On Wed, Feb 7, 2018 at 5:52 PM Steven D'Aprano > wrote: > >> > >> - slower; > >> - larger errors when converting from decimal numbers (in general); > >> - larger rounding errors; > >> - larger wobble; > > > > > > I don't see why it would have any of those problems. Base 10 isn't > special > > in any way. > > Base 10 *is* special, because it corresponds to what humans use. In > binary floating-point, you get weird results (by human standards) like > 0.1+0.2 not being 0.3; that doesn't happen in decimal. > > There is no error when converting from a string of decimal digits to a > decimal.Decimal, so presumably to avoid error, you'd have to work with > digits in the same base. The rounding errors and wobble are by > comparison with binary; you get the same problems in any other base, > without the benefit of human-friendly behaviour. > I see your list was about converting to and from base 10. That wasn't really intended in my proposal. I meant wholly working in another base. In that sense, 10 isn't particularly "fast, error-free, better at rounding, etc." > > > Right, I was playing with this problem > > ( > https://brilliant.org/weekly-problems/2017-10-02/advanced/?problem=no-computer-needed > ) > > and wanted to work in base 2. I realize it's niche, but it's not > exactly a > > significant change to the interface even if it's a big change to the > > implementation. > > You should be able to use the native float type for binary > floating-point. But the whole point of that challenge is that you > shouldn't need a computer. > Yeah, I know, but I wanted to play with it. Anyway, native floats don't help. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/twWEvFwahaQ/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Feb 7 18:53:30 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 8 Feb 2018 10:53:30 +1100 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: On Thu, Feb 8, 2018 at 10:49 AM, Neil Girdhar wrote: >> > Right, I was playing with this problem >> > >> > (https://brilliant.org/weekly-problems/2017-10-02/advanced/?problem=no-computer-needed) >> > and wanted to work in base 2. I realize it's niche, but it's not >> > exactly a >> > significant change to the interface even if it's a big change to the >> > implementation. >> >> You should be able to use the native float type for binary >> floating-point. But the whole point of that challenge is that you >> shouldn't need a computer. > > > Yeah, I know, but I wanted to play with it. Anyway, native floats don't > help. Sounds like performance isn't going to be a big problem, then. You can manage with a non-optimized and naive implementation. So here's a couple of things to try: 1) Check out PyPI and see if something like what you want exists. 2) Poke around in the source code for the Decimal class (ignore the C module and use the pure Python one) and see if you can hack on it. It'd then be off-topic for python-ideas, but it'd be an awesome topic to discuss on python-list. Exploration is great fun, and Python's a great language to explore with. ChrisA From ncoghlan at gmail.com Wed Feb 7 19:08:49 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 8 Feb 2018 10:08:49 +1000 Subject: [Python-ideas] Consider making Decimal's context use PEP 567 In-Reply-To: References: Message-ID: On 8 February 2018 at 08:36, Neil Girdhar wrote: > Wow, that's awesome! I didn't notice that when I checked. It seemed like > context had to be passed in. If it were me, I would probably deprecate > those context=None arguments now that we have such a clean solution. The context=None feature is there so that developers can write pure Decimal operations if they choose to do so, rather than always depending on implicit dynamic state. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From casevh at gmail.com Wed Feb 7 19:04:04 2018 From: casevh at gmail.com (Case Van Horsen) Date: Wed, 7 Feb 2018 16:04:04 -0800 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: On Wed, Feb 7, 2018 at 3:49 PM, Neil Girdhar wrote: > On Wed, Feb 7, 2018 at 6:36 PM Chris Angelico wrote: >> You should be able to use the native float type for binary >> floating-point. But the whole point of that challenge is that you >> shouldn't need a computer. > > > Yeah, I know, but I wanted to play with it. Anyway, native floats don't > help. >> >> >> ChrisA I maintain gmpy2 and it might do what you want (arbitrary precision radix-2 arithmetic and easy access to the bits). >>> gmpy2.get_context().precision=70 >>> gmpy2.mpfr(1)/7 mpfr('0.14285714285714285714283',70) >>> (gmpy2.mpfr(1)/7).digits(2) ('1001001001001001001001001001001001001001001001001001001001001001001001', -2, 70) Historical memory - I once wrote a radix-6 fixed point library to explore an extension of the 3n+1 problem to rational numbers. It was written in Turbo Pascal and ran for days on a 286/287 PC. casevh From steve at pearwood.info Wed Feb 7 19:10:20 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 8 Feb 2018 11:10:20 +1100 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: <20180208001019.GD26553@ando.pearwood.info> On Wed, Feb 07, 2018 at 11:49:27PM +0000, Neil Girdhar wrote: > I see your list was about converting to and from base 10. That wasn't > really intended in my proposal. I meant wholly working in another base. > In that sense, 10 isn't particularly "fast, error-free, better at rounding, > etc." I never said it was. Base 2 floats is the one that is faster and better at rounding than any other base. No finite precision floating point arithmetic can be error free, but all else being equal, base 2 minimises the errors you get. The advantage of base 10 is that it matches the standard base 10 numbers we write. Within the boundaries of the available precision, if you can write it in decimal, you can represent it in a decimal float. That isn't necessarily true of decimal -> binary floats. > > > Right, I was playing with this problem > > > ( > > https://brilliant.org/weekly-problems/2017-10-02/advanced/?problem=no-computer-needed > > ) > > > and wanted to work in base 2. I realize it's niche, but it's not > > exactly a > > > significant change to the interface even if it's a big change to the > > > implementation. > > > > You should be able to use the native float type for binary > > floating-point. But the whole point of that challenge is that you > > shouldn't need a computer. > > > > Yeah, I know, but I wanted to play with it. Anyway, native floats don't > help. Why not? If you can write the float in binary exactly, you can write it in hex, and use float.fromhex() to convert it exactly (provided it fits into the 64-bit floats Python uses). -- Steve From tjreedy at udel.edu Wed Feb 7 19:13:27 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 7 Feb 2018 19:13:27 -0500 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: On 2/7/2018 6:08 PM, Neil Girdhar wrote: > > On Wed, Feb 7, 2018 at 5:52 PM Steven D'Aprano > > wrote: > > On Wed, Feb 07, 2018 at 10:08:50PM +0000, Neil Girdhar wrote: > > > Oh, and to answer your specific question, I want to change the way > > arithmetic is done.? I want it to be done in a different radix. > > Why? > > There are clear advantages to floating point arithmetic done in base 2 > (speed, minimum possible rounding error, least amount of wobble), and a > different advantage to floating point done in base 10 (matches exactly > the standard decimal notation used by humans with no conversion error), This is the specialness of base 10 > Outside of those two bases, arithmetic done in any other base is going > to combine the worst of both: > > - slower; > - larger errors when converting from decimal numbers (in general); > - larger rounding errors; > - larger wobble; > I don't see why it would have any of those problems. Any base other than 2 has decreased speed (on a binary computer) and increased computational rounding errors and wobble. >? Base 10 isn't special in any way. Except as noted above and the fact that computation with binary coded decimal goes back to the early days of electronic computation. > with no corresponding advantages unless your data is coming to you in > arbitrary bases. > Right, I was playing with this problem > (https://brilliant.org/weekly-problems/2017-10-02/advanced/?problem=no-computer-needed) > and wanted to work in base 2.? I realize it's niche, but it's not > exactly a significant change to the interface even if it's a big change > to the implementation. In cpython, decimal uses _cdecimal for speed. I suspect that 10 is not only explicitly hard-coded as the base but implicitly hard-coded by using algorithm tricks that depend on and only work when the base is 10. -- Terry Jan Reedy From mistersheik at gmail.com Wed Feb 7 19:14:42 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 08 Feb 2018 00:14:42 +0000 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: That's really cool! I never knew about gmpy. On Wed, Feb 7, 2018 at 7:10 PM Case Van Horsen wrote: > On Wed, Feb 7, 2018 at 3:49 PM, Neil Girdhar > wrote: > > On Wed, Feb 7, 2018 at 6:36 PM Chris Angelico wrote: > >> You should be able to use the native float type for binary > >> floating-point. But the whole point of that challenge is that you > >> shouldn't need a computer. > > > > > > Yeah, I know, but I wanted to play with it. Anyway, native floats don't > > help. > >> > >> > >> ChrisA > > I maintain gmpy2 and it might do what you want (arbitrary precision > radix-2 arithmetic and easy access to the bits). > > >>> gmpy2.get_context().precision=70 > >>> gmpy2.mpfr(1)/7 > mpfr('0.14285714285714285714283',70) > >>> (gmpy2.mpfr(1)/7).digits(2) > ('1001001001001001001001001001001001001001001001001001001001001001001001', > -2, 70) > > Historical memory - I once wrote a radix-6 fixed point library to > explore an extension of the 3n+1 problem to rational numbers. It was > written in Turbo Pascal and ran for days on a 286/287 PC. > > casevh > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/twWEvFwahaQ/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Thu Feb 8 05:45:01 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Thu, 8 Feb 2018 05:45:01 -0500 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: On Feb 7, 2018 17:28, "Serhiy Storchaka" wrote: 04.02.18 00:04, Franklin? Lee ????: Let s be a str. I propose to allow these existing str methods to take > params in new forms. > > s.replace(old, new): > Allow passing in a collection of olds. > Allow passing in a single argument, a mapping of olds to news. > Allow the olds in the mapping to be tuples of strings. > > s.split(sep), s.rsplit, s.partition: > Allow sep to be a collection of separators. > > s.startswith, s.endswith: > Allow argument to be a collection of strings. > > s.find, s.index, s.count, x in s: > Similar. > These methods are also in `list`, which can't distinguish between > items, subsequences, and subsets. However, `str` is already inconsistent > with `list` here: list.M looks for an item, while str.M looks for a > subsequence. > > s.[r|l]strip: > Sadly, these functions already interpret their str arguments as > collections of characters. > The name of complicated str methods is regular expressions. For doing these operations efficiently you need to convert arguments in special optimized form. This is what re.compile() does. If make a compilation on every invocation of a str method, this will add too large overhead and kill performance. Even for simple string search a regular expression can be more efficient than a str method. $ ./python -m timeit -s 'import re; p = re.compile("spam"); s = "spa"*100+"m"' -- 'p.search(s)' 500000 loops, best of 5: 680 nsec per loop $ ./python -m timeit -s 's = "spa"*100+"m"' -- 's.find("spam")' 200000 loops, best of 5: 1.09 usec per loop That's an odd result. Python regexes use backtracking, not a DFA. I gave a timing test earlier in the thread: https://mail.python.org/pipermail/python-ideas/2018-February/048879.html I compared using repeated .find()s against a precompiled regex, then against a pure Python and unoptimized tree-based algorithm. Could it be that re uses an optimization that can also be used in str? CPython uses a modified Boyer-Moore for str.find: https://github.com/python/cpython/blob/master/Objects/stringlib/fastsearch.h http://effbot.org/zone/stringlib.htm Maybe there's a minimum length after which it's better to precompute a table. In any case, once you have branches in the regex, which is necessary to emulate these features, it will start to slow down because it has to travel down both branches in the worst case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Thu Feb 8 10:31:16 2018 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 8 Feb 2018 16:31:16 +0100 Subject: [Python-ideas] Complicate str methods In-Reply-To: <2d12c7ac-d1f3-bcd9-c6c4-9c41b195459c@mgmiller.net> References: <2d12c7ac-d1f3-bcd9-c6c4-9c41b195459c@mgmiller.net> Message-ID: Same here. On 07.02.2018 22:57, Mike Miller wrote: > +1 > > I have the need for one or two of these in every project (of a certain > size) and have to come up with solutions each time with the re module > or a loop. > > Not a fan of regex's for trivial tasks, or those that require a real > parser. > > On 2018-02-03 14:04, Franklin? Lee wrote: > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Thu Feb 8 11:24:57 2018 From: steve.dower at python.org (Steve Dower) Date: Thu, 8 Feb 2018 08:24:57 -0800 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: Easily fixed by installing one of the alternate regex libraries. re performance and its edge cases have been discussed endlessly. Please look it up before restarting that discussion. Top-posted from my Windows phone From: Franklin? Lee Sent: Thursday, February 8, 2018 2:46 To: Serhiy Storchaka Cc: Python-Ideas Subject: Re: [Python-ideas] Complicate str methods On Feb 7, 2018 17:28, "Serhiy Storchaka" wrote: 04.02.18 00:04, Franklin? Lee ????: Let s be a str. I propose to allow these existing str methods to take params in new forms. s.replace(old, new): ?? ? Allow passing in a collection of olds. ?? ? Allow passing in a single argument, a mapping of olds to news. ?? ? Allow the olds in the mapping to be tuples of strings. s.split(sep), s.rsplit, s.partition: ?? ? Allow sep to be a collection of separators. s.startswith, s.endswith: ?? ? Allow argument to be a collection of strings. s.find, s.index, s.count, x in s: ?? ? Similar. ?? ? These methods are also in `list`, which can't distinguish between items, subsequences, and subsets. However, `str` is already inconsistent with `list` here: list.M looks for an item, while str.M looks for a subsequence. s.[r|l]strip: ?? ? Sadly, these functions already interpret their str arguments as collections of characters. The name of complicated str methods is regular expressions. For doing these operations efficiently you need to convert arguments in special optimized form. This is what re.compile() does. If make a compilation on every invocation of a str method, this will add too large overhead and kill performance. Even for simple string search a regular expression can be more efficient than a str method. $ ./python -m timeit -s 'import re; p = re.compile("spam"); s = "spa"*100+"m"' -- 'p.search(s)' 500000 loops, best of 5: 680 nsec per loop $ ./python -m timeit -s 's = "spa"*100+"m"' -- 's.find("spam")' 200000 loops, best of 5: 1.09 usec per loop That's an odd result. Python regexes use backtracking, not a DFA. I gave a timing test earlier in the thread: https://mail.python.org/pipermail/python-ideas/2018-February/048879.html I compared using repeated .find()s against a precompiled regex, then against a pure Python and unoptimized tree-based algorithm. Could it be that re uses an optimization that can also be used in str? CPython uses a modified Boyer-Moore for str.find: https://github.com/python/cpython/blob/master/Objects/stringlib/fastsearch.h http://effbot.org/zone/stringlib.htm Maybe there's a minimum length after which it's better to precompute a table. In any case, once you have branches in the regex, which is necessary to emulate these features, it will start to slow down because it has to travel down both branches in the worst case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Thu Feb 8 12:13:26 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Thu, 8 Feb 2018 12:13:26 -0500 Subject: [Python-ideas] Complicate str methods In-Reply-To: <5a7c79df.47de500a.61115.11ffSMTPIN_ADDED_MISSING@mx.google.com> References: <5a7c79df.47de500a.61115.11ffSMTPIN_ADDED_MISSING@mx.google.com> Message-ID: On Thu, Feb 8, 2018 at 11:24 AM, Steve Dower wrote: > Easily fixed by installing one of the alternate regex libraries. MRAB's regex library, the most prominent alternative, does not use the linear-time search algorithm. The only libraries I know that do are the ones with re2, though I haven't looked deeply. Let it be known that I tried to install pyre2 (re2 on PyPI) on Ubuntu For Windows for the tests, and after hours of no success, I decided that the alternative libraries were not the point. I eventually got it working (https://github.com/axiak/pyre2/issues/51), and here are the results: # Same setup as before, with findsub_re2 using `find=re2.search, esc=re2.escape`. pattern2 = re2.compile('|'.join(map(re.escape, needles))) %timeit findsub(haystack, needles) #=> 1.26 ms ? 15.8 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) %timeit findsub_re(haystack, needles) #=> 745 ms ? 19.8 ms per loop (mean ? std. dev. of 7 runs, 1 loop each) %timeit findsub_re_cached(haystack, pattern) #=> 733 ms ? 16.5 ms per loop (mean ? std. dev. of 7 runs, 1 loop each) %timeit findsub_regex(haystack, needles) #=> 639 ms ? 12.9 ms per loop (mean ? std. dev. of 7 runs, 1 loop each) %timeit findsub_re2(haystack, needles) #=> 34.1 ms ? 1.13 ms per loop (mean ? std. dev. of 7 runs, 10 loops each) %timeit findsub_re_cached(haystack, pattern2) #=> 9.56 ?s ? 268 ns per loop (mean ? std. dev. of 7 runs, 100000 loops each) In any case, installing re2 is even more of an advanced solution for a basic problem than using re. > re performance and its edge cases have been discussed endlessly. Please look > it up before restarting that discussion. I'm not trying to restart the discussion. I'm trying to say that the assumptions being made about its superior performance are unfounded. Two members have suggested that re is the performant option, and that's just not true. From songofacandy at gmail.com Thu Feb 8 12:14:50 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 9 Feb 2018 02:14:50 +0900 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: I think it shouldn't be str's method. They should be separate class to reuse internal tree. There are some Aho Corasick implementation on PyPI. As far as I know, AC is longest match. On the other hand, Go's replacer (it's trie based too) is: > Replacements are performed in order, without overlapping matches. https://golang.org/pkg/strings/#NewReplacer On Sun, Feb 4, 2018 at 7:04 AM, Franklin? Lee wrote: > Let s be a str. I propose to allow these existing str methods to take params > in new forms. > > s.replace(old, new): > Allow passing in a collection of olds. > Allow passing in a single argument, a mapping of olds to news. > Allow the olds in the mapping to be tuples of strings. > > s.split(sep), s.rsplit, s.partition: > Allow sep to be a collection of separators. > > s.startswith, s.endswith: > Allow argument to be a collection of strings. > > s.find, s.index, s.count, x in s: > Similar. > These methods are also in `list`, which can't distinguish between items, > subsequences, and subsets. However, `str` is already inconsistent with > `list` here: list.M looks for an item, while str.M looks for a subsequence. > > s.[r|l]strip: > Sadly, these functions already interpret their str arguments as > collections of characters. > > These new forms can be optimized internally, as a search for multiple > candidate substrings can be more efficient than searching for one at a time. > See > https://stackoverflow.com/questions/3260962/algorithm-to-find-multiple-string-matches > > The most significant change is on .replace. The others are simple enough to > simulate with a loop or something. It is harder to make multiple > simultaneous replacements using one .replace at a time, because previous > replacements can form new things that look like replaceables. The easiest > Python solution is to use regex or install some package, which uses (if > you're lucky) regex or (if unlucky) doesn't simulate simultaneous > replacements. (If possible, just use str.translate.) > > I suppose .split on multiple separators is also annoying to simulate. The > two-argument form of .split may be even more of a burden, though I don't > know when a limited multiple-separator split is useful. The current best > solution is, like before, to use regex, or install a package and hope for > the best. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- INADA Naoki From storchaka at gmail.com Thu Feb 8 13:04:48 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 8 Feb 2018 20:04:48 +0200 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: 08.02.18 12:45, Franklin? Lee ????: > On Feb 7, 2018 17:28, "Serhiy Storchaka" > > wrote: > Even for simple string search a regular expression can be more > efficient than a str method. > > $ ./python -m timeit -s 'import re; p = re.compile("spam"); s = > "spa"*100+"m"' -- 'p.search(s)' > 500000 loops, best of 5: 680 nsec per loop > > $ ./python -m timeit -s 's = "spa"*100+"m"' -- 's.find("spam")' > 200000 loops, best of 5: 1.09 usec per loop > > > That's an odd result. Python regexes use backtracking, not a DFA. I gave > a timing test earlier in the thread: > https://mail.python.org/pipermail/python-ideas/2018-February/048879.html > I compared using repeated .find()s against a precompiled regex, then > against a pure Python and unoptimized tree-based algorithm. > > Could it be that re uses an optimization that can also be used in str? > CPython uses a modified Boyer-Moore for str.find: > https://github.com/python/cpython/blob/master/Objects/stringlib/fastsearch.h > http://effbot.org/zone/stringlib.htm > Maybe there's a minimum length after which it's better to precompute a > table. Yes, there is a special optimization in re here. It isn't free, you need to spend some time for preparing it. You need a special object that keeps an optimized representation for faster search. This makes it very unlikely be used in str, because you need either spend the time for compilation on every search, or use some kind of caching, which is not free too, adds complexity and increases memory consumption. Note also in case of re the compiler is implemented in Python. This reduces the complexity. Patches that add optimization for other common cases are welcomed. From leewangzhong+python at gmail.com Thu Feb 8 13:23:48 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Thu, 8 Feb 2018 13:23:48 -0500 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: On Thu, Feb 8, 2018 at 5:45 AM, Franklin? Lee wrote: > On Feb 7, 2018 17:28, "Serhiy Storchaka" wrote: > > The name of complicated str methods is regular expressions. For doing these > > operations efficiently you need to convert arguments in special optimized > > form. This is what re.compile() does. If make a compilation on every > > invocation of a str method, this will add too large overhead and kill > > performance. > > > > Even for simple string search a regular expression can be more efficient > > than a str method. > > > > $ ./python -m timeit -s 'import re; p = re.compile("spam"); s = > > "spa"*100+"m"' -- 'p.search(s)' > > 500000 loops, best of 5: 680 nsec per loop > > > > $ ./python -m timeit -s 's = "spa"*100+"m"' -- 's.find("spam")' > > 200000 loops, best of 5: 1.09 usec per loop I ran Serhiy's tests (3.5.2) and got different results. # Setup: __builtins__.basestring = str #hack for re2 import in py3 import re, re2, regex n = 10000 s = "spa"*n+"m" p = re.compile("spam") pgex = regex.compile("spam") p2 = re2.compile("spam") # Tests: %timeit s.find("spam") %timeit p.search(s) %timeit pgex.search(s) %timeit p2.search(s) n = 100 350 ns ? 17.9 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) 554 ns ? 16.1 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) 633 ns ? 8.05 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) 1.62 ?s ? 68.4 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) n = 1000 2.17 ?s ? 177 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) 3.57 ?s ? 27.7 ns per loop (mean ? std. dev. of 7 runs, 100000 loops each) 3.46 ?s ? 66.6 ns per loop (mean ? std. dev. of 7 runs, 100000 loops each) 7.8 ?s ? 72 ns per loop (mean ? std. dev. of 7 runs, 100000 loops each) n = 10000 17.3 ?s ? 326 ns per loop (mean ? std. dev. of 7 runs, 100000 loops each) 33.5 ?s ? 138 ns per loop (mean ? std. dev. of 7 runs, 10000 loops each) 31.7 ?s ? 396 ns per loop (mean ? std. dev. of 7 runs, 10000 loops each) 67.5 ?s ? 400 ns per loop (mean ? std. dev. of 7 runs, 10000 loops each) Conclusions: - `.find` is fastest. On 3.6.1 (Windows), it's about the same speed as re: 638 ns vs 662 ns; 41.3 ?s vs 43.8 ?s. - re and regex have similar performance, probably due to a similar backend. - re2 is slowest. I suspect it's due to the wrapper. It may be copying the strings to a format suitable for the backend. P.S.: I also tested `"spam" in s`, which was linearly slower than `.find`. However, `in` is consistently faster than `.find` in my 3.6, so the discrepancy has likely been fixed. More curious is that, on `.find`, my MSVC-compiled 3.6.1 and 3.5.2 are twice as slow as my 3.5.2 for Ubuntu For Windows, but the re performance is similar. It's probably a compiler thing. From leewangzhong+python at gmail.com Thu Feb 8 14:58:15 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Thu, 8 Feb 2018 14:58:15 -0500 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: On Feb 8, 2018 13:06, "Serhiy Storchaka" wrote: 08.02.18 12:45, Franklin? Lee ????: > Could it be that re uses an optimization that can also be used in str? > CPython uses a modified Boyer-Moore for str.find: > https://github.com/python/cpython/blob/master/Objects/string > lib/fastsearch.h > http://effbot.org/zone/stringlib.htm > Maybe there's a minimum length after which it's better to precompute a > table. > Yes, there is a special optimization in re here. It isn't free, you need to spend some time for preparing it. You need a special object that keeps an optimized representation for faster search. This makes it very unlikely be used in str, because you need either spend the time for compilation on every search, or use some kind of caching, which is not free too, adds complexity and increases memory consumption. Note also in case of re the compiler is implemented in Python. This reduces the complexity. The performance of the one-needle case isn't really relevant, though, is it? This idea is for the multi-needle case, and my tests showed that re performs even worse than a loop of `.find`s. How do re and .find scale with both number and lengths of needles on your machine? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Thu Feb 8 16:24:35 2018 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 8 Feb 2018 21:24:35 +0000 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: <20180207225054.GB26553@ando.pearwood.info> References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: On Wed, Feb 7, 2018 at 10:50 PM, Steven D'Aprano wrote: > Why? > > There are clear advantages to floating point arithmetic done in base 2 > (speed, minimum possible rounding error, least amount of wobble), and a > different advantage to floating point done in base 10 (matches exactly > the standard decimal notation used by humans with no conversion error), > Outside of those two bases, arithmetic done in any other base is going > to combine the worst of both: > Well, there are a couple of other bases that are potentially interesting. There are some compelling mathematical advantages to using ternary (or even better, balanced ternary). Knuth calls balanced ternary "Perhaps the prettiest number system of all" in TAOCP, and ternary is in some sense the most efficient base to use; the article "Third Base" by Brian Hayes (Sci. Am., 2001) gives a nice overview. It's completely inappropriate for binary hardware, of course. And base-16 floating-point is still used in current IBM hardware, but I don't know whether that's purely for historical/backwards-compatibility reasons, or because it's faster for the FPU. As to making decimal support arbitrary radix, though: I don't see that happening any time soon. The amount of work would be enormous, for questionable gain. -- Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Feb 8 17:37:59 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 09 Feb 2018 11:37:59 +1300 Subject: [Python-ideas] Consider generalizing Decimal to support arbitrary radix In-Reply-To: References: <262f05d6-8f5a-4b0f-8afd-c09375b20926@googlegroups.com> <20180207225054.GB26553@ando.pearwood.info> Message-ID: <5A7CD147.4090405@canterbury.ac.nz> Mark Dickinson wrote: > And base-16 floating-point is still used in current IBM hardware, but I > don't know whether that's purely for historical/backwards-compatibility > reasons, or because it's faster for the FPU. Historically, base 16 was used to get a bigger exponent range for a given number of exponent bits. That was a bigger deal back when memory was very expensive. I doubt there's any advantage in it now. -- Greg From mariocj89 at gmail.com Sat Feb 10 14:29:27 2018 From: mariocj89 at gmail.com (Mario Corchero) Date: Sat, 10 Feb 2018 19:29:27 +0000 Subject: [Python-ideas] Easing set-up of of console python applications Message-ID: Hello All! I got asked how to configure the logging stack to be able to output directly to console using both stdout and stderr and I could not really find a great answer as adding both as StreamHandlers will result in error and above messages going to stdout. The usecase is having a cli or app that wants to log to console as other tools. Errors and bove to stderr and normal information to stdout. I know the recommended way in Python is to just use print on simple scripts, but it can happen that you import a library you want to see the logs of, and therefore you need to set the logging stack. I have drafted two implementations but I am open to suggestions: 1) A "Console Handler" that uses multiple streams and chooses based on the level. 2) An inverted filter that can be used to filter everything above info for the stdout Stream handler. What do you people think? If people like it I'll send an issue + PR. Regards, Mario Corchero -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Feb 10 15:20:50 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 11 Feb 2018 07:20:50 +1100 Subject: [Python-ideas] Easing set-up of of console python applications In-Reply-To: References: Message-ID: On Sun, Feb 11, 2018 at 6:29 AM, Mario Corchero wrote: > Hello All! > > I got asked how to configure the logging stack to be able to output directly > to console using both stdout and stderr and I could not really find a great > answer as adding both as StreamHandlers will result in error and above > messages going to stdout. There's a recipe in the docs that shows how to "fork" to console and file: https://docs.python.org/3/howto/logging-cookbook.html#logging-to-multiple-destinations I presume that's what you were looking at? Because yes, that'll do exactly what you say. > The usecase is having a cli or app that wants to log to console as other > tools. Errors and bove to stderr and normal information to stdout. I know > the recommended way in Python is to just use print on simple scripts, but it > can happen that you import a library you want to see the logs of, and > therefore you need to set the logging stack. > > I have drafted two implementations but I am open to suggestions: > 1) A "Console Handler" that uses multiple streams and chooses based on the > level. > 2) An inverted filter that can be used to filter everything above info for > the stdout Stream handler. > > What do you people think? > If people like it I'll send an issue + PR. It would be an interesting variant on the recipe to say "debug and above, but NOT error and above, goes to this stream". How complex are the implementations? Would they fit nicely into the cookbook? ChrisA From mariocj89 at gmail.com Sat Feb 10 15:49:11 2018 From: mariocj89 at gmail.com (Mario Corchero) Date: Sat, 10 Feb 2018 20:49:11 +0000 Subject: [Python-ideas] Easing set-up of of console python applications In-Reply-To: References: Message-ID: The recipe as you pointed out works by logging to both (just using multiple handlers). The objective is to log *up to a level* to stdout and the rest to stderr. See the example console handler here and the filter here . Good point about just adding it to the how-to. On 10 February 2018 at 20:20, Chris Angelico wrote: > On Sun, Feb 11, 2018 at 6:29 AM, Mario Corchero > wrote: > > Hello All! > > > > I got asked how to configure the logging stack to be able to output > directly > > to console using both stdout and stderr and I could not really find a > great > > answer as adding both as StreamHandlers will result in error and above > > messages going to stdout. > > There's a recipe in the docs that shows how to "fork" to console and file: > > https://docs.python.org/3/howto/logging-cookbook.html#logging-to-multiple- > destinations > > I presume that's what you were looking at? Because yes, that'll do > exactly what you say. > > > The usecase is having a cli or app that wants to log to console as other > > tools. Errors and bove to stderr and normal information to stdout. I know > > the recommended way in Python is to just use print on simple scripts, > but it > > can happen that you import a library you want to see the logs of, and > > therefore you need to set the logging stack. > > > > I have drafted two implementations but I am open to suggestions: > > 1) A "Console Handler" that uses multiple streams and chooses based on > the > > level. > > 2) An inverted filter that can be used to filter everything above info > for > > the stdout Stream handler. > > > > What do you people think? > > If people like it I'll send an issue + PR. > > It would be an interesting variant on the recipe to say "debug and > above, but NOT error and above, goes to this stream". How complex are > the implementations? Would they fit nicely into the cookbook? > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Feb 10 15:59:26 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 11 Feb 2018 07:59:26 +1100 Subject: [Python-ideas] Easing set-up of of console python applications In-Reply-To: References: Message-ID: On Sun, Feb 11, 2018 at 7:49 AM, Mario Corchero wrote: > The recipe as you pointed out works by logging to both (just using multiple > handlers). Yep. It's a "forking" setup. What you're proposing is a "splitting" setup, which would be a great recipe to put immediately underneath that one. > The objective is to log *up to a level* to stdout and the rest to stderr. > > See the example console handler here and the filter here. > > Good point about just adding it to the how-to. You know, I think BOTH of those are worth adding. Especially the filter method, which is pretty concise and easy to build on. Want to write up a docs patch? ChrisA From mariocj89 at gmail.com Sat Feb 10 16:09:42 2018 From: mariocj89 at gmail.com (Mario Corchero) Date: Sat, 10 Feb 2018 21:09:42 +0000 Subject: [Python-ideas] Easing set-up of of console python applications In-Reply-To: References: Message-ID: Sure thing! Will prepare it tomorrow. On Sat, 10 Feb 2018 at 21:00, Chris Angelico wrote: > On Sun, Feb 11, 2018 at 7:49 AM, Mario Corchero > wrote: > > The recipe as you pointed out works by logging to both (just using > multiple > > handlers). > > Yep. It's a "forking" setup. What you're proposing is a "splitting" > setup, which would be a great recipe to put immediately underneath > that one. > > > The objective is to log *up to a level* to stdout and the rest to stderr. > > > > See the example console handler here and the filter here. > > > > Good point about just adding it to the how-to. > > You know, I think BOTH of those are worth adding. Especially the > filter method, which is pretty concise and easy to build on. Want to > write up a docs patch? > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yahya-abou-imran at protonmail.com Sun Feb 11 16:55:34 2018 From: yahya-abou-imran at protonmail.com (Yahya Abou 'Imran) Date: Sun, 11 Feb 2018 16:55:34 -0500 Subject: [Python-ideas] List the methods of a metaclass in the help of its instances Message-ID: <9ThR6UtT7BKzPnOTbmMKqOEw4oeB5R4zbm-cm76cngjNm_TsuoFG5aQ7-GCVOmHvrqhktzu5eWyT2gF9yrAIFa2Ao-RTWCmR57ktPJTDZjE=@protonmail.com> For example, in the help of ABCs, the register method witch is defined in ABCMeta is not listed. It's a little bit weird, since it's accessible from the class just like any classmethod or staticmethod, and it's a real feature of the class. I think it would be great if there was sections in the help that look like: | ---------------------------------------------------------------------- | Metaclass methods defined in : | ---------------------------------------------------------------------- | Metaclass methods inherited from : The same rules of visibility would apply of course. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylvain.marie at schneider-electric.com Mon Feb 12 04:41:04 2018 From: sylvain.marie at schneider-electric.com (Sylvain MARIE) Date: Mon, 12 Feb 2018 09:41:04 +0000 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module Message-ID: The numbers module provides very useful ABC for the 'numeric tower', able to abstract away the differences between python primitives and for example numpy primitives. I could not find any equivalent for Booleans. However numpy defines np.bool too, so being able to have an abstract Boolean class for both python bool and numpy bool would be great. Here is a version that I included in valid8 in the meantime ----------------------- class Boolean(metaclass=ABCMeta): """ An abstract base class for booleans, similar to what is available in numbers see https://docs.python.org/3.5/library/numbers.html """ __slots__ = () @abstractmethod def __bool__(self): """Return a builtin bool instance. Called for bool(self).""" @abstractmethod def __and__(self, other): """self & other""" @abstractmethod def __rand__(self, other): """other & self""" @abstractmethod def __xor__(self, other): """self ^ other""" @abstractmethod def __rxor__(self, other): """other ^ self""" @abstractmethod def __or__(self, other): """self | other""" @abstractmethod def __ror__(self, other): """other | self""" @abstractmethod def __invert__(self): """~self""" # register bool and numpy bool_ as virtual subclasses # so that issubclass(bool, Boolean) = issubclass(np.bool_, Boolean) = True Boolean.register(bool) try: import numpy as np Boolean.register(np.bool_) except ImportError: # silently escape pass --------------------------- If that topic was already discussed and settled in the past, please ignore this thread - apologies for not being able to find it. Best regards Sylvain -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Feb 12 08:45:03 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 13 Feb 2018 00:45:03 +1100 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: Message-ID: <20180212134502.GF26553@ando.pearwood.info> On Mon, Feb 12, 2018 at 09:41:04AM +0000, Sylvain MARIE wrote: > The numbers module provides very useful ABC for the 'numeric tower', > able to abstract away the differences between python primitives and > for example numpy primitives. > I could not find any equivalent for Booleans. > However numpy defines np.bool too, so being able to have an abstract > Boolean class for both python bool and numpy bool would be great. I don't know anything about numpy bools, but Python built-in bools are numbers, and as such already have an ABC: they are a subclass of int. -- Steve From guido at python.org Mon Feb 12 11:08:14 2018 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Feb 2018 08:08:14 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: Message-ID: TBH I've found the numeric tower a questionable addition to Python's stdlib. For PEP 484 we decided not to use it (using the concrete types, int, float etc. instead). So I'm not excited about adding more like this, especially since essentially *everything* can be used in a Boolean context. If your specific project has a specific style requirement for Booleans that would be helped by a Boolean ABC, maybe you should add it to your project and see how it works out, and after a few months report back here. Finally. How were you planning to use this new ABC? On Mon, Feb 12, 2018 at 1:41 AM, Sylvain MARIE < sylvain.marie at schneider-electric.com> wrote: > The numbers module provides very useful ABC for the ?numeric tower?, able > to abstract away the differences between python primitives and for example > numpy primitives. > > I could not find any equivalent for Booleans. > > However numpy defines np.bool too, so being able to have an abstract > Boolean class for both python bool and numpy bool would be great. > > > > Here is a version that I included in valid8 in the meantime > > > > ----------------------- > > class Boolean(metaclass=ABCMeta): > > """ > > An abstract base class for booleans, similar to what is available in > numbers > > see https://docs.python.org/3.5/library/numbers.html > > """ > > __slots__ = () > > > > @abstractmethod > > def __bool__(self): > > """Return a builtin bool instance. Called for bool(self).""" > > > > @abstractmethod > > def __and__(self, other): > > """self & other""" > > > > @abstractmethod > > def __rand__(self, other): > > """other & self""" > > > > @abstractmethod > > def __xor__(self, other): > > """self ^ other""" > > > > @abstractmethod > > def __rxor__(self, other): > > """other ^ self""" > > > > @abstractmethod > > def __or__(self, other): > > """self | other""" > > > > @abstractmethod > > def __ror__(self, other): > > """other | self""" > > > > @abstractmethod > > def __invert__(self): > > """~self""" > > > > > > # register bool and numpy bool_ as virtual subclasses > > # so that issubclass(bool, Boolean) = issubclass(np.bool_, Boolean) = True > > Boolean.register(bool) > > > > try: > > import numpy as np > > Boolean.register(np.bool_) > > except ImportError: > > # silently escape > > pass > > > > --------------------------- > > > > If that topic was already discussed and settled in the past, please ignore > this thread ? apologies for not being able to find it. > > Best regards > > > > Sylvain > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Mon Feb 12 11:14:21 2018 From: mertz at gnosis.cx (David Mertz) Date: Mon, 12 Feb 2018 08:14:21 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: <20180212134502.GF26553@ando.pearwood.info> References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: NumPy np.bool_ is specifically not a subclass of any np.int_. If it we're, there would be an ambiguity between indexing with a Boolean array and an array of ints. Both are meaningful, but they mean different things (mask vs collection of indices). Do we have other examples a Python ABC that exists to accommodate something outside the standard library or builtins? Even if not, NumPy is special... the actual syntax for '@' exists primarily for that library! On Feb 12, 2018 5:51 AM, "Steven D'Aprano" wrote: On Mon, Feb 12, 2018 at 09:41:04AM +0000, Sylvain MARIE wrote: > The numbers module provides very useful ABC for the 'numeric tower', > able to abstract away the differences between python primitives and > for example numpy primitives. > I could not find any equivalent for Booleans. > However numpy defines np.bool too, so being able to have an abstract > Boolean class for both python bool and numpy bool would be great. I don't know anything about numpy bools, but Python built-in bools are numbers, and as such already have an ABC: they are a subclass of int. -- Steve _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Feb 12 22:50:47 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Feb 2018 13:50:47 +1000 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: On 13 February 2018 at 02:14, David Mertz wrote: > NumPy np.bool_ is specifically not a subclass of any np.int_. If it we're, > there would be an ambiguity between indexing with a Boolean array and an > array of ints. Both are meaningful, but they mean different things (mask vs > collection of indices). > > Do we have other examples a Python ABC that exists to accommodate something > outside the standard library or builtins? Even if not, NumPy is special... > the actual syntax for '@' exists primarily for that library! collections.abc.Sequence and collections.abc.Mapping come to mind - the standard library doesn't tend to distinguish between different kinds of subscriptable objects, but it's a distinction some third party libraries and tools want to be able to make reliably. The other comparison that comes to mind would be the distinction between "__int__" ("can be coerced to an integer, but may lose information in the process") and "__index__" ("can be losslessly converted to and from a builtin integer"). Right now, we only define boolean coercion via "__bool__" - there's no mechanism to say "this *is* a boolean value that can be losslessly converted to and from the builtin boolean constants". That isn't a distinction the standard library makes, but it sounds like it's one that NumPy cares about (and NumPy was also the main driver for introducing __index__). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mertz at gnosis.cx Tue Feb 13 01:07:59 2018 From: mertz at gnosis.cx (David Mertz) Date: Mon, 12 Feb 2018 22:07:59 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in the standard library; Guido expresses skepticism. Of course it is possible to define it in some other library that actually needs to use `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure I'm unconvinced either, I can see a certain value to saying a given value is "fully round-trippable to bool" (as is np.bool_). But just for anyone who doesn't know NumPy, here's a quick illustration of what I alluded to: In [1]: import numpy as np In [2]: arr = np.array([7,8,12,33]) In [3]: ndx1 = np.array([0,1,1,0], dtype=int) In [4]: ndx2 = np.array([0,1,1,0], dtype=bool) In [5]: arr[ndx1] Out[5]: array([7, 8, 8, 7]) In [6]: arr[ndx2] Out[6]: array([ 8, 12]) ndx1 and ndx2 are both nice things (and are both often programmatically constructed by operations in NumPy). But indexing using ndx1 gives us an array of the things in the listed *positions* in arr. In this case, we happen to choose two each of the things an index 0 and index 1 in the result. Indexing by ndx2 gives us a filter of only those positions in arr corresponding to 'True's. These are both nice things to be able to do, but if NumPy's True was a special kind of 1, it wouldn't work out unambiguously. However, recent versions of NumPy *have* gotten a bit smarter about recognizing the special type of Python bools, so it's less of a trap than it used to be. Still, contrast these (using actual Python lists for the indexes: In [10]: arr[[False, True, True, False]] Out[10]: array([ 8, 12]) In [11]: arr[[False, True, 1, 0]] Out[11]: array([7, 8, 8, 7]) On Mon, Feb 12, 2018 at 7:50 PM, Nick Coghlan wrote: > On 13 February 2018 at 02:14, David Mertz wrote: > > NumPy np.bool_ is specifically not a subclass of any np.int_. If it > we're, > > there would be an ambiguity between indexing with a Boolean array and an > > array of ints. Both are meaningful, but they mean different things (mask > vs > > collection of indices). > > > > Do we have other examples a Python ABC that exists to accommodate > something > > outside the standard library or builtins? Even if not, NumPy is > special... > > the actual syntax for '@' exists primarily for that library! > > collections.abc.Sequence and collections.abc.Mapping come to mind - > the standard library doesn't tend to distinguish between different > kinds of subscriptable objects, but it's a distinction some third > party libraries and tools want to be able to make reliably. > > The other comparison that comes to mind would be the distinction > between "__int__" ("can be coerced to an integer, but may lose > information in the process") and "__index__" ("can be losslessly > converted to and from a builtin integer"). > > Right now, we only define boolean coercion via "__bool__" - there's no > mechanism to say "this *is* a boolean value that can be losslessly > converted to and from the builtin boolean constants". That isn't a > distinction the standard library makes, but it sounds like it's one > that NumPy cares about (and NumPy was also the main driver for > introducing __index__). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Tue Feb 13 07:49:43 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Tue, 13 Feb 2018 13:49:43 +0100 Subject: [Python-ideas] Complicate str methods In-Reply-To: References: Message-ID: <947639da-a220-491c-637d-8b93437cd8a1@gmail.com> Le 03/02/2018 ? 23:04, Franklin? Lee a ?crit?: > Let s be a str. I propose to allow these existing str methods to take > params in new forms. > > s.replace(old, new): > ? ? Allow passing in a collection of olds. > ? ? Allow passing in a single argument, a mapping of olds to news. > ? ? Allow the olds in the mapping to be tuples of strings. > > s.split(sep), s.rsplit, s.partition: > ? ? Allow sep to be a collection of separators. > > s.startswith, s.endswith: > ? ? Allow argument to be a collection of strings. > > s.find, s.index, s.count, x in s: > ? ? Similar. > ? ? These methods are also in `list`, which can't distinguish between > items, subsequences, and subsets. However, `str` is already inconsistent > with `list` here: list.M looks for an item, while str.M looks for a > subsequence. > > s.[r|l]strip: > ? ? Sadly, these functions already interpret their str arguments as > collections of characters. > I second that proposal. I regularly need those and feel frustrated when it doesn't work. I even wrote a wrapper that does exactly this : https://github.com/Tygs/ww/blob/master/src/ww/wrappers/strings.py But because it's pure Python, it's guaranteed to be slow. Plus you need to install it every time you need it. From chris.barker at noaa.gov Tue Feb 13 15:12:29 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 13 Feb 2018 12:12:29 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: On Mon, Feb 12, 2018 at 10:07 PM, David Mertz wrote: > I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in > the standard library; Guido expresses skepticism. Of course it is possible > to define it in some other library that actually needs to use > `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure > I'm unconvinced either, I can see a certain value to saying a given value > is "fully round-trippable to bool" (as is np.bool_). > But is an ABC the way to do it? Personally, I'm skeptical that ABCs are a solution to, well, anything (as apposed to duck typing and EAFTP). Take Nick's example: """ The other comparison that comes to mind would be the distinction between "__int__" ("can be coerced to an integer, but may lose information in the process") and "__index__" ("can be losslessly converted to and from a builtin integer"). """ I suppose we could have had an Index ABC -- but that seems painful to me. so maybe we could use a __true_bool__ special method? (and an operator.true_bool() function ???) (this all makes me wish that python bools were more pure -- but way to late for that!) I guess it comes down to whether you want to: - Ask the question: "is this object a boolean?" or - Make this object a boolean __index__ (and operator.index()) is essentially the later -- you want to make an index out of whatever object you have, if you can do so. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylvain.marie at schneider-electric.com Wed Feb 14 03:38:01 2018 From: sylvain.marie at schneider-electric.com (Sylvain MARIE) Date: Wed, 14 Feb 2018 08:38:01 +0000 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: My point is just that today, I use the ?numbers? package classes (Integral, Real, ?) for PEP484 type-hinting, and I find it quite useful in term of input type validation (in combination with PEP484-compliant type checkers, whether static or dynamic). Adding a Boolean ABC with a similar behavior would certainly add consistency to that ?numbers? package ? only for users who already find it useful, of course. Note that my use case is not about converting an object to a Boolean, I?m just speaking about type validation of a ?true? boolean object, for example to be received as a function argument for a flag option. This is for example for users who want to define strongly-typed APIs for interaction with the ?outside world?, and keep using duck-typing for internals. Sylvain De : Python-ideas [mailto:python-ideas-bounces+sylvain.marie=schneider-electric.com at python.org] De la part de Chris Barker Envoy? : mardi 13 f?vrier 2018 21:12 ? : David Mertz Cc : python-ideas Objet : Re: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module On Mon, Feb 12, 2018 at 10:07 PM, David Mertz > wrote: I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in the standard library; Guido expresses skepticism. Of course it is possible to define it in some other library that actually needs to use `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure I'm unconvinced either, I can see a certain value to saying a given value is "fully round-trippable to bool" (as is np.bool_). But is an ABC the way to do it? Personally, I'm skeptical that ABCs are a solution to, well, anything (as apposed to duck typing and EAFTP). Take Nick's example: """ The other comparison that comes to mind would be the distinction between "__int__" ("can be coerced to an integer, but may lose information in the process") and "__index__" ("can be losslessly converted to and from a builtin integer"). """ I suppose we could have had an Index ABC -- but that seems painful to me. so maybe we could use a __true_bool__ special method? (and an operator.true_bool() function ???) (this all makes me wish that python bools were more pure -- but way to late for that!) I guess it comes down to whether you want to: - Ask the question: "is this object a boolean?" or - Make this object a boolean __index__ (and operator.index()) is essentially the later -- you want to make an index out of whatever object you have, if you can do so. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From gvanrossum at gmail.com Wed Feb 14 11:13:32 2018 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 14 Feb 2018 08:13:32 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: Can you show some sample code that you have written that shows where this would be useful? Note that using the numbers package actually makes static type checking through e.g. mypy difficult. So I presume you are talking about dynamic checking? --Guido On Feb 14, 2018 12:42 AM, "Sylvain MARIE" < sylvain.marie at schneider-electric.com> wrote: My point is just that today, I use the ?numbers? package classes (Integral, Real, ?) for PEP484 type-hinting, and I find it quite useful in term of input type validation (in combination with PEP484-compliant type checkers, whether static or dynamic). Adding a Boolean ABC with a similar behavior would certainly add consistency to that ?numbers? package ? only for users who already find it useful, of course. Note that my use case is not about converting an object to a Boolean, I?m just speaking about type validation of a ?true? boolean object, for example to be received as a function argument for a flag option. This is for example for users who want to define strongly-typed APIs for interaction with the ?outside world?, and keep using duck-typing for internals. Sylvain *De :* Python-ideas [mailto:python-ideas-bounces+sylvain.marie=schneider- electric.com at python.org] *De la part de* Chris Barker *Envoy? :* mardi 13 f?vrier 2018 21:12 *? :* David Mertz *Cc :* python-ideas *Objet :* Re: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module On Mon, Feb 12, 2018 at 10:07 PM, David Mertz wrote: I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in the standard library; Guido expresses skepticism. Of course it is possible to define it in some other library that actually needs to use `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure I'm unconvinced either, I can see a certain value to saying a given value is "fully round-trippable to bool" (as is np.bool_). But is an ABC the way to do it? Personally, I'm skeptical that ABCs are a solution to, well, anything (as apposed to duck typing and EAFTP). Take Nick's example: """ The other comparison that comes to mind would be the distinction between "__int__" ("can be coerced to an integer, but may lose information in the process") and "__index__" ("can be losslessly converted to and from a builtin integer"). """ I suppose we could have had an Index ABC -- but that seems painful to me. so maybe we could use a __true_bool__ special method? (and an operator.true_bool() function ???) (this all makes me wish that python bools were more pure -- but way to late for that!) I guess it comes down to whether you want to: - Ask the question: "is this object a boolean?" or - Make this object a boolean __index__ (and operator.index()) is essentially the later -- you want to make an index out of whatever object you have, if you can do so. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylvain.marie at schneider-electric.com Tue Feb 13 04:21:14 2018 From: sylvain.marie at schneider-electric.com (Sylvain MARIE) Date: Tue, 13 Feb 2018 09:21:14 +0000 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: The main use case I had in mind was PEP484-based type hinting/checking actually: def my_function(foo: Boolean): pass explicitly states that my_function accepts any Boolean value, whether it is a python bool or a np.bool that would come from a numpy array or pandas dataframe. Note that type hinting is also the use case for which I make extensive use of the types from the ?numbers? package, for the same reasons. Sylvain De : Python-ideas [mailto:python-ideas-bounces+sylvain.marie=schneider-electric.com at python.org] De la part de David Mertz Envoy? : mardi 13 f?vrier 2018 07:08 ? : Nick Coghlan Cc : python-ideas Objet : Re: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in the standard library; Guido expresses skepticism. Of course it is possible to define it in some other library that actually needs to use `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure I'm unconvinced either, I can see a certain value to saying a given value is "fully round-trippable to bool" (as is np.bool_). But just for anyone who doesn't know NumPy, here's a quick illustration of what I alluded to: In [1]: import numpy as np In [2]: arr = np.array([7,8,12,33]) In [3]: ndx1 = np.array([0,1,1,0], dtype=int) In [4]: ndx2 = np.array([0,1,1,0], dtype=bool) In [5]: arr[ndx1] Out[5]: array([7, 8, 8, 7]) In [6]: arr[ndx2] Out[6]: array([ 8, 12]) ndx1 and ndx2 are both nice things (and are both often programmatically constructed by operations in NumPy). But indexing using ndx1 gives us an array of the things in the listed positions in arr. In this case, we happen to choose two each of the things an index 0 and index 1 in the result. Indexing by ndx2 gives us a filter of only those positions in arr corresponding to 'True's. These are both nice things to be able to do, but if NumPy's True was a special kind of 1, it wouldn't work out unambiguously. However, recent versions of NumPy have gotten a bit smarter about recognizing the special type of Python bools, so it's less of a trap than it used to be. Still, contrast these (using actual Python lists for the indexes: In [10]: arr[[False, True, True, False]] Out[10]: array([ 8, 12]) In [11]: arr[[False, True, 1, 0]] Out[11]: array([7, 8, 8, 7]) On Mon, Feb 12, 2018 at 7:50 PM, Nick Coghlan > wrote: On 13 February 2018 at 02:14, David Mertz > wrote: > NumPy np.bool_ is specifically not a subclass of any np.int_. If it we're, > there would be an ambiguity between indexing with a Boolean array and an > array of ints. Both are meaningful, but they mean different things (mask vs > collection of indices). > > Do we have other examples a Python ABC that exists to accommodate something > outside the standard library or builtins? Even if not, NumPy is special... > the actual syntax for '@' exists primarily for that library! collections.abc.Sequence and collections.abc.Mapping come to mind - the standard library doesn't tend to distinguish between different kinds of subscriptable objects, but it's a distinction some third party libraries and tools want to be able to make reliably. The other comparison that comes to mind would be the distinction between "__int__" ("can be coerced to an integer, but may lose information in the process") and "__index__" ("can be losslessly converted to and from a builtin integer"). Right now, we only define boolean coercion via "__bool__" - there's no mechanism to say "this *is* a boolean value that can be losslessly converted to and from the builtin boolean constants". That isn't a distinction the standard library makes, but it sounds like it's one that NumPy cares about (and NumPy was also the main driver for introducing __index__). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Feb 14 13:47:21 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 14 Feb 2018 10:47:21 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: I am mystified how you can be using the numbers package with mypy. Example: import numbers def f(a: numbers.Integral, b: numbers.Integral) -> numbers.Integral: return a + b f(12, 12) This gives an two errors on the last line when checked by mypy: _.py:10: error: Argument 1 to "f" has incompatible type "int"; expected "Integral" _.py:10: error: Argument 2 to "f" has incompatible type "int"; expected "Integral" On Tue, Feb 13, 2018 at 1:21 AM, Sylvain MARIE < sylvain.marie at schneider-electric.com> wrote: > The main use case I had in mind was PEP484-based type hinting/checking > actually: > > > > def my_function(foo: Boolean): > > pass > > > > explicitly states that my_function accepts any Boolean value, whether it > is a python bool or a np.bool that would come from a numpy array or pandas > dataframe. > > Note that type hinting is also the use case for which I make extensive use > of the types from the ?numbers? package, for the same reasons. > > > > Sylvain > > > > *De :* Python-ideas [mailto:python-ideas-bounces+sylvain.marie=schneider- > electric.com at python.org] *De la part de* David Mertz > *Envoy? :* mardi 13 f?vrier 2018 07:08 > *? :* Nick Coghlan > *Cc :* python-ideas > *Objet :* Re: [Python-ideas] Boolean ABC similar to what's provided in > the 'numbers' module > > > > I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in > the standard library; Guido expresses skepticism. Of course it is possible > to define it in some other library that actually needs to use > `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure > I'm unconvinced either, I can see a certain value to saying a given value > is "fully round-trippable to bool" (as is np.bool_). > > > > But just for anyone who doesn't know NumPy, here's a quick illustration of > what I alluded to: > > > > In [1]: import numpy as np > > In [2]: arr = np.array([7,8,12,33]) > > In [3]: ndx1 = np.array([0,1,1,0], dtype=int) > > In [4]: ndx2 = np.array([0,1,1,0], dtype=bool) > > In [5]: arr[ndx1] > > Out[5]: array([7, 8, 8, 7]) > > In [6]: arr[ndx2] > > Out[6]: array([ 8, 12]) > > > > ndx1 and ndx2 are both nice things (and are both often programmatically > constructed by operations in NumPy). But indexing using ndx1 gives us an > array of the things in the listed *positions* in arr. In this case, we > happen to choose two each of the things an index 0 and index 1 in the > result. > > > > Indexing by ndx2 gives us a filter of only those positions in arr > corresponding to 'True's. These are both nice things to be able to do, but > if NumPy's True was a special kind of 1, it wouldn't work out > unambiguously. However, recent versions of NumPy *have* gotten a bit > smarter about recognizing the special type of Python bools, so it's less of > a trap than it used to be. Still, contrast these (using actual Python > lists for the indexes: > > > > In [10]: arr[[False, True, True, False]] > > Out[10]: array([ 8, 12]) > > In [11]: arr[[False, True, 1, 0]] > > Out[11]: array([7, 8, 8, 7]) > > > > > > > > On Mon, Feb 12, 2018 at 7:50 PM, Nick Coghlan wrote: > > On 13 February 2018 at 02:14, David Mertz wrote: > > NumPy np.bool_ is specifically not a subclass of any np.int_. If it > we're, > > there would be an ambiguity between indexing with a Boolean array and an > > array of ints. Both are meaningful, but they mean different things (mask > vs > > collection of indices). > > > > Do we have other examples a Python ABC that exists to accommodate > something > > outside the standard library or builtins? Even if not, NumPy is > special... > > the actual syntax for '@' exists primarily for that library! > > collections.abc.Sequence and collections.abc.Mapping come to mind - > the standard library doesn't tend to distinguish between different > kinds of subscriptable objects, but it's a distinction some third > party libraries and tools want to be able to make reliably. > > The other comparison that comes to mind would be the distinction > between "__int__" ("can be coerced to an integer, but may lose > information in the process") and "__index__" ("can be losslessly > converted to and from a builtin integer"). > > Right now, we only define boolean coercion via "__bool__" - there's no > mechanism to say "this *is* a boolean value that can be losslessly > converted to and from the builtin boolean constants". That isn't a > distinction the standard library makes, but it sounds like it's one > that NumPy cares about (and NumPy was also the main driver for > introducing __index__). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > > > > > -- > > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > ______________________________________________________________________ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at notcom.com Wed Feb 14 17:29:56 2018 From: eric at notcom.com (Eric Osborne) Date: Wed, 14 Feb 2018 22:29:56 +0000 Subject: [Python-ideas] Extending __format__ method in ipaddress Message-ID: Folks- The ipaddress library returns an IP address object which can represent itself in a number of ways: In [1]: import ipaddress In [2]: v4 = ipaddress.IPv4Address('1.2.3.4') In [3]: print(v4) 1.2.3.4 In [4]: v4 Out[4]: IPv4Address('1.2.3.4') In [6]: v4.packed Out[6]: b'\x01\x02\x03\x04' In [9]: str(v4) Out[9]: '1.2.3.4' In [10]: int(v4) Out[10]: 16909060 In [13]: bin(int(v4)) Out[13]: '0b1000000100000001100000100' In [14]: hex(int(v4)) Out[14]: '0x1020304' In [15]: oct(int(v4)) Out[15]: '0o100401404' There are IPv6 objects as well: In [6]: v6 = ipaddress.IPv6Address('2001:0db8:85a3:0000:0000:8a2e:0370:7334') In [7]: int(v6) Out[7]: 42540766452641154071740215577757643572 and what I'm proposing will work for both address families. In either case, bin/hex/oct don't work on them directly, but on the integer representation. This is a little annoying but not such a big deal. What is a big deal (at least to me) is that the binary representation isn't zero-padded. This makes it harder to compare two IP addresses by eye to see what the differences are, i.e.: In [16]: a = ipaddress.IPv4Address('0.2.3.4') In [30]: bin(int(a)) Out[30]: '0b100000001100000100' In [31]: bin(int(v4)) Out[31]: '0b1000000100000001100000100' It would be nice if there was a way to have an IP address always present itself in fully zero-padded binary (32 bits for IPv4, 128 bits for IPv6). I find this particularly convenient when putting together training material, as it's easier to show subnetting and aggregation if you point at the binary than if you give people dotted-quad addresses and ask them to do the binary conversion in their head. Hex is also handy when you're comparing a dotted-quad IP address to a hex sniffer trace. It's possible to do this in a one-liner (thanks to Eric Smith): f'{int(v4):#0{34}b}'. But this is a little cryptic. I opened bpo-32820 (https://github.com/python/cpython/pull/5627) to contribute a way to do this. I started with an __index__ method but Issue 15559 ( https://github.com/python/cpython/commit/e0c3f5edc0f20cc28363258df501758c1bdb1ca7) rules this out. I instead added a bits() class method so that v4.bits would return the fully padded string. This was not terribly pretty, but it mirrored packed(), at least. Nick Coghlan suggested I instead extend __format__, which is what the diffs in the current pull request do. This allows a great deal more flexibility: the current code takes 'b', 'n', or 'x' types, as well as the '#' option and support for the '_' separator. I realize now I didn't add 'o' but I certainly can for completeness. I debated adding rfc1924 encoding for IPv6 addresses but decided it was entirely too silly. This is just a convenience function, but IMO fills a need. Is this worth pursuing? eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylvain.marie at schneider-electric.com Wed Feb 14 17:36:56 2018 From: sylvain.marie at schneider-electric.com (Sylvain MARIE) Date: Wed, 14 Feb 2018 22:36:56 +0000 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: I see :) This does not seem to happen with PyCharm IDE + Anaconda distribution. Is PyCharm relying on MyPy under the hood ? I actually have no knowledge at all about MyPy and how it relates to PyCharm static code analysis warnings. I?m pretty sure though that the runtime checkers (enforce, pytypes) are not dependent on MyPy. Sylvain De : gvanrossum at gmail.com [mailto:gvanrossum at gmail.com] De la part de Guido van Rossum Envoy? : mercredi 14 f?vrier 2018 19:47 ? : Sylvain MARIE Cc : python-ideas Objet : Re: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module I am mystified how you can be using the numbers package with mypy. Example: import numbers def f(a: numbers.Integral, b: numbers.Integral) -> numbers.Integral: return a + b f(12, 12) This gives an two errors on the last line when checked by mypy: _.py:10: error: Argument 1 to "f" has incompatible type "int"; expected "Integral" _.py:10: error: Argument 2 to "f" has incompatible type "int"; expected "Integral" On Tue, Feb 13, 2018 at 1:21 AM, Sylvain MARIE > wrote: The main use case I had in mind was PEP484-based type hinting/checking actually: def my_function(foo: Boolean): pass explicitly states that my_function accepts any Boolean value, whether it is a python bool or a np.bool that would come from a numpy array or pandas dataframe. Note that type hinting is also the use case for which I make extensive use of the types from the ?numbers? package, for the same reasons. Sylvain De : Python-ideas [mailto:python-ideas-bounces+sylvain.marie=schneider-electric.com at python.org] De la part de David Mertz Envoy? : mardi 13 f?vrier 2018 07:08 ? : Nick Coghlan > Cc : python-ideas > Objet : Re: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in the standard library; Guido expresses skepticism. Of course it is possible to define it in some other library that actually needs to use `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure I'm unconvinced either, I can see a certain value to saying a given value is "fully round-trippable to bool" (as is np.bool_). But just for anyone who doesn't know NumPy, here's a quick illustration of what I alluded to: In [1]: import numpy as np In [2]: arr = np.array([7,8,12,33]) In [3]: ndx1 = np.array([0,1,1,0], dtype=int) In [4]: ndx2 = np.array([0,1,1,0], dtype=bool) In [5]: arr[ndx1] Out[5]: array([7, 8, 8, 7]) In [6]: arr[ndx2] Out[6]: array([ 8, 12]) ndx1 and ndx2 are both nice things (and are both often programmatically constructed by operations in NumPy). But indexing using ndx1 gives us an array of the things in the listed positions in arr. In this case, we happen to choose two each of the things an index 0 and index 1 in the result. Indexing by ndx2 gives us a filter of only those positions in arr corresponding to 'True's. These are both nice things to be able to do, but if NumPy's True was a special kind of 1, it wouldn't work out unambiguously. However, recent versions of NumPy have gotten a bit smarter about recognizing the special type of Python bools, so it's less of a trap than it used to be. Still, contrast these (using actual Python lists for the indexes: In [10]: arr[[False, True, True, False]] Out[10]: array([ 8, 12]) In [11]: arr[[False, True, 1, 0]] Out[11]: array([7, 8, 8, 7]) On Mon, Feb 12, 2018 at 7:50 PM, Nick Coghlan > wrote: On 13 February 2018 at 02:14, David Mertz > wrote: > NumPy np.bool_ is specifically not a subclass of any np.int_. If it we're, > there would be an ambiguity between indexing with a Boolean array and an > array of ints. Both are meaningful, but they mean different things (mask vs > collection of indices). > > Do we have other examples a Python ABC that exists to accommodate something > outside the standard library or builtins? Even if not, NumPy is special... > the actual syntax for '@' exists primarily for that library! collections.abc.Sequence and collections.abc.Mapping come to mind - the standard library doesn't tend to distinguish between different kinds of subscriptable objects, but it's a distinction some third party libraries and tools want to be able to make reliably. The other comparison that comes to mind would be the distinction between "__int__" ("can be coerced to an integer, but may lose information in the process") and "__index__" ("can be losslessly converted to and from a builtin integer"). Right now, we only define boolean coercion via "__bool__" - there's no mechanism to say "this *is* a boolean value that can be losslessly converted to and from the builtin boolean constants". That isn't a distinction the standard library makes, but it sounds like it's one that NumPy cares about (and NumPy was also the main driver for introducing __index__). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Feb 14 18:34:51 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 14 Feb 2018 15:34:51 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: No, PyCharm has its own annotation checker, which is much more lenient than mypy (and less compliant with PEP 484). And indeed the runtime checkers are also unrelated (though a runtime checker will have to work with the type objects created by typing.py). So as long as you are not expecting to ever need mypy you should be fine -- however if you're sharing code at some point someone is probably going to want to point mypy at it. On Wed, Feb 14, 2018 at 2:36 PM, Sylvain MARIE < sylvain.marie at schneider-electric.com> wrote: > I see :) > > > > This does not seem to happen with PyCharm IDE + Anaconda distribution. Is > PyCharm relying on MyPy under the hood ? > > I actually have no knowledge at all about MyPy and how it relates to > PyCharm static code analysis warnings. I?m pretty sure though that the > runtime checkers (enforce, pytypes) are not dependent on MyPy. > > > > Sylvain > > > > *De :* gvanrossum at gmail.com [mailto:gvanrossum at gmail.com] *De la part de* > Guido van Rossum > *Envoy? :* mercredi 14 f?vrier 2018 19:47 > *? :* Sylvain MARIE > > *Cc :* python-ideas > *Objet :* Re: [Python-ideas] Boolean ABC similar to what's provided in > the 'numbers' module > > > > I am mystified how you can be using the numbers package with mypy. Example: > > import numbers > def f(a: numbers.Integral, b: numbers.Integral) -> numbers.Integral: > return a + b > f(12, 12) > > This gives an two errors on the last line when checked by mypy: > > _.py:10: error: Argument 1 to "f" has incompatible type "int"; expected > "Integral" > _.py:10: error: Argument 2 to "f" has incompatible type "int"; expected > "Integral" > > > > On Tue, Feb 13, 2018 at 1:21 AM, Sylvain MARIE electric.com> wrote: > > The main use case I had in mind was PEP484-based type hinting/checking > actually: > > > > def my_function(foo: Boolean): > > pass > > > > explicitly states that my_function accepts any Boolean value, whether it > is a python bool or a np.bool that would come from a numpy array or pandas > dataframe. > > Note that type hinting is also the use case for which I make extensive use > of the types from the ?numbers? package, for the same reasons. > > > > Sylvain > > > > *De :* Python-ideas [mailto:python-ideas-bounces+sylvain.marie=schneider- > electric.com at python.org] *De la part de* David Mertz > *Envoy? :* mardi 13 f?vrier 2018 07:08 > *? :* Nick Coghlan > *Cc :* python-ideas > *Objet :* Re: [Python-ideas] Boolean ABC similar to what's provided in > the 'numbers' module > > > > I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in > the standard library; Guido expresses skepticism. Of course it is possible > to define it in some other library that actually needs to use > `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure > I'm unconvinced either, I can see a certain value to saying a given value > is "fully round-trippable to bool" (as is np.bool_). > > > > But just for anyone who doesn't know NumPy, here's a quick illustration of > what I alluded to: > > > > In [1]: import numpy as np > > In [2]: arr = np.array([7,8,12,33]) > > In [3]: ndx1 = np.array([0,1,1,0], dtype=int) > > In [4]: ndx2 = np.array([0,1,1,0], dtype=bool) > > In [5]: arr[ndx1] > > Out[5]: array([7, 8, 8, 7]) > > In [6]: arr[ndx2] > > Out[6]: array([ 8, 12]) > > > > ndx1 and ndx2 are both nice things (and are both often programmatically > constructed by operations in NumPy). But indexing using ndx1 gives us an > array of the things in the listed *positions* in arr. In this case, we > happen to choose two each of the things an index 0 and index 1 in the > result. > > > > Indexing by ndx2 gives us a filter of only those positions in arr > corresponding to 'True's. These are both nice things to be able to do, but > if NumPy's True was a special kind of 1, it wouldn't work out > unambiguously. However, recent versions of NumPy *have* gotten a bit > smarter about recognizing the special type of Python bools, so it's less of > a trap than it used to be. Still, contrast these (using actual Python > lists for the indexes: > > > > In [10]: arr[[False, True, True, False]] > > Out[10]: array([ 8, 12]) > > In [11]: arr[[False, True, 1, 0]] > > Out[11]: array([7, 8, 8, 7]) > > > > > > > > On Mon, Feb 12, 2018 at 7:50 PM, Nick Coghlan wrote: > > On 13 February 2018 at 02:14, David Mertz wrote: > > NumPy np.bool_ is specifically not a subclass of any np.int_. If it > we're, > > there would be an ambiguity between indexing with a Boolean array and an > > array of ints. Both are meaningful, but they mean different things (mask > vs > > collection of indices). > > > > Do we have other examples a Python ABC that exists to accommodate > something > > outside the standard library or builtins? Even if not, NumPy is > special... > > the actual syntax for '@' exists primarily for that library! > > collections.abc.Sequence and collections.abc.Mapping come to mind - > the standard library doesn't tend to distinguish between different > kinds of subscriptable objects, but it's a distinction some third > party libraries and tools want to be able to make reliably. > > The other comparison that comes to mind would be the distinction > between "__int__" ("can be coerced to an integer, but may lose > information in the process") and "__index__" ("can be losslessly > converted to and from a builtin integer"). > > Right now, we only define boolean coercion via "__bool__" - there's no > mechanism to say "this *is* a boolean value that can be losslessly > converted to and from the builtin boolean constants". That isn't a > distinction the standard library makes, but it sounds like it's one > that NumPy cares about (and NumPy was also the main driver for > introducing __index__). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > > > > > -- > > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > ______________________________________________________________________ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > -- > > --Guido van Rossum (python.org/~guido) > > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > ______________________________________________________________________ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Feb 14 19:18:23 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 15 Feb 2018 11:18:23 +1100 Subject: [Python-ideas] Give ipaddresses an __index__ method Message-ID: <20180215001823.GH26553@ando.pearwood.info> This idea is inspired by Eric Osborne's post "Extending __format__ method in ipaddress", but I wanted to avoid derailing that thread. I notice what seems to be an inconsistency in the ipaddress objects: py> v4 = ipaddress.IPv4Address('1.2.3.4') py> bin(v4) Traceback (most recent call last): File "", line 1, in TypeError: 'IPv4Address' object cannot be interpreted as an integer But that's surely not right: we just need to explicitly do so: py> bin(int(v4)) '0b1000000100000001100000100' IP addresses are, in a strong sense, integers: either 32 or 128 bits. And they can be explicitly converted losslessly to and from integers: py> v4 == ipaddress.IPv4Address(int(v4)) True Is there a good reason not to give them an __index__ method so that bin(), oct() and hex() will work directly? py> class X(ipaddress.IPv4Address): ... def __index__(self): ... return int(self) ... py> a = X('1.2.3.4') py> bin(a) '0b1000000100000001100000100' I acknowledge one potentially undesirable side-effect: this would allow using IP addresses as indexes into sequences: py> 'abcdef'[X('0.0.0.2')] 'c' but while it's weird to do this, I don't think it's logically wrong. -- Steve From rosuav at gmail.com Wed Feb 14 19:45:46 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 15 Feb 2018 11:45:46 +1100 Subject: [Python-ideas] Give ipaddresses an __index__ method In-Reply-To: <20180215001823.GH26553@ando.pearwood.info> References: <20180215001823.GH26553@ando.pearwood.info> Message-ID: On Thu, Feb 15, 2018 at 11:18 AM, Steven D'Aprano wrote: > This idea is inspired by Eric Osborne's post "Extending __format__ > method in ipaddress", but I wanted to avoid derailing that thread. > > I notice what seems to be an inconsistency in the ipaddress objects: > > py> v4 = ipaddress.IPv4Address('1.2.3.4') > py> bin(v4) > Traceback (most recent call last): > File "", line 1, in > TypeError: 'IPv4Address' object cannot be interpreted as an integer > > But that's surely not right: we just need to explicitly do so: > > py> bin(int(v4)) > '0b1000000100000001100000100' > > IP addresses are, in a strong sense, integers: either 32 or 128 bits. > And they can be explicitly converted losslessly to and from integers: > Except that this computer's IPv4 is not 3232235539, and I never want to enter it that way. I enter it as 192.168.0.19 - as four separate integers. The __index__ method means "this thing really is an integer, and can be used as an index". With IPv6, similar: you think about them as eight separate blocks of digits. IP addresses can be losslessly converted to and from strings, too, and that's a lot more useful. But they still don't have string methods, because they're not strings. > I acknowledge one potentially undesirable side-effect: this would > allow using IP addresses as indexes into sequences: > > py> 'abcdef'[X('0.0.0.2')] > 'c' > > but while it's weird to do this, I don't think it's logically wrong. That's not a side effect. That is the *primary* effect of __index__. If you call int() on something, you are *converting* it to an integer (eg int(2.3) ==> 2), and IMO that is the appropriate way to turn 192.168.0.19 into 3232235539 if ever you want that. Unless you have a use-case for using IP addresses as integers, distinct from Eric's ideas? ChrisA From chris.barker at noaa.gov Wed Feb 14 20:49:18 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 14 Feb 2018 20:49:18 -0500 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: > > So as long as you are not expecting to ever need mypy you should be fine -- however if you're sharing code at some point someone is probably going to want to point mypy at it. mypy isn?t an ?official? tool, but PEP484 is ? and mypy is more or less a reference implimentation, yes? mypy support bool, as far as I can tell, will that not work for your case? Even though the python bools are integer subclasses, that doesn?t mean a type checker shouldn?t flag passing an integer in to a function that expects a bool. -CHB From ethan at stoneleaf.us Wed Feb 14 21:59:10 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 14 Feb 2018 18:59:10 -0800 Subject: [Python-ideas] Extending __format__ method in ipaddress In-Reply-To: References: Message-ID: <5A84F77E.5040102@stoneleaf.us> On 02/14/2018 02:29 PM, Eric Osborne wrote: > Nick Coghlan suggested I instead extend __format__, which is what the diffs in the current pull request do. This > allows a great deal more flexibility: the current code takes 'b', 'n', or 'x' types, as well as the '#' option and > support for the '_' separator. I realize now I didn't add 'o' but I certainly can for completeness. I debated adding > rfc1924 encoding for IPv6 addresses but decided it was entirely too silly. > > This is just a convenience function, but IMO fills a need. Is this worth pursuing? Seems like a good idea to me! -- ~Ethan~ From ncoghlan at gmail.com Wed Feb 14 22:39:13 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 15 Feb 2018 13:39:13 +1000 Subject: [Python-ideas] Give ipaddresses an __index__ method In-Reply-To: <20180215001823.GH26553@ando.pearwood.info> References: <20180215001823.GH26553@ando.pearwood.info> Message-ID: On 15 February 2018 at 10:18, Steven D'Aprano wrote: > This idea is inspired by Eric Osborne's post "Extending __format__ > method in ipaddress", but I wanted to avoid derailing that thread. > > I notice what seems to be an inconsistency in the ipaddress objects: > > py> v4 = ipaddress.IPv4Address('1.2.3.4') > py> bin(v4) > Traceback (most recent call last): > File "", line 1, in > TypeError: 'IPv4Address' object cannot be interpreted as an integer That error message should probably either have an "implicitly" in it, or else use the word "handled" rather than "interpreted". There are tests that ensure IP addresses don't implement __index__, and the pragmatic reason for that is the downside you mentioned: to ensure they can't be used as indices, slice endpoints, or range endpoints. While IP addresses can be converted to an integer, they are *not* integers in any mathematical sense, and it doesn't make sense to treat them that way. A useful heuristic for answering the question "Should this type implement __index__?" is "Does this type conform to the numbers.Integral ABC?" (IP addresses definitely don't, as there's no concept of addition, subtraction, multiplication, division, etc - they're discrete entities with a numeric representation, not numbers) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Feb 14 22:52:13 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 15 Feb 2018 13:52:13 +1000 Subject: [Python-ideas] Extending __format__ method in ipaddress In-Reply-To: References: Message-ID: On 15 February 2018 at 08:29, Eric Osborne wrote: > Nick Coghlan suggested I instead extend __format__, which is what the > diffs in the current pull request do. This allows a great deal more > flexibility: the current code takes 'b', 'n', or 'x' types, as well as the > '#' option and support for the '_' separator. +1 from me (unsurprisingly). We added __format__ specifically to give types more control over how they're printed, and this approach is amenable to the simple explanation that the custom IP address formatting works via: - conversion to int - printing in a fixed width field (based on the address size) - in binary or hex based on either the given format character, or the address size ("n", where IPv4=b and IPv6=x) - with a suitable prefix if "#" is given - with four-digit separators if "_" is given > I realize now I didn't add > 'o' but I certainly can for completeness. I'd suggest leaving it out, as octal characters are 3 bits each, so they don't have a natural association with IP address representations any more than decimal representation does (neither 32 nor 128 are divisible by 3). > I debated adding rfc1924 encoding > for IPv6 addresses but decided it was entirely too silly. Yeah, if we decided to support that, we likely *would* add a separate method for it. __format__ works well for "print an IP address as an integer with zero-padding and an automatically calculated field width" though, since we can borrow the notation from regular integer formatting to select the digit base and tweak the display details. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Wed Feb 14 23:14:03 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 15 Feb 2018 15:14:03 +1100 Subject: [Python-ideas] Give ipaddresses an __index__ method In-Reply-To: References: <20180215001823.GH26553@ando.pearwood.info> Message-ID: <20180215041403.GI26553@ando.pearwood.info> On Thu, Feb 15, 2018 at 11:45:46AM +1100, Chris Angelico wrote: > Except that this computer's IPv4 is not 3232235539, and I never want > to enter it that way. I enter it as 192.168.0.19 - as four separate > integers. That's partly convention (and a useful convention: it is less error- prone than 3232235539) and partly that because you're a sys admin who can read the individual subfields of an IP address. I'm not suggesting you ought to change your habit. But to civilians, 192.168.0.19 is as opaque as 3232235539 or 0xC0A80013 would be. We allow creating IP address objects from a single int, we don't require four separate int arguments (one for each subfield), and unless I've missed something, IP addresses are not treated as a collection of four separate integers (or more for v6). I can't even find a method to split an address into four ints. (Nor am I sure that there is good reason to want to do so.) So calling a single address "four separate integers" is not really accurate. [...] > IP addresses can be losslessly converted to and from strings, too, and > that's a lot more useful. But they still don't have string methods, > because they're not strings. I agree they're not strings, I never suggested they were. Python only allows IP addresses to be entered as strings because we don't have a "dotted-quad" syntax for 32-bit integers. (Nor am I suggesting we should.) It is meaningless to perform string operations on IP addresses. What would it mean to call addr.replace('.', 'Z') or addr.split('2')? But doing *at least some* int operations on addresses isn't meaningless: py> a = ipaddress.ip_address('192.168.0.19') py> a + 1 IPv4Address('192.168.0.20') -- Steve From rosuav at gmail.com Wed Feb 14 23:43:59 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 15 Feb 2018 15:43:59 +1100 Subject: [Python-ideas] Give ipaddresses an __index__ method In-Reply-To: <20180215041403.GI26553@ando.pearwood.info> References: <20180215001823.GH26553@ando.pearwood.info> <20180215041403.GI26553@ando.pearwood.info> Message-ID: On Thu, Feb 15, 2018 at 3:14 PM, Steven D'Aprano wrote: > On Thu, Feb 15, 2018 at 11:45:46AM +1100, Chris Angelico wrote: > >> Except that this computer's IPv4 is not 3232235539, and I never want >> to enter it that way. I enter it as 192.168.0.19 - as four separate >> integers. > > That's partly convention (and a useful convention: it is less error- > prone than 3232235539) and partly that because you're a sys admin who > can read the individual subfields of an IP address. I'm not suggesting > you ought to change your habit. > > But to civilians, 192.168.0.19 is as opaque as 3232235539 or 0xC0A80013 > would be. To some people, any form of address is as opaque as any other, true. (That's part of why we have DNS.) Also true, however, is that the conventional notation has value and meaning. That's partly historical (before CIDR, all networks were sized as either class A (10.x.y.z for any x, y, z), class B (172.16.x.y), or class C (192.168.0.x)), partly self-perpetuating (we use a lot of /24 addresses in local networks, not because we HAVE to, but because a /24 lets you lock three parts of the address and have the last part vary), but also definitely practical. > We allow creating IP address objects from a single int, we don't require > four separate int arguments (one for each subfield), and unless I've > missed something, IP addresses are not treated as a collection of four > separate integers (or more for v6). I can't even find a method to split > an address into four ints. (Nor am I sure that there is good reason to > want to do so.) So calling a single address "four separate integers" is > not really accurate. The most common way to create an IPv4Address object is to construct it from a string, which has the four separate integers in it. The dots delimit those integers. It's not an arbitrary string; it is most definitely a tuple of four integers, represented in its standard string notation. Simply because it's not actually the Python type Tuple[Int] doesn't mean it isn't functionally and logically a sequence of numbers. And if ever you actually do have the four integers, you can use a one-liner anyway: >>> address = (192, 168, 0, 19) >>> ipaddress.IPv4Address("%d.%d.%d.%d" % address) IPv4Address('192.168.0.19') > It is meaningless to perform string operations on IP addresses. What > would it mean to call addr.replace('.', 'Z') or addr.split('2')? > > But doing *at least some* int operations on addresses isn't meaningless: > > py> a = ipaddress.ip_address('192.168.0.19') > py> a + 1 > IPv4Address('192.168.0.20') How meaningful is that, when you don't have the netmask? >>> a = ipaddress.ip_address('192.168.0.254') >>> a + 1 IPv4Address('192.168.0.255') >>> a + 2 IPv4Address('192.168.1.0') >>> a + 3 IPv4Address('192.168.1.1') If that's a /24, one of those is a broadcast address, one is an unrelated network address, and one is an unrelated host address. "Adding 1" to an IP address is meaningless. And it definitely does NOT mean that IP addresses should have __index__, because that implies that they truly are integers, which would mean you could do something like this: >>> ipaddress.IPv4Address('192.168.0.19') + ipaddress.IPv4Address("10.1.1.1") Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for +: 'IPv4Address' and 'IPv4Address' The __int__ method *converts* something to an integer. Nobody is disagreeing that you can convert an IP address into an integer. But they are NOT integers. It doesn't make sense to treat one as an integer implicitly. ChrisA From dan at tombstonezero.net Wed Feb 14 23:46:49 2018 From: dan at tombstonezero.net (Dan Sommers) Date: Thu, 15 Feb 2018 04:46:49 +0000 (UTC) Subject: [Python-ideas] Give ipaddresses an __index__ method References: <20180215001823.GH26553@ando.pearwood.info> <20180215041403.GI26553@ando.pearwood.info> Message-ID: On Thu, 15 Feb 2018 15:14:03 +1100, Steven D'Aprano wrote: > On Thu, Feb 15, 2018 at 11:45:46AM +1100, Chris Angelico wrote: > >> Except that this computer's IPv4 is not 3232235539, and I never want >> to enter it that way. I enter it as 192.168.0.19 - as four separate >> integers. > > That's partly convention (and a useful convention: it is less error- > prone than 3232235539) and partly that because you're a sys admin who > can read the individual subfields of an IP address. I'm not suggesting > you ought to change your habit. > > But to civilians, 192.168.0.19 is as opaque as 3232235539 or 0xC0A80013 > would be. There was a lengthy discussion (or more than one) about supporting decimal unicode code point literals. Is U+03B1 (GREEK SMALL LETTER ALPHA) somehow less clear than X+945? 192.168.0.19 speaks volumes, but 3232235539 is not only opaque, but also obtuse. > But doing *at least some* int operations on addresses isn't meaningless: > > py> a = ipaddress.ip_address('192.168.0.19') > py> a + 1 > IPv4Address('192.168.0.20') py> a = ipaddress.ip_address('192.168.1.255') > py> a + 1 > IPv4Address('192.168.1.256') Uh, oh. py> a = ipaddress.ip_address('255.255.255.255') > py> a + 1 Mu? Yes, if I were writing a DHCP server, the notion of "the next IP address that meets certain constraints, or an exception if no such address exists" has meaning. But it's not as simple as "ip + 1." Dan From guido at python.org Wed Feb 14 23:51:28 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 14 Feb 2018 20:51:28 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: On Wed, Feb 14, 2018 at 5:49 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > > > > So as long as you are not expecting to ever need mypy you should be fine > -- however if you're sharing code at some point someone is probably going > to want to point mypy at it. > > mypy isn?t an ?official? tool, but PEP484 is ? and mypy is more or > less a reference implimentation, yes? > Except PEP 484 is silent on many, many details. So far from all of mypy's behavior is normative. However, in this case PEP 484 has an opinion on the numbers module (don't use it, just use int). > mypy support bool, as far as I can tell, will that not work for your case? > > Even though the python bools are integer subclasses, that doesn?t mean > a type checker shouldn?t flag passing an integer in to a function that > expects a bool. > That's not the issue. The issue is that, from mypy's POV, np.bool is not a subtype of builtins.bool, just like the various np.intXX types aren't subtypes of builtins.int. But IMO the solution is to lie about this in the stubs and make the np types subtypes of the builtin types, not to switch to numbers.Integral. And the reason is that few people (outside hardcore np fans) will want to write numbers.Integral instead of int. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Feb 15 00:27:48 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 15 Feb 2018 16:27:48 +1100 Subject: [Python-ideas] Give ipaddresses an __index__ method In-Reply-To: References: <20180215001823.GH26553@ando.pearwood.info> Message-ID: <20180215052748.GJ26553@ando.pearwood.info> On Thu, Feb 15, 2018 at 01:39:13PM +1000, Nick Coghlan wrote: > There are tests that ensure IP addresses don't implement __index__, > and the pragmatic reason for that is the downside you mentioned: to > ensure they can't be used as indices, slice endpoints, or range > endpoints. If it is an intentional decision to disallow treating IP addresses as integers implicitly, I guess that is definite then. No change. I can see that this is a reasonable decision for pragmatic reasons. However, for the record (and under no illusion that I'll change your mind *wink*) ... > While IP addresses can be converted to an integer, they are > *not* integers in any mathematical sense, and it doesn't make sense to > treat them that way. I really don't think this is strictly correct. IP addresses already support adding to regular ints, and conceptually they are indexes into a 32-bit or 128-bit space. They define "successor" and "predecessor" relations via addition and subtraction, which is pretty much all you need to build all other int operations from, mathematically speaking. (Actually, you don't even need predecessor.) I think there's a good case to make that they are ordinal numbers (each IP address uniquely specifies a logical position in a sequence from 0 to 2**32-1). Python ints already do quadruple duty as: - ordinal numbers, e.g. indexing, "string".find("r"); - cardinal numbers, e.g. counting, len("string"); - nominal numbers, e.g. id(obj); - subset of the Reals in the numeric tower. But anyway, at this point the discussion is getting rather esoteric. I accept the argument from pragmatism that the benefit of supporting __index__ is less than the disadvantage, so I think we're done here :-) -- Steve From fhsxfhsx at 126.com Thu Feb 15 00:56:44 2018 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Thu, 15 Feb 2018 13:56:44 +0800 (CST) Subject: [Python-ideas] Temporary variables in comprehensions Message-ID: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> As far as I can see, a comprehension like alist = [f(x) for x in range(10)] is better than a for-loop for x in range(10): alist.append(f(x)) because the previous one shows every element of the list explicitly so that we don't need to handle `append` mentally. But when it comes to something like [f(x) + g(f(x)) for x in range(10)] you find you have to sacrifice some readableness if you don't want two f(x) which might slow down your code. Someone may argue that one can write [y + g(y) for y in [f(x) for x in range(10)]] but it's not as clear as to show what `y` is in a subsequent clause, not to say there'll be another temporary list built in the process. We can even replace every comprehension with map and filter, but that would face the same problems. In a word, what I'm arguing is that we need a way to assign temporary variables in a comprehension. In my opinion, code like [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] is more natural than any solution we now have. And that's why I pro the new syntax, it's clear, explicit and readable, and is nothing beyond the functionality of the present comprehensions so it's not complicated. And I hope the discussion could focus more on whether we should allow assigning temporary variables in comprehensions rather than how to solve the specific example I mentioned above. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jw14896.2014 at my.bristol.ac.uk Thu Feb 15 04:32:25 2018 From: jw14896.2014 at my.bristol.ac.uk (Jamie Willis) Date: Thu, 15 Feb 2018 09:32:25 +0000 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: I +1 this at surface level; Both Haskell list comprehensions and Scala for comprehensions have variable assignment in them, even between iterating and this is often very useful. Perhaps syntax can be generalised as: [expr_using_x_and_y for i in is x = expr_using_i for j in is y = expr_using_j_and_x] This demonstrates the scope of each assignment; available in main result and then every clause that follows it. Sorry to op who will receive twice, forgot reply to all On 15 Feb 2018 7:03 am, "fhsxfhsx" wrote: > As far as I can see, a comprehension like > alist = [f(x) for x in range(10)] > is better than a for-loop > for x in range(10): > alist.append(f(x)) > because the previous one shows every element of the list explicitly so > that we don't need to handle `append` mentally. > > But when it comes to something like > [f(x) + g(f(x)) for x in range(10)] > you find you have to sacrifice some readableness if you don't want two > f(x) which might slow down your code. > > Someone may argue that one can write > [y + g(y) for y in [f(x) for x in range(10)]] > but it's not as clear as to show what `y` is in a subsequent clause, not > to say there'll be another temporary list built in the process. > We can even replace every comprehension with map and filter, but that > would face the same problems. > > In a word, what I'm arguing is that we need a way to assign temporary > variables in a comprehension. > In my opinion, code like > [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] > is more natural than any solution we now have. > And that's why I pro the new syntax, it's clear, explicit and readable, > and is nothing beyond the functionality of the present comprehensions so > it's not complicated. > > And I hope the discussion could focus more on whether we should allow > assigning temporary variables in comprehensions rather than how to solve > the specific example I mentioned above. > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From evpok.padding at gmail.com Thu Feb 15 04:53:21 2018 From: evpok.padding at gmail.com (Evpok Padding) Date: Thu, 15 Feb 2018 10:53:21 +0100 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: For simple cases such as `[y + g(y) for y in [f(x) for x in range(10)]]`, I don't really see what the issue is, if you really want to make it shorter, you can ``[y + g(y) for y in map(f,range(10))]` which is one of the rare case where I like `map` more than comprehensions. For more complex case, just define a intermediate generator along the lines ``` f_samples = (f(x) for x in range(10)) [y+g(y) for y in f_samples] ``` Which does exactly the same thing but - Is more readable and explicit - Has no memory overhead thanks to lazy evaluation (btw, you should consider generators for your nested comprenshions) While I am sometimes in the same state of mind, wishing for variables in comprehensions seems to me like a good indicator that your code needs refactoring. Best, E On 15 February 2018 at 10:32, Jamie Willis wrote: > > I +1 this at surface level; Both Haskell list comprehensions and Scala for comprehensions have variable assignment in them, even between iterating and this is often very useful. Perhaps syntax can be generalised as: > > [expr_using_x_and_y > for i in is > x = expr_using_i > for j in is > y = expr_using_j_and_x] > > This demonstrates the scope of each assignment; available in main result and then every clause that follows it. > > Sorry to op who will receive twice, forgot reply to all > > On 15 Feb 2018 7:03 am, "fhsxfhsx" wrote: >> >> As far as I can see, a comprehension like >> alist = [f(x) for x in range(10)] >> is better than a for-loop >> for x in range(10): >> alist.append(f(x)) >> because the previous one shows every element of the list explicitly so that we don't need to handle `append` mentally. >> >> But when it comes to something like >> [f(x) + g(f(x)) for x in range(10)] >> you find you have to sacrifice some readableness if you don't want two f(x) which might slow down your code. >> >> Someone may argue that one can write >> [y + g(y) for y in [f(x) for x in range(10)]] >> but it's not as clear as to show what `y` is in a subsequent clause, not to say there'll be another temporary list built in the process. >> We can even replace every comprehension with map and filter, but that would face the same problems. >> >> In a word, what I'm arguing is that we need a way to assign temporary variables in a comprehension. >> In my opinion, code like >> [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] >> is more natural than any solution we now have. >> And that's why I pro the new syntax, it's clear, explicit and readable, and is nothing beyond the functionality of the present comprehensions so it's not complicated. >> >> And I hope the discussion could focus more on whether we should allow assigning temporary variables in comprehensions rather than how to solve the specific example I mentioned above. >> >> >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Feb 15 05:08:46 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 15 Feb 2018 10:08:46 +0000 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: On 15 February 2018 at 05:56, fhsxfhsx wrote: > As far as I can see, a comprehension like > alist = [f(x) for x in range(10)] > is better than a for-loop > for x in range(10): > alist.append(f(x)) > because the previous one shows every element of the list explicitly so that > we don't need to handle `append` mentally. ... as long as the code inside the comprehension remains relatively simple. It's easy to abuse comprehensions to the point where they are less readable than a for loop, but that's true of a lot of things, so isn't a specific problem with comprehensions. > But when it comes to something like > [f(x) + g(f(x)) for x in range(10)] > you find you have to sacrifice some readableness if you don't want two f(x) > which might slow down your code. Agreed. I hit that quite often. > Someone may argue that one can write > [y + g(y) for y in [f(x) for x in range(10)]] > but it's not as clear as to show what `y` is in a subsequent clause, not to > say there'll be another temporary list built in the process. > We can even replace every comprehension with map and filter, but that would > face the same problems. That is a workaround (and one I'd not thought of before) but I agree it's ugly, and reduces readability. Actually, factoring out the inner comprehension like Evpok Padding suggests: f_samples = (f(x) for x in range(10)) [y+g(y) for y in f_samples] is very readable and effective, IMO, so it's not *that* obvious that local names are beneficial. > In a word, what I'm arguing is that we need a way to assign temporary > variables in a comprehension. "We need" is a bit strong here. "It would be useful to have" is probably true for some situations. > In my opinion, code like > [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] > is more natural than any solution we now have. > And that's why I pro the new syntax, it's clear, explicit and readable, and > is nothing beyond the functionality of the present comprehensions so it's > not complicated. The problem is that you haven't proposed an actual syntax here, just that one should be invented. There have been discussions on this in the past (a quick search found https://mail.python.org/pipermail/python-ideas/2011-April/009863.html and https://mail.python.org/pipermail/python-ideas/2012-January/013468.html, for example). > And I hope the discussion could focus more on whether we should allow > assigning temporary variables in comprehensions rather than how to solve the > specific example I mentioned above. The problem isn't so much "whether we should allow it" as "can we find a syntax that is acceptable", and only then "does the new syntax give sufficient benefit to be worth adding". New syntax has a pretty high cost, and proposals that don't suggest explicit syntax will get stuck because you can't judge whether adding the capability is "worth it" without being clear on what the cost is - particularly when the benefit is relatively small (which this is). Agreed that it's important to focus on the general problem, but part of the discussion *will* include arguing as to why the existing workarounds and alternatives are less acceptable than new syntax. And that will need to include discussion of specific cases. Generally, in that sort of discussion, artificial examples like "y=f(x)" don't fare well because it's too easy to end up just debating subjective views on "readability". If you can provide examples from real-world code that clearly demonstrate the cost in terms of maintainability of the existing workarounds, that will help your argument a lot. Although you'll need to be prepared for questions like "would you be willing to drop support for versions of Python older than 3.8 in order to get this improvement?" - it's surprisingly hard to justify language (as opposed to library) changes when you really stop and think about it. Which is not to say that it can't be done, just that it's easy to underestimate the effort needed. Paul From stephanh42 at gmail.com Thu Feb 15 05:11:47 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 15 Feb 2018 11:11:47 +0100 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: Note that you can already do: [y + g(y) for x in range(10) for y in [f(x)]] i.e. for y in [expr] does exactly what the OP wants. No new syntax needed. If you hang out on python-list , you'll soon notice that many newbies struggle already with the list comprehension syntax. It's a mini-language which is almost, but not entirely, exactly unlike normal Python code. Let's not complicate it further. Stephan 2018-02-15 10:53 GMT+01:00 Evpok Padding : > For simple cases such as `[y + g(y) for y in [f(x) for x in range(10)]]`, > I don't really see what the issue is, if you really want to make it > shorter, > you can ``[y + g(y) for y in map(f,range(10))]` which is one of the rare > case where I like `map` more than comprehensions. > > For more complex case, just define a intermediate generator along the lines > ``` > f_samples = (f(x) for x in range(10)) > [y+g(y) for y in f_samples] > ``` > Which does exactly the same thing but > - Is more readable and explicit > - Has no memory overhead thanks to lazy evaluation > (btw, you should consider generators for your nested comprenshions) > > While I am sometimes in the same state of mind, wishing for variables in > comprehensions seems to me like a good indicator that your code needs > refactoring. > > Best, > > E > > On 15 February 2018 at 10:32, Jamie Willis > wrote: > > > > I +1 this at surface level; Both Haskell list comprehensions and Scala > for comprehensions have variable assignment in them, even between iterating > and this is often very useful. Perhaps syntax can be generalised as: > > > > [expr_using_x_and_y > > for i in is > > x = expr_using_i > > for j in is > > y = expr_using_j_and_x] > > > > This demonstrates the scope of each assignment; available in main result > and then every clause that follows it. > > > > Sorry to op who will receive twice, forgot reply to all > > > > On 15 Feb 2018 7:03 am, "fhsxfhsx" wrote: > >> > >> As far as I can see, a comprehension like > >> alist = [f(x) for x in range(10)] > >> is better than a for-loop > >> for x in range(10): > >> alist.append(f(x)) > >> because the previous one shows every element of the list explicitly so > that we don't need to handle `append` mentally. > >> > >> But when it comes to something like > >> [f(x) + g(f(x)) for x in range(10)] > >> you find you have to sacrifice some readableness if you don't want two > f(x) which might slow down your code. > >> > >> Someone may argue that one can write > >> [y + g(y) for y in [f(x) for x in range(10)]] > >> but it's not as clear as to show what `y` is in a subsequent clause, > not to say there'll be another temporary list built in the process. > >> We can even replace every comprehension with map and filter, but that > would face the same problems. > >> > >> In a word, what I'm arguing is that we need a way to assign temporary > variables in a comprehension. > >> In my opinion, code like > >> [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] > >> is more natural than any solution we now have. > >> And that's why I pro the new syntax, it's clear, explicit and readable, > and is nothing beyond the functionality of the present comprehensions so > it's not complicated. > >> > >> And I hope the discussion could focus more on whether we should allow > assigning temporary variables in comprehensions rather than how to solve > the specific example I mentioned above. > >> > >> > >> > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > >> > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jw14896.2014 at my.bristol.ac.uk Thu Feb 15 05:13:37 2018 From: jw14896.2014 at my.bristol.ac.uk (Jamie Willis) Date: Thu, 15 Feb 2018 10:13:37 +0000 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: I'm not sure it does indicate a need for refactoring, I'd argue it's quite a common pattern, at least in functional languages from which this construct arises. In fact, in those languages, there are laws that govern interactions with the comprehensions (though this comes from monads and monads perhaps don't quite apply to pythons model). These laws define behaviour that is expected equivalent by users; [x for x in xs] = xs [f(x) for x in [x]] = f(x) [g(y) for y in [f(x) for x in xs]] = [g(y) for x in xs for y in f(x)] Even though that last law isn't completely analogous to the given example from the OP, the transformation he wants to be able to do does arise from the laws. So it could be argued that not being able to flatten the comprehension down via law 3 is unexpected behaviour and in order to achieve this you'd need a form of assignment in the comprehension or suffer inefficiencies. But that's probably just my functional brain talking... On 15 Feb 2018 9:53 am, "Evpok Padding" wrote: > For simple cases such as `[y + g(y) for y in [f(x) for x in range(10)]]`, > I don't really see what the issue is, if you really want to make it > shorter, > you can ``[y + g(y) for y in map(f,range(10))]` which is one of the rare > case where I like `map` more than comprehensions. > > For more complex case, just define a intermediate generator along the lines > ``` > f_samples = (f(x) for x in range(10)) > [y+g(y) for y in f_samples] > ``` > Which does exactly the same thing but > - Is more readable and explicit > - Has no memory overhead thanks to lazy evaluation > (btw, you should consider generators for your nested comprenshions) > > While I am sometimes in the same state of mind, wishing for variables in > comprehensions seems to me like a good indicator that your code needs > refactoring. > > Best, > > E > > On 15 February 2018 at 10:32, Jamie Willis > wrote: > > > > I +1 this at surface level; Both Haskell list comprehensions and Scala > for comprehensions have variable assignment in them, even between iterating > and this is often very useful. Perhaps syntax can be generalised as: > > > > [expr_using_x_and_y > > for i in is > > x = expr_using_i > > for j in is > > y = expr_using_j_and_x] > > > > This demonstrates the scope of each assignment; available in main result > and then every clause that follows it. > > > > Sorry to op who will receive twice, forgot reply to all > > > > On 15 Feb 2018 7:03 am, "fhsxfhsx" wrote: > >> > >> As far as I can see, a comprehension like > >> alist = [f(x) for x in range(10)] > >> is better than a for-loop > >> for x in range(10): > >> alist.append(f(x)) > >> because the previous one shows every element of the list explicitly so > that we don't need to handle `append` mentally. > >> > >> But when it comes to something like > >> [f(x) + g(f(x)) for x in range(10)] > >> you find you have to sacrifice some readableness if you don't want two > f(x) which might slow down your code. > >> > >> Someone may argue that one can write > >> [y + g(y) for y in [f(x) for x in range(10)]] > >> but it's not as clear as to show what `y` is in a subsequent clause, > not to say there'll be another temporary list built in the process. > >> We can even replace every comprehension with map and filter, but that > would face the same problems. > >> > >> In a word, what I'm arguing is that we need a way to assign temporary > variables in a comprehension. > >> In my opinion, code like > >> [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] > >> is more natural than any solution we now have. > >> And that's why I pro the new syntax, it's clear, explicit and readable, > and is nothing beyond the functionality of the present comprehensions so > it's not complicated. > >> > >> And I hope the discussion could focus more on whether we should allow > assigning temporary variables in comprehensions rather than how to solve > the specific example I mentioned above. > >> > >> > >> > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > >> > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylvain.marie at schneider-electric.com Wed Feb 14 13:23:47 2018 From: sylvain.marie at schneider-electric.com (Sylvain MARIE) Date: Wed, 14 Feb 2018 18:23:47 +0000 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: Yes, this is used in combination dynamic type checking, currently using enforce (https://github.com/RussBaz/enforce ) but I know that others exist (pytypes in particular) As per examples?all utility functions that we write that are receiving a number or a boolean in their parameters are now written using the numbers and additional Boolean classes: ------------- example where Integral is used instead of int ----------------- from numbers import Integral import pandas as pd from enforce import runtime_validation, config config(dict(mode='covariant')) # type validation will accept subclasses too @runtime_validation def only_keep_events_lasting_at_least(boolean_series: pd.Series, min_nb_occurrences: Integral): """ Filters boolean flags to keep 'true' only when it appears at least min_nb_occurrences times in a row :param boolean_series: :param min_nb_occurrences: :return: """ (contents skipped for clarity) ------------------------------------- Similarly when a bool type hint is in the signature we try to replace it with a Boolean, so that people can call it with a numpy bool. But maybe that?s too much of type checking for the python philosophy ? I?m wondering if we?re going too far here? Anyway, again, my point is just about consistency: if this is available for numbers, why not for simple Booleans? Sylvain De : Guido van Rossum [mailto:gvanrossum at gmail.com] Envoy? : mercredi 14 f?vrier 2018 17:14 ? : Sylvain MARIE Cc : Python-Ideas Objet : Re: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module Can you show some sample code that you have written that shows where this would be useful? Note that using the numbers package actually makes static type checking through e.g. mypy difficult. So I presume you are talking about dynamic checking? --Guido On Feb 14, 2018 12:42 AM, "Sylvain MARIE" > wrote: My point is just that today, I use the ?numbers? package classes (Integral, Real, ?) for PEP484 type-hinting, and I find it quite useful in term of input type validation (in combination with PEP484-compliant type checkers, whether static or dynamic). Adding a Boolean ABC with a similar behavior would certainly add consistency to that ?numbers? package ? only for users who already find it useful, of course. Note that my use case is not about converting an object to a Boolean, I?m just speaking about type validation of a ?true? boolean object, for example to be received as a function argument for a flag option. This is for example for users who want to define strongly-typed APIs for interaction with the ?outside world?, and keep using duck-typing for internals. Sylvain De : Python-ideas [mailto:python-ideas-bounces+sylvain.marie=schneider-electric.com at python.org] De la part de Chris Barker Envoy? : mardi 13 f?vrier 2018 21:12 ? : David Mertz > Cc : python-ideas > Objet : Re: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module On Mon, Feb 12, 2018 at 10:07 PM, David Mertz > wrote: I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in the standard library; Guido expresses skepticism. Of course it is possible to define it in some other library that actually needs to use `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure I'm unconvinced either, I can see a certain value to saying a given value is "fully round-trippable to bool" (as is np.bool_). But is an ABC the way to do it? Personally, I'm skeptical that ABCs are a solution to, well, anything (as apposed to duck typing and EAFTP). Take Nick's example: """ The other comparison that comes to mind would be the distinction between "__int__" ("can be coerced to an integer, but may lose information in the process") and "__index__" ("can be losslessly converted to and from a builtin integer"). """ I suppose we could have had an Index ABC -- but that seems painful to me. so maybe we could use a __true_bool__ special method? (and an operator.true_bool() function ???) (this all makes me wish that python bools were more pure -- but way to late for that!) I guess it comes down to whether you want to: - Ask the question: "is this object a boolean?" or - Make this object a boolean __index__ (and operator.index()) is essentially the later -- you want to make an index out of whatever object you have, if you can do so. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Feb 15 11:27:57 2018 From: guido at python.org (Guido van Rossum) Date: Thu, 15 Feb 2018 08:27:57 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: A thought just occurred to me. Maybe we should just add a Boolean class to numbers? It's a subclass of Integral, presumably. And normally only builtins.bool is registered with it. But np.bool can be added at the same point you register the other np integral types. On Wed, Feb 14, 2018 at 10:23 AM, Sylvain MARIE < sylvain.marie at schneider-electric.com> wrote: > Yes, this is used in combination dynamic type checking, currently using > enforce (https://github.com/RussBaz/enforce ) but I know that others > exist (pytypes in particular) > > > > As per examples?all utility functions that we write that are receiving a > number or a boolean in their parameters are now written using the numbers > and additional Boolean classes: > > > > ------------- example where Integral is used instead of int > ----------------- > > > > *from *numbers *import *Integral > > *import *pandas *as *pd > > *from *enforce *import *runtime_validation, config > config(dict(mode=*'covariant'*)) > > > *# type validation will accept subclasses too *@runtime_validation > *def *only_keep_events_lasting_at_least(boolean_series: pd.Series, > min_nb_occurrences: Integral): > > > > *""" Filters boolean flags to keep 'true' only when it appears at > least min_nb_occurrences times in a row **:param* > * boolean_series: **:param* > * min_nb_occurrences: **:return* > *: """* > > (contents skipped for clarity) > > > > ------------------------------------- > > > > Similarly when a bool type hint is in the signature we try to replace it > with a Boolean, so that people can call it with a numpy bool. But maybe > that?s too much of type checking for the python philosophy ? I?m wondering > if we?re going too far here? > > > > Anyway, again, my point is just about consistency: if this is available > for numbers, why not for simple Booleans? > > > > Sylvain > > > > *De :* Guido van Rossum [mailto:gvanrossum at gmail.com] > *Envoy? :* mercredi 14 f?vrier 2018 17:14 > *? :* Sylvain MARIE > *Cc :* Python-Ideas > > *Objet :* Re: [Python-ideas] Boolean ABC similar to what's provided in > the 'numbers' module > > > > Can you show some sample code that you have written that shows where this > would be useful? > > > > Note that using the numbers package actually makes static type checking > through e.g. mypy difficult. So I presume you are talking about dynamic > checking? > > > > --Guido > > > > > > On Feb 14, 2018 12:42 AM, "Sylvain MARIE" electric.com> wrote: > > My point is just that today, I use the ?numbers? package classes > (Integral, Real, ?) for PEP484 type-hinting, and I find it quite useful in > term of input type validation (in combination with PEP484-compliant type > checkers, whether static or dynamic). Adding a Boolean ABC with a similar > behavior would certainly add consistency to that ?numbers? package ? only > for users who already find it useful, of course. > > > > Note that my use case is not about converting an object to a Boolean, I?m > just speaking about type validation of a ?true? boolean object, for example > to be received as a function argument for a flag option. This is for > example for users who want to define strongly-typed APIs for interaction > with the ?outside world?, and keep using duck-typing for internals. > > > > Sylvain > > > > *De :* Python-ideas [mailto:python-ideas-bounces+sylvain.marie=schneider- > electric.com at python.org] *De la part de* Chris Barker > *Envoy? :* mardi 13 f?vrier 2018 21:12 > *? :* David Mertz > *Cc :* python-ideas > *Objet :* Re: [Python-ideas] Boolean ABC similar to what's provided in > the 'numbers' module > > > > > > > > On Mon, Feb 12, 2018 at 10:07 PM, David Mertz wrote: > > I'm not sure I'm convinced by Sylvain that Boolean needs to be an ABC in > the standard library; Guido expresses skepticism. Of course it is possible > to define it in some other library that actually needs to use > `isinstance(x, Boolean)` as Sylvain demonstraits in his post. I'm not sure > I'm unconvinced either, I can see a certain value to saying a given value > is "fully round-trippable to bool" (as is np.bool_). > > > > But is an ABC the way to do it? Personally, I'm skeptical that ABCs are a > solution to, well, anything (as apposed to duck typing and EAFTP). Take > Nick's example: > > > > """ > > The other comparison that comes to mind would be the distinction > between "__int__" ("can be coerced to an integer, but may lose > information in the process") and "__index__" ("can be losslessly > converted to and from a builtin integer"). > > """ > > > > I suppose we could have had an Index ABC -- but that seems painful to me. > > > > so maybe we could use a __true_bool__ special method? > > > > (and an operator.true_bool() function ???) > > > > (this all makes me wish that python bools were more pure -- but way to > late for that!) > > > > I guess it comes down to whether you want to: > > > > - Ask the question: "is this object a boolean?" > > > > or > > > > - Make this object a boolean > > > > __index__ (and operator.index()) is essentially the later -- you want to > make an index out of whatever object you have, if you can do so. > > > > -CHB > > > > > > > > -- > > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > ______________________________________________________________________ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > ______________________________________________________________________ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Feb 15 18:34:28 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 15 Feb 2018 15:34:28 -0800 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: On Thu, Feb 15, 2018 at 2:13 AM, Jamie Willis wrote: > These laws define behaviour that is expected equivalent by users; > > [x for x in xs] = xs > OK -- that's the definition... > [f(x) for x in [x]] = f(x) > well, not quite: [f(x) for x in [x]] = [f(x)] Using x in two places where they mean different things makes this odd, but yes, again the definition (of a list comp, and a length-1 sequence) [g(y) for y in [f(x) for x in xs]] = [g(y) for x in xs for y in f(x)] > well, no. using two for expressions yields the outer product -- all combinations: In [*14*]: xs = range(3) In [*15*]: [(x,y) *for* x *in* xs *for* y *in* xs] Out[*15*]: [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)] so the result is a length len(seq1)*len(seq(2)) list. Or in this case, a len(xs)**2. But nesting the comps applies one expression, and then the other, yielding a length len(xs) list. but you wrote: [g(y) for x in xs for y in f(x)] which I'm not sure what you were expecting, as f(x) is not a sequence (probably)... To play with your examples: Define some functions that make it clear what's been applied: In [*16*]: *def* f(x): ...: *return* "f(*{}*)".format(x) ...: In [*17*]: *def* g(x): ...: *return* "g(*{}*)".format(x) and a simple sequence to use: In [*18*]: xs = range(3) Now your examples: In [*19*]: [x *for* x *in* xs] Out[*19*]: [0, 1, 2] In [*20*]: [f(x) *for* x *in* [x]] Out[*20*]: ['f(5)'] In [*21*]: [g(y) *for* y *in* [f(x) *for* x *in* xs]] Out[*21*]: ['g(f(0))', 'g(f(1))', 'g(f(2))'] OK -- all good f applied, then g, but then the last one: In [*27*]: [g(y) *for* x *in* xs *for* y *in* f(x)] Out[*27*]: ['g(f)', 'g(()', 'g(0)', 'g())', 'g(f)', 'g(()', 'g(1)', 'g())', 'g(f)', 'g(()', 'g(2)', 'g())'] in this case, f(x) is returning a string, which is a sequence, so you get that kind odd result. But what if f(x) was a simple scalr function: In [*29*]: *def* f(x): ...: *return* 2*x Then you just get an error: In [*30*]: [g(y) *for* x *in* xs *for* y *in* f(x)] --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 [g(y) for x in xs for y in f(x)] in (.0) ----> 1 [g(y) for x in xs for y in f(x)] TypeError: 'int' object is not iterable The the nested comp is what is desired here: In [*31*]: [g(y) *for* y *in* [f(x) *for* x *in* xs]] Out[*31*]: ['g(0)', 'g(2)', 'g(4)'] Except you probably want a generator expression in the inner loop to avoid bulding an extra list: In [*33*]: [g(y) *for* y *in* (f(x) *for* x *in* xs)] Out[*33*]: ['g(f(0))', 'g(f(1))', 'g(f(2))'] So back to the OP's example: In [*34*]: [f(x) + g(f(x)) *for* x *in* range(10)] Out[*34*]: ['f(0)g(f(0))', 'f(1)g(f(1))', 'f(2)g(f(2))', 'f(3)g(f(3))', 'f(4)g(f(4))', 'f(5)g(f(5))', 'f(6)g(f(6))', 'f(7)g(f(7))', 'f(8)g(f(8))', 'f(9)g(f(9))'] that is best done with comps as: In [*36*]: [fx + g(fx) *for* fx *in* (f(x) *for* x *in* range(10))] Out[*36*]: ['f(0)g(f(0))', 'f(1)g(f(1))', 'f(2)g(f(2))', 'f(3)g(f(3))', 'f(4)g(f(4))', 'f(5)g(f(5))', 'f(6)g(f(6))', 'f(7)g(f(7))', 'f(8)g(f(8))', 'f(9)g(f(9))'] which really doesn't seem bad to me. And if the function names are longer -- which they should be, you might want to use a temp as suggested earlier: In [*41*]: fx = (f(x) *for* x *in* range(10)) In [*42*]: [x + g(x) *for* x *in* fx] Out[*42*]: ['f(0)g(f(0))', 'f(1)g(f(1))', 'f(2)g(f(2))', 'f(3)g(f(3))', 'f(4)g(f(4))', 'f(5)g(f(5))', 'f(6)g(f(6))', 'f(7)g(f(7))', 'f(8)g(f(8))', 'f(9)g(f(9))'] The truth is, comprehensions really are a bit wordy, if you are doing a lot of this kind of thing (at least with numbers), you might be happier with an array-oriented language or library, such as numpy: In [*46*]: *import* *numpy* *as* *np* In [*47*]: *def* f(x): ...: *return* x * 2 ...: In [*48*]: *def* g(x): ...: *return* x * 3 ...: ...: In [*49*]: xs = np.arange(3) In [*50*]: f(xs) + g(f(xs)) Out[*50*]: array([ 0, 8, 16]) is pretty compact, and can be "optimized with a temp: In [*51*]: fx = f(xs) ...: fx + g(fx) ...: Out[*51*]: array([ 0, 8, 16]) pretty simple isn't it? So this gets back to -- does anyone have a suggestion for a syntax for comprehensions that would make this substantially clearer, more readable, or more compact? I'm guessing not :-) (the compact bit comes from having to type the "for x in" part twice -- it does *feel* a bit unnecessary. which is why I like numpy -- no "for" at all :-) (I'm still trying to figure out why folks think map() or filter() help here...) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jw14896 at my.bristol.ac.uk Thu Feb 15 19:10:26 2018 From: jw14896 at my.bristol.ac.uk (jw14896 at my.bristol.ac.uk) Date: Fri, 16 Feb 2018 00:10:26 -0000 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: <003c01d3a6ba$8c1af0a0$a450d1e0$@my.bristol.ac.uk> I?d like to clarify that f(x) was indeed meant to be a sequence. As per monad law: do { y <- do { x <- m; f x } g y } === do { x <- m; y <- f x; g y } I think you might have misunderstood the types of things; f: Function[a, List[b]] and g: Function[b, List[c]]. m: List[a] But I DO think my old the fly write up went wrong, converting do-notation to list comprehensions isn?t completely straightforward The above is equivalent to [g y | x <- m, y <- f x] in Haskell and the top is [g y | y <- [z |x <- m, z <- f x]] These have analogous structures in python; [g(y) for x in m for y in f(x)] and [g(y) for y in [z for x in m for z in f(x)]] (I think?) And yes the left identity law I posted was missing the [f(x)] brackets. If I?ve not made another mistake, that *should* now work? From: Chris Barker [mailto:chris.barker at noaa.gov] Sent: 15 February 2018 23:34 To: jw14896.2014 at my.bristol.ac.uk Cc: Evpok Padding ; Python-Ideas Subject: Re: [Python-ideas] Temporary variables in comprehensions On Thu, Feb 15, 2018 at 2:13 AM, Jamie Willis > wrote: These laws define behaviour that is expected equivalent by users; [x for x in xs] = xs OK -- that's the definition... [f(x) for x in [x]] = f(x) well, not quite: [f(x) for x in [x]] = [f(x)] Using x in two places where they mean different things makes this odd, but yes, again the definition (of a list comp, and a length-1 sequence) [g(y) for y in [f(x) for x in xs]] = [g(y) for x in xs for y in f(x)] well, no. using two for expressions yields the outer product -- all combinations: In [14]: xs = range(3) In [15]: [(x,y) for x in xs for y in xs] Out[15]: [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)] so the result is a length len(seq1)*len(seq(2)) list. Or in this case, a len(xs)**2. But nesting the comps applies one expression, and then the other, yielding a length len(xs) list. but you wrote: [g(y) for x in xs for y in f(x)] which I'm not sure what you were expecting, as f(x) is not a sequence (probably)... To play with your examples: Define some functions that make it clear what's been applied: In [16]: def f(x): ...: return "f({})".format(x) ...: In [17]: def g(x): ...: return "g({})".format(x) and a simple sequence to use: In [18]: xs = range(3) Now your examples: In [19]: [x for x in xs] Out[19]: [0, 1, 2] In [20]: [f(x) for x in [x]] Out[20]: ['f(5)'] In [21]: [g(y) for y in [f(x) for x in xs]] Out[21]: ['g(f(0))', 'g(f(1))', 'g(f(2))'] OK -- all good f applied, then g, but then the last one: In [27]: [g(y) for x in xs for y in f(x)] Out[27]: ['g(f)', 'g(()', 'g(0)', 'g())', 'g(f)', 'g(()', 'g(1)', 'g())', 'g(f)', 'g(()', 'g(2)', 'g())'] in this case, f(x) is returning a string, which is a sequence, so you get that kind odd result. But what if f(x) was a simple scalr function: In [29]: def f(x): ...: return 2*x Then you just get an error: In [30]: [g(y) for x in xs for y in f(x)] --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 [g(y) for x in xs for y in f(x)] in (.0) ----> 1 [g(y) for x in xs for y in f(x)] TypeError: 'int' object is not iterable The the nested comp is what is desired here: In [31]: [g(y) for y in [f(x) for x in xs]] Out[31]: ['g(0)', 'g(2)', 'g(4)'] Except you probably want a generator expression in the inner loop to avoid bulding an extra list: In [33]: [g(y) for y in (f(x) for x in xs)] Out[33]: ['g(f(0))', 'g(f(1))', 'g(f(2))'] So back to the OP's example: In [34]: [f(x) + g(f(x)) for x in range(10)] Out[34]: ['f(0)g(f(0))', 'f(1)g(f(1))', 'f(2)g(f(2))', 'f(3)g(f(3))', 'f(4)g(f(4))', 'f(5)g(f(5))', 'f(6)g(f(6))', 'f(7)g(f(7))', 'f(8)g(f(8))', 'f(9)g(f(9))'] that is best done with comps as: In [36]: [fx + g(fx) for fx in (f(x) for x in range(10))] Out[36]: ['f(0)g(f(0))', 'f(1)g(f(1))', 'f(2)g(f(2))', 'f(3)g(f(3))', 'f(4)g(f(4))', 'f(5)g(f(5))', 'f(6)g(f(6))', 'f(7)g(f(7))', 'f(8)g(f(8))', 'f(9)g(f(9))'] which really doesn't seem bad to me. And if the function names are longer -- which they should be, you might want to use a temp as suggested earlier: In [41]: fx = (f(x) for x in range(10)) In [42]: [x + g(x) for x in fx] Out[42]: ['f(0)g(f(0))', 'f(1)g(f(1))', 'f(2)g(f(2))', 'f(3)g(f(3))', 'f(4)g(f(4))', 'f(5)g(f(5))', 'f(6)g(f(6))', 'f(7)g(f(7))', 'f(8)g(f(8))', 'f(9)g(f(9))'] The truth is, comprehensions really are a bit wordy, if you are doing a lot of this kind of thing (at least with numbers), you might be happier with an array-oriented language or library, such as numpy: In [46]: import numpy as np In [47]: def f(x): ...: return x * 2 ...: In [48]: def g(x): ...: return x * 3 ...: ...: In [49]: xs = np.arange(3) In [50]: f(xs) + g(f(xs)) Out[50]: array([ 0, 8, 16]) is pretty compact, and can be "optimized with a temp: In [51]: fx = f(xs) ...: fx + g(fx) ...: Out[51]: array([ 0, 8, 16]) pretty simple isn't it? So this gets back to -- does anyone have a suggestion for a syntax for comprehensions that would make this substantially clearer, more readable, or more compact? I'm guessing not :-) (the compact bit comes from having to type the "for x in" part twice -- it does *feel* a bit unnecessary. which is why I like numpy -- no "for" at all :-) (I'm still trying to figure out why folks think map() or filter() help here...) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Feb 15 19:57:40 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 16 Feb 2018 11:57:40 +1100 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: <20180216005739.GA10142@ando.pearwood.info> Hi fhsxfhsx, and welcome. My comments below, interleaved with yours. On Thu, Feb 15, 2018 at 01:56:44PM +0800, fhsxfhsx wrote: [quoted out of order] > And I hope the discussion could focus more on whether we should allow > assigning temporary variables in comprehensions rather than how to > solve the specific example I mentioned above. Whether or not to allow this proposal will depend on what alternate solutions to the problem already exist, so your specific example is very relevant. Any proposed change has to compete with existing solutions. > As far as I can see, a comprehension like > alist = [f(x) for x in range(10)] > is better than a for-loop > for x in range(10): > alist.append(f(x)) > because the previous one shows every element of the list explicitly so > that we don't need to handle `append` mentally. While I personally agree with you, many others disagree. I know quite a few experienced, competent Python programmers who avoid list comprehensions because they consider them harder to read and reason about. They consider a regular for-loop better precisely because you do see the explicit call to append. (In my experience, those of us who get functional-programming idioms often forget that others find them tricky.) The point is that list comprehensions are already complex enough that they are difficult for many people to learn, and some people never come to grips with them. Adding even more features comes with a cost. The bottom line is that it isn't clear to me that allowing local variables inside comprehensions will make them more readable. > But when it comes to something like > [f(x) + g(f(x)) for x in range(10)] > you find you have to sacrifice some readableness if you don't want two > f(x) which might slow down your code. The usual comments about premature optimisation apply here. Setting a new comprehension variable is not likely to be free, and may even be more costly than calling f(x) twice if f() is a cheap expression: [x+1 + some_func(x+1) for x in range(10)] could be faster than [y + some_func(y) for x in range(10) let y = x + 1] or whatever syntax we come up with. > Someone may argue that one can write > [y + g(y) for y in [f(x) for x in range(10)]] Indeed. This would be the functional-programming solution, and I personally think it is an excellent one. The only changes are that I'd use a generator expression for the intermediate value, avoiding the need to make a full list, and I would lay it out more nicely, using whitespace to make the structure more clear: result = [y + g(y) for y in (f(x) for x in range(10)) ] > but it's not as clear as to show what `y` is in a subsequent clause, > not to say there'll be another temporary list built in the process. There's no need to build the temporary list. Use a generator comprehension. And I disagree that the value of y isn't as clear. An alternative is simply to refactor your list comprehension. Move the calls to f() and g() into a helper function: def func(x): y = f(x) return y + g(y) and now you can write the extremely clear comprehension [func(x) for x in range(10)] that needs no extra variable. [...] > In a word, what I'm arguing is that we need a way to assign temporary > variables in a comprehension. "Need" is very strong. I think that the two alternatives I mention above cover 95% of the cases where might use a local variable in a comprehension. And of the remaining cases, many of them will be so complex that they should be re-written as an explicit for-loop. So in my opinion, we're only talking about a "need" to solve the problem for a small proportion of cases: - most comprehensions don't need a local variable (apart from the loop variable) at all; - of those which do need a local variable, most can be easily solved using a nested comprehension or a helper function; - of those which cannot be solved that way, most are complicated enough that they should use a regular for-loop; - leaving only a small number of cases which are complicated enough to genuinely benefit from local variables but not too complicated. So this is very much a borderline feature. Occasionally it would be "nice to have", but on the negative side: - it adds complexity to the language; - makes comprehensions harder to read; - and people will use it unnecessarily where there is no readability or speed benefit (premature optimization again). It is not clear to me that we should burden *all* Python programmers with additional syntax and complexity of an *already complex* feature for such a marginal improvement. > In my opinion, code like > [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] > is more natural than any solution we now have. > And that's why I pro the new syntax, it's clear, explicit and readable How can you say that the new syntax is "clear, explicit and readable" when you haven't proposed any new syntax yet? For lack of anything better, I'm going to suggest "let y = f(x)" as the syntax, although personally I don't like it even a bit. Where should the assignment go? [(y, y**2) let y = x+1 for x in (1, 2, 3, 4)] [(y, y**2) for x in (1, 2, 3, 4) let y = x+1] I think they're both pretty ugly, but I can't think of anything else. Can we rename the loop variable, or is that an error? [(x, x**2) let x = x+1 for x in (1, 2, 3, 4)] How do they interact when you have multiple loops and if-clauses? [(w, w**2) for x in (1, 2, 3, 4) let y = x+1 for a in range(y) let z = a+1 if z > 2 for b in range(z) let w = z+1] For simplicity, perhaps we should limit any such local assignments to the very end of the comprehension: [expression for name in sequence ] but that means we can't optimise this sort of comprehension: [expression for x in sequence for y in (something_expensive(x) + function(something_expensive(x)) ] Or these: [expression for x in sequence if something_expensive(x) or condition(something_expensive(x)) ] I think these are very hard questions to answer. -- Steve From chris.barker at noaa.gov Thu Feb 15 20:37:33 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 15 Feb 2018 20:37:33 -0500 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <003c01d3a6ba$8c1af0a0$a450d1e0$@my.bristol.ac.uk> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <003c01d3a6ba$8c1af0a0$a450d1e0$@my.bristol.ac.uk> Message-ID: I?d like to clarify that f(x) was indeed meant to be a sequence. As per monad law: *do* { y *<-* *do* { x *<-* m; f x } g y } === *do* { x *<-* m; y *<-* f x; g y } I think you might have misunderstood the types of things; f: Function[a, List[b]] and g: Function[b, List[c]]. m: List[a] I?m still confused ? f and g take a scale and return a list? Or take two parameters, a scalar and a list??? Either way, doesn?t really fit with python comprehensions, which are more or less expecting functions that except and return a single element (which could be a triple, but I digress) And the key point is that in python: for x in seq1 for y in seq2 Does all combinations, like a nested for loop would. But I DO think my old the fly write up went wrong, converting do-notation to list comprehensions isn?t completely straightforward Well, I don?t think I get the do notation at all.... The above is equivalent to [g y | x <- m, y <- f x] in Haskell and the top is [g y | y <- [z |x <- m, z <- f x]] These have analogous structures in python; [g(y) for x in m for y in f(x)] and [g(y) for y in [z for x in m for z in f(x)]] (I think?) Is this current Python or a new proposed syntax ? ?cause I can?t make any sense of it in python. Maybe define some f,g,m that behave as you expect, and see what python does.... If I?ve not made another mistake, that **should* *now work? Now on a phone, so can?t test, but I also can?t tell what you are expecting the result to be. Back to one of your examples: [f(x) for x in [x]] What does that mean??? for x in seq Means iterate through seq, and assign each item to the name x. If that seq has x in it ? I?m not sure that is even legal python ? the scope in a comprehension confuses me. But that is the equivalent is something like: it= iter(seq) while True: Try: x = next(it) Except StopIteration: Break (Excuse the caps ? hard to write code on a phone) So not sure how x gets into that sequence before the loop starts. -CHB *From:* Chris Barker [mailto:chris.barker at noaa.gov ] *Sent:* 15 February 2018 23:34 *To:* jw14896.2014 at my.bristol.ac.uk *Cc:* Evpok Padding ; Python-Ideas < python-ideas at python.org> *Subject:* Re: [Python-ideas] Temporary variables in comprehensions On Thu, Feb 15, 2018 at 2:13 AM, Jamie Willis wrote: These laws define behaviour that is expected equivalent by users; [x for x in xs] = xs OK -- that's the definition... [f(x) for x in [x]] = f(x) well, not quite: [f(x) for x in [x]] = [f(x)] Using x in two places where they mean different things makes this odd, but yes, again the definition (of a list comp, and a length-1 sequence) [g(y) for y in [f(x) for x in xs]] = [g(y) for x in xs for y in f(x)] well, no. using two for expressions yields the outer product -- all combinations: In [*14*]: xs = range(3) In [*15*]: [(x,y) *for* x *in* xs *for* y *in* xs] Out[*15*]: [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)] so the result is a length len(seq1)*len(seq(2)) list. Or in this case, a len(xs)**2. But nesting the comps applies one expression, and then the other, yielding a length len(xs) list. but you wrote: [g(y) for x in xs for y in f(x)] which I'm not sure what you were expecting, as f(x) is not a sequence (probably)... To play with your examples: Define some functions that make it clear what's been applied: In [*16*]: *def* f(x): ...: *return* "f(*{}*)".format(x) ...: In [*17*]: *def* g(x): ...: *return* "g(*{}*)".format(x) and a simple sequence to use: In [*18*]: xs = range(3) Now your examples: In [*19*]: [x *for* x *in* xs] Out[*19*]: [0, 1, 2] In [*20*]: [f(x) *for* x *in* [x]] Out[*20*]: ['f(5)'] In [*21*]: [g(y) *for* y *in* [f(x) *for* x *in* xs]] Out[*21*]: ['g(f(0))', 'g(f(1))', 'g(f(2))'] OK -- all good f applied, then g, but then the last one: In [*27*]: [g(y) *for* x *in* xs *for* y *in* f(x)] Out[*27*]: ['g(f)', 'g(()', 'g(0)', 'g())', 'g(f)', 'g(()', 'g(1)', 'g())', 'g(f)', 'g(()', 'g(2)', 'g())'] in this case, f(x) is returning a string, which is a sequence, so you get that kind odd result. But what if f(x) was a simple scalr function: In [*29*]: *def* f(x): ...: *return* 2*x Then you just get an error: In [*30*]: [g(y) *for* x *in* xs *for* y *in* f(x)] --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 [g(y) for x in xs for y in f(x)] in (.0) ----> 1 [g(y) for x in xs for y in f(x)] TypeError: 'int' object is not iterable The the nested comp is what is desired here: In [*31*]: [g(y) *for* y *in* [f(x) *for* x *in* xs]] Out[*31*]: ['g(0)', 'g(2)', 'g(4)'] Except you probably want a generator expression in the inner loop to avoid bulding an extra list: In [*33*]: [g(y) *for* y *in* (f(x) *for* x *in* xs)] Out[*33*]: ['g(f(0))', 'g(f(1))', 'g(f(2))'] So back to the OP's example: In [*34*]: [f(x) + g(f(x)) *for* x *in* range(10)] Out[*34*]: ['f(0)g(f(0))', 'f(1)g(f(1))', 'f(2)g(f(2))', 'f(3)g(f(3))', 'f(4)g(f(4))', 'f(5)g(f(5))', 'f(6)g(f(6))', 'f(7)g(f(7))', 'f(8)g(f(8))', 'f(9)g(f(9))'] that is best done with comps as: In [*36*]: [fx + g(fx) *for* fx *in* (f(x) *for* x *in* range(10))] Out[*36*]: ['f(0)g(f(0))', 'f(1)g(f(1))', 'f(2)g(f(2))', 'f(3)g(f(3))', 'f(4)g(f(4))', 'f(5)g(f(5))', 'f(6)g(f(6))', 'f(7)g(f(7))', 'f(8)g(f(8))', 'f(9)g(f(9))'] which really doesn't seem bad to me. And if the function names are longer -- which they should be, you might want to use a temp as suggested earlier: In [*41*]: fx = (f(x) *for* x *in* range(10)) In [*42*]: [x + g(x) *for* x *in* fx] Out[*42*]: ['f(0)g(f(0))', 'f(1)g(f(1))', 'f(2)g(f(2))', 'f(3)g(f(3))', 'f(4)g(f(4))', 'f(5)g(f(5))', 'f(6)g(f(6))', 'f(7)g(f(7))', 'f(8)g(f(8))', 'f(9)g(f(9))'] The truth is, comprehensions really are a bit wordy, if you are doing a lot of this kind of thing (at least with numbers), you might be happier with an array-oriented language or library, such as numpy: In [*46*]: *import* *numpy* *as* *np* In [*47*]: *def* f(x): ...: *return* x * 2 ...: In [*48*]: *def* g(x): ...: *return* x * 3 ...: ...: In [*49*]: xs = np.arange(3) In [*50*]: f(xs) + g(f(xs)) Out[*50*]: array([ 0, 8, 16]) is pretty compact, and can be "optimized with a temp: In [*51*]: fx = f(xs) ...: fx + g(fx) ...: Out[*51*]: array([ 0, 8, 16]) pretty simple isn't it? So this gets back to -- does anyone have a suggestion for a syntax for comprehensions that would make this substantially clearer, more readable, or more compact? I'm guessing not :-) (the compact bit comes from having to type the "for x in" part twice -- it does *feel* a bit unnecessary. which is why I like numpy -- no "for" at all :-) (I'm still trying to figure out why folks think map() or filter() help here...) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Feb 15 20:50:51 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 15 Feb 2018 20:50:51 -0500 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <20180216005739.GA10142@ando.pearwood.info> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <20180216005739.GA10142@ando.pearwood.info> Message-ID: > Setting a new comprehension variable is not likely to be free, and may even be > more costly than calling f(x) twice if f() is a cheap expression: > > [x+1 + some_func(x+1) for x in range(10)] > > could be faster than > > [y + some_func(y) for x in range(10) let y = x + 1] A bit of a nit ? function call overhead is substantial in python, so if that is an actual function, rather than a simple expression, it?ll likely be slower to call it twice for any but trivially small iterables. > [(y, y**2) let y = x+1 for x in (1, 2, 3, 4)] Do we need the let? [ g(y) for y = f(x) for c in seq] Or, with expressions: [y + y**2 for y = x+1 for x in (1,2,3)] Maybe that would be ambiguous? I haven?t thought carefully about it. -CHB From ncoghlan at gmail.com Thu Feb 15 21:06:04 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 16 Feb 2018 12:06:04 +1000 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators Message-ID: The recent thread on variable assignment in comprehensions has prompted me to finally share https://gist.github.com/ncoghlan/a1b0482fc1ee3c3a11fc7ae64833a315 with a wider audience (see the comments there for some notes on iterations I've already been through on the idea). == The general idea == The general idea would be to introduce a *single* statement local reference using a new keyword with a symbolic prefix: "?it" * `(?it=expr)` is a new atomic expression for an "it reference binding" (whitespace would be permitted around "?it" and "=", but PEP 8 would recommend against it in general) * subsequent subexpressions (in execution order) can reference the bound subexpression using `?it` (an "it reference") * `?it` is reset between statements, including before entering the suite within a compound statement (if you want a persistent binding, use a named variable) * for conditional expressions, put the reference binding in the conditional, as that gets executed first * to avoid ambiguity, especially in function calls (where it could be confused with keyword argument syntax), the parentheses around reference bindings are always required * unlike regular variables, you can't close over statement local references (the nested scope will get an UnboundLocalError if you try it) The core inspiration here is English pronouns (hence the choice of keyword): we don't generally define arbitrary terms in the middle of sentences, but we *do* use pronouns to refer back to concepts introduced earlier in the sentence. And while it's not an especially common practice, pronouns are sometimes even used in a sentence *before* the concept they refer to ;) If we did pursue this, then PEPs 505, 532, and 535 would all be withdrawn or rejected (with the direction being to use an it-reference instead). == Examples == `None`-aware attribute access: value = ?it.strip()[4:].upper() if (?it=var1) is not None else None `None`-aware subscript access: value = ?it[4:].upper() if (?it=var1) is not None else None `None`-coalescense: value = ?it if (?it=var1) is not None else ?it if (?it=var2) is not None else var3 `NaN`-coalescence: value = ?it if not math.isnan((?it=var1)) else ?it if not math.isnan((?that=var2)) else var3 Conditional function call: value = ?it() if (?it=calculate) is not None else default Avoiding repeated evaluation of a comprehension filter condition: filtered_values = [?it for x in keys if (?it=get_value(x)) is not None] Avoiding repeated evaluation for range and slice bounds: range((?it=calculate_start()), ?it+10) data[(?it=calculate_start()):?it+10] Avoiding repeated evaluation in chained comparisons: value if (?it=lower_bound()) <= value < ?it+tolerance else 0 Avoiding repeated evaluation in an f-string: print(f"{?it=get_value()!r} is printed in pure ASCII as {?it!a} and in Unicode as {?it}" == Possible future extensions == One possible future extension would be to pursue PEP 3150, treating the nested namespace as an it reference binding, giving: sorted_data = sorted(data, key=?it.sort_key) given ?it=: def sort_key(item): return item.attr1, item.attr2 (A potential bonus of that spelling is that it may be possible to make "given ?it=:" the syntactic keyword introducing the suite, allowing "given" itself to continue to be used as a variable name) Another possible extension would be to combine it references with `as` clauses on if statements and while loops: if (?it=pattern.match(data)) is not None as matched: ... while (?it=pattern.match(data)) is not None as matched: ... == Why not arbitrary embedded assignments? == Primarily because embedded assignments are inherently hard to read, especially in long expressions. Restricting things to one pronoun, and then pursuing PEP 3150's given clause in order to expand to multiple statement local names should help nudge folks towards breaking things up into multiple statements rather than writing ever more complex one-liners. That said, the ?-prefix notation is deliberately designed such that it *could* be used with arbitrary identifiers rather then being limited to a single specific keyword, and the explicit lack of closure support means that there wouldn't be any complex nested scope issues associated with lambda expressions, generator expressions, or container comprehensions. With that approach, "?it" would just be an idiomatic default name like "self" or "cls" rather than being a true keyword. Given arbitrary identifier support, some of the earlier examples might instead be written as: value = ?f() if (?f=calculate) is not None else default range((?start=calculate_start()), ?start+10) value if (?lower=lower_bound()) <= value < ?lower+tolerance else 0 The main practical downside to this approach is that *all* the semantic weight ends up resting on the symbolic "?" prefix, which makes it very difficult to look up as a new Python user. With a keyword embedded in the construct, there's a higher chance that folks will be able to guess the right term to search for (i.e. "python it expression" or "python it keyword"). Another downside of this more flexible option is that it likely *wouldn't* be amenable to the "if expr as name:" syntax extension, as there wouldn't be a single defined pronoun expression to bind the name to. However, the extension to PEP 3150 would allow the statement local namespace to be given an arbitrary name: sorted_data = sorted(data, key=?ns.sort_key) given ?ns=: def sort_key(item): return item.attr1, item.attr2 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rymg19 at gmail.com Thu Feb 15 21:19:49 2018 From: rymg19 at gmail.com (rymg19 at gmail.com) Date: Thu, 15 Feb 2018 18:19:49 -0800 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators In-Reply-To: <> Message-ID: I don't know...to me this looks downright ugly and an awkward special case. It feels like it combines reading difficulty of inline assignment with the awkwardness of a magic word and the ugliness of using ?. Basically, every con of the other proposals combined... -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttps://refi64.com/ On Feb 15, 2018 at 8:07 PM, > wrote: The recent thread on variable assignment in comprehensions has prompted me to finally share https://gist.github.com/ncoghlan/a1b0482fc1ee3c3a11fc7ae64833a315 with a wider audience (see the comments there for some notes on iterations I've already been through on the idea). == The general idea == The general idea would be to introduce a *single* statement local reference using a new keyword with a symbolic prefix: "?it" * `(?it=expr)` is a new atomic expression for an "it reference binding" (whitespace would be permitted around "?it" and "=", but PEP 8 would recommend against it in general) * subsequent subexpressions (in execution order) can reference the bound subexpression using `?it` (an "it reference") * `?it` is reset between statements, including before entering the suite within a compound statement (if you want a persistent binding, use a named variable) * for conditional expressions, put the reference binding in the conditional, as that gets executed first * to avoid ambiguity, especially in function calls (where it could be confused with keyword argument syntax), the parentheses around reference bindings are always required * unlike regular variables, you can't close over statement local references (the nested scope will get an UnboundLocalError if you try it) The core inspiration here is English pronouns (hence the choice of keyword): we don't generally define arbitrary terms in the middle of sentences, but we *do* use pronouns to refer back to concepts introduced earlier in the sentence. And while it's not an especially common practice, pronouns are sometimes even used in a sentence *before* the concept they refer to ;) If we did pursue this, then PEPs 505, 532, and 535 would all be withdrawn or rejected (with the direction being to use an it-reference instead). == Examples == `None`-aware attribute access: value = ?it.strip()[4:].upper() if (?it=var1) is not None else None `None`-aware subscript access: value = ?it[4:].upper() if (?it=var1) is not None else None `None`-coalescense: value = ?it if (?it=var1) is not None else ?it if (?it=var2) is not None else var3 `NaN`-coalescence: value = ?it if not math.isnan((?it=var1)) else ?it if not math.isnan((?that=var2)) else var3 Conditional function call: value = ?it() if (?it=calculate) is not None else default Avoiding repeated evaluation of a comprehension filter condition: filtered_values = [?it for x in keys if (?it=get_value(x)) is not None] Avoiding repeated evaluation for range and slice bounds: range((?it=calculate_start()), ?it+10) data[(?it=calculate_start()):?it+10] Avoiding repeated evaluation in chained comparisons: value if (?it=lower_bound()) <= value < ?it+tolerance else 0 Avoiding repeated evaluation in an f-string: print(f"{?it=get_value()!r} is printed in pure ASCII as {?it!a} and in Unicode as {?it}" == Possible future extensions == One possible future extension would be to pursue PEP 3150, treating the nested namespace as an it reference binding, giving: sorted_data = sorted(data, key=?it.sort_key) given ?it=: def sort_key(item): return item.attr1, item.attr2 (A potential bonus of that spelling is that it may be possible to make "given ?it=:" the syntactic keyword introducing the suite, allowing "given" itself to continue to be used as a variable name) Another possible extension would be to combine it references with `as` clauses on if statements and while loops: if (?it=pattern.match(data)) is not None as matched: ... while (?it=pattern.match(data)) is not None as matched: ... == Why not arbitrary embedded assignments? == Primarily because embedded assignments are inherently hard to read, especially in long expressions. Restricting things to one pronoun, and then pursuing PEP 3150's given clause in order to expand to multiple statement local names should help nudge folks towards breaking things up into multiple statements rather than writing ever more complex one-liners. That said, the ?-prefix notation is deliberately designed such that it *could* be used with arbitrary identifiers rather then being limited to a single specific keyword, and the explicit lack of closure support means that there wouldn't be any complex nested scope issues associated with lambda expressions, generator expressions, or container comprehensions. With that approach, "?it" would just be an idiomatic default name like "self" or "cls" rather than being a true keyword. Given arbitrary identifier support, some of the earlier examples might instead be written as: value = ?f() if (?f=calculate) is not None else default range((?start=calculate_start()), ?start+10) value if (?lower=lower_bound()) <= value < ?lower+tolerance else 0 The main practical downside to this approach is that *all* the semantic weight ends up resting on the symbolic "?" prefix, which makes it very difficult to look up as a new Python user. With a keyword embedded in the construct, there's a higher chance that folks will be able to guess the right term to search for (i.e. "python it expression" or "python it keyword"). Another downside of this more flexible option is that it likely *wouldn't* be amenable to the "if expr as name:" syntax extension, as there wouldn't be a single defined pronoun expression to bind the name to. However, the extension to PEP 3150 would allow the statement local namespace to be given an arbitrary name: sorted_data = sorted(data, key=?ns.sort_key) given ?ns=: def sort_key(item): return item.attr1, item.attr2 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Feb 15 21:31:33 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 15 Feb 2018 21:31:33 -0500 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <003c01d3a6ba$8c1af0a0$a450d1e0$@my.bristol.ac.uk> Message-ID: On 2/15/2018 8:37 PM, Chris Barker - NOAA Federal wrote: > Back to one of your examples: > > [f(x) for x in [x]] > > What does that mean??? > > for x in seq > > Means iterate through seq, and assign each item to the name x. > > If that seq has x in it ? I?m not sure that is even legal python ? the > scope in a comprehension confuses me. > > But that is the equivalent is something like: > > it= iter(seq) > while True: > ? ? Try: > ? ? ? ? x = next(it) > ? ? Except StopIteration: > ? ? ? ? Break > > (Excuse the caps ? hard to write code on a phone) > > So not sure how x gets into that sequence before the loop starts. Reusing a previously bound name as an iteration variable is a bad idea. It works in 3.x because the outermost iterable, but only the outermost iterable, is pre-calculated before executing the comprehension. Hence 'x in [x]' sometimes works, and sometimes not. ('Outermost' is topmost in nested loops, left most in comprehension.) >>> x = 2 >>> [x*x for x in [x]] [4] >>> [x*y for x in [3] for y in [x]] # here, local x is 3, not 2 [9] >>> [x*y for y in [x] for x in [3]] [6] >>> [x*y for y in [3] for x in [x]] Traceback (most recent call last): File "", line 1, in [x*y for y in [3] for x in [x]] File "", line 1, in [x*y for y in [3] for x in [x]] UnboundLocalError: local variable 'x' referenced before assignment >>> [z*y for y in [3] for z in [x]] # no confusion here [6] To put it another way, l = [x for x in [x]] is actually calculated as _temp = [x]; l = [x for x in _temp]. In general, other iterables cannot be precalculated since they might depend on prior iteration variables. -- Terry Jan Reedy From robertve92 at gmail.com Thu Feb 15 23:03:23 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Fri, 16 Feb 2018 05:03:23 +0100 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: Message-ID: Hello, talking about this syntax : [y+2 for x in range(5) let y = x+1] *Previous talks* I've had almost exactly the same idea in June 2017, see subject "variable assignment in functional context here: https://mail.python.org/ pipermail/python-ideas/2017-June/subject.html. Currently this can easily be done by iterating over a list of size 1: [y+2 for x in range(5) for y in [x+1]] (Other ways exist, see section "current possibilities" below). This comparison would answer a lot of questions like "can we write [x+2 for x in range(5) for x in [x+1]], so [x+2 for x in range(5) let x = x+1]" the new variable would indeed shadow the old one. In June 2017 I introduced the "let" syntax using the existing "for" keyword, using the "=" instead of a "in" like this: [y+2 for x in range(5) for y = x+1] The only difference I introduced was that it would be logical to accept the syntax in any expression: x = 5 print(y+2 for y = x + 1) or with the "let" keyword: x = 5 print(y+2 let y = x + 1) *Previous talk: Pep* In the June conversation one conclusion was that someone wrote a pep ( https://www.python.org/dev/peps/pep-3150/) in 2010 that's still pending (not exactly the same syntax but the same idea, he used a "given:" syntax) that looked like that : print(y+2 given: y=x+1) *Previous talk: GitHub implementation on Cython* Another conclusion was that someone on GitHub implemented a "where" statement in Cython (url: https://github.com/thektulu/cpython/commit/ 9e669d63d292a639eb6ba2ecea3ed2c0c23f2636) where one could write : print(y+2 where y = x+1) So in a list comprehension: [y+2 where y = x + 1 for x in range(5)] As the author thektulu said "just compile and have fun". *New syntax in list comprehension* However, we probably would like to be able to use this "where" after the for: [y+2 for x in range(5) where y = x+1] This would allow the new variable to be used in further "for" and "if" statement : [y+z for x in range(5) where y = x+1 for z in range(y+1)] *Choice of syntax* Initially I thought re-using the "for" keyword would be a good idea for backward comptability (a variable named "where" in old code wouldn't be a problem), however some people pointed out that the python Grammar wouldn't be LL1, when reading the "for" token it wouldn't be able to choose directly if the rest would be a "for in" or "for =" so actually introducing a dedicated keyword is probably better, like this : [y+2 for x in range(5) where y = x+1] print(y+2 where y = x+1) The "where" keyword is very readable in my opinion, very close to English sentences we use. Sometimes we introduce a new variable before using it, generally using "let" or after using it using "where". For example "let y = x+1; print(y+2)". Or "print(y+2 where y = x+1)". The first syntax is chosen in the "let in" syntax in haskell : print(let y = x+2 in y+2) Or chained: print(let x = 2 in let y = x+1 in y+2) But Haskell user would probably line break for clarity : print(let x = 2 in let y = x+1 in y+2) A postfix notation using "where" would probably be less verbose in my opinion. Another example of "postfix notation" python has is with the "a if condition else b" so adding a new one wouldn't be surprising. Furthermore, the postfix notation is preferred in the context of "presenting the result first, then the implementation" (context discussed already in the 2010 pep), the "presenting the result first" is also a goal of the list comprehension, indeed one does write [x+3 for x in range(5)] and not [for x in range(5): x+3], the latter would be more "imperative programming" style, and would be translated to a normal loop. The problem of chaining without parenthesis, how to remove the parenthesis in the following statement ? print((y+2 where y = x+1) where x = 2) We have two options : print(y+2 where x = 2 where y = x+1) print(y+2 where y = x+1 where x = 2) The first option would be probably closer to the way multiple "for" are linked in a list comprehension: [y+2 for x in range(5) for y in [x+1]] But the second option would be more "present result first" and more close to the parenthesized version, the user would create new variables as they go, "I want to compute y+2 but hey, what is y ? It's x+1 ! But what is x ? It's 5 !)). However, keeping the same order as in "multiple for in list comprehension" is better, so I'd choose first option. In the implementation on GitHub of thektulu the parenthesis are mandatory. Another syntax issue would probably surprise some users, the following statement, parenthesized : [(y+2 where y = x+1) for x in range(5)] Would have two totally legal ways to be done: [y+2 where y = x+1 for x in range(5)] [y+2 for x in range(5) where y = x+1] The first one is a consequence of the "where" keyword usable in an expression and the second one is a consequence of using it in a list comprehension. However I think it doesn't break the "there is only one obvious way to do it", because depending on the case, the "y" variable would be a consequence of the iteration or a consequence of the computation. *Goals of the new syntax* Personally, I find it very useful to do an assignment in such context, I use list comprehension for example to generate a big json with multiple for and if. I don't have here a list of big real world example thar would be simplified using this syntax but people interested could search arguments here. People could argue only the "list comprehension" case would be useful and not the "any expression" case: [y+2 for x in range(5) where y = x+1] would be accepted but not : print(y+2 where y=x+1) Because the latter could be written: y = x + 1 print(y+2) However, the idea of having an isolated scope can be a good idea. *Current possibilities* Currently we have multiple options : [y+2 for x in range(5) for y in [x+1]] [y+2 for (x+1 for x in range(5))] [(lambda y:y+2)(y=x+1) for x in range(5)] The first one works well but it's not obvious we just want to assign a new variable, especially when the expression is long, or multiple, or both: [y+z for x in range(5) for y,z in [((x + 1) * 2, x ** 2 - 5)]] The second one makes it impossible to reuse the "x" variable and the y = x+1 relation is not obvious. The third example is what a functional programmer would think but is really too much complex for a beginner and very verbose. *Proposed syntax : Conclusion* As I said, I like the "where" syntax with the "where" keyword. [y+2 for x in range(5) where y = x+1] Also usable in any expression : print(y+2 where y = x+1) *Conclusion* Here is all the talk/work/argument I've already found about this syntax. Apparently it's been a while (2010) since such an idea was thought but I think having a new pep listing all the pros and cons would be a good idea. So that we can measure how much the community would want this concept to be introduced, and if it's refused, the community would have a document where the "cons" are clearly written. Robert Vanden Eynde -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Feb 15 23:13:41 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 16 Feb 2018 15:13:41 +1100 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: <20180216041340.GB10142@ando.pearwood.info> On Thu, Feb 15, 2018 at 10:13:37AM +0000, Jamie Willis wrote: > I'm not sure it does indicate a need for refactoring, I'd argue it's quite > a common pattern, at least in functional languages from which this > construct arises. > > In fact, in those languages, there are laws that govern interactions with > the comprehensions (though this comes from monads and monads perhaps don't > quite apply to pythons model). These laws define behaviour that is expected > equivalent by users; Python is not a functional language, and the usual Pythonic solution to an overly complex or inefficient functional expression is to refactor into non-functional style. In my experience, most Python users have never even heard of monads, and those who have, few grok them. (I know I don't.) > [x for x in xs] = xs > [f(x) for x in [x]] = f(x) > [g(y) for y in [f(x) for x in xs]] = [g(y) for x in xs for y in f(x)] I think these should be: [x for x in xs] = list(xs) # not to be confused with [x] [f(x) for x in [x]] = [f(x)] [g(y) for y in [f(x) for x in xs]] = [g(y) for x in xs for y in [f(x)]] > Even though that last law isn't completely analogous to the given example > from the OP, It can be used though. He has: [f(x) + g(f(x)) for x in xs] which can be written as [y + g(y) for x in xs for y in [f(x)]] Here's an example: # calls ord() twice for each x py> [ord(x) + ord(x)**2 for x in "abc"] [9506, 9702, 9900] # calls ord() once for each x py> [y + y**2 for x in "abc" for y in [ord(x)]] [9506, 9702, 9900] And one last version: py> [y + y**2 for y in [ord(x) for x in "abc"]] [9506, 9702, 9900] In production, I'd change the last example to use a generator comprehension. > the transformation he wants to be able to do does arise from > the laws. So it could be argued that not being able to flatten the > comprehension down via law 3 is unexpected behaviour Law 3 does apply, and I'm not sure what you mean by the statement that we can't flatten the comprehension down. -- Steve From ncoghlan at gmail.com Fri Feb 16 02:55:22 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 16 Feb 2018 17:55:22 +1000 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators In-Reply-To: References: Message-ID: On 16 February 2018 at 12:19, rymg19 at gmail.com wrote: > I don't know...to me this looks downright ugly and an awkward special case. > It feels like it combines reading difficulty of inline assignment with the > awkwardness of a magic word and the ugliness of using ?. Basically, every > con of the other proposals combined... Yeah, it's tricky to find a spelling that looks nice without being readily confusable with other existing constructs (most notably keyword arguments). The cleanest *looking* idea I've come up with would be to allow arbitrary embedded assignments to ordinary frame local variables using the "(expr as name)" construct: value = tmp.strip()[4:].upper() if (var1 as tmp) is not None else None value = tmp[4:].upper() if (var1 as tmp) is not None else None value = tmp if (var1 as tmp) is not None else tmp if (var2 as tmp) is not None else var3 value = tmp if not math.isnan((var1 as tmp)) else tmp if not math.isnan((var2 as tmp)) else var3 value = f() if (calculate as f) is not None else default filtered_values = [val for x in keys if (get_value(x) as val) is not None] range((calculate_start() as start), start+10) data[(calculate_start() as start):start+10] value if (lower_bound() as min_val) <= value < min_val+tolerance else 0 print(f"{(get_value() as tmp)!r} is printed in pure ASCII as {tmp!a} and in Unicode as {tmp}") However, while I think that looks nicer in general, we'd still have to choose between two surprising behaviours: * implicitly delete the statement locals after the statement where they're set (which still overwrites any values previously bound to those names, similar to what happens with exception clauses) * skip deleting, which means references to subexpressions may last longer than expected (and we'd have the problem where embedded assignments could overwrite existing local variables) The interaction with compound statements would also be tricky to figure out (especially if we went with the "delete after the statement" behaviour). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From kirillbalunov at gmail.com Fri Feb 16 03:36:14 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Fri, 16 Feb 2018 11:36:14 +0300 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators In-Reply-To: References: Message-ID: What about (| val = get_value(x) |) assignment expression which will be True if success, and None if not? So it will be value = f() if (| f = calculate |) else default?The idea is inspired from C?s assignment, but needs some special treatment for anything which is False in boolean context. With kind regards, -gdg ? 2018-02-16 10:55 GMT+03:00 Nick Coghlan : > On 16 February 2018 at 12:19, rymg19 at gmail.com wrote: > > I don't know...to me this looks downright ugly and an awkward special > case. > > It feels like it combines reading difficulty of inline assignment with > the > > awkwardness of a magic word and the ugliness of using ?. Basically, every > > con of the other proposals combined... > > Yeah, it's tricky to find a spelling that looks nice without being > readily confusable with other existing constructs (most notably > keyword arguments). > > The cleanest *looking* idea I've come up with would be to allow > arbitrary embedded assignments to ordinary frame local variables using > the "(expr as name)" construct: > > value = tmp.strip()[4:].upper() if (var1 as tmp) is not None else None > > value = tmp[4:].upper() if (var1 as tmp) is not None else None > > value = tmp if (var1 as tmp) is not None else tmp if (var2 as tmp) > is not None else var3 > > value = tmp if not math.isnan((var1 as tmp)) else tmp if not > math.isnan((var2 as tmp)) else var3 > > value = f() if (calculate as f) is not None else default > > filtered_values = [val for x in keys if (get_value(x) as val) is not > None] > > range((calculate_start() as start), start+10) > data[(calculate_start() as start):start+10] > > value if (lower_bound() as min_val) <= value < min_val+tolerance else 0 > > print(f"{(get_value() as tmp)!r} is printed in pure ASCII as > {tmp!a} and in Unicode as {tmp}") > > However, while I think that looks nicer in general, we'd still have to > choose between two surprising behaviours: > > * implicitly delete the statement locals after the statement where > they're set (which still overwrites any values previously bound to > those names, similar to what happens with exception clauses) > * skip deleting, which means references to subexpressions may last > longer than expected (and we'd have the problem where embedded > assignments could overwrite existing local variables) > > The interaction with compound statements would also be tricky to > figure out (especially if we went with the "delete after the > statement" behaviour). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Feb 16 06:02:03 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 16 Feb 2018 21:02:03 +1000 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators In-Reply-To: References: Message-ID: On 16 February 2018 at 18:36, Kirill Balunov wrote: > What about (| val = get_value(x) |) assignment expression which will be True > if success, and None if not? > > So it will be value = f() if (| f = calculate |) else default?The idea is > inspired from C?s assignment, but needs some special treatment for anything > which is False in boolean context. If we're going to allow arbitrary embedded assignments, then "(expr as name)" is the most likely spelling, since: * "as" is already a keyword * "expr as name" is already used for name binding related purposes (albeit not for simple assignments) * "python as expression" and "python as keyword" are both things search engines will accept as queries (search engines tend not to cope very well when you try to search for punctuation characters) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rhodri at kynesim.co.uk Fri Feb 16 09:37:48 2018 From: rhodri at kynesim.co.uk (Rhodri James) Date: Fri, 16 Feb 2018 14:37:48 +0000 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators In-Reply-To: References: Message-ID: On 16/02/18 02:06, Nick Coghlan wrote: > The recent thread on variable assignment in comprehensions has > prompted me to finally share > https://gist.github.com/ncoghlan/a1b0482fc1ee3c3a11fc7ae64833a315 with > a wider audience (see the comments there for some notes on iterations > I've already been through on the idea). > > == The general idea == > > The general idea would be to introduce a *single* statement local > reference using a new keyword with a symbolic prefix: "?it" [snip] > If we did pursue this, then PEPs 505, 532, and 535 would all be > withdrawn or rejected (with the direction being to use an it-reference > instead). I don't think that follows. > == Examples == > > `None`-aware attribute access: > > value = ?it.strip()[4:].upper() if (?it=var1) is not None else None > > `None`-aware subscript access: > > value = ?it[4:].upper() if (?it=var1) is not None else None > > `None`-coalescense: > > value = ?it if (?it=var1) is not None else ?it if (?it=var2) is > not None else var3 > > `NaN`-coalescence: > > value = ?it if not math.isnan((?it=var1)) else ?it if not > math.isnan((?that=var2)) else var3 > > > Conditional function call: > > value = ?it() if (?it=calculate) is not None else default > I have to say I don't find this an improvement on "value = var if var is not None else None" and the like. In fact I think it's markedly harder to read. It has the same repetition problem as the current situation, just with added glyphs. > Avoiding repeated evaluation of a comprehension filter condition: > > filtered_values = [?it for x in keys if (?it=get_value(x)) is not None] Definite win here. It doesn't read particularly naturally, but then list comprehensions don't read that naturally either. I would still prefer something that read better. > Avoiding repeated evaluation for range and slice bounds: > > range((?it=calculate_start()), ?it+10) > data[(?it=calculate_start()):?it+10] > > Avoiding repeated evaluation in chained comparisons: > > value if (?it=lower_bound()) <= value < ?it+tolerance else 0 > > Avoiding repeated evaluation in an f-string: > > print(f"{?it=get_value()!r} is printed in pure ASCII as {?it!a} > and in Unicode as {?it}" While these are wins, they don't read nicely at all. I still don't see what's wrong with start = calculate_start() values = range(start, start+10) which beats everything I've seen so far for clarity. -- Rhodri James *-* Kynesim Ltd From lkb.teichmann at gmail.com Fri Feb 16 11:24:10 2018 From: lkb.teichmann at gmail.com (Martin Teichmann) Date: Fri, 16 Feb 2018 17:24:10 +0100 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: Hi list, > But when it comes to something like > [f(x) + g(f(x)) for x in range(10)] > you find you have to sacrifice some readableness if you don't want two f(x) > which might slow down your code. > > Someone may argue that one can write > [y + g(y) for y in [f(x) for x in range(10)]] personally I think that the biggest problem readability-wise is that "for" is a post-fix operator, which makes generators much harder to read. It's also very different from normal for loops, which have the "for" at the top. IMHO generators would be much easier to read with a prefix for, as in [for x in range(10): f(x) + g(f(x))] also nested generators get nicer like that: [for y in (for x in range(10): f(x)): y + g(y)] one could critique here that we shouldn't use colons in expressions, but that boat has sailed: we use them for lambdas. We do not write sq = x**2 lambda x and I am not proposing that. Also if/else could be written with colons, but I am not sure whether that's actually nicer: val = (if attr is None: 5 else: attr + 3) but it certainly is in case of ifs in generators: [for x in range(10): if x % 3 != 2: x] which could be problematic to parse if you compare that to [for x in range(10): if x % 3 == 2: x - 1 else: x + 1] one could even dig out the often-proposed always-rejected except-in-expression: [for x in range(10): try: f(x) except WhateverError: None] or even a with: [for x in file_names: with open(x) as f: f.read()] Greetings Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Feb 16 11:31:18 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 16 Feb 2018 08:31:18 -0800 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators In-Reply-To: References: Message-ID: <5A870756.6050503@stoneleaf.us> On 02/15/2018 11:55 PM, Nick Coghlan wrote: > On 16 February 2018 at 12:19, rymg19 at gmail.com wrote: >> I don't know...to me this looks downright ugly and an awkward special case. >> It feels like it combines reading difficulty of inline assignment with the >> awkwardness of a magic word and the ugliness of using ?. Basically, every >> con of the other proposals combined... > > Yeah, it's tricky to find a spelling that looks nice without being > readily confusable with other existing constructs (most notably > keyword arguments). > > The cleanest *looking* idea I've come up with would be to allow > arbitrary embedded assignments to ordinary frame local variables using > the "(expr as name)" construct: -1 to ?it +1 to (name as expr) > However, while I think that looks nicer in general, we'd still have to > choose between two surprising behaviours: > > * implicitly delete the statement locals after the statement where > they're set (which still overwrites any values previously bound to > those names, similar to what happens with exception clauses) If we're overwriting locals anyway, don't delete it. The good reason for unsetting an exception variable doesn't apply here. > * skip deleting, which means references to subexpressions may last > longer than expected (and we'd have the problem where embedded > assignments could overwrite existing local variables) Odds are good that we'll want/need that assignment even after the immediate expression it's used in. Let it stick around. -- ~Ethan~ From chris.barker at noaa.gov Fri Feb 16 16:41:36 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 16 Feb 2018 13:41:36 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: Sent from my iPhone > A thought just occurred to me. Maybe we should just add a Boolean class to numbers? This makes lots of sense to me. Bool is a subclass of int ? might as well embrace that fact. -CHB From fhsxfhsx at 126.com Sat Feb 17 09:47:22 2018 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Sat, 17 Feb 2018 22:47:22 +0800 (CST) Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: <10643d49.9fb.161a43bde4e.Coremail.fhsxfhsx@126.com> A generator can be a good idea, however, I wonder if it's really readable to have a `f_samples`. And the structure is the same as in [y + g(y) for y in [f(x) for x in range(10)]] if you replace the list with a generator. And it's similar for other solutions you mentioned. Well, I know that it can be quite different for everyone to determine whether a piece of code is readable or not, maybe it's wise to wait for more opinions. There's another problem that, as you distinguished the `simple` and `more complex` case, the offered solutions seem very specific, I'm not sure if there's a universal solution for every case of temporary variables in comprehensions. If not, I think it's too high cost to reject a new syntax if you need to work out a new solution every time. At 2018-02-15 17:53:21, "Evpok Padding" wrote: For simple cases such as `[y + g(y) for y in [f(x) for x in range(10)]]`, I don't really see what the issue is, if you really want to make it shorter, you can ``[y + g(y) for y in map(f,range(10))]` which is one of the rare case where I like `map` more than comprehensions. For more complex case, just define a intermediate generator along the lines ``` f_samples = (f(x) for x in range(10)) [y+g(y) for y in f_samples] ``` Which does exactly the same thing but - Is more readable and explicit - Has no memory overhead thanks to lazy evaluation (btw, you should consider generators for your nested comprenshions) While I am sometimes in the same state of mind, wishing for variables in comprehensions seems to me like a good indicator that your code needs refactoring. Best, E On 15 February 2018 at 10:32, Jamie Willis wrote: > > I +1 this at surface level; Both Haskell list comprehensions and Scala for comprehensions have variable assignment in them, even between iterating and this is often very useful. Perhaps syntax can be generalised as: > > [expr_using_x_and_y > for i in is > x = expr_using_i > for j in is > y = expr_using_j_and_x] > > This demonstrates the scope of each assignment; available in main result and then every clause that follows it. > > Sorry to op who will receive twice, forgot reply to all > > On 15 Feb 2018 7:03 am, "fhsxfhsx" wrote: >> >> As far as I can see, a comprehension like >> alist = [f(x) for x in range(10)] >> is better than a for-loop >> for x in range(10): >> alist.append(f(x)) >> because the previous one shows every element of the list explicitly so that we don't need to handle `append` mentally. >> >> But when it comes to something like >> [f(x) + g(f(x)) for x in range(10)] >> you find you have to sacrifice some readableness if you don't want two f(x) which might slow down your code. >> >> Someone may argue that one can write >> [y + g(y) for y in [f(x) for x in range(10)]] >> but it's not as clear as to show what `y` is in a subsequent clause, not to say there'll be another temporary list built in the process. >> We can even replace every comprehension with map and filter, but that would face the same problems. >> >> In a word, what I'm arguing is that we need a way to assign temporary variables in a comprehension. >> In my opinion, code like >> [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] >> is more natural than any solution we now have. >> And that's why I pro the new syntax, it's clear, explicit and readable, and is nothing beyond the functionality of the present comprehensions so it's not complicated. >> >> And I hope the discussion could focus more on whether we should allow assigning temporary variables in comprehensions rather than how to solve the specific example I mentioned above. >> >> >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fhsxfhsx at 126.com Sat Feb 17 11:23:28 2018 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Sun, 18 Feb 2018 00:23:28 +0800 (CST) Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: <21d694a9.b30.161a493d83e.Coremail.fhsxfhsx@126.com> Thank you Paul, what you said is enlightening and I agree on most part of it. I'll propose two candidate syntaxs. 1. `with ... as ...` This syntax is more paralles as there would be `for` and `with` clause as well as `for` and `with` statement. However, the existing `with` statement is semantically different from this one, although similar. 2. `for ... is ...` This syntax is more uniform as the existing `for` clause make an iterator which is a special kind of variable. However, I'm afraid this syntax might be confused with `for ... in ...` as they differ only on one letter. I like the latter one better. Other proposes are absolutely welcome. And here is an example which appears quite often in my code where I think a new syntax can help a lot: Suppose I have an list of goods showing by their ids in database, and I need to transform the ids into json including information from two tables, `Goods` and `GoodsCategory`, where the first table recording `id`, `name` and `category_id` indicating which category the goods belongs to, the second table recording `id`, `name` and `type` to the categories. With the new syntax, I can write [ { 'id': goods.id, 'name': goods.name, 'category': gc.name, 'category_type': gc.type, } for goods_id in goods_id_list for goods is Goods.get_by_id(goods_id) for gc is GoodsCategory.get_by_id(goods.category_id) ] And I cannot think of any good solutions as this one without it. To generalize this case, for each element of the list, I need two temporary variables (`goods` and `gc` in my case), and each one was used twice. And reply to the two past discussions you mentioned, 1.https://mail.python.org/pipermail/python-ideas/2011-April/009863.html This mail gave a solution to modify function `f` to keep the result. The weak point is obvious, you must modify the function `f`. 2.https://mail.python.org/pipermail/python-ideas/2012-January/013468.html This mail wrote >The important thing is that you name the thing you care about before using it. >I think this is a very natural way of writing: first you give the thing you >care about a name, then you refer to it by name. However, that's a problem every comprehension faces, not a problem drawn by the new syntax. At 2018-02-15 18:08:46, "Paul Moore" wrote: >On 15 February 2018 at 05:56, fhsxfhsx wrote: >> As far as I can see, a comprehension like >> alist = [f(x) for x in range(10)] >> is better than a for-loop >> for x in range(10): >> alist.append(f(x)) >> because the previous one shows every element of the list explicitly so that >> we don't need to handle `append` mentally. > >... as long as the code inside the comprehension remains relatively >simple. It's easy to abuse comprehensions to the point where they are >less readable than a for loop, but that's true of a lot of things, so >isn't a specific problem with comprehensions. > >> But when it comes to something like >> [f(x) + g(f(x)) for x in range(10)] >> you find you have to sacrifice some readableness if you don't want two f(x) >> which might slow down your code. > >Agreed. I hit that quite often. > >> Someone may argue that one can write >> [y + g(y) for y in [f(x) for x in range(10)]] >> but it's not as clear as to show what `y` is in a subsequent clause, not to >> say there'll be another temporary list built in the process. >> We can even replace every comprehension with map and filter, but that would >> face the same problems. > >That is a workaround (and one I'd not thought of before) but I agree >it's ugly, and reduces readability. Actually, factoring out the inner >comprehension like Evpok Padding suggests: > > f_samples = (f(x) for x in range(10)) > [y+g(y) for y in f_samples] > >is very readable and effective, IMO, so it's not *that* obvious that >local names are beneficial. > >> In a word, what I'm arguing is that we need a way to assign temporary >> variables in a comprehension. > >"We need" is a bit strong here. "It would be useful to have" is >probably true for some situations. > >> In my opinion, code like >> [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] >> is more natural than any solution we now have. >> And that's why I pro the new syntax, it's clear, explicit and readable, and >> is nothing beyond the functionality of the present comprehensions so it's >> not complicated. > >The problem is that you haven't proposed an actual syntax here, just >that one should be invented. There have been discussions on this in >the past (a quick search found >https://mail.python.org/pipermail/python-ideas/2011-April/009863.html >and https://mail.python.org/pipermail/python-ideas/2012-January/013468.html, >for example). > >> And I hope the discussion could focus more on whether we should allow >> assigning temporary variables in comprehensions rather than how to solve the >> specific example I mentioned above. > >The problem isn't so much "whether we should allow it" as "can we find >a syntax that is acceptable", and only then "does the new syntax give >sufficient benefit to be worth adding". New syntax has a pretty high >cost, and proposals that don't suggest explicit syntax will get stuck >because you can't judge whether adding the capability is "worth it" >without being clear on what the cost is - particularly when the >benefit is relatively small (which this is). > >Agreed that it's important to focus on the general problem, but part >of the discussion *will* include arguing as to why the existing >workarounds and alternatives are less acceptable than new syntax. And >that will need to include discussion of specific cases. Generally, in >that sort of discussion, artificial examples like "y=f(x)" don't fare >well because it's too easy to end up just debating subjective views on >"readability". If you can provide examples from real-world code that >clearly demonstrate the cost in terms of maintainability of the >existing workarounds, that will help your argument a lot. Although >you'll need to be prepared for questions like "would you be willing to >drop support for versions of Python older than 3.8 in order to get >this improvement?" - it's surprisingly hard to justify language (as >opposed to library) changes when you really stop and think about it. >Which is not to say that it can't be done, just that it's easy to >underestimate the effort needed. > >Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From fhsxfhsx at 126.com Sat Feb 17 11:57:16 2018 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Sun, 18 Feb 2018 00:57:16 +0800 (CST) Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: <4df725f8.b50.161a4b2cd02.Coremail.fhsxfhsx@126.com> You are right and actually I sometimes did the same thing in a temporary script such as in ipython. Because in my opinion, it's not really elegant code as one may be puzzled for the list `[f(x)]`. Well, although that's quite subjective. And also I test the code and find another `for` clause can make time cost about 1.5 times in my computer, even when I optimize `[f(x)]` to a generator `(f(x), )`. Though that's not very big issue if you don't care about efficiency that much. But a temporary list or generator is redundant here. `[f(x)]` can be an alternative, but I think it is worth a new syntax. At 2018-02-15 18:11:47, "Stephan Houben" wrote: Note that you can already do: [y + g(y) for x in range(10) for y in [f(x)]] i.e. for y in [expr] does exactly what the OP wants. No new syntax needed. If you hang out on python-list , you'll soon notice that many newbies struggle already with the list comprehension syntax. It's a mini-language which is almost, but not entirely, exactly unlike normal Python code. Let's not complicate it further. Stephan 2018-02-15 10:53 GMT+01:00 Evpok Padding : For simple cases such as `[y + g(y) for y in [f(x) for x in range(10)]]`, I don't really see what the issue is, if you really want to make it shorter, you can ``[y + g(y) for y in map(f,range(10))]` which is one of the rare case where I like `map` more than comprehensions. For more complex case, just define a intermediate generator along the lines ``` f_samples = (f(x) for x in range(10)) [y+g(y) for y in f_samples] ``` Which does exactly the same thing but - Is more readable and explicit - Has no memory overhead thanks to lazy evaluation (btw, you should consider generators for your nested comprenshions) While I am sometimes in the same state of mind, wishing for variables in comprehensions seems to me like a good indicator that your code needs refactoring. Best, E On 15 February 2018 at 10:32, Jamie Willis wrote: > > I +1 this at surface level; Both Haskell list comprehensions and Scala for comprehensions have variable assignment in them, even between iterating and this is often very useful. Perhaps syntax can be generalised as: > > [expr_using_x_and_y > for i in is > x = expr_using_i > for j in is > y = expr_using_j_and_x] > > This demonstrates the scope of each assignment; available in main result and then every clause that follows it. > > Sorry to op who will receive twice, forgot reply to all > > On 15 Feb 2018 7:03 am, "fhsxfhsx" wrote: >> >> As far as I can see, a comprehension like >> alist = [f(x) for x in range(10)] >> is better than a for-loop >> for x in range(10): >> alist.append(f(x)) >> because the previous one shows every element of the list explicitly so that we don't need to handle `append` mentally. >> >> But when it comes to something like >> [f(x) + g(f(x)) for x in range(10)] >> you find you have to sacrifice some readableness if you don't want two f(x) which might slow down your code. >> >> Someone may argue that one can write >> [y + g(y) for y in [f(x) for x in range(10)]] >> but it's not as clear as to show what `y` is in a subsequent clause, not to say there'll be another temporary list built in the process. >> We can even replace every comprehension with map and filter, but that would face the same problems. >> >> In a word, what I'm arguing is that we need a way to assign temporary variables in a comprehension. >> In my opinion, code like >> [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] >> is more natural than any solution we now have. >> And that's why I pro the new syntax, it's clear, explicit and readable, and is nothing beyond the functionality of the present comprehensions so it's not complicated. >> >> And I hope the discussion could focus more on whether we should allow assigning temporary variables in comprehensions rather than how to solve the specific example I mentioned above. >> >> >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fhsxfhsx at 126.com Sat Feb 17 13:54:42 2018 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Sun, 18 Feb 2018 02:54:42 +0800 (CST) Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <20180216005739.GA10142@ando.pearwood.info> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <20180216005739.GA10142@ando.pearwood.info> Message-ID: <6c02900d.50.161a51e4e16.Coremail.fhsxfhsx@126.com> Hi Steve, Thank you for so detailed comments. My comments also below interleaved with yours. At 2018-02-16 08:57:40, "Steven D'Aprano" wrote: >Hi fhsxfhsx, and welcome. > >My comments below, interleaved with yours. > > >On Thu, Feb 15, 2018 at 01:56:44PM +0800, fhsxfhsx wrote: > >[quoted out of order] >> And I hope the discussion could focus more on whether we should allow >> assigning temporary variables in comprehensions rather than how to >> solve the specific example I mentioned above. > >Whether or not to allow this proposal will depend on what alternate >solutions to the problem already exist, so your specific example is very >relevant. Any proposed change has to compete with existing solutions. >> >> As far as I can see, a comprehension like >> alist = [f(x) for x in range(10)] >> is better than a for-loop >> for x in range(10): >> alist.append(f(x)) >> because the previous one shows every element of the list explicitly so >> that we don't need to handle `append` mentally. > >While I personally agree with you, many others disagree. I know quite a >few experienced, competent Python programmers who avoid list >comprehensions because they consider them harder to read and reason >about. They consider a regular for-loop better precisely because you do >see the explicit call to append. > >(In my experience, those of us who get functional-programming idioms >often forget that others find them tricky.) > >The point is that list comprehensions are already complex enough that >they are difficult for many people to learn, and some people never come >to grips with them. Adding even more features comes with a cost. > >The bottom line is that it isn't clear to me that allowing local >variables inside comprehensions will make them more readable. > To be frank, I had not thought of this before. However, in my opinion, when considering adding a new syntax, we care more about the marginal cost. I mean, I think it is the functional-programming way which is tricky, but allowing a new syntax would not make things worse. Well, that's just a guess, maybe only those who are puzzled with comprehensions can give us an answer. > >> But when it comes to something like >> [f(x) + g(f(x)) for x in range(10)] >> you find you have to sacrifice some readableness if you don't want two >> f(x) which might slow down your code. > >The usual comments about premature optimisation apply here. > >Setting a new comprehension variable is not likely to be free, and may even be >more costly than calling f(x) twice if f() is a cheap expression: > > [x+1 + some_func(x+1) for x in range(10)] > >could be faster than > > [y + some_func(y) for x in range(10) let y = x + 1] > >or whatever syntax we come up with. > It is true. But since there are still so many cases where a temporary variable is faster. Also, even without let-clause, one can write a for-loop with a temporary variable which slow down the code. So, it seems that "setting a new comprehension variable may even be more costly" does not show any uselessness of temporary variables in comprehensions. > >> Someone may argue that one can write >> [y + g(y) for y in [f(x) for x in range(10)]] > >Indeed. This would be the functional-programming solution, and I >personally think it is an excellent one. The only changes are that I'd >use a generator expression for the intermediate value, avoiding the need >to make a full list, and I would lay it out more nicely, using >whitespace to make the structure more clear: > > result = [y + g(y) for y in > (f(x) for x in range(10)) > ] > In my opinion, [ y + g(y) for x in range(10) let y = f(x) ] is better because it's more corresponding to a for-loop for x in range(10): y = f(x) result.append(y + g(y)) In my opinion, comprehensions are not real functional-programming because there is not even a function. Though there're similarities, map and filter are real functional-programming. Since the similarity between for-clause in comprehensions and the for-loop, I think it's better to write comprehensions more close to for-loop. I don't know but I guess maybe it can also help those who fear comprehensions better embrace them? > >> but it's not as clear as to show what `y` is in a subsequent clause, >> not to say there'll be another temporary list built in the process. > >There's no need to build the temporary list. Use a generator >comprehension. And I disagree that the value of y isn't as clear. > >An alternative is simply to refactor your list comprehension. Move the >calls to f() and g() into a helper function: > >def func(x): > y = f(x) > return y + g(y) > >and now you can write the extremely clear comprehension > >[func(x) for x in range(10)] > >that needs no extra variable. > I think it can be a goods idea if there's a name to `func` which is easy to understand, or `func` is put close to the comprehension and is in a obvious place. But I feel it's not for the case I gave in another mail to Paul, https://mail.python.org/pipermail/python-ideas/2018-February/048997.html, (I'm sorry that the example is quite long, and I don't hope to copy it here) To me, it can be confusing to have several `func` when I have several lists at the same time and have to transform them each in a similar but different way. > > >[...] >> In a word, what I'm arguing is that we need a way to assign temporary >> variables in a comprehension. > >"Need" is very strong. I think that the two alternatives I mention above >cover 95% of the cases where might use a local variable in a >comprehension. And of the remaining cases, many of them will be so >complex that they should be re-written as an explicit for-loop. So in my >opinion, we're only talking about a "need" to solve the problem for a >small proportion of cases: > >- most comprehensions don't need a local variable (apart from > the loop variable) at all; > >- of those which do need a local variable, most can be easily > solved using a nested comprehension or a helper function; > >- of those which cannot be solved that way, most are complicated > enough that they should use a regular for-loop; > >- leaving only a small number of cases which are complicated enough > to genuinely benefit from local variables but not too complicated. > >So this is very much a borderline feature. Occasionally it would be >"nice to have", but on the negative side: > >- it adds complexity to the language; > >- makes comprehensions harder to read; > >- and people will use it unnecessarily where there is no readability > or speed benefit (premature optimization again). > >It is not clear to me that we should burden *all* Python programmers >with additional syntax and complexity of an *already complex* feature >for such a marginal improvement. > > >> In my opinion, code like >> [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] >> is more natural than any solution we now have. >> And that's why I pro the new syntax, it's clear, explicit and readable > >How can you say that the new syntax is "clear, explicit and readable" >when you haven't proposed any new syntax yet? > >For lack of anything better, I'm going to suggest "let y = f(x)" as the >syntax, although personally I don't like it even a bit. > >Where should the assignment go? > > [(y, y**2) let y = x+1 for x in (1, 2, 3, 4)] > > [(y, y**2) for x in (1, 2, 3, 4) let y = x+1] > >I think they're both pretty ugly, but I can't think of anything else. > >Can we rename the loop variable, or is that an error? > > [(x, x**2) let x = x+1 for x in (1, 2, 3, 4)] > >How do they interact when you have multiple loops and if-clauses? > > [(w, w**2) for x in (1, 2, 3, 4) let y = x+1 > for a in range(y) let z = a+1 if z > 2 > for b in range(z) let w = z+1] > > >For simplicity, perhaps we should limit any such local assignments to >the very end of the comprehension: > > [expression for name in sequence > > > ] > >but that means we can't optimise this sort of comprehension: > > [expression for x in sequence > for y in (something_expensive(x) + function(something_expensive(x)) > ] > >Or these: > > [expression for x in sequence > if something_expensive(x) or condition(something_expensive(x)) > ] > > >I think these are very hard questions to answer. > I think the assignment should be treated equally as for-clause and if-clause which means [(y, y**2) for x in (1, 2, 3, 4) let y = x+1] would be a right syntax. And [(x, x**2) for x in (1, 2, 3, 4) let x = x+1] would not cause an error because we can also write [(x, x**2) for x in (1, 2, 3, 4) for x in (4, 3, 2, 1)] now. I didn't see any problem in [(w, w**2) for x in (1, 2, 3, 4) let y = x+1 for a in range(y) let z = a+1 if z > 2 for b in range(z) let w = z+1] In my opinion, it would behave the same as for x in (1, 2, 3, 4): y = x+1 for a in range(y): z = a+1 if z > 2: for b in range(z): w = z+1 mylist.append((w, w**2)) According to my understanding, the present for-clause and if-clause does everything quite similar to this nested way, > >-- >Steve >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas >Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fhsxfhsx at 126.com Sun Feb 18 09:07:16 2018 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Sun, 18 Feb 2018 22:07:16 +0800 (CST) Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: Message-ID: <53f5a323.d4c.161a93d83cb.Coremail.fhsxfhsx@126.com> Thanks so much for the comments and the collect on this syntax! Comments on *previous talk* The list also mentioned some other previous proposals, so I myself search it and find that there're even more in the mailing list since 2008. https://mail.python.org/pipermail/python-ideas/2008-August/001842.html Comments on *previous talk: PEP* The PEP seems to be about an explicit temporary namespace where any objects (including functions, classes, etc.) could be held. However, I find that this syntax may not work for temporary variables in comprehensions. We might expect this syntax work like [?.y+2 for x in range(5) given: y = x+1] But notice that the proposed `given` statement here appeared in comprehensions, where syntax changes in comprehensions are needed, and nothing about this is mentioned in the PEP. What's more, the proposed syntax is a statement like `for` statement, `if` statement, rather than a for-clause or if-clause in comprehensions. In my opinion, it's not a good idea to have a statement in comprehensions. Instead, another given-clause should be added if one wants to write code like above, doing assignments in comprehensions, which can look like: [?.y+2 for x in range(5) given y = x+1] So the ? symbol seems useless here, so maybe [y+2 for x in range(5) given y = x+1] make it quite similar to the `where` syntax you proposed. So, I agree with you that it is a good idea to have a new PEP, though for a different reason. Comments on *Choice of syntax* I gave two candidate syntaxs in https://mail.python.org/pipermail/python-ideas/2008-August/001842.html, one said `for ... in ...`, another said `with ... as ...`. The first has the same problem as `for ... = ...` you proposed. And the biggest problem I think the second will face is the difference in semantic between the `with` statement and it. When it comes to `where ... = ...`, there are one possible problem I can think of. `where` is not now a keyword in Python. There are WHERE clauses in SQL, so in many modules including peewee and SQLAlchemy, `where` is an important method. The new syntax would cause quite incompatibilities. Personally I agreed with you that postfix notation would have advantage over prefix notation. Other than `where`, `with` is quite readable in my opinion. So maybe `with ... = ...` can be another candidate? Plus the `given ... = ...` I mentioned above, there are several more candidates now. Personally I perfer `with ... = ...`, because `with` is a python keyword so it would be good for backward compatibility. *About comprehensions and expressions* You gave `print(y+2 where y = x+1)` as an example, I think it should be clarified that it's not, or at least, does not look like a comprehension, but an expression. It should give an object `y+2` rather than a list or a generator. (otherwise what list can it give?) There are for-clause and if-clause for comprehensions, and if-clause (aka ternary operators) for expressions. So, In my opinion, there should be additional discuss and examples to show that it's helpful to have such syntax. For the where-clause in expressions, I think we could refer to how python handles if-clause. The following setences are legal: [x if x else 0 for x in mylist if x > 10] The following illegal: [x if x for x in mylist if x > 10] [x if x else 0 for x in mylist if x > 10 else 10] That's to say, the two kinds of if-clause, one is only used in expressions (aka `ternary operator`), the other is only used in comprehensions. They slightly differ in syntax. The where-clause might work in similar ways. Then [z+2 where z = y+1 for x in mylist where y = x+1] means [(z+2 where z=y+1) for x in mylist where y = x+1] where the parenthesis (also the part before the first `for`) is a expression, the rest is comprehension clauses. To be more accurate, the new syntax would be: test: where_test ['if' where_test 'else' test] | lambdef where_test: or_test | ( '(' or_test 'where' NAME '=' testlist_star_expr ')' ) Mandatory parenthesis in where_test is to resolve the ambiguity in code like print(y+2 where y=x+1 if x>0 else x-1 if x>1 else 0) It could be analysed like print((y+2 where y=x+1 if x>0 else x-1) if x>1 else 0) or print(y+2 where y=(x+1 if x>0 else x-1 if x>1 else 0)). I guess thektulu may have mandatory parenthesis for the same reason. I haven't check the new syntax very carefully so there might be other ambiguities. Another example is print(y+2 if x>0 else y-2 where y=x+1) with mandatory parenthesis, one must write print((y+2 if x>0 else y-2 where y=x+1)) or print(y+2 if x>0 else (y-2 where y=x+1)) However, it might still confuse many people. I wonder whether it's a good idea to have such syntax. It would be much easier to add assignments in comprehensions. comp_iter: comp_for | comp_if | comp_where comp_where: 'where' NAME '=' testlist_star_expr [comp_iter] Comments on *Goals of the new syntax* I have a real-world example in https://mail.python.org/pipermail/python-ideas/2018-February/048997.html, it's to generate a big json, you seem to have a very goods feeling of it though you didn't give a real-world example. At 2018-02-16 12:03:23, "Robert Vanden Eynde" wrote: Hello, talking about this syntax : [y+2 for x in range(5) let y = x+1] *Previous talks* I've had almost exactly the same idea in June 2017, see subject "variable assignment in functional context here: https://mail.python.org/pipermail/python-ideas/2017-June/subject.html. Currently this can easily be done by iterating over a list of size 1: [y+2 for x in range(5) for y in [x+1]] (Other ways exist, see section "current possibilities" below). This comparison would answer a lot of questions like "can we write [x+2 for x in range(5) for x in [x+1]], so [x+2 for x in range(5) let x = x+1]" the new variable would indeed shadow the old one. In June 2017 I introduced the "let" syntax using the existing "for" keyword, using the "=" instead of a "in" like this: [y+2 for x in range(5) for y = x+1] The only difference I introduced was that it would be logical to accept the syntax in any expression: x = 5 print(y+2 for y = x + 1) or with the "let" keyword: x = 5 print(y+2 let y = x + 1) *Previous talk: Pep* In the June conversation one conclusion was that someone wrote a pep (https://www.python.org/dev/peps/pep-3150/) in 2010 that's still pending (not exactly the same syntax but the same idea, he used a "given:" syntax) that looked like that : print(y+2 given: y=x+1) *Previous talk: GitHub implementation on Cython* Another conclusion was that someone on GitHub implemented a "where" statement in Cython (url: https://github.com/thektulu/cpython/commit/9e669d63d292a639eb6ba2ecea3ed2c0c23f2636) where one could write : print(y+2 where y = x+1) So in a list comprehension: [y+2 where y = x + 1 for x in range(5)] As the author thektulu said "just compile and have fun". *New syntax in list comprehension* However, we probably would like to be able to use this "where" after the for: [y+2 for x in range(5) where y = x+1] This would allow the new variable to be used in further "for" and "if" statement : [y+z for x in range(5) where y = x+1 for z in range(y+1)] *Choice of syntax* Initially I thought re-using the "for" keyword would be a good idea for backward comptability (a variable named "where" in old code wouldn't be a problem), however some people pointed out that the python Grammar wouldn't be LL1, when reading the "for" token it wouldn't be able to choose directly if the rest would be a "for in" or "for =" so actually introducing a dedicated keyword is probably better, like this : [y+2 for x in range(5) where y = x+1] print(y+2 where y = x+1) The "where" keyword is very readable in my opinion, very close to English sentences we use. Sometimes we introduce a new variable before using it, generally using "let" or after using it using "where". For example "let y = x+1; print(y+2)". Or "print(y+2 where y = x+1)". The first syntax is chosen in the "let in" syntax in haskell : print(let y = x+2 in y+2) Or chained: print(let x = 2 in let y = x+1 in y+2) But Haskell user would probably line break for clarity : print(let x = 2 in let y = x+1 in y+2) A postfix notation using "where" would probably be less verbose in my opinion. Another example of "postfix notation" python has is with the "a if condition else b" so adding a new one wouldn't be surprising. Furthermore, the postfix notation is preferred in the context of "presenting the result first, then the implementation" (context discussed already in the 2010 pep), the "presenting the result first" is also a goal of the list comprehension, indeed one does write [x+3 for x in range(5)] and not [for x in range(5): x+3], the latter would be more "imperative programming" style, and would be translated to a normal loop. The problem of chaining without parenthesis, how to remove the parenthesis in the following statement ? print((y+2 where y = x+1) where x = 2) We have two options : print(y+2 where x = 2 where y = x+1) print(y+2 where y = x+1 where x = 2) The first option would be probably closer to the way multiple "for" are linked in a list comprehension: [y+2 for x in range(5) for y in [x+1]] But the second option would be more "present result first" and more close to the parenthesized version, the user would create new variables as they go, "I want to compute y+2 but hey, what is y ? It's x+1 ! But what is x ? It's 5 !)). However, keeping the same order as in "multiple for in list comprehension" is better, so I'd choose first option. In the implementation on GitHub of thektulu the parenthesis are mandatory. Another syntax issue would probably surprise some users, the following statement, parenthesized : [(y+2 where y = x+1) for x in range(5)] Would have two totally legal ways to be done: [y+2 where y = x+1 for x in range(5)] [y+2 for x in range(5) where y = x+1] The first one is a consequence of the "where" keyword usable in an expression and the second one is a consequence of using it in a list comprehension. However I think it doesn't break the "there is only one obvious way to do it", because depending on the case, the "y" variable would be a consequence of the iteration or a consequence of the computation. *Goals of the new syntax* Personally, I find it very useful to do an assignment in such context, I use list comprehension for example to generate a big json with multiple for and if. I don't have here a list of big real world example thar would be simplified using this syntax but people interested could search arguments here. People could argue only the "list comprehension" case would be useful and not the "any expression" case: [y+2 for x in range(5) where y = x+1] would be accepted but not : print(y+2 where y=x+1) Because the latter could be written: y = x + 1 print(y+2) However, the idea of having an isolated scope can be a good idea. *Current possibilities* Currently we have multiple options : [y+2 for x in range(5) for y in [x+1]] [y+2 for (x+1 for x in range(5))] [(lambda y:y+2)(y=x+1) for x in range(5)] The first one works well but it's not obvious we just want to assign a new variable, especially when the expression is long, or multiple, or both: [y+z for x in range(5) for y,z in [((x + 1) * 2, x ** 2 - 5)]] The second one makes it impossible to reuse the "x" variable and the y = x+1 relation is not obvious. The third example is what a functional programmer would think but is really too much complex for a beginner and very verbose. *Proposed syntax : Conclusion* As I said, I like the "where" syntax with the "where" keyword. [y+2 for x in range(5) where y = x+1] Also usable in any expression : print(y+2 where y = x+1) *Conclusion* Here is all the talk/work/argument I've already found about this syntax. Apparently it's been a while (2010) since such an idea was thought but I think having a new pep listing all the pros and cons would be a good idea. So that we can measure how much the community would want this concept to be introduced, and if it's refused, the community would have a document where the "cons" are clearly written. Robert Vanden Eynde -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Feb 18 11:37:24 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 19 Feb 2018 03:37:24 +1100 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <21d694a9.b30.161a493d83e.Coremail.fhsxfhsx@126.com> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <21d694a9.b30.161a493d83e.Coremail.fhsxfhsx@126.com> Message-ID: <20180218163723.GF10142@ando.pearwood.info> On Sun, Feb 18, 2018 at 12:23:28AM +0800, fhsxfhsx wrote: > Thank you Paul, what you said is enlightening and I agree on most part of it. > > I'll propose two candidate syntaxs. > 1. `with ... as ...` > This syntax is more paralles as there would be `for` and `with` > clause as well as `for` and `with` statement. However, the > existing `with` statement is semantically different from this one, > although similar. I don't think they are even a little bit similar. The existing `with` statement is for controlling cleanup code (using a context manager). This proposal doesn't have anything to do with context managers or cleanup code. It's just a different way to spell "name = value" inside comprehensions. > 2. `for ... is ...` > This syntax is more uniform as the existing `for` clause make an > iterator which is a special kind of variable. However, I'm afraid > this syntax might be confused with `for ... in ...` as they differ > only on one letter. Indeed. And frankly, treated as English grammar, "for value is name" doesn't make sense and is horribly ugly to my eyes: result = [x for value in sequence for value+1 is x] > And here is an example which appears quite often in my code where I > think a new syntax can help a lot: Suppose I have an list of goods > showing by their ids in database, and I need to transform the ids into > json including information from two tables, `Goods` and > `GoodsCategory`, where the first table recording `id`, `name` and > `category_id` indicating which category the goods belongs to, the > second table recording `id`, `name` and `type` to the categories. > > With the new syntax, I can write > [ > { > 'id': goods.id, > 'name': goods.name, > 'category': gc.name, > 'category_type': gc.type, > } > for goods_id in goods_id_list > for goods is Goods.get_by_id(goods_id) > for gc is GoodsCategory.get_by_id(goods.category_id) > ] > > And I cannot think of any good solutions as this one without it. I can of a few, starting with the most simple: write a helper function. Not every problem needs to be solved with new syntax. def dict_from_id(goods_id): goods = Goods.get_by_id(goods_id) gc = GoodsCategory.get_by_id(goods.category_id) return {'id': goods.id, 'name': goods.name, 'category': gc.name, 'category_type': gc.type } result = [dict_from_id(goods_id) for goods_id in goods_id_list] That's much nicer to read, you can document and test the dict_from_id() function, no new systax is required, it is easy to refactor, and I very much doubt that adding one extra function call is going to be a significant slowdown compared to the cost of two calls to get_by_id() methods and constructing a dict. (And if as you add more fields to the dict, the overhead of the function call becomes an even smaller proportion.) Or you can make this a method of the goods object, which is arguably a better OO design. Let the goods object be responsible for creating the dict. result = [Goods.get_by_id(goods_id).make_dict() for goods_id in goods_id_list] # or if you prefer result = [goods.make_dict() for goods in map(Goods.get_by_id, goods_id_list)] Here is a third solution: use a for-loop iterating over a single-item tuple to get the effect of a local assignment to a temporary variable: result = [{ # dict display truncated for brevity... } for goods_id in goods_id_list for goods in (Goods.get_by_id(goods_id),) for gc in (GoodsCategory.get_by_id(goods.category_id),) ] If you don't like the look of single-item tuples (foo,) you can use single-item lists instead [x] but they are a tiny bit slower to create. Serhiy has suggested that the interpreter can optimize the single-item loop to make it as fast as a bare assignment: https://bugs.python.org/issue32856 I think this is a neat trick, although Yuri thinks it is an ugly hack and doesn't want to encourage it. Neat or ugly, I think it is better than "for value is name". -- Steve From ncoghlan at gmail.com Sun Feb 18 20:57:39 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 19 Feb 2018 11:57:39 +1000 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators In-Reply-To: <5A870756.6050503@stoneleaf.us> References: <5A870756.6050503@stoneleaf.us> Message-ID: On 17 February 2018 at 02:31, Ethan Furman wrote: > On 02/15/2018 11:55 PM, Nick Coghlan wrote: >> However, while I think that looks nicer in general, we'd still have to >> choose between two surprising behaviours: >> >> * implicitly delete the statement locals after the statement where >> they're set (which still overwrites any values previously bound to >> those names, similar to what happens with exception clauses) > > > If we're overwriting locals anyway, don't delete it. The good reason for > unsetting an exception variable doesn't apply here. > >> * skip deleting, which means references to subexpressions may last >> longer than expected (and we'd have the problem where embedded >> assignments could overwrite existing local variables) > > Odds are good that we'll want/need that assignment even after the immediate > expression it's used in. Let it stick around. If we want to use a subexpression in multiple statements, then regular assignment statements already work fine - breaking out a separate variable assignment only feels like an inconvenience when we use a subexpression twice in a single statement, and then don't need it any further. By contrast, if we have an implicit del immediately after the statement for any statement local variables, then naming a subexpression only extends its life to the end of the statement, not to the end of the current function, and it's semantically explicit that you *can't* use statement locals to name subexpressions that *aren't* statement local. The other concern I have with any form of statement local variables that can overwrite regular locals is that we'd be reintroducing the problem that comprehensions have in Python 2.x: unexpectedly rebinding things in non-obvious ways. At least with an implicit "del" the error would be more readily apparent, and if we disallow closing over statement local variables (which would be reasonable, since closures aren't statement local either), then we can avoid interfering with regular locals without needing to introduce a new execution scope. So let's consider a spec for statement local variable semantics that looks like this: 1. Statement local variables do *not* appear in locals() 2. Statement local variables are *not* visible in nested scopes (of any kind) 3. Statement local variables in compound statement headers are visible in the body of that compound statement 4. Due to 3, statement local variable references will be syntactically distinct from regular local variable references 5. Nested uses of the same statement local variable name will shadow outer uses, rather than overwriting them The most discreet syntactic marker we have available is a single leading dot, which would allow the following (note that in the simple assignment cases, breaking out a preceding assignment would be easy, but the perk of the statement local spelling is that it works in *any* expression context): value = .s.strip()[4:].upper() if (var1 as .s) is not None else None value = .s[4:].upper() if (var1 as .s) is not None else None value = .v if (var1 as .v) is not None else .v if (var2 as .v) is not None else var3 value = .v if not math.isnan((var1 as .v)) else tmp if not math.isnan((var2 as .v)) else var3 value = .f() if (calculate as .f) is not None else default filtered_values = [.v for x in keys if (get_value(x) as .v) is not None] range((calculate_start() as .start), .start+10) data[(calculate_start() as .start):.start+10] value if (lower_bound() as .min_val) <= value < .min_val+tolerance else 0 print(f"{(get_value() as .v)!r} is printed in pure ASCII as {.v!a} and in Unicode as {.v}") if (pattern.search(data) as .m) is not None: # .m is available here as the match result else: # .m is also available here (but will always be None given the condition) # .m is no longer available here Except clauses would be updated to allow the "except ExceptionType as .exc" spelling, which would give full statement local semantics (i.e. disallow closing over the variable, hide it from locals), rather than just deleting it at the end of the clause execution. Similarly, with statements would allow "with cm as .enter_result" to request statement local semantics for the enter result. (One potential concern here would be the not-immediately-obvious semantic difference between "with (cm as .the_cm):" and "with cm as .enter_result:"). To make that work at an implementation level we'd then need to track the following in the compiler: * the current nested statement level in the current compilation (so we can generate distinct names at each level) * a per-statement set of local variable names (so we can clear them at the end of the statement) * the largest number of concurrently set statement local variables (so we can allocate space for them in the frame) * the storage offset to use for each statement local variable and then frames would need an additional storage area for statement locals, as well as new opcodes for accessing them. Adding yet more complexity to an already complicated scoping model is an inherently dubious proposal, but at the same time, it does provide a way to express "and" and "or" semantics in terms of statement local variables and conditional expressions, and comparison chaining in terms of statement local variables and the "and" operator (so conceptually this kind of primitive does already exist in the language, just only as an operator-specific special case inside the interpreter). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From noreply81 at t-online.de Sun Feb 18 23:44:10 2018 From: noreply81 at t-online.de (guido) Date: Mon, 19 Feb 2018 05:44:10 +0100 (CET) Subject: [Python-ideas] (no subject) Message-ID: <1519015450546.827426.34e458954e2111fbae040c52204f24642a932cfd@spica.telekom.de> An HTML attachment was scrubbed... URL: From noreply81 at t-online.de Mon Feb 19 00:44:55 2018 From: noreply81 at t-online.de (guido) Date: Mon, 19 Feb 2018 06:44:55 +0100 Subject: [Python-ideas] PEP 468 Message-ID: <1eneFz-0lgbqq0@fwd25.t-online.de> Gesendet von Mail f?r Windows 10 -------------- next part -------------- An HTML attachment was scrubbed... URL: From noreply81 at t-online.de Mon Feb 19 06:37:49 2018 From: noreply81 at t-online.de (guido) Date: Mon, 19 Feb 2018 12:37:49 +0100 (CET) Subject: [Python-ideas] (no subject) Message-ID: <1519040269201.898187.9ef5d5aac0d403c0aa9406117a7553a7dd01f78c@spica.telekom.de> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Python-ast.c Type: application/binary Size: 272553 bytes Desc: not available URL: From sylvain.marie at schneider-electric.com Mon Feb 19 08:58:06 2018 From: sylvain.marie at schneider-electric.com (Sylvain MARIE) Date: Mon, 19 Feb 2018 13:58:06 +0000 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: > A thought just occurred to me. Maybe we should just add a Boolean class to numbers? That would be great indeed > It's a subclass of Integral, presumably. And normally only builtins.bool is registered with it. But np.bool can be added at the same point you register the other np integral types. I would rather suggest to keep that Boolean ABC class independent of Integral (see proposal in first post) to let it remain 'pure', i.e. represent logical booleans only. However nothing prevents us to register python bool as a virtual subclass of *both* Integral and Boolean - while np.bool would be registered as a virtual subclass of Boolean only. This would reflect quite well the reality - the fact that python bool is both a Boolean and an Integer, while numpy bool is only a Boolean. By the way, is there a reason for the name "Integral" (algebraic theory) instead of "Integer" (computer science) ? Would it be practically feasible to add "Integer" as an alias to "Integral" in the numbers package ? Sylvain -----Message d'origine----- De?: Chris Barker - NOAA Federal [mailto:chris.barker at noaa.gov] Envoy??: vendredi 16 f?vrier 2018 22:42 ??: guido at python.org Cc?: Sylvain MARIE ; Python-Ideas Objet?: Re: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module Sent from my iPhone > A thought just occurred to me. Maybe we should just add a Boolean class to numbers? This makes lots of sense to me. Bool is a subclass of int ? might as well embrace that fact. -CHB ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ From storchaka at gmail.com Mon Feb 19 12:40:30 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 19 Feb 2018 19:40:30 +0200 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: 15.02.18 18:27, Guido van Rossum ????: > A thought just occurred to me. Maybe we should just add a Boolean class > to numbers? It's a subclass of Integral, presumably. Isn't bool a subclass of int only for historical reasons? I think that if bool was in Python from the beginning, it would not be an int subclass. Operations inherited from int like division or bits shift doesn't make sense as boolean operations. The only boolean operations are testing for truthfulness, `not`, `and` and `or`. But every object in Python supports them. The bool class is just a type of constants True and False. From ethan at stoneleaf.us Mon Feb 19 13:35:28 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 19 Feb 2018 10:35:28 -0800 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators In-Reply-To: References: <5A870756.6050503@stoneleaf.us> Message-ID: <5A8B18F0.2080200@stoneleaf.us> On 02/18/2018 05:57 PM, Nick Coghlan wrote: > On 17 February 2018 at 02:31, Ethan Furman wrote: >> On 02/15/2018 11:55 PM, Nick Coghlan wrote: >>> However, while I think that looks nicer in general, we'd still have to >>> choose between two surprising behaviours: >>> >>> * implicitly delete the statement locals after the statement where >>> they're set (which still overwrites any values previously bound to >>> those names, similar to what happens with exception clauses) >> >> >> If we're overwriting locals anyway, don't delete it. The good reason for >> unsetting an exception variable doesn't apply here. >> >>> * skip deleting, which means references to subexpressions may last >>> longer than expected (and we'd have the problem where embedded >>> assignments could overwrite existing local variables) >> >> Odds are good that we'll want/need that assignment even after the immediate >> expression it's used in. Let it stick around. > > If we want to use a subexpression in multiple statements, then regular > assignment statements already work fine - breaking out a separate > variable assignment only feels like an inconvenience when we use a > subexpression twice in a single statement, and then don't need it any > further. > > By contrast, if we have an implicit del immediately after the > statement for any statement local variables, then naming a > subexpression only extends its life to the end of the statement, not > to the end of the current function, and it's semantically explicit > that you *can't* use statement locals to name subexpressions that > *aren't* statement local. > > The other concern I have with any form of statement local variables > that can overwrite regular locals is that we'd be reintroducing the > problem that comprehensions have in Python 2.x: unexpectedly rebinding > things in non-obvious ways. At least with an implicit "del" the error > would be more readily apparent, and if we disallow closing over > statement local variables (which would be reasonable, since closures > aren't statement local either), then we can avoid interfering with > regular locals without needing to introduce a new execution scope. Good points. I see two possibly good solutions: - don't override existing local variables, implicit del after statement - override existing local variables, no implicit del after statement I like the first one better, as it mirrors list comprehensions and is simple to understand. The second one is okay, and if significantly easier to implement I would be okay with. It's the combination of: - override existing local variables, implicit del after statement that I abhor. As I understand it, only try/except has that behavior -- and we have a really good reason for it, and it's the exception to the general rule, and... Okay, after further thought I like the second one better. List comps have the outside brackets as a reminder that they have their own scope, but these "statement local" variables only have internal parenthesis. I still don't like the third option. -- ~Ethan~ From davidfstr at gmail.com Mon Feb 19 14:13:27 2018 From: davidfstr at gmail.com (David Foster) Date: Mon, 19 Feb 2018 11:13:27 -0800 Subject: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators In-Reply-To: References: Message-ID: <5f00656b-20fe-daeb-8e99-5cf1f4d3b855@gmail.com> (1) This proposal serves well to eliminate repeated computations by allowing what is an inline assignment to a temporary variable. But it doesn't seem to make the case of None-aware operators any less verbose than they would be otherwise. Proposal: value = ?it.strip()[4:].upper() if (?it=var1) is not None else None Traditional: it = var1 value = it.strip()[4:].upper() if it is not None else None If we wanted to get rid of the "if it is not None else None" boilerplate, we'd need something more concise. For example - completely off the cuff - syntax like: value = var1?.strip()[4:].upper() I'm not really interested in going down a rabbit hole of discussing those kinds of syntax alternatives on this thread. I just want to point out that the current proposal don't seem to make None-aware operations much more concise than they were before except in the case of complex subexpressions being None-tested, which I find uncommon in my own programs. (2) In considering the proposal in the alternative light of specifically trying to eliminate complex subexpressions, I'll also put a +1 in for (expr as it) rather than (?it=) since I find it reads nicer. Also is consistent with existing syntax (with expr as var). Cheers, David -- David Foster | Seattle, WA, USA On 2/15/18 6:06 PM, Nick Coghlan wrote: > The recent thread on variable assignment in comprehensions has > prompted me to finally share > https://gist.github.com/ncoghlan/a1b0482fc1ee3c3a11fc7ae64833a315 with > a wider audience (see the comments there for some notes on iterations > I've already been through on the idea). > > == The general idea == > > The general idea would be to introduce a *single* statement local > reference using a new keyword with a symbolic prefix: "?it" > > * `(?it=expr)` is a new atomic expression for an "it reference > binding" (whitespace would be permitted around "?it" and "=", but PEP > 8 would recommend against it in general) > * subsequent subexpressions (in execution order) can reference the > bound subexpression using `?it` (an "it reference") > * `?it` is reset between statements, including before entering the > suite within a compound statement (if you want a persistent binding, > use a named variable) > * for conditional expressions, put the reference binding in the > conditional, as that gets executed first > * to avoid ambiguity, especially in function calls (where it could be > confused with keyword argument syntax), the parentheses around > reference bindings are always required > * unlike regular variables, you can't close over statement local > references (the nested scope will get an UnboundLocalError if you try > it) > > The core inspiration here is English pronouns (hence the choice of > keyword): we don't generally define arbitrary terms in the middle of > sentences, but we *do* use pronouns to refer back to concepts > introduced earlier in the sentence. And while it's not an especially > common practice, pronouns are sometimes even used in a sentence > *before* the concept they refer to ;) > > If we did pursue this, then PEPs 505, 532, and 535 would all be > withdrawn or rejected (with the direction being to use an it-reference > instead). > > == Examples == > > `None`-aware attribute access: > > value = ?it.strip()[4:].upper() if (?it=var1) is not None else None > > `None`-aware subscript access: > > value = ?it[4:].upper() if (?it=var1) is not None else None > > `None`-coalescense: > > value = ?it if (?it=var1) is not None else ?it if (?it=var2) is > not None else var3 > > `NaN`-coalescence: > > value = ?it if not math.isnan((?it=var1)) else ?it if not > math.isnan((?that=var2)) else var3 > > Conditional function call: > > value = ?it() if (?it=calculate) is not None else default > > Avoiding repeated evaluation of a comprehension filter condition: > > filtered_values = [?it for x in keys if (?it=get_value(x)) is not None] > > Avoiding repeated evaluation for range and slice bounds: > > range((?it=calculate_start()), ?it+10) > data[(?it=calculate_start()):?it+10] > > Avoiding repeated evaluation in chained comparisons: > > value if (?it=lower_bound()) <= value < ?it+tolerance else 0 > > Avoiding repeated evaluation in an f-string: > > print(f"{?it=get_value()!r} is printed in pure ASCII as {?it!a} > and in Unicode as {?it}" > > == Possible future extensions == > > One possible future extension would be to pursue PEP 3150, treating > the nested namespace as an it reference binding, giving: > > sorted_data = sorted(data, key=?it.sort_key) given ?it=: > def sort_key(item): > return item.attr1, item.attr2 > > (A potential bonus of that spelling is that it may be possible to make > "given ?it=:" the syntactic keyword introducing the suite, allowing > "given" itself to continue to be used as a variable name) > > Another possible extension would be to combine it references with `as` > clauses on if statements and while loops: > > if (?it=pattern.match(data)) is not None as matched: > ... > > while (?it=pattern.match(data)) is not None as matched: > ... > > == Why not arbitrary embedded assignments? == > > Primarily because embedded assignments are inherently hard to read, > especially in long expressions. Restricting things to one pronoun, and > then pursuing PEP 3150's given clause in order to expand to multiple > statement local names should help nudge folks towards breaking things > up into multiple statements rather than writing ever more complex > one-liners. > > That said, the ?-prefix notation is deliberately designed such that it > *could* be used with arbitrary identifiers rather then being limited > to a single specific keyword, and the explicit lack of closure support > means that there wouldn't be any complex nested scope issues > associated with lambda expressions, generator expressions, or > container comprehensions. > > With that approach, "?it" would just be an idiomatic default name like > "self" or "cls" rather than being a true keyword. Given arbitrary > identifier support, some of the earlier examples might instead be > written as: > > value = ?f() if (?f=calculate) is not None else default > range((?start=calculate_start()), ?start+10) > value if (?lower=lower_bound()) <= value < ?lower+tolerance else 0 > > The main practical downside to this approach is that *all* the > semantic weight ends up resting on the symbolic "?" prefix, which > makes it very difficult to look up as a new Python user. With a > keyword embedded in the construct, there's a higher chance that folks > will be able to guess the right term to search for (i.e. "python it > expression" or "python it keyword"). > > Another downside of this more flexible option is that it likely > *wouldn't* be amenable to the "if expr as name:" syntax extension, as > there wouldn't be a single defined pronoun expression to bind the name > to. > > However, the extension to PEP 3150 would allow the statement local > namespace to be given an arbitrary name: > > sorted_data = sorted(data, key=?ns.sort_key) given ?ns=: > def sort_key(item): > return item.attr1, item.attr2 > > Cheers, > Nick. > From stefan_ml at behnel.de Mon Feb 19 14:15:27 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 19 Feb 2018 20:15:27 +0100 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: References: Message-ID: Nick Coghlan schrieb am 02.02.2018 um 06:47: > to make the various extension module authoring tools > easier to discover, rather than having folks assuming that handcrafted > calls directly into the CPython C API is their only option. Or even a competitive option. Tools like Cython or pybind11 go to great length to shave off every bit of overhead from C-API calls, commonly replacing high-level C-API functionality with macros and direct access to data structures. The C/C++ code that they generate is so complex and tuned that it would be infeasible to write and maintain something like that by hand, but it can perfectly be generated, and it usually performs visibly better than most hand-written modules, definitely much better than anything a non-expert could write. Basically, by not learning the C-API you can benefit from all that highly tuned and specialised code written by C-API experts that the documentation doesn't even tell you about. Stefan From brett at python.org Mon Feb 19 15:18:13 2018 From: brett at python.org (Brett Cannon) Date: Mon, 19 Feb 2018 20:18:13 +0000 Subject: [Python-ideas] PEP 468 In-Reply-To: <1eneFz-0lgbqq0@fwd25.t-online.de> References: <1eneFz-0lgbqq0@fwd25.t-online.de> Message-ID: FYI I have unsubscribed this person from the mailing list. If they persist to send empty emails I will figure out how to block them. On Sun, 18 Feb 2018 at 22:04 guido wrote: > > > > > Gesendet von Mail f?r > Windows 10 > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Feb 19 17:33:06 2018 From: guido at python.org (Guido van Rossum) Date: Mon, 19 Feb 2018 14:33:06 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: On Mon, Feb 19, 2018 at 5:58 AM, Sylvain MARIE < sylvain.marie at schneider-electric.com> wrote: > > A thought just occurred to me. Maybe we should just add a Boolean class > to numbers? > > That would be great indeed > > > It's a subclass of Integral, presumably. And normally only builtins.bool > is registered with it. But np.bool can be added at the same point you > register the other np integral types. > > I would rather suggest to keep that Boolean ABC class independent of > Integral (see proposal in first post) to let it remain 'pure', i.e. > represent logical booleans only. However nothing prevents us to register > python bool as a virtual subclass of *both* Integral and Boolean - while > np.bool would be registered as a virtual subclass of Boolean only. This > would reflect quite well the reality - the fact that python bool is both a > Boolean and an Integer, while numpy bool is only a Boolean. > OK, that could work. At this point I think you should just file an issue on bugs.python.org (but since Python 3.7 is in feature freeze, expect this to be put on the 3.8 track). > By the way, is there a reason for the name "Integral" (algebraic theory) > instead of "Integer" (computer science) ? Would it be practically feasible > to add "Integer" as an alias to "Integral" in the numbers package ? > Hm, perhaps Integral is an adjective, just like Boolean? Though it's also possible that it was simply a mistake. In general I don't like adding aliases for different spellings -- it violates TOOWTDI. If we decide that this really was a mistake we should go ahead and make Integer the recommended way and define Integral as an alias for backwards compatibility. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Feb 19 23:08:41 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 20 Feb 2018 14:08:41 +1000 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: On 20 February 2018 at 08:33, Guido van Rossum wrote: > Hm, perhaps Integral is an adjective, just like Boolean? Though it's also > possible that it was simply a mistake. In general I don't like adding > aliases for different spellings -- it violates TOOWTDI. If we decide that > this really was a mistake we should go ahead and make Integer the > recommended way and define Integral as an alias for backwards compatibility. FWIW, I had to run `dir(numbers)` while tinkering at the REPL based on this discussion, because I was surprised that "numbers.Integer" was giving me an attribute error. Checking PEP 3141 doesn't shed any light on the question either - while that cites Scheme as the origin of the numeric tower structure, the given reference at https://groups.csail.mit.edu/mac/ftpdir/scheme-reports/r5rs-html/r5rs_8.html#SEC50 uses "integer" rather than "integral". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Feb 20 00:53:23 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Tue, 20 Feb 2018 14:53:23 +0900 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: <23179.47059.454599.372593@turnbull.sk.tsukuba.ac.jp> Guido van Rossum writes: > Hm, perhaps Integral is an adjective, just like Boolean? I would guess so. This is the same idiom we use when we call [1, 2, 3] a "truth-y". From solipsis at pitrou.net Tue Feb 20 01:17:26 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 20 Feb 2018 07:17:26 +0100 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? References: Message-ID: <20180220071726.6c3330ca@fsol> On Mon, 19 Feb 2018 20:15:27 +0100 Stefan Behnel wrote: > Nick Coghlan schrieb am 02.02.2018 um 06:47: > > to make the various extension module authoring tools > > easier to discover, rather than having folks assuming that handcrafted > > calls directly into the CPython C API is their only option. > > Or even a competitive option. Tools like Cython or pybind11 go to great > length to shave off every bit of overhead from C-API calls, commonly > replacing high-level C-API functionality with macros and direct access to > data structures. The C/C++ code that they generate is so complex and tuned > that it would be infeasible to write and maintain something like that by > hand, but it can perfectly be generated, and it usually performs visibly > better than most hand-written modules, definitely much better than anything > a non-expert could write. > > Basically, by not learning the C-API you can benefit from all that highly > tuned and specialised code written by C-API experts that the documentation > doesn't even tell you about. Doesn't the documentation ever mention Cython? It probably should (no idea about pybind11, which I've never played with). Perhaps you can open an issue about that? As a sidenote, you can certainly use Cython without learning the C API, but to extract maximum performance it's better to know the C API anyway, to be aware of what kind of optimizations are available in which situations. Regards Antoine. From ncoghlan at gmail.com Tue Feb 20 08:12:15 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 20 Feb 2018 23:12:15 +1000 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: <20180220071726.6c3330ca@fsol> References: <20180220071726.6c3330ca@fsol> Message-ID: On 20 February 2018 at 16:17, Antoine Pitrou wrote: > On Mon, 19 Feb 2018 20:15:27 +0100 > Stefan Behnel wrote: >> Nick Coghlan schrieb am 02.02.2018 um 06:47: >> > to make the various extension module authoring tools >> > easier to discover, rather than having folks assuming that handcrafted >> > calls directly into the CPython C API is their only option. >> >> Or even a competitive option. Tools like Cython or pybind11 go to great >> length to shave off every bit of overhead from C-API calls, commonly >> replacing high-level C-API functionality with macros and direct access to >> data structures. The C/C++ code that they generate is so complex and tuned >> that it would be infeasible to write and maintain something like that by >> hand, but it can perfectly be generated, and it usually performs visibly >> better than most hand-written modules, definitely much better than anything >> a non-expert could write. >> >> Basically, by not learning the C-API you can benefit from all that highly >> tuned and specialised code written by C-API experts that the documentation >> doesn't even tell you about. > > Doesn't the documentation ever mention Cython? It probably should (no > idea about pybind11, which I've never played with). Perhaps you can > open an issue about that? We mention them in the Extending & Embedding guide, and link out to the page on packaging.python.org that describes them in more detail: https://docs.python.org/3/extending/index.html#recommended-third-party-tools Cheers, Nick. P.S. There are also a number of open issues at https://github.com/pypa/python-packaging-user-guide/issues regarding additional projects that should be mentioned in https://packaging.python.org/guides/packaging-binary-extensions/ -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Tue Feb 20 12:06:27 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 20 Feb 2018 09:06:27 -0800 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: <23179.47059.454599.372593@turnbull.sk.tsukuba.ac.jp> References: <20180212134502.GF26553@ando.pearwood.info> <23179.47059.454599.372593@turnbull.sk.tsukuba.ac.jp> Message-ID: Looking at https://en.wikipedia.org/wiki/Number it seems that Integer is "special" -- every other number type is listed as " numbers" (e.g. rational numbers, complex numbers) but integers are listed as "Integers". So let's just switch it to that, and keep Integral as an alias for backwards compatibility. I don't think it's a huge problem to fix this in 3.7b2, if someone wants to do the work. On Mon, Feb 19, 2018 at 9:53 PM, Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > Guido van Rossum writes: > > > Hm, perhaps Integral is an adjective, just like Boolean? > > I would guess so. This is the same idiom we use when we call > [1, 2, 3] a "truth-y". > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Feb 20 15:14:33 2018 From: brett at python.org (Brett Cannon) Date: Tue, 20 Feb 2018 20:14:33 +0000 Subject: [Python-ideas] importlib: making FileFinder easier to extend In-Reply-To: References: Message-ID: Basically what you're after is a way to extend the default finder with a new file type. Historically you didn't want this because of the performance hit of the extra stat call to check that new file extension (this has been greatly alleviated in Python 3 through the caching of directory contents). But I would still argue that you don't necessarily want this for e.g. the stdlib or any other random project which might just happen to have a file with the same file extension as the one you want to have special support for. I also don't think we want a class attribute to contains the default loaders since not everyone will want those default semantics in all cases either. Since we're diving into deep levels of customization I would askew anything that makes assumptions for what you want. I think the best we could consider is making importlib.machinery._get_supported_loaders() a public API. That way you can easily construct a finder with the default loaders plus your custom ones. After that you can then provide a custom sys.path_hooks entry that recognizes the directories which contain your custom file type. If that seems reasonable then feel free to open an enhancement request at bugs.python.org to discuss the API and then we can discuss how to implement a PR for it. On Wed, 7 Feb 2018 at 07:04 Erik Bray wrote: > Hello, > > Brief problem statement: Let's say I have a custom file type (say, > with extension .foo) and these .foo files are included in a package > (along with other Python modules with standard extensions like .py and > .so), and I want to make these .foo files importable like any other > module. > > On its face, importlib.machinery.FileFinder makes this easy. I make a > loader for my custom file type (say, FooSourceLoader), and I can use > the FileFinder.path_hook helper like: > > sys.path_hooks.insert(0, FileFinder.path_hook((FooSourceLoader, ['.foo']))) > sys.path_importer_cache.clear() > > Great--now I can import my .foo modules like any other Python module. > However, any standard Python modules now cannot be imported. The way > PathFinder sys.meta_path hook works, sys.path_hooks entries are > first-come-first-serve, and furthermore FileFinder.path_hook is very > promiscuous--it will take over module loading for *any* directory on > sys.path, regardless what the file extensions are in that directory. > So although this mechanism is provided by the stdlib, it can't really > be used for this purpose without breaking imports of normal modules > (and maybe it's not intended for that purpose, but the documentation > is unclear). > > There are a number of different ways one could get around this. One > might be to pass FileFinder.path_hook loaders/extension pairs for all > the basic file types known by the Python interpreter. Unfortunately > there's no great way to get that information. *I* know that I want to > support .py, .pyc, .so etc. files, and I know which loaders to use for > them. But that's really information that should belong to the Python > interpreter, and not something that should be reverse-engineered. In > fact, there is such a mapping provided by > importlib.machinery._get_supported_file_loaders(), but this is not a > publicly documented function. > > One could probably think of other workarounds. For example you could > implement a custom sys.meta_path hook. But I think it shouldn't be > necessary to go to higher levels of abstraction in order to do > this--the default sys.path handler should be able to handle this use > case. > > In order to support adding support for new file types to > sys.path_hooks, I ended up implementing the following hack: > > ############################################################# > import os > import sys > > from importlib.abc import PathEntryFinder > > > @PathEntryFinder.register > class MetaFileFinder: > """ > A 'middleware', if you will, between the PathFinder sys.meta_path hook, > and sys.path_hooks hooks--particularly FileFinder. > > The hook returned by FileFinder.path_hook is rather 'promiscuous' in > that > it will handle *any* directory. So if one wants to insert another > FileFinder.path_hook into sys.path_hooks, that will totally take over > importing for any directory, and previous path hooks will be ignored. > > This class provides its own sys.path_hooks hook as follows: If inserted > on sys.path_hooks (it should be inserted early so that it can supersede > anything else). Its find_spec method then calls each hook on > sys.path_hooks after itself and, for each hook that can handle the > given > sys.path entry, it calls the hook to create a finder, and calls that > finder's find_spec. So each sys.path_hooks entry is tried until a > spec is > found or all finders are exhausted. > """ > > def __init__(self, path): > if not os.path.isdir(path): > raise ImportError('only directories are supported', path=path) > > self.path = path > self._finder_cache = {} > > def __repr__(self): > return '{}({!r})'.format(self.__class__.__name__, self.path) > > def find_spec(self, fullname, target=None): > if not sys.path_hooks: > return None > > for hook in sys.path_hooks: > if hook is self.__class__: > continue > > finder = None > try: > if hook in self._finder_cache: > finder = self._finder_cache[hook] > if finder is None: > # We've tried this finder before and got an > ImportError > continue > except TypeError: > # The hook is unhashable > pass > > if finder is None: > try: > finder = hook(self.path) > except ImportError: > pass > > try: > self._finder_cache[hook] = finder > except TypeError: > # The hook is unhashable for some reason so we don't bother > # caching it > pass > > if finder is not None: > spec = finder.find_spec(fullname, target) > if spec is not None: > return spec > > # Module spec not found through any of the finders > return None > > def invalidate_caches(self): > for finder in self._finder_cache.values(): > finder.invalidate_caches() > > @classmethod > def install(cls): > sys.path_hooks.insert(0, cls) > sys.path_importer_cache.clear() > > ############################################################# > > This works, for example, like: > > >>> MetaFileFinder.install() > >>> sys.path_hooks.append(FileFinder.path_hook((SourceFileLoader, > ['.foo']))) > > And now, .foo modules are importable, without breaking support for the > built-in module types. > > This is still overkill though. I feel like there should instead be a > way to, say, extend a sys.path_hooks hook based on FileFinder so as to > be able to support loading other file types, without having to go > above the default sys.meta_path hooks. > > A small, but related problem I noticed in the way FileFinder.path_hook > is implemented, is that for almost *every directory* that gets cached > in sys.path_importer_cache, a new FileFinder instance is created with > its own self._loaders attribute, each containing a copy of the same > list of (loader, extensions) tuples. I calculated that on one large > project this alone accounted for nearly 1 MB. Not a big deal in the > grand scheme of things, but still a bit overkill. > > ISTM it would kill two birds with one stone if FileFinder were > changed, or there were a subclass thereof, that had a class attribute > containing the standard loader/extension mappings. This in turn could > simply be appended to in order to support new extension types. > > Thanks, > E > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Tue Feb 20 15:54:32 2018 From: barry at barrys-emacs.org (Barry) Date: Tue, 20 Feb 2018 20:54:32 +0000 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: References: <20180220071726.6c3330ca@fsol> Message-ID: <46C88D37-F929-46A2-9F5E-BE17A1823EAC@barrys-emacs.org> > On 20 Feb 2018, at 13:12, Nick Coghlan wrote: > >> On 20 February 2018 at 16:17, Antoine Pitrou wrote: >> On Mon, 19 Feb 2018 20:15:27 +0100 >> Stefan Behnel wrote: >>> Nick Coghlan schrieb am 02.02.2018 um 06:47: >>>> to make the various extension module authoring tools >>>> easier to discover, rather than having folks assuming that handcrafted >>>> calls directly into the CPython C API is their only option. >>> >>> Or even a competitive option. Tools like Cython or pybind11 go to great >>> length to shave off every bit of overhead from C-API calls, commonly >>> replacing high-level C-API functionality with macros and direct access to >>> data structures. The C/C++ code that they generate is so complex and tuned >>> that it would be infeasible to write and maintain something like that by >>> hand, but it can perfectly be generated, and it usually performs visibly >>> better than most hand-written modules, definitely much better than anything >>> a non-expert could write. >>> >>> Basically, by not learning the C-API you can benefit from all that highly >>> tuned and specialised code written by C-API experts that the documentation >>> doesn't even tell you about. >> >> Doesn't the documentation ever mention Cython? It probably should (no >> idea about pybind11, which I've never played with). Perhaps you can >> open an issue about that? > > We mention them in the Extending & Embedding guide, and link out to > the page on packaging.python.org that describes them in more detail: > https://docs.python.org/3/extending/index.html#recommended-third-party-tools Can you add PyCXX to the list please? Barry > > Cheers, > Nick. > > P.S. There are also a number of open issues at > https://github.com/pypa/python-packaging-user-guide/issues regarding > additional projects that should be mentioned in > https://packaging.python.org/guides/packaging-binary-extensions/ > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From george at fischhof.hu Tue Feb 20 17:11:10 2018 From: george at fischhof.hu (George Fischhof) Date: Tue, 20 Feb 2018 23:11:10 +0100 Subject: [Python-ideas] Please consider adding of functions file system operations to pathlib Message-ID: Good day all, as a continuation of thread "OS related file operations (copy, move, delete, rename...) should be placed into one module" https://mail.python.org/pipermail/python-ideas/2017-January/044217.html please consider making pathlib to a central file system module with putting file operations (copy, move, delete, rmtree etc) into pathlib. BR, George -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Feb 20 17:40:29 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 20 Feb 2018 14:40:29 -0800 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: <46C88D37-F929-46A2-9F5E-BE17A1823EAC@barrys-emacs.org> References: <20180220071726.6c3330ca@fsol> <46C88D37-F929-46A2-9F5E-BE17A1823EAC@barrys-emacs.org> Message-ID: On Tue, Feb 20, 2018 at 12:54 PM, Barnstone Worthy wrote: I'm pretty sure that's an alias for Barry Warsaw. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Feb 20 21:05:12 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Feb 2018 12:05:12 +1000 Subject: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ? In-Reply-To: References: <20180220071726.6c3330ca@fsol> <46C88D37-F929-46A2-9F5E-BE17A1823EAC@barrys-emacs.org> Message-ID: On 21 February 2018 at 08:40, Guido van Rossum wrote: > On Tue, Feb 20, 2018 at 12:54 PM, Barnstone Worthy > wrote: > > I'm pretty sure that's an alias for Barry Warsaw. :-) Different Barry :) I've expanded the existing issue at https://github.com/pypa/python-packaging-user-guide/issues/355 to note that there are more options we should at least mention in the binary extensions guide, even if we don't go into the same level of detail as we do for cffi/Cython/SWIG. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From sylvain.marie at schneider-electric.com Tue Feb 20 08:34:30 2018 From: sylvain.marie at schneider-electric.com (Sylvain MARIE) Date: Tue, 20 Feb 2018 13:34:30 +0000 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: <23179.47059.454599.372593@turnbull.sk.tsukuba.ac.jp> References: <20180212134502.GF26553@ando.pearwood.info> <23179.47059.454599.372593@turnbull.sk.tsukuba.ac.jp> Message-ID: https://bugs.python.org/issue32886 created. Don't hesitate to correct if anything is wrong in the text or associated tags Sylvain -----Message d'origine----- De?: Stephen J. Turnbull [mailto:turnbull.stephen.fw at u.tsukuba.ac.jp] Envoy??: mardi 20 f?vrier 2018 06:53 ??: guido at python.org Cc?: Sylvain MARIE ; Python-Ideas Objet?: Re: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module Guido van Rossum writes: > Hm, perhaps Integral is an adjective, just like Boolean? I would guess so. This is the same idiom we use when we call [1, 2, 3] a "truth-y". ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ From greg.ewing at canterbury.ac.nz Tue Feb 20 18:43:31 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 21 Feb 2018 12:43:31 +1300 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <20180212134502.GF26553@ando.pearwood.info> Message-ID: <5A8CB2A3.4030500@canterbury.ac.nz> > On Mon, Feb 19, 2018 at 5:58 AM, Sylvain MARIE > > wrote: > > By the way, is there a reason for the name "Integral" (algebraic > theory) instead of "Integer" (computer science) ? Would it be > practically feasible to add "Integer" as an alias to "Integral" in > the numbers package ? Possibly inspired by Haskell, which has Integral (a type class, sort of like an abstract type) and also Integer and Int (which are concrete types). -- Greg From tjreedy at udel.edu Wed Feb 21 00:35:29 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 21 Feb 2018 00:35:29 -0500 Subject: [Python-ideas] Boolean ABC similar to what's provided in the 'numbers' module In-Reply-To: References: <23179.47059.454599.372593@turnbull.sk.tsukuba.ac.jp> Message-ID: On 2/20/2018 12:06 PM, Guido van Rossum wrote: > Looking at https://en.wikipedia.org/wiki/Number it seems that Integer is > "special" -- every other number type is listed as " numbers" > (e.g. rational numbers, complex numbers) but integers are listed as > "Integers". So let's just switch it to that, and keep Integral as an > alias for backwards compatibility. I don't think it's a huge problem to > fix this in 3.7b2, if someone wants to do the work. https://bugs.python.org/issue32891 -- Terry Jan Reedy From desmoulinmichel at gmail.com Wed Feb 21 02:11:04 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 21 Feb 2018 08:11:04 +0100 Subject: [Python-ideas] Please consider adding of functions file system operations to pathlib In-Reply-To: References: Message-ID: <16dd3f57-cef6-b849-d186-97f2f66564c2@gmail.com> +1 We already merged, os.path and glob with pathlib. Let's do all os and shutil. It's weird enough for beginners to even sumble upon that many ways of doing thing for FS. Le 20/02/2018 ? 23:11, George Fischhof a ?crit?: > Good day all, > > as a continuation of thread "OS related file operations (copy, move, > delete, rename...) should be placed into one module"? > > https://mail.python.org/pipermail/python-ideas/2017-January/044217.html > > please consider making pathlib to a central file system module with > putting file operations (copy, move, delete, rmtree etc) into pathlib. > > BR, > George > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From mistersheik at gmail.com Wed Feb 21 06:37:11 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 21 Feb 2018 03:37:11 -0800 (PST) Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> Message-ID: <6753a8af-35f4-4a7b-a54b-e6988ca2efcd@googlegroups.com> You should give an actual motivating example. I think none of these suggestions are more readable than just writing things out as a for loop. You argue that you want to avoid appending to a result list. In that case, I suggest writing your pattern as a generator function. Best, Neil On Thursday, February 15, 2018 at 2:03:31 AM UTC-5, fhsxfhsx wrote: > > As far as I can see, a comprehension like > alist = [f(x) for x in range(10)] > is better than a for-loop > for x in range(10): > alist.append(f(x)) > because the previous one shows every element of the list explicitly so > that we don't need to handle `append` mentally. > > But when it comes to something like > [f(x) + g(f(x)) for x in range(10)] > you find you have to sacrifice some readableness if you don't want two > f(x) which might slow down your code. > > Someone may argue that one can write > [y + g(y) for y in [f(x) for x in range(10)]] > but it's not as clear as to show what `y` is in a subsequent clause, not > to say there'll be another temporary list built in the process. > We can even replace every comprehension with map and filter, but that > would face the same problems. > > In a word, what I'm arguing is that we need a way to assign temporary > variables in a comprehension. > In my opinion, code like > [y + g(y) for x in range(10) **some syntax for `y=f(x)` here**] > is more natural than any solution we now have. > And that's why I pro the new syntax, it's clear, explicit and readable, > and is nothing beyond the functionality of the present comprehensions so > it's not complicated. > > And I hope the discussion could focus more on whether we should allow > assigning temporary variables in comprehensions rather than how to solve > the specific example I mentioned above. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at 2sn.net Thu Feb 22 17:18:36 2018 From: python at 2sn.net (Alexander Heger) Date: Fri, 23 Feb 2018 09:18:36 +1100 Subject: [Python-ideas] List assignment - extended slicing inconsistency Message-ID: ?What little documentation I could find, providing a stride on the assignment target for a list is supposed to trigger 'advanced slicing' causing element-wise replacement - and hence requiring that the source iterable has the appropriate number of elements. >>> a = [0,1,2,3] >>> a[::2] = [4,5] >>> a [4, 1, 5, 3] >>> a[::2] = [4,5,6] Traceback (most recent call last): File "", line 1, in ValueError: attempt to assign sequence of size 3 to extended slice of size 2 This is in contrast to regular slicing (*without* a stride), allowing to replace a *range* by another sequence of arbitrary length. >>> a = [0,1,2,3] >>> a[:3] = [4] >>> a [4, 3] Issue ===== When, however, a stride of `1` is specified, advanced slicing is not triggered. >>> a = [0,1,2,3] >>> a[:3:1] = [4] >>> a [4, 3] If advanced slicing had been triggered, there should have been a ValueError instead. Expected behaviour: >>> a = [0,1,2,3] >>> a[:3:1] = [4] Traceback (most recent call last): File "", line 1, in ValueError: attempt to assign sequence of size 1 to extended slice of size 3 I think that is an inconsistency in the language that should be fixed. Why do we need this? ==================== One may want this as extra check as well so that list does not change size. Depending on implementation, it may come with performance benefits as well. One could, though, argue that you still get the same result if you do all correctly >>> a = [0,1,2,3] >>> a[:3:1] = [4,5,6] >>> a [4, 5, 6, 3] But I disagree that there should be no error when it is wrong. *Strides that are not None should always trigger advanced slicing.* Other Data Types ================ This change should also be applied to bytearray, etc., though see below. Concerns ======== It may break some code that uses advanced slicing and expects regular slicing to occur? These cases should be rare, and the error message should be clear enough to allow fixes? I assume these cases should be exceptionally rare. If the implementation relies on `slice.indices(len(seq))[2] == 1` to determine about advance slicing or not, that would require some refactoring. If it is only `slice.stride in (1, None)` then this could easily replaced by checking against None. Will there be issues with syntax consistency with other data types, in particular outside the core library? - I always found that the dynamic behaviour of lists w/r non-advanced slicing to be somewhat peculiar in the first place, though, undeniably, it can be immensely useful. - Most external data types with fixed memory such as numpy do not have this dynamic flexibility, and the behavior of regular slicing on assignment is the same as regular slicing. The proposed change would increase consistency with these other data types. More surprises ============== >>> import array >>> a[1::2] = a[3:3] Traceback (most recent call last): File "", line 1, in ValueError: attempt to assign sequence of size 0 to extended slice of size 2 whereas >>> a = [1,2,3,4,5] >>> a[1::2] = a[3:3] Traceback (most recent call last): File "", line 1, in ValueError: attempt to assign sequence of size 0 to extended slice of size 2 >>> a = bytearray(b'12345') >>> a[1::2] = a[3:3] >>> a bytearray(b'135') but numpy >>> import numpy as np >>> a = np.array([1,2,3,4,5]) >>> a[1::2] = a[3:3] Traceback (most recent call last): File "", line 1, in ValueError: could not broadcast input array from shape (0) into shape (2) and >>> import numpy as np >>> a[1:2] = a[3:3] Traceback (most recent call last): File "", line 1, in ValueError: could not broadcast input array from shape (0) into shape (1) The latter two as expected. memoryview behaves the same. Issue 2 ======= Whereas NumPy is know to behave differently as a data type with fixed memory layout, and is not part of the standard library anyway, the difference in behaviour between lists and arrays I find disconcerting. This should be resolved to a consistent behaviour. Proposal 2 ========== Arrays and bytearrays should should adopt the same advanced slicing behaviour I suggest for lists. Concerns 2 ========== This has the potential for a lot more side effects in existing code, but as before in most cases error message should be triggered. Summary ======= I find it it not acceptable as a good language design that there is a large range of behaviour on slicing in assignment target for the different native (and standard library) data type of seemingly similar kind, and that users have to figure out for each data type by testing - or at the very least remember if documented - how it behaves on slicing in assignment targets. There should be a consistent behaviour at the very least, ideally even one with a clear user interface as suggested for lists. -Alexander -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Feb 22 20:51:09 2018 From: guido at python.org (Guido van Rossum) Date: Thu, 22 Feb 2018 17:51:09 -0800 Subject: [Python-ideas] List assignment - extended slicing inconsistency In-Reply-To: References: Message-ID: On Thu, Feb 22, 2018 at 2:18 PM, Alexander Heger wrote: > ?What little documentation I could find, providing a stride on the > assignment target for a list is supposed to trigger 'advanced slicing' > causing element-wise replacement - and hence requiring that the source > iterable has the appropriate number of elements. > > >>> a = [0,1,2,3] > >>> a[::2] = [4,5] > >>> a > [4, 1, 5, 3] > >>> a[::2] = [4,5,6] > Traceback (most recent call last): > File "", line 1, in > ValueError: attempt to assign sequence of size 3 to extended slice of size > 2 > > This is in contrast to regular slicing (*without* a stride), allowing to > replace a *range* by another sequence of arbitrary length. > > >>> a = [0,1,2,3] > >>> a[:3] = [4] > >>> a > [4, 3] > > Issue > ===== > When, however, a stride of `1` is specified, advanced slicing is not > triggered. > > >>> a = [0,1,2,3] > >>> a[:3:1] = [4] > >>> a > [4, 3] > > If advanced slicing had been triggered, there should have been a > ValueError instead. > > Expected behaviour: > > >>> a = [0,1,2,3] > >>> a[:3:1] = [4] > Traceback (most recent call last): > File "", line 1, in > ValueError: attempt to assign sequence of size 1 to extended slice of size > 3 > > I think that is an inconsistency in the language that should be fixed. > > Why do we need this? > ==================== > One may want this as extra check as well so that list does not change > size. Depending on implementation, it may come with performance benefits > as well. > > One could, though, argue that you still get the same result if you do all > correctly > > >>> a = [0,1,2,3] > >>> a[:3:1] = [4,5,6] > >>> a > [4, 5, 6, 3] > > But I disagree that there should be no error when it is wrong. > *Strides that are not None should always trigger advanced slicing.* > This makes sense. (I wonder if the discrepancy is due to some internal interface that loses the distinction between None and 1 before the decision is made whether to use advanced slicing or not. But that's a possible explanation, not an excuse.) > Other Data Types > ================ > This change should also be applied to bytearray, etc., though see below. > Sure. > Concerns > ======== > It may break some code that uses advanced slicing and expects regular > slicing to occur? These cases should be rare, and the error message should > be clear enough to allow fixes? I assume these cases should be > exceptionally rare. > Yeah, backwards compatibility sometimes prevents fixing a design bug. I don't know if that's the case here, we'll need reports from real-world code. > If the implementation relies on `slice.indices(len(seq))[2] == 1` to > determine about advance slicing or not, that would require some > refactoring. If it is only `slice.stride in (1, None)` then this could > easily replaced by checking against None. > > Will there be issues with syntax consistency with other data types, in > particular outside the core library? > Things outside the stdlib are responsible for their own behavior. Usually they can move faster and with less worry about breaking backward compatibility. > > - I always found that the dynamic behaviour of lists w/r non-advanced > slicing to be somewhat peculiar in the first place, though, undeniably, it > can be immensely useful. > If you're talking about the ability to resize a list by assigning to a slice, that's as intended. It predates advanced slicing by a decade or more. > - Most external data types with fixed memory such as numpy do not have > this dynamic flexibility, and the behavior of regular slicing on assignment > is the same as regular slicing. The proposed change would increase > consistency with these other data types. > How? Resizing through slice assignment will stay for builtin types -- if numpy doesn't support that, so be it. > More surprises > ============== > >>> import array > >>> a[1::2] = a[3:3] > Traceback (most recent call last): > File "", line 1, in > ValueError: attempt to assign sequence of size 0 to extended slice of size > 2 > > whereas > > >>> a = [1,2,3,4,5] > >>> a[1::2] = a[3:3] > Traceback (most recent call last): > File "", line 1, in > ValueError: attempt to assign sequence of size 0 to extended slice of size > 2 > OK, so array doesn't use the same rules. That should be fixed too probably (assuming whatever is valid today remains valid). > >>> a = bytearray(b'12345') > >>> a[1::2] = a[3:3] > >>> a > bytearray(b'135') > Bytearray should also follow the same rules. > but numpy > > >>> import numpy as np > >>> a = np.array([1,2,3,4,5]) > >>> a[1::2] = a[3:3] > Traceback (most recent call last): > File "", line 1, in > ValueError: could not broadcast input array from shape (0) into shape (2) > > and > > >>> import numpy as np > >>> a[1:2] = a[3:3] > Traceback (most recent call last): > File "", line 1, in > ValueError: could not broadcast input array from shape (0) into shape (1) > > The latter two as expected. memoryview behaves the same. > Let's leave numpy out of this discussion. And memoryview is a special case because it can't change size (it provides a view into an inflexible structure). > Issue 2 > ======= > Whereas NumPy is know to behave differently as a data type with fixed > memory layout, and is not part of the standard library anyway, the > difference in behaviour between lists and arrays I find disconcerting. > This should be resolved to a consistent behaviour. > > Proposal 2 > ========== > Arrays and bytearrays should should adopt the same advanced slicing > behaviour I suggest for lists. > Sure. > Concerns 2 > ========== > This has the potential for a lot more side effects in existing code, but > as before in most cases error message should be triggered. > Side effects? No code that currently doesn't raise will break, right? > Summary > ======= > I find it it not acceptable as a good language design that there is a > large range of behaviour on slicing in assignment target for the different > native (and standard library) data type of seemingly similar kind, and that > users have to figure out for each data type by testing - or at the very > least remember if documented - how it behaves on slicing in assignment > targets. There should be a consistent behaviour at the very least, ideally > even one with a clear user interface as suggested for lists. > Fortunately, what *you* find acceptable or good language design is not all that important. (You can avoid making any mistakes in your own language. :-) You may by now realize that 100% consistent behavior is hard to obtain. However we'll gladly consider your feedback. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 22 21:21:17 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 23 Feb 2018 12:21:17 +1000 Subject: [Python-ideas] List assignment - extended slicing inconsistency In-Reply-To: References: Message-ID: On 23 February 2018 at 11:51, Guido van Rossum wrote: > On Thu, Feb 22, 2018 at 2:18 PM, Alexander Heger wrote: >> But I disagree that there should be no error when it is wrong. >> *Strides that are not None should always trigger advanced slicing.* > > This makes sense. > > (I wonder if the discrepancy is due to some internal interface that loses > the distinction between None and 1 before the decision is made whether to > use advanced slicing or not. But that's a possible explanation, not an > excuse.) That explanation seems pretty likely to me, as for the data types implemented in C, we tend to switch to the Py_ssize_t form of slices pretty early, and that can't represent the None/1 distinction. Even for Python level collections, you lose the distinction as soon as you call slice.indices (as that promises to return 3-tuple of integers). >> Concerns >> ======== >> It may break some code that uses advanced slicing and expects regular >> slicing to occur? These cases should be rare, and the error message should >> be clear enough to allow fixes? I assume these cases should be exceptionally >> rare. > > Yeah, backwards compatibility sometimes prevents fixing a design bug. I > don't know if that's the case here, we'll need reports from real-world code. In this case, we should be able to start with a DeprecationWarning in 3.8, since we already have the checks in place to raise ValueError when the step is 2 or more - any patch would just need to make sure those checks either have access to the original slice object (so they can check the raw step value), or else an internal flag indicating whether or not an explicit step was provided. So the next step would be to file an issue pointing back to this thread for acknowledgement that this is a design bug to be handled with a DeprecationWarning in 3.8, and a ValueError in 3.9+. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From klahnakoski at mozilla.com Fri Feb 23 11:14:39 2018 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Fri, 23 Feb 2018 11:14:39 -0500 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <21d694a9.b30.161a493d83e.Coremail.fhsxfhsx@126.com> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <21d694a9.b30.161a493d83e.Coremail.fhsxfhsx@126.com> Message-ID: <628eb18c-cab0-c3a8-a39a-641b4d294fc7@mozilla.com> On 2018-02-17 11:23, fhsxfhsx wrote: > [ > ? { > ??? 'id': goods.id, > ??? 'name': goods.name, > ??? 'category': gc.name, > ??? 'category_type': gc.type, > ? } > ? for goods_id in goods_id_list > ? for goods is Goods.get_by_id(goods_id) > ? for gc is GoodsCategory.get_by_id(goods.category_id) > ] in the short term, it seems for...in [...] is good enough: > [ > ??? { > ??????? 'id': goods.id, > ??????? 'name': goods.name, > ??????? 'category': gc.name, > ??????? 'category_type': gc.type, > ??? } > ??? for goods_id in goods_id_list > ??? for goods in [Goods.get_by_id(goods_id)] > ??? for gc in [GoodsCategory.get_by_id(goods.category_id)] > ] I personally would like to see with...as... syntax allowed in list comprehensions, despite `with` being limited to context managers to date. > [ > ??? { > ??????? 'id': goods.id, > ??????? 'name': goods.name, > ??????? 'category': gc.name, > ??????? 'category_type': gc.type, > ??? } > ??? for goods_id in goods_id_list > ??? with Goods.get_by_id(goods_id) as goods > ??? with GoodsCategory.get_by_id(goods.category_id) as gc > ] ..,or maybe `let` reads easier? > [ > ??? { > ??????? 'id': goods.id, > ??????? 'name': goods.name, > ??????? 'category': gc.name, > ??????? 'category_type': gc.type, > ??? } > ??? for goods_id in goods_id_list > ??? let goods = Goods.get_by_id(goods_id) > ??? let gc = GoodsCategory.get_by_id(goods.category_id) > ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From klahnakoski at mozilla.com Fri Feb 23 12:34:40 2018 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Fri, 23 Feb 2018 12:34:40 -0500 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <20180216005739.GA10142@ando.pearwood.info> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <20180216005739.GA10142@ando.pearwood.info> Message-ID: <18779af5-aa44-cf1a-336a-d6d2236d07c1@mozilla.com> I believe list comprehensions are difficult to read because they are not formatted properly. For me, list comprehension clauses are an expression, followed by clauses executed in the order. Any list comprehension with more than one clause should be one-line-per clause. Examples inline: On 2018-02-15 19:57, Steven D'Aprano wrote: > Where should the assignment go? [(y, y**2) let y = x+1 for x in (1, 2, > 3, 4)] [(y, y**2) for x in (1, 2, 3, 4) let y = x+1] Since y is a function of x, it must follow the for clause: > [ > ??? (y, y ** 2) > ??? for x in (1, 2, 3, 4) > ??? let y = x + 1 > ] > How do they interact when you have multiple loops and if-clauses? [(w, > w**2) for x in (1, 2, 3, 4) let y = x+1 for a in range(y) let z = a+1 > if z > 2 for b in range(z) let w = z+1] They are applied in order: > [ >???? (w, w**2) >???? for x in (1, 2, 3, 4) >???? let y = x+1 >???? for a in range(y) >???? let z = a+1 >???? if z > 2 >???? for b in range(z) >???? let w = z+1 > ] which is a short form for: > def stuff(): >???? for x in (1, 2, 3, 4): >???????? y = x+1 >???????? for a in range(y): >???????????? z = a+1 >???????????? if z > 2: >???????????????? for b in range(z): >???????????????????? w = z+1 >???????????????????? yield (w, w**2) > > list(stuff()) From mistersheik at gmail.com Fri Feb 23 12:44:30 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 23 Feb 2018 17:44:30 +0000 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: <18779af5-aa44-cf1a-336a-d6d2236d07c1@mozilla.com> References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <20180216005739.GA10142@ando.pearwood.info> <18779af5-aa44-cf1a-336a-d6d2236d07c1@mozilla.com> Message-ID: On Fri, Feb 23, 2018 at 12:35 PM Kyle Lahnakoski wrote: > I believe list comprehensions are difficult to read because they are not > formatted properly. For me, list comprehension clauses are an > expression, followed by clauses executed in the order. Any list > comprehension with more than one clause should be one-line-per clause. > > Examples inline: > > On 2018-02-15 19:57, Steven D'Aprano wrote: > > Where should the assignment go? [(y, y**2) let y = x+1 for x in (1, 2, > > 3, 4)] [(y, y**2) for x in (1, 2, 3, 4) let y = x+1] > > Since y is a function of x, it must follow the for clause: > > > [ > > (y, y ** 2) > > for x in (1, 2, 3, 4) > > let y = x + 1 > > ] > > > How do they interact when you have multiple loops and if-clauses? [(w, > > w**2) for x in (1, 2, 3, 4) let y = x+1 for a in range(y) let z = a+1 > > if z > 2 for b in range(z) let w = z+1] > > They are applied in order: > > > [ > > (w, w**2) > > for x in (1, 2, 3, 4) > > let y = x+1 > > for a in range(y) > > let z = a+1 > > if z > 2 > > for b in range(z) > > let w = z+1 > > ] > > which is a short form for: > > > def stuff(): > > for x in (1, 2, 3, 4): > > y = x+1 > > for a in range(y): > > z = a+1 > > if z > 2: > > for b in range(z): > > w = z+1 > > yield (w, w**2) > > > > list(stuff()) > Is it that much shorter that it's worth giving up the benefit of indentation? > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/KwZtO4rpAGE/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Feb 23 13:24:55 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 23 Feb 2018 10:24:55 -0800 Subject: [Python-ideas] List assignment - extended slicing inconsistency In-Reply-To: References: Message-ID: On Thu, Feb 22, 2018 at 6:21 PM, Nick Coghlan wrote: > > (I wonder if the discrepancy is due to some internal interface that loses > > the distinction between None and 1 before the decision is made whether to > > use advanced slicing or not. But that's a possible explanation, not an > > excuse.) > > That explanation seems pretty likely to me, as for the data types > implemented in C, we tend to switch to the Py_ssize_t form of slices > pretty early, and that can't represent the None/1 distinction. > > Even for Python level collections, you lose the distinction as soon as > you call slice.indices (as that promises to return 3-tuple of > integers). If this is the case -- backward compatibility issues aside, wouldn't it be very hard to fix? Which means that should be investigated before going to far down the "how much code might this break" route. And certainly before adding a Deprecation Warning. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From klahnakoski at mozilla.com Fri Feb 23 13:39:06 2018 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Fri, 23 Feb 2018 13:39:06 -0500 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <20180216005739.GA10142@ando.pearwood.info> <18779af5-aa44-cf1a-336a-d6d2236d07c1@mozilla.com> Message-ID: On 2018-02-23 12:44, Neil Girdhar wrote: > > On Fri, Feb 23, 2018 at 12:35 PM Kyle Lahnakoski > > wrote: > > > > [ > >???? (w, w**2) > >???? for x in (1, 2, 3, 4) > >???? let y = x+1 > >???? for a in range(y) > >???? let z = a+1 > >???? if z > 2 > >???? for b in range(z) > >???? let w = z+1 > > ] > > which is a short form for: > > > def stuff(): > >???? for x in (1, 2, 3, 4): > >???????? y = x+1 > >???????? for a in range(y): > >???????????? z = a+1 > >???????????? if z > 2: > >???????????????? for b in range(z): > >???????????????????? w = z+1 > >???????????????????? yield (w, w**2) > > > > list(stuff()) > > > Is it that much shorter that it's worth giving up the benefit of > indentation?? > > Saving the indentation? Oh yes, for sure!? This code reads like a story, the indentation is superfluous to that story.? Should we add it to Python? I don't know; I quick scan through my own code, and I do not see much opportunity for list comprehensions of this complexity.? Either my data structures are not that complicated, or I have try/except blocks inside a loop, or I am using a real query language (like SQL).? pythonql seems to solve all these problems well enough (https://github.com/pythonql/pythonql). -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Fri Feb 23 13:45:56 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 23 Feb 2018 18:45:56 +0000 Subject: [Python-ideas] Temporary variables in comprehensions In-Reply-To: References: <61d0679e.79f.1619809580a.Coremail.fhsxfhsx@126.com> <20180216005739.GA10142@ando.pearwood.info> <18779af5-aa44-cf1a-336a-d6d2236d07c1@mozilla.com> Message-ID: On Fri, Feb 23, 2018 at 1:42 PM Kyle Lahnakoski wrote: > > > On 2018-02-23 12:44, Neil Girdhar wrote: > > On Fri, Feb 23, 2018 at 12:35 PM Kyle Lahnakoski > wrote: > >> >> > [ >> > (w, w**2) >> > for x in (1, 2, 3, 4) >> > let y = x+1 >> > for a in range(y) >> > let z = a+1 >> > if z > 2 >> > for b in range(z) >> > let w = z+1 >> > ] >> >> which is a short form for: >> >> > def stuff(): >> > for x in (1, 2, 3, 4): >> > y = x+1 >> > for a in range(y): >> > z = a+1 >> > if z > 2: >> > for b in range(z): >> > w = z+1 >> > yield (w, w**2) >> > >> > list(stuff()) >> > > Is it that much shorter that it's worth giving up the benefit of > indentation? > >> >> > Saving the indentation? Oh yes, for sure! This code reads like a story, > the indentation is superfluous to that story. Should we add it to Python? > I don't know; I quick scan through my own code, and I do not see much > opportunity for list comprehensions of this complexity. > That's a good thing that you are not writing code like this. I don't agree that the indentation is "superfluous". It makes the code easy to read. Anyway, Google's Python style guide also agrees that very long comprehensions like that are not worth it. ( https://google.github.io/styleguide/pyguide.html?showone=List_Comprehensions#List_Comprehensions ) Either my data structures are not that complicated, or I have try/except > blocks inside a loop, or I am using a real query language (like SQL). > pythonql seems to solve all these problems well enough ( > https://github.com/pythonql/pythonql). > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/KwZtO4rpAGE/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/KwZtO4rpAGE/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Feb 23 13:50:15 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 24 Feb 2018 05:50:15 +1100 Subject: [Python-ideas] List assignment - extended slicing inconsistency In-Reply-To: References: Message-ID: On Sat, Feb 24, 2018 at 5:24 AM, Chris Barker wrote: > On Thu, Feb 22, 2018 at 6:21 PM, Nick Coghlan wrote: >> >> > (I wonder if the discrepancy is due to some internal interface that >> > loses >> > the distinction between None and 1 before the decision is made whether >> > to >> > use advanced slicing or not. But that's a possible explanation, not an >> > excuse.) >> >> That explanation seems pretty likely to me, as for the data types >> implemented in C, we tend to switch to the Py_ssize_t form of slices >> pretty early, and that can't represent the None/1 distinction. >> >> Even for Python level collections, you lose the distinction as soon as >> you call slice.indices (as that promises to return 3-tuple of >> integers). > > > If this is the case -- backward compatibility issues aside, wouldn't it be > very hard to fix? > > Which means that should be investigated before going to far down the "how > much code might this break" route. > > And certainly before adding a Deprecation Warning. Ignoring backward compatibility, it ought to be possible to (ab)use a stride of zero for this. Calling slice.indices() on something with a stride of zero raises ValueError, so there's no ambiguity. But it would break code that iterates in a simple and obvious way, and (ugh ugh) break it in a very nasty way: an infinite loop. I'm not happy with that kind of breakage, even with multiple versions posting a warning. In the C API, there's PySlice_GetIndices "[y]ou probably do not want to use this function" and PySlice_GetIndicesEx, the "[u]sable replacement". Much as I dislike adding *yet another* function to do basically the same job, I think that might be the less-bad way to do this. ChrisA From storchaka at gmail.com Fri Feb 23 14:38:46 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 23 Feb 2018 21:38:46 +0200 Subject: [Python-ideas] List assignment - extended slicing inconsistency In-Reply-To: References: Message-ID: 23.02.18 20:50, Chris Angelico ????: > Ignoring backward compatibility, it ought to be possible to (ab)use a > stride of zero for this. Calling slice.indices() on something with a > stride of zero raises ValueError, so there's no ambiguity. But it > would break code that iterates in a simple and obvious way, and (ugh > ugh) break it in a very nasty way: an infinite loop. I'm not happy > with that kind of breakage, even with multiple versions posting a > warning. > > In the C API, there's PySlice_GetIndices "[y]ou probably do not want > to use this function" and PySlice_GetIndicesEx, the "[u]sable > replacement". Much as I dislike adding *yet another* function to do > basically the same job, I think that might be the less-bad way to do > this. Actually PySlice_GetIndicesEx is deprecated too. It is not safe for resizeable sequences since it is vulnerable to race condition. The pair of PySlice_Unpack() and PySlice_AdjustIndices() replaces it in new code. So now we have 4 functions for doing the same thing in C, 2 of them are deprecated. Do you want to deprecate the other two and add new replacements for them? From rosuav at gmail.com Fri Feb 23 15:00:40 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 24 Feb 2018 07:00:40 +1100 Subject: [Python-ideas] List assignment - extended slicing inconsistency In-Reply-To: References: Message-ID: On Sat, Feb 24, 2018 at 6:38 AM, Serhiy Storchaka wrote: > 23.02.18 20:50, Chris Angelico ????: >> >> Ignoring backward compatibility, it ought to be possible to (ab)use a >> stride of zero for this. Calling slice.indices() on something with a >> stride of zero raises ValueError, so there's no ambiguity. But it >> would break code that iterates in a simple and obvious way, and (ugh >> ugh) break it in a very nasty way: an infinite loop. I'm not happy >> with that kind of breakage, even with multiple versions posting a >> warning. >> >> In the C API, there's PySlice_GetIndices "[y]ou probably do not want >> to use this function" and PySlice_GetIndicesEx, the "[u]sable >> replacement". Much as I dislike adding *yet another* function to do >> basically the same job, I think that might be the less-bad way to do >> this. > > > Actually PySlice_GetIndicesEx is deprecated too. It is not safe for > resizeable sequences since it is vulnerable to race condition. The pair of > PySlice_Unpack() and PySlice_AdjustIndices() replaces it in new code. > > So now we have 4 functions for doing the same thing in C, 2 of them are > deprecated. Do you want to deprecate the other two and add new replacements > for them? > Wow. Who'd have thought slice indexing was so hard... (If you look at the Python 3.6 docos, that deprecation isn't mentioned. Should it be?) I presume it's already too late for 3.7 to change anything to fix this. ChrisA From ncoghlan at gmail.com Sat Feb 24 01:03:08 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Feb 2018 16:03:08 +1000 Subject: [Python-ideas] List assignment - extended slicing inconsistency In-Reply-To: References: Message-ID: On 24 February 2018 at 06:00, Chris Angelico wrote: > I presume it's already too late for 3.7 to change anything to fix this. > Yeah, any changes in relation to this would be 3.8+ only. To answer your previous question about "Wouldn't it be hard to fix this given the way slice processing works?", whether or not it's tricky depends more on the internal code structure of any given type implementation than it does the API that [C]Python exposes for converting a slice definition to a set of indices given a particular sequence length. For lists, for example, the code handling that dispatch is in the "list assign subscript" function, under a "PySlice_Check(item)" branch: https://github.com/python/cpython/blob/master/Objects/listobject.c#L2775 There's an early return there for the "step == 1" case, but at the point where we run that check, we still have access to "item", so that early return can be modified to instead check "((PySliceObject *) item->step == Py_None)". During the deprecation warning period, we'd then *also* add a second delegation point down where the exception normally gets raised, such that when "step == 1" we emit the deprecation warning and then call the same function as the existing early return does. In 3.9+, we'd delete that additional fallback code. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Feb 27 17:27:59 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Feb 2018 09:27:59 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings Message-ID: This is a suggestion that comes up periodically here or on python-dev. This proposal introduces a way to bind a temporary name to the value of an expression, which can then be used elsewhere in the current statement. The nicely-rendered version will be visible here shortly: https://www.python.org/dev/peps/pep-0572/ ChrisA PEP: 572 Title: Syntax for Statement-Local Name Bindings Author: Chris Angelico Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Feb-2018 Python-Version: 3.8 Post-History: 28-Feb-2018 Abstract ======== Programming is all about reusing code rather than duplicating it. When an expression needs to be used twice in quick succession but never again, it is convenient to assign it to a temporary name with very small scope. By permitting name bindings to exist within a single statement only, we make this both convenient and safe against collisions. Rationale ========= When an expression is used multiple times in a list comprehension, there are currently several suboptimal ways to spell this, and no truly good ways. A statement-local name allows any expression to be temporarily captured and then used multiple times. Syntax and semantics ==================== In any context where arbitrary Python expressions can be used, a named expression can appear. This must be parenthesized for clarity, and is of the form `(expr as NAME)` where `expr` is any valid Python expression, and `NAME` is a simple name. The value of such a named expression is the same as the incorporated expression, with the additional side-effect that NAME is bound to that value for the remainder of the current statement. Just as function-local names shadow global names for the scope of the function, statement-local names shadow other names for that statement. They can also shadow each other, though actually doing this should be strongly discouraged in style guides. Example usage ============= These list comprehensions are all approximately equivalent:: # Calling the function twice stuff = [[f(x), f(x)] for x in range(5)] # Helper function def pair(value): return [value, value] stuff = [pair(f(x)) for x in range(5)] # Inline helper function stuff = [(lambda v: [v,v])(f(x)) for x in range(5)] # Extra 'for' loop - see also Serhiy's optimization stuff = [[y, y] for x in range(5) for y in [f(x)]] # Expanding the comprehension into a loop stuff = [] for x in range(5): y = f(x) stuff.append([y, y]) # Using a statement-local name stuff = [[(f(x) as y), y] for x in range(5)] If calling `f(x)` is expensive or has side effects, the clean operation of the list comprehension gets muddled. Using a short-duration name binding retains the simplicity; while the extra `for` loop does achieve this, it does so at the cost of dividing the expression visually, putting the named part at the end of the comprehension instead of the beginning. Statement-local name bindings can be used in any context, but should be avoided where regular assignment can be used, just as `lambda` should be avoided when `def` is an option. Open questions ============== 1. What happens if the name has already been used? `(x, (1 as x), x)` Currently, prior usage functions as if the named expression did not exist (following the usual lookup rules); the new name binding will shadow the other name from the point where it is evaluated until the end of the statement. Is this acceptable? Should it raise a syntax error or warning? 2. The current implementation [1] implements statement-local names using a special (and mostly-invisible) name mangling. This works perfectly inside functions (including list comprehensions), but not at top level. Is this a serious limitation? Is it confusing? 3. The interaction with locals() is currently[1] slightly buggy. Should statement-local names appear in locals() while they are active (and shadow any other names from the same function), or should they simply not appear? 4. Syntactic confusion in `except` statements. While technically unambiguous, it is potentially confusing to humans. In Python 3.7, parenthesizing `except (Exception as e):` is illegal, and there is no reason to capture the exception type (as opposed to the exception instance, as is done by the regular syntax). Should this be made outright illegal, to prevent confusion? Can it be left to linters? 5. Similar confusion in `with` statements, with the difference that there is good reason to capture the result of an expression, and it is also very common for `__enter__` methods to return `self`. In many cases, `with expr as name:` will do the same thing as `with (expr as name):`, adding to the confusion. References ========== .. [1] Proof of concept / reference implementation (https://github.com/Rosuav/cpython/tree/statement-local-variables) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From rob.cliffe at btinternet.com Tue Feb 27 22:47:29 2018 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 28 Feb 2018 03:47:29 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> I hope nobody will mind too much if I throw in my (relatively uninformed) 2c before some of the big guns respond. First: Well done, Chris, for all the work on this.? IMHO this could be a useful Python enhancement (and reduce the newsgroup churn :-)). On 27/02/2018 22:27, Chris Angelico wrote: > This is a suggestion that comes up periodically here or on python-dev. > This proposal introduces a way to bind a temporary name to the value > of an expression, which can then be used elsewhere in the current > statement. > > The nicely-rendered version will be visible here shortly: > > https://www.python.org/dev/peps/pep-0572/ > > ChrisA > > PEP: 572 > Title: Syntax for Statement-Local Name Bindings > Author: Chris Angelico > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 28-Feb-2018 > Python-Version: 3.8 > Post-History: 28-Feb-2018 > > > Abstract > ======== > > Programming is all about reusing code rather than duplicating it. When > an expression needs to be used twice in quick succession but never again, > it is convenient to assign it to a temporary name with very small scope. > By permitting name bindings to exist within a single statement only, we > make this both convenient and safe against collisions. It may be pedantic of me (and it *will* produce a more pedantic-sounding sentence) but I honestly think that "safe against name collisions" is clearer than "safe against collisions", and that clarity matters. > > > Rationale > ========= > > When an expression is used multiple times in a list comprehension, there > are currently several suboptimal ways to spell this, and no truly good > ways. A statement-local name allows any expression to be temporarily > captured and then used multiple times. IMHO the first sentence is a bit of an overstatement (though of course it's a big part of the PEP's "sell"). How about "... there are currently several ways to spell this, none of them ideal." Also, given that a list comprehension is an expression, which in turn could be part of a larger expression, would it be appropriate to replace "expression" by "sub-expression" in the 2 places where it occurs in the above paragraph? > > > Syntax and semantics > ==================== > > In any context where arbitrary Python expressions can be used, a named > expression can appear. This must be parenthesized for clarity, I agree, pro tem (not that I am claiming that my opinion counts for much).? I'm personally somewhat allergic to making parentheses mandatory where they really don't need to be, but trying to think about where they could be unambiguously omitted makes my head spin. At least, if we run with this for now, then making them non-mandatory in some contexts, at some future time, won't lead to backwards incompatibility. > and is of > the form `(expr as NAME)` where `expr` is any valid Python expression, > and `NAME` is a simple name. > > The value of such a named expression is the same as the incorporated > expression, with the additional side-effect that NAME is bound to that > value for the remainder of the current statement. > > Just as function-local names shadow global names for the scope of the > function, statement-local names shadow other names for that statement. > They can also shadow each other, though actually doing this should be > strongly discouraged in style guides. > > > Example usage > ============= > > These list comprehensions are all approximately equivalent:: > > # Calling the function twice ??? ??? ??? # Calling the function twice (assuming that side effects can be ignored) > stuff = [[f(x), f(x)] for x in range(5)] > > # Helper function ??? ??? ??? # External helper function > def pair(value): return [value, value] > stuff = [pair(f(x)) for x in range(5)] > > # Inline helper function > stuff = [(lambda v: [v,v])(f(x)) for x in range(5)] > > # Extra 'for' loop - see also Serhiy's optimization > stuff = [[y, y] for x in range(5) for y in [f(x)]] > > # Expanding the comprehension into a loop > stuff = [] > for x in range(5): > y = f(x) > stuff.append([y, y]) Please feel free to ignore this, but (trying to improve on the above example): ??? ??? ??? # Using a generator: ??? ??? ??? def gen(): ??? ?? ??? ? ?? for x in range(5): ??? ?? ??? ? ?????? y = f(x) ??? ?? ??? ? ?????? yield y,y ??? ??? ??? stuff = list(gen()) > > # Using a statement-local name > stuff = [[(f(x) as y), y] for x in range(5)] > > If calling `f(x)` is expensive or has side effects, the clean operation of > the list comprehension gets muddled. Using a short-duration name binding > retains the simplicity; while the extra `for` loop does achieve this, it > does so at the cost of dividing the expression visually, putting the named > part at the end of the comprehension instead of the beginning. Maybe add to last sentence "and of adding (at least conceptually) extra steps: building a 1-element list, then extracting the first element" > > Statement-local name bindings can be used in any context, but should be > avoided where regular assignment can be used, just as `lambda` should be > avoided when `def` is an option. > > > Open questions > ============== > > 1. What happens if the name has already been used? `(x, (1 as x), x)` > Currently, prior usage functions as if the named expression did not > exist (following the usual lookup rules); the new name binding will > shadow the other name from the point where it is evaluated until the > end of the statement. Is this acceptable? Should it raise a syntax > error or warning? IMHO this is not only acceptable, but the (only) correct behaviour. Your crystal-clear statement "the new name binding will shadow the other name from the point where it is evaluated until the end of the statement " is *critical* and IMO what should happen. Perhaps an extra example or two, to clarify that *execution order* is what matters, might help, e.g. ??? y if (f() as y) > 0 else None will work as expected, because "(f() as y)" is evaluated before the initial "y" is (if it is). [Parenthetical comment: Advanced use of this new feature would require knowledge of Python's evaluation order.? But this is not an argument against the PEP, because the same could be said about almost any feature of Python, e.g. ??? [ f(x), f(x) ] where evaluating f(x) has side effects.] 2. The current implementation [1] implements statement-local names using > a special (and mostly-invisible) name mangling. This works perfectly > inside functions (including list comprehensions), but not at top > level. Is this a serious limitation? Is it confusing? It's great that it works perfectly in functions and list comprehensions, but it sounds as if, at top level, in rare circumstances it could produce a hard-to-track-down bug, which is not exactly desirable.? It's hard to say more without knowing more details.? As a stab in the dark, is it possible to avoid it by including the module name in the mangling?? Sorry if I'm talking rubbish. > > 3. The interaction with locals() is currently[1] slightly buggy. Should > statement-local names appear in locals() while they are active (and > shadow any other names from the same function), or should they simply > not appear? IMHO this is an implementation detail.? IMO you should have some idea what you're doing when you use locals().? But I think consistency matters - either the temporary variable *always* gets into locals() "from the point where it is evaluated until the end of the statement", or it *never* gets into locals().? (Possibly the language spec should specify one or the other - I'm not sure, time may tell.) > > 4. Syntactic confusion in `except` statements. While technically > unambiguous, it is potentially confusing to humans. In Python 3.7, > parenthesizing `except (Exception as e):` is illegal, and there is no > reason to capture the exception type (as opposed to the exception > instance, as is done by the regular syntax). Should this be made > outright illegal, to prevent confusion? Can it be left to linters? > > 5. Similar confusion in `with` statements, with the difference that there > is good reason to capture the result of an expression, and it is also > very common for `__enter__` methods to return `self`. In many cases, > `with expr as name:` will do the same thing as `with (expr as name):`, > adding to the confusion. This (4. and 5.) shows that we are using "as" in more than one sense, and in a perfect world we would use different keywords.? But IMHO (admittedly, without having thought about it much) this isn't much of a problem.? Again, perhaps some clarifying examples would help. Some pedantry: ??? One issue not so far explicitly mentioned: IMHO it should be perfectly legal to assign a value to a temporary variable, and then not use that temporary variable (just as it is legal to assign to a variable in a regular assignment statement, and then not use that variable) though linters should IMO point it out.? E.g. you might want to modify (perhaps only temporarily) ??? ??? a = [ (f() as b), b ] to ??? ??? a = [ (f() as b), c ] Also (and I'm relying on "In any context where arbitrary Python expressions can be used, a named expression can appear." ), linters should also IMO point to ??? a = (42 as b) which AFAICT is a laborious synonym for ??? a = 42 And here's a thought: What are the semantics of ??? a = (42 as a) # Of course a linter should point this out too At first I thought this was also a laborious synonym for "a=42". But then I re-read your statement (the one I described above as crystal-clear) and realised that its exact wording was even more critical than I had thought: ??? "the new name binding will shadow the other name from the point where it is evaluated until the end of the statement" Note: "until the end of the *statement*".? NOT "until the end of the *expression*".? The distinction matters. If we take this as gospel, all this will do is create a temporary variable "a", assign the value 42 to it twice, then discard it. I.e. it effectively does nothing, slowly. Have I understood correctly?? Very likely you have considered this and mean exactly what you say, but I am sure you will understand that I mean no offence by querying it. Best wishes Rob Cliffe -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.cliffe at btinternet.com Tue Feb 27 23:38:56 2018 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 28 Feb 2018 04:38:56 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> On 27/02/2018 22:27, Chris Angelico wrote: > This is a suggestion that comes up periodically here or on python-dev. > This proposal introduces a way to bind a temporary name to the value > of an expression, which can then be used elsewhere in the current > statement. > > Hm, apologies.? This is in complete contrast to my previous post, where I was pretty enthusiastic about Chris's PEP.? But I can't resist sharing these thoughts ... There was some vague uneasiness at the back of my mind, which I think I have finally pinned down.? Consider Chris's example: ???? ??? # Using a statement-local name ??? ??? stuff = [[(f(x) as y), y] for x in range(5)] I think what bothered me was the *asymmetry* between the two uses of the calculated value of f(x).? It is not obvious at first glance that ??? [(f(x) as y), y] defines a 2-element list where the 2 elements are the *same*. Contrast something like (exact syntax bike-sheddable) ??? ????? stuff = [ (with f(x) as y: [y,y])? for x in range(5)] or ??? ??? ??? stuff = [ (y,y] with f(x) as y) for x in range(5)] This also has the advantage (if it is? I think probably it is) that the scope of the temporary variable ("y" here) can be limited to inside the parentheses of the "with" sub-expression. And that it is not dependent on Python's evaluation order. Ir gives the programmer explicit control over the scope, which might conceivably be an advantage in more complicated expressions. Sorry if this is re-hashing a suggestion that has been made before, as it probably is.? It just struck me as ... I don't know ... cleaner somehow. Regards Rob Cliffe -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Feb 28 00:23:13 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Feb 2018 16:23:13 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> Message-ID: On Wed, Feb 28, 2018 at 2:47 PM, Rob Cliffe via Python-ideas wrote: > I hope nobody will mind too much if I throw in my (relatively uninformed) 2c > before some of the big guns respond. Not at all! Everyone's contributions are welcomed. Even after the "big guns" respond, other voices are definitely worth hearing. (One small tip though: Responding in plain text is appreciated, as it means information about who said what is entirely copy-and-pasteable.) > First: Well done, Chris, for all the work on this. IMHO this could be a > useful Python enhancement (and reduce the newsgroup churn :-)). Thanks :) It's one of those PEPs that can be immensely useful even if it's rejected. > On 27/02/2018 22:27, Chris Angelico wrote: > Programming is all about reusing code rather than duplicating it. When > an expression needs to be used twice in quick succession but never again, > it is convenient to assign it to a temporary name with very small scope. > By permitting name bindings to exist within a single statement only, we > make this both convenient and safe against collisions. > > It may be pedantic of me (and it will produce a more pedantic-sounding > sentence) but I honestly think that > "safe against name collisions" is clearer than "safe against collisions", > and that clarity matters. Sure. I'm also aware that I'm using the same words over and over, but I can add that one. > Rationale > ========= > > When an expression is used multiple times in a list comprehension, there > are currently several suboptimal ways to spell this, and no truly good > ways. A statement-local name allows any expression to be temporarily > captured and then used multiple times. > > IMHO the first sentence is a bit of an overstatement (though of course it's > a big part of the PEP's "sell"). > How about "... there are currently several ways to spell this, none of them > ideal." Hmm, I think I prefer the current wording, but maybe there's some other way to say it that's even better. > Also, given that a list comprehension is an expression, which in turn could > be part of a larger expression, would it be appropriate to replace > "expression" by "sub-expression" in the 2 places where it occurs in the > above paragraph? Thanks, done. > Syntax and semantics > ==================== > > In any context where arbitrary Python expressions can be used, a named > expression can appear. This must be parenthesized for clarity, > > I agree, pro tem (not that I am claiming that my opinion counts for much). > I'm personally somewhat allergic to making parentheses mandatory where they > really don't need to be, but trying to think about where they could be > unambiguously omitted makes my head spin. At least, if we run with this for > now, then making them non-mandatory in some contexts, at some future time, > won't lead to backwards incompatibility. Yeah, definitely. There've been other times when a new piece of syntax is extra restrictive at first, and then gets opened up later. It's way easier than the alternative. (For the record, I had some trouble with this syntax at first, and was almost going to launch this PEP with a syntax of "( > expr as NAME)" to disambiguate. That was never the intention, though, and I'm grateful to the folks on core-mentorship for helping me get that sorted.) > Example usage > ============= > > These list comprehensions are all approximately equivalent:: > > # Calling the function twice > > # Calling the function twice (assuming that side effects can be > ignored) That assumption should be roughly inherent in the problem. If the call has no side effects and low cost, none of this is necessary - just repeat the expression. > stuff = [[f(x), f(x)] for x in range(5)] > > # Helper function > > # External helper function Not a big deal either way, can toss in the extra word but I'm not really sure it's needed. > Please feel free to ignore this, but (trying to improve on the above > example): > # Using a generator: > def gen(): > for x in range(5): > y = f(x) > yield y,y > stuff = list(gen()) I think it's unnecessary; the direct loop is entirely better IMO. Since the point of these examples is just to contrast against the proposal, it's no biggie if there are EVEN MORE ways (and I haven't even mentioned the steak knives!) to do something, unless they're actually better. > If calling `f(x)` is expensive or has side effects, the clean operation of > the list comprehension gets muddled. Using a short-duration name binding > retains the simplicity; while the extra `for` loop does achieve this, it > does so at the cost of dividing the expression visually, putting the named > part at the end of the comprehension instead of the beginning. > > Maybe add to last sentence "and of adding (at least conceptually) extra > steps: building a 1-element list, then extracting the first element" That's precisely the point that Serhiy's optimization is aiming at, with the intention of making "for x in [expr]" a standard idiom for list comp assignment. If we assume that this does become standard, it won't add the extra steps, but it does still push that expression out to the far end of the comprehension, whereas a named subexpression places it at first use. > Open questions > ============== > > 1. What happens if the name has already been used? `(x, (1 as x), x)` > Currently, prior usage functions as if the named expression did not > exist (following the usual lookup rules); the new name binding will > shadow the other name from the point where it is evaluated until the > end of the statement. Is this acceptable? Should it raise a syntax > error or warning? > > IMHO this is not only acceptable, but the (only) correct behaviour. Your > crystal-clear statement "the new name binding will shadow the other name > from the point where it is evaluated until the end of the statement " is > critical and IMO what should happen. Regular function-locals don't work that way, though: x = "global" def f(): print(x) x = "local" print(x) This won't print "global" followed by "local" - it'll bomb with UnboundLocalError. I do still think this is correct behaviour, though; the only other viable option is for the SLNB to fail if it's shadowing anything at all, and even that has its weird edge cases. > Perhaps an extra example or two, to clarify that *execution order* is what > matters, might help, e.g. > y if (f() as y) > 0 else None > will work as expected, because "(f() as y)" is evaluated before the initial > "y" is (if it is). Unnecessary in the "open questions" section, but if this proves to be a point of confusion and I make a FAQ, then yeah, I could put in some examples like that. > [Parenthetical comment: Advanced use of this new feature would require > knowledge of Python's evaluation order. But this is not an argument against > the PEP, because the same could be said about almost any feature of Python, > e.g. > [ f(x), f(x) ] > where evaluating f(x) has side effects.] Yeah, people should have no problem figuring this out. > 2. The current implementation [1] implements statement-local names using > > a special (and mostly-invisible) name mangling. This works perfectly > inside functions (including list comprehensions), but not at top > level. Is this a serious limitation? Is it confusing? > > It's great that it works perfectly in functions and list comprehensions, but > it sounds as if, at top level, in rare circumstances it could produce a > hard-to-track-down bug, which is not exactly desirable. It's hard to say > more without knowing more details. As a stab in the dark, is it possible to > avoid it by including the module name in the mangling? Sorry if I'm talking > rubbish. The problem is that it's all done through the special "cell" slots in a function's locals. To try to do that at module level would potentially mean polluting the global namespace, which could interfere with other functions and cause extreme confusion. Currently, attempting to use an SLNB at top level produces a bizarre UnboundLocalError, and I don't truly understand why. The disassembly shows the same name mangling that happens inside a function, but it doesn't get properly undone. But I'm sure there are many other implementation bugs too. > 3. The interaction with locals() is currently[1] slightly buggy. Should > statement-local names appear in locals() while they are active (and > shadow any other names from the same function), or should they simply > not appear? > > IMHO this is an implementation detail. IMO you should have some idea what > you're doing when you use locals(). But I think consistency matters - > either the temporary variable *always* gets into locals() "from the point > where it is evaluated until the end of the statement", or it *never* gets > into locals(). (Possibly the language spec should specify one or the other > - I'm not sure, time may tell.) Yeah, and I would prefer the former, but that's still potentially confusing. Consider: y = "gg" def g(): x = 1 print(x, locals()) print((3 as x), x, locals()) print(y, (4 as y), y, locals()) print(x, locals()) del x print(locals()) Current output: 1 {} 3 3 {'x': 3} gg 4 4 {'y': 4} 1 {} {} Desired output: 1 {'x': 1} 3 3 {'x': 3} gg 4 4 {'x': 1, 'y': 4} 1 {'x': 1} {} Also acceptable (but depreferred) output: 1 {'x': 1} 3 3 {'x': 1} gg 4 4 {'x': 1} 1 {'x': 1} {} If the language spec mandates that "either this or that" happen, I'd be okay with that; it'd give other Pythons the option to implement this completely outside of locals() while still being broadly sane. > 4. Syntactic confusion in `except` statements. > 5. Similar confusion in `with` statements > > This (4. and 5.) shows that we are using "as" in more than one sense, and in > a perfect world we would use different keywords. But IMHO (admittedly, > without having thought about it much) this isn't much of a problem. Again, > perhaps some clarifying examples would help. No, we want to keep using the same keywords - otherwise there are too many keywords in the language. The "except" case isn't a big deal IMO, but the "with" one is more serious, and the subtle difference between "with (x as y):" and "with x as y:" is sure to trip someone up. But maybe that's one for linters and code review. > Some pedantry: > > One issue not so far explicitly mentioned: IMHO it should be perfectly > legal to assign a value to a temporary variable, and then not use that > temporary variable (just as it is legal to assign to a variable in a regular > assignment statement, and then not use that variable) though linters should > IMO point it out. E.g. you might want to modify (perhaps only temporarily) > a = [ (f() as b), b ] > to > a = [ (f() as b), c ] Yep, perfectly legal. Once linters learn that this is an assignment, they can flag this as "unused variable". Otherwise, it's not really hurting much. > Also (and I'm relying on "In any context where arbitrary Python expressions > can be used, a named expression can appear." ), > linters should also IMO point to > a = (42 as b) > which AFAICT is a laborious synonym for > a = 42 Ditto - an unused variable. You could also write "a = b = 42" and then never use b. > And here's a thought: What are the semantics of > a = (42 as a) # Of course a linter should point this out too > At first I thought this was also a laborious synonym for "a=42". But then I > re-read your statement (the one I described above as crystal-clear) and > realised that its exact wording was even more critical than I had thought: > "the new name binding will shadow the other name from the point where it > is evaluated until the end of the statement" > Note: "until the end of the statement". NOT "until the end of the > expression". The distinction matters. > If we take this as gospel, all this will do is create a temporary variable > "a", assign the value 42 to it twice, then discard it. I.e. it effectively > does nothing, slowly. > Have I understood correctly? Very likely you have considered this and mean > exactly what you say, but I am sure you will understand that I mean no > offence by querying it. Actually, that's a very good point, and I had to actually go and do that to confirm. You're correct that the "a =" part is also affected, but there may be more complicated edge cases. Disassembly can help track down what the compiler's actually doing: >>> def f(): ... a = 1 ... a = (2 as a) ... print(a) ... >>> dis.dis(f) 2 0 LOAD_CONST 1 (1) 2 STORE_FAST 0 (a) 3 4 LOAD_CONST 2 (2) 6 DUP_TOP 8 STORE_FAST 1 (a) 10 STORE_FAST 1 (a) 12 DELETE_FAST 1 (a) 4 14 LOAD_GLOBAL 0 (print) 16 LOAD_FAST 0 (a) 18 CALL_FUNCTION 1 20 POP_TOP 22 LOAD_CONST 0 (None) 24 RETURN_VALUE If you're not familiar with the output of dis.dis(), the first column (largely blank) is line numbers in the source, the second is byte code offsets, and then we have the operation and its parameter (if any). The STORE_FAST and LOAD_FAST opcodes work with local names, which are identified by their indices; the first such operation sets slot 0 (named "a"), but the two that happen in line 3 (byte positions 8 and 10) are manipulating slot 1 (also named "a"). So you can see that line 3 never touches slot 0, and it is entirely operating within the SLNB scope. Identical byte code is produced from this function: >>> def f(): ... a = 1 ... b = (2 as b) ... print(a) ... >>> dis.dis(f) 2 0 LOAD_CONST 1 (1) 2 STORE_FAST 0 (a) 3 4 LOAD_CONST 2 (2) 6 DUP_TOP 8 STORE_FAST 1 (b) 10 STORE_FAST 1 (b) 12 DELETE_FAST 1 (b) 4 14 LOAD_GLOBAL 0 (print) 16 LOAD_FAST 0 (a) 18 CALL_FUNCTION 1 20 POP_TOP 22 LOAD_CONST 0 (None) 24 RETURN_VALUE I love dis.dis(), it's such an awesome tool :) I'll push PEP changes based on your suggestions shortly. Am also going to add a "performance considerations" section, as features like this are potentially costly. Thanks for your input! ChrisA From rosuav at gmail.com Wed Feb 28 00:29:29 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Feb 2018 16:29:29 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: On Wed, Feb 28, 2018 at 3:38 PM, Rob Cliffe via Python-ideas wrote: > > Hm, apologies. This is in complete contrast to my previous post, where I > was pretty enthusiastic about Chris's PEP. But I can't resist sharing these > thoughts ... > > There was some vague uneasiness at the back of my mind, which I think I have > finally pinned down. Consider Chris's example: > # Using a statement-local name > stuff = [[(f(x) as y), y] for x in range(5)] > I think what bothered me was the asymmetry between the two uses of the > calculated value of f(x). It is not obvious at first glance that > [(f(x) as y), y] > defines a 2-element list where the 2 elements are the same. Hmm, very good point. In non-toy examples, I suspect this will be somewhat overshadowed by the actual work being done, but this is definitely a bit weird. > Contrast > something like (exact syntax bike-sheddable) > stuff = [ (with f(x) as y: [y,y]) for x in range(5)] > or > stuff = [ (y,y] with f(x) as y) for x in range(5)] > This also has the advantage (if it is? I think probably it is) that the > scope of the temporary variable ("y" here) can be limited to inside the > parentheses of the "with" sub-expression. True, but it's also extremely wordy. Your two proposed syntaxes, if I have this correct, are: 1) '(' 'with' EXPR 'as' NAME ':' EXPR ')' 2) '(' EXPR 'with' EXPR 'as' NAME ')' Of the two, I prefer the first, as the second has the same problem as the if/else expression: execution is middle-first. It also offers only a slight advantage (in a comprehension) over just slapping another 'for' clause at the end of the expression. The first one is most comparable to the lambda helper example, and is (IMO) still very wordy. Perhaps it can be added in a section of "alternative syntax forms"? > And that it is not dependent on Python's evaluation order. > Ir gives the programmer explicit control over the scope, which might > conceivably be an advantage in more complicated expressions. That shouldn't normally be an issue (execution order is usually pretty intuitive), but if there are weird enough edge cases found in my current proposal, I'm happy to mention this as a possible solution to them. ChrisA From robertve92 at gmail.com Wed Feb 28 00:52:41 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 28 Feb 2018 06:52:41 +0100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: Hello Chris and Rob, did you compare your proposal tothe subject called "[Python-ideas] Temporary variables in comprehensions" on this month list ? If you don't want to go through all the mails, I tried to summarize the ideas in this mail : https://mail.python.org/pipermail/python-ideas/2018- February/048987.html In a nutshell one of the proposed syntax was stuff = [(y, y) where y = f(x) for x in range(5)] Or with your choice of keyword, stuff = [(y, y) with y = f(x) for x in range(5)] Your proposal uses the *reverse* syntax : stuff = [(y, y) with f(x) as y for x in range (5)] Your syntax was already proposed in 2008 and you can see in the response to my mail (https://mail.python.org/pipermail/python-ideas/2018- February/049000.html) The first example with the "where y = f(x)" can already be compiled on a CPython branch called "where-expr" on https://github.com/ thektulu/cpython/commit/9e669d63d292a639eb6ba2ecea3ed2c0c23f2636 You can see all the conversation on subject "Tempory variables in comprehensions" on https://mail.python.org/pipermail/python-ideas/2018- February/thread.html#start I wish I could write a pep to summarize all the discussions (including yours, my summary, including real world examples, including pro's and con's), or should I do a gist on GitHub so that we can talk in a more "forum like" manner where people can edit their answers and modify the document ? This mailing is driving me a bit crazy. As Rob pointed out, your syntax "(f(x) as y, y)" is really assymmetric and "(y, y) with f(x) as y" or "(y, y) with y = f(x)" is probably prefered. Moreover I agree with you the choice of "with" keyword could be confused with the "with f(x) as x:" statement in context management, so maybe "with x = f(x)" would cleary makes it different ? Or using a new keyword like "where y = f(x)", "let y = f(x)" or probably better "given y = f(x)", "given" isn't used in current librairies like numpy.where or sql alchemy "where". In the conversation, a lot of people wanted real world examples where it's obvious the new syntax is better. Cheers, Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Feb 28 01:52:29 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Feb 2018 17:52:29 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: On Wed, Feb 28, 2018 at 4:52 PM, Robert Vanden Eynde wrote: > Hello Chris and Rob, > > did you compare your proposal tothe subject called "[Python-ideas] Temporary > variables in comprehensions" on this month list ? Yes, I did; that's one of many threads on the subject, and is part of why I'm writing this up. One section that I have yet to write is "alternative syntax proposals", which is where I'd collect all of those together. > If you don't want to go through all the mails, I tried to summarize the > ideas in this mail : > https://mail.python.org/pipermail/python-ideas/2018-February/048987.html > > In a nutshell one of the proposed syntax was > > stuff = [(y, y) where y = f(x) for x in range(5)] > > Or with your choice of keyword, > > stuff = [(y, y) with y = f(x) for x in range(5)] > > Your proposal uses the *reverse* syntax : > > stuff = [(y, y) with f(x) as y for x in range (5)] > ... > I wish I could write a pep to summarize all the discussions (including > yours, my summary, including real world examples, including pro's and > con's), or should I do a gist on GitHub so that we can talk in a more "forum > like" manner where people can edit their answers and modify the document ? > This mailing is driving me a bit crazy. This is exactly why I am writing up a PEP. Ultimately, it should list every viable proposal (or group of similar proposals), with the arguments for and against. Contributions of actual paragraphs of text are VERY welcome; simply pointing out "hey, don't forget this one" is also welcome, but I then have to go and write something up, so it'll take a bit longer :) > As Rob pointed out, your syntax "(f(x) as y, y)" is really assymmetric and > "(y, y) with f(x) as y" or "(y, y) with y = f(x)" is probably prefered. > Moreover I agree with you the choice of "with" keyword could be confused > with the "with f(x) as x:" statement in context management, so maybe "with x > = f(x)" would cleary makes it different ? Or using a new keyword like "where > y = f(x)", "let y = f(x)" or probably better "given y = f(x)", "given" isn't > used in current librairies like numpy.where or sql alchemy "where". And also the standard library's tkinter.dnd.Icon, according to a quick 'git grep'; but that might be just an example, rather than actually being covered by backward-compatibility guarantees. I think "given" is the strongest contender of the three, but I'm just mentioning all three together. A new version of the PEP has been pushed, and should be live within a few minutes. https://www.python.org/dev/peps/pep-0572/ Whatever I've missed, do please let me know. This document should end up incorporating, or at least mentioning, all of the proposals you cited. ChrisA From greg at krypto.org Wed Feb 28 01:53:05 2018 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 28 Feb 2018 06:53:05 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Tue, Feb 27, 2018 at 2:35 PM Chris Angelico wrote: > This is a suggestion that comes up periodically here or on python-dev. > This proposal introduces a way to bind a temporary name to the value > of an expression, which can then be used elsewhere in the current > statement. > > The nicely-rendered version will be visible here shortly: > > https://www.python.org/dev/peps/pep-0572/ > > ChrisA > > PEP: 572 > Title: Syntax for Statement-Local Name Bindings > Author: Chris Angelico > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 28-Feb-2018 > Python-Version: 3.8 > Post-History: 28-Feb-2018 > > > Abstract > ======== > > Programming is all about reusing code rather than duplicating it. When > an expression needs to be used twice in quick succession but never again, > it is convenient to assign it to a temporary name with very small scope. > By permitting name bindings to exist within a single statement only, we > make this both convenient and safe against collisions. > > > Rationale > ========= > > When an expression is used multiple times in a list comprehension, there > are currently several suboptimal ways to spell this, and no truly good > ways. A statement-local name allows any expression to be temporarily > captured and then used multiple times. > > > Syntax and semantics > ==================== > > In any context where arbitrary Python expressions can be used, a named > expression can appear. This must be parenthesized for clarity, and is of > the form `(expr as NAME)` where `expr` is any valid Python expression, > and `NAME` is a simple name. > > The value of such a named expression is the same as the incorporated > expression, with the additional side-effect that NAME is bound to that > value for the remainder of the current statement. > > Just as function-local names shadow global names for the scope of the > function, statement-local names shadow other names for that statement. > They can also shadow each other, though actually doing this should be > strongly discouraged in style guides. > > > Example usage > ============= > > These list comprehensions are all approximately equivalent:: > > # Calling the function twice > stuff = [[f(x), f(x)] for x in range(5)] > > # Helper function > def pair(value): return [value, value] > stuff = [pair(f(x)) for x in range(5)] > > # Inline helper function > stuff = [(lambda v: [v,v])(f(x)) for x in range(5)] > > # Extra 'for' loop - see also Serhiy's optimization > stuff = [[y, y] for x in range(5) for y in [f(x)]] > > # Expanding the comprehension into a loop > stuff = [] > for x in range(5): > y = f(x) > stuff.append([y, y]) > > # Using a statement-local name > stuff = [[(f(x) as y), y] for x in range(5)] > > If calling `f(x)` is expensive or has side effects, the clean operation of > the list comprehension gets muddled. Using a short-duration name binding > retains the simplicity; while the extra `for` loop does achieve this, it > does so at the cost of dividing the expression visually, putting the named > part at the end of the comprehension instead of the beginning. > > Statement-local name bindings can be used in any context, but should be > avoided where regular assignment can be used, just as `lambda` should be > avoided when `def` is an option. > > > Open questions > ============== > > 1. What happens if the name has already been used? `(x, (1 as x), x)` > Currently, prior usage functions as if the named expression did not > exist (following the usual lookup rules); the new name binding will > shadow the other name from the point where it is evaluated until the > end of the statement. Is this acceptable? Should it raise a syntax > error or warning? > > 2. The current implementation [1] implements statement-local names using > a special (and mostly-invisible) name mangling. This works perfectly > inside functions (including list comprehensions), but not at top > level. Is this a serious limitation? Is it confusing? > > 3. The interaction with locals() is currently[1] slightly buggy. Should > statement-local names appear in locals() while they are active (and > shadow any other names from the same function), or should they simply > not appear? > > 4. Syntactic confusion in `except` statements. While technically > unambiguous, it is potentially confusing to humans. In Python 3.7, > parenthesizing `except (Exception as e):` is illegal, and there is no > reason to capture the exception type (as opposed to the exception > instance, as is done by the regular syntax). Should this be made > outright illegal, to prevent confusion? Can it be left to linters? > > 5. Similar confusion in `with` statements, with the difference that there > is good reason to capture the result of an expression, and it is also > very common for `__enter__` methods to return `self`. In many cases, > `with expr as name:` will do the same thing as `with (expr as name):`, > adding to the confusion. > > > References > ========== > > .. [1] Proof of concept / reference implementation > (https://github.com/Rosuav/cpython/tree/statement-local-variables) > > -1 today My first concern for this proposal is that it gives parenthesis a non-intuitive meaning. ()s used to be innocuous. A way to group things, make explicit the desired order of operations, and allow spanning across multiple lines. With this proposal, ()s occasionally take on a new meaning to cause additional action to happen _outside_ of the ()s - an expression length name binding. Does this parse? print(fetch_the_comfy_chair(cardinal) as chair, "==", chair) SyntaxError? My read of the proposal suggests that this would be required: print((fetch_the_comfy_chair(cardinal) as chair), "==", chair) But that sets off my "excess unnecessary parenthesis" radar as this is a new need it isn't trained for. I could retrain the radar... I see people try to cram too much functionality into one liners. By the time you need to refer to a side effect value more than once in a single statement... Just use multiple statements. It is more clear and easier to debug. Anything else warps human minds trying to read, understand, review, and maintain the code. {see_also: "nested list comprehensions"} '''insert favorite Zen of Python quotes here''' We've got a whitespace delimited language. That is a feature. Lets keep it that way rather than adding a form of (non-curly) braces. I do understand the desire. So far I remain unconvinced of a need. Practical examples from existing code that become clearly easier to understand afterwards instead of made the examples with one letter names may help. 2cents, -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Feb 28 02:12:01 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Feb 2018 18:12:01 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Wed, Feb 28, 2018 at 5:53 PM, Gregory P. Smith wrote: > > My first concern for this proposal is that it gives parenthesis a > non-intuitive meaning. ()s used to be innocuous. A way to group things, > make explicit the desired order of operations, and allow spanning across > multiple lines. > > With this proposal, ()s occasionally take on a new meaning to cause > additional action to happen _outside_ of the ()s - an expression length name > binding. > > Does this parse? > > print(fetch_the_comfy_chair(cardinal) as chair, "==", chair) > > SyntaxError? My read of the proposal suggests that this would be required: > > print((fetch_the_comfy_chair(cardinal) as chair), "==", chair) > > But that sets off my "excess unnecessary parenthesis" radar as this is a new > need it isn't trained for. I could retrain the radar... Hmm, interesting point. The value to be captured DOES need to be grouped, though, and the influence of the assignment is "until end of statement", which is usually fairly clear. Parentheses do already have quite a number of meanings. For instance, a function call does more than simply group a subexpression - there's a lot of difference between x=(1, 2, 3) and x(1, 2, 3) even before you look at all the other syntaxes that can be used inside function calls (eg keyword arguments). The way I see it, this proposal is about the "as" keyword being used for local name binding, and the purpose of the parentheses is to group the name with the value being bound. > I see people try to cram too much functionality into one liners. By the > time you need to refer to a side effect value more than once in a single > statement... Just use multiple statements. It is more clear and easier to > debug. This concern comes up frequently, and there are definitely times when it's true. But there are also times when the opposite is true, and pushing more syntax into a single logical action makes code MORE readable. For instance, Python allows us to quickly zero out a bunch of variables all at once: a = b = c = d = e = 0 Can this be abused? Sure! But when it's used correctly, it is far MORE readable than laying everything out vertically: a = 0 b = 0 c = 0 d = 0 e = 0 Code is attempting to express abstract ideas. If your idea of "readable code" is "code that makes it easy to see the concrete actions being done", you're missing out on a lot of the power of Python. > We've got a whitespace delimited language. That is a feature. Lets keep it > that way rather than adding a form of (non-curly) braces. I don't fully understand you here. Python has always used symbols as delimiters, and I doubt anyone would want to do things any differently. > Practical examples from existing code that become clearly easier to > understand afterwards instead of made the examples with one letter names may > help. Fair enough. I'll try to dig some up. The trouble is that practical examples are seldom as clear and simple as the shorter examples are. ChrisA From marcidy at gmail.com Wed Feb 28 02:46:50 2018 From: marcidy at gmail.com (Matt Arcidy) Date: Wed, 28 Feb 2018 07:46:50 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: I have been struggling to justify the need based on what I have read. I hope this isn't a dupe, I only saw caching mentioned in passing. Also please excuse some of the naive generalizations below for illustrative purposes. Is there a reason memoization doesn't work? If f is truly expensive, using this syntax only makes sense if you have few comprehensions (as opposed to many iterations) and few other calls to f. Calling f in 10 comprehensions would (naively) benefit from memoization more than this. It appears to me to be ad-hoc memoization with limited scope. is this a fair statement? >From readability, the examples put forth have been to explain the advantage, with which I agree. However, i do not believe this scales well. [(foo(x,y) as g)*(bar(y) as i) + g*foo(x,a) +baz(g,i) for x... for y...] That's 3 functions, 2 iterators, 3 calls saved ('a' is some constant just to trigger a new call on foo). I'm not trying to show ugly statements can be constructed, but show how quickly in _n iterators and _m functions readability declines. it's seems the utility is bounded by f being not complex/costly enough for memoization, and ( _m, _n) being small enough to pass code review. The syntax does not create a readability issue, but by adding symbols, exacerbates an inherent issue with 'complex' comprehensions. If I have understood, does this bounding on f, _m, and _n yield a tool with sufficient applicability for a language change? I know it's already spoken about as an edge case, I'm just not clear on the bounds of that. I am not against it, I just haven't seen this addressed. Thank you for putting the PEP together. Again, I don't want to sound negative on it, I may have misunderstood wildly. I like the idea conceptually, and I don't think it's anymore confusing to me than comprehensions were when I first encountered them, and it will yield a 'cool!' statement much like they did when I learned them. On Tue, Feb 27, 2018, 22:55 Gregory P. Smith wrote: > > On Tue, Feb 27, 2018 at 2:35 PM Chris Angelico wrote: > >> This is a suggestion that comes up periodically here or on python-dev. >> This proposal introduces a way to bind a temporary name to the value >> of an expression, which can then be used elsewhere in the current >> statement. >> >> The nicely-rendered version will be visible here shortly: >> >> https://www.python.org/dev/peps/pep-0572/ >> >> ChrisA >> >> PEP: 572 >> Title: Syntax for Statement-Local Name Bindings >> Author: Chris Angelico >> Status: Draft >> Type: Standards Track >> Content-Type: text/x-rst >> Created: 28-Feb-2018 >> Python-Version: 3.8 >> Post-History: 28-Feb-2018 >> >> >> Abstract >> ======== >> >> Programming is all about reusing code rather than duplicating it. When >> an expression needs to be used twice in quick succession but never again, >> it is convenient to assign it to a temporary name with very small scope. >> By permitting name bindings to exist within a single statement only, we >> make this both convenient and safe against collisions. >> >> >> Rationale >> ========= >> >> When an expression is used multiple times in a list comprehension, there >> are currently several suboptimal ways to spell this, and no truly good >> ways. A statement-local name allows any expression to be temporarily >> captured and then used multiple times. >> >> >> Syntax and semantics >> ==================== >> >> In any context where arbitrary Python expressions can be used, a named >> expression can appear. This must be parenthesized for clarity, and is of >> the form `(expr as NAME)` where `expr` is any valid Python expression, >> and `NAME` is a simple name. >> >> The value of such a named expression is the same as the incorporated >> expression, with the additional side-effect that NAME is bound to that >> value for the remainder of the current statement. >> >> Just as function-local names shadow global names for the scope of the >> function, statement-local names shadow other names for that statement. >> They can also shadow each other, though actually doing this should be >> strongly discouraged in style guides. >> >> >> Example usage >> ============= >> >> These list comprehensions are all approximately equivalent:: >> >> # Calling the function twice >> stuff = [[f(x), f(x)] for x in range(5)] >> >> # Helper function >> def pair(value): return [value, value] >> stuff = [pair(f(x)) for x in range(5)] >> >> # Inline helper function >> stuff = [(lambda v: [v,v])(f(x)) for x in range(5)] >> >> # Extra 'for' loop - see also Serhiy's optimization >> stuff = [[y, y] for x in range(5) for y in [f(x)]] >> >> # Expanding the comprehension into a loop >> stuff = [] >> for x in range(5): >> y = f(x) >> stuff.append([y, y]) >> >> # Using a statement-local name >> stuff = [[(f(x) as y), y] for x in range(5)] >> >> If calling `f(x)` is expensive or has side effects, the clean operation of >> the list comprehension gets muddled. Using a short-duration name binding >> retains the simplicity; while the extra `for` loop does achieve this, it >> does so at the cost of dividing the expression visually, putting the named >> part at the end of the comprehension instead of the beginning. >> >> Statement-local name bindings can be used in any context, but should be >> avoided where regular assignment can be used, just as `lambda` should be >> avoided when `def` is an option. >> >> >> Open questions >> ============== >> >> 1. What happens if the name has already been used? `(x, (1 as x), x)` >> Currently, prior usage functions as if the named expression did not >> exist (following the usual lookup rules); the new name binding will >> shadow the other name from the point where it is evaluated until the >> end of the statement. Is this acceptable? Should it raise a syntax >> error or warning? >> >> 2. The current implementation [1] implements statement-local names using >> a special (and mostly-invisible) name mangling. This works perfectly >> inside functions (including list comprehensions), but not at top >> level. Is this a serious limitation? Is it confusing? >> >> 3. The interaction with locals() is currently[1] slightly buggy. Should >> statement-local names appear in locals() while they are active (and >> shadow any other names from the same function), or should they simply >> not appear? >> >> 4. Syntactic confusion in `except` statements. While technically >> unambiguous, it is potentially confusing to humans. In Python 3.7, >> parenthesizing `except (Exception as e):` is illegal, and there is no >> reason to capture the exception type (as opposed to the exception >> instance, as is done by the regular syntax). Should this be made >> outright illegal, to prevent confusion? Can it be left to linters? >> >> 5. Similar confusion in `with` statements, with the difference that there >> is good reason to capture the result of an expression, and it is also >> very common for `__enter__` methods to return `self`. In many cases, >> `with expr as name:` will do the same thing as `with (expr as name):`, >> adding to the confusion. >> >> >> References >> ========== >> >> .. [1] Proof of concept / reference implementation >> (https://github.com/Rosuav/cpython/tree/statement-local-variables) >> >> > -1 today > > My first concern for this proposal is that it gives parenthesis a > non-intuitive meaning. ()s used to be innocuous. A way to group things, > make explicit the desired order of operations, and allow spanning across > multiple lines. > > With this proposal, ()s occasionally take on a new meaning to cause > additional action to happen _outside_ of the ()s - an expression length > name binding. > > Does this parse? > > print(fetch_the_comfy_chair(cardinal) as chair, "==", chair) > > SyntaxError? My read of the proposal suggests that this would be required: > > print((fetch_the_comfy_chair(cardinal) as chair), "==", chair) > > But that sets off my "excess unnecessary parenthesis" radar as this is a > new need it isn't trained for. I could retrain the radar... > > I see people try to cram too much functionality into one liners. By the > time you need to refer to a side effect value more than once in a single > statement... Just use multiple statements. It is more clear and easier to > debug. > > Anything else warps human minds trying to read, understand, review, and > maintain the code. {see_also: "nested list comprehensions"} > > '''insert favorite Zen of Python quotes here''' > > We've got a whitespace delimited language. That is a feature. Lets keep it > that way rather than adding a form of (non-curly) braces. > > I do understand the desire. So far I remain unconvinced of a need. > > Practical examples from existing code that become clearly easier to > understand afterwards instead of made the examples with one letter names > may help. > > 2cents, > -gps > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Feb 28 03:37:20 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Feb 2018 19:37:20 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Wed, Feb 28, 2018 at 6:46 PM, Matt Arcidy wrote: > I have been struggling to justify the need based on what I have read. I > hope this isn't a dupe, I only saw caching mentioned in passing. > > Also please excuse some of the naive generalizations below for illustrative > purposes. > > Is there a reason memoization doesn't work? If f is truly expensive, using > this syntax only makes sense if you have few comprehensions (as opposed to > many iterations) and few other calls to f. Calling f in 10 comprehensions > would (naively) benefit from memoization more than this. It appears to me > to be ad-hoc memoization with limited scope. is this a fair statement? Memoization is only an option if the expression in question is (a) a single cacheable function call, and (b) used twice without any variation. If it's any sort of more complicated expression, that concept doesn't work. ChrisA From robertve92 at gmail.com Wed Feb 28 04:04:13 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 28 Feb 2018 10:04:13 +0100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: True, but it's also extremely wordy. Your two proposed syntaxes, if I have this correct, are: 1) '(' 'with' EXPR 'as' NAME ':' EXPR ')' 2) '(' EXPR 'with' EXPR 'as' NAME ')' Of the two, I prefer the first, as the second has the same problem as the if/else expression: execution is middle-first. It also offers only a slight advantage (in a comprehension) over just slapping another 'for' clause at the end of the expression. The first one is most comparable to the lambda helper example, and is (IMO) still very wordy. Perhaps it can be added in a section of "alternative syntax forms"? Considering the 3rd syntax : '(' EXPR 'with' NAME '=' EXPR ')' Wouldn't have the problem of "execution being middle first and would clearly differenciate the "with NAME = CONTEXT" from the "with CONTEXT as NAME:" statement. Considering the PEP : 1) I think I spoke too fast for SqlAlchemy using "where", after looking up, they use "filter" (I was pretty sure I read it somewhere...) 2) talking about the implementation of thektulu in the "where =" part. 3) "C problem that an equals sign in an expression can now create a name binding, rather than performing a comparison." The "=" does variable assignement already, and there is no grammar problem of "=" vs "==" because the "with" keyword is used in the expression, therefore "with a == ..." is a SyntaxError whereas "where a = ..." is alright (See grammar in thektulu implemention of "where"). Remember that the lexer knows the difference between "=" and "==", so those two are clearly different tokens. 4) Would the syntax be allowed after the "for" in a list comprehension ? [[y, y] for x in range(5) with y = x+1] This is exactly the same as "for y in [ x+1 ]", allowing the syntax here would allow adding "if" to filter in the list comp using the new Variable. [[y, y] for x in range(5) with y = x+1 if y % 2 == 0] 5) Any expression vs "post for" only When I say "any expression" I mean: print(y+2 with y = x+1) When I say "post for in list comp" I mean the previous paragraph: [y+2 for x in range(5) with y = x+1] Allowing both cases leads to having two ways in the simple case [(y,y) with y = x+1 for x in range(5)] vs [(y,y) for x in range(5) with y = x+1] (but that's alright) Allowing "any expression" would result in having two ways to have variable assignement : y = x + 1 print(y+2) Vs: print(y+2 with y = x+1) One could argue the first is imperative programming whereas the second is Functional programming. The second case will have to have "y" being a Local variable as the new Variable in list comp are not in the outside scope. 6) with your syntax, how does the simple case work (y+2 with y = x+1) ? Would you write ((x+1 as y) + 2) ? That's very unclear where the variable are defined, in the [(x+1 as y), y] case, the scoping would suggest the "y" Variable is defined between the parenthesis whereas [x+1 as y, y] is not symmetric. The issue is not only about reusing variable. 7) the "lambda example", the "v" variable can be renamed "y" to be consistent with the other examples. 8) there are two ways of using a lamba, either positional args, either keyword arguments, writing (lambda y: [y, y])(x+1) Vs (lambda y: [y, y])(y=x+1) In the second example, the y = x+1 is explicit. 9) the issue is not only about reusing variable, but also readability, otherwise, why would we create Tempory variables we only use once ? 10) Chaining, in the case of the "with =", in thektulu, parenthesis were mandatory: print((z+3 with z = y+2) with y = x+2) What happens when the parenthesis are dropped ? print(z+3 with y = x+2 with z = y+2) Vs print(z+3 with y = x+2 with z = y+2) I prefer the first one be cause it's in the same order as the "post for" [z + 3 for y in [ x+2 ] for z in [ y+2 ]] 11) Scoping, in the case of the "with =" syntax, I think the parenthesis introduce a scope : print(y + (y+1 where y = 2)) Would raise a SyntaxError, it's probably better for the variable beeing local and not in the current function (that would be a mess). Remember that in list comp, the variable is not leaked : x = 5 stuff = [y+2 for y in [x+1] print(y) # SyntaxError Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcidy at gmail.com Wed Feb 28 04:55:05 2018 From: marcidy at gmail.com (Matt Arcidy) Date: Wed, 28 Feb 2018 01:55:05 -0800 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: I appreciate that point as it is what I must be misunderstanding. I believe the performance speed up of [((f(x) as h), g(h)) for x range(10)] is that there are 10 calls to compute f, not 20. You can do this with a dictionary right now, at least for the example we're talking about: [(d[x], g(d[x])) for x in range(10) if d.update({x:f(x)}) is None] It's ugly but get's the job done. The proposed syntax is leagues better than that, but just to give my point a concrete example. If f is memoized, isn't [( f(x), g(f(x)) ) for range(10)] the same? You compute f 10 times, not 20. You get the second f(x) as cache retrieval instead of recomputing it, precisely because the argument x is the same. Here I can use 'd' later if needed, as opposed to with the proposal. That's really my point about memoization (or cacheing). If 'f' is really expensive, I don't really see the point of using an ad-hoc local caching of the value that lives just for one statement when I could use it where-ever, even persist it if it makes sense. I fully admit I'm at my depth here, so I can comfortably concede it's better than memoization and I just don't understand! On Wed, Feb 28, 2018 at 12:37 AM, Chris Angelico wrote: > On Wed, Feb 28, 2018 at 6:46 PM, Matt Arcidy wrote: >> I have been struggling to justify the need based on what I have read. I >> hope this isn't a dupe, I only saw caching mentioned in passing. >> >> Also please excuse some of the naive generalizations below for illustrative >> purposes. >> >> Is there a reason memoization doesn't work? If f is truly expensive, using >> this syntax only makes sense if you have few comprehensions (as opposed to >> many iterations) and few other calls to f. Calling f in 10 comprehensions >> would (naively) benefit from memoization more than this. It appears to me >> to be ad-hoc memoization with limited scope. is this a fair statement? > > Memoization is only an option if the expression in question is (a) a > single cacheable function call, and (b) used twice without any > variation. If it's any sort of more complicated expression, that > concept doesn't work. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rosuav at gmail.com Wed Feb 28 05:25:04 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Feb 2018 21:25:04 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Wed, Feb 28, 2018 at 8:55 PM, Matt Arcidy wrote: > I appreciate that point as it is what I must be misunderstanding. > > I believe the performance speed up of [((f(x) as h), g(h)) for x > range(10)] is that there are 10 calls to compute f, not 20. > > You can do this with a dictionary right now, at least for the example > we're talking about: > [(d[x], g(d[x])) for x in range(10) if d.update({x:f(x)}) is None] > > It's ugly but get's the job done. The proposed syntax is leagues > better than that, but just to give my point a concrete example. Definitely ugly... if I saw that in code review, I'd ask "When is that condition going to be false?". It also offers nothing that the "extra 'for' loop" syntax can't do better. Memoization can only be done for pure functions. If the call in question is actually, say, "next(iter)", you can't memoize it to avoid duplicate calls. Possibly not the greatest example (since you could probably zip() to solve all those sorts of cases), but anything else that has side effects could do the same thing. ChrisA From rosuav at gmail.com Wed Feb 28 05:43:05 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Feb 2018 21:43:05 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: On Wed, Feb 28, 2018 at 8:04 PM, Robert Vanden Eynde wrote: > Considering the 3rd syntax : > '(' EXPR 'with' NAME '=' EXPR ')' > > Wouldn't have the problem of "execution being middle first and would clearly > differenciate the "with NAME = CONTEXT" from the "with CONTEXT as NAME:" > statement. It's still right-to-left, which is as bad as middle-outward once you combine it with normal left-to-right evaluation. Python has very little of this, and usually only in contexts where you wouldn't have much code on the left: >>> def d(): ... print("Getting dictionary") ... return {} ... >>> def k(): ... print("Getting key") ... return "spam" ... >>> def v(): ... print("Getting value") ... return 42 ... >>> d()[k()] = v() Getting value Getting dictionary Getting key Python executes the RHS of an assignment statement before the LHS, but the LHS is usually going to be so simple that you don't really care (or even notice, usually). By creating a name binding on the right and then evaluating the left, you create a complicated evaluation order that *will* have complex code on the left. > Considering the PEP : > > 1) I think I spoke too fast for SqlAlchemy using "where", after looking up, > they use "filter" (I was pretty sure I read it somewhere...) There's something with "select where exists" that uses .where(). It may not be as common as filter, but it's certainly out there. > 2) talking about the implementation of thektulu in the "where =" part. ? > 3) "C problem that an equals sign in an expression can now create a name > binding, rather than performing a comparison." The "=" does variable > assignement already, and there is no grammar problem of "=" vs "==" because > the "with" keyword is used in the expression, therefore "with a == ..." is a > SyntaxError whereas "where a = ..." is alright (See grammar in thektulu > implemention of "where"). Yes, but in Python, "=" does variable assignment *as a statement*. In C, you can do this: while (ch = getch()) do_something_with(ch) That's an assignment in an arbitrary condition, and that's a bug magnet. You cannot do that in Python. You cannot simply miss out one equals sign and have legal code that does what you don't want. With my proposed syntax, you'll be able to do this: while (getch() as ch): ... There's no way that you could accidentally write this when you really wanted to compare against the character. With yours, I'm not sure whether it handles a 'while' loop at all, but if it does, it would be something like: while (ch with ch = getch()): ... which doesn't read very well, doesn't really save much, but yes, I agree, it isn't going to accidentally assign. > Remember that the lexer knows the difference between "=" and "==", so those > two are clearly different tokens. It's not the lexer I'm worried about :) > 4) Would the syntax be allowed after the "for" in a list comprehension ? > > [[y, y] for x in range(5) with y = x+1] > > This is exactly the same as "for y in [ x+1 ]", allowing the syntax here > would allow adding "if" to filter in the list comp using the new Variable. > > [[y, y] for x in range(5) with y = x+1 if y % 2 == 0] I honestly don't know. With my "as" syntax, you would be able to, because it's simply first-use. The (expr as name) unit is itself an expression with a value. The 'with' clause has to bracket the value in some way. > 5) Any expression vs "post for" only > > When I say "any expression" I mean: > > print(y+2 with y = x+1) > > When I say "post for in list comp" I mean the previous paragraph: > > [y+2 for x in range(5) with y = x+1] > > Allowing both cases leads to having two ways in the simple case [(y,y) with > y = x+1 for x in range(5)] vs [(y,y) for x in range(5) with y = x+1] (but > that's alright) > > Allowing "any expression" would result in having two ways to have variable > assignement : > > y = x + 1 > print(y+2) > > Vs: > > print(y+2 with y = x+1) > > One could argue the first is imperative programming whereas the second is > Functional programming. > > The second case will have to have "y" being a Local variable as the new > Variable in list comp are not in the outside scope. I don't know what the benefit is here, but sure. As long as the grammar is unambiguous, I don't see any particular reason to reject this. > 6) with your syntax, how does the simple case work (y+2 with y = x+1) ? > > Would you write ((x+1 as y) + 2) ? That's very unclear where the variable > are defined, in the [(x+1 as y), y] case, the scoping would suggest the "y" > Variable is defined between the parenthesis whereas [x+1 as y, y] is not > symmetric. What simple case? The case where you only use the variable once? I'd write it like this: (x + 1) + 2 > The issue is not only about reusing variable. If you aren't using the variable multiple times, there's no point giving it a name. Unless I'm missing something here? > 7) the "lambda example", the "v" variable can be renamed "y" to be > consistent with the other examples. Oops, thanks, fixed. > 8) there are two ways of using a lamba, either positional args, either > keyword arguments, writing > > (lambda y: [y, y])(x+1) > > Vs > > (lambda y: [y, y])(y=x+1) > > In the second example, the y = x+1 is explicit. Ewww. Remind me what the benefit is of writing the variable name that many times? "Explicit" doesn't mean "utterly verbose". > 10) Chaining, in the case of the "with =", in thektulu, parenthesis were > mandatory: > > print((z+3 with z = y+2) with y = x+2) > > What happens when the parenthesis are dropped ? > > print(z+3 with y = x+2 with z = y+2) > > Vs > > print(z+3 with y = x+2 with z = y+2) > > I prefer the first one be cause it's in the same order as the "post for" > > [z + 3 for y in [ x+2 ] for z in [ y+2 ]] With my proposal, the parens are simply mandatory. Extending this to make them optional can come later. > 11) Scoping, in the case of the "with =" syntax, I think the parenthesis > introduce a scope : > > print(y + (y+1 where y = 2)) > > Would raise a SyntaxError, it's probably better for the variable beeing > local and not in the current function (that would be a mess). > > Remember that in list comp, the variable is not leaked : > > x = 5 > stuff = [y+2 for y in [x+1] > print(y) # SyntaxError Scoping is a fundamental part of both my proposal and the others I've seen here. (BTW, that would be a NameError, not a SyntaxError; it's perfectly legal to ask for the name 'y', it just hasn't been given any value.) By my definition, the variable is locked to the statement that created it, even if that's a compound statement. By the definition of a "(expr given var = expr)" proposal, it would be locked to that single expression. ChrisA From p.f.moore at gmail.com Wed Feb 28 06:49:28 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 28 Feb 2018 11:49:28 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On 27 February 2018 at 22:27, Chris Angelico wrote: > This is a suggestion that comes up periodically here or on python-dev. > This proposal introduces a way to bind a temporary name to the value > of an expression, which can then be used elsewhere in the current > statement. > > The nicely-rendered version will be visible here shortly: > > https://www.python.org/dev/peps/pep-0572/ > > ChrisA > > PEP: 572 > Title: Syntax for Statement-Local Name Bindings > Author: Chris Angelico > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 28-Feb-2018 > Python-Version: 3.8 > Post-History: 28-Feb-2018 Thanks for writing this - as you mention, this will be a useful document even if the proposal ultimately gets rejected. FWIW, I'm currently -1 on this, so "rejected" is what I'm expecting, but it's possible that subsequent discussions could refine this into something that people are happy with (or that I'm in the minority, of course :-)) > Abstract > ======== > > Programming is all about reusing code rather than duplicating it. When > an expression needs to be used twice in quick succession but never again, > it is convenient to assign it to a temporary name with very small scope. > By permitting name bindings to exist within a single statement only, we > make this both convenient and safe against collisions. In the context of multi-line statements like with, or while, describing this as a "very small scope" is inaccurate (and maybe even misleading). See later for an example using def. > Rationale > ========= > > When an expression is used multiple times in a list comprehension, there > are currently several suboptimal ways to spell this, and no truly good > ways. A statement-local name allows any expression to be temporarily > captured and then used multiple times. I agree with Rob Cliffe, this is a bit overstated. How about "there are currently several ways to spell this, none of which is universally accepted as ideal". Specifically, the point to me is that opinions are (strongly) divided, rather than that everyone agrees that it's a problem but we haven't found a good solution yet. > Syntax and semantics > ==================== > > In any context where arbitrary Python expressions can be used, a named > expression can appear. This must be parenthesized for clarity, and is of > the form `(expr as NAME)` where `expr` is any valid Python expression, > and `NAME` is a simple name. I agree with requiring parentheses - although I dislike the tendency towards "mandatory parentheses" in recent syntax proposals :-( But "1 as x + 1 as y" is an abomination. Conversely "sqrt((1 as x))" is a little annoying. So it feels like a case of the lesser of two evils to me, rather than an actually good idea... > The value of such a named expression is the same as the incorporated > expression, with the additional side-effect that NAME is bound to that > value for the remainder of the current statement. While there's basically no justification for doing so, it should be noted that under this proposal, ((((((((1 as x) as y) as z) as w) as v) as u) as t) as s) is valid. Of course, "you can write confusing code using this" isn't an argument against a useful enhancement, but potential for abuse is something to be aware of. There's also (slightly more realistically) something like [(sqrt((b*b as bsq) + (4*a*c as fourac)) as root1), (sqrt(bsq - fourac) as root2)], which I can see someone thinking is a good idea! The question here is whether the readability of "reasonable" uses of the construct is sufficient to outweigh the risk of well-intentioned misuses. > Just as function-local names shadow global names for the scope of the > function, statement-local names shadow other names for that statement. > They can also shadow each other, though actually doing this should be > strongly discouraged in style guides. > > > Example usage > ============= > > These list comprehensions are all approximately equivalent:: > > # Calling the function twice > stuff = [[f(x), f(x)] for x in range(5)] > > # Helper function > def pair(value): return [value, value] > stuff = [pair(f(x)) for x in range(5)] > > # Inline helper function > stuff = [(lambda v: [v,v])(f(x)) for x in range(5)] > > # Extra 'for' loop - see also Serhiy's optimization > stuff = [[y, y] for x in range(5) for y in [f(x)]] > > # Expanding the comprehension into a loop > stuff = [] > for x in range(5): > y = f(x) > stuff.append([y, y]) > > # Using a statement-local name > stuff = [[(f(x) as y), y] for x in range(5)] Honestly, the asymmetry in [(f(x) as y), y] makes this the *least* readable option to me :-( All of the other options clearly show that the 2 elements of the list are the same, but the statement-local name version requires me to stop and think to confirm that it's a list of 2 copies of the same value. > If calling `f(x)` is expensive or has side effects, the clean operation of > the list comprehension gets muddled. Using a short-duration name binding > retains the simplicity; while the extra `for` loop does achieve this, it > does so at the cost of dividing the expression visually, putting the named > part at the end of the comprehension instead of the beginning. "retains the simplicity" is subjective. I'd prefer something like "makes it clear that f(x) is called only once". Of course, all of the other options to this too, the main question is whether it's as clear as in the named subexpression version. > Statement-local name bindings can be used in any context, but should be > avoided where regular assignment can be used, just as `lambda` should be > avoided when `def` is an option. > > > Open questions > ============== > > 1. What happens if the name has already been used? `(x, (1 as x), x)` > Currently, prior usage functions as if the named expression did not > exist (following the usual lookup rules); the new name binding will > shadow the other name from the point where it is evaluated until the > end of the statement. Is this acceptable? Should it raise a syntax > error or warning? IMO, relying on evaluation order is the only viable option, but it's confusing. I would immediately reject something like `(x, (1 as x), x)` as bad style, simply because the meaning of x at the two places it is used is non-obvious. I'm -1 on a warning. I'd prefer an error, but I can't see how you'd implement (or document) it. > 2. The current implementation [1] implements statement-local names using > a special (and mostly-invisible) name mangling. This works perfectly > inside functions (including list comprehensions), but not at top > level. Is this a serious limitation? Is it confusing? I'm strongly -1 on "works like the current implementation" as a definition of the behaviour. While having a proof of concept to clarify behaviour is great, my first question would be "how is the behaviour documented to work?" So what is the PEP proposing would happen if I did if ('.'.join((str(x) for x in sys.version_info[:2])) as ver) == '3.6': # Python 3.6 specific code here elif sys.version_info[0] < 3: print(f"Version {ver} is not supported") at the top level of a Python file? To me, that's a perfectly reasonable way of using the new feature to avoid exposing a binding for "ver" in my module... > 3. The interaction with locals() is currently[1] slightly buggy. Should > statement-local names appear in locals() while they are active (and > shadow any other names from the same function), or should they simply > not appear? > > 4. Syntactic confusion in `except` statements. While technically > unambiguous, it is potentially confusing to humans. In Python 3.7, > parenthesizing `except (Exception as e):` is illegal, and there is no > reason to capture the exception type (as opposed to the exception > instance, as is done by the regular syntax). Should this be made > outright illegal, to prevent confusion? Can it be left to linters? Wait - except (Exception as e): would set e to the type Exception, and not capture the actual exception object? Even though that's unambiguous, it's *incredibly* confusing. But saying we allow "except " is also bad, in the sense that it's an annoying exception (sic) to have to include in the documentation. Maybe it would be viable to say that a (top-level) expression can never be a name binding - after all, there's no point because the name will be immediately discarded. Only allow name bindings in subexpressions. That would disallow this case as part of that general rule. But I've no idea what that rule would do to the grammar (in particular, whether it would still be possible to parse without lookahead). (Actually no, this would prohibit constructs such as `while (f(x) as val) > 0`, which I presume you're trying to support[1], although you don't mention this in the rationale or example usage sections). [1] Based on the fact that you want the name binding to remain active for the enclosing *statement*, not just the enclosing *expression*. > 5. Similar confusion in `with` statements, with the difference that there > is good reason to capture the result of an expression, and it is also > very common for `__enter__` methods to return `self`. In many cases, > `with expr as name:` will do the same thing as `with (expr as name):`, > adding to the confusion. This seems to imply that the name in (expr as name) when used as a top level expression will persist after the closing parenthesis. Is that intended? It's not mentioned anywhere in the PEP (that I could see). On re-reading, I see that you say "for the remainder of the current *statement*" and not (as I had misread it) the remainder of the current *expression*. So multi-line statements will give it a larger scope? That strikes me as giving this proposal a much wider applicability than implied by the summary. Consider def f(x, y=(object() as default_value)): if y is default_value: print("You did not supply y") That's an "obvious" use of the new feature, and I could see it very quickly becoming the standard way to define sentinel values. I *think* it's a reasonable idiom, but honestly, I'm not sure. It certainly feels like scope creep from the original use case, which was naming bits of list comprehensions. Overall, it feels like the semantics of having the name bindings persist for the enclosing *statement* rather than the enclosing *expression* is a significant extension of the scope of the proposal, to the extent that the actual use cases that it would allow are mostly not mentioned in the rationale, and using it in comprehensions becomes merely yet another suboptimal solution to the problem of reusing calculated values in comprehensions!!! So IMO, either the semantics should be reduced to exposing the binding to just the enclosing top-level expression, or the rationale, use cases and examples should be significantly beefed up to reflect the much wider applicability of the feature (and then you'd need to address any questions that arise from that wider scope). Paul From rosuav at gmail.com Wed Feb 28 08:45:59 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 00:45:59 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Wed, Feb 28, 2018 at 10:49 PM, Paul Moore wrote: > On 27 February 2018 at 22:27, Chris Angelico wrote: >> This is a suggestion that comes up periodically here or on python-dev. >> This proposal introduces a way to bind a temporary name to the value >> of an expression, which can then be used elsewhere in the current >> statement. >> >> The nicely-rendered version will be visible here shortly: >> >> https://www.python.org/dev/peps/pep-0572/ >> >> ChrisA >> >> PEP: 572 >> Title: Syntax for Statement-Local Name Bindings >> Author: Chris Angelico >> Status: Draft >> Type: Standards Track >> Content-Type: text/x-rst >> Created: 28-Feb-2018 >> Python-Version: 3.8 >> Post-History: 28-Feb-2018 > > Thanks for writing this - as you mention, this will be a useful > document even if the proposal ultimately gets rejected. > > FWIW, I'm currently -1 on this, so "rejected" is what I'm expecting, > but it's possible that subsequent discussions could refine this into > something that people are happy with (or that I'm in the minority, of > course :-)) TBH, I'm no more than +0.5 on the proposal being accepted, but a very strong +1 on it being all written up in a PEP :) >> Abstract >> ======== >> >> Programming is all about reusing code rather than duplicating it. When >> an expression needs to be used twice in quick succession but never again, >> it is convenient to assign it to a temporary name with very small scope. >> By permitting name bindings to exist within a single statement only, we >> make this both convenient and safe against collisions. > > In the context of multi-line statements like with, or while, > describing this as a "very small scope" is inaccurate (and maybe even > misleading). See later for an example using def. True. When I first wrote that paragraph, I wasn't thinking of compound statements at all. They are, however, a powerful use of the new syntax IMO. I'll remove the word "very" from there; "small scope" is still the intention though, as the point of this is to be a subscope within a (larger) function scope. >> Rationale >> ========= >> >> When an expression is used multiple times in a list comprehension, there >> are currently several suboptimal ways to spell this, and no truly good >> ways. A statement-local name allows any expression to be temporarily >> captured and then used multiple times. > > I agree with Rob Cliffe, this is a bit overstated. > > How about "there are currently several ways to spell this, none of > which is universally accepted as ideal". Specifically, the point to me > is that opinions are (strongly) divided, rather than that everyone > agrees that it's a problem but we haven't found a good solution yet. I can run with that wording. Thanks. >> Syntax and semantics >> ==================== >> >> In any context where arbitrary Python expressions can be used, a named >> expression can appear. This must be parenthesized for clarity, and is of >> the form `(expr as NAME)` where `expr` is any valid Python expression, >> and `NAME` is a simple name. > > I agree with requiring parentheses - although I dislike the tendency > towards "mandatory parentheses" in recent syntax proposals :-( But "1 > as x + 1 as y" is an abomination. Conversely "sqrt((1 as x))" is a > little annoying. So it feels like a case of the lesser of two evils to > me, rather than an actually good idea... The sqrt example could be changed in the future (either before or after the PEP's acceptance). It's like a genexp - parens mandatory but function calls are special-cased. >> The value of such a named expression is the same as the incorporated >> expression, with the additional side-effect that NAME is bound to that >> value for the remainder of the current statement. > > While there's basically no justification for doing so, it should be > noted that under this proposal, ((((((((1 as x) as y) as z) as w) as > v) as u) as t) as s) is valid. Of course, "you can write confusing > code using this" isn't an argument against a useful enhancement, but > potential for abuse is something to be aware of. There's also > (slightly more realistically) something like [(sqrt((b*b as bsq) + > (4*a*c as fourac)) as root1), (sqrt(bsq - fourac) as root2)], which I > can see someone thinking is a good idea! Sure! Though I'm not sure what you're representing there; it looks almost, but not quite, like the quadratic formula. If that was the intention, I'd be curious to see the discriminant broken out, with some kind of trap for the case where it's negative. >> Example usage >> ============= >> >> These list comprehensions are all approximately equivalent:: >> >> # Calling the function twice >> stuff = [[f(x), f(x)] for x in range(5)] >> >> # Helper function >> def pair(value): return [value, value] >> stuff = [pair(f(x)) for x in range(5)] >> >> # Inline helper function >> stuff = [(lambda v: [v,v])(f(x)) for x in range(5)] >> >> # Extra 'for' loop - see also Serhiy's optimization >> stuff = [[y, y] for x in range(5) for y in [f(x)]] >> >> # Expanding the comprehension into a loop >> stuff = [] >> for x in range(5): >> y = f(x) >> stuff.append([y, y]) >> >> # Using a statement-local name >> stuff = [[(f(x) as y), y] for x in range(5)] > > Honestly, the asymmetry in [(f(x) as y), y] makes this the *least* > readable option to me :-( All of the other options clearly show that > the 2 elements of the list are the same, but the statement-local name > version requires me to stop and think to confirm that it's a list of 2 > copies of the same value. I need some real-world examples where it's not as trivial as [y, y] so people don't get hung up on the symmetry issue. >> Open questions >> ============== >> >> 1. What happens if the name has already been used? `(x, (1 as x), x)` >> Currently, prior usage functions as if the named expression did not >> exist (following the usual lookup rules); the new name binding will >> shadow the other name from the point where it is evaluated until the >> end of the statement. Is this acceptable? Should it raise a syntax >> error or warning? > > IMO, relying on evaluation order is the only viable option, but it's > confusing. I would immediately reject something like `(x, (1 as x), > x)` as bad style, simply because the meaning of x at the two places it > is used is non-obvious. > > I'm -1 on a warning. I'd prefer an error, but I can't see how you'd > implement (or document) it. Sure. For now, I'm just going to leave it as a perfectly acceptable use of the feature; it can be rejected as poor style, but permitted by the language. >> 2. The current implementation [1] implements statement-local names using >> a special (and mostly-invisible) name mangling. This works perfectly >> inside functions (including list comprehensions), but not at top >> level. Is this a serious limitation? Is it confusing? > > I'm strongly -1 on "works like the current implementation" as a > definition of the behaviour. While having a proof of concept to > clarify behaviour is great, my first question would be "how is the > behaviour documented to work?" So what is the PEP proposing would > happen if I did > > if ('.'.join((str(x) for x in sys.version_info[:2])) as ver) == '3.6': > # Python 3.6 specific code here > elif sys.version_info[0] < 3: > print(f"Version {ver} is not supported") > > at the top level of a Python file? To me, that's a perfectly > reasonable way of using the new feature to avoid exposing a binding > for "ver" in my module... I agree, sounds good. I'll reword this to be a limitation of implementation. >> 4. Syntactic confusion in `except` statements. While technically >> unambiguous, it is potentially confusing to humans. In Python 3.7, >> parenthesizing `except (Exception as e):` is illegal, and there is no >> reason to capture the exception type (as opposed to the exception >> instance, as is done by the regular syntax). Should this be made >> outright illegal, to prevent confusion? Can it be left to linters? > > Wait - except (Exception as e): would set e to the type Exception, and > not capture the actual exception object? Correct. The expression "Exception" evaluates to the type Exception, and you can capture that. It's a WutFace moment but it's a logical consequence of the nature of Python. > Even though that's > unambiguous, it's *incredibly* confusing. But saying we allow "except > " is also bad, in the sense that it's an > annoying exception (sic) to have to include in the documentation. Agreed. I don't want to special-case it out; this is something for code review to catch. Fortunately, this would give fairly obvious results - you try to do something with the exception, and you don't actually have an exception object, you have . It's a little more problematic in a "with" block, because it'll often do the same thing. > Maybe it would be viable to say that a (top-level) expression can > never be a name binding - after all, there's no point because the name > will be immediately discarded. Only allow name bindings in > subexpressions. That would disallow this case as part of that general > rule. But I've no idea what that rule would do to the grammar (in > particular, whether it would still be possible to parse without > lookahead). (Actually no, this would prohibit constructs such as > `while (f(x) as val) > 0`, which I presume you're trying to > support[1], although you don't mention this in the rationale or > example usage sections). > > [1] Based on the fact that you want the name binding to remain active > for the enclosing *statement*, not just the enclosing *expression*. Not sure what you mean by a "top-level expression", if it's going to disallow 'while (f(x) as val) > 0:'. Can you elaborate? >> 5. Similar confusion in `with` statements, with the difference that there >> is good reason to capture the result of an expression, and it is also >> very common for `__enter__` methods to return `self`. In many cases, >> `with expr as name:` will do the same thing as `with (expr as name):`, >> adding to the confusion. > > This seems to imply that the name in (expr as name) when used as a top > level expression will persist after the closing parenthesis. Is that > intended? It's not mentioned anywhere in the PEP (that I could see). > On re-reading, I see that you say "for the remainder of the current > *statement*" and not (as I had misread it) the remainder of the > current *expression*. Yep. If you have an expression on its own, it's an "expression statement", and the subscope will end at the newline (or the semicolon, if you have one). Inside something larger, it'll persist. > So multi-line statements will give it a larger > scope? That strikes me as giving this proposal a much wider > applicability than implied by the summary. Consider > > def f(x, y=(object() as default_value)): > if y is default_value: > print("You did not supply y") > > That's an "obvious" use of the new feature, and I could see it very > quickly becoming the standard way to define sentinel values. I *think* > it's a reasonable idiom, but honestly, I'm not sure. It certainly > feels like scope creep from the original use case, which was naming > bits of list comprehensions. Eeeuaghh.... okay. Now I gotta think about this one. The 'def' statement is an executable one, to be sure; but the statement doesn't include *running* the function, only *creating* it. So as you construct the function, default_value has that value. Inside the actual running of it, that name doesn't exist any more. So this won't actually work. But you could use this to create annotations and such, I guess... >>> def f(): ... def g(x=(object() as default_value)) -> default_value: ... ... ... return g ... >>> f().__annotations__ {'return': } >>> dis.dis(f) 2 0 LOAD_GLOBAL 0 (object) 2 CALL_FUNCTION 0 4 DUP_TOP 6 STORE_FAST 0 (default_value) 8 BUILD_TUPLE 1 10 LOAD_FAST 0 (default_value) 12 LOAD_CONST 1 (('return',)) 14 BUILD_CONST_KEY_MAP 1 16 LOAD_CONST 2 (", line 2>) 18 LOAD_CONST 3 ('f..g') 20 MAKE_FUNCTION 5 22 STORE_FAST 1 (g) 24 DELETE_FAST 0 (default_value) 4 26 LOAD_FAST 1 (g) 28 RETURN_VALUE Disassembly of ", line 2>: 3 0 LOAD_CONST 0 (None) 2 RETURN_VALUE Not terribly useful. > Overall, it feels like the semantics of having the name bindings > persist for the enclosing *statement* rather than the enclosing > *expression* is a significant extension of the scope of the proposal, > to the extent that the actual use cases that it would allow are mostly > not mentioned in the rationale, and using it in comprehensions becomes > merely yet another suboptimal solution to the problem of reusing > calculated values in comprehensions!!! > > So IMO, either the semantics should be reduced to exposing the binding > to just the enclosing top-level expression, or the rationale, use > cases and examples should be significantly beefed up to reflect the > much wider applicability of the feature (and then you'd need to > address any questions that arise from that wider scope). I'll add some more examples. I think the if/while usage is potentially of value. Thanks for the feedback! Keep it coming! :) ChrisA From storchaka at gmail.com Wed Feb 28 08:49:05 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Feb 2018 15:49:05 +0200 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: 28.02.18 00:27, Chris Angelico ????: > Example usage > ============= > > These list comprehensions are all approximately equivalent:: > > # Calling the function twice > stuff = [[f(x), f(x)] for x in range(5)] The simplest equivalent of [f(x), f(x)] is [f(x)]*2. It would be worth to use less trivial example, e.g. f(x) + x/f(x). > # Helper function > def pair(value): return [value, value] > stuff = [pair(f(x)) for x in range(5)] > > # Inline helper function > stuff = [(lambda v: [v,v])(f(x)) for x in range(5)] > > # Extra 'for' loop - see also Serhiy's optimization > stuff = [[y, y] for x in range(5) for y in [f(x)]] > > # Expanding the comprehension into a loop > stuff = [] > for x in range(5): > y = f(x) > stuff.append([y, y]) > > # Using a statement-local name > stuff = [[(f(x) as y), y] for x in range(5)] Other options: stuff = [[y, y] for y in (f(x) for x in range(5))] g = (f(x) for x in range(5)) stuff = [[y, y] for y in g] def g(): for x in range(5): y = f(x) yield [y, y] stuff = list(g) Seems the two last options are generally considered the most Pythonic. map() and itertools can be helpful in particular cases. From rosuav at gmail.com Wed Feb 28 09:06:32 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 01:06:32 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Thu, Mar 1, 2018 at 12:49 AM, Serhiy Storchaka wrote: > 28.02.18 00:27, Chris Angelico ????: >> >> Example usage >> ============= >> >> These list comprehensions are all approximately equivalent:: >> >> # Calling the function twice >> stuff = [[f(x), f(x)] for x in range(5)] > > > The simplest equivalent of [f(x), f(x)] is [f(x)]*2. It would be worth to > use less trivial example, e.g. f(x) + x/f(x). Sure, I'll go with that. >> # Expanding the comprehension into a loop >> stuff = [] >> for x in range(5): >> y = f(x) >> stuff.append([y, y]) >> > Other options: > > g = (f(x) for x in range(5)) > stuff = [[y, y] for y in g] That's the same as the one-liner, but with the genexp broken out. Not sure it helps much as examples go? > def g(): > for x in range(5): > y = f(x) > yield [y, y] > stuff = list(g) You're not the first to mention this, but I thought it basically equivalent to the "expand into a loop" form. Is it really beneficial to expand it, not just into a loop, but into a generator function that contains a loop? ChrisA From kirillbalunov at gmail.com Wed Feb 28 09:14:27 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Wed, 28 Feb 2018 17:14:27 +0300 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: 2018-02-28 16:49 GMT+03:00 Serhiy Storchaka : > 28.02.18 00:27, Chris Angelico ????: > >> Example usage >> ============= >> >> These list comprehensions are all approximately equivalent:: >> >> # Calling the function twice >> stuff = [[f(x), f(x)] for x in range(5)] >> > > Other options: > > stuff = [[y, y] for y in (f(x) for x in range(5))] > Why not `stuff = [[y, y] for y in map(f, range(5))]`? It is currently the fastest and most readable version IMHO. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Feb 28 09:32:39 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Feb 2018 16:32:39 +0200 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: 28.02.18 16:14, Kirill Balunov ????: > 2018-02-28 16:49 GMT+03:00 Serhiy Storchaka > >: > Other options: > > ? ? stuff = [[y, y] for y in (f(x) for x in range(5))] > > > Why not `stuff = [[y, y] for y in map(f, range(5))]`? It is currently > the fastest and most readable version IMHO. Only in this particular case. This doesn't work if the expression for items depends on x or if the subexpression is not a function call. From storchaka at gmail.com Wed Feb 28 09:51:55 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Feb 2018 16:51:55 +0200 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: 28.02.18 16:06, Chris Angelico ????: > On Thu, Mar 1, 2018 at 12:49 AM, Serhiy Storchaka wrote: >> Other options: >> >> g = (f(x) for x in range(5)) >> stuff = [[y, y] for y in g] > > That's the same as the one-liner, but with the genexp broken out. Not > sure it helps much as examples go? It is more readable. But can't be used as an expression. >> def g(): >> for x in range(5): >> y = f(x) >> yield [y, y] >> stuff = list(g) > > You're not the first to mention this, but I thought it basically > equivalent to the "expand into a loop" form. Is it really beneficial > to expand it, not just into a loop, but into a generator function that > contains a loop? It is slightly faster (if the list is not too small). It doesn't leak a temporary variable after loop. And in many cases you don't need a list, an iterator would work as well. In these cases it is easy to just drop calling list(). From rosuav at gmail.com Wed Feb 28 09:56:21 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 01:56:21 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Thu, Mar 1, 2018 at 1:51 AM, Serhiy Storchaka wrote: > 28.02.18 16:06, Chris Angelico ????: >> >> On Thu, Mar 1, 2018 at 12:49 AM, Serhiy Storchaka >> wrote: >>> >>> Other options: >>> >>> g = (f(x) for x in range(5)) >>> stuff = [[y, y] for y in g] >> >> >> That's the same as the one-liner, but with the genexp broken out. Not >> sure it helps much as examples go? > > > It is more readable. But can't be used as an expression. > >>> def g(): >>> for x in range(5): >>> y = f(x) >>> yield [y, y] >>> stuff = list(g) >> >> >> You're not the first to mention this, but I thought it basically >> equivalent to the "expand into a loop" form. Is it really beneficial >> to expand it, not just into a loop, but into a generator function that >> contains a loop? > > > It is slightly faster (if the list is not too small). It doesn't leak a > temporary variable after loop. And in many cases you don't need a list, an > iterator would work as well. In these cases it is easy to just drop calling > list(). Doesn't leak a temporary? In Python 3, the list comp won't leak anything, but the function is itself a temporary variable with permanent scope. You're right about the generator being sufficient at times, but honestly, if we're going to say "maybe you don't need the same result", then all syntax questions go out the window :D ChrisA From ethan at stoneleaf.us Wed Feb 28 10:10:43 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 28 Feb 2018 07:10:43 -0800 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> Message-ID: <5A96C673.4070804@stoneleaf.us> On 02/27/2018 09:23 PM, Chris Angelico wrote: > On Wed, Feb 28, 2018 at 2:47 PM, Rob Cliffe via Python-ideas wrote: >> And here's a thought: What are the semantics of >> a = (42 as a) # Of course a linter should point this out too >> At first I thought this was also a laborious synonym for "a=42". But then I >> re-read your statement (the one I described above as crystal-clear) and >> realised that its exact wording was even more critical than I had thought: >> "the new name binding will shadow the other name from the point where it >> is evaluated until the end of the statement" >> Note: "until the end of the statement". NOT "until the end of the >> expression". The distinction matters. >> If we take this as gospel, all this will do is create a temporary variable >> "a", assign the value 42 to it twice, then discard it. I.e. it effectively >> does nothing, slowly. >> Have I understood correctly? Very likely you have considered this and mean >> exactly what you say, but I am sure you will understand that I mean no >> offence by querying it. > > Actually, that's a very good point, and I had to actually go and do > that to confirm. You're correct that the "a =" part is also affected, > but there may be more complicated edge cases. Disassembly can help > track down what the compiler's actually doing: > >>>> def f(): > ... a = 1 > ... a = (2 as a) > ... print(a) > ... >>>> dis.dis(f) > 2 0 LOAD_CONST 1 (1) > 2 STORE_FAST 0 (a) > > 3 4 LOAD_CONST 2 (2) > 6 DUP_TOP > 8 STORE_FAST 1 (a) > 10 STORE_FAST 1 (a) > 12 DELETE_FAST 1 (a) > > 4 14 LOAD_GLOBAL 0 (print) > 16 LOAD_FAST 0 (a) > 18 CALL_FUNCTION 1 > 20 POP_TOP > 22 LOAD_CONST 0 (None) > 24 RETURN_VALUE > > If you're not familiar with the output of dis.dis(), the first column > (largely blank) is line numbers in the source, the second is byte code > offsets, and then we have the operation and its parameter (if any). > The STORE_FAST and LOAD_FAST opcodes work with local names, which are > identified by their indices; the first such operation sets slot 0 > (named "a"), but the two that happen in line 3 (byte positions 8 and > 10) are manipulating slot 1 (also named "a"). So you can see that line > 3 never touches slot 0, and it is entirely operating within the SLNB > scope. dis.dis may be great, but so is running the function so everyone can see the output. ;) If I understood your explanation, `print(a)` produces `1` ? That seems wrong -- the point of statement-local name bindings is twofold: - give a name to a value - evaluate to that value Which is why your first example works: stuff = [[(f(x) as y), y] for x in range(5)] (f(x) as y), y evaluates as f(x), and also assigns that result to y, so in a = (2 as a) there is a temporary variable 'a', which gets assigned 2, and the SLNB is evaluated as 2, which should then get assigned back to the local variable 'a'. In other words, the final print from `f()` above should be 2, not 1. (Slightly different names would help avoid confusion when referencing different locations of the PEP.) -- ~Ethan~ From storchaka at gmail.com Wed Feb 28 10:16:09 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Feb 2018 17:16:09 +0200 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: 28.02.18 16:56, Chris Angelico ????: >>>> def g(): >>>> for x in range(5): >>>> y = f(x) >>>> yield [y, y] >>>> stuff = list(g) >>> >>> >>> You're not the first to mention this, but I thought it basically >>> equivalent to the "expand into a loop" form. Is it really beneficial >>> to expand it, not just into a loop, but into a generator function that >>> contains a loop? >> >> >> It is slightly faster (if the list is not too small). It doesn't leak a >> temporary variable after loop. And in many cases you don't need a list, an >> iterator would work as well. In these cases it is easy to just drop calling >> list(). > > Doesn't leak a temporary? In Python 3, the list comp won't leak > anything, but the function is itself a temporary variable with > permanent scope. You're right about the generator being sufficient at > times, but honestly, if we're going to say "maybe you don't need the > same result", then all syntax questions go out the window :D Explicit for loop leaks variables x and y after the loop. They can hold references to large objects. The generator function itself doesn't hold references to the proceeded data. From rosuav at gmail.com Wed Feb 28 10:18:45 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 02:18:45 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: <5A96C673.4070804@stoneleaf.us> References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> <5A96C673.4070804@stoneleaf.us> Message-ID: On Thu, Mar 1, 2018 at 2:10 AM, Ethan Furman wrote: > dis.dis may be great, but so is running the function so everyone can see the > output. ;) Oh, sorry. >>> f() 1 > If I understood your explanation, `print(a)` produces `1` ? That seems > wrong -- the point of statement-local name bindings is twofold: > > - give a name to a value > - evaluate to that value > > Which is why your first example works: > > stuff = [[(f(x) as y), y] for x in range(5)] > > (f(x) as y), y > > evaluates as f(x), and also assigns that result to y, so in > > a = (2 as a) > > there is a temporary variable 'a', which gets assigned 2, and the SLNB is > evaluated as 2, which should then get assigned back to the local variable > 'a'. In other words, the final print from `f()` above should be 2, not 1. > (Slightly different names would help avoid confusion when referencing > different locations of the PEP.) Except that assignment is evaluated RHS before LHS as part of a single statement. When Python goes to look up the name "a" to store it (as the final step of the assignment), the SLNB is still active (it's still the same statement - note that this is NOT expression-local), so it uses the temporary. Honestly, though, it's like writing "a = a++" in C, and then being confused by the result. Why are you using the same name in two assignments? Normal code shouldn't do this. :) ChrisA From rosuav at gmail.com Wed Feb 28 10:19:12 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 02:19:12 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Thu, Mar 1, 2018 at 2:16 AM, Serhiy Storchaka wrote: > 28.02.18 16:56, Chris Angelico ????: >>>>> >>>>> def g(): >>>>> for x in range(5): >>>>> y = f(x) >>>>> yield [y, y] >>>>> stuff = list(g) >>>> >>>> >>>> >>>> You're not the first to mention this, but I thought it basically >>>> equivalent to the "expand into a loop" form. Is it really beneficial >>>> to expand it, not just into a loop, but into a generator function that >>>> contains a loop? >>> >>> >>> >>> It is slightly faster (if the list is not too small). It doesn't leak a >>> temporary variable after loop. And in many cases you don't need a list, >>> an >>> iterator would work as well. In these cases it is easy to just drop >>> calling >>> list(). >> >> >> Doesn't leak a temporary? In Python 3, the list comp won't leak >> anything, but the function is itself a temporary variable with >> permanent scope. You're right about the generator being sufficient at >> times, but honestly, if we're going to say "maybe you don't need the >> same result", then all syntax questions go out the window :D > > > Explicit for loop leaks variables x and y after the loop. They can hold > references to large objects. The generator function itself doesn't hold > references to the proceeded data. > Oh, gotcha. Yeah. Will add that as another example. ChrisA From storchaka at gmail.com Wed Feb 28 10:20:49 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Feb 2018 17:20:49 +0200 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: 28.02.18 08:52, Chris Angelico ????: > A new version of the PEP has been pushed, and should be live within a > few minutes. > > https://www.python.org/dev/peps/pep-0572/ > > Whatever I've missed, do please let me know. This document should end > up incorporating, or at least mentioning, all of the proposals you > cited. I have left comments on https://github.com/python/peps/commit/2cd352673896e84c4d30f22d0829fae65e253e85. Not sure that they are visible to you. From rosuav at gmail.com Wed Feb 28 10:24:08 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 02:24:08 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: On Thu, Mar 1, 2018 at 2:20 AM, Serhiy Storchaka wrote: > 28.02.18 08:52, Chris Angelico ????: >> >> A new version of the PEP has been pushed, and should be live within a >> few minutes. >> >> https://www.python.org/dev/peps/pep-0572/ >> >> Whatever I've missed, do please let me know. This document should end >> up incorporating, or at least mentioning, all of the proposals you >> cited. > > > I have left comments on > https://github.com/python/peps/commit/2cd352673896e84c4d30f22d0829fae65e253e85. > Not sure that they are visible to you. > They are, thanks. I've responded to them all, either inline in the GitHub commit notes, and/or by pushing a change that fixes the issue. Appreciated. ChrisA From ethan at stoneleaf.us Wed Feb 28 10:25:55 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 28 Feb 2018 07:25:55 -0800 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: <5A96CA03.1040501@stoneleaf.us> On 02/28/2018 02:43 AM, Chris Angelico wrote: > On Wed, Feb 28, 2018 at 8:04 PM, Robert Vanden Eynde wrote: >> 3) "C problem that an equals sign in an expression can now create a name >> binding, rather than performing a comparison." The "=" does variable >> assignement already, and there is no grammar problem of "=" vs "==" because >> the "with" keyword is used in the expression, therefore "with a == ..." is a >> SyntaxError whereas "where a = ..." is alright (See grammar in thektulu >> implemention of "where"). > > Yes, but in Python, "=" does variable assignment *as a statement*. In > C, you can do this: > > while (ch = getch()) > do_something_with(ch) > > That's an assignment in an arbitrary condition, and that's a bug > magnet. You cannot do that in Python. You cannot simply miss out one > equals sign and have legal code that does what you don't want. With my > proposed syntax, you'll be able to do this: > > while (getch() as ch): > ... > > There's no way that you could accidentally write this when you really > wanted to compare against the character. Given the current (posted) proposal, wouldn't 'ch' evaporate before the ':' and be unavailable in the 'while' body? -- ~Ethan~ From rosuav at gmail.com Wed Feb 28 10:36:14 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 02:36:14 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: <5A96CA03.1040501@stoneleaf.us> References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> <5A96CA03.1040501@stoneleaf.us> Message-ID: On Thu, Mar 1, 2018 at 2:25 AM, Ethan Furman wrote: > On 02/28/2018 02:43 AM, Chris Angelico wrote: >> >> On Wed, Feb 28, 2018 at 8:04 PM, Robert Vanden Eynde wrote: > > >>> 3) "C problem that an equals sign in an expression can now create a name >>> binding, rather than performing a comparison." The "=" does variable >>> assignement already, and there is no grammar problem of "=" vs "==" >>> because >>> the "with" keyword is used in the expression, therefore "with a == ..." >>> is a >>> SyntaxError whereas "where a = ..." is alright (See grammar in thektulu >>> implemention of "where"). >> >> >> Yes, but in Python, "=" does variable assignment *as a statement*. In >> C, you can do this: >> >> while (ch = getch()) >> do_something_with(ch) >> >> That's an assignment in an arbitrary condition, and that's a bug >> magnet. You cannot do that in Python. You cannot simply miss out one >> equals sign and have legal code that does what you don't want. With my >> proposed syntax, you'll be able to do this: >> >> while (getch() as ch): >> ... >> >> There's no way that you could accidentally write this when you really >> wanted to compare against the character. > > > Given the current (posted) proposal, wouldn't 'ch' evaporate before the ':' > and be unavailable in the 'while' body? > Not my proposal. Others have suggested various forms of *expression* local name bindings, but the definition in the PEP is *statement* local. So it is indeed available. ChrisA From p.f.moore at gmail.com Wed Feb 28 10:46:21 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 28 Feb 2018 15:46:21 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> <5A96C673.4070804@stoneleaf.us> Message-ID: On 28 February 2018 at 15:18, Chris Angelico wrote: >> a = (2 as a) >> >> there is a temporary variable 'a', which gets assigned 2, and the SLNB is >> evaluated as 2, which should then get assigned back to the local variable >> 'a'. In other words, the final print from `f()` above should be 2, not 1. >> (Slightly different names would help avoid confusion when referencing >> different locations of the PEP.) > > Except that assignment is evaluated RHS before LHS as part of a single > statement. When Python goes to look up the name "a" to store it (as > the final step of the assignment), the SLNB is still active (it's > still the same statement - note that this is NOT expression-local), so > it uses the temporary. > > Honestly, though, it's like writing "a = a++" in C, and then being > confused by the result. Why are you using the same name in two > assignments? Normal code shouldn't do this. :) Eww. I can understand the logic here, but this sort of weird gotcha is precisely why people dislike C/C++ and prefer Python. I don't consider it a selling point that this proposal allows Python coders to make the sort of mistakes C coders have suffered from for years. Can you make sure that the PEP includes a section that covers weird behaviours like this as problems with the proposal? I'm happy if you just list them, or even say "while this is a potential issue, the author doesn't think it's a major problem". I just don't think it should be forgotten. Paul From rosuav at gmail.com Wed Feb 28 11:03:40 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 03:03:40 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> <5A96C673.4070804@stoneleaf.us> Message-ID: On Thu, Mar 1, 2018 at 2:46 AM, Paul Moore wrote: > On 28 February 2018 at 15:18, Chris Angelico wrote: > >>> a = (2 as a) >>> >>> there is a temporary variable 'a', which gets assigned 2, and the SLNB is >>> evaluated as 2, which should then get assigned back to the local variable >>> 'a'. In other words, the final print from `f()` above should be 2, not 1. >>> (Slightly different names would help avoid confusion when referencing >>> different locations of the PEP.) >> >> Except that assignment is evaluated RHS before LHS as part of a single >> statement. When Python goes to look up the name "a" to store it (as >> the final step of the assignment), the SLNB is still active (it's >> still the same statement - note that this is NOT expression-local), so >> it uses the temporary. >> >> Honestly, though, it's like writing "a = a++" in C, and then being >> confused by the result. Why are you using the same name in two >> assignments? Normal code shouldn't do this. :) > > Eww. I can understand the logic here, but this sort of weird gotcha is > precisely why people dislike C/C++ and prefer Python. I don't consider > it a selling point that this proposal allows Python coders to make the > sort of mistakes C coders have suffered from for years. > > Can you make sure that the PEP includes a section that covers weird > behaviours like this as problems with the proposal? I'm happy if you > just list them, or even say "while this is a potential issue, the > author doesn't think it's a major problem". I just don't think it > should be forgotten. Sure. Ultimately, it's like any other feature: it can be abused in ways that make no sense. You can write a list comprehension where you ignore the end result and work entirely with side effects; you can write threaded code that spawns threads and immediately joins them all; nobody's stopping you. In a non-toy example, assigning to the same name twice in one statement is almost certainly an error for other reasons, so I'm not too bothered by it here. I'll add something to the PEP about execution order. ChrisA From p.f.moore at gmail.com Wed Feb 28 11:30:59 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 28 Feb 2018 16:30:59 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On 28 February 2018 at 13:45, Chris Angelico wrote: > On Wed, Feb 28, 2018 at 10:49 PM, Paul Moore wrote: >> While there's basically no justification for doing so, it should be >> noted that under this proposal, ((((((((1 as x) as y) as z) as w) as >> v) as u) as t) as s) is valid. Of course, "you can write confusing >> code using this" isn't an argument against a useful enhancement, but >> potential for abuse is something to be aware of. There's also >> (slightly more realistically) something like [(sqrt((b*b as bsq) + >> (4*a*c as fourac)) as root1), (sqrt(bsq - fourac) as root2)], which I >> can see someone thinking is a good idea! > > Sure! Though I'm not sure what you're representing there; it looks > almost, but not quite, like the quadratic formula. If that was the > intention, I'd be curious to see the discriminant broken out, with > some kind of trap for the case where it's negative. lol, it was meant to be the quadratic roots. If I got it wrong, that probably says something about how hard it is top maintain or write code that over-uses the proposed feature ;-) If I didn't get it wrong, that still makes the same point, I guess! >> Honestly, the asymmetry in [(f(x) as y), y] makes this the *least* >> readable option to me :-( All of the other options clearly show that >> the 2 elements of the list are the same, but the statement-local name >> version requires me to stop and think to confirm that it's a list of 2 >> copies of the same value. > > I need some real-world examples where it's not as trivial as [y, y] so > people don't get hung up on the symmetry issue. Well, I agree real world examples would be a lot more compelling here, but I don't necessarily agree that the asymmetry won't remain an issue. In practice, I've only ever wanted a feature like this when hacking at the command line, or writing *extremely* hacky one-off scripts. So any sort of demonstration of code that exists in a maintained, production codebase which would actually benefit from this feature would be a major advantage here. >>> Open questions >>> ============== >>> >>> 1. What happens if the name has already been used? `(x, (1 as x), x)` >>> Currently, prior usage functions as if the named expression did not >>> exist (following the usual lookup rules); the new name binding will >>> shadow the other name from the point where it is evaluated until the >>> end of the statement. Is this acceptable? Should it raise a syntax >>> error or warning? >> >> IMO, relying on evaluation order is the only viable option, but it's >> confusing. I would immediately reject something like `(x, (1 as x), >> x)` as bad style, simply because the meaning of x at the two places it >> is used is non-obvious. >> >> I'm -1 on a warning. I'd prefer an error, but I can't see how you'd >> implement (or document) it. > > Sure. For now, I'm just going to leave it as a perfectly acceptable > use of the feature; it can be rejected as poor style, but permitted by > the language. "Perfectly acceptable" I disagree with. "Unacceptable but impossible to catch in the compiler" is closer to my view. What I'm concerned with is less dealing with code that is written like that (delete it as soon as you see it is the only practical answer ;-)) but rather clearly documenting the feature without either drowning the reader in special cases and clarifications, or leaving important points unspecified or difficult to find the definition for. >>> 2. The current implementation [1] implements statement-local names using >>> a special (and mostly-invisible) name mangling. This works perfectly >>> inside functions (including list comprehensions), but not at top >>> level. Is this a serious limitation? Is it confusing? >> >> I'm strongly -1 on "works like the current implementation" as a >> definition of the behaviour. While having a proof of concept to >> clarify behaviour is great, my first question would be "how is the >> behaviour documented to work?" So what is the PEP proposing would >> happen if I did >> >> if ('.'.join((str(x) for x in sys.version_info[:2])) as ver) == '3.6': >> # Python 3.6 specific code here >> elif sys.version_info[0] < 3: >> print(f"Version {ver} is not supported") >> >> at the top level of a Python file? To me, that's a perfectly >> reasonable way of using the new feature to avoid exposing a binding >> for "ver" in my module... > > I agree, sounds good. I'll reword this to be a limitation of implementation. To put it another way, "Intended to work, but we haven't determined how to implement it yet"? Fair enough, although it needs to be possible to implement it. These names are a weird not-quite-scope construct, and interactions with real scopes are going to be tricky to get right (not just implement, but define). Consider x = 12 if (1 as x) == 1: def foo(): print(x) # Actually, you'd probably get a "Name used before definition" error here. # Would "global x" refer to x=12 or to the statement-local x (1)? # Would "nonlocal x" refer to the statement-local x? x = 13 return x print(foo()) print(x) print(foo()) print(x) What should that return? Not "what does the current implementation return", but what is the intended result according to the proposal, and how would you explain that result in the docs? I think I'd expect 1 13 # But see note about global/nonlocal 12 1 xxxxxxx Not sure? Maybe 1? Can you create a closure over a statement-local variable? 13 # But see note about global/nonlocal 12 The most charitable thing I can say here is that the semantics are currently under-specified in the PEP :-) >>> 4. Syntactic confusion in `except` statements. While technically >>> unambiguous, it is potentially confusing to humans. In Python 3.7, >>> parenthesizing `except (Exception as e):` is illegal, and there is no >>> reason to capture the exception type (as opposed to the exception >>> instance, as is done by the regular syntax). Should this be made >>> outright illegal, to prevent confusion? Can it be left to linters? >> >> Wait - except (Exception as e): would set e to the type Exception, and >> not capture the actual exception object? > > Correct. The expression "Exception" evaluates to the type Exception, > and you can capture that. It's a WutFace moment but it's a logical > consequence of the nature of Python. "Logical consequence of the rules" isn't enough for a Python language feature, IMO. Intuitive and easy to infer are key. Even if this is a corner case, it counts as a mildly ugly wart to me. >> Even though that's >> unambiguous, it's *incredibly* confusing. But saying we allow "except >> " is also bad, in the sense that it's an >> annoying exception (sic) to have to include in the documentation. > > Agreed. I don't want to special-case it out; this is something for > code review to catch. Fortunately, this would give fairly obvious > results - you try to do something with the exception, and you don't > actually have an exception object, you have . It's > a little more problematic in a "with" block, because it'll often do > the same thing. I value Python for making it easy to write correct code, not easy to spot your errors. Too many hings like this would start me thinking I should ban statement-local names from codebases I maintain, which is not a good start for a feature... >> Maybe it would be viable to say that a (top-level) expression can >> never be a name binding - after all, there's no point because the name >> will be immediately discarded. Only allow name bindings in >> subexpressions. That would disallow this case as part of that general >> rule. But I've no idea what that rule would do to the grammar (in >> particular, whether it would still be possible to parse without >> lookahead). (Actually no, this would prohibit constructs such as >> `while (f(x) as val) > 0`, which I presume you're trying to >> support[1], although you don't mention this in the rationale or >> example usage sections). >> >> [1] Based on the fact that you want the name binding to remain active >> for the enclosing *statement*, not just the enclosing *expression*. > > Not sure what you mean by a "top-level expression", if it's going to > disallow 'while (f(x) as val) > 0:'. Can you elaborate? Actually, I was wrong. The top level expression in '(f(x) as val) > 0' is the comparison, so the while usage survives. In 'exception (Exception as e)', the top-level expression is '(Exception as e)', so we can ban name bindings in top-level expressions and kill that. But all this feels pretty non-intuitive. Squint hard and you can work out how the rules mean what you think they mean, but it doesn't feel "obvious" (dare I say "Pythonic"?) >> This seems to imply that the name in (expr as name) when used as a top >> level expression will persist after the closing parenthesis. Is that >> intended? It's not mentioned anywhere in the PEP (that I could see). >> On re-reading, I see that you say "for the remainder of the current >> *statement*" and not (as I had misread it) the remainder of the >> current *expression*. > > Yep. If you have an expression on its own, it's an "expression > statement", and the subscope will end at the newline (or the > semicolon, if you have one). Inside something larger, it'll persist. Technically you can have more than one expression in a statement. Consider (from the grammar): for_stmt ::= "for" target_list "in" expression_list ":" suite ["else" ":" suite] expression_list ::= expression ( "," expression )* [","] Would a name binding in the first expression in an expression_list be visible in the second expression? Should it be? It will be, because it's visible to the end of the statement, not to the end of the expression, but there might be subtle technical implications here (I haven't checked the grammar to see what other statements allow multiple expressions - that's your job ;-)) To clarify this sort of question, you should probably document in the PEP precisely how the grammar will be modified. >> So multi-line statements will give it a larger >> scope? That strikes me as giving this proposal a much wider >> applicability than implied by the summary. Consider >> >> def f(x, y=(object() as default_value)): >> if y is default_value: >> print("You did not supply y") >> >> That's an "obvious" use of the new feature, and I could see it very >> quickly becoming the standard way to define sentinel values. I *think* >> it's a reasonable idiom, but honestly, I'm not sure. It certainly >> feels like scope creep from the original use case, which was naming >> bits of list comprehensions. > > Eeeuaghh.... okay. Now I gotta think about this one. > > The 'def' statement is an executable one, to be sure; but the > statement doesn't include *running* the function, only *creating* it. > So as you construct the function, default_value has that value. Inside > the actual running of it, that name doesn't exist any more. So this > won't actually work. But you could use this to create annotations and > such, I guess... lol, see? Closures rear their ugly heads, as I mentioned above. >>>> def f(): > ... def g(x=(object() as default_value)) -> default_value: > ... ... > ... return g > ... >>>> f().__annotations__ > {'return': } >>>> dis.dis(f) > 2 0 LOAD_GLOBAL 0 (object) > 2 CALL_FUNCTION 0 > 4 DUP_TOP > 6 STORE_FAST 0 (default_value) > 8 BUILD_TUPLE 1 > 10 LOAD_FAST 0 (default_value) > 12 LOAD_CONST 1 (('return',)) > 14 BUILD_CONST_KEY_MAP 1 > 16 LOAD_CONST 2 ( 0x7fe37d974ae0, file "", line 2>) > 18 LOAD_CONST 3 ('f..g') > 20 MAKE_FUNCTION 5 > 22 STORE_FAST 1 (g) > 24 DELETE_FAST 0 (default_value) > > 4 26 LOAD_FAST 1 (g) > 28 RETURN_VALUE > > Disassembly of ", line 2>: > 3 0 LOAD_CONST 0 (None) > 2 RETURN_VALUE > > Not terribly useful. "What the current proof of concept implementation does" isn't useful anyway, but even ignoring that I'd prefer to see what it *does* rather than what it *compiles to*. But what needs to be documented is what the PEP *proposes* it does. > I'll add some more examples. I think the if/while usage is potentially of value. I think it's an unexpected consequence of an overly-broad solution to the original problem, that accidentally solves another long-running debate. But it means you've opened a much bigger can of worms than it originally appeared, and I'm not sure you don't risk losing the simplicity that *expression* local names might have had. But I can even break expression local names: x = ((lambda: boom()) as boom) x() It's possible that the "boom" is just my head exploding, not the interpreter. But I think I just demonstrated a closure over an expression-local name. For added fun, replace "x" with "boom"... > Thanks for the feedback! Keep it coming! :) Ask and you shall receive :-) Paul From ethan at stoneleaf.us Wed Feb 28 12:34:19 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 28 Feb 2018 09:34:19 -0800 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: <5A96E81B.5090008@stoneleaf.us> On 02/27/2018 02:27 PM, Chris Angelico wrote: > PEP: 572 Because: a = (2 as a) does not evaluate to 2 and because some statements, such as 'for' and 'while' can cover many, many lines (leaving plenty of potential for confusion over which variable disappear at the end and which persist): -1 I would love to see an expression assignment syntax, but Statement-Local-Name-Binding is confusing and non-intuitive. Is there already a PEP about expression assignment? If not, that would be a good one to write. -- ~Ethan~ From kirillbalunov at gmail.com Wed Feb 28 13:03:50 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Wed, 28 Feb 2018 21:03:50 +0300 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: The PEP says: > Omitting the parentheses from this PEP's proposed syntax introduces many > syntactic ambiguities. > and: As the name's scope extends to the full current statement, even a block > statement, this can be used to good effect in the header of an if or while > statement Will the `from ... import ... as ... statement be a special case, because currently the following form is valid: from math import (tau as two_pi) With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.cliffe at btinternet.com Wed Feb 28 13:09:33 2018 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 28 Feb 2018 18:09:33 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> Message-ID: <5de11694-b03b-e986-8b18-d95652b488d0@btinternet.com> On 28/02/2018 05:23, Chris Angelico wrote: > >> If calling `f(x)` is expensive or has side effects, the clean operation of >> the list comprehension gets muddled. Using a short-duration name binding >> retains the simplicity; while the extra `for` loop does achieve this, it >> does so at the cost of dividing the expression visually, putting the named >> part at the end of the comprehension instead of the beginning. >> >> Maybe add to last sentence "and of adding (at least conceptually) extra >> steps: building a 1-element list, then extracting the first element" > That's precisely the point that Serhiy's optimization is aiming at, > with the intention of making "for x in [expr]" a standard idiom for > list comp assignment. If we assume that this does become standard, it > won't add the extra steps, but it does still push that expression out > to the far end of the comprehension, whereas a named subexpression > places it at first use. I understand that creating the list could be avoided *at runtime*. My point was that in trying to *read and understand* ??? ??? stuff = [[y, y] for x in range(5) for y in [f(x)]] the brain must follow the creation and unpacking of the list.? I.e. this is an extra cost of this particular construction. > >> And here's a thought: What are the semantics of >> a = (42 as a) # Of course a linter should point this out too >> At first I thought this was also a laborious synonym for "a=42". But then I >> re-read your statement (the one I described above as crystal-clear) and >> realised that its exact wording was even more critical than I had thought: >> "the new name binding will shadow the other name from the point where it >> is evaluated until the end of the statement" >> Note: "until the end of the statement". NOT "until the end of the >> expression". The distinction matters. >> If we take this as gospel, all this will do is create a temporary variable >> "a", assign the value 42 to it twice, then discard it. I.e. it effectively >> does nothing, slowly. >> Have I understood correctly? Very likely you have considered this and mean >> exactly what you say, but I am sure you will understand that I mean no >> offence by querying it. > Actually, that's a very good point, and I had to actually go and do > that to confirm. You're correct that the "a =" part is also affected, > but there may be more complicated edge cases. I have read this thread so far - I can't say I have absorbed and understood it all, but I am left with a feeling that Expression-Local-Name-Bindings would be preferable to Statement-Local-Name_Bindings, so that the temporary variable wouldn't apply to the LHS in the above example.? I realise that this is a vague statement that needs far more definition, but - hand-waving for now - do you think it would be difficult to change the implementation accordingly? Rob Cliffe From brenbarn at brenbarn.net Wed Feb 28 14:35:18 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Wed, 28 Feb 2018 11:35:18 -0800 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> <5A96C673.4070804@stoneleaf.us> Message-ID: <5A970476.3000101@brenbarn.net> On 2018-02-28 07:18, Chris Angelico wrote: > Except that assignment is evaluated RHS before LHS as part of a single > statement. When Python goes to look up the name "a" to store it (as > the final step of the assignment), the SLNB is still active (it's > still the same statement - note that this is NOT expression-local), so > it uses the temporary. Wait, so you're saying that if I do a = (2 as a) The "a = " assignment assigns to the SLNB, and so is then discarded after the statement finishes? That seems very bad to me. If there are SLNBs with this special "as" syntax, I think the ONLY way to assign to an SLNB should be with the "as" syntax. You shouldn't be able to assign to an SLNB with regular assignment syntax, even if you created an SNLB with the same name as the LHS within the RHS. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From rosuav at gmail.com Wed Feb 28 14:41:10 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 06:41:10 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Thu, Mar 1, 2018 at 3:30 AM, Paul Moore wrote: > On 28 February 2018 at 13:45, Chris Angelico wrote: >> On Wed, Feb 28, 2018 at 10:49 PM, Paul Moore wrote: > >>> While there's basically no justification for doing so, it should be >>> noted that under this proposal, ((((((((1 as x) as y) as z) as w) as >>> v) as u) as t) as s) is valid. Of course, "you can write confusing >>> code using this" isn't an argument against a useful enhancement, but >>> potential for abuse is something to be aware of. There's also >>> (slightly more realistically) something like [(sqrt((b*b as bsq) + >>> (4*a*c as fourac)) as root1), (sqrt(bsq - fourac) as root2)], which I >>> can see someone thinking is a good idea! >> >> Sure! Though I'm not sure what you're representing there; it looks >> almost, but not quite, like the quadratic formula. If that was the >> intention, I'd be curious to see the discriminant broken out, with >> some kind of trap for the case where it's negative. > > lol, it was meant to be the quadratic roots. If I got it wrong, that > probably says something about how hard it is top maintain or write > code that over-uses the proposed feature ;-) If I didn't get it wrong, > that still makes the same point, I guess! Or, more likely, it says something about what happens when a programmer bashes out some code to try to represent a famous formula, but doesn't actually debug it. As is often said, code that isn't tested is buggy :) Here's another equally untested piece of code: [(-b + (sqrt(b*b - 4*a*c) as disc)) / (2*a), (-b - disc) / (2*a)] >>>> Open questions >>>> ============== >>>> >>>> 1. What happens if the name has already been used? `(x, (1 as x), x)` >>>> Currently, prior usage functions as if the named expression did not >>>> exist (following the usual lookup rules); the new name binding will >>>> shadow the other name from the point where it is evaluated until the >>>> end of the statement. Is this acceptable? Should it raise a syntax >>>> error or warning? >>> >>> IMO, relying on evaluation order is the only viable option, but it's >>> confusing. I would immediately reject something like `(x, (1 as x), >>> x)` as bad style, simply because the meaning of x at the two places it >>> is used is non-obvious. >>> >>> I'm -1 on a warning. I'd prefer an error, but I can't see how you'd >>> implement (or document) it. >> >> Sure. For now, I'm just going to leave it as a perfectly acceptable >> use of the feature; it can be rejected as poor style, but permitted by >> the language. > > "Perfectly acceptable" I disagree with. "Unacceptable but impossible > to catch in the compiler" is closer to my view. Sorry, what I meant by "acceptable" was "legal". The compiler accepts it, the bytecode exec is fine with it, but a human may very well decide that it's unacceptable. >>>> 2. The current implementation [1] implements statement-local names using >>>> a special (and mostly-invisible) name mangling. This works perfectly >>>> inside functions (including list comprehensions), but not at top >>>> level. Is this a serious limitation? Is it confusing? >>> >>> I'm strongly -1 on "works like the current implementation" as a >>> definition of the behaviour. While having a proof of concept to >>> clarify behaviour is great, my first question would be "how is the >>> behaviour documented to work?" So what is the PEP proposing would >>> happen if I did >>> >>> if ('.'.join((str(x) for x in sys.version_info[:2])) as ver) == '3.6': >>> # Python 3.6 specific code here >>> elif sys.version_info[0] < 3: >>> print(f"Version {ver} is not supported") >>> >>> at the top level of a Python file? To me, that's a perfectly >>> reasonable way of using the new feature to avoid exposing a binding >>> for "ver" in my module... >> >> I agree, sounds good. I'll reword this to be a limitation of implementation. > > To put it another way, "Intended to work, but we haven't determined > how to implement it yet"? Fair enough, although it needs to be > possible to implement it. These names are a weird not-quite-scope > construct, and interactions with real scopes are going to be tricky to > get right (not just implement, but define). Yeah. My ideal is something like this: * The subscope names are actual real variables, with unspellable names. * These name bindings get created and destroyed just like other names do, with the exception that they are automatically destroyed when they "fall out of scope". * While a subscope name is visible, locals() will use that value for that name (shadowing any other). * Once that name is removed, locals() will return to the normal form of the name. And yes, ideally this will still work when locals() is globals(). There are a couple of issues, but that's my planned design. An alternative design that is also viable: These subscope names (easily detected internally) are simply hidden from locals() and globals(). I haven't dug into the implementation consequences of any of this at global scope. I know what parts of the code I need to look at, but my days have this annoying habit of having only 24 hours in them. Anyone got a bugfix for that? :| > Consider > > x = 12 > if (1 as x) == 1: > def foo(): > print(x) > # Actually, you'd probably get a "Name used before definition" > error here. > # Would "global x" refer to x=12 or to the statement-local x (1)? > # Would "nonlocal x" refer to the statement-local x? > x = 13 > return x > print(foo()) Yeah, that's going to give UnboundLocalError, because inside foo(), x has been flagged as local. That's independent of the global scope changes. I'd like to say that "global x" would catch the 12, but until I actually get around to implementing it, I'm not sure. > print(x) > print(foo()) > print(x) Anything that executes after the 'if' exits should see x as 12. The temporary variable is completely gone at that point. > What should that return? Not "what does the current implementation > return", but what is the intended result according to the proposal, > and how would you explain that result in the docs? > > I think I'd expect > > 1 > 13 # But see note about global/nonlocal > 12 > 1 xxxxxxx Not sure? Maybe 1? Can you create a closure over a > statement-local variable? > 13 # But see note about global/nonlocal > 12 > > The most charitable thing I can say here is that the semantics are > currently under-specified in the PEP :-) Hah. This is why I started out by saying that this ONLY applies inside a function. Extending this to global scope (and class scope; my guess is that it'll behave the same as global) is something that I'm only just now looking into. >>>> 4. Syntactic confusion in `except` statements. While technically >>>> unambiguous, it is potentially confusing to humans. In Python 3.7, >>>> parenthesizing `except (Exception as e):` is illegal, and there is no >>>> reason to capture the exception type (as opposed to the exception >>>> instance, as is done by the regular syntax). Should this be made >>>> outright illegal, to prevent confusion? Can it be left to linters? >>> >>> Wait - except (Exception as e): would set e to the type Exception, and >>> not capture the actual exception object? >> >> Correct. The expression "Exception" evaluates to the type Exception, >> and you can capture that. It's a WutFace moment but it's a logical >> consequence of the nature of Python. > > "Logical consequence of the rules" isn't enough for a Python language > feature, IMO. Intuitive and easy to infer are key. Even if this is a > corner case, it counts as a mildly ugly wart to me. Does it need to be special-cased as an error? I do *not* want to special-case it to capture the exception instance, as that would almost certainly misbehave in more complicated scenarios. > I value Python for making it easy to write correct code, not easy to > spot your errors. Too many hings like this would start me thinking I > should ban statement-local names from codebases I maintain, which is > not a good start for a feature... Banning them from 'except' clauses isn't a bad thing, though. There's nothing that you need to capture; you're normally going to use a static lookup of a simple name (at best, a qualified name). >>> This seems to imply that the name in (expr as name) when used as a top >>> level expression will persist after the closing parenthesis. Is that >>> intended? It's not mentioned anywhere in the PEP (that I could see). >>> On re-reading, I see that you say "for the remainder of the current >>> *statement*" and not (as I had misread it) the remainder of the >>> current *expression*. >> >> Yep. If you have an expression on its own, it's an "expression >> statement", and the subscope will end at the newline (or the >> semicolon, if you have one). Inside something larger, it'll persist. > > Technically you can have more than one expression in a statement. > Consider (from the grammar): > > for_stmt ::= "for" target_list "in" expression_list ":" suite > ["else" ":" suite] > > expression_list ::= expression ( "," expression )* [","] > > Would a name binding in the first expression in an expression_list be > visible in the second expression? Should it be? It will be, because > it's visible to the end of the statement, not to the end of the > expression, but there might be subtle technical implications here (I > haven't checked the grammar to see what other statements allow > multiple expressions - that's your job ;-)) To clarify this sort of > question, you should probably document in the PEP precisely how the > grammar will be modified. Yes, it will. It's exactly the same in any form of statement: the name binding begins to exist at the point where it's evaluated, and it ceases to exist once that statement finishes executing. If it's an expression statement (by which I specifically mean the syntactic construct of putting a bare expression on a line on its own, called "expr_stmt" in the grammar), that point happens to coincide with the end of the expression, but that's a coincidence. So if, in the "in" expression list, you capture something, that thing will be visible all through the suite. Here's an example: for item in (get_items() as items): print(item) print(items) print(items) What actually happens is kinda this: for item in (get_items() as items_0x142857): print(item) print(items_0x142857) del items_0x142857 print(items) except that, internally, the name "items_0x142857" actually has a dot in it, making it impossible to reference using regular syntax. Once the 'for' loop is completely finished, the unbinding is compiled in, and then the name mangling ceases to happen. So it doesn't actually matter how many expressions are in a statement; it's just "this statement". > lol, see? Closures rear their ugly heads, as I mentioned above. > > "What the current proof of concept implementation does" isn't useful > anyway, but even ignoring that I'd prefer to see what it *does* rather > than what it *compiles to*. But what needs to be documented is what > the PEP *proposes* it does. The current implementation matches my proposed semantics, as long as the code in question is all inside a function. >> I'll add some more examples. I think the if/while usage is potentially of value. > > I think it's an unexpected consequence of an overly-broad solution to > the original problem, that accidentally solves another long-running > debate. But it means you've opened a much bigger can of worms than it > originally appeared, and I'm not sure you don't risk losing the > simplicity that *expression* local names might have had. > > But I can even break expression local names: > > x = ((lambda: boom()) as boom) > x() > > It's possible that the "boom" is just my head exploding, not the > interpreter. But I think I just demonstrated a closure over an > expression-local name. For added fun, replace "x" with "boom"... And this is why I am not trying for expression-local names. If someone wants to run with a competing proposal for list-comprehension-local names, sure, but I'm not in favour of that either. Expression-local is too narrow to be useful AND it still has the problems that statement-local has. >> Thanks for the feedback! Keep it coming! :) > > Ask and you shall receive :-) If I take a razor and cut myself with it, it's called "self-harm" and can get me dinged for a psych evaluation. If I take a mailing list and induce it to send me hundreds of emails and force myself to read them all... there's probably a padded cell with my name on it somewhere. You know, I'd be okay with that actually. Just as long as the cell has wifi. ChrisA From rosuav at gmail.com Wed Feb 28 14:45:16 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 06:45:16 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Thu, Mar 1, 2018 at 5:03 AM, Kirill Balunov wrote: > > The PEP says: > >> >> Omitting the parentheses from this PEP's proposed syntax introduces many >> syntactic ambiguities. > > > and: > >> As the name's scope extends to the full current statement, even a block >> statement, this can be used to good effect in the header of an if or while >> statement > > > > Will the `from ... import ... as ... statement be a special case, because > currently the following form is valid: > > from math import (tau as two_pi) No, because that statement doesn't have any expressions in it - it's a series of names. The "tau" in that line is not looked up in the current scope; you can't write a function that returns the symbol "tau" and then use that in the import. So the grammatical hook that enables "(... as ...)" doesn't apply here. ChrisA From rosuav at gmail.com Wed Feb 28 14:47:11 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 06:47:11 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: <5de11694-b03b-e986-8b18-d95652b488d0@btinternet.com> References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> <5de11694-b03b-e986-8b18-d95652b488d0@btinternet.com> Message-ID: On Thu, Mar 1, 2018 at 5:09 AM, Rob Cliffe via Python-ideas wrote: > > > On 28/02/2018 05:23, Chris Angelico wrote: >> >> >>> If calling `f(x)` is expensive or has side effects, the clean operation >>> of >>> the list comprehension gets muddled. Using a short-duration name binding >>> retains the simplicity; while the extra `for` loop does achieve this, it >>> does so at the cost of dividing the expression visually, putting the >>> named >>> part at the end of the comprehension instead of the beginning. >>> >>> Maybe add to last sentence "and of adding (at least conceptually) extra >>> steps: building a 1-element list, then extracting the first element" >> >> That's precisely the point that Serhiy's optimization is aiming at, >> with the intention of making "for x in [expr]" a standard idiom for >> list comp assignment. If we assume that this does become standard, it >> won't add the extra steps, but it does still push that expression out >> to the far end of the comprehension, whereas a named subexpression >> places it at first use. > > I understand that creating the list could be avoided *at runtime*. My point > was that in trying to *read and understand* > stuff = [[y, y] for x in range(5) for y in [f(x)]] > the brain must follow the creation and unpacking of the list. I.e. this is > an extra cost of this particular construction. >> >> >>> And here's a thought: What are the semantics of >>> a = (42 as a) # Of course a linter should point this out too >>> At first I thought this was also a laborious synonym for "a=42". But >>> then I >>> re-read your statement (the one I described above as crystal-clear) and >>> realised that its exact wording was even more critical than I had >>> thought: >>> "the new name binding will shadow the other name from the point >>> where it >>> is evaluated until the end of the statement" >>> Note: "until the end of the statement". NOT "until the end of the >>> expression". The distinction matters. >>> If we take this as gospel, all this will do is create a temporary >>> variable >>> "a", assign the value 42 to it twice, then discard it. I.e. it >>> effectively >>> does nothing, slowly. >>> Have I understood correctly? Very likely you have considered this and >>> mean >>> exactly what you say, but I am sure you will understand that I mean no >>> offence by querying it. >> >> Actually, that's a very good point, and I had to actually go and do >> that to confirm. You're correct that the "a =" part is also affected, >> but there may be more complicated edge cases. > > I have read this thread so far - I can't say I have absorbed and understood > it all, but I am left with a feeling that Expression-Local-Name-Bindings > would be preferable to Statement-Local-Name_Bindings, so that the temporary > variable wouldn't apply to the LHS in the above example. I realise that > this is a vague statement that needs far more definition, but - hand-waving > for now - do you think it would be difficult to change the implementation > accordingly? > Yes it would, but AIUI (I haven't tested it) a competing implementation already exists. So now we just need a competing PEP and we can run with it! ChrisA From brett at python.org Wed Feb 28 14:54:04 2018 From: brett at python.org (Brett Cannon) Date: Wed, 28 Feb 2018 19:54:04 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: Thanks for taking the time to write this PEP, Chris, even though I'm -1 on the idea. I'm glad to just have this as a historical document for the idea. On Tue, 27 Feb 2018 at 22:53 Chris Angelico wrote: > On Wed, Feb 28, 2018 at 4:52 PM, Robert Vanden Eynde > wrote: > > Hello Chris and Rob, > > > > did you compare your proposal tothe subject called "[Python-ideas] > Temporary > > variables in comprehensions" on this month list ? > > Yes, I did; that's one of many threads on the subject, and is part of > why I'm writing this up. > > One section that I have yet to write is "alternative syntax > proposals", which is where I'd collect all of those together. > > > If you don't want to go through all the mails, I tried to summarize the > > ideas in this mail : > > https://mail.python.org/pipermail/python-ideas/2018-February/048987.html > > > > In a nutshell one of the proposed syntax was > > > > stuff = [(y, y) where y = f(x) for x in range(5)] > > > > Or with your choice of keyword, > > > > stuff = [(y, y) with y = f(x) for x in range(5)] > > > > Your proposal uses the *reverse* syntax : > > > > stuff = [(y, y) with f(x) as y for x in range (5)] > > ... > > I wish I could write a pep to summarize all the discussions (including > > yours, my summary, including real world examples, including pro's and > > con's), or should I do a gist on GitHub so that we can talk in a more > "forum > > like" manner where people can edit their answers and modify the document > ? > > This mailing is driving me a bit crazy. > > This is exactly why I am writing up a PEP. Ultimately, it should list > every viable proposal (or group of similar proposals), with the > arguments for and against. Contributions of actual paragraphs of text > are VERY welcome; simply pointing out "hey, don't forget this one" is > also welcome, but I then have to go and write something up, so it'll > take a bit longer :) > > > As Rob pointed out, your syntax "(f(x) as y, y)" is really assymmetric > and > > "(y, y) with f(x) as y" or "(y, y) with y = f(x)" is probably prefered. > > Moreover I agree with you the choice of "with" keyword could be confused > > with the "with f(x) as x:" statement in context management, so maybe > "with x > > = f(x)" would cleary makes it different ? Or using a new keyword like > "where > > y = f(x)", "let y = f(x)" or probably better "given y = f(x)", "given" > isn't > > used in current librairies like numpy.where or sql alchemy "where". > > And also the standard library's tkinter.dnd.Icon, according to a quick > 'git grep'; but that might be just an example, rather than actually > being covered by backward-compatibility guarantees. I think "given" is > the strongest contender of the three, but I'm just mentioning all > three together. > > A new version of the PEP has been pushed, and should be live within a > few minutes. > > https://www.python.org/dev/peps/pep-0572/ > > Whatever I've missed, do please let me know. This document should end > up incorporating, or at least mentioning, all of the proposals you > cited. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Feb 28 15:00:09 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 07:00:09 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: <5A970476.3000101@brenbarn.net> References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> <5A96C673.4070804@stoneleaf.us> <5A970476.3000101@brenbarn.net> Message-ID: On Thu, Mar 1, 2018 at 6:35 AM, Brendan Barnwell wrote: > On 2018-02-28 07:18, Chris Angelico wrote: >> >> Except that assignment is evaluated RHS before LHS as part of a single >> statement. When Python goes to look up the name "a" to store it (as >> the final step of the assignment), the SLNB is still active (it's >> still the same statement - note that this is NOT expression-local), so >> it uses the temporary. > > > Wait, so you're saying that if I do > > a = (2 as a) > > The "a = " assignment assigns to the SLNB, and so is then discarded > after the statement finishes? > > That seems very bad to me. If there are SLNBs with this special > "as" syntax, I think the ONLY way to assign to an SLNB should be with the > "as" syntax. You shouldn't be able to assign to an SLNB with regular > assignment syntax, even if you created an SNLB with the same name as the LHS > within the RHS. That seems a reasonable requirement on the face of it, but what about these variants? a = (x as a) a[b] = (x as a) b[a] = (x as a) a[b].c = (x as a) b[a].c = (x as a) Which of these should use the SLNB, which should be errors, which should use the previously-visible binding of 'a'? It wouldn't be too hard to put in a trap for assignment per se, but where do you draw the line? I think "a[b] =" is just as problematic as "a =", but "b[a] =" could be useful. Maybe the rule could be that direct assignment or mutation is disallowed, but using that value to assign to something else isn't? That would permit the last three and disallow only the first two. ChrisA From rosuav at gmail.com Wed Feb 28 15:01:26 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 07:01:26 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: On Thu, Mar 1, 2018 at 6:54 AM, Brett Cannon wrote: > Thanks for taking the time to write this PEP, Chris, even though I'm -1 on > the idea. I'm glad to just have this as a historical document for the idea. I'm going to get a reputation for writing up PEPs for dead ideas. PEP 463 (exception-catching expressions) was the same. In this particular case, I'm fairly in favour of it, but only because I think it's cool - not because I have actual need for it - and if the PEP's rejected, so be it. ChrisA From tritium-list at sdamon.com Wed Feb 28 15:16:52 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 28 Feb 2018 15:16:52 -0500 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: <16d501d3b0d1$12755850$376008f0$@sdamon.com> For what its worth, I'm +1 on it. I actually like that it would allow: while (something() as var): something_else(var) ...without being a bug magnet. The bug magnet isn't the assignment of a name in the condition of a while loop, it's the fact that assignment is a simple typo away from comparison. This is not a simple typo away from comparison (the operands are a different order, too). (sorry Chris, I didn't hit reply all the first time.) > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Chris Angelico > Sent: Wednesday, February 28, 2018 3:01 PM > To: python-ideas > Subject: Re: [Python-ideas] PEP 572: Statement-Local Name Bindings > > On Thu, Mar 1, 2018 at 6:54 AM, Brett Cannon wrote: > > Thanks for taking the time to write this PEP, Chris, even though I'm -1 on > > the idea. I'm glad to just have this as a historical document for the idea. > > I'm going to get a reputation for writing up PEPs for dead ideas. PEP > 463 (exception-catching expressions) was the same. In this particular > case, I'm fairly in favour of it, but only because I think it's cool - > not because I have actual need for it - and if the PEP's rejected, so > be it. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From p.f.moore at gmail.com Wed Feb 28 15:28:30 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 28 Feb 2018 20:28:30 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On 28 February 2018 at 19:41, Chris Angelico wrote: > On Thu, Mar 1, 2018 at 3:30 AM, Paul Moore wrote: > Here's another equally untested piece of code: > > [(-b + (sqrt(b*b - 4*a*c) as disc)) / (2*a), (-b - disc) / (2*a)] Yeah, that's a good real world example. And IMO it hides the symmetry between the two expressions, so for me it's a good example of a case where there's a readability *disadvantage* with the proposed syntax. > I haven't dug into the implementation consequences of any of this at > global scope. I know what parts of the code I need to look at, but my > days have this annoying habit of having only 24 hours in them. Anyone > got a bugfix for that? :| I can give you daylight saving time and a few leap seconds, but that's more of a workaround than a fix :-) >> Consider >> >> x = 12 >> if (1 as x) == 1: >> def foo(): >> print(x) >> # Actually, you'd probably get a "Name used before definition" >> error here. >> # Would "global x" refer to x=12 or to the statement-local x (1)? >> # Would "nonlocal x" refer to the statement-local x? >> x = 13 >> return x >> print(foo()) > > Yeah, that's going to give UnboundLocalError, because inside foo(), x > has been flagged as local. That's independent of the global scope > changes. > > I'd like to say that "global x" would catch the 12, but until I > actually get around to implementing it, I'm not sure. > >> print(x) >> print(foo()) >> print(x) > > Anything that executes after the 'if' exits should see x as 12. The > temporary variable is completely gone at that point. So simplifying x = 12 if (1 as x) == 1: def foo(): return x print(foo()) print(foo()) prints 1 12 Personally, I can't work out how to think about that in a way that isn't confusing :-( I gather from what you've said elsewhere that your current implementation uses a hidden name-mangled variable. IIRC, list comprehensions used something similar in the original implementation, but it resulted in weird consequences, and ultimately the implementation switched to a "proper" scoped semantics and implementation. I've no particular reason to think your implementation might suffer in the same way, but understanding the semantics in terms of an "assignment to a hidden variable" bothers me. >> The most charitable thing I can say here is that the semantics are >> currently under-specified in the PEP :-) > > Hah. This is why I started out by saying that this ONLY applies inside > a function. Extending this to global scope (and class scope; my guess > is that it'll behave the same as global) is something that I'm only > just now looking into. And yet, "only works within a function" is IMO an unacceptable limitation - so we need to look into the implications here. > So if, in the "in" expression list, you capture something, that thing > will be visible all through the suite. Here's an example: > > for item in (get_items() as items): > print(item) > print(items) > print(items) > > What actually happens is kinda this: > > for item in (get_items() as items_0x142857): > print(item) > print(items_0x142857) > del items_0x142857 > print(items) OK, so that explains why trying to write a closure over a variable introduced via "as" won't work. But I wouldn't want such a renaming to be mandated by the semantics, and I don't know how I'd explain the (implied) behaviour without insisting on a name mangling implementation, so I'm stuck here. >> "What the current proof of concept implementation does" isn't useful >> anyway, but even ignoring that I'd prefer to see what it *does* rather >> than what it *compiles to*. But what needs to be documented is what >> the PEP *proposes* it does. > > The current implementation matches my proposed semantics, as long as > the code in question is all inside a function. Understood. But I'd like the PEP to fully explain more of the intended semantics, *without* referring to the specific implementation. Remember, PyPy, Jython, IronPython, Cython, etc, will all have to implement it too. >> But I can even break expression local names: >> >> x = ((lambda: boom()) as boom) >> x() >> >> It's possible that the "boom" is just my head exploding, not the >> interpreter. But I think I just demonstrated a closure over an >> expression-local name. For added fun, replace "x" with "boom"... > > And this is why I am not trying for expression-local names. If someone > wants to run with a competing proposal for list-comprehension-local > names, sure, but I'm not in favour of that either. Expression-local is > too narrow to be useful AND it still has the problems that > statement-local has. I've no idea how that would work under statement-local names either, though. boom = lambda: boom() boom() is just an infinite recursion. I'm less sure that the as version is. Or the alternative form ((lambda: boom()) as boom)() I know you can tell me what the implementation does - but I can't reason it out from the spec. >>> Thanks for the feedback! Keep it coming! :) >> >> Ask and you shall receive :-) > > If I take a razor and cut myself with it, it's called "self-harm" and > can get me dinged for a psych evaluation. If I take a mailing list and > induce it to send me hundreds of emails and force myself to read them > all... there's probably a padded cell with my name on it somewhere. > > You know, I'd be okay with that actually. Just as long as the cell has wifi. lol. Have fun :-) Paul From rosuav at gmail.com Wed Feb 28 15:34:59 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 07:34:59 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Thu, Mar 1, 2018 at 7:28 AM, Paul Moore wrote: > I've no idea how that would work under statement-local names either, though. > > boom = lambda: boom() > boom() > > is just an infinite recursion. I'm less sure that the as version is. > Or the alternative form > > ((lambda: boom()) as boom)() > > I know you can tell me what the implementation does - but I can't > reason it out from the spec. The only part that isn't yet properly specified is how this interacts with closures, and that's because I really don't know what makes sense. Honestly, *any* situation where you're closing over a SLNB is going to have readability penalties, no matter what the spec says. ChrisA From p.f.moore at gmail.com Wed Feb 28 15:30:43 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 28 Feb 2018 20:30:43 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: <16d501d3b0d1$12755850$376008f0$@sdamon.com> References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> <16d501d3b0d1$12755850$376008f0$@sdamon.com> Message-ID: On 28 February 2018 at 20:16, Alex Walters wrote: > For what its worth, I'm +1 on it. > > I actually like that it would allow: > > while (something() as var): > something_else(var) > > ...without being a bug magnet. The bug magnet isn't the assignment of a > name in the condition of a while loop, it's the fact that assignment is a > simple typo away from comparison. This is not a simple typo away from > comparison (the operands are a different order, too). For me, the problem isn't so much with the expected use cases (with the ironic exception that I dislike it for the original motivating example of list comprehensions), it's that any "unusual" use of the construct seems to raise more questions about the semantics than it answers. I remain -1, I'm afraid. Paul From ethan at stoneleaf.us Wed Feb 28 15:46:24 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 28 Feb 2018 12:46:24 -0800 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: <16d501d3b0d1$12755850$376008f0$@sdamon.com> References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> <16d501d3b0d1$12755850$376008f0$@sdamon.com> Message-ID: <5A971520.1040207@stoneleaf.us> On 02/28/2018 12:16 PM, Alex Walters wrote: > For what its worth, I'm +1 on it. > > I actually like that it would allow: > > while (something() as var): > something_else(var) > > ...without being a bug magnet. The bug magnet isn't the assignment of a > name in the condition of a while loop, it's the fact that assignment is a > simple typo away from comparison. This is not a simple typo away from > comparison (the operands are a different order, too). I also like the above, but as a more general assignment-in-expression, not as a statement-local -- the difference being that there would be no auto-deletion of the variable, no confusion about when it goes away, no visual name collisions (aside from we have already), etc., etc. Maybe I'll write yet-another-competing-PEP. ;) -- ~Ethan~ From robertve92 at gmail.com Wed Feb 28 16:38:34 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 28 Feb 2018 22:38:34 +0100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: Le 28 f?vr. 2018 11:43, "Chris Angelico" a ?crit : > It's still right-to-left, which is as bad as middle-outward once you > combine it with normal left-to-right evaluation. Python has very > little of this [..] I agree [....] >> 2) talking about the implementation of thektulu in the "where =" part. > ? In the Alternate Syntax, I was talking about adding a link to the thektulu (branch where-expr) implementation as a basis of proof of concept (as you did with the other syntax). >> 3) "C problem that an equals sign in an expression can now create a name inding, rather than performing a comparison." As you agreed, with the "ch with ch = getch()" syntax we won't accidentally switch a "==" for a "=". I agree this syntax : ``` while (ch with ch = getch()): ... ``` doesn't read very well, but in the same way as in C or Java while(ch = getch()){} or worse ((ch = getch()) != null) syntax. Your syntax "while (getch() as ch):" may have a less words, but is still not clearer. As we spoke on Github, having this syntax in a while is only useful if the variable does leak. >> 5) Any expression vs "post for" only > I don't know what the benefit is here, but sure. As long as the > grammar is unambiguous, I don't see any particular reason to reject > this. I would like to see a discussion of pros and cons, some might think like me or disagree, that's a strong langage question. > 6) with your syntax, how does the simple case work (y+2 with y = x+1) ? What simple case? The case where you only use the variable once? I'd write it like this: (x + 1) + 2 >> The issue is not only about reusing variable. > If you aren't using the variable multiple times, there's no point > giving it a name. Unless I'm missing something here? Yes, variables are not there "just because we reuse them", but also to include temporary variables to better understand the code. Same for functions, you could inline functions when used only once, but you introduce them for clarity no ? ``` a = v ** 2 / R # the acceleration in a circular motion f = m * a # law of Newton ``` could be written as ``` f = m * (v ** 2 / R) # compute the force, trivial ``` But having temporary variables help a lot to understand the code, otherwise why would we create temporary variables ? I can give you an example where you do a process and each time the variable is used only one. >> 8) >> (lambda y: [y, y])(x+1) >> Vs >> (lambda y: [y, y])(y=x+1) Ewww. Remind me what the benefit is of writing the variable name that many times? "Explicit" doesn't mean "utterly verbose". Yep it's verbose, lambdas are verbose, that's why we created this PEP isn't it :) > 10) Chaining, in the case of the "with =", in thektulu, parenthesis were > mandatory: > > print((z+3 with z = y+2) with y = x+2) > > What happens when the parenthesis are dropped ? > > print(z+3 with y = x+2 with z = y+2) > > Vs > > print(z+3 with y = x+2 with z = y+2) > > I prefer the first one be cause it's in the same order as the "post for" > > [z + 3 for y in [ x+2 ] for z in [ y+2 ]] > With my proposal, the parens are simply mandatory. Extending this to > make them optional can come later. Indeed, but that's still questions that can be asked. >> 11) Scoping, in the case of the "with =" syntax, I think the parenthesis >> introduce a scope : >> >> print(y + (y+1 where y = 2)) >> >> Would raise a SyntaxError, it's probably better for the variable beeing >> local and not in the current function (that would be a mess). >> >> Remember that in list comp, the variable is not leaked : >> >> x = 5 >> stuff = [y+2 for y in [x+1] >> print(y) # SyntaxError > Scoping is a fundamental part of both my proposal and the others I've > seen here. (BTW, that would be a NameError, not a SyntaxError; it's > perfectly legal to ask for the name 'y', it just hasn't been given any > value.) By my definition, the variable is locked to the statement that > created it, even if that's a compound statement. By the definition of > a "(expr given var = expr)" proposal, it would be locked to that > single expression. Confer the discussion on scoping on github ( https://github.com/python/peps/commit/2b4ca20963a24cf5faac054226857ea9705471e5) : """ In the current implementation it looks like it is like a regular assignment (function local then). Therefore in the expression usage, the usefulness would be debatable (just assign before). But in a list comprehension *after the for* (as I mentioned in my mail), aka. when used as a replacement for for y in [ x + 1 ] this would make sense. But I think that it would be much better to have a local scope, in the parenthesis. So that print(y+2 where y = x + 1) wouldn't leak y. And when there are no parenthesis like in a = y+2 where y = x+1, it would imply one, giving the same effect as a = (y+2 where y = x+1). Moreover, it would naturally shadow variables in the outermost scope. This would imply while data where data = sock.read(): does not leak data but as a comparison with C and Java, the syntax while((data = sock.read()) != null) is really really ugly and confusing. """ -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Wed Feb 28 16:48:18 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 28 Feb 2018 22:48:18 +0100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: We are currently like a dozen of people talking about multiple sections of a single subject. Isn't it easier to talk on a forum? *Am I the only one* who thinks mailing list isn't easy when lots of people talking about multiple subjects? Of course we would put the link in the mailing list so that everyone can join. A forum (or just few "issues" thread on github) is where we could have different thread in parallel, in my messages I end up with like *10 comments not all related*, in a forum we could talk about everything and it would still be organized by subjects. Also, it's more interactive than email on a global list, people can talk to each other in parallel, if I want to answer about a mail that was 10 mail ago, it gets quickly messy. We could all discuss on a gist or some "Issues" thread on GitHub. 2018-02-28 22:38 GMT+01:00 Robert Vanden Eynde : > Le 28 f?vr. 2018 11:43, "Chris Angelico" a ?crit : > > > It's still right-to-left, which is as bad as middle-outward once you > > combine it with normal left-to-right evaluation. Python has very > > little of this [..] > > I agree [....] > > >> 2) talking about the implementation of thektulu in the "where =" part. > > > ? > > In the Alternate Syntax, I was talking about adding a link to the thektulu > (branch where-expr) > > implementation as a basis of proof of concept (as you did with the other > syntax). > > >> 3) "C problem that an equals sign in an expression can now create a > name inding, rather than performing a comparison." > > As you agreed, with the "ch with ch = getch()" syntax we won't > accidentally switch a "==" for a "=". > > I agree this syntax : > > ``` > while (ch with ch = getch()): > ... > ``` > > doesn't read very well, but in the same way as in C or Java while(ch = > getch()){} or worse ((ch = getch()) != null) syntax. > Your syntax "while (getch() as ch):" may have a less words, but is still > not clearer. > > As we spoke on Github, having this syntax in a while is only useful if the > variable does leak. > > >> 5) Any expression vs "post for" only > > > I don't know what the benefit is here, but sure. As long as the > > grammar is unambiguous, I don't see any particular reason to reject > > this. > > I would like to see a discussion of pros and cons, some might think like > me or disagree, that's a strong langage question. > > > 6) with your syntax, how does the simple case work (y+2 with y = x+1) ? > > What simple case? The case where you only use the variable once? I'd > write it like this: > > (x + 1) + 2 > > >> The issue is not only about reusing variable. > > > If you aren't using the variable multiple times, there's no point > > giving it a name. Unless I'm missing something here? > > Yes, variables are not there "just because we reuse them", but also to > include temporary variables to better understand the code. > Same for functions, you could inline functions when used only once, but > you introduce them for clarity no ? > > ``` > a = v ** 2 / R # the acceleration in a circular motion > f = m * a # law of Newton > ``` > > could be written as > > ``` > f = m * (v ** 2 / R) # compute the force, trivial > ``` > > But having temporary variables help a lot to understand the code, > otherwise why would we create temporary variables ? > I can give you an example where you do a process and each time the > variable is used only one. > > >> 8) > >> (lambda y: [y, y])(x+1) > >> Vs > >> (lambda y: [y, y])(y=x+1) > > Ewww. Remind me what the benefit is of writing the variable name that > many times? "Explicit" doesn't mean "utterly verbose". > > Yep it's verbose, lambdas are verbose, that's why we created this PEP > isn't it :) > > > 10) Chaining, in the case of the "with =", in thektulu, parenthesis were > > mandatory: > > > > print((z+3 with z = y+2) with y = x+2) > > > > What happens when the parenthesis are dropped ? > > > > print(z+3 with y = x+2 with z = y+2) > > > > Vs > > > > print(z+3 with y = x+2 with z = y+2) > > > > I prefer the first one be cause it's in the same order as the "post for" > > > > [z + 3 for y in [ x+2 ] for z in [ y+2 ]] > > > With my proposal, the parens are simply mandatory. Extending this to > > make them optional can come later. > > Indeed, but that's still questions that can be asked. > > >> 11) Scoping, in the case of the "with =" syntax, I think the parenthesis > >> introduce a scope : > >> > >> print(y + (y+1 where y = 2)) > >> > >> Would raise a SyntaxError, it's probably better for the variable beeing > >> local and not in the current function (that would be a mess). > >> > >> Remember that in list comp, the variable is not leaked : > >> > >> x = 5 > >> stuff = [y+2 for y in [x+1] > >> print(y) # SyntaxError > > > Scoping is a fundamental part of both my proposal and the others I've > > seen here. (BTW, that would be a NameError, not a SyntaxError; it's > > perfectly legal to ask for the name 'y', it just hasn't been given any > > value.) By my definition, the variable is locked to the statement that > > created it, even if that's a compound statement. By the definition of > > a "(expr given var = expr)" proposal, it would be locked to that > > single expression. > > Confer the discussion on scoping on github (https://github.com/python/ > peps/commit/2b4ca20963a24cf5faac054226857ea9705471e5) : > > """ > In the current implementation it looks like it is like a regular > assignment (function local then). > > Therefore in the expression usage, the usefulness would be debatable (just > assign before). > > But in a list comprehension *after the for* (as I mentioned in my mail), > aka. when used as a replacement for for y in [ x + 1 ] this would make > sense. > > But I think that it would be much better to have a local scope, in the > parenthesis. So that print(y+2 where y = x + 1) wouldn't leak y. And when > there are no parenthesis like in a = y+2 where y = x+1, it would imply > one, giving the same effect as a = (y+2 where y = x+1). Moreover, it > would naturally shadow variables in the outermost scope. > > This would imply while data where data = sock.read(): does not leak data > but as a comparison with C and Java, the syntax while((data = sock.read()) > != null) is really really ugly and confusing. > """ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Feb 28 16:49:50 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 08:49:50 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: On Thu, Mar 1, 2018 at 8:38 AM, Robert Vanden Eynde wrote: > Le 28 f?vr. 2018 11:43, "Chris Angelico" a ?crit : >> If you aren't using the variable multiple times, there's no point >> giving it a name. Unless I'm missing something here? > > Yes, variables are not there "just because we reuse them", but also to > include temporary variables to better understand the code. > Same for functions, you could inline functions when used only once, but you > introduce them for clarity no ? Sure, but if you're creating temporaries for clarity, you should probably make them regular variables, not statement-local ones. If you're going to use something only once and you don't want to break it out into a separate statement, just use a comment. > > ``` > a = v ** 2 / R # the acceleration in a circular motion > f = m * a # law of Newton > ``` > > could be written as > > ``` > f = m * (v ** 2 / R) # compute the force, trivial > ``` > > But having temporary variables help a lot to understand the code, otherwise > why would we create temporary variables ? > I can give you an example where you do a process and each time the variable > is used only one. Neither of your examples needs SLNBs. >> Scoping is a fundamental part of both my proposal and the others I've >> seen here. (BTW, that would be a NameError, not a SyntaxError; it's >> perfectly legal to ask for the name 'y', it just hasn't been given any >> value.) By my definition, the variable is locked to the statement that >> created it, even if that's a compound statement. By the definition of >> a "(expr given var = expr)" proposal, it would be locked to that >> single expression. > > Confer the discussion on scoping on github > (https://github.com/python/peps/commit/2b4ca20963a24cf5faac054226857ea9705471e5) > : > > """ > In the current implementation it looks like it is like a regular assignment > (function local then). > > Therefore in the expression usage, the usefulness would be debatable (just > assign before). > > But in a list comprehension after the for (as I mentioned in my mail), aka. > when used as a replacement for for y in [ x + 1 ] this would make sense. > > But I think that it would be much better to have a local scope, in the > parenthesis. So that print(y+2 where y = x + 1) wouldn't leak y. And when > there are no parenthesis like in a = y+2 where y = x+1, it would imply one, > giving the same effect as a = (y+2 where y = x+1). Moreover, it would > naturally shadow variables in the outermost scope. So the question is: what is the benefit of the local name 'y'? In any non-trivial example, it's not going to fit in a single line, so you have to either wrap it as a single expression (one that's been made larger by the "where" clause at the end), or break it out into a real variable as a separate assignment. ChrisA From marcidy at gmail.com Wed Feb 28 16:53:29 2018 From: marcidy at gmail.com (Matt Arcidy) Date: Wed, 28 Feb 2018 21:53:29 +0000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: -0 unless archived appropriately. List is the standard for decades. but I guess things change and I get old. On Wed, Feb 28, 2018, 13:49 Robert Vanden Eynde wrote: > We are currently like a dozen of people talking about multiple sections of > a single subject. > > Isn't it easier to talk on a forum? > *Am I the only one* who thinks mailing list isn't easy when lots of > people talking about multiple subjects? > > Of course we would put the link in the mailing list so that everyone can > join. > > A forum (or just few "issues" thread on github) is where we could have > different thread in parallel, in my messages I end up with like *10 > comments not all related*, in a forum we could talk about everything and > it would still be organized by subjects. > > Also, it's more interactive than email on a global list, people can talk > to each other in parallel, if I want to answer about a mail that was 10 > mail ago, it gets quickly messy. > > We could all discuss on a gist or some "Issues" thread on GitHub. > > 2018-02-28 22:38 GMT+01:00 Robert Vanden Eynde : > >> Le 28 f?vr. 2018 11:43, "Chris Angelico" a ?crit : >> >> > It's still right-to-left, which is as bad as middle-outward once you >> > combine it with normal left-to-right evaluation. Python has very >> > little of this [..] >> >> I agree [....] >> >> >> 2) talking about the implementation of thektulu in the "where =" part. >> >> > ? >> >> In the Alternate Syntax, I was talking about adding a link to the thektulu >> (branch where-expr) >> >> implementation as a basis of proof of concept (as you did with the other >> syntax). >> >> >> 3) "C problem that an equals sign in an expression can now create a >> name inding, rather than performing a comparison." >> >> As you agreed, with the "ch with ch = getch()" syntax we won't >> accidentally switch a "==" for a "=". >> >> I agree this syntax : >> >> ``` >> while (ch with ch = getch()): >> ... >> ``` >> >> doesn't read very well, but in the same way as in C or Java while(ch = >> getch()){} or worse ((ch = getch()) != null) syntax. >> Your syntax "while (getch() as ch):" may have a less words, but is still >> not clearer. >> >> As we spoke on Github, having this syntax in a while is only useful if >> the variable does leak. >> >> >> 5) Any expression vs "post for" only >> >> > I don't know what the benefit is here, but sure. As long as the >> > grammar is unambiguous, I don't see any particular reason to reject >> > this. >> >> I would like to see a discussion of pros and cons, some might think like >> me or disagree, that's a strong langage question. >> >> > 6) with your syntax, how does the simple case work (y+2 with y = x+1) ? >> >> What simple case? The case where you only use the variable once? I'd >> write it like this: >> >> (x + 1) + 2 >> >> >> The issue is not only about reusing variable. >> >> > If you aren't using the variable multiple times, there's no point >> > giving it a name. Unless I'm missing something here? >> >> Yes, variables are not there "just because we reuse them", but also to >> include temporary variables to better understand the code. >> Same for functions, you could inline functions when used only once, but >> you introduce them for clarity no ? >> >> ``` >> a = v ** 2 / R # the acceleration in a circular motion >> f = m * a # law of Newton >> ``` >> >> could be written as >> >> ``` >> f = m * (v ** 2 / R) # compute the force, trivial >> ``` >> >> But having temporary variables help a lot to understand the code, >> otherwise why would we create temporary variables ? >> I can give you an example where you do a process and each time the >> variable is used only one. >> >> >> 8) >> >> (lambda y: [y, y])(x+1) >> >> Vs >> >> (lambda y: [y, y])(y=x+1) >> >> Ewww. Remind me what the benefit is of writing the variable name that >> many times? "Explicit" doesn't mean "utterly verbose". >> >> Yep it's verbose, lambdas are verbose, that's why we created this PEP >> isn't it :) >> >> > 10) Chaining, in the case of the "with =", in thektulu, parenthesis were >> > mandatory: >> > >> > print((z+3 with z = y+2) with y = x+2) >> > >> > What happens when the parenthesis are dropped ? >> > >> > print(z+3 with y = x+2 with z = y+2) >> > >> > Vs >> > >> > print(z+3 with y = x+2 with z = y+2) >> > >> > I prefer the first one be cause it's in the same order as the "post for" >> > >> > [z + 3 for y in [ x+2 ] for z in [ y+2 ]] >> >> > With my proposal, the parens are simply mandatory. Extending this to >> > make them optional can come later. >> >> Indeed, but that's still questions that can be asked. >> >> >> 11) Scoping, in the case of the "with =" syntax, I think the >> parenthesis >> >> introduce a scope : >> >> >> >> print(y + (y+1 where y = 2)) >> >> >> >> Would raise a SyntaxError, it's probably better for the variable beeing >> >> local and not in the current function (that would be a mess). >> >> >> >> Remember that in list comp, the variable is not leaked : >> >> >> >> x = 5 >> >> stuff = [y+2 for y in [x+1] >> >> print(y) # SyntaxError >> >> > Scoping is a fundamental part of both my proposal and the others I've >> > seen here. (BTW, that would be a NameError, not a SyntaxError; it's >> > perfectly legal to ask for the name 'y', it just hasn't been given any >> > value.) By my definition, the variable is locked to the statement that >> > created it, even if that's a compound statement. By the definition of >> > a "(expr given var = expr)" proposal, it would be locked to that >> > single expression. >> >> Confer the discussion on scoping on github ( >> https://github.com/python/peps/commit/2b4ca20963a24cf5faac054226857ea9705471e5) >> : >> >> """ >> In the current implementation it looks like it is like a regular >> assignment (function local then). >> >> Therefore in the expression usage, the usefulness would be debatable >> (just assign before). >> >> But in a list comprehension *after the for* (as I mentioned in my mail), >> aka. when used as a replacement for for y in [ x + 1 ] this would make >> sense. >> >> But I think that it would be much better to have a local scope, in the >> parenthesis. So that print(y+2 where y = x + 1) wouldn't leak y. And >> when there are no parenthesis like in a = y+2 where y = x+1, it would >> imply one, giving the same effect as a = (y+2 where y = x+1). Moreover, >> it would naturally shadow variables in the outermost scope. >> >> This would imply while data where data = sock.read(): does not leak data >> but as a comparison with C and Java, the syntax while((data = sock.read()) >> != null) is really really ugly and confusing. >> """ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Wed Feb 28 16:54:31 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 28 Feb 2018 16:54:31 -0500 Subject: [Python-ideas] Medium for discussion potential changes to python (was: PEP 572: Statement-Local Name Bindings) Message-ID: <172401d3b0de$b6ad10e0$240732a0$@sdamon.com> That should probably be its own thread From: Python-ideas [mailto:python-ideas-bounces+tritium-list=sdamon.com at python.org] On Behalf Of Robert Vanden Eynde Sent: Wednesday, February 28, 2018 4:48 PM Cc: python-ideas Subject: Re: [Python-ideas] PEP 572: Statement-Local Name Bindings We are currently like a dozen of people talking about multiple sections of a single subject. Isn't it easier to talk on a forum? Am I the only one who thinks mailing list isn't easy when lots of people talking about multiple subjects? Of course we would put the link in the mailing list so that everyone can join. A forum (or just few "issues" thread on github) is where we could have different thread in parallel, in my messages I end up with like 10 comments not all related, in a forum we could talk about everything and it would still be organized by subjects. Also, it's more interactive than email on a global list, people can talk to each other in parallel, if I want to answer about a mail that was 10 mail ago, it gets quickly messy. We could all discuss on a gist or some "Issues" thread on GitHub. 2018-02-28 22:38 GMT+01:00 Robert Vanden Eynde >: Le 28 f?vr. 2018 11:43, "Chris Angelico" > a ?crit : > It's still right-to-left, which is as bad as middle-outward once you > combine it with normal left-to-right evaluation. Python has very > little of this [..] I agree [....] >> 2) talking about the implementation of thektulu in the "where =" part. > ? In the Alternate Syntax, I was talking about adding a link to the thektulu (branch where-expr) implementation as a basis of proof of concept (as you did with the other syntax). >> 3) "C problem that an equals sign in an expression can now create a name inding, rather than performing a comparison." As you agreed, with the "ch with ch = getch()" syntax we won't accidentally switch a "==" for a "=". I agree this syntax : ``` while (ch with ch = getch()): ... ``` doesn't read very well, but in the same way as in C or Java while(ch = getch()){} or worse ((ch = getch()) != null) syntax. Your syntax "while (getch() as ch):" may have a less words, but is still not clearer. As we spoke on Github, having this syntax in a while is only useful if the variable does leak. >> 5) Any expression vs "post for" only > I don't know what the benefit is here, but sure. As long as the > grammar is unambiguous, I don't see any particular reason to reject > this. I would like to see a discussion of pros and cons, some might think like me or disagree, that's a strong langage question. > 6) with your syntax, how does the simple case work (y+2 with y = x+1) ? What simple case? The case where you only use the variable once? I'd write it like this: (x + 1) + 2 >> The issue is not only about reusing variable. > If you aren't using the variable multiple times, there's no point > giving it a name. Unless I'm missing something here? Yes, variables are not there "just because we reuse them", but also to include temporary variables to better understand the code. Same for functions, you could inline functions when used only once, but you introduce them for clarity no ? ``` a = v ** 2 / R # the acceleration in a circular motion f = m * a # law of Newton ``` could be written as ``` f = m * (v ** 2 / R) # compute the force, trivial ``` But having temporary variables help a lot to understand the code, otherwise why would we create temporary variables ? I can give you an example where you do a process and each time the variable is used only one. >> 8) >> (lambda y: [y, y])(x+1) >> Vs >> (lambda y: [y, y])(y=x+1) Ewww. Remind me what the benefit is of writing the variable name that many times? "Explicit" doesn't mean "utterly verbose". Yep it's verbose, lambdas are verbose, that's why we created this PEP isn't it :) > 10) Chaining, in the case of the "with =", in thektulu, parenthesis were > mandatory: > > print((z+3 with z = y+2) with y = x+2) > > What happens when the parenthesis are dropped ? > > print(z+3 with y = x+2 with z = y+2) > > Vs > > print(z+3 with y = x+2 with z = y+2) > > I prefer the first one be cause it's in the same order as the "post for" > > [z + 3 for y in [ x+2 ] for z in [ y+2 ]] > With my proposal, the parens are simply mandatory. Extending this to > make them optional can come later. Indeed, but that's still questions that can be asked. >> 11) Scoping, in the case of the "with =" syntax, I think the parenthesis >> introduce a scope : >> >> print(y + (y+1 where y = 2)) >> >> Would raise a SyntaxError, it's probably better for the variable beeing >> local and not in the current function (that would be a mess). >> >> Remember that in list comp, the variable is not leaked : >> >> x = 5 >> stuff = [y+2 for y in [x+1] >> print(y) # SyntaxError > Scoping is a fundamental part of both my proposal and the others I've > seen here. (BTW, that would be a NameError, not a SyntaxError; it's > perfectly legal to ask for the name 'y', it just hasn't been given any > value.) By my definition, the variable is locked to the statement that > created it, even if that's a compound statement. By the definition of > a "(expr given var = expr)" proposal, it would be locked to that > single expression. Confer the discussion on scoping on github (https://github.com/python/peps/commit/2b4ca20963a24cf5faac054226857ea9705471e5) : """ In the current implementation it looks like it is like a regular assignment (function local then). Therefore in the expression usage, the usefulness would be debatable (just assign before). But in a list comprehension after the for (as I mentioned in my mail), aka. when used as a replacement for for y in [ x + 1 ] this would make sense. But I think that it would be much better to have a local scope, in the parenthesis. So that print(y+2 where y = x + 1) wouldn't leak y. And when there are no parenthesis like in a = y+2 where y = x+1, it would imply one, giving the same effect as a = (y+2 where y = x+1). Moreover, it would naturally shadow variables in the outermost scope. This would imply while data where data = sock.read(): does not leak data but as a comparison with C and Java, the syntax while((data = sock.read()) != null) is really really ugly and confusing. """ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Feb 28 16:58:52 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 28 Feb 2018 13:58:52 -0800 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: <5A97261C.5080900@stoneleaf.us> On 02/28/2018 01:48 PM, Robert Vanden Eynde wrote: > We are currently like a dozen of people talking about multiple sections of a single subject. > > Isn't it easier to talk on a forum? No. > *Am I the only one* who thinks mailing list isn't easy when lots of people talking about multiple subjects? Maybe. > Of course we would put the link in the mailing list so that everyone can join. Python Ideas is a mailing list. This is where we discuss ideas for future versions of Python. If you're using Google Groups or a lousy mail reader then I sympathize, but it's on you to use the appropriate tools. -- ~Ethan~ From rosuav at gmail.com Wed Feb 28 16:57:49 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 1 Mar 2018 08:57:49 +1100 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <3c8a2267-1605-8e47-7e62-f79b230e7aec@btinternet.com> Message-ID: On Thu, Mar 1, 2018 at 8:53 AM, Matt Arcidy wrote: > -0 unless archived appropriately. List is the standard for decades. but I > guess things change and I get old. Archived, searchable, and properly threaded AND properly notifying the correct participants. Every few years, a spiffy new thing comes along, and it usually isn't enough of an improvement to survive. Mailing lists continue to be more than adequate, and the alternatives end up being aimed at point-and-click people who dislike mailing lists. I'm much happier sticking to the list. ChrisA From robertve92 at gmail.com Wed Feb 28 16:59:09 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 28 Feb 2018 22:59:09 +0100 Subject: [Python-ideas] Medium for discussion potential changes to python (was: PEP 572: Statement-Local Name Bindings) In-Reply-To: <172401d3b0de$b6ad10e0$240732a0$@sdamon.com> References: <172401d3b0de$b6ad10e0$240732a0$@sdamon.com> Message-ID: By "thread" you mean like "email thread" ? Meaning when I want to talk about multiple stuffs, I send multiple mails with different "Subject Lines" ? I have like multiple issues : - Base Syntax (multiple choice, list pros and cons) - Extended Syntax (what about parenthesis, and multiple assignement, and while, and list comprehension) - Scoping - Use cases (list multiple ones) - Cons (list multiple ones and solution). It looks like, because it's a PEP, people think of it like "I should say +1 or ?1 until something else is created" :/ 2018-02-28 22:54 GMT+01:00 Alex Walters : > That should probably be its own thread > > > > *From:* Python-ideas [mailto:python-ideas-bounces+tritium-list=sdamon.com@ > python.org] *On Behalf Of *Robert Vanden Eynde > *Sent:* Wednesday, February 28, 2018 4:48 PM > *Cc:* python-ideas > *Subject:* Re: [Python-ideas] PEP 572: Statement-Local Name Bindings > > > > We are currently like a dozen of people talking about multiple sections of > a single subject. > > Isn't it easier to talk on a forum? > *Am I the only one* who thinks mailing list isn't easy when lots of > people talking about multiple subjects? > > Of course we would put the link in the mailing list so that everyone can > join. > > A forum (or just few "issues" thread on github) is where we could have > different thread in parallel, in my messages I end up with like *10 > comments not all related*, in a forum we could talk about everything and > it would still be organized by subjects. > > Also, it's more interactive than email on a global list, people can talk > to each other in parallel, if I want to answer about a mail that was 10 > mail ago, it gets quickly messy. > > We could all discuss on a gist or some "Issues" thread on GitHub. > > > > 2018-02-28 22:38 GMT+01:00 Robert Vanden Eynde : > > Le 28 f?vr. 2018 11:43, "Chris Angelico" a ?crit : > > > It's still right-to-left, which is as bad as middle-outward once you > > combine it with normal left-to-right evaluation. Python has very > > little of this [..] > > I agree [....] > > >> 2) talking about the implementation of thektulu in the "where =" part. > > > ? > > In the Alternate Syntax, I was talking about adding a link to the thektulu > (branch where-expr) > > implementation as a basis of proof of concept (as you did with the other > syntax). > > >> 3) "C problem that an equals sign in an expression can now create a > name inding, rather than performing a comparison." > > As you agreed, with the "ch with ch = getch()" syntax we won't > accidentally switch a "==" for a "=". > > I agree this syntax : > > > ``` > while (ch with ch = getch()): > ... > ``` > > doesn't read very well, but in the same way as in C or Java while(ch = > getch()){} or worse ((ch = getch()) != null) syntax. > > Your syntax "while (getch() as ch):" may have a less words, but is still > not clearer. > > As we spoke on Github, having this syntax in a while is only useful if the > variable does leak. > > > >> 5) Any expression vs "post for" only > > > I don't know what the benefit is here, but sure. As long as the > > grammar is unambiguous, I don't see any particular reason to reject > > this. > > I would like to see a discussion of pros and cons, some might think like > me or disagree, that's a strong langage question. > > > > 6) with your syntax, how does the simple case work (y+2 with y = x+1) ? > > What simple case? The case where you only use the variable once? I'd > write it like this: > > (x + 1) + 2 > > >> The issue is not only about reusing variable. > > > If you aren't using the variable multiple times, there's no point > > giving it a name. Unless I'm missing something here? > > Yes, variables are not there "just because we reuse them", but also to > include temporary variables to better understand the code. > > Same for functions, you could inline functions when used only once, but > you introduce them for clarity no ? > > > ``` > > a = v ** 2 / R # the acceleration in a circular motion > > f = m * a # law of Newton > ``` > > could be written as > > ``` > > f = m * (v ** 2 / R) # compute the force, trivial > > ``` > > But having temporary variables help a lot to understand the code, > otherwise why would we create temporary variables ? > > I can give you an example where you do a process and each time the > variable is used only one. > > > >> 8) > >> (lambda y: [y, y])(x+1) > >> Vs > >> (lambda y: [y, y])(y=x+1) > > Ewww. Remind me what the benefit is of writing the variable name that > many times? "Explicit" doesn't mean "utterly verbose". > > Yep it's verbose, lambdas are verbose, that's why we created this PEP > isn't it :) > > > > 10) Chaining, in the case of the "with =", in thektulu, parenthesis were > > mandatory: > > > > print((z+3 with z = y+2) with y = x+2) > > > > What happens when the parenthesis are dropped ? > > > > print(z+3 with y = x+2 with z = y+2) > > > > Vs > > > > print(z+3 with y = x+2 with z = y+2) > > > > I prefer the first one be cause it's in the same order as the "post for" > > > > [z + 3 for y in [ x+2 ] for z in [ y+2 ]] > > > With my proposal, the parens are simply mandatory. Extending this to > > make them optional can come later. > > Indeed, but that's still questions that can be asked. > > > >> 11) Scoping, in the case of the "with =" syntax, I think the parenthesis > >> introduce a scope : > >> > >> print(y + (y+1 where y = 2)) > >> > >> Would raise a SyntaxError, it's probably better for the variable beeing > >> local and not in the current function (that would be a mess). > >> > >> Remember that in list comp, the variable is not leaked : > >> > >> x = 5 > >> stuff = [y+2 for y in [x+1] > >> print(y) # SyntaxError > > > Scoping is a fundamental part of both my proposal and the others I've > > seen here. (BTW, that would be a NameError, not a SyntaxError; it's > > perfectly legal to ask for the name 'y', it just hasn't been given any > > value.) By my definition, the variable is locked to the statement that > > created it, even if that's a compound statement. By the definition of > > a "(expr given var = expr)" proposal, it would be locked to that > > single expression. > > Confer the discussion on scoping on github (https://github.com/python/ > peps/commit/2b4ca20963a24cf5faac054226857ea9705471e5) : > > """ > In the current implementation it looks like it is like a regular > assignment (function local then). > > Therefore in the expression usage, the usefulness would be debatable (just > assign before). > > But in a list comprehension *after the for* (as I mentioned in my mail), > aka. when used as a replacement for for y in [ x + 1 ] this would make > sense. > > But I think that it would be much better to have a local scope, in the > parenthesis. So that print(y+2 where y = x + 1) wouldn't leak y. And when > there are no parenthesis like in a = y+2 where y = x+1, it would imply > one, giving the same effect as a = (y+2 where y = x+1). Moreover, it > would naturally shadow variables in the outermost scope. > > This would imply while data where data = sock.read(): does not leak data > but as a comparison with C and Java, the syntax while((data = sock.read()) > != null) is really really ugly and confusing. > > """ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Wed Feb 28 17:12:10 2018 From: phd at phdru.name (Oleg Broytman) Date: Wed, 28 Feb 2018 23:12:10 +0100 Subject: [Python-ideas] Medium for discussion potential changes to python (was: PEP 572: Statement-Local Name Bindings) In-Reply-To: <172401d3b0de$b6ad10e0$240732a0$@sdamon.com> References: <172401d3b0de$b6ad10e0$240732a0$@sdamon.com> Message-ID: <20180228221210.adiz2s4lyityrv26@phdru.name> The topic has been discussed to death with all possible pros and cons. I've published my personal collection of pros and cons at http://phdru.name/Software/mail-vs-web.html And my personal bottom line is: I still prefer mailing lists but I know their advantages and disadvantages and I've invested a lot of resources to learn and configure my tools. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From tritium-list at sdamon.com Wed Feb 28 17:18:43 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 28 Feb 2018 17:18:43 -0500 Subject: [Python-ideas] Medium for discussion potential changes to python (was: PEP 572: Statement-Local Name Bindings) In-Reply-To: <20180228221210.adiz2s4lyityrv26@phdru.name> References: <172401d3b0de$b6ad10e0$240732a0$@sdamon.com> <20180228221210.adiz2s4lyityrv26@phdru.name> Message-ID: <173d01d3b0e2$189a01c0$49ce0540$@sdamon.com> "This page was intentionally left blank." > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Oleg Broytman > Sent: Wednesday, February 28, 2018 5:12 PM > To: python-ideas at python.org > Subject: Re: [Python-ideas] Medium for discussion potential changes to > python (was: PEP 572: Statement-Local Name Bindings) > > The topic has been discussed to death with all possible pros and cons. > I've published my personal collection of pros and cons at > http://phdru.name/Software/mail-vs-web.html > > And my personal bottom line is: I still prefer mailing lists but I > know their advantages and disadvantages and I've invested a lot of > resources to learn and configure my tools. > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From phd at phdru.name Wed Feb 28 17:20:15 2018 From: phd at phdru.name (Oleg Broytman) Date: Wed, 28 Feb 2018 23:20:15 +0100 Subject: [Python-ideas] Medium for discussion potential changes to python (was: PEP 572: Statement-Local Name Bindings) In-Reply-To: <173d01d3b0e2$189a01c0$49ce0540$@sdamon.com> References: <172401d3b0de$b6ad10e0$240732a0$@sdamon.com> <20180228221210.adiz2s4lyityrv26@phdru.name> <173d01d3b0e2$189a01c0$49ce0540$@sdamon.com> Message-ID: <20180228222015.gapf6jpmsw7ylmi4@phdru.name> Oops, sorry, fixed. On Wed, Feb 28, 2018 at 05:18:43PM -0500, Alex Walters wrote: > "This page was intentionally left blank." > > > -----Original Message----- > > From: Python-ideas [mailto:python-ideas-bounces+tritium- > > list=sdamon.com at python.org] On Behalf Of Oleg Broytman > > Sent: Wednesday, February 28, 2018 5:12 PM > > To: python-ideas at python.org > > Subject: Re: [Python-ideas] Medium for discussion potential changes to > > python (was: PEP 572: Statement-Local Name Bindings) > > > > The topic has been discussed to death with all possible pros and cons. > > I've published my personal collection of pros and cons at > > http://phdru.name/Software/mail-vs-web.html > > > > And my personal bottom line is: I still prefer mailing lists but I > > know their advantages and disadvantages and I've invested a lot of > > resources to learn and configure my tools. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From marcidy at gmail.com Wed Feb 28 17:23:08 2018 From: marcidy at gmail.com (Matt Arcidy) Date: Wed, 28 Feb 2018 22:23:08 +0000 Subject: [Python-ideas] Medium for discussion potential changes to python (was: PEP 572: Statement-Local Name Bindings) In-Reply-To: <20180228221210.adiz2s4lyityrv26@phdru.name> References: <172401d3b0de$b6ad10e0$240732a0$@sdamon.com> <20180228221210.adiz2s4lyityrv26@phdru.name> Message-ID: if Linux kernel can handle it, there is no argument for it being factually superior or inferior. It is only preference. There is nothing stopping a forum link being created and posted to the list as an alternative right now. The result of that experiment would be the answer. On Wed, Feb 28, 2018, 14:14 Oleg Broytman wrote: > The topic has been discussed to death with all possible pros and cons. > I've published my personal collection of pros and cons at > http://phdru.name/Software/mail-vs-web.html > > And my personal bottom line is: I still prefer mailing lists but I > know their advantages and disadvantages and I've invested a lot of > resources to learn and configure my tools. > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Wed Feb 28 17:39:46 2018 From: phd at phdru.name (Oleg Broytman) Date: Wed, 28 Feb 2018 23:39:46 +0100 Subject: [Python-ideas] Medium for discussion potential changes to python (was: PEP 572: Statement-Local Name Bindings) In-Reply-To: References: <172401d3b0de$b6ad10e0$240732a0$@sdamon.com> <20180228221210.adiz2s4lyityrv26@phdru.name> Message-ID: <20180228223946.ytgtnbcv32pgyfno@phdru.name> On Wed, Feb 28, 2018 at 10:23:08PM +0000, Matt Arcidy wrote: > if Linux kernel can handle it, there is no argument for it being factually > superior or inferior. It is only preference. > > There is nothing stopping a forum link being created and posted to the list > as an alternative right now. > > The result of that experiment would be the answer. The problem with the approach is division inside community and miscommunication between subgroup. One medium (currently it's the mailing list) is more preferable. Once an idea is discussed using the preferred medium and code is created the other groups will feel they have been singled out and their ideas were ignored. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From mertz at gnosis.cx Wed Feb 28 20:21:55 2018 From: mertz at gnosis.cx (David Mertz) Date: Wed, 28 Feb 2018 17:21:55 -0800 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On Tue, Feb 27, 2018 at 11:46 PM, Matt Arcidy wrote: > From readability, the examples put forth have been to explain the > advantage, with which I agree. However, i do not believe this scales well. > > [(foo(x,y) as g)*(bar(y) as i) + g*foo(x,a) +baz(g,i) for x... for y...] > This definitely looks hard to read. Let's compare it to: lst = [] for x in something: for y in other_thing: g = f(x, y) i = bar(y) lst.append(g*foo(x,a) + baz(g,i)) Obviously the one-liner is shorter, but the full loop looks a heck of a lot more readable to me. I was thinking of an example closer to the PEP like this: [((my_object.calculate_the_quantity(quant1, vect2, arr3) as x), log(x)) for quant1 in quants] Just one "as" clause, but a long enough expression I wouldn't want to repeat it. I still feel this suffers in readability compared to the existing option of (even as a non-unrolled comprehension): [(x, log(x)) for x in (my_object.calculate_the_quantity(quant1, vect2, arr3) for quant1 in quants)] Sure, again we save a couple characters under the PEP, but readability feels harmed not helped. And most likely this is another thing better spelled as a regular loop. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From klahnakoski at mozilla.com Wed Feb 28 22:11:06 2018 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Wed, 28 Feb 2018 22:11:06 -0500 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: <222e8110-2b74-2a9d-38a4-d28af9486dec@mozilla.com> On 2018-02-28 02:46, Matt Arcidy wrote: > From readability, the examples put forth have been to explain the > advantage, with which I agree.? However, i do not believe this scales > well. > > [(foo(x,y) as g)*(bar(y) as i) + g*foo(x,a) +baz(g,i) for x... for y...] > > That's 3 functions, 2 iterators, 3 calls saved ('a' is some constant > just to trigger a new call on foo).? I'm not trying to show ugly > statements can be constructed, but show how quickly in _n iterators > and _m functions readability declines. > You could put it on multiple lines > [ >???? (g * i) + g * foo(x, a) + baz(g, i) >???? for x in X >???? for y in Y >???? for g in [foo(x,y)] >???? for i in [bar(y)] > ] and then notice a common factor!? :) > [ >???? g * (i + foo(x, a) + baz(g, i)) >???? for x in X >???? for y in Y >???? for g in [foo(x,y)] >???? for i in [bar(y)] > ] From ncoghlan at gmail.com Wed Feb 28 23:31:37 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Mar 2018 14:31:37 +1000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: Message-ID: On 28 February 2018 at 08:27, Chris Angelico wrote: > This is a suggestion that comes up periodically here or on python-dev. > This proposal introduces a way to bind a temporary name to the value > of an expression, which can then be used elsewhere in the current > statement. > > The nicely-rendered version will be visible here shortly: > > https://www.python.org/dev/peps/pep-0572/ Thanks for putting this together! > Statement-local name bindings can be used in any context, but should be > avoided where regular assignment can be used, just as `lambda` should be > avoided when `def` is an option. > > [snip] > > 2. The current implementation [1] implements statement-local names using > a special (and mostly-invisible) name mangling. This works perfectly > inside functions (including list comprehensions), but not at top > level. Is this a serious limitation? Is it confusing? > It isn't clear to me from the current PEP what the intended lifecycle of the bound names actually is, especially for compound statements. Using list comprehensions as your example currently hides that confusion, since they create an implicit nested scope anyway, so the references will go away when the list comprehension terminates no matter what. So I think it would be useful to have some other examples in the PEP to more explicitly answer that question: x = (expr as y) assert x == y # Does this pass? Or raise NameError for 'y'? if (condition as c): assert c # Does this pass? Or raise NameError for 'c'? else: assert not c # Does this pass? Or raise NameError for 'c'? assert c or not c # Does this pass? Or raise NameError for 'c'? class C: x = (True as y) assert C.y # Does this pass? Or raise AttributeError for 'y'? I think it would also be worth explicitly considering a syntactic variant that requires statement local references to be explicitly disambiguated from regular variable names by way of a leading dot: result = [[(f(x) as .y), .y] for x in range(5)] (.x, (1 as .x), .x) # UnboundLocalError on first subexpression (x, (1 as .x), .x) # Valid if 'x' is a visible name x = (expr as .y) assert x == .y # UnboundLocalError for '.y' if (condition as .c): assert .c # Passes else: assert not .c # Passes assert .c or not .c # UnboundLocalError for '.c' class C: x = (True as .y) assert C..y # SyntaxError on the second dot Since ".NAME" is illegal for both variable and attribute names, this makes the fact statement locals are a distinct namespace visible to readers as well as to the compiler, and also reduces the syntactic ambiguity in with statements and exception handlers. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Feb 28 23:54:52 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Mar 2018 14:54:52 +1000 Subject: [Python-ideas] PEP 572: Statement-Local Name Bindings In-Reply-To: References: <54c90d32-ef5f-229d-28d8-2e3433eadf0b@btinternet.com> <5A96C673.4070804@stoneleaf.us> <5A970476.3000101@brenbarn.net> Message-ID: On 1 March 2018 at 06:00, Chris Angelico wrote: > On Thu, Mar 1, 2018 at 6:35 AM, Brendan Barnwell > wrote: > > On 2018-02-28 07:18, Chris Angelico wrote: > >> > >> Except that assignment is evaluated RHS before LHS as part of a single > >> statement. When Python goes to look up the name "a" to store it (as > >> the final step of the assignment), the SLNB is still active (it's > >> still the same statement - note that this is NOT expression-local), so > >> it uses the temporary. > > > > > > Wait, so you're saying that if I do > > > > a = (2 as a) > > > > The "a = " assignment assigns to the SLNB, and so is then > discarded > > after the statement finishes? > > > > That seems very bad to me. If there are SLNBs with this special > > "as" syntax, I think the ONLY way to assign to an SLNB should be with the > > "as" syntax. You shouldn't be able to assign to an SLNB with regular > > assignment syntax, even if you created an SNLB with the same name as the > LHS > > within the RHS. > > That seems a reasonable requirement on the face of it, but what about > these variants? > > a = (x as a) > a[b] = (x as a) > b[a] = (x as a) > a[b].c = (x as a) > b[a].c = (x as a) > > Which of these should use the SLNB, which should be errors, which > should use the previously-visible binding of 'a'? > This is the kind of ambiguity of intent that goes away if statement locals are made syntactically distinct in addition to being semantically distinct: .a = (2 as .a) # Syntax error (persistent bindings can't target statement locals) a = (2 as .a) # Binds both ".a" (ephemerally) and "a" (persistently) to "2" .a[b] = (x as .a) # Syntax error (persistent bindings can't target statement locals) b[.a] = (x as .a) # LHS references .a .a[b].c = (x as .a) # Syntax error (persistent bindings can't target statement locals) b[.a].c = (x as .a) # LHS references .a We may still decide that even the syntactically distinct variant poses a net loss to overall readability, but I do think it avoids many of the confusability problems that arise when statement locals use the same reference syntax as regular variable names. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: