From storchaka at gmail.com Thu Jun 1 02:47:57 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 1 Jun 2017 09:47:57 +0300 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= Message-ID: What you are think about adding Unicode aliases for some mathematic names in the math module? ;-) math.? = math.pi math.? = math.tau math.? = math.gamma math.? = math.e Unfortunately we can't use ?, ? and ? as identifiers. :-( From mertz at gnosis.cx Thu Jun 1 03:03:18 2017 From: mertz at gnosis.cx (David Mertz) Date: Thu, 1 Jun 2017 00:03:18 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: It's awfully easy to add in your own code. Since they are simply aliases, I don't see why bother put the duplication in the standard library. You can even monkey patch if you want it in the 'math' namespace. I'd prefer a bare '?' in my own code though. On May 31, 2017 11:48 PM, "Serhiy Storchaka" wrote: What you are think about adding Unicode aliases for some mathematic names in the math module? ;-) math.? = math.pi math.? = math.tau math.? = math.gamma math.? = math.e Unfortunately we can't use ?, ? and ? as identifiers. :-( _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Jun 1 03:03:19 2017 From: mertz at gnosis.cx (David Mertz) Date: Thu, 1 Jun 2017 00:03:19 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: It's awfully easy to add in your own code. Since they are simply aliases, I don't see why bother put the duplication in the standard library. You can even monkey patch if you want it in the 'math' namespace. I'd prefer a bare '?' in my own code though. On May 31, 2017 11:48 PM, "Serhiy Storchaka" wrote: What you are think about adding Unicode aliases for some mathematic names in the math module? ;-) math.? = math.pi math.? = math.tau math.? = math.gamma math.? = math.e Unfortunately we can't use ?, ? and ? as identifiers. :-( _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tomuxiong at gmx.com Thu Jun 1 03:12:31 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Thu, 1 Jun 2017 00:12:31 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <1e7b2c33-79b2-7c52-6b0f-a73508c37da0@gmx.com> On 05/31/2017 11:47 PM, Serhiy Storchaka wrote: > What you are think about adding Unicode aliases for some mathematic > names in the math module? ;-) I personally don't like there being multiple symbols with the same meaning and I never find myself confused by the longer names versus the sorter symbols I would use when writing them. Cheers, Thomas From greg.ewing at canterbury.ac.nz Thu Jun 1 03:11:39 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 01 Jun 2017 19:11:39 +1200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <592FBE2B.1090004@canterbury.ac.nz> Serhiy Storchaka wrote: > > Unfortunately we can't use ?, ? and ? as identifiers. :-( Maybe Latex formulas should be allowed in Python. Then you could pretty-print your mathematical programs. -- Greg From levkivskyi at gmail.com Thu Jun 1 03:14:34 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 1 Jun 2017 09:14:34 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Although it is trivial, I like the idea (except for math.e maybe). (And it would be cool to be able to also write ? = sum, etc.) -- Ivan On 1 June 2017 at 08:47, Serhiy Storchaka wrote: > What you are think about adding Unicode aliases for some mathematic names > in the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at brice.xyz Thu Jun 1 03:17:28 2017 From: contact at brice.xyz (Brice PARENT) Date: Thu, 1 Jun 2017 09:17:28 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <3b2bce0a-97a3-3908-db0d-054ee54c4a71@brice.xyz> Why not simply use from math import pi as ? and so on? It makes your math formulas more readable (taking out the "math." entirely), without requiring any change to the module. Le 01/06/17 ? 08:47, Serhiy Storchaka a ?crit : > What you are think about adding Unicode aliases for some mathematic > names in the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From stephanh42 at gmail.com Thu Jun 1 03:20:51 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Jun 2017 09:20:51 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Hi all, Two remarks: 1. Note that ? also doesn't really work. While you can assign to this identifier, it actually gets normalized into a plain "e". 2. Unicode has a ? : GREEK CAPITAL LETTER SIGMA and a ? : N-ARY SUMMATION The first is a valid Python identifier, the second not. Unfortunately, the second has the desired semantics... Stephan 2017-06-01 9:14 GMT+02:00 Ivan Levkivskyi : > Although it is trivial, I like the idea (except for math.e maybe). > (And it would be cool to be able to also write ? = sum, etc.) > > -- > Ivan > > > > On 1 June 2017 at 08:47, Serhiy Storchaka wrote: >> >> What you are think about adding Unicode aliases for some mathematic names >> in the math module? ;-) >> >> math.? = math.pi >> math.? = math.tau >> math.? = math.gamma >> math.? = math.e >> >> Unfortunately we can't use ?, ? and ? as identifiers. :-( >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From srkunze at mail.de Thu Jun 1 03:19:56 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 1 Jun 2017 09:19:56 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On 01.06.2017 08:47, Serhiy Storchaka wrote: > What you are think about adding Unicode aliases for some mathematic > names in the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ No, thank you. :) One way to do it is perfectly fine. Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Thu Jun 1 03:30:28 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Jun 2017 09:30:28 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <3b2bce0a-97a3-3908-db0d-054ee54c4a71@brice.xyz> References: <3b2bce0a-97a3-3908-db0d-054ee54c4a71@brice.xyz> Message-ID: Or perhaps create a small module: ============unimath.py============== import math __all__ = ["?", "?", "?"] ? = math.pi ? = math.tau ? = math.gamma ==================================== Then do: from unimath import * Put it on the Python Package index. If it gets wildly popular the case for putting it in `math` will be greatly strengthened. Stephan 2017-06-01 9:17 GMT+02:00 Brice PARENT : > Why not simply use > > from math import pi as ? > > and so on? It makes your math formulas more readable (taking out the "math." > entirely), without requiring any change to the module. > > > Le 01/06/17 ? 08:47, Serhiy Storchaka a ?crit : > >> What you are think about adding Unicode aliases for some mathematic names >> in the math module? ;-) >> >> math.? = math.pi >> math.? = math.tau >> math.? = math.gamma >> math.? = math.e >> >> Unfortunately we can't use ?, ? and ? as identifiers. :-( >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From lucas.wiman at gmail.com Thu Jun 1 03:31:32 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Thu, 1 Jun 2017 00:31:32 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: I like the aesthetic of the idea, but it seems like this would be a better fit in a library namespace like sympy or jupyter. On Thu, Jun 1, 2017 at 12:19 AM, Sven R. Kunze wrote: > On 01.06.2017 08:47, Serhiy Storchaka wrote: > > What you are think about adding Unicode aliases for some mathematic names > in the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > No, thank you. :) > > One way to do it is perfectly fine. > > Regards, > Sven > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Thu Jun 1 03:39:08 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Jun 2017 09:39:08 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <3b2bce0a-97a3-3908-db0d-054ee54c4a71@brice.xyz> References: <3b2bce0a-97a3-3908-db0d-054ee54c4a71@brice.xyz> Message-ID: Or perhaps create a small module: ============unimath.py============== import math __all__ = ["?", "?", "?"] ? = math.pi ? = math.tau ? = math.gamma ==================================== Then do: from unimath import * Put it on the Python Package index. If it gets wildly popular the case for putting it in `math` will be greatly strengthened. Stephan 2017-06-01 9:17 GMT+02:00 Brice PARENT : > Why not simply use > > from math import pi as ? > > and so on? It makes your math formulas more readable (taking out the "math." > entirely), without requiring any change to the module. > > > Le 01/06/17 ? 08:47, Serhiy Storchaka a ?crit : > >> What you are think about adding Unicode aliases for some mathematic names >> in the math module? ;-) >> >> math.? = math.pi >> math.? = math.tau >> math.? = math.gamma >> math.? = math.e >> >> Unfortunately we can't use ?, ? and ? as identifiers. :-( >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Thu Jun 1 03:08:46 2017 From: phd at phdru.name (Oleg Broytman) Date: Thu, 1 Jun 2017 09:08:46 +0200 Subject: [Python-ideas] ?? = math.pi In-Reply-To: References: Message-ID: <20170601070846.GA19860@phdru.name> On Thu, Jun 01, 2017 at 09:47:57AM +0300, Serhiy Storchaka wrote: > What you are think about adding Unicode aliases for some mathematic names in > the math module? ;-) > > math.?? = math.pi > math.?? = math.tau > math.?? = math.gamma > math.??? = math.e > > Unfortunately we can't use ???, ??? and ? as identifiers. :-( -1. "There should be one-- and preferably only one --obvious way to do it." And -1 for non-ascii in stdlib. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From storchaka at gmail.com Thu Jun 1 03:47:12 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 1 Jun 2017 10:47:12 +0300 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: 01.06.17 10:03, David Mertz ????: > It's awfully easy to add in your own code. Since they are simply > aliases, I don't see why bother put the duplication in the standard > library. You can even monkey patch if you want it in the 'math' > namespace. I'd prefer a bare '?' in my own code though. As well as adding tau = 2*math.pi in your own code. But this deserved the whole PEP and was added as a feature in 3.6. From stephanh42 at gmail.com Thu Jun 1 03:53:00 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Jun 2017 09:53:00 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Tau was kind of a joke. Stephan Op 1 jun. 2017 09:47 schreef "Serhiy Storchaka" : 01.06.17 10:03, David Mertz ????: It's awfully easy to add in your own code. Since they are simply aliases, I > don't see why bother put the duplication in the standard library. You can > even monkey patch if you want it in the 'math' namespace. I'd prefer a bare > '?' in my own code though. > As well as adding tau = 2*math.pi in your own code. But this deserved the whole PEP and was added as a feature in 3.6. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine.rozo at gmail.com Thu Jun 1 03:54:17 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Thu, 1 Jun 2017 09:54:17 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Does this really make expressions more readable? More, these characters are difficult to write. 2017-06-01 9:53 GMT+02:00 Stephan Houben : > Tau was kind of a joke. > > Stephan > > Op 1 jun. 2017 09:47 schreef "Serhiy Storchaka" : > > 01.06.17 10:03, David Mertz ????: > > It's awfully easy to add in your own code. Since they are simply aliases, >> I don't see why bother put the duplication in the standard library. You can >> even monkey patch if you want it in the 'math' namespace. I'd prefer a bare >> '?' in my own code though. >> > > As well as adding tau = 2*math.pi in your own code. But this deserved the > whole PEP and was added as a feature in 3.6. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Jun 1 04:06:21 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 1 Jun 2017 11:06:21 +0300 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: 01.06.17 10:53, Stephan Houben ????: > Tau was kind of a joke. math.? is a kind of joke too. Honest, it is strange, that Python allows Unicode identifiers, but doesn't have the one in its stdlib! From stephanh42 at gmail.com Thu Jun 1 05:32:38 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Jun 2017 11:32:38 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Hi Serhiy, > math.? is a kind of joke too. The point is that tau, being a joke, should not be considered as setting a precedent. > Honest, it is strange, that Python allows Unicode identifiers, but doesn't > have the one in its stdlib! Actually it is policy: https://www.python.org/dev/peps/pep-3131/#policy-specification """ Policy Specification As an addition to the Python Coding style, the following policy is prescribed: All identifiers in the Python standard library MUST use ASCII-only identifiers, and SHOULD use English words wherever feasible (in many cases, abbreviations and technical terms are used which aren't English). In addition, string literals and comments must also be in ASCII. The only exceptions are (a) test cases testing the non-ASCII features, and (b) names of authors. Authors whose names are not based on the Latin alphabet MUST provide a Latin transliteration of their names. """ Stephan 2017-06-01 10:06 GMT+02:00 Serhiy Storchaka : > 01.06.17 10:53, Stephan Houben ????: >> >> Tau was kind of a joke. > > > math.? is a kind of joke too. > > Honest, it is strange, that Python allows Unicode identifiers, but doesn't > have the one in its stdlib! > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From victor.stinner at gmail.com Thu Jun 1 05:49:43 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 1 Jun 2017 11:49:43 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: 2017-06-01 8:47 GMT+02:00 Serhiy Storchaka : > What you are think about adding Unicode aliases for some mathematic names in > the math module? ;-) > > math.? = math.pi How do you write ? (pi) with a keyboard on Windows, Linux or macOS? Victor From stephanh42 at gmail.com Thu Jun 1 06:09:44 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Jun 2017 12:09:44 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: > How do you write ? (pi) with a keyboard on Windows, Linux or macOS? On macOS, ? P (Option-P) works. On all platforms: 1. Make sure you are using Vim. 2. In insert mode: Ctrl-K *p You can also define abbrev's which will allow you to type pi\ and it gets replaced by ?. See: https://gist.github.com/stephanh42/fc466e62bfb022a890ff2c4643eaf3a5 I presume Emacs can do something similar. Or you get this keyboard: https://imgur.com/gallery/tCNvP ;-) Stephan 2017-06-01 11:49 GMT+02:00 Victor Stinner : > 2017-06-01 8:47 GMT+02:00 Serhiy Storchaka : >> What you are think about adding Unicode aliases for some mathematic names in >> the math module? ;-) >> >> math.? = math.pi > > How do you write ? (pi) with a keyboard on Windows, Linux or macOS? > > Victor > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From storchaka at gmail.com Thu Jun 1 06:14:36 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 1 Jun 2017 13:14:36 +0300 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: 01.06.17 12:49, Victor Stinner ????: > 2017-06-01 8:47 GMT+02:00 Serhiy Storchaka : >> What you are think about adding Unicode aliases for some mathematic names in >> the math module? ;-) >> >> math.? = math.pi > > How do you write ? (pi) with a keyboard on Windows, Linux or macOS? This shouldn't be a problem for Greek users. ;-) Personally I copied them from a character table. From storchaka at gmail.com Thu Jun 1 06:16:40 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 1 Jun 2017 13:16:40 +0300 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: 01.06.17 12:32, Stephan Houben ????: >> math.? is a kind of joke too. > > The point is that tau, being a joke, should not be considered as > setting a precedent. If add one joke feature per release, this one looks enough harmless. From stephanh42 at gmail.com Thu Jun 1 06:29:11 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Jun 2017 12:29:11 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: > This shouldn't be a problem for Greek users. ;-) Well, they still need to switch between keymaps, since presumably they used the Latin keymap to enter `math.` before they can enter ?. That is actually another general solution: just install the Greek keymap in addition to your native keymap. The OS typically provides keyboard shortcus to switch between keymaps. ??? ? ??? ??? ????? ?????????! OK, that works. Stephan 2017-06-01 12:14 GMT+02:00 Serhiy Storchaka : > 01.06.17 12:49, Victor Stinner ????: >> >> 2017-06-01 8:47 GMT+02:00 Serhiy Storchaka : >>> >>> What you are think about adding Unicode aliases for some mathematic names >>> in >>> the math module? ;-) >>> >>> math.? = math.pi >> >> >> How do you write ? (pi) with a keyboard on Windows, Linux or macOS? > > > This shouldn't be a problem for Greek users. ;-) > > Personally I copied them from a character table. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From stephanh42 at gmail.com Thu Jun 1 06:36:54 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Jun 2017 12:36:54 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Hi Serhiy, For the record, *I* am a complete nobody, it's Guido you will have to convince. I am not sure that presenting it as a joke feature is the road to success, but admittedly it has at least worked once before. Stephan 2017-06-01 12:16 GMT+02:00 Serhiy Storchaka : > 01.06.17 12:32, Stephan Houben ????: >>> >>> math.? is a kind of joke too. >> >> >> The point is that tau, being a joke, should not be considered as >> setting a precedent. > > > If add one joke feature per release, this one looks enough harmless. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From victor.stinner at gmail.com Thu Jun 1 08:17:05 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 1 Jun 2017 14:17:05 +0200 Subject: [Python-ideas] SealedMock proposal for unittest.mock In-Reply-To: References: Message-ID: A stricter mock object cannot be a bad idea :-) I am not sure about your proposed API: some random code may already use this attribute. Maybe it can be a seal (mock) function which sets a secret attribute with a less common name? Yeah, please open an issue on bugs.python.org ;-) Victor Le 29 mai 2017 11:33 PM, "Mario Corchero" a ?crit : > Hello Everyone! > > First time writing to python-ideas. > > *Overview* > Add a new mock class within the mock module > , > SealedMock (or RestrictedMock) that allows to restrict in a dynamic and > recursive way the addition of attributes to it. The new class just defines > a special attribute "sealed" which once set to True the behaviour of > automatically creating mocks is blocked, as well as for all its "submocks". > See sealedmock > . Don't > focus on the implementation, it is ugly, it would be much simpler within > *mock.py*. > > *Rationale* > Inspired by GMock > RestrictedMock, > SealedMock aims to allow the developer to define a narrow interface to the > mock that defines what the mocks allows to be called on. > The feature of mocks returning mocks by default is extremely useful but > not always desired. Quite often you rely on it only at the time you are > writing the test but you want it to be disabled at the time the mock is > passed into your code, that is what SealedMock aims to address. > > This solution also prevents user errors when mocking incorrect paths or > having typos when calling attributes/methods of the mock. > We have tried it internally in our company and it gives quite a nicer user > experience for many use cases, specially for new users of mock as it helps > out when you mock the wrong path. > > *Alternatives* > > - Using auto_spec/spec is a possible solution but removes flexibility > and is rather painful to write for each of the mocks and submocks being > used. > - Leaving it outside of the mock.py as it is not interesting enough. I > am fine with it :) just proposing it in case you think otherwise. > - Make it part of the standard Mock base class. Works for me, but I'd > concerned on how can we do it in a backward compatible way. (Say someone is > mocking something that has a "sealed" attribute already). > > Let me know what you think, happy to open a enhancement in > https://bugs.python.org/ and send a PR. > > Regards, > Mario > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From julien at gns3.net Thu Jun 1 08:29:19 2017 From: julien at gns3.net (Julien Duponchelle) Date: Thu, 01 Jun 2017 12:29:19 +0000 Subject: [Python-ideas] SealedMock proposal for unittest.mock In-Reply-To: References: Message-ID: Perhaps you can set via configure_mock. This will prevent conflict with existing code. Julien On Thu, Jun 1, 2017 at 2:17 PM Victor Stinner wrote: > A stricter mock object cannot be a bad idea :-) I am not sure about your > proposed API: some random code may already use this attribute. Maybe it can > be a seal (mock) function which sets a secret attribute with a less common > name? > > Yeah, please open an issue on bugs.python.org ;-) > > Victor > > > Le 29 mai 2017 11:33 PM, "Mario Corchero" a ?crit : > >> Hello Everyone! >> >> First time writing to python-ideas. >> >> *Overview* >> Add a new mock class within the mock module >> , >> SealedMock (or RestrictedMock) that allows to restrict in a dynamic and >> recursive way the addition of attributes to it. The new class just defines >> a special attribute "sealed" which once set to True the behaviour of >> automatically creating mocks is blocked, as well as for all its "submocks". >> See sealedmock >> . Don't >> focus on the implementation, it is ugly, it would be much simpler within >> *mock.py*. >> >> *Rationale* >> Inspired by GMock >> RestrictedMock, >> SealedMock aims to allow the developer to define a narrow interface to the >> mock that defines what the mocks allows to be called on. >> The feature of mocks returning mocks by default is extremely useful but >> not always desired. Quite often you rely on it only at the time you are >> writing the test but you want it to be disabled at the time the mock is >> passed into your code, that is what SealedMock aims to address. >> >> This solution also prevents user errors when mocking incorrect paths or >> having typos when calling attributes/methods of the mock. >> We have tried it internally in our company and it gives quite a nicer >> user experience for many use cases, specially for new users of mock as it >> helps out when you mock the wrong path. >> >> *Alternatives* >> >> - Using auto_spec/spec is a possible solution but removes flexibility >> and is rather painful to write for each of the mocks and submocks being >> used. >> - Leaving it outside of the mock.py as it is not interesting enough. >> I am fine with it :) just proposing it in case you think otherwise. >> - Make it part of the standard Mock base class. Works for me, but I'd >> concerned on how can we do it in a backward compatible way. (Say someone is >> mocking something that has a "sealed" attribute already). >> >> Let me know what you think, happy to open a enhancement in >> https://bugs.python.org/ and send a PR. >> >> Regards, >> Mario >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From z+py+pyideas at m0g.net Thu Jun 1 08:32:40 2017 From: z+py+pyideas at m0g.net (Guyzmo) Date: Thu, 1 Jun 2017 14:32:40 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <20170601123240.tyymxd6fid6wofte@BuGz.eclipse.m0g.net> On Thu, Jun 01, 2017 at 11:49:43AM +0200, Victor Stinner wrote: > How do you write ? (pi) with a keyboard on Windows, Linux or macOS? Use the compose key ? for linux: https://help.ubuntu.com/community/ComposeKey for windows: https://github.com/SamHocevar/wincompose for macosx: http://lol.zoy.org/blog/2012/06/17/compose-key-on-os-x Then I wrote my own ~/.XCompose file with:

: "?" U03C0 # GREEK SMALL LETTER PI so it's like the vim digraphs. Cheers, -- zmo From mariocj89 at gmail.com Thu Jun 1 08:43:44 2017 From: mariocj89 at gmail.com (Mario Corchero) Date: Thu, 1 Jun 2017 13:43:44 +0100 Subject: [Python-ideas] SealedMock proposal for unittest.mock In-Reply-To: References: Message-ID: Having it part of the existing Mock class might be great. I really like the idea of mock.seal(mock_object). Let me try it out and draft some code and I'll open the issue. Thanks :) On 1 June 2017 at 13:29, Julien Duponchelle wrote: > Perhaps you can set via configure_mock. This will prevent conflict with > existing code. > > Julien > > On Thu, Jun 1, 2017 at 2:17 PM Victor Stinner > wrote: > >> A stricter mock object cannot be a bad idea :-) I am not sure about your >> proposed API: some random code may already use this attribute. Maybe it can >> be a seal (mock) function which sets a secret attribute with a less common >> name? >> >> Yeah, please open an issue on bugs.python.org ;-) >> >> Victor >> >> >> Le 29 mai 2017 11:33 PM, "Mario Corchero" a ?crit : >> >>> Hello Everyone! >>> >>> First time writing to python-ideas. >>> >>> *Overview* >>> Add a new mock class within the mock module >>> , >>> SealedMock (or RestrictedMock) that allows to restrict in a dynamic and >>> recursive way the addition of attributes to it. The new class just defines >>> a special attribute "sealed" which once set to True the behaviour of >>> automatically creating mocks is blocked, as well as for all its "submocks". >>> See sealedmock >>> . Don't >>> focus on the implementation, it is ugly, it would be much simpler within >>> *mock.py*. >>> >>> *Rationale* >>> Inspired by GMock >>> RestrictedMock, >>> SealedMock aims to allow the developer to define a narrow interface to the >>> mock that defines what the mocks allows to be called on. >>> The feature of mocks returning mocks by default is extremely useful but >>> not always desired. Quite often you rely on it only at the time you are >>> writing the test but you want it to be disabled at the time the mock is >>> passed into your code, that is what SealedMock aims to address. >>> >>> This solution also prevents user errors when mocking incorrect paths or >>> having typos when calling attributes/methods of the mock. >>> We have tried it internally in our company and it gives quite a nicer >>> user experience for many use cases, specially for new users of mock as it >>> helps out when you mock the wrong path. >>> >>> *Alternatives* >>> >>> - Using auto_spec/spec is a possible solution but removes >>> flexibility and is rather painful to write for each of the mocks and >>> submocks being used. >>> - Leaving it outside of the mock.py as it is not interesting enough. >>> I am fine with it :) just proposing it in case you think otherwise. >>> - Make it part of the standard Mock base class. Works for me, but >>> I'd concerned on how can we do it in a backward compatible way. (Say >>> someone is mocking something that has a "sealed" attribute already). >>> >>> Let me know what you think, happy to open a enhancement in >>> https://bugs.python.org/ and send a PR. >>> >>> Regards, >>> Mario >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ma3yuki.8mamo10 at gmail.com Thu Jun 1 10:08:27 2017 From: ma3yuki.8mamo10 at gmail.com (Masayuki YAMAMOTO) Date: Thu, 1 Jun 2017 23:08:27 +0900 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: The width of Greek letters is East Asian Ambiguous. Using ambiguous width characters possibly will be a reason that is source code layout break on specific locale. Masayuki 2017-06-01 15:47 GMT+09:00 Serhiy Storchaka : > What you are think about adding Unicode aliases for some mathematic names > in the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nanjekyejoannah at gmail.com Thu Jun 1 10:17:49 2017 From: nanjekyejoannah at gmail.com (joannah nanjekye) Date: Thu, 1 Jun 2017 17:17:49 +0300 Subject: [Python-ideas] Allow function to return multiple values Message-ID: Hello Team, I am Joannah. I am currently working on a book on python compatibility and publishing it with apress. I have worked with python for a while we are talking about four years. Today I was writing an example snippet for the book and needed to write a function that returns two values something like this: def return_multiplevalues(num1, num2): return num1, num2 I noticed that this actually returns a tuple of the values which I did not want in the first place.I wanted python to return two values in their own types so I can work with them as they are but here I was stuck with working around a tuple. My proposal is we provide a way of functions returning multiple values. This has been implemented in languages like Go and I have found many cases where I needed and used such a functionality. I wish for this convenience in python so that I don't have to suffer going around a tuple. I will appreciate discussing this. You may also bring to light any current way of returning multiple values from a function that I may not know of in python if there is. Kind regards, Joannah -- Joannah Nanjekye +256776468213 F : Nanjekye Captain Joannah S : joannah.nanjekye T : @Captain_Joannah SO : joannah *"You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program." Alan J. Perlis* -------------- next part -------------- An HTML attachment was scrubbed... URL: From markusmeskanen at gmail.com Thu Jun 1 10:21:05 2017 From: markusmeskanen at gmail.com (Markus Meskanen) Date: Thu, 1 Jun 2017 17:21:05 +0300 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: Why isn't a tuple enough? You can do automatic tuple unpack: v1, v2 = return_multiplevalues(1, 2) On Jun 1, 2017 17:18, "joannah nanjekye" wrote: Hello Team, I am Joannah. I am currently working on a book on python compatibility and publishing it with apress. I have worked with python for a while we are talking about four years. Today I was writing an example snippet for the book and needed to write a function that returns two values something like this: def return_multiplevalues(num1, num2): return num1, num2 I noticed that this actually returns a tuple of the values which I did not want in the first place.I wanted python to return two values in their own types so I can work with them as they are but here I was stuck with working around a tuple. My proposal is we provide a way of functions returning multiple values. This has been implemented in languages like Go and I have found many cases where I needed and used such a functionality. I wish for this convenience in python so that I don't have to suffer going around a tuple. I will appreciate discussing this. You may also bring to light any current way of returning multiple values from a function that I may not know of in python if there is. Kind regards, Joannah -- Joannah Nanjekye +256776468213 F : Nanjekye Captain Joannah S : joannah.nanjekye T : @Captain_Joannah SO : joannah *"You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program." Alan J. Perlis* _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From elazarg at gmail.com Thu Jun 1 10:27:13 2017 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Thu, 01 Jun 2017 14:27:13 +0000 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: What is the difference between returning a tuple and returning two values? I think at least theoretically it's different wording for precisely the same thing. Elazar ?????? ??? ??, 1 ????' 2017, 17:21, ??? Markus Meskanen ?< markusmeskanen at gmail.com>: > Why isn't a tuple enough? You can do automatic tuple unpack: > > v1, v2 = return_multiplevalues(1, 2) > > > On Jun 1, 2017 17:18, "joannah nanjekye" > wrote: > > Hello Team, > > I am Joannah. I am currently working on a book on python compatibility and > publishing it with apress. I have worked with python for a while we are > talking about four years. > > Today I was writing an example snippet for the book and needed to write a > function that returns two values something like this: > > def return_multiplevalues(num1, num2): > return num1, num2 > > I noticed that this actually returns a tuple of the values which I did > not want in the first place.I wanted python to return two values in their > own types so I can work with them as they are but here I was stuck with > working around a tuple. > > My proposal is we provide a way of functions returning multiple values. > This has been implemented in languages like Go and I have found many cases > where I needed and used such a functionality. I wish for this convenience > in python so that I don't have to suffer going around a tuple. > > I will appreciate discussing this. You may also bring to light any current > way of returning multiple values from a function that I may not know of in > python if there is. > > Kind regards, > Joannah > > -- > Joannah Nanjekye > +256776468213 > F : Nanjekye Captain Joannah > S : joannah.nanjekye > T : @Captain_Joannah > SO : joannah > > > *"You think you know when you learn, are more sure when you can write, > even more when you can teach, but certain when you can program." Alan J. > Perlis* > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Thu Jun 1 10:52:28 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 1 Jun 2017 16:52:28 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Hi Masayuki, I admit that my understanding of this issue is very limited. Nevertheless, I would like to point out that the encoding assumed for a Python3 source file never depends on the locale. My understanding is that in the default encoding for Python source files (utf-8), East Asian Ambiguous characters must be assumed narrow. Now there are also legacy encodings where they are fullwidth. But it is always determined by the encoding, which in turn is specified or implied in the source file. So I don't actually see an issue here. Am I missing something? Stephan Op 1 jun. 2017 16:08 schreef "Masayuki YAMAMOTO" : The width of Greek letters is East Asian Ambiguous. Using ambiguous width characters possibly will be a reason that is source code layout break on specific locale. Masayuki 2017-06-01 15:47 GMT+09:00 Serhiy Storchaka : > What you are think about adding Unicode aliases for some mathematic names > in the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tomuxiong at gmx.com Thu Jun 1 10:57:54 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Thu, 1 Jun 2017 07:57:54 -0700 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: <96f875db-8508-dbdf-6fb8-cb13eb54d68a@gmx.com> On 06/01/2017 07:17 AM, joannah nanjekye wrote: > a function that returns two values something like this: > > def return_multiplevalues(num1, num2): > return num1, num2 > > I noticed that this actually returns a tuple of the values which I did > not want in the first place.I wanted python to return two values in > their own types so I can work with them as they are but here I was stuck > with working around a tuple. > Why not just unpack the values? E.g. test.py ----------------- def f(a, b): return a, b a, b = f(1, 2) print(a) print(b) ----------------- $ python3 test.py 1 2 Cheers, Thomas From ethan at stoneleaf.us Thu Jun 1 11:16:23 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 01 Jun 2017 08:16:23 -0700 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: <59302FC7.5060804@stoneleaf.us> On 06/01/2017 07:17 AM, joannah nanjekye wrote: > Today I was writing an example snippet for the book and needed to write a function that returns two values something > like this: > > def return_multiplevalues(num1, num2): > return num1, num2 > > I noticed that this actually returns a tuple of the values which I did not want in the first place.I wanted python to > return two values in their own types so I can work with them as they are but here I was stuck with working around a tuple. If you had a function that returned two values, how would you assign them? Maybe something like: var1, var2 = return_multiplevalues(num1, num2) ? That is exactly how Python works. > I will appreciate discussing this. You may also bring to light any current way of returning multiple values from a > function that I may not know of in python if there is. While I am somewhat alarmed that you don't know this already after four years of Python programming, I greatly appreciate you taking the time to find out. Thank you. -- ~Ethan~ From victor.stinner at gmail.com Thu Jun 1 12:30:57 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 1 Jun 2017 18:30:57 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? Message-ID: Hi, Perl 5.26 succeeded to remove the current working directory from the default include path (our Python sys.path): https://metacpan.org/pod/release/XSAWYERX/perl-5.26.0/pod/perldelta.pod#Removal-of-the-current-directory-(%22.%22)-from- at INC Would it technically possible to make this change in Python? Or would it destroy the world? Sorry, it's a naive question (but honestly, I don't know the answer.) My main use case for "." in sys.path is to be to run an application without installing it: run ./hachoir-metadata which loads the Python "hachoir" module from the script directory. Sometimes, I run explicitly "PYTHONPATH=$PWD ./hachoir-metadata". But I don't think that running an application from the source without installing it is the most common way to run an application. Most users install applications to use them, no? Enabling the isolated mode already prevents "." to be added to sys.path: -I command line option. https://docs.python.org/dev/using/cmdline.html#cmdoption-I There is also an old idea of a "restricted" system Python which would use a "fixed" sys.path. Victor From random832 at fastmail.com Thu Jun 1 12:40:34 2017 From: random832 at fastmail.com (Random832) Date: Thu, 01 Jun 2017 12:40:34 -0400 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <1496335234.573610.995625920.79AD2E14@webmail.messagingengine.com> On Thu, Jun 1, 2017, at 10:08, Masayuki YAMAMOTO wrote: > The width of Greek letters is East Asian Ambiguous. Using ambiguous width > characters possibly will be a reason that is source code layout break on > specific locale. I don't think PEP 8 approves anyway of doing the kind of column alignment that this (or for that matter proportional fonts) would break. One example is specifically called out as a "pet peeve". Of course, it also doesn't exactly approve of non-ASCII identifiers (PEP 3131 specifically forbids them in the standard library). From rosuav at gmail.com Thu Jun 1 12:46:47 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 2 Jun 2017 02:46:47 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: On Fri, Jun 2, 2017 at 2:30 AM, Victor Stinner wrote: > Perl 5.26 succeeded to remove the current working directory from the > default include path (our Python sys.path): > > https://metacpan.org/pod/release/XSAWYERX/perl-5.26.0/pod/perldelta.pod#Removal-of-the-current-directory-(%22.%22)-from- at INC > > Would it technically possible to make this change in Python? Or would > it destroy the world? Sorry, it's a naive question (but honestly, I > don't know the answer.) (AIUI, the *current directory* is never on Python's path, but the *script directory* is. They're the same thing a lot of the time.) All it'd take is one tiny change to Python, and then one tiny change to any multi-file non-package Python app. 1) Make the script directory implicitly function as a package. In effect, assume that there is an empty __init__.py in the same directory as the thing you just ran. 2) Any time a Python app wants to import from its own directory, it needs to "from . import blah" instead of simply "import blah". Then the removal you suggest could be done, without any loss of functionality. The change could alternatively be done as an import hack rather than an actual fake package if that's easier, such that "from . import blah" means either "import from the script directory" or "import from the current package" as appropriate. Or, it could be more simply done: 1) Make script-directory-local imports raise a warning, citing packages as the best solution. IMO it's a logical extension to relative imports, and a good solution to the "Idle crashes on startup" problem that comes from someone creating a "random.py" in the current directory. Big +1 from me for this. ChrisA From rosuav at gmail.com Thu Jun 1 12:53:43 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 2 Jun 2017 02:53:43 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <1496335234.573610.995625920.79AD2E14@webmail.messagingengine.com> References: <1496335234.573610.995625920.79AD2E14@webmail.messagingengine.com> Message-ID: On Fri, Jun 2, 2017 at 2:40 AM, Random832 wrote: > On Thu, Jun 1, 2017, at 10:08, Masayuki YAMAMOTO wrote: >> The width of Greek letters is East Asian Ambiguous. Using ambiguous width >> characters possibly will be a reason that is source code layout break on >> specific locale. > > I don't think PEP 8 approves anyway of doing the kind of column > alignment that this (or for that matter proportional fonts) would break. > One example is specifically called out as a "pet peeve". > > Of course, it also doesn't exactly approve of non-ASCII identifiers (PEP > 3131 specifically forbids them in the standard library). PEP 8 has nothing against non-ASCII identifiers where they make sense. The Py3 grammar was changed to be full Unicode specifically to permit that sort of thing. Personally, I would continue to use math.pi because it's easier to type *on my keyboard* than something involving letters I have to compose, but it may well be different for someone who already shifts keyboard from Latin to Greek regularly. Regardless, the stdlib does, as you say, avoid non-ASCII. ChrisA From mertz at gnosis.cx Thu Jun 1 12:58:57 2017 From: mertz at gnosis.cx (David Mertz) Date: Thu, 1 Jun 2017 09:58:57 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: I think having math.pi is just for backward compatibility. We all use tau now, I assume. That's why the true definition is: pi = math.tau/2 :-) ... and yes, as someone else comments, adding tau was a bit whimsical in the spirit of 'import antigravity' and the line. On Jun 1, 2017 12:48 AM, "Serhiy Storchaka" wrote: 01.06.17 10:03, David Mertz ????: It's awfully easy to add in your own code. Since they are simply aliases, I > don't see why bother put the duplication in the standard library. You can > even monkey patch if you want it in the 'math' namespace. I'd prefer a bare > '?' in my own code though. > As well as adding tau = 2*math.pi in your own code. But this deserved the whole PEP and was added as a feature in 3.6. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Thu Jun 1 13:16:10 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 1 Jun 2017 20:16:10 +0300 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On Thu, Jun 1, 2017 at 9:47 AM, Serhiy Storchaka wrote: > What you are think about adding Unicode aliases for some mathematic names in > the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > If this were to happen, I would only add ? and forget about the others. It is the only one that is nearly 100 percent unambiguous. And seeing that in code or listed in math or builtins might have a nice wow factor to some. If ? were in builtins, it might actually be useful as being more readable and faster to type than math.pi or np.pi. As math.?, I'm not sure it's worth it, although less harmful than math.tau. In IPython/Jupyter, you can type \pi + tab, and you'll get ?. This even works on command line! -- Koos (mobile) From Nikolaus at rath.org Thu Jun 1 14:11:24 2017 From: Nikolaus at rath.org (Nikolaus Rath) Date: Thu, 01 Jun 2017 11:11:24 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: (Victor Stinner's message of "Thu, 1 Jun 2017 11:49:43 +0200") References: Message-ID: <87efv3tw6r.fsf@thinkpad.rath.org> On Jun 01 2017, Victor Stinner wrote: > 2017-06-01 8:47 GMT+02:00 Serhiy Storchaka : >> What you are think about adding Unicode aliases for some mathematic names in >> the math module? ;-) >> >> math.? = math.pi > > How do you write ? (pi) with a keyboard on Windows, Linux or macOS? Under Linux, you'd use the Compose facility. Take a look at eg. /usr/share/X11/locale/en_US.UTF-8/Compose for all the nice things it let's you enter: $ egrep '[???]' /usr/share/X11/locale/en_US.UTF-8/Compose : "?" U0393 # GREEK CAPITAL LETTER GAMMA

: "?" U03C0 # GREEK SMALL LETTER PI : "?" U03C4 # GREEK SMALL LETTER TAU Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From ericsnowcurrently at gmail.com Thu Jun 1 14:37:16 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 1 Jun 2017 12:37:16 -0600 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: On Thu, Jun 1, 2017 at 10:30 AM, Victor Stinner wrote: > Hi, > > Perl 5.26 succeeded to remove the current working directory from the > default include path (our Python sys.path): > > https://metacpan.org/pod/release/XSAWYERX/perl-5.26.0/pod/perldelta.pod#Removal-of-the-current-directory-(%22.%22)-from- at INC > > Would it technically possible to make this change in Python? Or would > it destroy the world? Sorry, it's a naive question (but honestly, I > don't know the answer.) > > My main use case for "." in sys.path is to be to run an application > without installing it: run ./hachoir-metadata which loads the Python > "hachoir" module from the script directory. Sometimes, I run > explicitly "PYTHONPATH=$PWD ./hachoir-metadata". > > But I don't think that running an application from the source without > installing it is the most common way to run an application. Most users > install applications to use them, no? > > Enabling the isolated mode already prevents "." to be added to > sys.path: -I command line option. > https://docs.python.org/dev/using/cmdline.html#cmdoption-I > > There is also an old idea of a "restricted" system Python which would > use a "fixed" sys.path. FYI, PEP 432 (Restructuring the CPython startup sequence) [1] facilitates efforts along these lines. It even proposes adding a system-python binary to the cpython build. More specific to the matter of sys.path[0], there's a long-standing issue (#13475) [2] that discusses the topic and some solutions. Note that progress on the issue was blocked by desire to clean up interpreter startup (i.e. PEP 432) first, which has now been accomplished as of the recent PyCon sprints. -eric [1] https://www.python.org/dev/peps/pep-0432/ [2] http://bugs.python.org/issue13475 From tjreedy at udel.edu Thu Jun 1 15:10:48 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 1 Jun 2017 15:10:48 -0400 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: On 6/1/2017 10:17 AM, joannah nanjekye wrote: > Today I was writing an example snippet for the book and needed to write > a function that returns two values something like this: > > def return_multiplevalues(num1, num2): > return num1, num2 > > I noticed that this actually returns a tuple of the values which I did > not want in the first place.I wanted python to return two values in > their own types so I can work with them as they are but here I was stuck > with working around a tuple. Others have pointed out that you are not stuck at all. Returning a tuple that can be unpacked is Python's concrete implementation of the abstract concept 'return multiple values'. Note that Python's gives one a choice whether to keep the values bundles or to immediately unbundle them. -- Terry Jan Reedy From ma3yuki.8mamo10 at gmail.com Thu Jun 1 15:27:08 2017 From: ma3yuki.8mamo10 at gmail.com (Masayuki YAMAMOTO) Date: Fri, 2 Jun 2017 04:27:08 +0900 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Hi Stephan, Nevertheless, I would like to point out that the encoding assumed for a > Python3 source file never depends on the locale. > Yeah, as you pointed out. I'd like to correct my said. > My understanding is that in the default encoding for Python source files > (utf-8), East Asian Ambiguous characters must be assumed narrow. Now there > are also legacy encodings where they are fullwidth. But it is always > determined by the encoding, which in turn is specified or implied in the > source file. > The mapping for ambiguous width assumes on East Asia legacy encodings and non East Asia legacy encodings, but not recommend to UTF-8 and other Unicode encodings. Displaying ambiguous width characters behave narrow by default, it isn't related to encoding. [*] Let me see... Several softwares have a setting that changes ambiguous width to halfwidth or fullwidth regardless for encoding (e.g. gnome-terminal, vim). And some fonts that are used in East Asia make glyph that is Greek letters and other signs to adjust to fullwidth, they break layout under halfwidth settings. It is possible that avoids these fonts, and uses multi language support font, yet signs that are only used in East Asia don't have halfwidth glyph no matter the ambiguous width. Therefore, in case of using East Asia language, it is difficult that set displaying Greek letters as halfwidth. Regards, Masayuki [*] http://unicode.org/reports/tr11/#Recommendations -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Thu Jun 1 16:05:34 2017 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Thu, 1 Jun 2017 15:05:34 -0500 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: I'm slightly confused as to what you mean, but here goes: So you're saying that: - Glyphs like pi have an ambiguous width. - Most text editors/terminals let you choose between halfwidth (roughly normal monospace width?) and fullwidth (double the size). - However, many East Asian fonts do NOT have halfwidth support. Is this correct? -- Ryan (????) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On Jun 1, 2017 2:27 PM, "Masayuki YAMAMOTO" wrote: Hi Stephan, Nevertheless, I would like to point out that the encoding assumed for a > Python3 source file never depends on the locale. > Yeah, as you pointed out. I'd like to correct my said. > My understanding is that in the default encoding for Python source files > (utf-8), East Asian Ambiguous characters must be assumed narrow. Now there > are also legacy encodings where they are fullwidth. But it is always > determined by the encoding, which in turn is specified or implied in the > source file. > The mapping for ambiguous width assumes on East Asia legacy encodings and non East Asia legacy encodings, but not recommend to UTF-8 and other Unicode encodings. Displaying ambiguous width characters behave narrow by default, it isn't related to encoding. [*] Let me see... Several softwares have a setting that changes ambiguous width to halfwidth or fullwidth regardless for encoding (e.g. gnome-terminal, vim). And some fonts that are used in East Asia make glyph that is Greek letters and other signs to adjust to fullwidth, they break layout under halfwidth settings. It is possible that avoids these fonts, and uses multi language support font, yet signs that are only used in East Asia don't have halfwidth glyph no matter the ambiguous width. Therefore, in case of using East Asia language, it is difficult that set displaying Greek letters as halfwidth. Regards, Masayuki [*] http://unicode.org/reports/tr11/#Recommendations _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From gvanrossum at gmail.com Thu Jun 1 17:37:28 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 1 Jun 2017 14:37:28 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: I think it's silliness and I don't even know how to type those characters. On May 31, 2017 11:49 PM, "Serhiy Storchaka" wrote: > What you are think about adding Unicode aliases for some mathematic names > in the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eryksun at gmail.com Thu Jun 1 18:32:40 2017 From: eryksun at gmail.com (eryk sun) Date: Thu, 1 Jun 2017 22:32:40 +0000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: On Thu, Jun 1, 2017 at 4:46 PM, Chris Angelico wrote: > (AIUI, the *current directory* is never on Python's path, but the > *script directory* is. They're the same thing a lot of the time.) sys.path includes the current directory (i.e. an empty string) when there's no script, which includes the REPL, -c, and -m. It's removed by [-I]solated mode, which also removes the script directory. From victor.stinner at gmail.com Thu Jun 1 18:58:54 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Jun 2017 00:58:54 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: > (AIUI, the *current directory* is never on Python's path, but the *script directory* is. They're the same thing a lot of the time.) Oh, it's very common that I run a script from its directory, so yeah script directory = current directory on such case. Sorry for the confusion. You are right, it's the script directory that it added to sys.path and I would like to know if it would be possible to change that? Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Jun 1 19:22:16 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 2 Jun 2017 09:22:16 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: On Fri, Jun 2, 2017 at 8:58 AM, Victor Stinner wrote: >> (AIUI, the *current directory* is never on Python's path, but the > *script directory* is. They're the same thing a lot of the time.) > > Oh, it's very common that I run a script from its directory, so yeah script > directory = current directory on such case. Sorry for the confusion. You are > right, it's the script directory that it added to sys.path and I would like > to know if it would be possible to change that? Yeah. The rest of my post assumed you meant script directory and, on that basis, wholeheartedly agrees with you. Ultimately, what I would like is for "import random" to be absolutely dependably going to grab the stdlib "random" module, or at very least, something that someone *deliberately* is shadowing that module with. You shouldn't be able to accidentally shadow a stdlib module. ChrisA From steve at pearwood.info Thu Jun 1 20:11:28 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 2 Jun 2017 10:11:28 +1000 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: <20170602001126.GQ23443@ando.pearwood.info> Hi Joannah, and welcome! On Thu, Jun 01, 2017 at 05:17:49PM +0300, joannah nanjekye wrote: [...] > My proposal is we provide a way of functions returning multiple values. > This has been implemented in languages like Go and I have found many cases > where I needed and used such a functionality. I wish for this convenience > in python so that I don't have to suffer going around a tuple. Can you explain how Go's multiple return values differ from Python's? Looking at the example here: https://golang.org/doc/effective_go.html it looks like there's very little difference. In Go, you have to declare the return type(s) of the function, which Python doesn't require, but the caller treats the function the same. The Go example is: func nextInt(b []byte, i int) (int, int) { for ; i < len(b) && !isDigit(b[i]); i++ { } x := 0 for ; i < len(b) && isDigit(b[i]); i++ { x = x*10 + int(b[i]) - '0' } return x, i } which the caller uses like this: x, i = nextInt(b, i) In Python, we would write exactly the same thing: inside the nextInt function, we'd return multiple values: return x, i and the caller would accept them like this: x, i = nextInt(b, i) So the syntax is even the same. How is Go different from Python? -- Steve From breamoreboy at yahoo.co.uk Thu Jun 1 20:37:33 2017 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Fri, 2 Jun 2017 01:37:33 +0100 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: <20170602001126.GQ23443@ando.pearwood.info> References: <20170602001126.GQ23443@ando.pearwood.info> Message-ID: On 02/06/2017 01:11, Steven D'Aprano wrote: > Hi Joannah, and welcome! > > On Thu, Jun 01, 2017 at 05:17:49PM +0300, joannah nanjekye wrote: > > [...] >> My proposal is we provide a way of functions returning multiple values. >> This has been implemented in languages like Go and I have found many cases >> where I needed and used such a functionality. I wish for this convenience >> in python so that I don't have to suffer going around a tuple. > > Can you explain how Go's multiple return values differ from Python's? > Looking at the example here: > > https://golang.org/doc/effective_go.html > > it looks like there's very little difference. In Go, you have to declare > the return type(s) of the function, which Python doesn't require, but > the caller treats the function the same. > I suggest you look up the Go reflect package and the use of interface{}, meaning that you can pass anything you like both into and out of a function. There lies The Road To Hell as far as I'm concerned, I'll stick with duck typing thank you. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From steve at pearwood.info Thu Jun 1 21:05:59 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 2 Jun 2017 11:05:59 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: <20170602010559.GS23443@ando.pearwood.info> On Fri, Jun 02, 2017 at 09:22:16AM +1000, Chris Angelico wrote: > Ultimately, what I would like is for "import random" to be absolutely > dependably going to grab the stdlib "random" module, or at very least, > something that someone *deliberately* is shadowing that module with. > You shouldn't be able to accidentally shadow a stdlib module. If that's the only problem you want to solve, then I would expect that moving the script/current directory to the *end* of sys.path instead of the start will accomplish that, without breaking any scripts that rely on '' to be in the path. I expect that moving '' to the end of sys.path will be a less disruptive change than removing it. -- Steve From rosuav at gmail.com Thu Jun 1 22:36:59 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 2 Jun 2017 12:36:59 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: <20170602010559.GS23443@ando.pearwood.info> References: <20170602010559.GS23443@ando.pearwood.info> Message-ID: On Fri, Jun 2, 2017 at 11:05 AM, Steven D'Aprano wrote: > On Fri, Jun 02, 2017 at 09:22:16AM +1000, Chris Angelico wrote: > >> Ultimately, what I would like is for "import random" to be absolutely >> dependably going to grab the stdlib "random" module, or at very least, >> something that someone *deliberately* is shadowing that module with. >> You shouldn't be able to accidentally shadow a stdlib module. > > If that's the only problem you want to solve, then I would expect that > moving the script/current directory to the *end* of sys.path instead of > the start will accomplish that, without breaking any scripts that rely > on '' to be in the path. > > I expect that moving '' to the end of sys.path will be a less disruptive > change than removing it. This is true. However, anything that depends on the current behaviour (intentionally or otherwise) would be just as broken as if it were removed, and it's still possible for "import foo" to mean a local import today and a stdlib import tomorrow. It'd just move the ambiguity around; instead of not being sure if it's been accidentally shadowed, instead you have to wonder if your code will break when a future Python version adds a new stdlib module. Or your code could break because someone pip-installs a third-party module. You can take your pick which thing shadows which by choosing where you place '' in sys.path, but there'll always be a risk one way or another. Different packages don't conflict with each other because intra-package imports are explicit. They're safe. ChrisA From greg.ewing at canterbury.ac.nz Fri Jun 2 01:52:44 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 02 Jun 2017 17:52:44 +1200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <5930FD2C.1070904@canterbury.ac.nz> Victor Stinner wrote: > How do you write ? (pi) with a keyboard on Windows, Linux or macOS? On a Mac, ? is Option-p and ? is Option-w. -- Greg From ma3yuki.8mamo10 at gmail.com Fri Jun 2 03:09:57 2017 From: ma3yuki.8mamo10 at gmail.com (Masayuki YAMAMOTO) Date: Fri, 2 Jun 2017 16:09:57 +0900 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Yes, it's correct. I'd show you a link to vim help for ambiguous width setting. http://vimdoc.sourceforge.net/htmldoc/options.html#'ambiwidth' Masayuki 2017-06-02 5:05 GMT+09:00 Ryan Gonzalez : > I'm slightly confused as to what you mean, but here goes: > > So you're saying that: > > - Glyphs like pi have an ambiguous width. > - Most text editors/terminals let you choose between halfwidth (roughly > normal monospace width?) and fullwidth (double the size). > - However, many East Asian fonts do NOT have halfwidth support. > > Is this correct? > > -- > Ryan (????) > Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else > http://refi64.com > > On Jun 1, 2017 2:27 PM, "Masayuki YAMAMOTO" > wrote: > > Hi Stephan, > > > Nevertheless, I would like to point out that the encoding assumed for a >> Python3 source file never depends on the locale. >> > Yeah, as you pointed out. I'd like to correct my said. > > >> My understanding is that in the default encoding for Python source files >> (utf-8), East Asian Ambiguous characters must be assumed narrow. Now there >> are also legacy encodings where they are fullwidth. But it is always >> determined by the encoding, which in turn is specified or implied in the >> source file. >> > The mapping for ambiguous width assumes on East Asia legacy encodings and > non East Asia legacy encodings, but not recommend to UTF-8 and other > Unicode encodings. Displaying ambiguous width characters behave narrow by > default, it isn't related to encoding. [*] > > Let me see... Several softwares have a setting that changes ambiguous > width to halfwidth or fullwidth regardless for encoding (e.g. > gnome-terminal, vim). And some fonts that are used in East Asia make glyph > that is Greek letters and other signs to adjust to fullwidth, they break > layout under halfwidth settings. It is possible that avoids these fonts, > and uses multi language support font, yet signs that are only used in East > Asia don't have halfwidth glyph no matter the ambiguous width. Therefore, > in case of using East Asia language, it is difficult that set displaying > Greek letters as halfwidth. > > Regards, > Masayuki > > [*] http://unicode.org/reports/tr11/#Recommendations > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Jun 2 03:12:29 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 02 Jun 2017 19:12:29 +1200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: <59310FDD.3010702@canterbury.ac.nz> Victor Stinner wrote: > You are right, it's the script directory that it added to > sys.path and I would like to know if it would be possible to change that? Why do you want to change it? -- Greg From erik.m.bray at gmail.com Fri Jun 2 03:46:56 2017 From: erik.m.bray at gmail.com (Erik Bray) Date: Fri, 2 Jun 2017 09:46:56 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <5930FD2C.1070904@canterbury.ac.nz> References: <5930FD2C.1070904@canterbury.ac.nz> Message-ID: On Fri, Jun 2, 2017 at 7:52 AM, Greg Ewing wrote: > Victor Stinner wrote: >> >> How do you write ? (pi) with a keyboard on Windows, Linux or macOS? > > > On a Mac, ? is Option-p and ? is Option-w. I don't have a strong opinion about it being in the stdlib, but I'd also point out that a strong advantage to having these defined in a module at all is that third-party interpreters (e.g. IPython, bpython, some IDEs) that support tab-completion make these easy to type as well, and I find them to be very readable for math-heavy code. From niki.spahiev at gmail.com Fri Jun 2 03:52:28 2017 From: niki.spahiev at gmail.com (Niki Spahiev) Date: Fri, 2 Jun 2017 10:52:28 +0300 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: On 1.06.2017 19:46, Chris Angelico wrote: > On Fri, Jun 2, 2017 at 2:30 AM, Victor Stinner wrote: >> Perl 5.26 succeeded to remove the current working directory from the >> default include path (our Python sys.path): >> >> https://metacpan.org/pod/release/XSAWYERX/perl-5.26.0/pod/perldelta.pod#Removal-of-the-current-directory-(%22.%22)-from- at INC >> >> Would it technically possible to make this change in Python? Or would >> it destroy the world? Sorry, it's a naive question (but honestly, I >> don't know the answer.) > > (AIUI, the *current directory* is never on Python's path, but the > *script directory* is. They're the same thing a lot of the time.) > > All it'd take is one tiny change to Python, and then one tiny change > to any multi-file non-package Python app. > > 1) Make the script directory implicitly function as a package. In > effect, assume that there is an empty __init__.py in the same > directory as the thing you just ran. > > 2) Any time a Python app wants to import from its own directory, it > needs to "from . import blah" instead of simply "import blah". > > Then the removal you suggest could be done, without any loss of > functionality. The change could alternatively be done as an import > hack rather than an actual fake package if that's easier, such that > "from . import blah" means either "import from the script directory" > or "import from the current package" as appropriate. Hack for tesing this idea: #!/bin/env python3 __path__ = '.' import sys; sys.modules[''] = sys.modules['__main__'] # rest of the script +1 Niki From victor.stinner at gmail.com Fri Jun 2 06:02:28 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Jun 2017 12:02:28 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: <59310FDD.3010702@canterbury.ac.nz> References: <59310FDD.3010702@canterbury.ac.nz> Message-ID: 2017-06-02 9:12 GMT+02:00 Greg Ewing : > Why do you want to change it? To make Python more secure. To prevent untrusted modules hijacking stdlib modules on purpose to inject code for example. Victor From g.rodola at gmail.com Fri Jun 2 06:17:26 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Fri, 2 Jun 2017 12:17:26 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka wrote: > What you are think about adding Unicode aliases for some mathematic names > in the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > * it duplicates functionality * I have no idea how to write those chars on Linux; if I did, I'm not sure it'd be the same on OSX and Windows (probably not) * duplicated aliases might make sense if they add readability; in this case they don't unless (maybe) you have a mathematical background. I can infer what "math.gamma" stands for but not being a mathematician math.? makes absolutely zero sense to me. * if you really want to do that you can simply do "from math import gamma as ?" but it's something I wouldn't like if I were to read your code * I generally dislike any non-ASCII API; the fact that Python 3 allows you to do that should not be an incentive to promote such habit in the stdlib or anywhere else except in the end-user code, and it's something I still wouldn't like it except if in comments or docstrings -1 -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Jun 2 12:48:16 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Jun 2017 02:48:16 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <59310FDD.3010702@canterbury.ac.nz> Message-ID: On 2 June 2017 at 20:02, Victor Stinner wrote: > 2017-06-02 9:12 GMT+02:00 Greg Ewing : >> Why do you want to change it? > > To make Python more secure. To prevent untrusted modules hijacking > stdlib modules on purpose to inject code for example. As long as user site packages are enabled, folks are pretty much hosed on that front (drop a *.pth file in there and you can run arbitrary code at startup). Hence isolated mode and the system-python idea (which can potentially be implemented even while PEP 432 is still a private API, although it would require several more config settings to be migrated to the new structure first). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From levkivskyi at gmail.com Fri Jun 2 18:48:35 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 3 Jun 2017 00:48:35 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On 2 June 2017 at 12:17, Giampaolo Rodola' wrote: > On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka > wrote: > >> What you are think about adding Unicode aliases for some mathematic names >> in the math module? ;-) >> >> math.? = math.pi >> math.? = math.tau >> math.? = math.gamma >> math.? = math.e >> >> Unfortunately we can't use ?, ? and ? as identifiers. :-( >> >> [...] > * duplicated aliases might make sense if they add readability; in this > case they don't unless (maybe) you have a mathematical background. I can > infer what "math.gamma" stands for but not being a mathematician math.? > makes absolutely zero sense to me. > > There is a significant number of scientific Python programmers (21% according to PyCharm 2016), so it is not that rare to meet someone who knows what is Gamma function. And for many of them ? is much more readable than np.pi. Also there is another problem, confusion between Gamma function and Euler?Mascheroni constant, the first one is ?, the second one is ? (perfectly opposite to PEP 8 capitalization rules :-), while both of them are frequently denoted as just gamma (in particular math.gamma follows the PEP8 rules, but is counter-intuitive for most scientist). All that said, I agree that these problems are easily solved by a custom import from. Still there is something in (or related to?) this proposal I think is worth considering: Can we also allow identifiers like ? or ?. This will make many expressions more similar to usual TeX, plus it will be useful for projects like SymPy. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jun 2 18:55:01 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Jun 2017 15:55:01 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: I would love to show how easy it is to write from math import pi as ?, gamma as ? but I had to cheat by copying from the OP since I don't know how to type these (and even if you were to tell me how I'd forget tomorrow). So, I am still in favor of the rule "only ASCII in the stdlib". On Fri, Jun 2, 2017 at 3:48 PM, Ivan Levkivskyi wrote: > On 2 June 2017 at 12:17, Giampaolo Rodola' wrote: > >> On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka >> wrote: >> >>> What you are think about adding Unicode aliases for some mathematic >>> names in the math module? ;-) >>> >>> math.? = math.pi >>> math.? = math.tau >>> math.? = math.gamma >>> math.? = math.e >>> >>> Unfortunately we can't use ?, ? and ? as identifiers. :-( >>> >>> [...] >> * duplicated aliases might make sense if they add readability; in this >> case they don't unless (maybe) you have a mathematical background. I can >> infer what "math.gamma" stands for but not being a mathematician math.? >> makes absolutely zero sense to me. >> >> > There is a significant number of scientific Python programmers (21% > according to PyCharm 2016), so it is not that rare to meet someone who > knows what is Gamma function. > And for many of them ? is much more readable than np.pi. Also there is > another problem, confusion between Gamma function and Euler?Mascheroni > constant, the first one is ?, > the second one is ? (perfectly opposite to PEP 8 capitalization rules :-), > while both of them are frequently denoted as just gamma (in particular > math.gamma follows the PEP8 rules, > but is counter-intuitive for most scientist). > > All that said, I agree that these problems are easily solved by a custom > import from. Still there is something in (or related to?) this proposal > I think is worth considering: Can we also allow identifiers like ? or ?. > This will make many expressions more similar to usual TeX, > plus it will be useful for projects like SymPy. > > -- > Ivan > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Fri Jun 2 19:02:12 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 3 Jun 2017 01:02:12 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On 3 June 2017 at 00:55, Guido van Rossum wrote: > [...] > So, I am still in favor of the rule "only ASCII in the stdlib". > But what about the other question? Currently, integral, sum, infinity, square root etc. Unicode symbols are all prohibited in identifiers. Is it possible to allow them? (Btw IPython just supports normal TeX notations like \pi, \lambda etc, so it is very easy to remember) -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jun 2 19:29:16 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Jun 2017 16:29:16 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Are those characters not considered Unicode letters? Maybe we could add their category to the allowed set? On Jun 2, 2017 4:02 PM, "Ivan Levkivskyi" wrote: > On 3 June 2017 at 00:55, Guido van Rossum wrote: > >> [...] >> So, I am still in favor of the rule "only ASCII in the stdlib". >> > > But what about the other question? Currently, integral, sum, infinity, > square root etc. Unicode symbols are all prohibited in identifiers. > Is it possible to allow them? > > (Btw IPython just supports normal TeX notations like \pi, \lambda etc, so > it is very easy to remember) > > -- > Ivan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bussonniermatthias at gmail.com Fri Jun 2 19:44:41 2017 From: bussonniermatthias at gmail.com (Matthias Bussonnier) Date: Fri, 2 Jun 2017 16:44:41 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: > (Btw IPython just supports normal TeX notations like \pi, \lambda etc, so it is very easy to remember) IPython dev here. I am the one who implemented (most of) that. We do support it, but it's not easy to remember unless you know how to write latex, and know said character. Question, how would you type the following: In [3]: ? = 1 Hint it's easy it's \CHARACTERNAME, but if you don't know how ? is named[1], you are screwed[3]. It's cute, it's compact, it's extremely useful for some internal code, but _exporting_ this as an interface is IMHO an extremely bad idea that hinders readability[2] and usability of the code. On Fri, Jun 2, 2017 at 4:29 PM, Guido van Rossum wrote: > Are those characters not considered Unicode letters? Maybe we could add > their category to the allowed set? +1 on allowing more of math symbols and be more flexible on allowed identifiers though. In particular the one mentioned above are part of mathematical operators[4]. It also would be great for some of these to be parsed as infix operators, but that's another topic :-) -- M [1] \daleth [2] and that's assuming your font support said character. [3] Tab completion on full unicode character name does work as well so \GREEK SMALL LETTER GAMMA will give you ?. And \? will expand to \gamma, so you can figure it out, but users still struggle for unknown symbols [4] http://www.fileformat.info/info/unicode/block/mathematical_operators/images.htm On Fri, Jun 2, 2017 at 4:02 PM, Ivan Levkivskyi wrote: > On 3 June 2017 at 00:55, Guido van Rossum wrote: >> >> [...] >> So, I am still in favor of the rule "only ASCII in the stdlib". > > > But what about the other question? Currently, integral, sum, infinity, > square root etc. Unicode symbols are all prohibited in identifiers. > Is it possible to allow them? > > (Btw IPython just supports normal TeX notations like \pi, \lambda etc, so it > is very easy to remember) > > -- > Ivan > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From steve at pearwood.info Fri Jun 2 19:45:05 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 3 Jun 2017 09:45:05 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <20170602234458.GA17170@ando.pearwood.info> On Sat, Jun 03, 2017 at 01:02:12AM +0200, Ivan Levkivskyi wrote: > On 3 June 2017 at 00:55, Guido van Rossum wrote: > > > [...] > > So, I am still in favor of the rule "only ASCII in the stdlib". > > > > But what about the other question? Currently, integral, sum, infinity, > square root etc. Unicode symbols are all prohibited in identifiers. > Is it possible to allow them? In the last few months, I've been making a lot of use of the TI Nspire CAS calculator, and I think that there is very little benefit to allowing symbols like ? ? ? (sum, radical/root, integral) unless you have a proper 2-dimensional template system. There's not much, if any, benefit to writing: ?(expression, lower_limit, upper_limit, name) In fact, that's probably *harder* to read than integrate(expression, lower_limit, upper_limit, name) because the important thing, the fact that this is an integral, is barely visible. Its only a single character. That's not how mathematicians write it! If we had a 2D template system, like the Nspire, we could write what mathematicians do: (best viewed with a non-proportional font) b ? ? 3 2 1 ? x + 2 x ? ??? dx ? x ? a I say "best", but of course even with a monospaced font, it still looks pretty awful. You really need a proper GUI interface and support for resizing characters. I'm not suggesting this be part of Python the language! But It might be a nice application written for users of Python, perhaps part of Sage or IPython/Jupiter or a GUI interface to Sympy. You don't need ? to be legal in identifies for that. -- Steve From levkivskyi at gmail.com Fri Jun 2 19:56:49 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 3 Jun 2017 01:56:49 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On 3 June 2017 at 01:29, Guido van Rossum wrote: > Are those characters not considered Unicode letters? Maybe we could add > their category to the allowed set? > > Yes, they are not considered letters, they are in category Sm. Unfortunately, +, -, |, and other symbol that clearly should not be in identifiers are also in this category, so we cannot add the whole category. It is possible to include particular ranges, but there should be a discussion about what exactly can/should be included. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Jun 2 20:10:26 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 3 Jun 2017 10:10:26 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <20170603001026.GB17170@ando.pearwood.info> On Fri, Jun 02, 2017 at 04:29:16PM -0700, Guido van Rossum wrote: > Are those characters not considered Unicode letters? Maybe we could add > their category to the allowed set? They're not letters: py> {unicodedata.category(c) for c in '????'} {'Sm'} That's Symbol, Math. One problem is that the 'Sm' category includes a whole lot of mathematical symbols that we probably don't want in identifiers: ? ? ? ? ? ? ? ? (plus MANY more variations on = < and > operators) including some "Confusables": ? ? ? ? ? etc C ? v * ? http://www.unicode.org/reports/tr39/ Of course a language can define identifiers however it likes, but I think it is relevant that the Unicode Consortium's default algorithm for determining an identifier excludes Sm. http://www.unicode.org/reports/tr31/ I also disagree with Ivan that these symbols would be particularly useful in general, even for maths-heavy code, although I wouldn't say no to special casing ? (infinity) and maybe ? as a unary square root operator. -- Steve From guido at python.org Fri Jun 2 20:31:05 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Jun 2017 17:31:05 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <20170603001026.GB17170@ando.pearwood.info> References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: OK, I think this discussion is pretty much dead then. We definitely shouldn't allow math operators in identifiers, otherwise in Python 4 or 5 we couldn't introduce them as operators. On Fri, Jun 2, 2017 at 5:10 PM, Steven D'Aprano wrote: > On Fri, Jun 02, 2017 at 04:29:16PM -0700, Guido van Rossum wrote: > > > Are those characters not considered Unicode letters? Maybe we could add > > their category to the allowed set? > > They're not letters: > > py> {unicodedata.category(c) for c in '????'} > {'Sm'} > > > That's Symbol, Math. > > One problem is that the 'Sm' category includes a whole lot of > mathematical symbols that we probably don't want in identifiers: > > ? ? ? ? ? ? ? ? (plus MANY more variations on = < and > operators) > > including some "Confusables": > > ? ? ? ? ? etc > > C ? v * ? > > http://www.unicode.org/reports/tr39/ > > Of course a language can define identifiers however it likes, but I > think it is relevant that the Unicode Consortium's default algorithm for > determining an identifier excludes Sm. > > http://www.unicode.org/reports/tr31/ > > I also disagree with Ivan that these symbols would be particularly > useful in general, even for maths-heavy code, although I wouldn't say no > to special casing ? (infinity) and maybe ? as a unary square root > operator. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Fri Jun 2 20:38:38 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Fri, 2 Jun 2017 20:38:38 -0400 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On Fri, Jun 2, 2017 at 7:29 PM, Guido van Rossum wrote: Are those characters not considered Unicode letters? Maybe we could add > their category to the allowed set? Speaking of which, it would be convenient to be able to build strings with non-ascii characters using their Unicode codepoint name: greek_pi = "\u:greek_small_letter_pi" Or something like that. ? -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Jun 2 20:43:56 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 3 Jun 2017 10:43:56 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On Sat, Jun 3, 2017 at 10:38 AM, Juancarlo A?ez wrote: > Speaking of which, it would be convenient to be able to build strings with > non-ascii characters using their Unicode codepoint name: > > greek_pi = "\u:greek_small_letter_pi" > > Or something like that. You mean: >>> greek_pi = "\N{GREEK SMALL LETTER PI}" >>> greek_pi '?' Time machine strikes again :) ChrisA From tjreedy at udel.edu Fri Jun 2 21:13:47 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 2 Jun 2017 21:13:47 -0400 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On 6/2/2017 7:56 PM, Ivan Levkivskyi wrote: > On 3 June 2017 at 01:29, Guido van Rossum > > wrote: > > Are those characters not considered Unicode letters? Maybe we could > add their category to the allowed set? > > > Yes, they are not considered letters, they are in category Sm. I presume that is Symbol - math. > Unfortunately, +, -, |, and other symbol that clearly should not be in > identifiers are also in this category, > so we cannot add the whole category. It is possible to include > particular ranges, Having to test ranges will slow down identifier recognition. > but there should be a discussion > about what exactly can/should be included. I believe the current python definition of 'identifier' is taken from the Unicode Standard for default identifiers. Any change would have to be propagated to regex engines, IDEs, and anything else that parses python. I suggest that you ask Martin Loewis for his opinion on changing the identifier definition. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Fri Jun 2 21:22:24 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 03 Jun 2017 13:22:24 +1200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <20170602234458.GA17170@ando.pearwood.info> References: <20170602234458.GA17170@ando.pearwood.info> Message-ID: <59320F50.4030401@canterbury.ac.nz> Steven D'Aprano wrote: > There's not much, if any, benefit to writing: > > ?(expression, lower_limit, upper_limit, name) More generally, there's a kind of culture clash between mathematical notation and programming notation. Mathematical notation tends to almost exclusively use single-character names, relying on different fonts and alphabets, and superscripts and subscripts, to get a large enough set of identifiers. Whereas in programming we use a much smaller alphabet and longer names. Having terse symbols for just a few things, and having to spell everything else out longhand, doesn't really help. -- Greg From pavol.lisy at gmail.com Sat Jun 3 01:42:28 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sat, 3 Jun 2017 07:42:28 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: Sorry for probably stupid question! Is something like -> class A: def __oper__(self, '?', other): return something(self.value, other) a = A() a ? 3 thinkable? On 6/3/17, Guido van Rossum wrote: > OK, I think this discussion is pretty much dead then. We definitely > shouldn't allow math operators in identifiers, otherwise in Python 4 or 5 > we couldn't introduce them as operators. > > On Fri, Jun 2, 2017 at 5:10 PM, Steven D'Aprano > wrote: > >> On Fri, Jun 02, 2017 at 04:29:16PM -0700, Guido van Rossum wrote: >> >> > Are those characters not considered Unicode letters? Maybe we could add >> > their category to the allowed set? >> >> They're not letters: >> >> py> {unicodedata.category(c) for c in '????'} >> {'Sm'} >> >> >> That's Symbol, Math. >> >> One problem is that the 'Sm' category includes a whole lot of >> mathematical symbols that we probably don't want in identifiers: >> >> ? ? ? ? ? ? ? ? (plus MANY more variations on = < and > operators) >> >> including some "Confusables": >> >> ? ? ? ? ? etc >> >> C ? v * ? >> >> http://www.unicode.org/reports/tr39/ >> >> Of course a language can define identifiers however it likes, but I >> think it is relevant that the Unicode Consortium's default algorithm for >> determining an identifier excludes Sm. >> >> http://www.unicode.org/reports/tr31/ >> >> I also disagree with Ivan that these symbols would be particularly >> useful in general, even for maths-heavy code, although I wouldn't say no >> to special casing ? (infinity) and maybe ? as a unary square root >> operator. >> >> >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > --Guido van Rossum (python.org/~guido) > From rosuav at gmail.com Sat Jun 3 01:55:38 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 3 Jun 2017 15:55:38 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: On Sat, Jun 3, 2017 at 3:42 PM, Pavol Lisy wrote: > Sorry for probably stupid question! Is something like -> > > class A: > def __oper__(self, '?', other): > return something(self.value, other) > > a = A() > a ? 3 > > thinkable? No, because operators need to be defined before you get to individual objects, and they need precedence and associativity. So it'd have to be defined at the compiler level. Also, having arbitrary operators gets extremely confusing. It's not easy to reason about code when you don't know what's even an operator. Not a stupid question, but one for which the answer is "definitely not like that". ChrisA From joshua.morton13 at gmail.com Sat Jun 3 01:59:06 2017 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Sat, 03 Jun 2017 05:59:06 +0000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: For reference, haskell is perhaps the closest language to providing arbitrary infix operators, and it requires that they be surrounded by backticks. That is A `op` B is equivalent to op(A, B) That doesn't work for python (backtick is taken) and I don't think anything similar is a good idea. On Sat, Jun 3, 2017 at 1:56 AM Chris Angelico wrote: > On Sat, Jun 3, 2017 at 3:42 PM, Pavol Lisy wrote: > > Sorry for probably stupid question! Is something like -> > > > > class A: > > def __oper__(self, '?', other): > > return something(self.value, other) > > > > a = A() > > a ? 3 > > > > thinkable? > > No, because operators need to be defined before you get to individual > objects, and they need precedence and associativity. So it'd have to > be defined at the compiler level. > > Also, having arbitrary operators gets extremely confusing. It's not > easy to reason about code when you don't know what's even an operator. > > Not a stupid question, but one for which the answer is "definitely not > like that". > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavol.lisy at gmail.com Sat Jun 3 02:26:28 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sat, 3 Jun 2017 08:26:28 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On 6/1/17, Serhiy Storchaka wrote: > What you are think about adding Unicode aliases for some mathematic > names in the math module? ;-) > > math.? = math.pi > math.? = math.tau > math.? = math.gamma > math.? = math.e > > Unfortunately we can't use ?, ? and ? as identifiers. :-( My humble opinion: I would rather like to see something like: from some_wide_used_scientific_library.math_constants import * with good acceptance from scientific users before thinking to add it into stdlib. PS. Maybe this could be interesting for some vim users -> http://gnosis.cx/bin/.vim/after/syntax/python.vim (I am not author of this file) If you apply it in vim then vim show lines (where is not cursor) "translated". Means for example that math.pi is replaced by ?. So you still edit "math.pi" but if you move cursor outside of this line then you could see formula simplified/prettified. (or more complicated - because you need a little train your brain to accept new view) I am pretty skeptic how popular this conceal technique could be in vim pythonistas community! ( why skeptic? For example I am testing to improve readability of line similar to self.a = self.b + self.c using with this technique and see in vim ?a = ?b + ?c but **I am not sure if it is really useful** (probably I have to add that I am editing code much more in other editor than vim) ? U+1420 CANADIAN SYLLABICS FINAL GRAVE ) From stephanh42 at gmail.com Sat Jun 3 02:32:27 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Sat, 3 Jun 2017 08:32:27 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: Hi Joshua, > A `op` B > > is equivalent to > > op(A, B This can of course be faked in Python. https://gist.github.com/stephanh42/a4d6d66b10cfecf935c9531150afb247 Now you can do: ======== @BinopCallable def add(x, y): return x + y print(3 @add@ 5) =========== Stephan 2017-06-03 7:59 GMT+02:00 Joshua Morton : > For reference, haskell is perhaps the closest language to providing > arbitrary infix operators, and it requires that they be surrounded by > backticks. That is > > A `op` B > > is equivalent to > > op(A, B) > > That doesn't work for python (backtick is taken) and I don't think anything > similar is a good idea. > > On Sat, Jun 3, 2017 at 1:56 AM Chris Angelico wrote: >> >> On Sat, Jun 3, 2017 at 3:42 PM, Pavol Lisy wrote: >> > Sorry for probably stupid question! Is something like -> >> > >> > class A: >> > def __oper__(self, '?', other): >> > return something(self.value, other) >> > >> > a = A() >> > a ? 3 >> > >> > thinkable? >> >> No, because operators need to be defined before you get to individual >> objects, and they need precedence and associativity. So it'd have to >> be defined at the compiler level. >> >> Also, having arbitrary operators gets extremely confusing. It's not >> easy to reason about code when you don't know what's even an operator. >> >> Not a stupid question, but one for which the answer is "definitely not >> like that". >> >> ChrisA >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From mertz at gnosis.cx Sat Jun 3 02:50:05 2017 From: mertz at gnosis.cx (David Mertz) Date: Fri, 2 Jun 2017 23:50:05 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: This is a horrible thing that nobody else should do :-), but I *am* the author of the file linked by Pavol. It's based on someone else's version (credited in the file), but I fine tuned it for what I want. I'm also not the author of the conceal plugin for vim, which is pretty much exactly for just this. The screenshot attached is a little bit of a vim session editing a Python file. The key thing is that I *type* only regular ASCII characters; I just tell vim to show my something different on lines I am not currently editing. On Fri, Jun 2, 2017 at 11:26 PM, Pavol Lisy wrote: > On 6/1/17, Serhiy Storchaka wrote: > > What you are think about adding Unicode aliases for some mathematic > > names in the math module? ;-) > > > > math.? = math.pi > > math.? = math.tau > > math.? = math.gamma > > math.? = math.e > > > > Unfortunately we can't use ?, ? and ? as identifiers. :-( > > My humble opinion: I would rather like to see something like: > > from some_wide_used_scientific_library.math_constants import * > > with good acceptance from scientific users before thinking to add it > into stdlib. > > PS. > Maybe this could be interesting for some vim users -> > http://gnosis.cx/bin/.vim/after/syntax/python.vim (I am not author of > this file) > > If you apply it in vim then vim show lines (where is not cursor) > "translated". Means for example that math.pi is replaced by ?. > > So you still edit "math.pi" but if you move cursor outside of this > line then you could see formula simplified/prettified. (or more > complicated - because you need a little train your brain to accept new > view) > > I am pretty skeptic how popular this conceal technique could be in vim > pythonistas community! > > ( > why skeptic? > For example I am testing to improve readability of line similar to > > self.a = self.b + self.c > > using with this technique and see in vim > > ?a = ?b + ?c > > but **I am not sure if it is really useful** (probably I have to add > that I am editing code much more in other editor than vim) > > ? U+1420 CANADIAN SYLLABICS FINAL GRAVE > ) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: conceal.png Type: image/png Size: 165902 bytes Desc: not available URL: From stephanh42 at gmail.com Sat Jun 3 03:10:21 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Sat, 3 Jun 2017 09:10:21 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: I use the conceal feature for this purpose, but I use https://github.com/ehamberg/vim-cute-python instead, which is a more toned-down version of the same idea. I personally find x ? y, a ? B more readable than x <= y, a not in B etc. Especially the conceal lambda => ? is useful for de-cluttering the code. Stephan 2017-06-03 8:50 GMT+02:00 David Mertz : > This is a horrible thing that nobody else should do :-), but I *am* the > author of the file linked by Pavol. It's based on someone else's version > (credited in the file), but I fine tuned it for what I want. I'm also not > the author of the conceal plugin for vim, which is pretty much exactly for > just this. > > The screenshot attached is a little bit of a vim session editing a Python > file. The key thing is that I *type* only regular ASCII characters; I just > tell vim to show my something different on lines I am not currently editing. > > On Fri, Jun 2, 2017 at 11:26 PM, Pavol Lisy wrote: >> >> On 6/1/17, Serhiy Storchaka wrote: >> > What you are think about adding Unicode aliases for some mathematic >> > names in the math module? ;-) >> > >> > math.? = math.pi >> > math.? = math.tau >> > math.? = math.gamma >> > math.? = math.e >> > >> > Unfortunately we can't use ?, ? and ? as identifiers. :-( >> >> My humble opinion: I would rather like to see something like: >> >> from some_wide_used_scientific_library.math_constants import * >> >> with good acceptance from scientific users before thinking to add it >> into stdlib. >> >> PS. >> Maybe this could be interesting for some vim users -> >> http://gnosis.cx/bin/.vim/after/syntax/python.vim (I am not author of >> this file) >> >> If you apply it in vim then vim show lines (where is not cursor) >> "translated". Means for example that math.pi is replaced by ?. >> >> So you still edit "math.pi" but if you move cursor outside of this >> line then you could see formula simplified/prettified. (or more >> complicated - because you need a little train your brain to accept new >> view) >> >> I am pretty skeptic how popular this conceal technique could be in vim >> pythonistas community! >> >> ( >> why skeptic? >> For example I am testing to improve readability of line similar to >> >> self.a = self.b + self.c >> >> using with this technique and see in vim >> >> ?a = ?b + ?c >> >> but **I am not sure if it is really useful** (probably I have to add >> that I am editing code much more in other editor than vim) >> >> ? U+1420 CANADIAN SYLLABICS FINAL GRAVE >> ) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From stephanh42 at gmail.com Sat Jun 3 03:14:10 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Sat, 3 Jun 2017 09:14:10 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: > My humble opinion: I would rather like to see something like: > > from some_wide_used_scientific_library.math_constants import * > > with good acceptance from scientific users before thinking to add it > into stdlib. Agree with this, but note that a similar proposal has once been made to scipy. It was rejected for now since they still also target Python2. So once scipy drops Python2 supports (presumable around 2020 at the latest), this could be re-proposed there. Stephan 2017-06-03 8:26 GMT+02:00 Pavol Lisy : > On 6/1/17, Serhiy Storchaka wrote: >> What you are think about adding Unicode aliases for some mathematic >> names in the math module? ;-) >> >> math.? = math.pi >> math.? = math.tau >> math.? = math.gamma >> math.? = math.e >> >> Unfortunately we can't use ?, ? and ? as identifiers. :-( > > My humble opinion: I would rather like to see something like: > > from some_wide_used_scientific_library.math_constants import * > > with good acceptance from scientific users before thinking to add it > into stdlib. > > PS. > Maybe this could be interesting for some vim users -> > http://gnosis.cx/bin/.vim/after/syntax/python.vim (I am not author of > this file) > > If you apply it in vim then vim show lines (where is not cursor) > "translated". Means for example that math.pi is replaced by ?. > > So you still edit "math.pi" but if you move cursor outside of this > line then you could see formula simplified/prettified. (or more > complicated - because you need a little train your brain to accept new > view) > > I am pretty skeptic how popular this conceal technique could be in vim > pythonistas community! > > ( > why skeptic? > For example I am testing to improve readability of line similar to > > self.a = self.b + self.c > > using with this technique and see in vim > > ?a = ?b + ?c > > but **I am not sure if it is really useful** (probably I have to add > that I am editing code much more in other editor than vim) > > ? U+1420 CANADIAN SYLLABICS FINAL GRAVE > ) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From pavol.lisy at gmail.com Sat Jun 3 03:22:14 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sat, 3 Jun 2017 09:22:14 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: On 6/3/17, Chris Angelico wrote: > On Sat, Jun 3, 2017 at 3:42 PM, Pavol Lisy wrote: >> Sorry for probably stupid question! Is something like -> >> >> class A: >> def __oper__(self, '?', other): >> return something(self.value, other) >> >> a = A() >> a ? 3 >> >> thinkable? > > No, because operators need to be defined before you get to individual > objects, and they need precedence and associativity. So it'd have to > be defined at the compiler level. Thanks for clarifying this point. Sorry for another stupid question: coding import machinery couldn't be used too, right? (I mean something like hylang.org ) Because ast could not understand these operators (and precedence and associativity)? BTW there could be also question about "multipliability". I mean something like a???n ( see https://en.wikipedia.org/wiki/Knuth%27s_up-arrow_notation ) > Also, having arbitrary operators gets extremely confusing. It's not > easy to reason about code when you don't know what's even an operator. I was thinking about it, but python is like this! You couldn't be really sure what is operator + doing! :) And it could be much easier to learn what some operator means in some library than for example understand async paradigm. (at least for some people) > Not a stupid question, but one for which the answer is "definitely not > like that". Thanks again! :) Although I am not sure it is definitely impossible I see that it is pretty pretty difficult. From ncoghlan at gmail.com Sat Jun 3 03:44:07 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Jun 2017 17:44:07 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: On 3 June 2017 at 15:55, Chris Angelico wrote: > On Sat, Jun 3, 2017 at 3:42 PM, Pavol Lisy wrote: >> Sorry for probably stupid question! Is something like -> >> >> class A: >> def __oper__(self, '?', other): >> return something(self.value, other) >> >> a = A() >> a ? 3 >> >> thinkable? > > No, because operators need to be defined before you get to individual > objects, and they need precedence and associativity. So it'd have to > be defined at the compiler level. > > Also, having arbitrary operators gets extremely confusing. It's not > easy to reason about code when you don't know what's even an operator. > > Not a stupid question, but one for which the answer is "definitely not > like that". A useful background read on this question specifically in the context of Python is PEP 465 (which added A at B for matrix multiplication), and in particular its discussion of the rejected alternatives: https://www.python.org/dev/peps/pep-0465/#rejected-alternatives-to-adding-a-new-operator For most purposes, the existing set of operators is sufficient, since we can alias them for unusual purposes (e.g. "/" for path joining in pathlib) when we don't need access to the more conventional meaning (division in that case, since "dividing one path segment by another" is nonsensical) and context makes it possible for the reader to understand what is going on ("filepath = segment1 / segment2 / segment3" looks a lot like writing out a filesystem path as a string and the name of the assignment target makes it clear this is a filesystem path operation, not a division operation). Matrix multiplication turned out to be a genuine expection, since all the other binary operators had well defined meanings as elementwise-operators, so borrowing one of them for matrix multiplication meant losing access to the corresponding elementwise operation, and there typically *weren't* enough hints from the context to let you know whether "*" was by element or the dot product. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Jun 3 03:54:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Jun 2017 17:54:19 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: On 3 June 2017 at 17:22, Pavol Lisy wrote: > On 6/3/17, Chris Angelico wrote: >> On Sat, Jun 3, 2017 at 3:42 PM, Pavol Lisy wrote: >>> Sorry for probably stupid question! Is something like -> >>> >>> class A: >>> def __oper__(self, '?', other): >>> return something(self.value, other) >>> >>> a = A() >>> a ? 3 >>> >>> thinkable? >> >> No, because operators need to be defined before you get to individual >> objects, and they need precedence and associativity. So it'd have to >> be defined at the compiler level. > > Thanks for clarifying this point. > > Sorry for another stupid question: coding import machinery couldn't be > used too, right? (I mean something like hylang.org ) Because ast could > not understand these operators (and precedence and associativity)? Source translation frontends *can* define new in-fix operators, but they need to translate them into explicit method and/or function calls before they reach the AST. So a frontend that added "A @ B" support to Python 2.7 (for example), would need to translate it into something like "numpy.dot(A, B)" or "matmul(A, B)" at the Python AST level. It would then be up to that function to emulate Python 3's __matmul__ special method support (or not, if the frontend was happy to only support a particular type, such as NumPy arrays) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Sat Jun 3 06:36:30 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 3 Jun 2017 20:36:30 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> Message-ID: <20170603103629.GE17170@ando.pearwood.info> On Fri, Jun 02, 2017 at 12:36:59PM +1000, Chris Angelico wrote: [...] > > I expect that moving '' to the end of sys.path will be a less disruptive > > change than removing it. > > This is true. However, anything that depends on the current behaviour > (intentionally or otherwise) would be just as broken as if it were > removed, I don't think we've agreed that the current behaviour is broken. I think we agree that: - it is unfortunate when people accidentally shadow the stdlib; - it is a feature to be able to intentionally shadow the stdlib. I believe that it is also a feature for scripts to be able to depend on resources in their directory, including other modules. That's the current behaviour. I don't know if you agree, but if you want to argue that's "broken", you should do so explicitly. Broken or not, removing '' from the sys.path will break scripts that expect to import modules in their directory. So even if we conclude that the current behaviour is broken, we still need a deprecation period. How about... ? - in 3.7, we add a pair of command line flags, let's say: --script-directory # add '' to sys.path when running scripts --no-script-directory # don't add '' to sys.path with the default remaining to add it - add a warning to 3.7 whenever you import a module from '' - in 3.9, we move '' to the end of sys.path instead of the start - and in 3.9, the default changes to not adding '' to sys.path unless explicitly requested. Alternatively, we could use a single flag: --script-directory-location=FIRST|LAST|NONE > and it's still possible for "import foo" to mean a local > import today and a stdlib import tomorrow. Of course -- that's the downside of any search path. Without a central authority to allocate library names, how do you deal with name conflicts? Any library can clash with any other library, all you can do is choose the order in which clashes are resolved. The advantage of moving '' to the end is that shadowing the stdlib becomes an affirmative, deliberate act, rather than something easy to do by accident. Experts who want to shadow a stdlib module can be assumed to be able to cope with the consequences of moving '' back to the front (or whatever technique they use to shadow a module). The disadvantage is that now the std lib can shadow your modules! Naive users will instead find the std lib shadowing *their* modules, which will be no less mysterious when it happens, but at least you can't break things by shadowing a module you don't directly import. E.g. you import A, which requires B, but you've shadowed B, and now A is broken. That's an especially frustrating error for beginners to diagnose, even with help. Moving '' to the end of the path will, I think, all but eliminate that sort of shadowing error. > It'd just move the > ambiguity around; instead of not being sure if it's been accidentally > shadowed, instead you have to wonder if your code will break when a > future Python version adds a new stdlib module. You don't have to wonder. You just have to read the What's New document. Your code can also break if you install a third-party library which (accidentally?) shadows the std lib, or another library which you rely on. There are only so many short, descriptive library names, and any time anyone (std lib or not) uses one, it risks clashing with somebody else's use of the same name. > Or your code could > break because someone pip-installs a third-party module. You can take > your pick which thing shadows which by choosing where you place '' in > sys.path, but there'll always be a risk one way or another. Indeed. > Different packages don't conflict with each other because > intra-package imports are explicit. They're safe. Unless the package shadows another package of the same name. By the way, to give you an idea of how hairy this can get, there was a recent bug report (now closed) complaining that intra-package imports can shadow each other: spam/ +-- __init__.py +-- eggs/ +-- __init__.py +-- cheese +-- eggs.py Inside spam, the eggs.py module shadows the eggs subdirectory and makes it impossible to reach eggs.cheese. (Or perhaps the other way -- I don't think the language guarantees one way or the other.) -- Steve From bepshatsky at yandex.ru Sat Jun 3 06:59:59 2017 From: bepshatsky at yandex.ru (Daniel Bershatsky) Date: Sat, 03 Jun 2017 13:59:59 +0300 Subject: [Python-ideas] Defer Statement Message-ID: <4774921496487599@web21m.yandex.ru> An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jun 3 08:24:33 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Jun 2017 22:24:33 +1000 Subject: [Python-ideas] Defer Statement In-Reply-To: <4774921496487599@web21m.yandex.ru> References: <4774921496487599@web21m.yandex.ru> Message-ID: On 3 June 2017 at 20:59, Daniel Bershatsky wrote: > Dear Python Developers, > > We have a potential idea for enhancing Python. You will find a kind of draft > bellow. Thank you for taking the time to write this up! > Best regards, > Daniel Bershatsky > > > Abstract > ======== > > This PEP proposes the introduction of new syntax to create community > standard, > readable and clear way to defered function execution in basic block on all > control flows. > > Proposal > ======== > > There is not any mechanism to defer the execution of function in python. There is, thanks to context managers: import contextlib def foo(i): print(i) def bar(): with contextlib.ExitStack() as stack: stack.callback(foo, 42) print(3.14) bar() Now, defer is certainly pithier, but thanks to contextlib2, the above code can be used all the way back to Python 2.6, rather than being limited to running on 3.7+. I was also motivated enough to *write* ExitStack() to solve this problem, but even I don't use it often enough to consider it worthy of being a builtin, let alone syntax. So while I'm definitely sympathetic to the use case (otherwise ExitStack wouldn't have a callback() method), "this would be useful" isn't a sufficient argument in this particular case - what's needed is a justification that this pattern of resource management is common enough to justify giving functions an optional implicit ExitStack instance and assigning a dedicated keyword for adding entries to it. Alternatively, the case could be made that there's a discoverability problem, where folks aren't necessarily being pointed towards ExitStack as a dynamic resource management tool when that's what they need, and to consider what could be done to help resolve that (with adding a new kind of statement being just one of the options evaluated). Cheers, Nick. P.S. Nikolas Rauth has a more in-depth write-up of the utility of ExitStack here: https://www.rath.org/on-the-beauty-of-pythons-exitstack.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From toddrjen at gmail.com Sat Jun 3 08:30:16 2017 From: toddrjen at gmail.com (Todd) Date: Sat, 3 Jun 2017 08:30:16 -0400 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: Julia lets you define new infix operators directly, including using mathematical symbols as operators. Not that I think that is a good idea, but you can do it. On Jun 3, 2017 2:00 AM, "Joshua Morton" wrote: For reference, haskell is perhaps the closest language to providing arbitrary infix operators, and it requires that they be surrounded by backticks. That is A `op` B is equivalent to op(A, B) That doesn't work for python (backtick is taken) and I don't think anything similar is a good idea. On Sat, Jun 3, 2017 at 1:56 AM Chris Angelico wrote: > On Sat, Jun 3, 2017 at 3:42 PM, Pavol Lisy wrote: > > Sorry for probably stupid question! Is something like -> > > > > class A: > > def __oper__(self, '?', other): > > return something(self.value, other) > > > > a = A() > > a ? 3 > > > > thinkable? > > No, because operators need to be defined before you get to individual > objects, and they need precedence and associativity. So it'd have to > be defined at the compiler level. > > Also, having arbitrary operators gets extremely confusing. It's not > easy to reason about code when you don't know what's even an operator. > > Not a stupid question, but one for which the answer is "definitely not > like that". > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Jun 3 08:51:50 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 3 Jun 2017 15:51:50 +0300 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: 03.06.17 03:31, Guido van Rossum ????: > OK, I think this discussion is pretty much dead then. We definitely > shouldn't allow math operators in identifiers, otherwise in Python 4 or > 5 we couldn't introduce them as operators. Sorry. I proposed this idea as a joke. math.? is useless, but mostly harmless. But I don't want to change Python grammar. The rule for Python identifiers already is not easy, there is no simple regular expression for them, and I'm sure most tools proceeding Python sources (even the tokenize module and IDLE) do not handle all Python identifier correctly. For example they don't recognize the symbol ? (U+2118, SCRIPT CAPITAL P) as a valid identifier. From rosuav at gmail.com Sat Jun 3 09:23:05 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 3 Jun 2017 23:23:05 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: <20170603103629.GE17170@ando.pearwood.info> References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> Message-ID: On Sat, Jun 3, 2017 at 8:36 PM, Steven D'Aprano wrote: > On Fri, Jun 02, 2017 at 12:36:59PM +1000, Chris Angelico wrote: > > [...] >> > I expect that moving '' to the end of sys.path will be a less disruptive >> > change than removing it. >> >> This is true. However, anything that depends on the current behaviour >> (intentionally or otherwise) would be just as broken as if it were >> removed, > > I don't think we've agreed that the current behaviour is broken. I > think we agree that: > > - it is unfortunate when people accidentally shadow the stdlib; > > - it is a feature to be able to intentionally shadow the stdlib. > > I believe that it is also a feature for scripts to be able to depend on > resources in their directory, including other modules. That's the > current behaviour. I don't know if you agree, but if you want to argue > that's "broken", you should do so explicitly. No, I'm not arguing that that behaviour is broken. Unideal, perhaps, but definitely not broken. What I said was that an application that depends on "import secrets" picking up secrets.py in the current directory is just as broken if '' is moved to the end as if it's removed altogether. By moving it to the end, we increase the chances that a minor version will break someone's code; by removing it altogether and forcing people to write "from . import secrets" (either with an implicit package or making people explicitly create __init__.py), we also force the issue to be fixed earlier. Instead of a potential future breakage, we have an immediate breakage with an easy and obvious solution. That's not to say that I don't think moving '' to the end would be an advantage. I just think that, if we're proposing to change the current behaviour and thus potentially break people's current code, we should fix the problem completely rather than merely reducing it. ChrisA From fakedme+py at gmail.com Sat Jun 3 09:49:47 2017 From: fakedme+py at gmail.com (Soni L.) Date: Sat, 3 Jun 2017 10:49:47 -0300 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> Message-ID: How about `import self.thing` (where "self" implies same dir as the current .py - if it's a dir-based package, then "self" is that dir) and `import super.thing` (where "super" implies parent package *locked to parent dir*. if there isn't any (top level package or main script), fail-by-default but let scripts override this behaviour (some scripts may want to reconfigure it to ignore "super" if there is no super)) With `import __future__.self_super_packages` to enable it. This should then allow '' to be completely removed, since you can just use `self` and/or `super`. Imports using `self.thing` should have their `super` set to the current `self`, e.g. ./main.py import self.xy ./xy/__init__.py import super.zy ./zy/__init__.py print "hello world" Should print "hello world" when you run main.py, even if there are modules `xy` and `zy` in the python path and no ''. On 2017-06-03 10:23 AM, Chris Angelico wrote: > On Sat, Jun 3, 2017 at 8:36 PM, Steven D'Aprano wrote: >> On Fri, Jun 02, 2017 at 12:36:59PM +1000, Chris Angelico wrote: >> >> [...] >>>> I expect that moving '' to the end of sys.path will be a less disruptive >>>> change than removing it. >>> This is true. However, anything that depends on the current behaviour >>> (intentionally or otherwise) would be just as broken as if it were >>> removed, >> I don't think we've agreed that the current behaviour is broken. I >> think we agree that: >> >> - it is unfortunate when people accidentally shadow the stdlib; >> >> - it is a feature to be able to intentionally shadow the stdlib. >> >> I believe that it is also a feature for scripts to be able to depend on >> resources in their directory, including other modules. That's the >> current behaviour. I don't know if you agree, but if you want to argue >> that's "broken", you should do so explicitly. > No, I'm not arguing that that behaviour is broken. Unideal, perhaps, > but definitely not broken. What I said was that an application that > depends on "import secrets" picking up secrets.py in the current > directory is just as broken if '' is moved to the end as if it's > removed altogether. By moving it to the end, we increase the chances > that a minor version will break someone's code; by removing it > altogether and forcing people to write "from . import secrets" (either > with an implicit package or making people explicitly create > __init__.py), we also force the issue to be fixed earlier. Instead of > a potential future breakage, we have an immediate breakage with an > easy and obvious solution. > > That's not to say that I don't think moving '' to the end would be an > advantage. I just think that, if we're proposing to change the current > behaviour and thus potentially break people's current code, we should > fix the problem completely rather than merely reducing it. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From fakedme+py at gmail.com Sat Jun 3 09:50:00 2017 From: fakedme+py at gmail.com (Soni L.) Date: Sat, 3 Jun 2017 10:50:00 -0300 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> Message-ID: <3c689d30-22a4-c8c9-0b0a-f3fb1877911c@gmail.com> How about `import self.thing` (where "self" implies same dir as the current .py - if it's a dir-based package, then "self" is that dir) and `import super.thing` (where "super" implies parent package *locked to parent dir*. if there isn't any (top level package or main script), fail-by-default but let scripts override this behaviour (some scripts may want to reconfigure it to ignore "super" if there is no super)) With `import __future__.self_super_packages` to enable it. This should then allow '' to be completely removed, since you can just use `self` and/or `super`. Imports using `self.thing` should have their `super` set to the current `self`, e.g. ./main.py import self.xy ./xy/__init__.py import super.zy ./zy/__init__.py print "hello world" Should print "hello world" when you run main.py, even if there are modules `xy` and `zy` in the python path and no ''. On 2017-06-03 10:23 AM, Chris Angelico wrote: > On Sat, Jun 3, 2017 at 8:36 PM, Steven D'Aprano wrote: >> On Fri, Jun 02, 2017 at 12:36:59PM +1000, Chris Angelico wrote: >> >> [...] >>>> I expect that moving '' to the end of sys.path will be a less disruptive >>>> change than removing it. >>> This is true. However, anything that depends on the current behaviour >>> (intentionally or otherwise) would be just as broken as if it were >>> removed, >> I don't think we've agreed that the current behaviour is broken. I >> think we agree that: >> >> - it is unfortunate when people accidentally shadow the stdlib; >> >> - it is a feature to be able to intentionally shadow the stdlib. >> >> I believe that it is also a feature for scripts to be able to depend on >> resources in their directory, including other modules. That's the >> current behaviour. I don't know if you agree, but if you want to argue >> that's "broken", you should do so explicitly. > No, I'm not arguing that that behaviour is broken. Unideal, perhaps, > but definitely not broken. What I said was that an application that > depends on "import secrets" picking up secrets.py in the current > directory is just as broken if '' is moved to the end as if it's > removed altogether. By moving it to the end, we increase the chances > that a minor version will break someone's code; by removing it > altogether and forcing people to write "from . import secrets" (either > with an implicit package or making people explicitly create > __init__.py), we also force the issue to be fixed earlier. Instead of > a potential future breakage, we have an immediate breakage with an > easy and obvious solution. > > That's not to say that I don't think moving '' to the end would be an > advantage. I just think that, if we're proposing to change the current > behaviour and thus potentially break people's current code, we should > fix the problem completely rather than merely reducing it. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Sat Jun 3 10:42:41 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Jun 2017 00:42:41 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: On 2 June 2017 at 02:30, Victor Stinner wrote: > Hi, > > Perl 5.26 succeeded to remove the current working directory from the > default include path (our Python sys.path): > > https://metacpan.org/pod/release/XSAWYERX/perl-5.26.0/pod/perldelta.pod#Removal-of-the-current-directory-(%22.%22)-from- at INC > > Would it technically possible to make this change in Python? Or would > it destroy the world? Sorry, it's a naive question (but honestly, I > don't know the answer.) > > My main use case for "." in sys.path is to be to run an application > without installing it: run ./hachoir-metadata which loads the Python > "hachoir" module from the script directory. Sometimes, I run > explicitly "PYTHONPATH=$PWD ./hachoir-metadata". > > But I don't think that running an application from the source without > installing it is the most common way to run an application. Most users > install applications to use them, no? Scripts are very frequently run without installing them, as are things like Jupyter Notebooks, so any change along these lines would need to be carefully planned to avoid being unduly disruptive. It's entirely feasible at a technical level, though - https://bugs.python.org/issue29929 describes one way to move away from "import X" for __main__ relative imports and towards "from . import X", which essentially involves turning __main__ into a package in its own right when its a directly executed script. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tjol at tjol.eu Sat Jun 3 11:33:43 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Sat, 3 Jun 2017 17:33:43 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <87efv3tw6r.fsf@thinkpad.rath.org> References: <87efv3tw6r.fsf@thinkpad.rath.org> Message-ID: <8dc35096-96a2-767b-7a8b-e1359dddd38f@tjol.eu> On 01/06/17 20:11, Nikolaus Rath wrote: > On Jun 01 2017, Victor Stinner wrote: >> 2017-06-01 8:47 GMT+02:00 Serhiy Storchaka : >>> What you are think about adding Unicode aliases for some mathematic names in >>> the math module? ;-) >>> >>> math.? = math.pi >> How do you write ? (pi) with a keyboard on Windows, Linux or macOS? > Under Linux, you'd use the Compose facility. Take a look at eg. > /usr/share/X11/locale/en_US.UTF-8/Compose for all the nice things it > let's you enter: > > $ egrep '[???]' /usr/share/X11/locale/en_US.UTF-8/Compose > : "?" U0393 # GREEK CAPITAL LETTER GAMMA >

: "?" U03C0 # GREEK SMALL LETTER PI > : "?" U03C4 # GREEK SMALL LETTER TAU > Have you ever seen a keyboard with a key in real life, though? -- Thomas Jollans m ? +31 6 42630259 e ? tjol at tjol.eu From steve at pearwood.info Sat Jun 3 12:36:50 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 4 Jun 2017 02:36:50 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> Message-ID: <20170603163647.GF17170@ando.pearwood.info> On Sat, Jun 03, 2017 at 03:51:50PM +0300, Serhiy Storchaka wrote: > The rule for Python identifiers already is not easy, there is no simple > regular expression for them, and I'm sure most tools proceeding Python > sources (even the tokenize module and IDLE) do not handle all Python > identifier correctly. For example they don't recognize the symbol ? > (U+2118, SCRIPT CAPITAL P) as a valid identifier. They shouldn't, because it isn't a valid identifier: it's a Maths Symbol, not a letter, same as ? ? ? ? etc. https://en.wikipedia.org/wiki/Weierstrass_p py> unicodedata.category('?') 'Sm' But Python 3.5 does treat it as an identifier! py> ? = 1 # should be a SyntaxError ? py> ? 1 There's a bug here, somewhere, I'm just not sure where... The PEP for non-ASCII identifiers is quite old now (it was written for Unicode 4!) but it excludes category 'Sm' in its identifier algorithm: https://www.python.org/dev/peps/pep-3131/#id16 -- Steve From ericsnowcurrently at gmail.com Sat Jun 3 12:45:20 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 3 Jun 2017 10:45:20 -0600 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: <20170603103629.GE17170@ando.pearwood.info> References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> Message-ID: On Sat, Jun 3, 2017 at 4:36 AM, Steven D'Aprano wrote: > I believe that it is also a feature for scripts to be able to depend on > resources in their directory, including other modules. That's the > current behaviour. [snip] > > Broken or not, removing '' from the sys.path will break scripts that > expect to import modules in their directory. [snip] Which is why the implicit sys.path entry probably can't go away, even though it would be nice in some ways. IIRC, in the past Guido has indicated he's opposed dropping the implicit sys.path entry for reasons along these lines. > > How about... ? > > - in 3.7, we add a pair of command line flags, let's say: > > --script-directory # add '' to sys.path when running scripts > --no-script-directory # don't add '' to sys.path In http://bugs.python.org/issue13475 spells these as "--path0" and "--nopath0". Also see http://www.python.org/dev/peps/pep-0395/ for "ways that the current automatic initialisation of sys.path[0] can go wrong" (quoted from the issue). > > with the default remaining to add it > > - add a warning to 3.7 whenever you import a module from '' > > - in 3.9, we move '' to the end of sys.path instead of the start Both seem okay. Doing so would help with some of the reasons detailed in PEP 395. However we'd need to be sure the consequences are as minimal as they seem. :) > > - and in 3.9, the default changes to not adding '' to sys.path unless > explicitly requested. We'd need to make sure there was a simple, obvious replacement. I'm not convinced dropping the implicit sys.path entry is worth doing though. -eric From steve at pearwood.info Sat Jun 3 12:48:24 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 4 Jun 2017 02:48:24 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <20170603163647.GF17170@ando.pearwood.info> References: <20170603001026.GB17170@ando.pearwood.info> <20170603163647.GF17170@ando.pearwood.info> Message-ID: <20170603164824.GG17170@ando.pearwood.info> On Sun, Jun 04, 2017 at 02:36:50AM +1000, Steven D'Aprano wrote: > But Python 3.5 does treat it as an identifier! > > py> ? = 1 # should be a SyntaxError ? > py> ? > 1 > > There's a bug here, somewhere, I'm just not sure where... That appears to be the only Symbol Math character which is accepted as an identifier in Python 3.5: py> import unicodedata py> all_unicode = map(chr, range(0x110000)) py> symbols = [c for c in all_unicode if unicodedata.category(c) == 'Sm'] py> len(symbols) 948 py> ns = {} py> for c in symbols: ... try: ... exec(c + " = 1", ns) ... except SyntaxError: ... pass ... else: ... print(c, unicodedata.name(c)) ... ? SCRIPT CAPITAL P py> -- Steve From brett at python.org Sat Jun 3 13:45:43 2017 From: brett at python.org (Brett Cannon) Date: Sat, 03 Jun 2017 17:45:43 +0000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On Fri, 2 Jun 2017 at 15:56 Guido van Rossum wrote: > I would love to show how easy it is to write > > from math import pi as ?, gamma as ? > > but I had to cheat by copying from the OP since I don't know how to type > these (and even if you were to tell me how I'd forget tomorrow). So, I am > still in favor of the rule "only ASCII in the stdlib". > Since this regularly comes up, why don't we add a note to the math module that you can do the above import(s) to bind various mathematical constants to their traditional symbol counterparts? The note can even start off with something like "While Python's standard library only uses ASCII characters to maximize ease of use and contribution, individuals are allowed to use various Unicode characters for variable names." This would also help with making sure people don't come back later and say, "why don't you just add the constants to the module?" -Brett > > On Fri, Jun 2, 2017 at 3:48 PM, Ivan Levkivskyi > wrote: > >> On 2 June 2017 at 12:17, Giampaolo Rodola' wrote: >> >>> On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka >>> wrote: >>> >>>> What you are think about adding Unicode aliases for some mathematic >>>> names in the math module? ;-) >>>> >>>> math.? = math.pi >>>> math.? = math.tau >>>> math.? = math.gamma >>>> math.? = math.e >>>> >>>> Unfortunately we can't use ?, ? and ? as identifiers. :-( >>>> >>>> [...] >>> * duplicated aliases might make sense if they add readability; in this >>> case they don't unless (maybe) you have a mathematical background. I can >>> infer what "math.gamma" stands for but not being a mathematician math.? >>> makes absolutely zero sense to me. >>> >>> >> There is a significant number of scientific Python programmers (21% >> according to PyCharm 2016), so it is not that rare to meet someone who >> knows what is Gamma function. >> And for many of them ? is much more readable than np.pi. Also there is >> another problem, confusion between Gamma function and Euler?Mascheroni >> constant, the first one is ?, >> the second one is ? (perfectly opposite to PEP 8 capitalization rules >> :-), while both of them are frequently denoted as just gamma (in particular >> math.gamma follows the PEP8 rules, >> but is counter-intuitive for most scientist). >> >> All that said, I agree that these problems are easily solved by a custom >> import from. Still there is something in (or related to?) this proposal >> I think is worth considering: Can we also allow identifiers like ? or ?. >> This will make many expressions more similar to usual TeX, >> plus it will be useful for projects like SymPy. >> >> -- >> Ivan >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Jun 3 14:39:41 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 3 Jun 2017 21:39:41 +0300 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: 03.06.17 20:45, Brett Cannon ????: > Since this regularly comes up, why don't we add a note to the math > module that you can do the above import(s) to bind various mathematical > constants to their traditional symbol counterparts? The note can even > start off with something like "While Python's standard library only uses > ASCII characters to maximize ease of use and contribution, individuals > are allowed to use various Unicode characters for variable names." This > would also help with making sure people don't come back later and say, > "why don't you just add the constants to the module?" Nice idea! From rosuav at gmail.com Sat Jun 3 14:41:22 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 4 Jun 2017 04:41:22 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <20170603164824.GG17170@ando.pearwood.info> References: <20170603001026.GB17170@ando.pearwood.info> <20170603163647.GF17170@ando.pearwood.info> <20170603164824.GG17170@ando.pearwood.info> Message-ID: On Sun, Jun 4, 2017 at 2:48 AM, Steven D'Aprano wrote: > On Sun, Jun 04, 2017 at 02:36:50AM +1000, Steven D'Aprano wrote: > >> But Python 3.5 does treat it as an identifier! >> >> py> ? = 1 # should be a SyntaxError ? >> py> ? >> 1 >> >> There's a bug here, somewhere, I'm just not sure where... > > That appears to be the only Symbol Math character which is accepted as > an identifier in Python 3.5: > > py> import unicodedata > py> all_unicode = map(chr, range(0x110000)) > py> symbols = [c for c in all_unicode if unicodedata.category(c) == 'Sm'] > py> len(symbols) > 948 > py> ns = {} > py> for c in symbols: > ... try: > ... exec(c + " = 1", ns) > ... except SyntaxError: > ... pass > ... else: > ... print(c, unicodedata.name(c)) > ... > ? SCRIPT CAPITAL P > py> Curious. And not specific to 3.5 - the exact same thing happens in 3.7. Here's the full category breakdown: cats = collections.defaultdict(int) ns = {} for c in map(chr, range(1, 0x110000)): try: exec(c + " = 1", ns) except SyntaxError: pass except UnicodeEncodeError: if unicodedata.category(c) != "Cs": raise else: cats[unicodedata.category(c)] += 1 defaultdict(, {'Po': 1, 'Lu': 1702, 'Pc': 1, 'Ll': 2063, 'Lo': 112703, 'Lt': 31, 'Lm': 245, 'Nl': 236, 'Mn': 2, 'Sm': 1, 'So': 1}) For reference, as well as the 948 Sm, there are 1690 Mn and 5777 So, but only these characters are valid from them: \u1885 Mn MONGOLIAN LETTER ALI GALI BALUDA \u1886 Mn MONGOLIAN LETTER ALI GALI THREE BALUDA ? Sm SCRIPT CAPITAL P ? So ESTIMATED SYMBOL 2118 SCRIPT CAPITAL P and 212E ESTIMATED SYMBOL are listed in PropList.txt as Other_ID_Start, so they make sense. But that doesn't explain the two characters from category Mn. It also doesn't explain why U+309B and U+309C are *not* valid, despite being declared Other_ID_Start. Maybe it's a bug? Maybe 309B and 309C somehow got switched into 1885 and 1886?? ChrisA From rosuav at gmail.com Sat Jun 3 14:44:35 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 4 Jun 2017 04:44:35 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: On Sun, Jun 4, 2017 at 12:42 AM, Nick Coghlan wrote: >> But I don't think that running an application from the source without >> installing it is the most common way to run an application. Most users >> install applications to use them, no? > > Scripts are very frequently run without installing them, as are things > like Jupyter Notebooks, so any change along these lines would need to > be carefully planned to avoid being unduly disruptive. > A single-file script wouldn't be affected; only something that has more than one file "side by side" in an arbitrary directory, and imports one from the other. Do Jupyter notebooks do that? I've no idea how they work under the covers. ChrisA From tjol at tjol.eu Sat Jun 3 14:50:28 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Sat, 3 Jun 2017 20:50:28 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <20170603164824.GG17170@ando.pearwood.info> References: <20170603001026.GB17170@ando.pearwood.info> <20170603163647.GF17170@ando.pearwood.info> <20170603164824.GG17170@ando.pearwood.info> Message-ID: On 03/06/17 18:48, Steven D'Aprano wrote: > On Sun, Jun 04, 2017 at 02:36:50AM +1000, Steven D'Aprano wrote: > >> But Python 3.5 does treat it as an identifier! >> >> py> ? = 1 # should be a SyntaxError ? >> py> ? >> 1 >> >> There's a bug here, somewhere, I'm just not sure where... > That appears to be the only Symbol Math character which is accepted as > an identifier in Python 3.5: > > py> import unicodedata > py> all_unicode = map(chr, range(0x110000)) > py> symbols = [c for c in all_unicode if unicodedata.category(c) == 'Sm'] > py> len(symbols) > 948 > py> ns = {} > py> for c in symbols: > ... try: > ... exec(c + " = 1", ns) > ... except SyntaxError: > ... pass > ... else: > ... print(c, unicodedata.name(c)) > ... > ? SCRIPT CAPITAL P > py> This is actually not a bug in Python, but a quirk in Unicode. I've had a closer look at PEP 3131 [1], which specifies that Python identifiers follow the Unicode classes XID_Start and XID_Continue. ? is listed in the standard [2][3] as XID_Start, so Python correctly accepts it as an identifier. >>> import unicodedata >>> all_unicode = map(chr, range(0x110000)) >>> for c in all_unicode: ... category = unicodedata.category(c) ... if not category.startswith('L') and category != 'Nl': # neither letter nor letter-number ... if c.isidentifier(): ... print('%s\tU+%04X\t%s' % (c, ord(c), unicodedata.name(c))) ... _ U+005F LOW LINE ? U+2118 SCRIPT CAPITAL P ? U+212E ESTIMATED SYMBOL >>> ? and ? are actually explicitly mentioned in the Unicode annnex [3]: > > 2.5Backward Compatibility > > Unicode General_Category values are kept as stable as possible, but > they can change across versions of the Unicode Standard. The bulk of > the characters having a given value are determined by other > properties, and the coverage expands in the future according to the > assignment of those properties. In addition, the Other_ID_Start > property provides a small list of characters that qualified as > ID_Start characters in some previous version of Unicode solely on the > basis of their General_Category properties, but that no longer qualify > in the current version. These are called /grandfathered/ characters. > > The Other_ID_Start property includes characters such as the following: > > U+2118 ( ? ) SCRIPT CAPITAL P > U+212E ( ? ) ESTIMATED SYMBOL > U+309B ( ? ) KATAKANA-HIRAGANA VOICED SOUND MARK > U+309C ( ? ) KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK > > I have no idea why U+309B and U+309C are not accepted as identifiers by Python 3.5. This could be a question of Python following an old version of the Unicode standard, or it *could* be a bug. Thomas [1] https://www.python.org/dev/peps/pep-3131/#specification-of-language-changes [2] http://www.unicode.org/Public/4.1.0/ucd/DerivedCoreProperties.txt [3] http://www.unicode.org/reports/tr31/ From dan at tombstonezero.net Sat Jun 3 15:02:28 2017 From: dan at tombstonezero.net (Dan Sommers) Date: Sat, 3 Jun 2017 19:02:28 +0000 (UTC) Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= References: Message-ID: On Sat, 03 Jun 2017 17:45:43 +0000, Brett Cannon wrote: > On Fri, 2 Jun 2017 at 15:56 Guido van Rossum wrote: > >> I would love to show how easy it is to write >> >> from math import pi as ?, gamma as ? [...] >> but I had to cheat by copying from the OP since I don't know how to type >> these (and even if you were to tell me how I'd forget tomorrow). So, I am >> still in favor of the rule "only ASCII in the stdlib". > > Since this regularly comes up, why don't we add a note to the math module > that you can do the above import(s) to bind various mathematical constants > to their traditional symbol counterparts? ... Because in order to add that note to the math module, you have to violate the "only ASCII in the stdlib" rule. ;-) People who would benefit from seeing ? (or ?) in their code will arrange to type it in proportion to that benefit (and probably already have). I know how to type those characters in my environments, but it might not be that easy if I had to do so on a random computer with a random keyboard running a random OS. Ob XKCD: https://xkcd.com/1806/ (my apologies if someone else already brought this up; I haven't been following along that closely). And I want to say something about this argument being like the one about being able to represent people's names correctly, but while the ratio between the circumference of a circle to its diameter has a name, it isn't a person. Dan From tjol at tjol.eu Sat Jun 3 15:02:37 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Sat, 3 Jun 2017 21:02:37 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> <20170603163647.GF17170@ando.pearwood.info> <20170603164824.GG17170@ando.pearwood.info> Message-ID: On 03/06/17 20:41, Chris Angelico wrote: > [snip] > For reference, as well as the 948 Sm, there are 1690 Mn and 5777 So, > but only these characters are valid from them: > > \u1885 Mn MONGOLIAN LETTER ALI GALI BALUDA > \u1886 Mn MONGOLIAN LETTER ALI GALI THREE BALUDA > ? Sm SCRIPT CAPITAL P > ? So ESTIMATED SYMBOL > > 2118 SCRIPT CAPITAL P and 212E ESTIMATED SYMBOL are listed in > PropList.txt as Other_ID_Start, so they make sense. But that doesn't > explain the two characters from category Mn. It also doesn't explain > why U+309B and U+309C are *not* valid, despite being declared > Other_ID_Start. Maybe it's a bug? Maybe 309B and 309C somehow got > switched into 1885 and 1886?? \u1885 and \u1886 are categorised as letters (category Lo) by my Python 3.5. (Which makes sense, right?) If your system puts them in category Mn, that's bound to be a bug somewhere. As for \u309B and \u309C - it turns out this is a question of normalisation. PEP 3131 requires NFKC normalisation: >>> for c in unicodedata.normalize('NFKC', '\u309B'): ... print('%s\tU+%04X\t%s' % (c, ord(c), unicodedata.name(c))) ... U+0020 SPACE U+3099 COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK >>> for c in unicodedata.normalize('NFKC', '\u309C'): ... print('%s\tU+%04X\t%s' % (c, ord(c), unicodedata.name(c))) ... U+0020 SPACE U+309A COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK >>> This is.... interesting. Thomas From toddrjen at gmail.com Sat Jun 3 15:14:04 2017 From: toddrjen at gmail.com (Todd) Date: Sat, 3 Jun 2017 15:14:04 -0400 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: Message-ID: On Jun 3, 2017 2:45 PM, "Chris Angelico" wrote: On Sun, Jun 4, 2017 at 12:42 AM, Nick Coghlan wrote: >> But I don't think that running an application from the source without >> installing it is the most common way to run an application. Most users >> install applications to use them, no? > > Scripts are very frequently run without installing them, as are things > like Jupyter Notebooks, so any change along these lines would need to > be carefully planned to avoid being unduly disruptive. > A single-file script wouldn't be affected; only something that has more than one file "side by side" in an arbitrary directory, and imports one from the other. Do Jupyter notebooks do that? I've no idea how they work under the covers. ChrisA It seems to be pretty common in unit tests in my experience. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sat Jun 3 15:55:05 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 3 Jun 2017 20:55:05 +0100 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> <20170603163647.GF17170@ando.pearwood.info> <20170603164824.GG17170@ando.pearwood.info> Message-ID: <475b3dcd-a950-77ad-c39c-df92c19f5b78@mrabarnett.plus.com> On 2017-06-03 19:50, Thomas Jollans wrote: > On 03/06/17 18:48, Steven D'Aprano wrote: >> On Sun, Jun 04, 2017 at 02:36:50AM +1000, Steven D'Aprano wrote: >> >>> But Python 3.5 does treat it as an identifier! >>> >>> py> ? = 1 # should be a SyntaxError ? >>> py> ? >>> 1 >>> >>> There's a bug here, somewhere, I'm just not sure where... >> That appears to be the only Symbol Math character which is accepted as >> an identifier in Python 3.5: >> >> py> import unicodedata >> py> all_unicode = map(chr, range(0x110000)) >> py> symbols = [c for c in all_unicode if unicodedata.category(c) == 'Sm'] >> py> len(symbols) >> 948 >> py> ns = {} >> py> for c in symbols: >> ... try: >> ... exec(c + " = 1", ns) >> ... except SyntaxError: >> ... pass >> ... else: >> ... print(c, unicodedata.name(c)) >> ... >> ? SCRIPT CAPITAL P >> py> > > This is actually not a bug in Python, but a quirk in Unicode. > > I've had a closer look at PEP 3131 [1], which specifies that Python > identifiers follow the Unicode classes XID_Start and XID_Continue. ? is > listed in the standard [2][3] as XID_Start, so Python correctly accepts > it as an identifier. > >>>> import unicodedata >>>> all_unicode = map(chr, range(0x110000)) >>>> for c in all_unicode: > ... category = unicodedata.category(c) > ... if not category.startswith('L') and category != 'Nl': # neither > letter nor letter-number > ... if c.isidentifier(): > ... print('%s\tU+%04X\t%s' % (c, ord(c), unicodedata.name(c))) > ... > _ U+005F LOW LINE > ? U+2118 SCRIPT CAPITAL P > ? U+212E ESTIMATED SYMBOL >>>> > > ? and ? are actually explicitly mentioned in the Unicode annnex [3]: > >> >> 2.5Backward Compatibility >> >> Unicode General_Category values are kept as stable as possible, but >> they can change across versions of the Unicode Standard. The bulk of >> the characters having a given value are determined by other >> properties, and the coverage expands in the future according to the >> assignment of those properties. In addition, the Other_ID_Start >> property provides a small list of characters that qualified as >> ID_Start characters in some previous version of Unicode solely on the >> basis of their General_Category properties, but that no longer qualify >> in the current version. These are called /grandfathered/ characters. >> >> The Other_ID_Start property includes characters such as the following: >> >> U+2118 ( ? ) SCRIPT CAPITAL P >> U+212E ( ? ) ESTIMATED SYMBOL >> U+309B ( ? ) KATAKANA-HIRAGANA VOICED SOUND MARK >> U+309C ( ? ) KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK >> >> > I have no idea why U+309B and U+309C are not accepted as identifiers by > Python 3.5. This could be a question of Python following an old version > of the Unicode standard, or it *could* be a bug. > [snip] U+309B and U+309C have had the property ID_Start since at least Unicode 6.0 (August 2010). Interestingly, '_' doesn't have that property, although Python does allow identifiers to start with it. From levkivskyi at gmail.com Sat Jun 3 16:07:18 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 3 Jun 2017 22:07:18 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <475b3dcd-a950-77ad-c39c-df92c19f5b78@mrabarnett.plus.com> References: <20170603001026.GB17170@ando.pearwood.info> <20170603163647.GF17170@ando.pearwood.info> <20170603164824.GG17170@ando.pearwood.info> <475b3dcd-a950-77ad-c39c-df92c19f5b78@mrabarnett.plus.com> Message-ID: On 3 June 2017 at 21:55, MRAB wrote: > [...] >> > > Interestingly, '_' doesn't have that property, although Python does allow > identifiers to start with it. Yes, it is special cased: if (!_PyUnicode_IsXidStart(first) && first != 0x5F /* LOW LINE */) return 0; -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From julien at palard.fr Sat Jun 3 16:10:48 2017 From: julien at palard.fr (Julien Palard) Date: Sat, 03 Jun 2017 16:10:48 -0400 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: Hi, What you are think about adding Unicode aliases for some mathematic names in the math module? ;-) math.? = math.pi math.? = math.tau math.? = math.gamma math.? = math.e Unfortunately we can't use ?, ? and ? as identifiers. :-( It may be the role of editors to do it if one really want it. I personally use the `pretty-mode` emacs mode, it replaces pi, sum, sqrt, None (?), tau, gamma, but not math.e obviously. It replaces some other strings like lambda (? x: ...), and is customizable (I added ? ? for "in" and "not in"). It only "messes" with visual numbers of characters in a line, but flake8 sees the correct python thrue flycheck so it reports errors properly, and you're still notified you're more than 80 columns long, even if visually there may be a longer valid line. Oh and it's also messing with copying and pasting to share, found myself yesterday pasting a "prettyfied" line of code on IRC and it (legitimately) surprised someone. -- Julien Palard https://mdk.fr -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjol at tjol.eu Sat Jun 3 16:32:28 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Sat, 3 Jun 2017 22:32:28 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> <20170603163647.GF17170@ando.pearwood.info> <20170603164824.GG17170@ando.pearwood.info> Message-ID: <475995c9-a5e9-a3ec-b048-367ecc17ce80@tjol.eu> On 03/06/17 21:02, Thomas Jollans wrote: > On 03/06/17 20:41, Chris Angelico wrote: >> [snip] >> For reference, as well as the 948 Sm, there are 1690 Mn and 5777 So, >> but only these characters are valid from them: >> >> \u1885 Mn MONGOLIAN LETTER ALI GALI BALUDA >> \u1886 Mn MONGOLIAN LETTER ALI GALI THREE BALUDA >> ? Sm SCRIPT CAPITAL P >> ? So ESTIMATED SYMBOL >> >> 2118 SCRIPT CAPITAL P and 212E ESTIMATED SYMBOL are listed in >> PropList.txt as Other_ID_Start, so they make sense. But that doesn't >> explain the two characters from category Mn. It also doesn't explain >> why U+309B and U+309C are *not* valid, despite being declared >> Other_ID_Start. Maybe it's a bug? Maybe 309B and 309C somehow got >> switched into 1885 and 1886?? > \u1885 and \u1886 are categorised as letters (category Lo) by my Python > 3.5. (Which makes sense, right?) If your system puts them in category > Mn, that's bound to be a bug somewhere. Actually it turns out that these characters were changed to category Mn in Unicode 9.0, but remain in (X)ID_Start for compatibility. All is right with the world. (All of this just goes to show how much subtlety there is in the science that goes into making Unicode) See: http://www.unicode.org/reports/tr44/tr44-18.html#Unicode_9.0.0 > > As for \u309B and \u309C - it turns out this is a question of > normalisation. PEP 3131 requires NFKC normalisation: > >>>> for c in unicodedata.normalize('NFKC', '\u309B'): > ... print('%s\tU+%04X\t%s' % (c, ord(c), unicodedata.name(c))) > ... > U+0020 SPACE > U+3099 COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK >>>> for c in unicodedata.normalize('NFKC', '\u309C'): > ... print('%s\tU+%04X\t%s' % (c, ord(c), unicodedata.name(c))) > ... > U+0020 SPACE > U+309A COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK > This is.... interesting. > > > Thomas > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Thomas Jollans m ? +31 6 42630259 e ? tjol at tjol.eu From joshua.morton13 at gmail.com Sat Jun 3 17:11:06 2017 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Sat, 03 Jun 2017 21:11:06 +0000 Subject: [Python-ideas] Defer Statement In-Reply-To: References: <4774921496487599@web21m.yandex.ru> Message-ID: Its also worth mentioning that the `defer` statement has come up in other contexts, and is already often used as an identifier already (see *https://mail.python.org/pipermail/python-ideas/2017-February/044682.html *), so there are a lot of practical considerations for this to overcome even if its deemed necessary (which I think Nick shows that it probably shouldn't be). --Josh On Sat, Jun 3, 2017 at 8:24 AM Nick Coghlan wrote: > On 3 June 2017 at 20:59, Daniel Bershatsky wrote: > > Dear Python Developers, > > > > We have a potential idea for enhancing Python. You will find a kind of > draft > > bellow. > > Thank you for taking the time to write this up! > > > Best regards, > > Daniel Bershatsky > > > > > > Abstract > > ======== > > > > This PEP proposes the introduction of new syntax to create community > > standard, > > readable and clear way to defered function execution in basic block on > all > > control flows. > > > > Proposal > > ======== > > > > There is not any mechanism to defer the execution of function in python. > > There is, thanks to context managers: > > import contextlib > def foo(i): > print(i) > > def bar(): > with contextlib.ExitStack() as stack: > stack.callback(foo, 42) > print(3.14) > > bar() > > Now, defer is certainly pithier, but thanks to contextlib2, the above > code can be used all the way back to Python 2.6, rather than being > limited to running on 3.7+. I was also motivated enough to *write* > ExitStack() to solve this problem, but even I don't use it often > enough to consider it worthy of being a builtin, let alone syntax. > > So while I'm definitely sympathetic to the use case (otherwise > ExitStack wouldn't have a callback() method), "this would be useful" > isn't a sufficient argument in this particular case - what's needed is > a justification that this pattern of resource management is common > enough to justify giving functions an optional implicit ExitStack > instance and assigning a dedicated keyword for adding entries to it. > > Alternatively, the case could be made that there's a discoverability > problem, where folks aren't necessarily being pointed towards > ExitStack as a dynamic resource management tool when that's what they > need, and to consider what could be done to help resolve that (with > adding a new kind of statement being just one of the options > evaluated). > > Cheers, > Nick. > > P.S. Nikolas Rauth has a more in-depth write-up of the utility of > ExitStack here: > https://www.rath.org/on-the-beauty-of-pythons-exitstack.html > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Jun 3 18:04:12 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 4 Jun 2017 08:04:12 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> <20170603163647.GF17170@ando.pearwood.info> <20170603164824.GG17170@ando.pearwood.info> Message-ID: On Sun, Jun 4, 2017 at 5:02 AM, Thomas Jollans wrote: > On 03/06/17 20:41, Chris Angelico wrote: >> [snip] >> For reference, as well as the 948 Sm, there are 1690 Mn and 5777 So, >> but only these characters are valid from them: >> >> \u1885 Mn MONGOLIAN LETTER ALI GALI BALUDA >> \u1886 Mn MONGOLIAN LETTER ALI GALI THREE BALUDA >> ? Sm SCRIPT CAPITAL P >> ? So ESTIMATED SYMBOL >> >> 2118 SCRIPT CAPITAL P and 212E ESTIMATED SYMBOL are listed in >> PropList.txt as Other_ID_Start, so they make sense. But that doesn't >> explain the two characters from category Mn. It also doesn't explain >> why U+309B and U+309C are *not* valid, despite being declared >> Other_ID_Start. Maybe it's a bug? Maybe 309B and 309C somehow got >> switched into 1885 and 1886?? > > \u1885 and \u1886 are categorised as letters (category Lo) by my Python > 3.5. (Which makes sense, right?) If your system puts them in category > Mn, that's bound to be a bug somewhere. rosuav at sikorsky:~$ python3.7 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 9.0.0 Mn rosuav at sikorsky:~$ python3.6 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 8.0.0 Lo rosuav at sikorsky:~$ python3.5 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 8.0.0 Lo rosuav at sikorsky:~$ python3.4 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 6.3.0 Lo Is it possible that there's a discrepancy between the Unicode version used by the unicodedata module and the one used by the parser? ChrisA From tjol at tjol.eu Sat Jun 3 19:28:42 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Sun, 4 Jun 2017 01:28:42 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: <20170603001026.GB17170@ando.pearwood.info> <20170603163647.GF17170@ando.pearwood.info> <20170603164824.GG17170@ando.pearwood.info> Message-ID: <0ecf5ab7-f29e-5144-917c-6dcab48ad039@tjol.eu> On 04/06/17 00:04, Chris Angelico wrote: > On Sun, Jun 4, 2017 at 5:02 AM, Thomas Jollans wrote: >> On 03/06/17 20:41, Chris Angelico wrote: >>> [snip] >>> For reference, as well as the 948 Sm, there are 1690 Mn and 5777 So, >>> but only these characters are valid from them: >>> >>> \u1885 Mn MONGOLIAN LETTER ALI GALI BALUDA >>> \u1886 Mn MONGOLIAN LETTER ALI GALI THREE BALUDA >>> ? Sm SCRIPT CAPITAL P >>> ? So ESTIMATED SYMBOL >>> >>> 2118 SCRIPT CAPITAL P and 212E ESTIMATED SYMBOL are listed in >>> PropList.txt as Other_ID_Start, so they make sense. But that doesn't >>> explain the two characters from category Mn. It also doesn't explain >>> why U+309B and U+309C are *not* valid, despite being declared >>> Other_ID_Start. Maybe it's a bug? Maybe 309B and 309C somehow got >>> switched into 1885 and 1886?? >> \u1885 and \u1886 are categorised as letters (category Lo) by my Python >> 3.5. (Which makes sense, right?) If your system puts them in category >> Mn, that's bound to be a bug somewhere. > rosuav at sikorsky:~$ python3.7 -c "import unicodedata; > print(unicodedata.unidata_version, unicodedata.category('\u1885'))" > 9.0.0 Mn > rosuav at sikorsky:~$ python3.6 -c "import unicodedata; > print(unicodedata.unidata_version, unicodedata.category('\u1885'))" > 8.0.0 Lo > rosuav at sikorsky:~$ python3.5 -c "import unicodedata; > print(unicodedata.unidata_version, unicodedata.category('\u1885'))" > 8.0.0 Lo > rosuav at sikorsky:~$ python3.4 -c "import unicodedata; > print(unicodedata.unidata_version, unicodedata.category('\u1885'))" > 6.3.0 Lo > > Is it possible that there's a discrepancy between the Unicode version > used by the unicodedata module and the one used by the parser? It appear to be Unicode policy to keep characters in ID_Start (etc) even if this no longer fits their character category. So in Unicode 9.0, 1885 and 1886 were added to Other_ID_Start for backwards compatibility (like ?). Thomas From apalala at gmail.com Sat Jun 3 19:59:13 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sat, 3 Jun 2017 19:59:13 -0400 Subject: [Python-ideas] Defer Statement In-Reply-To: <4774921496487599@web21m.yandex.ru> References: <4774921496487599@web21m.yandex.ru> Message-ID: On Sat, Jun 3, 2017 at 6:59 AM, Daniel Bershatsky wrote: Or with usage defer keyword > > ``` > fin = open(filename) > defer fin.close() > # some stuff > IMHO, a block in which the intention of a `finally: is not well understood, needs refactoring. Some *old* code is like that, but it doesn?t mean it?s *bad*. Then, as a matter of personal preference, I?m not comfortable with that the *defer* idiom talks first about things that should be done last. It?s a debt acquired without enough syntactic evidence (Oh! Mi gosh!, There were those defers at the start of the function I just changed). >From import this: Explicit is better than implicit. ? -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Jun 3 20:00:58 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 04 Jun 2017 12:00:58 +1200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> Message-ID: <59334DBA.3030904@canterbury.ac.nz> Is this really much of a security issue? Seems to me that for someone to exploit it, they would have to inject a malicious .py file alongside one of my script files. If they can do that, they can probably do all kinds of bad things directly. -- Greg From greg.ewing at canterbury.ac.nz Sat Jun 3 20:16:36 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 04 Jun 2017 12:16:36 +1200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> Message-ID: <59335164.7050604@canterbury.ac.nz> Soni L. wrote: > How about `import self.thing` (where "self" implies same dir as the > current .py That wouldn't provide quite the same functionality, since currently a module alongside the main py file can be imported from anywhere, including .py files inside a package. Also I think it would be confusing to have two very similar but subtly different relative import mechanisms. -- Greg From apalala at gmail.com Sat Jun 3 21:06:50 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sat, 3 Jun 2017 21:06:50 -0400 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: <59335164.7050604@canterbury.ac.nz> References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59335164.7050604@canterbury.ac.nz> Message-ID: I'm not comfortable with the subject title of this thread. I'm comfortable with the experiences and solutions proposed by Nick. On Sat, Jun 3, 2017 at 8:16 PM, Greg Ewing wrote: > Soni L. wrote: > >> How about `import self.thing` (where "self" implies same dir as the >> current .py >> > > That wouldn't provide quite the same functionality, since > currently a module alongside the main py file can be imported > from anywhere, including .py files inside a package. > > Also I think it would be confusing to have two very similar > but subtly different relative import mechanisms. > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Jun 4 02:23:31 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Jun 2017 16:23:31 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On 4 June 2017 at 05:02, Dan Sommers wrote: > On Sat, 03 Jun 2017 17:45:43 +0000, Brett Cannon wrote: > >> On Fri, 2 Jun 2017 at 15:56 Guido van Rossum wrote: >> >>> I would love to show how easy it is to write >>> >>> from math import pi as ?, gamma as ? > > [...] > >>> but I had to cheat by copying from the OP since I don't know how to type >>> these (and even if you were to tell me how I'd forget tomorrow). So, I am >>> still in favor of the rule "only ASCII in the stdlib". >> >> Since this regularly comes up, why don't we add a note to the math module >> that you can do the above import(s) to bind various mathematical constants >> to their traditional symbol counterparts? ... > > Because in order to add that note to the math module, you have to > violate the "only ASCII in the stdlib" rule. ;-) The ASCII-only restriction in the standard library is merely "all public APIs will use ASCII-only identifiers", rather than "We don't allow the use of Unicode anywhere" (Several parts of the documentation would be rather unreadable if they were restricted to ASCII characters). However, clarifying that made me realise we've never actually written that down anywhere - it's just been an assumed holdover from the fact that Python 2.7 is still being supported, and doesn't allow for Unicode identifiers in the first place. https://github.com/python/peps/pull/285 is a PR to explicitly document the standard library API restriction in the "Names to Avoid" part of PEP 8. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Jun 4 02:38:27 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Jun 2017 16:38:27 +1000 Subject: [Python-ideas] Defer Statement In-Reply-To: References: <4774921496487599@web21m.yandex.ru> Message-ID: On 3 June 2017 at 22:24, Nick Coghlan wrote: > So while I'm definitely sympathetic to the use case (otherwise > ExitStack wouldn't have a callback() method), "this would be useful" > isn't a sufficient argument in this particular case - what's needed is > a justification that this pattern of resource management is common > enough to justify giving functions an optional implicit ExitStack > instance and assigning a dedicated keyword for adding entries to it. It occurred to me that I should elaborate a bit further here, and point out explicitly that one of the main benefits of ExitStack (and, indeed, the main reason it exists) is that it allows resource lifecycle management to be deterministic, *without* necessarily tying it to function calls or with statements. The behave BDD test framework, for example, defines hooks that run before and after each feature and scenario, as well as before and after the entire test run. I use those to set up "scenario_cleanup", "_feature_cleanup" and "_global_cleanup" stacks as part of the testing context: https://github.com/leapp-to/prototype/blob/master/integration-tests/features/environment.py#L49 If a test step implementor allocates a resource that needs to be cleaned up, they register it with "context.scenario_cleanup", and then the "after scenario" hook takes care of closing the ExitStack instance and cleaning everything up appropriately. For me, that kind of situation is when I'm most inclined to reach for ExitStack, whereas when the cleanup needs align with the function call stack, I'm more likely to reach for contexlib.contextmanager or an explicit try/finally. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From lucas.wiman at gmail.com Sun Jun 4 03:23:01 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Sun, 4 Jun 2017 00:23:01 -0700 Subject: [Python-ideas] Defer Statement In-Reply-To: References: <4774921496487599@web21m.yandex.ru> Message-ID: I agree that the stated use cases are better handled with ExitStack. One area where `defer` might be useful is in lazy-evaluating global constants. For example in a genomics library used at my work, one module involves compiling a large number of regular expressions, and setting them as global contants in the module, like: FOO1_re = re.compile(r'...') FOO_TO_BAR_re = {foo: complicated_computation_of_regex(foo) for foo in LONG_LIST_OF_THINGS} ... This utility module is imported in a lot of places in the codebase, which meant that importing almost anything from our codebase involved precompiling all these regular expressions, which took around 500ms to to run the anything (the test runner, manually testing code in the shell, etc.) It would be ideal to only do these computations if/when they are needed. This is a more general issue than this specific example, e.g. for libraries which parse large data sources like pycountry (see my PR for one possible ugly solution using proxy objects; the author instead went with the simpler, less general solution of manually deciding when the data is needed). See also django.utils.functional.lazy , which is used extensively in the framework. A statement like: `defer FOO = lengthy_computation_of_foo()` which deferred the lengthy computation until it is used for something would be useful to allow easily fixing these issues without writing ugly hacks like proxy objects or refactoring code into high-overhead cached properties or the like. Alternatively, if there are less ugly or bespoke ways to handle this kind of issue, I'd be interested in hearing them. Best, Lucas On Sat, Jun 3, 2017 at 11:38 PM, Nick Coghlan wrote: > On 3 June 2017 at 22:24, Nick Coghlan wrote: > > So while I'm definitely sympathetic to the use case (otherwise > > ExitStack wouldn't have a callback() method), "this would be useful" > > isn't a sufficient argument in this particular case - what's needed is > > a justification that this pattern of resource management is common > > enough to justify giving functions an optional implicit ExitStack > > instance and assigning a dedicated keyword for adding entries to it. > > It occurred to me that I should elaborate a bit further here, and > point out explicitly that one of the main benefits of ExitStack (and, > indeed, the main reason it exists) is that it allows resource > lifecycle management to be deterministic, *without* necessarily tying > it to function calls or with statements. > > The behave BDD test framework, for example, defines hooks that run > before and after each feature and scenario, as well as before and > after the entire test run. I use those to set up "scenario_cleanup", > "_feature_cleanup" and "_global_cleanup" stacks as part of the testing > context: https://github.com/leapp-to/prototype/blob/master/ > integration-tests/features/environment.py#L49 > > If a test step implementor allocates a resource that needs to be > cleaned up, they register it with "context.scenario_cleanup", and then > the "after scenario" hook takes care of closing the ExitStack instance > and cleaning everything up appropriately. > > For me, that kind of situation is when I'm most inclined to reach for > ExitStack, whereas when the cleanup needs align with the function call > stack, I'm more likely to reach for contexlib.contextmanager or an > explicit try/finally. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Jun 4 03:35:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Jun 2017 17:35:26 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: <59334DBA.3030904@canterbury.ac.nz> References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: On 4 June 2017 at 10:00, Greg Ewing wrote: > Is this really much of a security issue? Seems to me that > for someone to exploit it, they would have to inject a > malicious .py file alongside one of my script files. If > they can do that, they can probably do all kinds of bad > things directly. There are genuine problems with it, which is why we have the -I switch to enable "isolated mode" (where pretty much all per-user settings get ignored). However, just dropping the current directory from sys.path without also disabling those other features (like user site-packages processing and environment variable processing) really doesn't buy you much. So the better answer from a security perspective is PEP 432 and the separate system-python binary (and Eric Snow recently got us started down that path by merging the initial aspects of that PEP as a private development API, so we can adopt the new settings management architecture incrementally before deciding whether or not we want to support it as a public API). So rather than anything security related, the key reasons I'm personally interested in moving towards requiring main-relative imports to be explicit are a matter of making it easier to reason about a piece of code just by reading it, as well as automatically avoiding certain classes of beginner bugs (i.e. essentially the same arguments PEP 328 put forward for the previous switch away from implicit relative imports in package submodules: https://www.python.org/dev/peps/pep-0328/#rationale-for-absolute-imports). Currently, main relative imports look like this: import helper This means that at the point of reading it, you don't know whether "helper" is independently redistributed, or if it's expected to be distributed alongside the main script. By contrast: from . import helper Makes it clear that "helper" isn't a 3rd party thing, it's meant to be distributed alongside the main script, and if it's missing, you don't want to pick up any arbitrary top level module that happens to be called "helper". Reaching a point where we require main relative imports to be written as "from . import helper" also means that a script called "socket.py" could include the statement "import socket" and actually get the standard library's socket module as it expected - the developer of such a script would have to write "from . import socket" in order to reimport the main script as a module. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Sun Jun 4 03:37:20 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 4 Jun 2017 00:37:20 -0700 Subject: [Python-ideas] Defer Statement In-Reply-To: References: <4774921496487599@web21m.yandex.ru> Message-ID: On Sun, Jun 4, 2017 at 12:23 AM, Lucas Wiman wrote: > I agree that the stated use cases are better handled with ExitStack. One > area where `defer` might be useful is in lazy-evaluating global constants. > For example in a genomics library used at my work, one module involves > compiling a large number of regular expressions, and setting them as global > contants in the module, like: > > FOO1_re = re.compile(r'...') > FOO_TO_BAR_re = {foo: complicated_computation_of_regex(foo) for foo in > LONG_LIST_OF_THINGS} > ... > > This utility module is imported in a lot of places in the codebase, which > meant that importing almost anything from our codebase involved precompiling > all these regular expressions, which took around 500ms to to run the > anything (the test runner, manually testing code in the shell, etc.) It > would be ideal to only do these computations if/when they are needed. This > is a more general issue than this specific example, e.g. for libraries which > parse large data sources like pycountry (see my PR for one possible ugly > solution using proxy objects; the author instead went with the simpler, less > general solution of manually deciding when the data is needed). See also > django.utils.functional.lazy, which is used extensively in the framework. > > A statement like: `defer FOO = lengthy_computation_of_foo()` which deferred > the lengthy computation until it is used for something would be useful to > allow easily fixing these issues without writing ugly hacks like proxy > objects or refactoring code into high-overhead cached properties or the > like. I think in general I'd recommend making the API for accessing these things be a function call interface, so that it's obvious to the caller that some expensive computation might be going on. But if you're stuck with an attribute-lookup based interface, then you can use a __getattr__ hook to compute them the first time they're accessed: class LazyConstants: def __getattr__(self, name): value = compute_value_for(name) setattr(self, name, value) return value __getattr__ is only called as a fallback, so by setting the computed value on the object we make any future attribute lookups just as cheap as they would be otherwise. You can get this behavior onto a module object by doing "sys.modules[__name__] = Constants()" inside the module body, or by using a hack elegant bit of code like https://github.com/njsmith/metamodule/ (mostly the latter would only be preferred if you have a bunch of other attributes exported from this same module and trying to move all of them onto the LazyConstants object would be difficult). -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Sun Jun 4 04:12:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Jun 2017 18:12:20 +1000 Subject: [Python-ideas] Defer Statement In-Reply-To: References: <4774921496487599@web21m.yandex.ru> Message-ID: On 4 June 2017 at 17:37, Nathaniel Smith wrote: > I think in general I'd recommend making the API for accessing these > things be a function call interface, so that it's obvious to the > caller that some expensive computation might be going on. But if > you're stuck with an attribute-lookup based interface, then you can > use a __getattr__ hook to compute them the first time they're > accessed: > > class LazyConstants: > def __getattr__(self, name): > value = compute_value_for(name) > setattr(self, name, value) > return value > > __getattr__ is only called as a fallback, so by setting the computed > value on the object we make any future attribute lookups just as cheap > as they would be otherwise. > > You can get this behavior onto a module object by doing > "sys.modules[__name__] = Constants()" inside the module body, or by > using a hack elegant bit of code like > https://github.com/njsmith/metamodule/ (mostly the latter would only > be preferred if you have a bunch of other attributes exported from > this same module and trying to move all of them onto the LazyConstants > object would be difficult). This reminds me: we could really use some documentation help in relation to https://bugs.python.org/issue22986 making module __class__ attributes mutable in Python 3.5+ At the moment, that is just reported in Misc/NEWS as "Issue #22986: Allow changing an object's __class__ between a dynamic type and static type in some cases.", which doesn't do anything to convey the significant *implications* of now being able to define module level properties as follows: >>> x = 10 >>> x 10 >>> import __main__ as main >>> main.x 10 >>> from types import ModuleType >>> class SpecialMod(ModuleType): ... @property ... def x(self): ... return 42 ... >>> main.__class__ = SpecialMod >>> x 10 >>> main.x 42 (I know that's what metamodule does under the hood, but if the 3.5+ only limitation is acceptable, then it's likely to be clearer to just do this inline rather than hiding it behind a 3rd party API) One potentially good option would be a HOWTO guide on "Lazy attribute initialization" in https://docs.python.org/3/howto/index.html that walked through from the basics of using read-only properties with double-underscore prefixed result caching, through helper functions & methods decorated with lru_cache, and all the way up to using __class__ assignment to enable the definition of module level properties. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From 4kir4.1i at gmail.com Sun Jun 4 07:17:11 2017 From: 4kir4.1i at gmail.com (Akira Li) Date: Sun, 04 Jun 2017 14:17:11 +0300 Subject: [Python-ideas] Defer Statement References: <4774921496487599@web21m.yandex.ru> Message-ID: <87efv0knns.fsf@gmail.com> Daniel Bershatsky writes: > ... > Proposal > ======== > There is not any mechanism to defer the execution of function in > python. In > order to mimic defer statement one could use either try/except > construction or > use context manager in with statement. > ... Related: "Python equivalent of golang's defer statement" https://stackoverflow.com/questions/34625089/python-equivalent-of-golangs-defer-statement From zuo at chopin.edu.pl Sun Jun 4 08:09:33 2017 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sun, 4 Jun 2017 14:09:33 +0200 Subject: [Python-ideas] Defer Statement In-Reply-To: References: <4774921496487599@web21m.yandex.ru> Message-ID: <20170604140933.043b60a3@grzmot> Hello, 2017-06-04 Nathaniel Smith dixit: > class LazyConstants: > def __getattr__(self, name): > value = compute_value_for(name) > setattr(self, name, value) > return value > > __getattr__ is only called as a fallback, so by setting the computed > value on the object we make any future attribute lookups just as cheap > as they would be otherwise. Another solution is to use a Pyramid's- at reify-like decorator to make a caching non-data descriptor (i.e., the kind of descriptor that can be shadowed with an ordinary instance attribute): ``` class LazyConstants: @reify def FOO(self): return @reify def BAR(self): return ``` In my work we use just @pyramid.decorator.reify [1], but the mechanism is so simple that it you can always implement it by yourself [2], though providing some features related to_ _doc__/introspection-ability etc. may need some additional deliberation to do it right... That may mean to it would be worth to add it to the standard library. Wouldn't be? Cheers. *j [1] See: http://docs.pylonsproject.org/projects/pyramid/en/latest/api/decorator.html#pyramid.decorator.reify [2] The gist of the implementation is just: ``` class lazyproperty(object): def __init__(self, maker): self.maker = maker def __get__(self, instance, owner): if instance is None: return self value = self.maker(instance) setattr(instance, self.maker.__name__, value) return value ``` From gvanrossum at gmail.com Sun Jun 4 17:00:41 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 4 Jun 2017 14:00:41 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: AFAK it was in whatever PEP introduced Unicode identifiers. On Jun 3, 2017 11:24 PM, "Nick Coghlan" wrote: > On 4 June 2017 at 05:02, Dan Sommers wrote: > > On Sat, 03 Jun 2017 17:45:43 +0000, Brett Cannon wrote: > > > >> On Fri, 2 Jun 2017 at 15:56 Guido van Rossum wrote: > >> > >>> I would love to show how easy it is to write > >>> > >>> from math import pi as ?, gamma as ? > > > > [...] > > > >>> but I had to cheat by copying from the OP since I don't know how to > type > >>> these (and even if you were to tell me how I'd forget tomorrow). So, I > am > >>> still in favor of the rule "only ASCII in the stdlib". > >> > >> Since this regularly comes up, why don't we add a note to the math > module > >> that you can do the above import(s) to bind various mathematical > constants > >> to their traditional symbol counterparts? ... > > > > Because in order to add that note to the math module, you have to > > violate the "only ASCII in the stdlib" rule. ;-) > > The ASCII-only restriction in the standard library is merely "all > public APIs will use ASCII-only identifiers", rather than "We don't > allow the use of Unicode anywhere" (Several parts of the documentation > would be rather unreadable if they were restricted to ASCII > characters). > > However, clarifying that made me realise we've never actually written > that down anywhere - it's just been an assumed holdover from the fact > that Python 2.7 is still being supported, and doesn't allow for > Unicode identifiers in the first place. > > https://github.com/python/peps/pull/285 is a PR to explicitly document > the standard library API restriction in the "Names to Avoid" part of > PEP 8. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Jun 4 18:51:40 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 4 Jun 2017 15:51:40 -0700 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: I really don't want people to start using the "from . import foo" idiom for their first steps into programming. It seems a reasonable "defensive programming" maneuver to put in scripts and apps made by professional Python programmers for surprise-free wide distribution, but (like many of those) should not be part of the learning experience. On Sun, Jun 4, 2017 at 12:35 AM, Nick Coghlan wrote: > On 4 June 2017 at 10:00, Greg Ewing wrote: > > Is this really much of a security issue? Seems to me that > > for someone to exploit it, they would have to inject a > > malicious .py file alongside one of my script files. If > > they can do that, they can probably do all kinds of bad > > things directly. > > There are genuine problems with it, which is why we have the -I switch > to enable "isolated mode" (where pretty much all per-user settings get > ignored). However, just dropping the current directory from sys.path > without also disabling those other features (like user site-packages > processing and environment variable processing) really doesn't buy you > much. > > So the better answer from a security perspective is PEP 432 and the > separate system-python binary (and Eric Snow recently got us started > down that path by merging the initial aspects of that PEP as a private > development API, so we can adopt the new settings management > architecture incrementally before deciding whether or not we want to > support it as a public API). > > So rather than anything security related, the key reasons I'm > personally interested in moving towards requiring main-relative > imports to be explicit are a matter of making it easier to reason > about a piece of code just by reading it, as well as automatically > avoiding certain classes of beginner bugs (i.e. essentially the same > arguments PEP 328 put forward for the previous switch away from > implicit relative imports in package submodules: > https://www.python.org/dev/peps/pep-0328/#rationale-for-absolute-imports). > > Currently, main relative imports look like this: > > import helper > > This means that at the point of reading it, you don't know whether > "helper" is independently redistributed, or if it's expected to be > distributed alongside the main script. > > By contrast: > > from . import helper > > Makes it clear that "helper" isn't a 3rd party thing, it's meant to be > distributed alongside the main script, and if it's missing, you don't > want to pick up any arbitrary top level module that happens to be > called "helper". > > Reaching a point where we require main relative imports to be written > as "from . import helper" also means that a script called "socket.py" > could include the statement "import socket" and actually get the > standard library's socket module as it expected - the developer of > such a script would have to write "from . import socket" in order to > reimport the main script as a module. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Sun Jun 4 20:12:44 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 4 Jun 2017 20:12:44 -0400 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: On Sun, Jun 4, 2017 at 6:51 PM, Guido van Rossum wrote: > I really don't want people to start using the "from . import foo" idiom > for their first steps into programming. It seems a reasonable "defensive > programming" maneuver to put in scripts and apps made by professional > Python programmers for surprise-free wide distribution, but (like many of > those) should not be part of the learning experience. At the same time, someday you may want the support for 2.7-kind-of issues to stop. Requiring "from" (or other ways to make the source of the import unambiguous) is common in programming languages. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Sun Jun 4 20:44:53 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Sun, 4 Jun 2017 17:44:53 -0700 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <59310FDD.3010702@canterbury.ac.nz> Message-ID: I'd like to throw some cold water on this one, for the same reason I always add "." to the path in my shell, when some well-meaning soul has removed it. Why? It's 2017 and I've not shared a machine since the 1980's. I use immutable containers in the cloud that are not at this particular risk either. At a small company you might share a file server, but can trust fellow employees. At a large company, you might be at risk, but after many years at one I'd never heard of this actually happening. Guess that leaves hackers? Well, if they are already in... In short I submit this problem is mostly theoretical, as it hasn't occurred the decades(*cough*) of my experience. From small company to large, to the cloud. Has it ever occurred in the history of the world? Sure. On the other hand, requiring "from . " in front of many imports would make python a bit more tedious every single day, for everyone. -1 -Mike p.s. Rearranging sys.path should be tolerable. Have wondered why the current dir was first. From ncoghlan at gmail.com Sun Jun 4 23:33:24 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Jun 2017 13:33:24 +1000 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: On 5 June 2017 at 07:00, Guido van Rossum wrote: > AFAK it was in whatever PEP introduced Unicode identifiers. Ah, indeed it is: https://www.python.org/dev/peps/pep-3131/#policy-specification Interestingly, that's stricter than my draft PR for PEP 8, and I'm not entirely sure we follow the "string literals and comments must be in ASCII" part in its entirety: ============ All identifiers in the Python standard library MUST use ASCII-only identifiers, and SHOULD use English words wherever feasible (in many cases, abbreviations and technical terms are used which aren't English). In addition, string literals and comments must also be in ASCII. The only exceptions are (a) test cases testing the non-ASCII features, and (b) names of authors. Authors whose names are not based on the Latin alphabet MUST provide a Latin transliteration of their names. ============ That said, all the potential counter-examples that come to mind are in the documentation, but *not* in the corresponding docstrings (e.g. the Euro symbol used in in the docs for chr() and ord()). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From gvanrossum at gmail.com Mon Jun 5 00:08:50 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 4 Jun 2017 21:08:50 -0700 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: I think the strictness comes from the observation that the stdlib is read and edited using *lots* of different tools and not every tool is with the program. That argument may be weaker now than when that PEP was written, but I still get emails and see websites with mojibake. (Most recently, the US-PyCon badges had spaces for all non-ASCII letters.) The argument is also weaker for comments than it is for identifiers, since stdlib identifiers will be used through *even more* tools (anyone who uses a name imported from stdlib). Docstrings are perhaps halfway in between. On Sun, Jun 4, 2017 at 8:33 PM, Nick Coghlan wrote: > On 5 June 2017 at 07:00, Guido van Rossum wrote: > > AFAK it was in whatever PEP introduced Unicode identifiers. > > Ah, indeed it is: https://www.python.org/dev/peps/pep-3131/#policy- > specification > > Interestingly, that's stricter than my draft PR for PEP 8, and I'm not > entirely sure we follow the "string literals and comments must be in > ASCII" part in its entirety: > > ============ > All identifiers in the Python standard library MUST use ASCII-only > identifiers, and SHOULD use English words wherever feasible (in many > cases, abbreviations and technical terms are used which aren't > English). In addition, string literals and comments must also be in > ASCII. The only exceptions are (a) test cases testing the non-ASCII > features, and (b) names of authors. Authors whose names are not based > on the Latin alphabet MUST provide a Latin transliteration of their > names. > ============ > > That said, all the potential counter-examples that come to mind are in > the documentation, but *not* in the corresponding docstrings (e.g. the > Euro symbol used in in the docs for chr() and ord()). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Mon Jun 5 05:49:40 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Mon, 5 Jun 2017 11:49:40 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <59310FDD.3010702@canterbury.ac.nz> Message-ID: What about just adding the -I (isolated mode) flag to the #! line of installed scripts? I was actually surprised this is not already done for the Python scripts in /use/bin on my Ubuntu box. Stephan Op 5 jun. 2017 02:52 schreef "Mike Miller" : > I'd like to throw some cold water on this one, for the same reason I > always add "." to the path in my shell, when some well-meaning soul has > removed it. Why? > > It's 2017 and I've not shared a machine since the 1980's. I use immutable > containers in the cloud that are not at this particular risk either. At a > small company you might share a file server, but can trust fellow > employees. At a large company, you might be at risk, but after many years > at one I'd never heard of this actually happening. > > Guess that leaves hackers? Well, if they are already in... > > In short I submit this problem is mostly theoretical, as it hasn't > occurred the decades(*cough*) of my experience. From small company to > large, to the cloud. Has it ever occurred in the history of the world? > Sure. > > On the other hand, requiring "from . " in front of many imports would make > python a bit more tedious every single day, for everyone. > > -1 > > -Mike > > p.s. Rearranging sys.path should be tolerable. Have wondered why the > current dir was first. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Jun 5 05:59:35 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 05 Jun 2017 21:59:35 +1200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <59310FDD.3010702@canterbury.ac.nz> Message-ID: <59352B87.4000304@canterbury.ac.nz> Stephan Houben wrote: > What about just adding the -I (isolated mode) flag to the #! line of > installed scripts? Not all unix systems support passing extra arguments on a #! line. -- Greg From stephanh42 at gmail.com Mon Jun 5 06:06:45 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Mon, 5 Jun 2017 12:06:45 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: <59352B87.4000304@canterbury.ac.nz> References: <59310FDD.3010702@canterbury.ac.nz> <59352B87.4000304@canterbury.ac.nz> Message-ID: What about doing it on the systems which *do* support it? (Which probably covers 99% of the installed base...) Stephan Op 5 jun. 2017 11:59 schreef "Greg Ewing" : > Stephan Houben wrote: > >> What about just adding the -I (isolated mode) flag to the #! line of >> installed scripts? >> > > Not all unix systems support passing extra arguments on a #! line. > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Mon Jun 5 06:11:35 2017 From: phd at phdru.name (Oleg Broytman) Date: Mon, 5 Jun 2017 12:11:35 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: <59352B87.4000304@canterbury.ac.nz> References: <59310FDD.3010702@canterbury.ac.nz> <59352B87.4000304@canterbury.ac.nz> Message-ID: <20170605101135.GA596@phdru.name> On Mon, Jun 05, 2017 at 09:59:35PM +1200, Greg Ewing wrote: > Stephan Houben wrote: > >What about just adding the -I (isolated mode) flag to the #! line of > >installed scripts? > > Not all unix systems support passing extra arguments on a #! line. In case of #!/usr/bin/env - yes. In case of #!/usr/bin/python - combined one-letter arguments can be passed. > -- > Greg Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From phd at phdru.name Mon Jun 5 06:12:34 2017 From: phd at phdru.name (Oleg Broytman) Date: Mon, 5 Jun 2017 12:12:34 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <59310FDD.3010702@canterbury.ac.nz> <59352B87.4000304@canterbury.ac.nz> Message-ID: <20170605101234.GB596@phdru.name> On Mon, Jun 05, 2017 at 12:06:45PM +0200, Stephan Houben wrote: > What about doing it on the systems which *do* support it? > (Which probably covers 99% of the installed base...) That's the job of Linux distribution packagers, not Python Core developers. > Stephan > > Op 5 jun. 2017 11:59 schreef "Greg Ewing" : > > > Stephan Houben wrote: > > > >> What about just adding the -I (isolated mode) flag to the #! line of > >> installed scripts? > >> > > > > Not all unix systems support passing extra arguments on a #! line. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From stephanh42 at gmail.com Mon Jun 5 06:41:25 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Mon, 5 Jun 2017 12:41:25 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: <20170605101234.GB596@phdru.name> References: <59310FDD.3010702@canterbury.ac.nz> <59352B87.4000304@canterbury.ac.nz> <20170605101234.GB596@phdru.name> Message-ID: Would it not be a job for setuptools? Setuptools creates the scripts. Stephan Op 5 jun. 2017 12:12 schreef "Oleg Broytman" : On Mon, Jun 05, 2017 at 12:06:45PM +0200, Stephan Houben < stephanh42 at gmail.com> wrote: > What about doing it on the systems which *do* support it? > (Which probably covers 99% of the installed base...) That's the job of Linux distribution packagers, not Python Core developers. > Stephan > > Op 5 jun. 2017 11:59 schreef "Greg Ewing" : > > > Stephan Houben wrote: > > > >> What about just adding the -I (isolated mode) flag to the #! line of > >> installed scripts? > >> > > > > Not all unix systems support passing extra arguments on a #! line. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Jun 5 06:55:17 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 5 Jun 2017 12:55:17 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: Le 5 juin 2017 00:52, "Guido van Rossum" a ?crit : I really don't want people to start using the "from . import foo" idiom for their first steps into programming. It seems a reasonable "defensive programming" maneuver to put in scripts and apps made by professional Python programmers for surprise-free wide distribution, but (like many of those) should not be part of the learning experience. A minimum change would be to add the (empty string) at the end of sys.path in Python 3.7 rather than adding it at the start. It would increase Python usability since it avoids the "random has no randint() function" caused by a random.py file in the script directory. In my experience, this bug hits every developers starting to learn Python and it can be very strange when you get the error when trying to run IDLE. I don't think that a new command line parameter is required. It's already easy enough to prepend something to sys.path directly in the script. And I consider that it's a very rare use case. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jun 5 07:06:46 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Jun 2017 21:06:46 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <59310FDD.3010702@canterbury.ac.nz> Message-ID: On 5 June 2017 at 19:49, Stephan Houben wrote: > What about just adding the -I (isolated mode) flag to the #! line of > installed scripts? Fedora & derivatives generally do do that, but as others noted, it can sometimes cause issues with shebang line parsers. It's also easy to lose the setting when a subprocess gets started based on sys.executable. Wrapper scripts can be a little more robust (as long as they use -a to get sys.executable set appropriately), but things still end up being quite intricate and fiddly, and it's hard to prove you've plugged all the gaps. Providing a separate binary with different defaults baked in at build time doesn't magically fix everything (since you still need to change shebang lines to refer to that binary), but it does make it much easier to *stay* in system mode once you're there. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Jun 5 07:14:40 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Jun 2017 21:14:40 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: On 5 June 2017 at 20:55, Victor Stinner wrote: > A minimum change would be to add the (empty string) at the end of sys.path > in Python 3.7 rather than adding it at the start. > > It would increase Python usability since it avoids the "random has no > randint() function" caused by a random.py file in the script directory. In > my experience, this bug hits every developers starting to learn Python and > it can be very strange when you get the error when trying to run IDLE. > > I don't think that a new command line parameter is required. It's already > easy enough to prepend something to sys.path directly in the script. And I > consider that it's a very rare use case. The biggest problem with this approach is that it means that adding new standard library modules becomes a backwards compatibility break - scripts that used to work will now fail since they'll get the standard library module rather than the previously implicit main relative import. At the moment we don't have that problem - as with new builtins, adding a new standard library module may mean people have to rename things to get access to it, but their current code won't actually *break* as a result of the new name being assigned. Hence the "from . import helper" idea - that's unambiguous, so it will always get the co-located library, even if we later add "helper" to the standard library. That said, if we *just* wanted to fix the "random has no attribute randint" problem, without any other side effects, we could potentially special case __main__.__file__ in the import system such that we always ignored it, even if it could technically satisfy the current import request. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephanh42 at gmail.com Mon Jun 5 07:30:32 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Mon, 5 Jun 2017 13:30:32 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <59310FDD.3010702@canterbury.ac.nz> Message-ID: So it seems the best thing would be to have a system-python executable which always runs in isolated mode? In fact I could imagine that security-conscious distributions would only install system-python by default and relegate the ordinary python to some python-dev package. Stephan Op 5 jun. 2017 13:06 schreef "Nick Coghlan" : > On 5 June 2017 at 19:49, Stephan Houben wrote: > > What about just adding the -I (isolated mode) flag to the #! line of > > installed scripts? > > Fedora & derivatives generally do do that, but as others noted, it can > sometimes cause issues with shebang line parsers. It's also easy to > lose the setting when a subprocess gets started based on > sys.executable. > > Wrapper scripts can be a little more robust (as long as they use -a to > get sys.executable set appropriately), but things still end up being > quite intricate and fiddly, and it's hard to prove you've plugged all > the gaps. > > Providing a separate binary with different defaults baked in at build > time doesn't magically fix everything (since you still need to change > shebang lines to refer to that binary), but it does make it much > easier to *stay* in system mode once you're there. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Jun 5 08:25:06 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 5 Jun 2017 05:25:06 -0700 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: On Mon, Jun 5, 2017 at 4:14 AM, Nick Coghlan wrote: > The biggest problem with this approach is that it means that adding > new standard library modules becomes a backwards compatibility break - > scripts that used to work will now fail since they'll get the standard > library module rather than the previously implicit main relative > import. At the moment we don't have that problem - as with new > builtins, adding a new standard library module may mean people have to > rename things to get access to it, but their current code won't > actually *break* as a result of the new name being assigned. Python is a bit inconsistent about this. The standard library currently doesn't shadow modules in the script directory, but it does shadow site-packages, which means that new stdlib modules already can break working code. It also makes it impossible to pip install backport modules that intentionally shadow old stdlib modules, which might not be a great idea but is at least plausibly useful in some situations, while the kind of accidental shadowing one gets in the script directory is pretty much always bad IME. -n -- Nathaniel J. Smith -- https://vorpus.org From chris.barker at noaa.gov Mon Jun 5 13:51:18 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 5 Jun 2017 10:51:18 -0700 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: On Mon, Jun 5, 2017 at 3:55 AM, Victor Stinner wrote: > A minimum change would be to add the (empty string) at the end of sys.path > in Python 3.7 rather than adding it at the start. > > It would increase Python usability since it avoids the "random has no > randint() function" caused by a random.py file in the script directory. In > my experience, this bug hits every developers starting to learn Python and > it can be very strange when you get the error when trying to run IDLE. > But it would add the "why won't python import my file?!??!" problem, which newbies also struggle with. Which leaves me with no suggestion for a solution... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavol.lisy at gmail.com Mon Jun 5 16:13:53 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Mon, 5 Jun 2017 22:13:53 +0200 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: On 6/5/17, Chris Barker wrote: > On Mon, Jun 5, 2017 at 3:55 AM, Victor Stinner > wrote: > >> A minimum change would be to add the (empty string) at the end of >> sys.path >> in Python 3.7 rather than adding it at the start. >> >> It would increase Python usability since it avoids the "random has no >> randint() function" caused by a random.py file in the script directory. >> In >> my experience, this bug hits every developers starting to learn Python >> and >> it can be very strange when you get the error when trying to run IDLE. >> > > But it would add the "why won't python import my file?!??!" problem, which > newbies also struggle with. > > Which leaves me with no suggestion for a solution... Maybe help() could check sys.last_value and get some hints? From greg at krypto.org Mon Jun 5 16:15:39 2017 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 05 Jun 2017 20:15:39 +0000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: On Mon, Jun 5, 2017 at 10:52 AM Chris Barker wrote: > On Mon, Jun 5, 2017 at 3:55 AM, Victor Stinner > wrote: > >> A minimum change would be to add the (empty string) at the end of >> sys.path in Python 3.7 rather than adding it at the start. >> >> It would increase Python usability since it avoids the "random has no >> randint() function" caused by a random.py file in the script directory. In >> my experience, this bug hits every developers starting to learn Python and >> it can be very strange when you get the error when trying to run IDLE. >> > > But it would add the "why won't python import my file?!??!" problem, which > newbies also struggle with. > > Which leaves me with no suggestion for a solution... > We already got rid of implicit relative imports within packages in Python 3. The primary value in continuing to treat the __main__ module differently is simplicity for learners. If the problem we're trying to solve (really - that needs to be nailed down) is that of overriding standard library modules with your own .py files being an issue (it is! it comes up time and time again)... This simple move from beginning to end is the right way to do it. It does not remove implicit relative imports for the main module but does remove implicit stdlib shadowing. which is something nobody ever wants. and when they do want that unwantable thing, they should be explicit about it instead of relying on magic that depends on code being executed as __main__ vs imported as a module from elsewhere. +0.667 on moving the empty string to the end of sys.path in 3.7 from me. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Jun 6 00:14:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Jun 2017 14:14:10 +1000 Subject: [Python-ideas] Security: remove "." from sys.path? In-Reply-To: References: <20170602010559.GS23443@ando.pearwood.info> <20170603103629.GE17170@ando.pearwood.info> <59334DBA.3030904@canterbury.ac.nz> Message-ID: On 5 June 2017 at 22:25, Nathaniel Smith wrote: > On Mon, Jun 5, 2017 at 4:14 AM, Nick Coghlan wrote: >> The biggest problem with this approach is that it means that adding >> new standard library modules becomes a backwards compatibility break - >> scripts that used to work will now fail since they'll get the standard >> library module rather than the previously implicit main relative >> import. At the moment we don't have that problem - as with new >> builtins, adding a new standard library module may mean people have to >> rename things to get access to it, but their current code won't >> actually *break* as a result of the new name being assigned. > > Python is a bit inconsistent about this. The standard library > currently doesn't shadow modules in the script directory, but it does > shadow site-packages, which means that new stdlib modules already can > break working code. It also makes it impossible to pip install > backport modules that intentionally shadow old stdlib modules, which > might not be a great idea but is at least plausibly useful in some > situations, while the kind of accidental shadowing one gets in the > script directory is pretty much always bad IME. And if folks want to *reliably* shadow the standard library (whether with their own modules or with third party ones), we already have a solution for that: zipapp and "pip install --target .". You do need to rename your scripts in development from "script.py" to "script/__main__.py" if you want to go down that path, though. I'm still somewhat inclined towards special casing __main__.__spec__.origin, but the point about standard library additions already shadowing site-packages does move me to being +0 on changing the relative precedence of the current directory on sys.path, rather than my previous -1. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Tue Jun 6 15:51:55 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 7 Jun 2017 05:51:55 +1000 Subject: [Python-ideas] Improved exception messages Message-ID: A question came up on python-list regarding the message given when you call float(""). It's somewhat unclear due to the way humans tend to ignore a lack of content: >>> float("spam") Traceback (most recent call last): File "", line 1, in ValueError: could not convert string to float: 'spam' >>> float("") Traceback (most recent call last): File "", line 1, in ValueError: could not convert string to float: Firstly, is there a reason for the empty string to not be surrounded with quotes? The source code, AIUI, is this: x = PyOS_string_to_double(s, (char **)&end, NULL); if (end != last) { PyErr_Format(PyExc_ValueError, "could not convert string to float: " "%R", obj); return NULL; } which, by my reading, should always be repr'ing the string. Secondly, the actual feature suggestion/request: Incorporate the original string in the exception's arguments. That way, if there's any confusion, e.args[1] is the safest way to check what string was actually being floated. Feasible? Useful? ChrisA From tomuxiong at gmx.com Tue Jun 6 16:15:21 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Tue, 6 Jun 2017 13:15:21 -0700 Subject: [Python-ideas] Improved exception messages In-Reply-To: References: Message-ID: On 06/06/2017 12:51 PM, Chris Angelico wrote: > Firstly, is there a reason for the empty string to not be surrounded > with quotes? The source code, AIUI, is this: > > x = PyOS_string_to_double(s, (char **)&end, NULL); > if (end != last) { > PyErr_Format(PyExc_ValueError, > "could not convert string to float: " > "%R", obj); > return NULL; > } > > which, by my reading, should always be repr'ing the string. > The confusing part is that if the string is empty, then the line if (end != last) { does not evaluate to true. So you never enter that part of the if statement. Instead you get here: else if (x == -1.0 && PyErr_Occurred()) { return NULL; } (You can check in a debugger that this happens.) Cheers, Thomas From tomuxiong at gmx.com Tue Jun 6 16:26:29 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Tue, 6 Jun 2017 13:26:29 -0700 Subject: [Python-ideas] Improved exception messages In-Reply-To: References: Message-ID: <7f0eb6bc-fd8c-8212-f3b8-ffaeb6148aac@gmx.com> I think this diff is probably the correct solution. Basically it just checks if there's anything left after spaces are stripped and then throws an error if not: (By the way sorry for not being clearer in my other message. This diff is against the current 3.7 master branch. I didn't look at the original 2.7 because I wanted to check that it wasn't fixed in a future version (which apparently it isn't).) --------------------- $ git diff diff --git a/Objects/floatobject.c b/Objects/floatobject.c index 8c4fe74..c1886cc 100644 --- a/Objects/floatobject.c +++ b/Objects/floatobject.c @@ -144,7 +144,13 @@ float_from_string_inner(const char *s, Py_ssize_t len, void *obj) while (s < last - 1 && Py_ISSPACE(last[-1])) { last--; } - + /* If nothing is left after stripping spaces, return error. */ + if (s == last) { + PyErr_Format(PyExc_ValueError, + "could not convert string to float: " + "%R", obj); + return NULL; + } /* We don't care about overflow or underflow. If the platform * supports them, infinities and signed zeroes (on underflow) are * fine. */ --------------------- I just compiled and tested it and it seems to do what we want: >>> float(" ") Traceback (most recent call last): File "", line 1, in ValueError: could not convert string to float: ' ' >>> float(" ") Traceback (most recent call last): File "", line 1, in ValueError: could not convert string to float: ' ' >>> float("a") Traceback (most recent call last): File "", line 1, in ValueError: could not convert string to float: 'a' >>> float("a ") Traceback (most recent call last): File "", line 1, in ValueError: could not convert string to float: 'a ' >>> float(" a ") Traceback (most recent call last): File "", line 1, in ValueError: could not convert string to float: ' a ' >>> float(" 1 ") 1.0 Cheers, Thomas From mikhailwas at gmail.com Tue Jun 6 20:03:44 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Wed, 7 Jun 2017 02:03:44 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= Message-ID: Greg Ewing wrote: >Steven D'Aprano wrote: >> There's not much, if any, benefit to writing: >> >> ?(expression, lower_limit, upper_limit, name) >More generally, there's a kind of culture clash between mathematical >notation and programming notation. Mathematical notation tends to >almost exclusively use single-character names, relying on different >fonts and alphabets, and superscripts and subscripts, to get a large >enough set of identifiers. Whereas in programming we use a much >smaller alphabet and longer names. That's probably because mathematicians grown up writing everything with a chalk on a blackboard. Hands are tired after hours of writing and blackboards are limitited, need to erase everything and start over. I find actually symbols ? ? (inclusive comparison) nice. They look ok and have usage in context of writing code. But that's merely an exception in world of math symbols. OTOH I'm strongly against unicode. Mikhail From greg.ewing at canterbury.ac.nz Wed Jun 7 01:34:52 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 07 Jun 2017 17:34:52 +1200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <5937907C.30306@canterbury.ac.nz> Mikhail V wrote: > I find actually symbols ? ? (inclusive comparison) nice. Yes, there are a few symbols it would be nice to have. A proper ? symbol would have avoided the wars between <> and !=. :-) -- Greg From storchaka at gmail.com Wed Jun 7 02:03:50 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 7 Jun 2017 09:03:50 +0300 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <5937907C.30306@canterbury.ac.nz> References: <5937907C.30306@canterbury.ac.nz> Message-ID: 07.06.17 08:34, Greg Ewing ????: > Mikhail V wrote: >> I find actually symbols ? ? (inclusive comparison) nice. > > Yes, there are a few symbols it would be nice to have. > A proper ? symbol would have avoided the wars between > <> and !=. :-) But this would start the war between ? and ? (symbols used by mathematicians of different countries for less-or-equal). From contact at brice.xyz Wed Jun 7 02:48:32 2017 From: contact at brice.xyz (Brice PARENT) Date: Wed, 7 Jun 2017 08:48:32 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <5937907C.30306@canterbury.ac.nz> References: <5937907C.30306@canterbury.ac.nz> Message-ID: <069e94e5-6c47-8b4e-be27-2956ab47b353@brice.xyz> Le 07/06/17 ? 07:34, Greg Ewing a ?crit : > > Yes, there are a few symbols it would be nice to have. > A proper ? symbol would have avoided the wars between > <> and !=. :-) I'm not sure it's worth any change in the language, it's already really easy to read and write as is. But I agree this can be great to have for example for reviewers (Python being what it is, you can have reviewers who are not really pythonistas but just here to check the logic and maths for example). And it's already available by using some fonts that provide a good ligature support, like Fira Code (https://twitter.com/pycharm/status/804786040775045123?lang=fr). I'm not sure about the support in other editors/terminals tho. From stephanh42 at gmail.com Wed Jun 7 03:11:30 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 7 Jun 2017 09:11:30 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: <069e94e5-6c47-8b4e-be27-2956ab47b353@brice.xyz> References: <5937907C.30306@canterbury.ac.nz> <069e94e5-6c47-8b4e-be27-2956ab47b353@brice.xyz> Message-ID: As already mentioned, Vim can display <= as ? using the ' conceal' feature. (And in fact arbitrary substitutions, of course.) Stephan Op 7 jun. 2017 8:48 a.m. schreef "Brice PARENT" : Le 07/06/17 ? 07:34, Greg Ewing a ?crit : > Yes, there are a few symbols it would be nice to have. > A proper ? symbol would have avoided the wars between > <> and !=. :-) > I'm not sure it's worth any change in the language, it's already really easy to read and write as is. But I agree this can be great to have for example for reviewers (Python being what it is, you can have reviewers who are not really pythonistas but just here to check the logic and maths for example). And it's already available by using some fonts that provide a good ligature support, like Fira Code (https://twitter.com/pycharm/status/804786040775045123?lang=fr). I'm not sure about the support in other editors/terminals tho. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjol at tjol.eu Wed Jun 7 03:59:04 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Wed, 7 Jun 2017 09:59:04 +0200 Subject: [Python-ideas] =?utf-8?b?z4AgPSBtYXRoLnBp?= In-Reply-To: References: Message-ID: <6bf7c03a-de21-75d9-6ce8-6eb104079f2f@tjol.eu> On 2017-06-07 02:03, Mikhail V wrote: > Greg Ewing wrote: > >> Steven D'Aprano wrote: >>> There's not much, if any, benefit to writing: >>> >>> ?(expression, lower_limit, upper_limit, name) > >> More generally, there's a kind of culture clash between mathematical >> notation and programming notation. Mathematical notation tends to >> almost exclusively use single-character names, relying on different >> fonts and alphabets, and superscripts and subscripts, to get a large >> enough set of identifiers. Whereas in programming we use a much >> smaller alphabet and longer names. > > That's probably because mathematicians grown up writing > everything with a chalk on a blackboard. > Hands are tired after hours of writing and blackboards > are limitited, need to erase everything and start over. > Also don't forget that mathematical formalism is *always* accompanied by an explanation of what the symbols mean in the particular situation (either orally by the person doing the writing, or in prose if it's in a paper or book). Valid code, no matter how badly chosen the identifier names, is always self-explanatory (at least to the computer); a mathematical formula almost never is. > I find actually symbols ? ? (inclusive comparison) > nice. They look ok and have usage in context of writing > code. > But that's merely an exception in world of math symbols. > OTOH I'm strongly against unicode. > > > > Mikhail > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From nick at humrich.us Wed Jun 7 14:14:08 2017 From: nick at humrich.us (Nick Humrich) Date: Wed, 07 Jun 2017 18:14:08 +0000 Subject: [Python-ideas] Dictionary destructing and unpacking. Message-ID: In python, we have beautiful unpacking: a, b, c = [1,2,3] and even a, b, *c = [1,2,3,4,5] We also have dictionary destructing for purposes of keywords: myfunc(**mydict) You can currently unpack a dictionary, but its almost certainly not what you would intend. a, b, c = {'a': 1, 'c': 3, 'b': 2}.values() In python 3.6+ this is better since the dictionary is insertion-ordered, but is still not really what one would probably want. It would be cool to have a syntax that would unpack the dictionary to values based on the names of the variables. Something perhaps like: a, b, c = **mydict which would assign the values of the keys 'a', 'b', 'c' to the variables. The problem with this approach is that it only works if the key is also a valid variable name. Another syntax could potentially be used to specify the keys you care about (and the order). Perhaps: a, b, c = **mydict('a', 'b', 'c') I dont really like that syntax, but it gives a good idea. One way to possibly achieve this today without adding syntax support could be simply adding a builtin method to the dict class: a, b, c = mydict.unpack('a', 'b', 'c') The real goal of this is to easily get multiple values from a dictionary. The current ways of doing this are: a, b, c, = mydict['a'], mydict['b'], mydict['c'] or a = mydict['a'] b = mydict['b'] c = mydict['c'] The later seams to be more common. Both are overly verbose in my mind. One thing to consider however is the getitem vs get behavior. mydict['a'] would raise a KeyError if 'a' wasnt in the dict, whereas mydict.get('a') would return a "default" (None if not specified). Which behavior is chosen? Maybe there is no clean solutions, but those are my thoughts. Anyone have feedback/ideas on this? Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at lucidity.plus.com Wed Jun 7 18:11:20 2017 From: python at lucidity.plus.com (Erik) Date: Wed, 7 Jun 2017 23:11:20 +0100 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: On 07/06/17 19:14, Nick Humrich wrote: > a, b, c = mydict.unpack('a', 'b', 'c') def retrieve(mapping, *keys): return (mapping[key] for key in keys) $ python3 Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> def retrieve(mapping, *keys): ... return (mapping[key] for key in keys) ... >>> d = {'a': 1, 'b': None, 100: 'Foo' } >>> a, b, c = retrieve(d, 'a', 'b', 100) >>> a, b, c (1, None, 'Foo') E. From matt at getpattern.com Wed Jun 7 18:15:05 2017 From: matt at getpattern.com (Matt Gilson) Date: Wed, 7 Jun 2017 15:15:05 -0700 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: On Wed, Jun 7, 2017 at 3:11 PM, Erik wrote: > On 07/06/17 19:14, Nick Humrich wrote: > >> a, b, c = mydict.unpack('a', 'b', 'c') >> > > def retrieve(mapping, *keys): > return (mapping[key] for key in keys) > > > Or even: from operator import itemgetter retrieve = itemgetter('a', 'b', 'c') a, b, c = retrieve(dictionary) > > $ python3 > Python 3.5.2 (default, Nov 17 2016, 17:05:23) > [GCC 5.4.0 20160609] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> def retrieve(mapping, *keys): > ... return (mapping[key] for key in keys) > ... > >>> d = {'a': 1, 'b': None, 100: 'Foo' } > >>> a, b, c = retrieve(d, 'a', 'b', 100) > >>> a, b, c > (1, None, 'Foo') > > > E. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Matt Gilson | Pattern Software Engineer getpattern.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From c at anthonyrisinger.com Wed Jun 7 18:42:44 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Wed, 7 Jun 2017 17:42:44 -0500 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: On Jun 7, 2017 5:15 PM, "Matt Gilson" wrote: On Wed, Jun 7, 2017 at 3:11 PM, Erik wrote: > On 07/06/17 19:14, Nick Humrich wrote: > >> a, b, c = mydict.unpack('a', 'b', 'c') >> > > def retrieve(mapping, *keys): > return (mapping[key] for key in keys) > > > Or even: from operator import itemgetter retrieve = itemgetter('a', 'b', 'c') a, b, c = retrieve(dictionary) Neither of these are really comparable to destructuring. If you take a look at how Erlang and Elixir do it, and any related code, you'll find it used constantly, all over the place. Recent ECMAScript is very similar, allowing both destructuring into vars matching the key names, or arbitrary var names. They both allow destructuring in the function header (IIRC python can do this with at least tuples). Erlang/Elixir goes beyond this by using the pattern matching to select the appropriate function clause within a function definition, but that's less relevant to Python. This feature has been requested before. It's easily one of the most, if not the top, feature I personally wish Python had. Incredibly useful and intuitive, and for me again, way more generally applicable than iterable unpacking. Maps are ubiquitous. -- C Anthony [mobile] -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at lucidity.plus.com Wed Jun 7 18:54:28 2017 From: python at lucidity.plus.com (Erik) Date: Wed, 7 Jun 2017 23:54:28 +0100 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: On 07/06/17 23:42, C Anthony Risinger wrote: > Neither of these are really comparable to destructuring. No, but they are comparable to the OP's suggested new built-in method (without requiring each mapping type - not just dicts - to implement it). That was what _I_ was responding to. E. From antoine.rozo at gmail.com Wed Jun 7 18:59:16 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Thu, 8 Jun 2017 00:59:16 +0200 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: I think you want something similar to locals.update(mydict)? 2017-06-08 0:54 GMT+02:00 Erik : > On 07/06/17 23:42, C Anthony Risinger wrote: > >> Neither of these are really comparable to destructuring. >> > > No, but they are comparable to the OP's suggested new built-in method > (without requiring each mapping type - not just dicts - to implement it). > That was what _I_ was responding to. > > > E. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine.rozo at gmail.com Wed Jun 7 18:59:39 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Thu, 8 Jun 2017 00:59:39 +0200 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: * locals().update(mydict) 2017-06-08 0:59 GMT+02:00 Antoine Rozo : > I think you want something similar to locals.update(mydict)? > > 2017-06-08 0:54 GMT+02:00 Erik : > >> On 07/06/17 23:42, C Anthony Risinger wrote: >> >>> Neither of these are really comparable to destructuring. >>> >> >> No, but they are comparable to the OP's suggested new built-in method >> (without requiring each mapping type - not just dicts - to implement it). >> That was what _I_ was responding to. >> >> >> E. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > Antoine Rozo > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From c at anthonyrisinger.com Wed Jun 7 19:00:32 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Wed, 7 Jun 2017 18:00:32 -0500 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: On Jun 7, 2017 5:42 PM, "C Anthony Risinger" wrote: On Jun 7, 2017 5:15 PM, "Matt Gilson" wrote: On Wed, Jun 7, 2017 at 3:11 PM, Erik wrote: > On 07/06/17 19:14, Nick Humrich wrote: > >> a, b, c = mydict.unpack('a', 'b', 'c') >> > > def retrieve(mapping, *keys): > return (mapping[key] for key in keys) > > > Or even: from operator import itemgetter retrieve = itemgetter('a', 'b', 'c') a, b, c = retrieve(dictionary) Neither of these are really comparable to destructuring. If you take a look at how Erlang and Elixir do it, and any related code, you'll find it used constantly, all over the place. Recent ECMAScript is very similar, allowing both destructuring into vars matching the key names, or arbitrary var names. They both allow destructuring in the function header (IIRC python can do this with at least tuples). Erlang/Elixir goes beyond this by using the pattern matching to select the appropriate function clause within a function definition, but that's less relevant to Python. This feature has been requested before. It's easily one of the most, if not the top, feature I personally wish Python had. Incredibly useful and intuitive, and for me again, way more generally applicable than iterable unpacking. Maps are ubiquitous. Also in the Erlang/Elixir (not sure about ECMAScript) the destructuring is about both matching *and* assignment. So something like this (in Python): payload = {"id": 123, "data": {...}} {"id": None, "data": data} = payload Would raise a MatchError or similar. It's a nice way to assert some values and bind others in one shot. Those languages often use atoms for keys though, which typically don't require quoting (and ECMAScript is more lax), so that extended format is less useful and pretty if the Python variant expected quotes all over the place. -- C Anthony [mobile] -------------- next part -------------- An HTML attachment was scrubbed... URL: From c at anthonyrisinger.com Wed Jun 7 19:15:43 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Wed, 7 Jun 2017 18:15:43 -0500 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: On Jun 7, 2017 5:54 PM, "Erik" wrote: On 07/06/17 23:42, C Anthony Risinger wrote: > Neither of these are really comparable to destructuring. > No, but they are comparable to the OP's suggested new built-in method (without requiring each mapping type - not just dicts - to implement it). That was what _I_ was responding to. No worries, I only meant to emphasize that destructuring is much much more powerful and less verbose/duplicative than anything based on functions. It could readily apply/fallback against any object's __dict__ because maps underpin the entire Python object system. -- C Anthony [mobile] -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jun 7 21:18:06 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 8 Jun 2017 11:18:06 +1000 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: <20170608011804.GC3149@ando.pearwood.info> On Wed, Jun 07, 2017 at 06:14:08PM +0000, Nick Humrich wrote: > It would be cool to have a syntax that would unpack the dictionary to > values based on the names of the variables. Something perhaps like: > > a, b, c = **mydict This was discussed (briefly, to very little interest) in March/April 2008: https://mail.python.org/pipermail/python-ideas/2008-March/001511.html https://mail.python.org/pipermail/python-ideas/2008-April/001513.html and then again in 2016, when it spawned a very large thread starting here: https://mail.python.org/pipermail/python-ideas/2016-May/040430.html I know there's a lot of messages, but I STRONGLY encourage anyone, whether you are for or against this idea, to read the previous discussion before continuing it here. Guido was luke-warm about the **mapping syntax: https://mail.python.org/pipermail/python-ideas/2016-May/040466.html Nathan Schneider proposed making dict.values() take optional key names: https://mail.python.org/pipermail/python-ideas/2016-May/040517.html Guido suggested that this should be a different method: https://mail.python.org/pipermail/python-ideas/2016-May/040518.html My recollection is that the discussion evertually petered out with a more-or-less consensus that having a dict method (perhaps "getvalues"?) plus regular item unpacking is sufficient for the common use-case of unpacking a subset of keys: prefs = {'width': 80, 'height': 200, 'verbose': False, 'mode': PLAIN, 'name': 'Fnord', 'flags': spam|eggs|cheese, ... } # dict includes many more items width, height, size = prefs.getvalues( 'width', 'height', 'papersize', ) This trivially supports the cases where keys are not strings or valid identifiers: class_, spam, eggs = mapping.getvalues('class', 42, '~') It easily supports assignment targets which aren't simple variable names: obj.attribute[index], spam().attr = mapping.getvalues('foo', 'bar') An optional (defaults to False) "pop" keyword argument supports extracting and removing values from the dict in one call, which is commonly needed inside __init__ methods with **kwargs: class K(parent): def __init__(self, a, b, c, **kwargs): self.spam = kwargs.pop('spam') self.eggs = kwargs.pop('eggs') self.cheese = kwargs.pop('cheese') super().__init__(a, b, c, **kwargs) becomes: self.spam, self.eggs, self.cheese = kwargs.getvalues( 'spam eggs cheese'.split(), pop=True ) I don't recall this being proposed at the time, but we could support keyword arguments for missing or default values: DEFAULTS = {'height': 100, 'width': 50} prefs = get_prefs() # returns a dict height, width, size = prefs.getvalues( 'height', 'width', 'papersize', defaults=DEFAULTS, missing=None ) A basic implementation might be: # Untested. def getvalues(self, *keys, pop=False, defaults=None, missing=SENTINEL): values = [] for key in keys: try: x = self[key] except KeyError: if defaults is not None: x = defaults.get(key, SENTINEL) if x is SENTINEL: x = missing if x is SENTINEL: raise KeyError('missing key %r' % key) if pop: del self[key] values.append(x) return tuple(values) It's a bit repetitive for the common case where keys are the same as the assignment targets, but that's a hard problem to solve, and besides, "explicit is better than implicit". It also doesn't really work well for the case where you want to blindly create new assignment targets for *every* key, but: - my recollection is that nobody really came up with a convincing use-case for this (apologies if I missed any); - and if you really need this, you can do: locals().update(mapping) inside a class body or at the top-level of the module (but not inside a function). Please, let's save a lot of discussion here and now, and just read the 2016 thread: it is extremely comprehensive. -- Steve From phd at phdru.name Thu Jun 8 01:51:39 2017 From: phd at phdru.name (Oleg Broytman) Date: Thu, 8 Jun 2017 07:51:39 +0200 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: <20170608011804.GC3149@ando.pearwood.info> References: <20170608011804.GC3149@ando.pearwood.info> Message-ID: <20170608055139.GA22713@phdru.name> Thank you! This overview really helps! On Thu, Jun 08, 2017 at 11:18:06AM +1000, Steven D'Aprano wrote: > On Wed, Jun 07, 2017 at 06:14:08PM +0000, Nick Humrich wrote: > > > It would be cool to have a syntax that would unpack the dictionary to > > values based on the names of the variables. Something perhaps like: > > > > a, b, c = **mydict > > This was discussed (briefly, to very little interest) in March/April > 2008: > > https://mail.python.org/pipermail/python-ideas/2008-March/001511.html > https://mail.python.org/pipermail/python-ideas/2008-April/001513.html > > and then again in 2016, when it spawned a very large thread starting > here: > > https://mail.python.org/pipermail/python-ideas/2016-May/040430.html > > I know there's a lot of messages, but I STRONGLY encourage anyone, > whether you are for or against this idea, to read the previous > discussion before continuing it here. > > Guido was luke-warm about the **mapping syntax: > > https://mail.python.org/pipermail/python-ideas/2016-May/040466.html > > Nathan Schneider proposed making dict.values() take optional key names: > > https://mail.python.org/pipermail/python-ideas/2016-May/040517.html > > Guido suggested that this should be a different method: > > https://mail.python.org/pipermail/python-ideas/2016-May/040518.html > > My recollection is that the discussion evertually petered out with a > more-or-less consensus that having a dict method (perhaps "getvalues"?) > plus regular item unpacking is sufficient for the common use-case of > unpacking a subset of keys: > > prefs = {'width': 80, 'height': 200, 'verbose': False, 'mode': PLAIN, > 'name': 'Fnord', 'flags': spam|eggs|cheese, ... } > # dict includes many more items > > width, height, size = prefs.getvalues( > 'width', 'height', 'papersize', > ) > > > This trivially supports the cases where keys are not strings or valid > identifiers: > > class_, spam, eggs = mapping.getvalues('class', 42, '~') > > It easily supports assignment targets which aren't simple variable > names: > > obj.attribute[index], spam().attr = mapping.getvalues('foo', 'bar') > > An optional (defaults to False) "pop" keyword argument supports > extracting and removing values from the dict in one call, which is > commonly needed inside __init__ methods with **kwargs: > > class K(parent): > def __init__(self, a, b, c, **kwargs): > self.spam = kwargs.pop('spam') > self.eggs = kwargs.pop('eggs') > self.cheese = kwargs.pop('cheese') > super().__init__(a, b, c, **kwargs) > > > becomes: > > self.spam, self.eggs, self.cheese = kwargs.getvalues( > 'spam eggs cheese'.split(), pop=True > ) > > > I don't recall this being proposed at the time, but we could support > keyword arguments for missing or default values: > > DEFAULTS = {'height': 100, 'width': 50} > prefs = get_prefs() # returns a dict > > height, width, size = prefs.getvalues( > 'height', 'width', 'papersize', > defaults=DEFAULTS, > missing=None > ) > > > A basic implementation might be: > > # Untested. > def getvalues(self, *keys, pop=False, defaults=None, missing=SENTINEL): > values = [] > for key in keys: > try: > x = self[key] > except KeyError: > if defaults is not None: > x = defaults.get(key, SENTINEL) > if x is SENTINEL: > x = missing > if x is SENTINEL: > raise KeyError('missing key %r' % key) > if pop: > del self[key] > values.append(x) > return tuple(values) > > > > It's a bit repetitive for the common case where keys are the same as the > assignment targets, but that's a hard problem to solve, and besides, > "explicit is better than implicit". > > It also doesn't really work well for the case where you want to blindly create new assignment targets for > *every* key, but: > > - my recollection is that nobody really came up with a convincing > use-case for this (apologies if I missed any); > > - and if you really need this, you can do: > > locals().update(mapping) > > inside a class body or at the top-level of the module (but not inside a > function). > > Please, let's save a lot of discussion here and now, and just read the > 2016 thread: it is extremely comprehensive. > > > -- > Steve Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From victor.stinner at gmail.com Thu Jun 8 01:57:44 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 8 Jun 2017 07:57:44 +0200 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: > In python 3.6+ this is better since the dictionary is insertion-ordered, but is still not really what one would probably want. Be careful: ordered dict is an implementation detail. You must use explicitly collections.OrderedDict() to avoid bad surprises. In CPython 3.7, dict might change again. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Jun 8 02:28:12 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 08 Jun 2017 18:28:12 +1200 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: <5938EE7C.9050505@canterbury.ac.nz> One existing way to do this: a, b, c = (mydict[k] for k in ('a', 'b', 'c')) -- Greg From greg.ewing at canterbury.ac.nz Thu Jun 8 02:32:40 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 08 Jun 2017 18:32:40 +1200 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: <5938EF88.2040408@canterbury.ac.nz> C Anthony Risinger wrote: > Incredibly useful and > intuitive, and for me again, way more generally applicable than iterable > unpacking. Maps are ubiquitous. Maps with a known, fixed set of keys are relatively uncommon in Python, though. Such an object is more likely to be an object with named attributes. -- Greg From lucas.wiman at gmail.com Thu Jun 8 02:53:21 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Wed, 7 Jun 2017 23:53:21 -0700 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: <5938EF88.2040408@canterbury.ac.nz> References: <5938EF88.2040408@canterbury.ac.nz> Message-ID: > > Maps with a known, fixed set of keys are relatively uncommon > in Python, though. This is false in interacting with HTTP services, where frequently you're working with deserialized JSON dictionaries you expect to be in a precise format (and fail if not). On Wed, Jun 7, 2017 at 11:32 PM, Greg Ewing wrote: > C Anthony Risinger wrote: > >> Incredibly useful and intuitive, and for me again, way more generally >> applicable than iterable unpacking. Maps are ubiquitous. >> > > Maps with a known, fixed set of keys are relatively uncommon > in Python, though. Such an object is more likely to be an > object with named attributes. > > -- > Greg > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Thu Jun 8 03:09:43 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Thu, 8 Jun 2017 09:09:43 +0200 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: <5938EF88.2040408@canterbury.ac.nz> Message-ID: Hi Lucas, I would consider converting the dict into a namedtuple then. Essentially the namedtuple acts as a specification for expected fielsds: abc = namedtuple("ABC", "a b c") d = {"a":1, "b": 2, "c":3} # presumably you got this from reading some JSON abc(**d) # returns: ABC(a=1, b=2, c=3) Stephan 2017-06-08 8:53 GMT+02:00 Lucas Wiman : >> Maps with a known, fixed set of keys are relatively uncommon >> in Python, though. > > > This is false in interacting with HTTP services, where frequently you're > working with deserialized JSON dictionaries you expect to be in a precise > format (and fail if not). > > On Wed, Jun 7, 2017 at 11:32 PM, Greg Ewing > wrote: >> >> C Anthony Risinger wrote: >>> >>> Incredibly useful and intuitive, and for me again, way more generally >>> applicable than iterable unpacking. Maps are ubiquitous. >> >> >> Maps with a known, fixed set of keys are relatively uncommon >> in Python, though. Such an object is more likely to be an >> object with named attributes. >> >> -- >> Greg >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Jun 8 03:15:06 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Thu, 8 Jun 2017 16:15:06 +0900 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: <5938EF88.2040408@canterbury.ac.nz> Message-ID: <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> Lucas Wiman writes: > > Maps with a known, fixed set of keys are relatively uncommon > > in Python, though. > > This is false in interacting with HTTP services, where frequently you're > working with deserialized JSON dictionaries you expect to be in a precise > format (and fail if not). It's still true. In Python, if I need those things in variables *frequently*, I write a destructuring function that returns a sequence and use sequence unpacking. If I don't need the values in a particular use case, I use the "a, _, c = second_item_ignorable()" idiom. Or, in a larger or more permanent program, I write a class that is initialized with such a dictionary. If you like this feature, and wish it were in Python, I genuinely wish you good luck getting it in. My point is just that in precisely that use case I wouldn't be passing dictionaries that need destructuring around. I believe that to be the case for most Pythonistas. (Although several have posted in favor of some way to destructure dictionaries, typically those in favor of the status quo don't speak up until it looks like there will be a change.) From lucas.wiman at gmail.com Thu Jun 8 03:24:48 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Thu, 8 Jun 2017 00:24:48 -0700 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> References: <5938EF88.2040408@canterbury.ac.nz> <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> Message-ID: > > It's still true. In Python, if I need those things in variables > *frequently*, I write a destructuring function that returns a sequence > and use sequence unpacking. > Yes, you frequently need to write destructuring functions. Of course there are ways to write them within Python, but they're often sort of ugly to write, and this syntax would make writing them nicer. I was disagreeing with your assertion that this is an uncommon use case: it's not, by your own admission. Of course, it is business logic which can be factored into a separate method, it's just that it would often be nice to not need to write such a method or define a NamedTuple class. If this feature does get added, I think it would make more sense to add it as part of a general framework of unification and pattern matching. This was previously discussed in this thread: https://mail.python.org/pipermail/python-ideas/2015-April/thread.html#32907 - Lucas On Thu, Jun 8, 2017 at 12:15 AM, Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > Lucas Wiman writes: > > > > Maps with a known, fixed set of keys are relatively uncommon > > > in Python, though. > > > > This is false in interacting with HTTP services, where frequently you're > > working with deserialized JSON dictionaries you expect to be in a > precise > > format (and fail if not). > > It's still true. In Python, if I need those things in variables > *frequently*, I write a destructuring function that returns a sequence > and use sequence unpacking. If I don't need the values in a > particular use case, I use the "a, _, c = second_item_ignorable()" > idiom. > > Or, in a larger or more permanent program, I write a class that is > initialized with such a dictionary. > > If you like this feature, and wish it were in Python, I genuinely wish > you good luck getting it in. My point is just that in precisely that > use case I wouldn't be passing dictionaries that need destructuring > around. I believe that to be the case for most Pythonistas. > (Although several have posted in favor of some way to destructure > dictionaries, typically those in favor of the status quo don't speak > up until it looks like there will be a change.) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Jun 8 03:49:13 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 8 Jun 2017 08:49:13 +0100 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> References: <5938EF88.2040408@canterbury.ac.nz> <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> Message-ID: On 8 June 2017 at 08:15, Stephen J. Turnbull wrote: > If you like this feature, and wish it were in Python, I genuinely wish > you good luck getting it in. My point is just that in precisely that > use case I wouldn't be passing dictionaries that need destructuring > around. I believe that to be the case for most Pythonistas. > (Although several have posted in favor of some way to destructure > dictionaries, typically those in favor of the status quo don't speak > up until it looks like there will be a change.) The most common use case I find for this is when dealing with JSON (as someone else pointed out). But that's a definite case of dealing with data in a format that's "unnatural" for Python (by definition, JSON is "natural" for JavaScript). While having better support for working with JSON would be nice, I typically find myself wishing for better JSON handling libraries (ones that deal better with mappings with known keys) than for language features. But of course, I could write such a library myself, if it mattered sufficiently to me - and it never seems *that* important :-) Paul From c at anthonyrisinger.com Thu Jun 8 05:00:38 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Thu, 8 Jun 2017 04:00:38 -0500 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: <5938EF88.2040408@canterbury.ac.nz> References: <5938EF88.2040408@canterbury.ac.nz> Message-ID: On Jun 8, 2017 1:35 AM, "Greg Ewing" wrote: C Anthony Risinger wrote: > Incredibly useful and intuitive, and for me again, way more generally > applicable than iterable unpacking. Maps are ubiquitous. > Maps with a known, fixed set of keys are relatively uncommon in Python, though. Such an object is more likely to be an object with named attributes. I would generally agree, but in the 3 languages I mentioned at least, map destructuring does not require the LHS to exactly match the RHS (a complete match is required for lists and tuples though). Because pattern matching is central to the language, Elixir takes it even further, providing syntax that allows you to choose whether a variable on the LHS is treated as a match (similar to the None constant in my example) or normal variable binding. In all cases though, the LHS need only include the attributes you are actually interested in matching and/or binding. I need to review the linked thread still, but the way ECMAScript does it: const {one, two} = {one: 1, two: 2}; I think could also be useful in Python, especially if we defined some default handling of objects and dicts via __getattr__ and/or __getitem__, because people could use this to destructure `self` (I've seen this requested a couple times too): self.one = 1 self.two = 2 self.three = 3 {one, two} = self Or maybe even: def fun({one, two}, ...): Which is also supported (and common) in those 3 langs, but probably less pretty to Python eyes (and I think a bit less useful without function clauses). -- C Anthony [mobile] -------------- next part -------------- An HTML attachment was scrubbed... URL: From nanjekyejoannah at gmail.com Thu Jun 8 08:22:41 2017 From: nanjekyejoannah at gmail.com (joannah nanjekye) Date: Thu, 8 Jun 2017 15:22:41 +0300 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: Thanks for response on automatic tuple unpack. My bad I dint know about this all along. Infact this works same way Go does. I have been analyzing why we would really need such a function (allow function to return multiple types) in python given we have this feature( automatic tuple unpack) and have not yet got good ground. When I come across good ground I will talk about it. So I will say this automatic tuple unpack pretty much works for my needs. Thanks On Thu, Jun 1, 2017 at 5:21 PM, Markus Meskanen wrote: > Why isn't a tuple enough? You can do automatic tuple unpack: > > v1, v2 = return_multiplevalues(1, 2) > > > On Jun 1, 2017 17:18, "joannah nanjekye" > wrote: > > Hello Team, > > I am Joannah. I am currently working on a book on python compatibility and > publishing it with apress. I have worked with python for a while we are > talking about four years. > > Today I was writing an example snippet for the book and needed to write a > function that returns two values something like this: > > def return_multiplevalues(num1, num2): > return num1, num2 > > I noticed that this actually returns a tuple of the values which I did > not want in the first place.I wanted python to return two values in their > own types so I can work with them as they are but here I was stuck with > working around a tuple. > > My proposal is we provide a way of functions returning multiple values. > This has been implemented in languages like Go and I have found many cases > where I needed and used such a functionality. I wish for this convenience > in python so that I don't have to suffer going around a tuple. > > I will appreciate discussing this. You may also bring to light any current > way of returning multiple values from a function that I may not know of in > python if there is. > > Kind regards, > Joannah > > -- > Joannah Nanjekye > +256776468213 > F : Nanjekye Captain Joannah > S : joannah.nanjekye > T : @Captain_Joannah > SO : joannah > > > *"You think you know when you learn, are more sure when you can write, > even more when you can teach, but certain when you can program." Alan J. > Perlis* > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > -- Joannah Nanjekye +256776468213 F : Nanjekye Captain Joannah S : joannah.nanjekye T : @Captain_Joannah SO : joannah *"You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program." Alan J. Perlis* -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine.pietri1 at gmail.com Thu Jun 8 09:42:16 2017 From: antoine.pietri1 at gmail.com (Antoine Pietri) Date: Thu, 8 Jun 2017 15:42:16 +0200 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory Message-ID: Hello everyone! A very common pattern when dealing with temporary files is code like this: with tempfile.TemporaryDirectory() as tmpdir: tmp_path = tmpdir.name os.chmod(tmp_path) os.foobar(tmp_path) open(tmp_path).read(barquux) PEP 519 (https://www.python.org/dev/peps/pep-0519/) introduced the concept of "path-like objects", objects that define a __fspath__() method. Most of the standard library has been adapted so that the functions accept path-like objects. My proposal is to define __fspath__() for TemporaryDirectory and NamedTemporaryFile so that we can pass those directly to the library functions instead of having to use the .name attribute explicitely. Thoughts? :-) -- Antoine Pietri From ncoghlan at gmail.com Thu Jun 8 10:00:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 Jun 2017 00:00:10 +1000 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: <5938EF88.2040408@canterbury.ac.nz> <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> Message-ID: On 8 June 2017 at 17:49, Paul Moore wrote: > On 8 June 2017 at 08:15, Stephen J. Turnbull > wrote: >> If you like this feature, and wish it were in Python, I genuinely wish >> you good luck getting it in. My point is just that in precisely that >> use case I wouldn't be passing dictionaries that need destructuring >> around. I believe that to be the case for most Pythonistas. >> (Although several have posted in favor of some way to destructure >> dictionaries, typically those in favor of the status quo don't speak >> up until it looks like there will be a change.) > > The most common use case I find for this is when dealing with JSON (as > someone else pointed out). But that's a definite case of dealing with > data in a format that's "unnatural" for Python (by definition, JSON is > "natural" for JavaScript). While having better support for working > with JSON would be nice, I typically find myself wishing for better > JSON handling libraries (ones that deal better with mappings with > known keys) than for language features. But of course, I could write > such a library myself, if it mattered sufficiently to me - and it > never seems *that* important :-) Aye, I've had good experiences with using JSL to define JSON schemas for ad hoc JSON data structures that didn't already have them: https://jsl.readthedocs.io/en/latest/ And then, if you really wanted to, something like JSON Schema Objects provides automated destructuring and validation based on those schemas: https://python-jsonschema-objects.readthedocs.io/en/latest/Introduction.html However, it really isn't an ad hoc scripting friendly way to go - it's an "I'm writing a tested-and-formally-released application and want to strictly manage the data processing boundaries between components" style solution. pandas.read_json is pretty nice (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_json.html), but would be a heavy dependency to bring in *just* for JSON -> DataFrame conversions. For myself, the things I mainly miss are: * getitem/setitem/delitem counterparts to getattr/setattr/delattr * getattrs and getitems builtins for retrieving multiple attributes or items in a single call (with the default value for missing results moved to a keyword-only argument) Now, these aren't hard to write yourself (and you can even use operator.attrgetter and operator.itemgetter as part of building them), but it's a sufficiently irritating niggle not to have them at my fingertips whenever they'd be convenient that I'll often end up writing out the long form equivalents instead. Are these necessary? Clearly not (although we did decide operator.itemgetter and operator.attrgetter were important enough to add for use with the map() and filter() builtins and other itertools). Is it a source of irritation that they're not there? Absolutely, at least for me. Cheers, Nick. P.S. Just clearly not irritating enough for me to actually put a patch together and push for a final decision one way or the other regarding adding them ;) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ethan at stoneleaf.us Thu Jun 8 11:27:09 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 08 Jun 2017 08:27:09 -0700 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory In-Reply-To: References: Message-ID: <59396CCD.1080703@stoneleaf.us> On 06/08/2017 06:42 AM, Antoine Pietri wrote: > Hello everyone! > > A very common pattern when dealing with temporary files is code like this: > > with tempfile.TemporaryDirectory() as tmpdir: > tmp_path = tmpdir.name > > os.chmod(tmp_path) > os.foobar(tmp_path) > open(tmp_path).read(barquux) > > PEP 519 (https://www.python.org/dev/peps/pep-0519/) introduced the > concept of "path-like objects", objects that define a __fspath__() > method. Most of the standard library has been adapted so that the > functions accept path-like objects. > > My proposal is to define __fspath__() for TemporaryDirectory and > NamedTemporaryFile so that we can pass those directly to the library > functions instead of having to use the .name attribute explicitely. > > Thoughts? :-) Good idea. Check bugs.python.org to see if there is already an issue for this, and if not create one! :) -- ~Ethan~ From apalala at gmail.com Thu Jun 8 11:44:50 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Thu, 8 Jun 2017 11:44:50 -0400 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory In-Reply-To: References: Message-ID: On Thu, Jun 8, 2017 at 9:42 AM, Antoine Pietri wrote: > My proposal is to define __fspath__() for TemporaryDirectory and > NamedTemporaryFile so that we can pass those directly to the library > functions instead of having to use the .name attribute explicitely. > +1 -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From nbadger1 at gmail.com Thu Jun 8 15:16:56 2017 From: nbadger1 at gmail.com (Nick Badger) Date: Thu, 8 Jun 2017 12:16:56 -0700 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: <5938EF88.2040408@canterbury.ac.nz> <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> Message-ID: Well, it's not deliberately not destructive, but I'd be more in favor of dict unpacking-assignment if it were spelled more like this: >>> foo = {'a': 1, 'b': 2, 'c': 3, 'd': 4} >>> {'a': bar, 'b': baz, **rest} = foo >>> bar 1 >>> baz 2 >>> rest {'c': 3, 'd': 4} >>> foo {'a': 1, 'b': 2, 'c': 3, 'd': 4} That also takes care of the ordering issue, and any ambiguity about "am I unpacking the keys, the values, or both?", at the cost of a bit more typing. However, I'm a bit on the fence about this syntax as well: it's pretty easily confused with dictionary creation. Maybe the same thing but without the brackets? Just a thought I had this morning. Nick Nick Badger https://www.nickbadger.com 2017-06-08 7:00 GMT-07:00 Nick Coghlan : > On 8 June 2017 at 17:49, Paul Moore wrote: > > On 8 June 2017 at 08:15, Stephen J. Turnbull > > wrote: > >> If you like this feature, and wish it were in Python, I genuinely wish > >> you good luck getting it in. My point is just that in precisely that > >> use case I wouldn't be passing dictionaries that need destructuring > >> around. I believe that to be the case for most Pythonistas. > >> (Although several have posted in favor of some way to destructure > >> dictionaries, typically those in favor of the status quo don't speak > >> up until it looks like there will be a change.) > > > > The most common use case I find for this is when dealing with JSON (as > > someone else pointed out). But that's a definite case of dealing with > > data in a format that's "unnatural" for Python (by definition, JSON is > > "natural" for JavaScript). While having better support for working > > with JSON would be nice, I typically find myself wishing for better > > JSON handling libraries (ones that deal better with mappings with > > known keys) than for language features. But of course, I could write > > such a library myself, if it mattered sufficiently to me - and it > > never seems *that* important :-) > > Aye, I've had good experiences with using JSL to define JSON schemas > for ad hoc JSON data structures that didn't already have them: > https://jsl.readthedocs.io/en/latest/ > > And then, if you really wanted to, something like JSON Schema Objects > provides automated destructuring and validation based on those > schemas: https://python-jsonschema-objects.readthedocs.io/en/ > latest/Introduction.html > > However, it really isn't an ad hoc scripting friendly way to go - it's > an "I'm writing a tested-and-formally-released application and want to > strictly manage the data processing boundaries between components" > style solution. > > pandas.read_json is pretty nice > (https://pandas.pydata.org/pandas-docs/stable/generated/ > pandas.read_json.html), > but would be a heavy dependency to bring in *just* for JSON -> > DataFrame conversions. > > For myself, the things I mainly miss are: > > * getitem/setitem/delitem counterparts to getattr/setattr/delattr > * getattrs and getitems builtins for retrieving multiple attributes or > items in a single call (with the default value for missing results > moved to a keyword-only argument) > > Now, these aren't hard to write yourself (and you can even use > operator.attrgetter and operator.itemgetter as part of building them), > but it's a sufficiently irritating niggle not to have them at my > fingertips whenever they'd be convenient that I'll often end up > writing out the long form equivalents instead. > > Are these necessary? Clearly not (although we did decide > operator.itemgetter and operator.attrgetter were important enough to > add for use with the map() and filter() builtins and other itertools). > > Is it a source of irritation that they're not there? Absolutely, at > least for me. > > Cheers, > Nick. > > P.S. Just clearly not irritating enough for me to actually put a patch > together and push for a final decision one way or the other regarding > adding them ;) > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Jun 8 15:22:25 2017 From: brett at python.org (Brett Cannon) Date: Thu, 08 Jun 2017 19:22:25 +0000 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory In-Reply-To: <59396CCD.1080703@stoneleaf.us> References: <59396CCD.1080703@stoneleaf.us> Message-ID: On Thu, 8 Jun 2017 at 08:27 Ethan Furman wrote: > On 06/08/2017 06:42 AM, Antoine Pietri wrote: > > Hello everyone! > > > > A very common pattern when dealing with temporary files is code like > this: > > > > with tempfile.TemporaryDirectory() as tmpdir: > > tmp_path = tmpdir.name > > > > os.chmod(tmp_path) > > os.foobar(tmp_path) > > open(tmp_path).read(barquux) > > > > PEP 519 (https://www.python.org/dev/peps/pep-0519/) introduced the > > concept of "path-like objects", objects that define a __fspath__() > > method. Most of the standard library has been adapted so that the > > functions accept path-like objects. > > > > My proposal is to define __fspath__() for TemporaryDirectory and > > NamedTemporaryFile so that we can pass those directly to the library > > functions instead of having to use the .name attribute explicitely. > > > > Thoughts? :-) > > Good idea. Check bugs.python.org to see if there is already an issue for > this, and if not create one! :) > Already exists, been discussed, and rejected: https://bugs.python.org/issue29447 . -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Jun 8 15:49:30 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 08 Jun 2017 12:49:30 -0700 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory In-Reply-To: References: <59396CCD.1080703@stoneleaf.us> Message-ID: <5939AA4A.2050302@stoneleaf.us> On 06/08/2017 12:22 PM, Brett Cannon wrote: > Already exists, been discussed, and rejected: https://bugs.python.org/issue29447 . Ah, right, because the returned object is not a file path. Makes sense. -- ~Ethan~ From python at mrabarnett.plus.com Thu Jun 8 16:02:40 2017 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 8 Jun 2017 21:02:40 +0100 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: <5938EF88.2040408@canterbury.ac.nz> <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> Message-ID: <145c1990-5fd9-f04c-e007-bbf59214fcad@mrabarnett.plus.com> On 2017-06-08 20:16, Nick Badger wrote: > Well, it's not deliberately not destructive, but I'd be more in favor of > dict unpacking-assignment if it were spelled more like this: > > >>> foo = {'a': 1, 'b': 2, 'c': 3, 'd': 4} > >>> {'a': bar, 'b': baz, **rest} = foo > >>> bar > 1 > >>> baz > 2 > >>> rest > {'c': 3, 'd': 4} > >>> foo > {'a': 1, 'b': 2, 'c': 3, 'd': 4} > > That also takes care of the ordering issue, and any ambiguity about "am > I unpacking the keys, the values, or both?", at the cost of a bit more > typing. However, I'm a bit on the fence about this syntax as well: it's > pretty easily confused with dictionary creation. Maybe the same thing > but without the brackets? > [snip] Maybe the braces could be doubled-up: >>> foo = {'a': 1, 'b': 2, 'c': 3, 'd': 4} >>> {{a, b, **rest}} = foo >>> a 1 >>> b 2 >>> rest {'c': 3, 'd': 4} It could also be used on the RHS to pack: >>> a = 1 >>> b = 2 >>> c = 3 >>> d = 4 >>> foo = {{a, b, c, d}} >>> foo {'a': 1, 'b': 2, 'c': 3, 'd': 4} From abedillon at gmail.com Thu Jun 8 16:27:40 2017 From: abedillon at gmail.com (Abe Dillon) Date: Thu, 8 Jun 2017 15:27:40 -0500 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: Welcome to the group, Joannah! Now that you've been introduced to packing and unpacking in Python, I would suggest learning the complete syntax, because it's a very useful feature. >>> a, b = "hi" # you can unpack any iterable >>> a 'h' >>> b 'i' >>> a, b = 1, 2 # this is the same as: a, b = (1, 2) >>> a 1 >>> b 2 >>> a, (b, c) = [1, (2, 3)] # you can unpack nested iterables using parentheses >>> first, *rest = "spam" # you can use '*' to capture multiple elements >>> first 's' >>> rest 'pam' >>> *rest, last = "eggs" # which elements are captured by `*` is implied by the other assignment targets >>> rest 'egg' >>> last 's' >>> first, second, *middle, before_last, last = "lumberjack" >>> first 'l' >>> second 'u' >>> middle 'mberja' >>> before_last 'c' >>> last 'k' >>> a, b, *c = range(2) # a '*' variable can be empty >>> c [] >>> a, b, *c, d, e = range(3) # the number of non-star variables has to make sense ValueError >>> a, *b, c, *d = "african swallow" # multiple '*'s are FORBIDDEN! SyntaxError >>> a, *b = 1, 2, 3, 4, 5 # NOTE: Most itterables unpack starred variables as a list >>> type(b) >>> a, *b = {1, 2, 3, 4, 5} >>> type(b) >>> a, *b = [1, 2, 3, 4, 5] >>> type(b) >>> a, *b = dict(zip("spam", range(4))) >>> type(b) >>> a, *b = "except strings" >>> type(b) All of these rules apply just as well to assignment targets in for-loops: >>> for num, (first, *rest) in {1: "dead", 2: "parrot"}.items(): ... print("num=%r, first=%r, rest=%r"%(num, first, rest)) ... num=1, first='d', rest='ead' num=2, first='p', rest='arrot' Hope that helps! On Thu, Jun 8, 2017 at 7:22 AM, joannah nanjekye wrote: > Thanks for response on automatic tuple unpack. My bad I dint know about > this all along. > > Infact this works same way Go does. I have been analyzing why we would > really need such a function (allow function to return multiple types) in > python given we have this feature( automatic tuple unpack) and have not yet > got good ground. When I come across good ground I will talk about it. > > So I will say this automatic tuple unpack pretty much works for my needs. > > Thanks > > On Thu, Jun 1, 2017 at 5:21 PM, Markus Meskanen > wrote: > >> Why isn't a tuple enough? You can do automatic tuple unpack: >> >> v1, v2 = return_multiplevalues(1, 2) >> >> >> On Jun 1, 2017 17:18, "joannah nanjekye" >> wrote: >> >> Hello Team, >> >> I am Joannah. I am currently working on a book on python compatibility >> and publishing it with apress. I have worked with python for a while we are >> talking about four years. >> >> Today I was writing an example snippet for the book and needed to write a >> function that returns two values something like this: >> >> def return_multiplevalues(num1, num2): >> return num1, num2 >> >> I noticed that this actually returns a tuple of the values which I did >> not want in the first place.I wanted python to return two values in their >> own types so I can work with them as they are but here I was stuck with >> working around a tuple. >> >> My proposal is we provide a way of functions returning multiple values. >> This has been implemented in languages like Go and I have found many cases >> where I needed and used such a functionality. I wish for this convenience >> in python so that I don't have to suffer going around a tuple. >> >> I will appreciate discussing this. You may also bring to light any >> current way of returning multiple values from a function that I may not >> know of in python if there is. >> >> Kind regards, >> Joannah >> >> -- >> Joannah Nanjekye >> +256776468213 <+256%20776%20468213> >> F : Nanjekye Captain Joannah >> S : joannah.nanjekye >> T : @Captain_Joannah >> SO : joannah >> >> >> *"You think you know when you learn, are more sure when you can write, >> even more when you can teach, but certain when you can program." Alan J. >> Perlis* >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> > > > -- > Joannah Nanjekye > +256776468213 <+256%20776%20468213> > F : Nanjekye Captain Joannah > S : joannah.nanjekye > T : @Captain_Joannah > SO : joannah > > > *"You think you know when you learn, are more sure when you can write, > even more when you can teach, but certain when you can program." Alan J. > Perlis* > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Jun 8 16:40:22 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 9 Jun 2017 06:40:22 +1000 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: <145c1990-5fd9-f04c-e007-bbf59214fcad@mrabarnett.plus.com> References: <5938EF88.2040408@canterbury.ac.nz> <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> <145c1990-5fd9-f04c-e007-bbf59214fcad@mrabarnett.plus.com> Message-ID: On Fri, Jun 9, 2017 at 6:02 AM, MRAB wrote: > It could also be used on the RHS to pack: > >>>> a = 1 >>>> b = 2 >>>> c = 3 >>>> d = 4 >>>> foo = {{a, b, c, d}} The trouble with that is that it's syntactically identical to creating a set containing a set containing four values. That may cause problems. ChrisA From joshua.morton13 at gmail.com Thu Jun 8 17:00:21 2017 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Thu, 08 Jun 2017 21:00:21 +0000 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: <5938EF88.2040408@canterbury.ac.nz> <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> <145c1990-5fd9-f04c-e007-bbf59214fcad@mrabarnett.plus.com> Message-ID: While I don't like that syntax, we do know that sets are unhashable, so we can be certain that that would be a TypeError if it was meant to construct a set containing a set. (ie. {{foo}} will always result in a TypeError in python). On Thu, Jun 8, 2017 at 1:40 PM Chris Angelico wrote: > On Fri, Jun 9, 2017 at 6:02 AM, MRAB wrote: > > It could also be used on the RHS to pack: > > > >>>> a = 1 > >>>> b = 2 > >>>> c = 3 > >>>> d = 4 > >>>> foo = {{a, b, c, d}} > > The trouble with that is that it's syntactically identical to creating > a set containing a set containing four values. That may cause > problems. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nanjekyejoannah at gmail.com Fri Jun 9 03:15:25 2017 From: nanjekyejoannah at gmail.com (joannah nanjekye) Date: Fri, 9 Jun 2017 10:15:25 +0300 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: Thanks Abe for the insight. On Thu, Jun 8, 2017 at 11:27 PM, Abe Dillon wrote: > Welcome to the group, Joannah! > > Now that you've been introduced to packing and unpacking in Python, I > would suggest learning the complete syntax, because it's a very useful > feature. > > >>> a, b = "hi" # you can unpack any iterable > >>> a > 'h' > >>> b > 'i' > > >>> a, b = 1, 2 # this is the same as: a, b = (1, 2) > >>> a > 1 > >>> b > 2 > > >>> a, (b, c) = [1, (2, 3)] # you can unpack nested iterables using > parentheses > >>> first, *rest = "spam" # you can use '*' to capture multiple elements > >>> first > 's' > >>> rest > 'pam' > > >>> *rest, last = "eggs" # which elements are captured by `*` is implied > by the other assignment targets > >>> rest > 'egg' > >>> last > 's' > > >>> first, second, *middle, before_last, last = "lumberjack" > >>> first > 'l' > >>> second > 'u' > >>> middle > 'mberja' > >>> before_last > 'c' > >>> last > 'k' > > >>> a, b, *c = range(2) # a '*' variable can be empty > >>> c > [] > > >>> a, b, *c, d, e = range(3) # the number of non-star variables has to > make sense > ValueError > > >>> a, *b, c, *d = "african swallow" # multiple '*'s are FORBIDDEN! > SyntaxError > > >>> a, *b = 1, 2, 3, 4, 5 # NOTE: Most itterables unpack starred > variables as a list > >>> type(b) > > > >>> a, *b = {1, 2, 3, 4, 5} > >>> type(b) > > > >>> a, *b = [1, 2, 3, 4, 5] > >>> type(b) > > > >>> a, *b = dict(zip("spam", range(4))) > >>> type(b) > > > >>> a, *b = "except strings" > >>> type(b) > > > All of these rules apply just as well to assignment targets in for-loops: > > >>> for num, (first, *rest) in {1: "dead", 2: "parrot"}.items(): > ... print("num=%r, first=%r, rest=%r"%(num, first, rest)) > ... > num=1, first='d', rest='ead' > num=2, first='p', rest='arrot' > > Hope that helps! > > On Thu, Jun 8, 2017 at 7:22 AM, joannah nanjekye < > nanjekyejoannah at gmail.com> wrote: > >> Thanks for response on automatic tuple unpack. My bad I dint know about >> this all along. >> >> Infact this works same way Go does. I have been analyzing why we would >> really need such a function (allow function to return multiple types) in >> python given we have this feature( automatic tuple unpack) and have not yet >> got good ground. When I come across good ground I will talk about it. >> >> So I will say this automatic tuple unpack pretty much works for my needs. >> >> Thanks >> >> On Thu, Jun 1, 2017 at 5:21 PM, Markus Meskanen > > wrote: >> >>> Why isn't a tuple enough? You can do automatic tuple unpack: >>> >>> v1, v2 = return_multiplevalues(1, 2) >>> >>> >>> On Jun 1, 2017 17:18, "joannah nanjekye" >>> wrote: >>> >>> Hello Team, >>> >>> I am Joannah. I am currently working on a book on python compatibility >>> and publishing it with apress. I have worked with python for a while we are >>> talking about four years. >>> >>> Today I was writing an example snippet for the book and needed to write >>> a function that returns two values something like this: >>> >>> def return_multiplevalues(num1, num2): >>> return num1, num2 >>> >>> I noticed that this actually returns a tuple of the values which I did >>> not want in the first place.I wanted python to return two values in their >>> own types so I can work with them as they are but here I was stuck with >>> working around a tuple. >>> >>> My proposal is we provide a way of functions returning multiple values. >>> This has been implemented in languages like Go and I have found many cases >>> where I needed and used such a functionality. I wish for this convenience >>> in python so that I don't have to suffer going around a tuple. >>> >>> I will appreciate discussing this. You may also bring to light any >>> current way of returning multiple values from a function that I may not >>> know of in python if there is. >>> >>> Kind regards, >>> Joannah >>> >>> -- >>> Joannah Nanjekye >>> +256776468213 <+256%20776%20468213> >>> F : Nanjekye Captain Joannah >>> S : joannah.nanjekye >>> T : @Captain_Joannah >>> SO : joannah >>> >>> >>> *"You think you know when you learn, are more sure when you can write, >>> even more when you can teach, but certain when you can program." Alan J. >>> Perlis* >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >>> >> >> >> -- >> Joannah Nanjekye >> +256776468213 <+256%20776%20468213> >> F : Nanjekye Captain Joannah >> S : joannah.nanjekye >> T : @Captain_Joannah >> SO : joannah >> >> >> *"You think you know when you learn, are more sure when you can write, >> even more when you can teach, but certain when you can program." Alan J. >> Perlis* >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > -- Joannah Nanjekye +256776468213 F : Nanjekye Captain Joannah S : joannah.nanjekye T : @Captain_Joannah SO : joannah *"You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program." Alan J. Perlis* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Fri Jun 9 05:49:42 2017 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 9 Jun 2017 11:49:42 +0200 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: <5938EF88.2040408@canterbury.ac.nz> <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> <145c1990-5fd9-f04c-e007-bbf59214fcad@mrabarnett.plus.com> Message-ID: I ha d not read all the e-mails in the thread - but just to announce to people eager for a similar feature - I have a package on pypi that enables one to do: In [74]: with extradict.MapGetter({'a': 1}) as d: ...: from d import a ...: In [75]: a Out[75]: 1 (just pip install extradict ) On 8 June 2017 at 23:00, Joshua Morton wrote: > While I don't like that syntax, we do know that sets are unhashable, so we > can be certain that that would be a TypeError if it was meant to construct a > set containing a set. (ie. {{foo}} will always result in a TypeError in > python). > > On Thu, Jun 8, 2017 at 1:40 PM Chris Angelico wrote: >> >> On Fri, Jun 9, 2017 at 6:02 AM, MRAB wrote: >> > It could also be used on the RHS to pack: >> > >> >>>> a = 1 >> >>>> b = 2 >> >>>> c = 3 >> >>>> d = 4 >> >>>> foo = {{a, b, c, d}} >> >> The trouble with that is that it's syntactically identical to creating >> a set containing a set containing four values. That may cause >> problems. >> >> ChrisA >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From mehaase at gmail.com Fri Jun 9 08:46:49 2017 From: mehaase at gmail.com (Mark E. Haase) Date: Fri, 9 Jun 2017 08:46:49 -0400 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: On Thu, Jun 8, 2017 at 4:27 PM, Abe Dillon wrote: > >>> a, *b = 1, 2, 3, 4, 5 # NOTE: Most itterables unpack starred > variables as a list > >>> type(b) > > > >>> a, *b = "except strings" > >>> type(b) > > I was just playing around with this, and on Python 3.5.3, I see strings unpacked as lists: >>> first, *rest = 'spam' >>> type(rest) Am I doing something different, or is this something that changed in the language? -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Fri Jun 9 11:15:28 2017 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 9 Jun 2017 10:15:28 -0500 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: No. You're right. I don't know why I thought strings were treated differently. On Jun 9, 2017 7:47 AM, "Mark E. Haase" wrote: > On Thu, Jun 8, 2017 at 4:27 PM, Abe Dillon wrote: > >> >>> a, *b = 1, 2, 3, 4, 5 # NOTE: Most itterables unpack starred >> variables as a list >> >>> type(b) >> >> >> >>> a, *b = "except strings" >> >>> type(b) >> >> > > I was just playing around with this, and on Python 3.5.3, I see strings > unpacked as lists: > > >>> first, *rest = 'spam' > >>> type(rest) > > > Am I doing something different, or is this something that changed in the > language? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Jun 9 14:45:54 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 9 Jun 2017 11:45:54 -0700 Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: Message-ID: a few notes: >From the OP: It would be cool to have a syntax that would unpack the dictionary to > values based on the names of the variables. Something perhaps like: > > a, b, c = **mydict > > which would assign the values of the keys 'a', 'b', 'c' to the variables. > a number of alternatives have been brought up, but I think it's important to note that this doesn't require naming the variables twice -- I think that's key to the proposal. Personally, I don't think that use case is common enough that it's something we should support with syntax. And while the OP drew a parallel with sequence unpacking, THAT doesn't make any assumptions about variable names on either side of the equals... The solutions that work with locals() being the only ones that do satisfy this, maybe a utility function that lets you specify which names you want unpacked would be handy. (a bit tricky to get the right locals() dict, though. Also: Paul Moore wrote: > The most common use case I find for this is when dealing with JSON (as > someone else pointed out). But that's a definite case of dealing with > data in a format that's "unnatural" for Python (by definition, JSON is > "natural" for JavaScript). > I suppose so - what is a dict in python (data structure) is an object in Javascript (code). I often find teasing out what is data and what is code to be a tricky code structure issue. However, most of the time when you pass JSON around it really is "data", rather than code, so the Python interpretation is the more appropriate one. > While having better support for working > with JSON would be nice, I typically find myself wishing for better > JSON handling libraries (ones that deal better with mappings with > known keys) than for language features. > exactly -- after all, under the hood, python objects have a _dict__ -- s mapping a JSON "object" to a Python object is really easy code to write: class JSObject: def __init__(self, js): self.__dict__.update(json.loads(js)) Of course, this doesn't deal with nested objects, but you can do that pretty easily, too: class JSObject: def __init__(self, js): """ initialized an object that matches the JSON passed in js is a JSON compatible python object tree (as produced by json.load*) """ for key, val in js.items(): if type(val) is dict: self.__dict__[key] = JSObject(val) else: self.__dict__[key] = val which won't deal with objects nested inside arrays. So, as I am hainvg fun: class JSObject: def __new__(cls, js_obj): """ create an object that matches the JSON passed in js is a JSON compatible python object tree (as produced by json.load*) """ if type(js_obj) is dict: self = super().__new__(cls) for key, val in js_obj.items(): self.__dict__[key] = JSObject(val) return self elif type(js_obj) is list: return [JSObject(item) for item in js_obj] else: return js_obj def __repr__(self): # note -- this does not do indentation... s = ['{\n'] for key, val in self.__dict__.items(): s.append('"%s": %s\n' % (key, str(val))) s.append('}') return "".join(s) I"m sure there are all sorts of edge cases this doesn't handle, but it shows it's pretty easy. However, I haven't seen a lib like this (though it may exist). I think the reason is that if you really want to convert JSON to Python objects, you probably want to use a schema and do validation, etc. And there are packages that do that. Also -- the Python data model is richer than the javascript one (dict keys that aren't identifiers, tuples vs lists, float vs int, ...) so another reason you probably don't want to unpack JSON into python Objects rather than dicts. This has gotten pretty off topic, though I think it shows that having a sample way to unpack a dict is probably not really helpful to the language - when you need to do that, you really need to have a bunch of extra logic in there anyway. > But of course, I could write > such a library myself, if it mattered sufficiently to me - and it > never seems *that* important :-) > So here it is :-) Not that I seems important to me -- just fun. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: json_unpack.py Type: text/x-python-script Size: 2327 bytes Desc: not available URL: From nfultz at gmail.com Sat Jun 10 22:20:07 2017 From: nfultz at gmail.com (Neal Fultz) Date: Sat, 10 Jun 2017 19:20:07 -0700 Subject: [Python-ideas] Run length encoding Message-ID: Hello python-ideas, I am very new to this, but on a different forum and after a couple conversations, I really wished Python came with run-length encoding built-in; after all, it ships with zip, which is much more complicated :) The general idea is to be able to go back and forth between two representations of a sequence: [1,1,1,1,2,3,4,4,3,3,3] and [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] where the first element is the data element, and the second is how many times it is repeated. I wrote an encoder/decoder in about 20 lines ( https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like to offer it for the next version; I think it might fit in nicely in the itertools module, for example. I am curious about your thoughts. Best, -Neal -------------- next part -------------- An HTML attachment was scrubbed... URL: From mafagafogigante at gmail.com Sat Jun 10 22:55:14 2017 From: mafagafogigante at gmail.com (Bernardo Sulzbach) Date: Sat, 10 Jun 2017 23:55:14 -0300 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: <9081aa6b-693d-da0b-e35d-6f4745a20097@gmail.com> On 2017-06-10 23:20, Neal Fultz wrote: > Hello python-ideas, > > I am very new to this, but on a different forum and after a couple > conversations, I really wished Python came with run-length encoding > built-in; after all, it ships with zip, which is much more complicated :) > > The general idea is to be able to go back and forth between two > representations of a sequence: > > |[1,1,1,1,2,3,4,4,3,3,3]| > > and > > |[(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)]| > > where the first element is the data element, and the second is how many > times it is repeated. > We can currently do it like this in one line: [(k, sum(1 for _ in g)) for k, g in groupby(sequence)] However, it is slower than a "dedicated" solution. Additionally, I don't know if what you are proposing is generic enough for the standard library. -- Bernardo Sulzbach http://www.mafagafogigante.org/ mafagafogigante at gmail.com From mertz at gnosis.cx Sat Jun 10 22:59:05 2017 From: mertz at gnosis.cx (David Mertz) Date: Sat, 10 Jun 2017 19:59:05 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: Here's a one-line version: from itertools import groupby rle_encode = lambda it: ( (l[0],len(l)) for g in groupby(it) for l in [list(g[1])]) Since "not every one line function needs to be in the standard library" is a guiding principle of Python, and even moreso of `itertools`, probably this is a recipe in the documentation at most. Or maybe it would have a home in `more_itertools`. On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz wrote: > Hello python-ideas, > > I am very new to this, but on a different forum and after a couple > conversations, I really wished Python came with run-length encoding > built-in; after all, it ships with zip, which is much more complicated :) > > The general idea is to be able to go back and forth between two > representations of a sequence: > > [1,1,1,1,2,3,4,4,3,3,3] > > and > > [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] > > where the first element is the data element, and the second is how many > times it is repeated. > > I wrote an encoder/decoder in about 20 lines ( > https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like to > offer it for the next version; I think it might fit in nicely in the > itertools module, for example. I am curious about your thoughts. > > Best, > > -Neal > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From joshua.morton13 at gmail.com Sat Jun 10 23:12:41 2017 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Sun, 11 Jun 2017 03:12:41 +0000 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: Another is [(k, len(list(g))) for k, g in groupby(l)] It might be worth adding it to the list of recipies either at https://docs.python.org/2/library/itertools.html#itertools.groupby or at https://docs.python.org/2/library/itertools.html#recipes, though. On Sat, Jun 10, 2017 at 8:07 PM David Mertz wrote: > Here's a one-line version: > > from itertools import groupby > rle_encode = lambda it: ( > (l[0],len(l)) for g in groupby(it) for l in [list(g[1])]) > > Since "not every one line function needs to be in the standard library" is > a guiding principle of Python, and even moreso of `itertools`, probably > this is a recipe in the documentation at most. Or maybe it would have a > home in `more_itertools`. > > > On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz wrote: > >> Hello python-ideas, >> >> I am very new to this, but on a different forum and after a couple >> conversations, I really wished Python came with run-length encoding >> built-in; after all, it ships with zip, which is much more complicated :) >> >> The general idea is to be able to go back and forth between two >> representations of a sequence: >> >> [1,1,1,1,2,3,4,4,3,3,3] >> >> and >> >> [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] >> >> where the first element is the data element, and the second is how many >> times it is repeated. >> >> I wrote an encoder/decoder in about 20 lines ( >> https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like to >> offer it for the next version; I think it might fit in nicely in the >> itertools module, for example. I am curious about your thoughts. >> >> Best, >> >> -Neal >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Jun 10 23:13:43 2017 From: mertz at gnosis.cx (David Mertz) Date: Sat, 10 Jun 2017 20:13:43 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: Bernardo Sulzbach posted a much prettier version than mine that is a bit shorter. But his is also somewhat slower (and I believe asymptotically so as the number of equal elements in subsequence goes up). He needs to sum up a bunch of 1's repeatedly rather than do the O(1) `len()` function. For a list with 1000 run lengths of 1000 each, we get: In [53]: %timeit [(k, sum(1 for _ in g)) for k, g in groupby(lst)] 10 loops, best of 3: 66.2 ms per loop In [54]: %timeit [(k,len(l)) for k, g in groupby(lst) for l in [list(g)]] 100 loops, best of 3: 17.5 ms per loop On Sat, Jun 10, 2017 at 7:59 PM, David Mertz wrote: > Here's a one-line version: > > from itertools import groupby > rle_encode = lambda it: ( > (l[0],len(l)) for g in groupby(it) for l in [list(g[1])]) > > Since "not every one line function needs to be in the standard library" is > a guiding principle of Python, and even moreso of `itertools`, probably > this is a recipe in the documentation at most. Or maybe it would have a > home in `more_itertools`. > > > On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz wrote: > >> Hello python-ideas, >> >> I am very new to this, but on a different forum and after a couple >> conversations, I really wished Python came with run-length encoding >> built-in; after all, it ships with zip, which is much more complicated :) >> >> The general idea is to be able to go back and forth between two >> representations of a sequence: >> >> [1,1,1,1,2,3,4,4,3,3,3] >> >> and >> >> [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] >> >> where the first element is the data element, and the second is how many >> times it is repeated. >> >> I wrote an encoder/decoder in about 20 lines ( >> https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like to >> offer it for the next version; I think it might fit in nicely in the >> itertools module, for example. I am curious about your thoughts. >> >> Best, >> >> -Neal >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nfultz at gmail.com Sat Jun 10 23:14:06 2017 From: nfultz at gmail.com (Neal Fultz) Date: Sat, 10 Jun 2017 20:14:06 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: Thanks, that's cool. Maybe the root problem is that the docs aren't using the right words when I google. Run-length-encoding is particularly relevant for spare matrices, but there's probably a library for those as well. On the data science side of things, there's a few hundred R packages that use it there[1]. Can you explicate the guiding principle a bit? I'm perplexed that python would come with zip and gzip but not rle. [1] : https://github.com/search?l=R&q=user%3Acran+rle&type=Code&utf8=%E2%9C%93 On Sat, Jun 10, 2017 at 7:59 PM, David Mertz wrote: > Here's a one-line version: > > from itertools import groupby > rle_encode = lambda it: ( > (l[0],len(l)) for g in groupby(it) for l in [list(g[1])]) > > Since "not every one line function needs to be in the standard library" is > a guiding principle of Python, and even moreso of `itertools`, probably > this is a recipe in the documentation at most. Or maybe it would have a > home in `more_itertools`. > > > On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz wrote: > >> Hello python-ideas, >> >> I am very new to this, but on a different forum and after a couple >> conversations, I really wished Python came with run-length encoding >> built-in; after all, it ships with zip, which is much more complicated :) >> >> The general idea is to be able to go back and forth between two >> representations of a sequence: >> >> [1,1,1,1,2,3,4,4,3,3,3] >> >> and >> >> [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] >> >> where the first element is the data element, and the second is how many >> times it is repeated. >> >> I wrote an encoder/decoder in about 20 lines ( >> https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like to >> offer it for the next version; I think it might fit in nicely in the >> itertools module, for example. I am curious about your thoughts. >> >> Best, >> >> -Neal >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mafagafogigante at gmail.com Sat Jun 10 23:26:15 2017 From: mafagafogigante at gmail.com (Bernardo Sulzbach) Date: Sun, 11 Jun 2017 00:26:15 -0300 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: <4ddc1812-c5ef-66cc-550e-abb9837d7b42@gmail.com> On 2017-06-11 00:13, David Mertz wrote: > Bernardo Sulzbach posted a much prettier version than mine that is a bit > shorter. But his is also somewhat slower (and I believe asymptotically > so as the number of equal elements in subsequence goes up). He needs to > sum up a bunch of 1's repeatedly rather than do the O(1) `len()` function. > Constructing a list from an iterator of size N is in O(N). Summing N elements is in O(N). I don't think it is asymptotically slower, just slower because of implementation details. -- Bernardo Sulzbach http://www.mafagafogigante.org/ mafagafogigante at gmail.com From joshua.morton13 at gmail.com Sat Jun 10 23:27:34 2017 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Sun, 11 Jun 2017 03:27:34 +0000 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: David: You're absolutely right, s/2/3 in my prior post! Neal: As for why zip (at first I thought you meant the zip function, not the zip compression scheme) is included and rle is not, zip is (or was), I believe, used as part of python's packaging infrastructure, hopefully someone else can correct me if that's untrue. --Josh On Sat, Jun 10, 2017 at 8:20 PM David Mertz wrote: > God no! Not in the Python 2 docs! ... if the recipe belongs somewhere it's > in the Python 3 docs. Although, I suppose it could go under 2 also, since > it's not actually a behavior change in the feature-frozen interpreter. But > as a Python instructor (and someone who remembers the cool new features of > Python 1.5 over 1.4 pretty well), my attitude about Python 2 is "kill it > with fire!" > > Your spelling of the one-liner is prettier, shorter, and more intuitive > than mine, and the same speed. > > On Sat, Jun 10, 2017 at 8:12 PM, Joshua Morton > wrote: > >> Another is >> >> [(k, len(list(g))) for k, g in groupby(l)] >> >> >> It might be worth adding it to the list of recipies either at >> https://docs.python.org/2/library/itertools.html#itertools.groupby or at >> https://docs.python.org/2/library/itertools.html#recipes, though. >> >> On Sat, Jun 10, 2017 at 8:07 PM David Mertz wrote: >> >>> Here's a one-line version: >>> >>> from itertools import groupby >>> rle_encode = lambda it: ( >>> (l[0],len(l)) for g in groupby(it) for l in [list(g[1])]) >>> >>> Since "not every one line function needs to be in the standard library" >>> is a guiding principle of Python, and even moreso of `itertools`, probably >>> this is a recipe in the documentation at most. Or maybe it would have a >>> home in `more_itertools`. >>> >>> >>> On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz wrote: >>> >>>> Hello python-ideas, >>>> >>>> I am very new to this, but on a different forum and after a couple >>>> conversations, I really wished Python came with run-length encoding >>>> built-in; after all, it ships with zip, which is much more complicated :) >>>> >>>> The general idea is to be able to go back and forth between two >>>> representations of a sequence: >>>> >>>> [1,1,1,1,2,3,4,4,3,3,3] >>>> >>>> and >>>> >>>> [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] >>>> >>>> where the first element is the data element, and the second is how many >>>> times it is repeated. >>>> >>>> I wrote an encoder/decoder in about 20 lines ( >>>> https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like >>>> to offer it for the next version; I think it might fit in nicely in the >>>> itertools module, for example. I am curious about your thoughts. >>>> >>>> Best, >>>> >>>> -Neal >>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>>> >>> >>> >>> -- >>> Keeping medicines from the bloodstreams of the sick; food >>> from the bellies of the hungry; books from the hands of the >>> uneducated; technology from the underdeveloped; and putting >>> advocates of freedom in prisons. Intellectual property is >>> to the 21st century what the slave trade was to the 16th. >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Jun 10 23:29:29 2017 From: mertz at gnosis.cx (David Mertz) Date: Sat, 10 Jun 2017 20:29:29 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: If what you really want is sparse matrices, you should use those: https://docs.scipy.org/doc/scipy/reference/sparse.html. Or maybe from the experimental Dask offshoot that I contributed a few lines to: https://github.com/mrocklin/sparse. Either of those will be about two orders of magnitude faster than working with Python lists for numeric data. The reason, I think, that there's no RLE module in Python (standard library, there's probably something on PyPI) is that it's so easy to roll your own with the building blocks in itertools. The `zipfile` and `gzip` modules are written in hundreds or thousands of lines of C code, and more importantly they are for dealing with *files* mostly, not generic sequences... that's the domain of `itertools`, but `itertools` is kept to a bare minimum collection of building blocks from which 1-10 line functions can be built efficiently. On Sat, Jun 10, 2017 at 8:14 PM, Neal Fultz wrote: > Thanks, that's cool. Maybe the root problem is that the docs aren't > using the right words when I google. Run-length-encoding is particularly > relevant for spare matrices, but there's probably a library for those as > well. On the data science side of things, there's a few hundred R packages > that use it there[1]. > > Can you explicate the guiding principle a bit? I'm perplexed that python > would come with zip and gzip but not rle. > > [1] : https://github.com/search?l=R&q=user%3Acran+rle&type=Code& > utf8=%E2%9C%93 > > On Sat, Jun 10, 2017 at 7:59 PM, David Mertz wrote: > >> Here's a one-line version: >> >> from itertools import groupby >> rle_encode = lambda it: ( >> (l[0],len(l)) for g in groupby(it) for l in [list(g[1])]) >> >> Since "not every one line function needs to be in the standard library" >> is a guiding principle of Python, and even moreso of `itertools`, probably >> this is a recipe in the documentation at most. Or maybe it would have a >> home in `more_itertools`. >> >> >> On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz wrote: >> >>> Hello python-ideas, >>> >>> I am very new to this, but on a different forum and after a couple >>> conversations, I really wished Python came with run-length encoding >>> built-in; after all, it ships with zip, which is much more complicated :) >>> >>> The general idea is to be able to go back and forth between two >>> representations of a sequence: >>> >>> [1,1,1,1,2,3,4,4,3,3,3] >>> >>> and >>> >>> [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] >>> >>> where the first element is the data element, and the second is how many >>> times it is repeated. >>> >>> I wrote an encoder/decoder in about 20 lines ( >>> https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like to >>> offer it for the next version; I think it might fit in nicely in the >>> itertools module, for example. I am curious about your thoughts. >>> >>> Best, >>> >>> -Neal >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >> >> >> -- >> Keeping medicines from the bloodstreams of the sick; food >> from the bellies of the hungry; books from the hands of the >> uneducated; technology from the underdeveloped; and putting >> advocates of freedom in prisons. Intellectual property is >> to the 21st century what the slave trade was to the 16th. >> > > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nfultz at gmail.com Sat Jun 10 23:33:04 2017 From: nfultz at gmail.com (Neal Fultz) Date: Sat, 10 Jun 2017 20:33:04 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: Yes, I mean zip compression :) Also, everyone's been posting decode functions, but encode is a bit harder :). I think it should be equally easy to go one direction as the other. Hopefully this email chain builds up enough info to update the docs for posterity / future me. On Sat, Jun 10, 2017 at 8:27 PM, Joshua Morton wrote: > David: You're absolutely right, s/2/3 in my prior post! > > Neal: As for why zip (at first I thought you meant the zip function, not > the zip compression scheme) is included and rle is not, zip is (or was), I > believe, used as part of python's packaging infrastructure, hopefully > someone else can correct me if that's untrue. > > --Josh > > On Sat, Jun 10, 2017 at 8:20 PM David Mertz wrote: > >> God no! Not in the Python 2 docs! ... if the recipe belongs somewhere >> it's in the Python 3 docs. Although, I suppose it could go under 2 also, >> since it's not actually a behavior change in the feature-frozen >> interpreter. But as a Python instructor (and someone who remembers the >> cool new features of Python 1.5 over 1.4 pretty well), my attitude about >> Python 2 is "kill it with fire!" >> >> Your spelling of the one-liner is prettier, shorter, and more intuitive >> than mine, and the same speed. >> >> On Sat, Jun 10, 2017 at 8:12 PM, Joshua Morton > > wrote: >> >>> Another is >>> >>> [(k, len(list(g))) for k, g in groupby(l)] >>> >>> >>> It might be worth adding it to the list of recipies either at >>> https://docs.python.org/2/library/itertools.html#itertools.groupby or >>> at https://docs.python.org/2/library/itertools.html#recipes, though. >>> >>> On Sat, Jun 10, 2017 at 8:07 PM David Mertz wrote: >>> >>>> Here's a one-line version: >>>> >>>> from itertools import groupby >>>> rle_encode = lambda it: ( >>>> (l[0],len(l)) for g in groupby(it) for l in [list(g[1])]) >>>> >>>> Since "not every one line function needs to be in the standard library" >>>> is a guiding principle of Python, and even moreso of `itertools`, probably >>>> this is a recipe in the documentation at most. Or maybe it would have a >>>> home in `more_itertools`. >>>> >>>> >>>> On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz wrote: >>>> >>>>> Hello python-ideas, >>>>> >>>>> I am very new to this, but on a different forum and after a couple >>>>> conversations, I really wished Python came with run-length encoding >>>>> built-in; after all, it ships with zip, which is much more complicated :) >>>>> >>>>> The general idea is to be able to go back and forth between two >>>>> representations of a sequence: >>>>> >>>>> [1,1,1,1,2,3,4,4,3,3,3] >>>>> >>>>> and >>>>> >>>>> [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] >>>>> >>>>> where the first element is the data element, and the second is how >>>>> many times it is repeated. >>>>> >>>>> I wrote an encoder/decoder in about 20 lines ( >>>>> https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like >>>>> to offer it for the next version; I think it might fit in nicely in the >>>>> itertools module, for example. I am curious about your thoughts. >>>>> >>>>> Best, >>>>> >>>>> -Neal >>>>> >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> Keeping medicines from the bloodstreams of the sick; food >>>> from the bellies of the hungry; books from the hands of the >>>> uneducated; technology from the underdeveloped; and putting >>>> advocates of freedom in prisons. Intellectual property is >>>> to the 21st century what the slave trade was to the 16th. >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> >> >> >> -- >> Keeping medicines from the bloodstreams of the sick; food >> from the bellies of the hungry; books from the hands of the >> uneducated; technology from the underdeveloped; and putting >> advocates of freedom in prisons. Intellectual property is >> to the 21st century what the slave trade was to the 16th. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nfultz at gmail.com Sat Jun 10 23:35:01 2017 From: nfultz at gmail.com (Neal Fultz) Date: Sat, 10 Jun 2017 20:35:01 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: Whoops, scratch that part about encode /decode. On Sat, Jun 10, 2017 at 8:33 PM, Neal Fultz wrote: > Yes, I mean zip compression :) > > Also, everyone's been posting decode functions, but encode is a bit harder > :). > > I think it should be equally easy to go one direction as the other. > Hopefully this email chain builds up enough info to update the docs for > posterity / future me. > > On Sat, Jun 10, 2017 at 8:27 PM, Joshua Morton > wrote: > >> David: You're absolutely right, s/2/3 in my prior post! >> >> Neal: As for why zip (at first I thought you meant the zip function, not >> the zip compression scheme) is included and rle is not, zip is (or was), I >> believe, used as part of python's packaging infrastructure, hopefully >> someone else can correct me if that's untrue. >> >> --Josh >> >> On Sat, Jun 10, 2017 at 8:20 PM David Mertz wrote: >> >>> God no! Not in the Python 2 docs! ... if the recipe belongs somewhere >>> it's in the Python 3 docs. Although, I suppose it could go under 2 also, >>> since it's not actually a behavior change in the feature-frozen >>> interpreter. But as a Python instructor (and someone who remembers the >>> cool new features of Python 1.5 over 1.4 pretty well), my attitude about >>> Python 2 is "kill it with fire!" >>> >>> Your spelling of the one-liner is prettier, shorter, and more intuitive >>> than mine, and the same speed. >>> >>> On Sat, Jun 10, 2017 at 8:12 PM, Joshua Morton < >>> joshua.morton13 at gmail.com> wrote: >>> >>>> Another is >>>> >>>> [(k, len(list(g))) for k, g in groupby(l)] >>>> >>>> >>>> It might be worth adding it to the list of recipies either at >>>> https://docs.python.org/2/library/itertools.html#itertools.groupby or >>>> at https://docs.python.org/2/library/itertools.html#recipes, though. >>>> >>>> On Sat, Jun 10, 2017 at 8:07 PM David Mertz wrote: >>>> >>>>> Here's a one-line version: >>>>> >>>>> from itertools import groupby >>>>> rle_encode = lambda it: ( >>>>> (l[0],len(l)) for g in groupby(it) for l in [list(g[1])]) >>>>> >>>>> Since "not every one line function needs to be in the standard >>>>> library" is a guiding principle of Python, and even moreso of `itertools`, >>>>> probably this is a recipe in the documentation at most. Or maybe it would >>>>> have a home in `more_itertools`. >>>>> >>>>> >>>>> On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz wrote: >>>>> >>>>>> Hello python-ideas, >>>>>> >>>>>> I am very new to this, but on a different forum and after a couple >>>>>> conversations, I really wished Python came with run-length encoding >>>>>> built-in; after all, it ships with zip, which is much more complicated :) >>>>>> >>>>>> The general idea is to be able to go back and forth between two >>>>>> representations of a sequence: >>>>>> >>>>>> [1,1,1,1,2,3,4,4,3,3,3] >>>>>> >>>>>> and >>>>>> >>>>>> [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] >>>>>> >>>>>> where the first element is the data element, and the second is how >>>>>> many times it is repeated. >>>>>> >>>>>> I wrote an encoder/decoder in about 20 lines ( >>>>>> https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like >>>>>> to offer it for the next version; I think it might fit in nicely in the >>>>>> itertools module, for example. I am curious about your thoughts. >>>>>> >>>>>> Best, >>>>>> >>>>>> -Neal >>>>>> >>>>>> _______________________________________________ >>>>>> Python-ideas mailing list >>>>>> Python-ideas at python.org >>>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Keeping medicines from the bloodstreams of the sick; food >>>>> from the bellies of the hungry; books from the hands of the >>>>> uneducated; technology from the underdeveloped; and putting >>>>> advocates of freedom in prisons. Intellectual property is >>>>> to the 21st century what the slave trade was to the 16th. >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>> >>>> >>> >>> >>> -- >>> Keeping medicines from the bloodstreams of the sick; food >>> from the bellies of the hungry; books from the hands of the >>> uneducated; technology from the underdeveloped; and putting >>> advocates of freedom in prisons. Intellectual property is >>> to the 21st century what the slave trade was to the 16th. >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Jun 10 23:35:14 2017 From: mertz at gnosis.cx (David Mertz) Date: Sat, 10 Jun 2017 20:35:14 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: <4ddc1812-c5ef-66cc-550e-abb9837d7b42@gmail.com> References: <4ddc1812-c5ef-66cc-550e-abb9837d7b42@gmail.com> Message-ID: You are right. I made a thinko. List construction from an iterator is O(N) just as is `sum(1 for _ in it)`. Both of them need to march through every element. But as a constant multiplier, just constructing the list should be faster than needing an addition (Python append is O(1) because of smart dynamic memory pre-allocation). So the "just read the iterator" is about 2-3 times faster than read-then-accumulate). Of course, it the run-lengths are LARGE, we can start worrying about the extra memory allocation needed as a tradeoff. Your sum uses constant memory. On Sat, Jun 10, 2017 at 8:26 PM, Bernardo Sulzbach < mafagafogigante at gmail.com> wrote: > On 2017-06-11 00:13, David Mertz wrote: > >> Bernardo Sulzbach posted a much prettier version than mine that is a bit >> shorter. But his is also somewhat slower (and I believe asymptotically so >> as the number of equal elements in subsequence goes up). He needs to sum >> up a bunch of 1's repeatedly rather than do the O(1) `len()` function. >> >> > Constructing a list from an iterator of size N is in O(N). > > Summing N elements is in O(N). > > I don't think it is asymptotically slower, just slower because of > implementation details. > > -- > Bernardo Sulzbach > http://www.mafagafogigante.org/ > mafagafogigante at gmail.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Jun 10 23:46:01 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 10 Jun 2017 23:46:01 -0400 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: On 6/10/2017 11:27 PM, Joshua Morton wrote: > Neal: As for why zip (at first I thought you meant the zip function, not > the zip compression scheme) is included and rle is not, zip is (or was), > I believe, used as part of python's packaging infrastructure, hopefully > someone else can correct me if that's untrue. cpython can run from a zipped version of the stdlib. In fact, sys.path contains 'C:\\Programs\\Python36\\python36.zip' -- Terry Jan Reedy From mertz at gnosis.cx Sat Jun 10 23:20:43 2017 From: mertz at gnosis.cx (David Mertz) Date: Sat, 10 Jun 2017 20:20:43 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: God no! Not in the Python 2 docs! ... if the recipe belongs somewhere it's in the Python 3 docs. Although, I suppose it could go under 2 also, since it's not actually a behavior change in the feature-frozen interpreter. But as a Python instructor (and someone who remembers the cool new features of Python 1.5 over 1.4 pretty well), my attitude about Python 2 is "kill it with fire!" Your spelling of the one-liner is prettier, shorter, and more intuitive than mine, and the same speed. On Sat, Jun 10, 2017 at 8:12 PM, Joshua Morton wrote: > Another is > > [(k, len(list(g))) for k, g in groupby(l)] > > > It might be worth adding it to the list of recipies either at > https://docs.python.org/2/library/itertools.html#itertools.groupby or at > https://docs.python.org/2/library/itertools.html#recipes, though. > > On Sat, Jun 10, 2017 at 8:07 PM David Mertz wrote: > >> Here's a one-line version: >> >> from itertools import groupby >> rle_encode = lambda it: ( >> (l[0],len(l)) for g in groupby(it) for l in [list(g[1])]) >> >> Since "not every one line function needs to be in the standard library" >> is a guiding principle of Python, and even moreso of `itertools`, probably >> this is a recipe in the documentation at most. Or maybe it would have a >> home in `more_itertools`. >> >> >> On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz wrote: >> >>> Hello python-ideas, >>> >>> I am very new to this, but on a different forum and after a couple >>> conversations, I really wished Python came with run-length encoding >>> built-in; after all, it ships with zip, which is much more complicated :) >>> >>> The general idea is to be able to go back and forth between two >>> representations of a sequence: >>> >>> [1,1,1,1,2,3,4,4,3,3,3] >>> >>> and >>> >>> [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)] >>> >>> where the first element is the data element, and the second is how many >>> times it is repeated. >>> >>> I wrote an encoder/decoder in about 20 lines ( >>> https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like to >>> offer it for the next version; I think it might fit in nicely in the >>> itertools module, for example. I am curious about your thoughts. >>> >>> Best, >>> >>> -Neal >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >> >> >> -- >> Keeping medicines from the bloodstreams of the sick; food >> from the bellies of the hungry; books from the hands of the >> uneducated; technology from the underdeveloped; and putting >> advocates of freedom in prisons. Intellectual property is >> to the 21st century what the slave trade was to the 16th. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nfultz at gmail.com Sat Jun 10 23:56:15 2017 From: nfultz at gmail.com (Neal Fultz) Date: Sat, 10 Jun 2017 20:56:15 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: I would also submit there's some value in the obvious readability of z = runlength.encode(sequence) vs z = [(k, len(list(g))) for k, g in itertools.groupby(sequence)] but that's my personal opinion. Everyone is welcome to use my code, but I probably won't submit to pypi for a two function module, it was just an idea :) I do think it's worth adding to the docs, though, if only for future people / me googling "run length encoding python" and only finding stack overflow. On Sat, Jun 10, 2017 at 8:46 PM, Terry Reedy wrote: > On 6/10/2017 11:27 PM, Joshua Morton wrote: > > Neal: As for why zip (at first I thought you meant the zip function, not >> the zip compression scheme) is included and rle is not, zip is (or was), I >> believe, used as part of python's packaging infrastructure, hopefully >> someone else can correct me if that's untrue. >> > > cpython can run from a zipped version of the stdlib. > In fact, sys.path contains 'C:\\Programs\\Python36\\python36.zip' > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Jun 11 00:00:03 2017 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 10 Jun 2017 23:00:03 -0500 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: In the other direction, e.g., def expand_rle(rle): from itertools import repeat, chain return list(chain.from_iterable(repeat(x, n) for x, n in rle)) Then >>> expand_rle([('a', 5), ('bc', 3)]) ['a', 'a', 'a', 'a', 'a', 'bc', 'bc', 'bc'] As to why zip is in the distribution, but not RLE, zip is a very widely used, general purpose, compression standard. RLE is a special-purpose thing. From ncoghlan at gmail.com Sun Jun 11 00:17:56 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Jun 2017 14:17:56 +1000 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: On 11 June 2017 at 13:27, Joshua Morton wrote: > David: You're absolutely right, s/2/3 in my prior post! > > Neal: As for why zip (at first I thought you meant the zip function, not the > zip compression scheme) is included and rle is not, zip is (or was), I > believe, used as part of python's packaging infrastructure, hopefully > someone else can correct me if that's untrue. There are a variety of reasons why things end up in the standard library: - they're part of how the language works (e.g. contextlib, types, operator, numbers, pickle, abc, copy, sys, importlib, zipimport) - they're modeling core computing concepts (e.g. math, cmath, decimal, fractions, re, encodings) - they're widely useful, and harder to get right than they may first appear (e.g. collections, itertools, secrets, statistics, struct, shelve) - they're widely useful (or at least once were), and just plain hard to get right (e.g. ssl, http, json, xml, xmlrpc, zip, bz2) - they provide cross-platform abstractions of operating system level interfaces (e.g. os, io, shutil, pathlib) - they're helpful in enabling ecosystem level interoperability between tools (e.g. logging, unittest, asyncio) - they're part of the way *nix systems work (e.g. grp, pwd) - they're useful in working on other parts of the standard library (e.g. unittest.mock, enum) "zip" and the other compression libraries check a couple of those boxes: they're broadly useful *and* they're needed in other parts of the standard library (e.g. lots of network protocols include compression support, we support importing from zip archives, and we support generating them through distutils, shutil, and zipapp) Run-length-encoding on the other hand is one of those things where the algorithm is pretty simple, and you're almost always going to be able to get the best results by creating an implement tailored specifically to your use case, rather than working through a general purpose abstraction like the iterator protocol. Even when that isn't the case, implementing your own utility function is still going to be competitive time-wise with finding and using a standard implementation. I suspect the main long term value of offering a recommended implementation as an itertools recipe wouldn't be in using it directly, but rather in making it easier for people to: - find the recipe if they already know the phrase "run length encoding" - test their own implementations that are more tailored to their specific use case The one itertools building block I do sometimes wish we had was a `counted_len` helper that: - used `len(obj)` if the given object implemented `__len__` - fell back to `sum(1 for __ in obj)` otherwise Otherwise you have to make the choice between: - use `len(obj)`, and only support sequences - use `len(list(obj))` and potentially make a pointless copy - use `sum(1 for __ in obj)` and ignore the possible O(1) fast path - writing your own `counted_len` helper: def counted_len(iterable): try: return len(iterable) except TypeError: pass return sum(1 for __ in iter(iterable)) If there was an itertools.counted_len function then the obvious option would be "use itertools.counted_len". Such a function would also become the obvious way to consume an iterator when you don't care about the individual results - you'd just process them all, and get the count of how many iterations happened. Given such a helper, the recipe for run-length-encoding would then be: def run_length_encode(iterable): return ((k, counted_len(g)) for k, g in groupby(iterable)) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Jun 11 00:24:35 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Jun 2017 14:24:35 +1000 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: On 11 June 2017 at 13:35, Neal Fultz wrote: > Whoops, scratch that part about encode /decode. Aye, decode is a relatively straightforward nested comprehension: def run_length_decode(iterable): return (item for item, item_count in iterable for __ in range(item_count)) It's only encode that is currently missing a clear self-evidently correct spelling, and I think that's more due to the lack of an obvious spelling for "tell me how many items are in this iterable, exhausting it if necessary" than it is to anything else. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From greg.ewing at canterbury.ac.nz Sun Jun 11 00:39:26 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 Jun 2017 16:39:26 +1200 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: <593CC97E.4000901@canterbury.ac.nz> In my experience, RLE isn't something you often find on its own. Usually it's used as part of some compression scheme that also has ways of encoding verbatim runs of data and maybe other things. So I'm skeptical that it can be usefully provided as a library function. It seems more like a design pattern than something you can capture in a library. -- Greg From ncoghlan at gmail.com Sun Jun 11 00:50:17 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Jun 2017 14:50:17 +1000 Subject: [Python-ideas] Run length encoding In-Reply-To: References: <4ddc1812-c5ef-66cc-550e-abb9837d7b42@gmail.com> Message-ID: On 11 June 2017 at 13:35, David Mertz wrote: > You are right. I made a thinko. > > List construction from an iterator is O(N) just as is `sum(1 for _ in it)`. > Both of them need to march through every element. But as a constant > multiplier, just constructing the list should be faster than needing an > addition (Python append is O(1) because of smart dynamic memory > pre-allocation). > > So the "just read the iterator" is about 2-3 times faster than > read-then-accumulate). Of course, it the run-lengths are LARGE, we can > start worrying about the extra memory allocation needed as a tradeoff. Your > sum uses constant memory. This would be another argument in favour of providing an itertools.counted_len function, as that would be able to avoid all the overheads currently associated with the "sum(1 for __ in iterable)" counting strategy. Without that, the best you can do in pure Python is to use __length_hint__ to choose your preferred counting strategy at runtime based on the specific input. Something like: from operator import length_hint # 10k 64-bit pointers ~= 640k max temporary list _USE_COUNTED_SUM = 10_001 def counted_len(iterable): # For sized containers, just return their length try: return len(iterable) except TypeError: pass # For probably-large inputs & those with no length hint, count them hint = length_hint(iterable, default=_USE_COUNTED_SUM) if hint >= _USE_COUNTED_SUM: return sum(1 for __ in iter(iterable)) # Otherwise, make a temporary list and report its length # as the specifics of the CPython implementation make this # faster than using the generator expression return len(list(iterable)) Cheers, Nick. P.S. itertools._grouper objects don't currently provide a length hint, and it would be tricky to add one, since it would need to be based on the number of remaining items in the original sequence, which would it turn depend on *that* defining either __len__ or __length_hint__. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From storchaka at gmail.com Sun Jun 11 00:57:24 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 11 Jun 2017 07:57:24 +0300 Subject: [Python-ideas] Run length encoding In-Reply-To: References: Message-ID: 11.06.17 05:20, Neal Fultz ????: > I am very new to this, but on a different forum and after a couple > conversations, I really wished Python came with run-length encoding > built-in; after all, it ships with zip, which is much more complicated :) > > The general idea is to be able to go back and forth between two > representations of a sequence: > > |[1,1,1,1,2,3,4,4,3,3,3]| > > and > > |[(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)]| > > where the first element is the data element, and the second is how many > times it is repeated. > > I wrote an encoder/decoder in about 20 lines ( > https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like to > offer it for the next version; I think it might fit in nicely in the > itertools module, for example. I am curious about your thoughts. RLE is just a general idea. Concrete implementations in file formats and protocols have different limitations and peculiarities. Different schemes are used for encoding lengths and values, short repetition sequences usually are not encoded with RLE, as well as repetition sequences of specific values, there are limitations on the maximal length. The implementation of the general idea is simple, but is not helpful in concrete cases. From nfultz at gmail.com Sun Jun 11 01:08:22 2017 From: nfultz at gmail.com (Neal Fultz) Date: Sat, 10 Jun 2017 22:08:22 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: <593CC97E.4000901@canterbury.ac.nz> References: <593CC97E.4000901@canterbury.ac.nz> Message-ID: Agreed to a degree about providing it as code, but it may also be worth mentioning also that zlib itself implements rle [1], and if there was ever a desire to go "python all the way down" you need an RLE somewhere anyway :) That said, I'll be pretty happy with anything that replaces an hour of google/coding/testing/(hour later find out I'm an idiot from a random listserv) with 1 minute of googling. Again, my issue isn't that it was difficult to code, but it *was* hard to make the research-y jump from googling for "run length encoding python", where I knew *exactly* what algorithm I wanted, to "itertools.groupby" which appears to be more general purpose and needs a little tweaking. Adjusting the docs/recipes would probably solve that problem. -- To me this is roughly on the same level as googling for 'binary search python' and not having bisect show up. However, the fact that `itertools.groupby` doesn't group over elements that are not contiguous is a bit surprising to me coming from SQL/pandas/R land (that is probably a large part of my disconnect here). This is actually explicitly called out in the current docs, but I wonder how many people search for one thing and find the other: I googled for RLE and the solution was actually groupby, but probably a lot of other people want a SQL group-by accidentally got an RLE and have to work around that... Then again, I don't know if you all can easily change names of functions at this point. -Neal [1] https://github.com/madler/zlib/blob/master/deflate.c#L2057 On Sat, Jun 10, 2017 at 9:39 PM, Greg Ewing wrote: > In my experience, RLE isn't something you often find on its own. > Usually it's used as part of some compression scheme that also > has ways of encoding verbatim runs of data and maybe other > things. > > So I'm skeptical that it can be usefully provided as a library > function. It seems more like a design pattern than something > you can capture in a library. > > -- > Greg > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sun Jun 11 01:19:00 2017 From: mertz at gnosis.cx (David Mertz) Date: Sat, 10 Jun 2017 22:19:00 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: <593CC97E.4000901@canterbury.ac.nz> Message-ID: If you understand what iterators do, the fact that itertools.groupby collects contiguous elements is both obvious and necessary. Iterators might be infinitely long... you cannot ask for every "A" that might eventually occur in an infinite sequence of letters. On Sat, Jun 10, 2017 at 10:08 PM, Neal Fultz wrote: > Agreed to a degree about providing it as code, but it may also be worth > mentioning also that zlib itself implements rle [1], and if there was ever > a desire to go "python all the way down" you need an RLE somewhere anyway > :) > > That said, I'll be pretty happy with anything that replaces an hour of > google/coding/testing/(hour later find out I'm an idiot from a random > listserv) with 1 minute of googling. Again, my issue isn't that it was > difficult to code, but it *was* hard to make the research-y jump from > googling for "run length encoding python", where I knew *exactly* what > algorithm I wanted, to "itertools.groupby" which appears to be more > general purpose and needs a little tweaking. Adjusting the docs/recipes > would probably solve that problem. > > -- To me this is roughly on the same level as googling for 'binary search > python' and not having bisect show up. > > However, the fact that `itertools.groupby` doesn't group over elements > that are not contiguous is a bit surprising to me coming from SQL/pandas/R > land (that is probably a large part of my disconnect here). This is > actually explicitly called out in the current docs, but I wonder how many > people search for one thing and find the other: > > I googled for RLE and the solution was actually groupby, but probably a > lot of other people want a SQL group-by accidentally got an RLE and have to > work around that... Then again, I don't know if you all can easily change > names of functions at this point. > > -Neal > > [1] https://github.com/madler/zlib/blob/master/deflate.c#L2057 > > > > On Sat, Jun 10, 2017 at 9:39 PM, Greg Ewing > wrote: > >> In my experience, RLE isn't something you often find on its own. >> Usually it's used as part of some compression scheme that also >> has ways of encoding verbatim runs of data and maybe other >> things. >> >> So I'm skeptical that it can be usefully provided as a library >> function. It seems more like a design pattern than something >> you can capture in a library. >> >> -- >> Greg >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nfultz at gmail.com Sun Jun 11 02:17:33 2017 From: nfultz at gmail.com (Neal Fultz) Date: Sat, 10 Jun 2017 23:17:33 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: <593CC97E.4000901@canterbury.ac.nz> Message-ID: If the consensus is "Let's add ten lines to the recipes" I'm all aboard, ignore the rest: if I could have googled a good answer I would have stopped there. I won't argue the necessity or obviousness of itertools.groupby, just it's name: * I myself am a false negative that wanted the RLE behavior *and couldn't find it easily * so we should update the docs * other people have been false positive and wanted a SQL-type group by, but got burned * hence the warnings in the docs. * If you say explicate "by run", some extra group of them will then know what that means vs the current wording. I would definitely also support adding helper functions though, I think this is a very common use case which turns up in math/optimization applied to geology, biology, ... , and also fax machines: https://en.wikipedia.org/wiki/Run-length_encoding Also, if someone rewrote zip in pure python, would many people actually notice a slow down vs network latency, disk IO, etc? RLE is a building block just like bisect. :) Anyway, I'm not claiming my implementation is some huge gift, but let's at least add a recipe or documentation so people can find y'all's way later without reinventing the wheel. On Sat, Jun 10, 2017 at 10:19 PM, David Mertz wrote: > If you understand what iterators do, the fact that itertools.groupby > collects contiguous elements is both obvious and necessary. Iterators > might be infinitely long... you cannot ask for every "A" that might > eventually occur in an infinite sequence of letters. > > On Sat, Jun 10, 2017 at 10:08 PM, Neal Fultz wrote: > >> Agreed to a degree about providing it as code, but it may also be worth >> mentioning also that zlib itself implements rle [1], and if there was ever >> a desire to go "python all the way down" you need an RLE somewhere anyway >> :) >> >> That said, I'll be pretty happy with anything that replaces an hour of >> google/coding/testing/(hour later find out I'm an idiot from a random >> listserv) with 1 minute of googling. Again, my issue isn't that it was >> difficult to code, but it *was* hard to make the research-y jump from >> googling for "run length encoding python", where I knew *exactly* what >> algorithm I wanted, to "itertools.groupby" which appears to be more >> general purpose and needs a little tweaking. Adjusting the docs/recipes >> would probably solve that problem. >> >> -- To me this is roughly on the same level as googling for 'binary >> search python' and not having bisect show up. >> >> However, the fact that `itertools.groupby` doesn't group over elements >> that are not contiguous is a bit surprising to me coming from SQL/pandas/R >> land (that is probably a large part of my disconnect here). This is >> actually explicitly called out in the current docs, but I wonder how many >> people search for one thing and find the other: >> >> I googled for RLE and the solution was actually groupby, but probably a >> lot of other people want a SQL group-by accidentally got an RLE and have to >> work around that... Then again, I don't know if you all can easily change >> names of functions at this point. >> >> -Neal >> >> [1] https://github.com/madler/zlib/blob/master/deflate.c#L2057 >> >> >> >> On Sat, Jun 10, 2017 at 9:39 PM, Greg Ewing >> wrote: >> >>> In my experience, RLE isn't something you often find on its own. >>> Usually it's used as part of some compression scheme that also >>> has ways of encoding verbatim runs of data and maybe other >>> things. >>> >>> So I'm skeptical that it can be usefully provided as a library >>> function. It seems more like a design pattern than something >>> you can capture in a library. >>> >>> -- >>> Greg >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sun Jun 11 04:53:08 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 11 Jun 2017 11:53:08 +0300 Subject: [Python-ideas] Run length encoding In-Reply-To: References: <593CC97E.4000901@canterbury.ac.nz> Message-ID: 11.06.17 09:17, Neal Fultz ????: > * other people have been false positive and wanted a SQL-type group > by, but got burned > * hence the warnings in the docs. This wouldn't help if people don't read the docs. > Also, if someone rewrote zip in pure python, would many people actually > notice a slow down vs network latency, disk IO, etc? Definitely yes. > RLE is a building block just like bisect. This is very specific building block. And if ZIP compression be rewrote in pure Python it wouldn't use FYI, there are multiple compression methods supported in ZIP files, but the zipmodule module implements not all of them. In particular simple RLE based methods are not implemented (they almost not used in real world now). I suppose that if the zipmodule module implements these algorithms it wouldn't use any general RLE implementation. https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT From barry at barrys-emacs.org Tue Jun 13 14:30:35 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Tue, 13 Jun 2017 19:30:35 +0100 Subject: [Python-ideas] ImportError raised for a circular import Message-ID: Recently I fell into the trap of creating a circular import and yet again it took time to figure out what was wrong. I'm wondering why the python import code does not detect this error and raise an exception. I took a look at the code and got as far as figuring out that I would need to add the detection to the python 3 import code. Unless I missed something I cannot get the detection without modifying the core code as I could see no way to hook the process cleanly. Is it reasonable idea to add this detection to python? I am willing to work on a patch. Barry From antoine.rozo at gmail.com Tue Jun 13 15:13:01 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Tue, 13 Jun 2017 21:13:01 +0200 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: Message-ID: But circular imports are sometimes needed in modules. For example when you have two classes in two different modules that reference each other in their methods (and because you can't pre-declare classes like in C++). 2017-06-13 20:30 GMT+02:00 Barry Scott : > Recently I fell into the trap of creating a circular import and yet again > it took time to figure out what was wrong. > > I'm wondering why the python import code does not detect this error and > raise an exception. > > I took a look at the code and got as far as figuring out that I would need > to add the detection to the > python 3 import code. Unless I missed something I cannot get the detection > without > modifying the core code as I could see no way to hook the process cleanly. > > Is it reasonable idea to add this detection to python? > > I am willing to work on a patch. > > Barry > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Tue Jun 13 14:42:55 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Tue, 13 Jun 2017 19:42:55 +0100 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers Message-ID: I have been trying to get dir(c_ext_obj) to work for PyCXX as the method used with python2 was removed in python3, namely use the list of names returned from the __members__ attribute. I have failed to find a simple replacement for the python3 API. It seems that I have implement __dir__. But to do that I need to know what dir() will return and add the member variables to the answer. I have been able to figure out what is necessary to write such code. No one on python users or python dev responded to an earlier query on this subject. Would it be possible to simply put back the support for the __members__ attribute in python3? Or provide a API call to get the list that dir() would produce for my object. Barry From guettliml at thomas-guettler.de Tue Jun 13 16:13:44 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Tue, 13 Jun 2017 22:13:44 +0200 Subject: [Python-ideas] socket module: plain stuples vs named tuples Message-ID: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> AFAIK the socket module returns plain tuples in Python3: https://docs.python.org/3/library/socket.html Why not use named tuples? Regards, Thomas G?ttler -- I am looking for feedback for my personal programming guidelines: https://github.com/guettli/programming-guidelines From barry at barrys-emacs.org Tue Jun 13 16:35:28 2017 From: barry at barrys-emacs.org (Barry) Date: Tue, 13 Jun 2017 21:35:28 +0100 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: Message-ID: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> > On 13 Jun 2017, at 20:13, Antoine Rozo wrote: > > But circular imports are sometimes needed in modules. > For example when you have two classes in two different modules that reference each other in their methods (and because you can't pre-declare classes like in C++). Really? It has always been a strong sign of a design bug in all the cases I have ever seen. The example you suggest always fails when I accidentally write it. Pylint will certainly shout loud that this case is an error. Barry > > 2017-06-13 20:30 GMT+02:00 Barry Scott : >> Recently I fell into the trap of creating a circular import and yet again it took time to figure out what was wrong. >> >> I'm wondering why the python import code does not detect this error and raise an exception. >> >> I took a look at the code and got as far as figuring out that I would need to add the detection to the >> python 3 import code. Unless I missed something I cannot get the detection without >> modifying the core code as I could see no way to hook the process cleanly. >> >> Is it reasonable idea to add this detection to python? >> >> I am willing to work on a patch. >> >> Barry >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jun 13 16:43:16 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 14 Jun 2017 06:43:16 +1000 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: On Wed, Jun 14, 2017 at 6:35 AM, Barry wrote: > On 13 Jun 2017, at 20:13, Antoine Rozo wrote: > > But circular imports are sometimes needed in modules. > For example when you have two classes in two different modules that > reference each other in their methods (and because you can't pre-declare > classes like in C++). > > > Really? It has always been a strong sign of a design bug in all the cases I > have ever seen. > The example you suggest always fails when I accidentally write it. > > Pylint will certainly shout loud that this case is an error. > Depends on your definition of "circular". Consider this: # __init__.py from flask import Flask app = Flask(__name__) from . import views # views.py from . import app @app.route("/") def home(): ... Technically this is circular. During the loading of __init__, views will be imported, which then imports something from __init__. But it's perfectly well-defined (there's no way that views will ever be the first one imported, per the rules of packages) and it makes good sense. An error on circular imports, or even a warning, would be very annoying here. ChrisA From tjreedy at udel.edu Tue Jun 13 18:09:23 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 13 Jun 2017 18:09:23 -0400 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: On 6/13/2017 2:42 PM, Barry Scott wrote: > I have been trying to get dir(c_ext_obj) to work for PyCXX as the method used with > python2 was removed in python3, namely use the list of names returned from > the __members__ attribute. __members__ was deprecated about 15 years ago and gone from the stdlib before 2.7. The 2.7 doc says "object.__members__ Deprecated since version 2.2: Use the built-in function dir() to get a list of an object?s attributes. This attribute is no longer available." Ditto for __methods__. These were pre-2.2, pre type-class unification hacks. Only a few builtin types has a .__members__ for non-function data attributes. > I have failed to find a simple replacement for the python3 API. > > It seems that I have implement __dir__. > But to do that I need to know what dir() will return and add the member variables to the answer. dir includes __methods__ + __members__. > I have been able to figure out what is necessary to write such code. > No one on python users or python dev responded to an earlier query on this subject. Framed in terms of something so ancient, the question makes no sense to most people. If you ask again, don't refer to __members__. > Would it be possible to simply put back the support for the __members__ attribute in python3? No. Perhaps you are looking for __dir__, called by dir(). " dir([object]) Without arguments, return the list of names in the current local scope. With an argument, attempt to return a list of valid attributes for that object. If the object has a method named __dir__(), this method will be called and must return the list of attributes." -- Terry Jan Reedy From mahmoud at hatnote.com Tue Jun 13 18:10:01 2017 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Tue, 13 Jun 2017 15:10:01 -0700 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: I didn't interpret the initial email as wanting an error on *all* circular imports. Merely those which are unresolvable. I've definitely helped people diagnose circular imports and wished there was an error that called that out programmatically, even if it's just a string admonition to check for circular imports, appended to the ImportError message. On Tue, Jun 13, 2017 at 1:43 PM, Chris Angelico wrote: > On Wed, Jun 14, 2017 at 6:35 AM, Barry wrote: > > On 13 Jun 2017, at 20:13, Antoine Rozo wrote: > > > > But circular imports are sometimes needed in modules. > > For example when you have two classes in two different modules that > > reference each other in their methods (and because you can't pre-declare > > classes like in C++). > > > > > > Really? It has always been a strong sign of a design bug in all the > cases I > > have ever seen. > > The example you suggest always fails when I accidentally write it. > > > > Pylint will certainly shout loud that this case is an error. > > > > Depends on your definition of "circular". Consider this: > > # __init__.py > from flask import Flask > app = Flask(__name__) > from . import views > > # views.py > from . import app > @app.route("/") > def home(): > ... > > > Technically this is circular. During the loading of __init__, views > will be imported, which then imports something from __init__. But it's > perfectly well-defined (there's no way that views will ever be the > first one imported, per the rules of packages) and it makes good > sense. An error on circular imports, or even a warning, would be very > annoying here. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jun 13 18:36:08 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 14 Jun 2017 08:36:08 +1000 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: On Wed, Jun 14, 2017 at 8:10 AM, Mahmoud Hashemi wrote: > I didn't interpret the initial email as wanting an error on *all* circular > imports. Merely those which are unresolvable. I've definitely helped people > diagnose circular imports and wished there was an error that called that out > programmatically, even if it's just a string admonition to check for > circular imports, appended to the ImportError message. Oh! That could be interesting. How about a traceback in the import chain? # a.py import b q = 1 # b.py import c # c.py from a import q c.py will trigger an ImportError, but that could say something like... oh look: $ python3 a.py Traceback (most recent call last): File "a.py", line 1, in import b File "/home/rosuav/tmp/asdf/b.py", line 1, in import c File "/home/rosuav/tmp/asdf/c.py", line 1, in from a import q ImportError: cannot import name 'q' from 'a' (/home/rosuav/tmp/asdf/a.py) Already happens :) A bit harder, but also possible, would be to have an AttributeError on a module recognize that an import is happening, and report a possible circular import. That'd take some engineering, but it would be helpful. That'd catch cases like: # c.py import a print(a.q) Is that what you're looking for? ChrisA From boehm.matthew at gmail.com Tue Jun 13 18:38:56 2017 From: boehm.matthew at gmail.com (Matt) Date: Tue, 13 Jun 2017 18:38:56 -0400 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: I've also been thinking about this lately. I can remember being confused the first time I saw "ImportError: cannot import name X". As there are multiple things that can cause this error, it took me a while to find a stackoverflow post that suggested that this might be due to circular imports. After learning this, I still had to read a few sources to understand what circular imports were and how to fix the problem. A quick stackoverflow search reveals that people frequently have questions about this error message: https://www.google.com/search?q=stackoverflow+python+import+error&oq=stackoverflow+python+import+error&aqs=chrome..69i57j0j69i64.6423j0j7&sourceid=chrome&ie=UTF-8#q=site:stackoverflow.com+python+importerror+%22cannot+import+name%22 At the very least, it would be nice if the error message could differentiate between different causes for this error. Ideally, however, I'd love if for circular imports, it included text on what they are and how to resolve them. On Tue, Jun 13, 2017 at 6:10 PM, Mahmoud Hashemi wrote: > I didn't interpret the initial email as wanting an error on *all* circular > imports. Merely those which are unresolvable. I've definitely helped people > diagnose circular imports and wished there was an error that called that > out programmatically, even if it's just a string admonition to check for > circular imports, appended to the ImportError message. > > On Tue, Jun 13, 2017 at 1:43 PM, Chris Angelico wrote: > >> On Wed, Jun 14, 2017 at 6:35 AM, Barry wrote: >> > On 13 Jun 2017, at 20:13, Antoine Rozo wrote: >> > >> > But circular imports are sometimes needed in modules. >> > For example when you have two classes in two different modules that >> > reference each other in their methods (and because you can't pre-declare >> > classes like in C++). >> > >> > >> > Really? It has always been a strong sign of a design bug in all the >> cases I >> > have ever seen. >> > The example you suggest always fails when I accidentally write it. >> > >> > Pylint will certainly shout loud that this case is an error. >> > >> >> Depends on your definition of "circular". Consider this: >> >> # __init__.py >> from flask import Flask >> app = Flask(__name__) >> from . import views >> >> # views.py >> from . import app >> @app.route("/") >> def home(): >> ... >> >> >> Technically this is circular. During the loading of __init__, views >> will be imported, which then imports something from __init__. But it's >> perfectly well-defined (there's no way that views will ever be the >> first one imported, per the rules of packages) and it makes good >> sense. An error on circular imports, or even a warning, would be very >> annoying here. >> >> ChrisA >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jun 13 18:49:43 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 14 Jun 2017 08:49:43 +1000 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: On Wed, Jun 14, 2017 at 8:09 AM, Terry Reedy wrote: > Perhaps you are looking for __dir__, called by dir(). > > " dir([object]) > > Without arguments, return the list of names in the current local scope. > With an argument, attempt to return a list of valid attributes for that > object. > > If the object has a method named __dir__(), this method will be called > and must return the list of attributes." AIUI the OP is looking to implement __dir__, but make use of *what dir() would have returned* in that function. Something like: class Magic: def __getattr__(self, attr): if attr in self.generatables: return self.generated_value(attr) raise AttributeError def __dir__(self): return default_dir(self) + self.generatables For that purpose, is it possible to use super().__dir__()? Are there any considerations where that would fail? ChrisA From mahmoud at hatnote.com Tue Jun 13 18:49:54 2017 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Tue, 13 Jun 2017 15:49:54 -0700 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: Oh I know the traceback, I've had many brought to my desk by a confused junior dev, looking a lot like yours truly a few years back. :) My only problem with it is that it makes people look at "a.py". And if you look at "a.py", you'll see there's a "q" there. While most of us on this list will check for the circular import, it's quite a bit of headscratching as to why Python can't find the "q", when it's right there in the file. Calling out the circular import possibility explicitly makes people look at the _whole_ stack track, and even if real stack traces are quite a bit longer, they'll probably make the connection. The AttributeError idea is definitely interesting because it's also a major player in circular import confusions. I think it's an ambitious idea, and would be very exciting if it were implemented. On Tue, Jun 13, 2017 at 3:36 PM, Chris Angelico wrote: > On Wed, Jun 14, 2017 at 8:10 AM, Mahmoud Hashemi > wrote: > > I didn't interpret the initial email as wanting an error on *all* > circular > > imports. Merely those which are unresolvable. I've definitely helped > people > > diagnose circular imports and wished there was an error that called that > out > > programmatically, even if it's just a string admonition to check for > > circular imports, appended to the ImportError message. > > Oh! That could be interesting. How about a traceback in the import chain? > > # a.py > import b > q = 1 > > # b.py > import c > > # c.py > from a import q > > c.py will trigger an ImportError, but that could say something like... oh > look: > > $ python3 a.py > Traceback (most recent call last): > File "a.py", line 1, in > import b > File "/home/rosuav/tmp/asdf/b.py", line 1, in > import c > File "/home/rosuav/tmp/asdf/c.py", line 1, in > from a import q > ImportError: cannot import name 'q' from 'a' (/home/rosuav/tmp/asdf/a.py) > > Already happens :) > > A bit harder, but also possible, would be to have an AttributeError on > a module recognize that an import is happening, and report a possible > circular import. That'd take some engineering, but it would be > helpful. That'd catch cases like: > > # c.py > import a > print(a.q) > > Is that what you're looking for? > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Jun 13 22:41:13 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 14 Jun 2017 12:41:13 +1000 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: On 14 June 2017 at 08:49, Mahmoud Hashemi wrote: > Oh I know the traceback, I've had many brought to my desk by a confused > junior dev, looking a lot like yours truly a few years back. :) Something worth noting is that as of 3.7, all circular imports that actually *are* resolvable at runtime will be resolved: https://bugs.python.org/issue30024 However, that only impacts submodules where the submodule entry exists in sys.modules, but the name hasn't been bound in the parent module yet - it doesn't help with module level attributes that would be defined eventually, but we're still too early in the module's import process for them to exist yet. As Chris pointed out, there are two key points of name resolution to take into account for those cases: * ModuleType.__getattr__ ("import a; a.q") * from_list processing in the import system ("from a import q") Since the import system already keeps track of "currently in progress imports" to manage the per-module import locks, both of those could potentially be updated to query _frozen_importlib._module_locks to find out if the source module was currently in the process of being imported and raise a new CircularImportError that inherited from both AttributeError and ImportError when that was the case. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mahmoud at hatnote.com Tue Jun 13 23:02:38 2017 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Tue, 13 Jun 2017 20:02:38 -0700 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: That would be amazing! If there's anything I can do to help make that happen, please let me know. It'll almost certainly save that much time for me alone down the line, anyway :) On Tue, Jun 13, 2017 at 7:41 PM, Nick Coghlan wrote: > On 14 June 2017 at 08:49, Mahmoud Hashemi wrote: > > Oh I know the traceback, I've had many brought to my desk by a confused > > junior dev, looking a lot like yours truly a few years back. :) > > Something worth noting is that as of 3.7, all circular imports that > actually *are* resolvable at runtime will be resolved: > https://bugs.python.org/issue30024 > > However, that only impacts submodules where the submodule entry exists > in sys.modules, but the name hasn't been bound in the parent module > yet - it doesn't help with module level attributes that would be > defined eventually, but we're still too early in the module's import > process for them to exist yet. > > As Chris pointed out, there are two key points of name resolution to > take into account for those cases: > > * ModuleType.__getattr__ ("import a; a.q") > * from_list processing in the import system ("from a import q") > > Since the import system already keeps track of "currently in progress > imports" to manage the per-module import locks, both of those could > potentially be updated to query _frozen_importlib._module_locks to > find out if the source module was currently in the process of being > imported and raise a new CircularImportError that inherited from both > AttributeError and ImportError when that was the case. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jun 14 02:33:27 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 14 Jun 2017 16:33:27 +1000 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: On 14 June 2017 at 13:02, Mahmoud Hashemi wrote: > That would be amazing! If there's anything I can do to help make that > happen, please let me know. It'll almost certainly save that much time for > me alone down the line, anyway :) The `IMPORT_FROM` opcode's error handling would probably be the best place to start poking around: https://github.com/python/cpython/blob/master/Python/ceval.c#L5055 If you can prove the concept there, that would: 1. Directly handle the "from x import y" and "import x.y as name" cases 2. Provide a starting point for factoring out a "report missing module attribute" helper that could be shared with ModuleType As an example of querying _frozen_importlib state from C code, I'd point to https://github.com/python/cpython/blob/master/Python/import.c#L478 Cheers, Nick. P.S. I also double checked that ImportError & AttributeError have compatible binary layouts, so dual inheritance from them works :) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Wed Jun 14 03:59:25 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 14 Jun 2017 08:59:25 +0100 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: On 13 June 2017 at 23:36, Chris Angelico wrote: > On Wed, Jun 14, 2017 at 8:10 AM, Mahmoud Hashemi wrote: >> I didn't interpret the initial email as wanting an error on *all* circular >> imports. Merely those which are unresolvable. I've definitely helped people >> diagnose circular imports and wished there was an error that called that out >> programmatically, even if it's just a string admonition to check for >> circular imports, appended to the ImportError message. > > Oh! That could be interesting. How about a traceback in the import chain? I have a feeling that mypy might flag circular imports. I've not used mypy myself, but I just saw the output from a project where we enabled very basic use of mypy (no type hints at all, yet) and saw an error reported about a circular import. So with suitable configuration, mypy could help here (and may lead to other benefits if you want to use more of its capabilities). OTOH, having the interpreter itself flag that it had got stuck in an import loop with an explicit message explaining the problem sounds like a reasonable idea. Paul From levkivskyi at gmail.com Wed Jun 14 04:26:42 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 14 Jun 2017 10:26:42 +0200 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: On 14 June 2017 at 09:59, Paul Moore wrote: > On 13 June 2017 at 23:36, Chris Angelico wrote: > > On Wed, Jun 14, 2017 at 8:10 AM, Mahmoud Hashemi > wrote: > >> I didn't interpret the initial email as wanting an error on *all* > circular > >> imports. Merely those which are unresolvable. I've definitely helped > people > >> diagnose circular imports and wished there was an error that called > that out > >> programmatically, even if it's just a string admonition to check for > >> circular imports, appended to the ImportError message. > > > > Oh! That could be interesting. How about a traceback in the import chain? > > I have a feeling that mypy might flag circular imports. I've not used > mypy myself, but I just saw the output from a project where we enabled > very basic use of mypy (no type hints at all, yet) and saw an error > reported about a circular import. So with suitable configuration, mypy > could help here (and may lead to other benefits if you want to use > more of its capabilities). > Mypy doesn't always flag invalid circular imports, there is an old issue about this, see https://github.com/python/mypy/issues/61 But yes, one gets many other benefits like static type checking (including checking the types of things imported in a circular manner). -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Wed Jun 14 05:17:16 2017 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Wed, 14 Jun 2017 13:17:16 +0400 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: Message-ID: anyways circular imports seem to be making people go for node.js rather than using python . . . Abdur-Rahmaan Janhangeer, Mauritius abdurrahmaanjanhangeer.wordpress.com On 13 Jun 2017 23:04, "Barry Scott" wrote: > Recently I fell into the trap of creating a circular import and yet again > it took time to figure out what was wrong. > > I'm wondering why the python import code does not detect this error and > raise an exception. > > I took a look at the code and got as far as figuring out that I would need > to add the detection to the > python 3 import code. Unless I missed something I cannot get the detection > without > modifying the core code as I could see no way to hook the process cleanly. > > Is it reasonable idea to add this detection to python? > > I am willing to work on a patch. > > Barry > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Wed Jun 14 06:16:14 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 14 Jun 2017 12:16:14 +0200 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: Message-ID: Huh? The semantics of node.js seem to be completely similar to Python in this respect. In both cases the circular import works if you go through the mutable module object but fails if both modules circularly try to import module members directly. Stephan Op 14 jun. 2017 11:17 a.m. schreef "Abdur-Rahmaan Janhangeer" < arj.python at gmail.com>: > anyways circular imports seem to be making people go for node.js rather > than using python . . . > > Abdur-Rahmaan Janhangeer, > Mauritius > abdurrahmaanjanhangeer.wordpress.com > > On 13 Jun 2017 23:04, "Barry Scott" wrote: > >> Recently I fell into the trap of creating a circular import and yet again >> it took time to figure out what was wrong. >> >> I'm wondering why the python import code does not detect this error and >> raise an exception. >> >> I took a look at the code and got as far as figuring out that I would >> need to add the detection to the >> python 3 import code. Unless I missed something I cannot get the >> detection without >> modifying the core code as I could see no way to hook the process cleanly. >> >> Is it reasonable idea to add this detection to python? >> >> I am willing to work on a patch. >> >> Barry >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Wed Jun 14 16:54:11 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Wed, 14 Jun 2017 21:54:11 +0100 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: > On 13 Jun 2017, at 23:49, Chris Angelico wrote: > > On Wed, Jun 14, 2017 at 8:09 AM, Terry Reedy wrote: >> Perhaps you are looking for __dir__, called by dir(). >> >> " dir([object]) >> >> Without arguments, return the list of names in the current local scope. >> With an argument, attempt to return a list of valid attributes for that >> object. >> >> If the object has a method named __dir__(), this method will be called >> and must return the list of attributes." > > AIUI the OP is looking to implement __dir__, but make use of *what > dir() would have returned* in that function. Something like: Yes. > > class Magic: > def __getattr__(self, attr): > if attr in self.generatables: > return self.generated_value(attr) > raise AttributeError > def __dir__(self): > return default_dir(self) + self.generatables > > For that purpose, is it possible to use super().__dir__()? Are there > any considerations where that would fail? Remember that I need to do this in the C API and I want default_dir of self in C not python. super().__dir__ looks at the class above me that is typically object() and so is not useful as it does not list the member function from my class or __mro__ or other stuff I may not be aware of that is important to return. Today I solve the problem in 2.7 C extension code by providing a value for __members__. In python3 I have no idea how to do this in C. I can find no example code that addresses this problem. How am I supposed to code this without the __members__ trick? Did I miss the C API that implements default_dir(self)? Barry > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From rosuav at gmail.com Wed Jun 14 19:51:09 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 15 Jun 2017 09:51:09 +1000 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: On Thu, Jun 15, 2017 at 6:54 AM, Barry Scott wrote: >> class Magic: >> def __getattr__(self, attr): >> if attr in self.generatables: >> return self.generated_value(attr) >> raise AttributeError >> def __dir__(self): >> return default_dir(self) + self.generatables >> >> For that purpose, is it possible to use super().__dir__()? Are there >> any considerations where that would fail? > > Remember that I need to do this in the C API and I want default_dir of self in C not python. > > super().__dir__ looks at the class above me that is typically object() and so is not useful > as it does not list the member function from my class or __mro__ or other stuff I may not be aware of > that is important to return. Right, thank you. That's the bit I wasn't aware of - the reason that __dir__ can't be used. So in terms of defining the problem, this is a solid piece of info. ChrisA From ethan at stoneleaf.us Wed Jun 14 20:38:36 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 14 Jun 2017 17:38:36 -0700 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: <5941D70C.4090802@stoneleaf.us> On 06/14/2017 01:54 PM, Barry Scott wrote: > super().__dir__ looks at the class above me that is typically object() and so is not useful > as it does not list the member function from my class or __mro__ or other stuff I may not be aware of > that is important to return. __dir__ should return whatever you think is interesting about your object. It does not have to return everything, and in fact makes no guarantees that it will return everything. Enum's __dir__, for example, only returns a handful of entries. -- ~Ethan~ From njs at pobox.com Wed Jun 14 21:06:02 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 14 Jun 2017 18:06:02 -0700 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: On Wed, Jun 14, 2017 at 1:54 PM, Barry Scott wrote: > > On 13 Jun 2017, at 23:49, Chris Angelico wrote: > > For that purpose, is it possible to use super().__dir__()? Are there > > any considerations where that would fail? > > Remember that I need to do this in the C API and I want default_dir of self in C not python. > > super().__dir__ looks at the class above me that is typically object() and so is not useful > as it does not list the member function from my class or __mro__ or other stuff I may not be aware of > that is important to return. object.__dir__(your_class_instance) should generally return everything you would get if you didn't override __dir__ at all. Remember, that code doesn't mean "return the methods and attributes defined on the object class", it's "run the object class's __dir__ method with self=your_class_instance". I don't know off-hand if there's a nicer way to do this from C than to manually look up the "__dir__" attribute on PyBaseObject_Type. (And of course if you wanted to get really fancy and handle cases where your object inherits from some type other than 'object', or where some user sticks your type into a multiple-inheritance hierarchy, you might potentially want to find "__dir__" using super lookup instead of going directly to PyBaseObject_Type. From a quick google it looks like this page gives an approach for doing that: https://pythonextensionpatterns.readthedocs.io/en/latest/super_call.html) -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Wed Jun 14 23:45:13 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 15 Jun 2017 13:45:13 +1000 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: On 15 June 2017 at 11:06, Nathaniel Smith wrote: > On Wed, Jun 14, 2017 at 1:54 PM, Barry Scott wrote: >> > On 13 Jun 2017, at 23:49, Chris Angelico wrote: >> > For that purpose, is it possible to use super().__dir__()? Are there >> > any considerations where that would fail? >> >> Remember that I need to do this in the C API and I want default_dir of self in C not python. >> >> super().__dir__ looks at the class above me that is typically object() and so is not useful >> as it does not list the member function from my class or __mro__ or other stuff I may not be aware of >> that is important to return. > > object.__dir__(your_class_instance) should generally return everything > you would get if you didn't override __dir__ at all. Remember, that > code doesn't mean "return the methods and attributes defined on the > object class", it's "run the object class's __dir__ method with > self=your_class_instance". > > I don't know off-hand if there's a nicer way to do this from C than to > manually look up the "__dir__" attribute on PyBaseObject_Type. This is the kind of case where https://docs.python.org/3/c-api/object.html#c.PyObject_CallMethod is useful: dir_result = PyObject_CallMethod(base_type, "__dir__", "O", self); /* Add any additional attributes to the dir_result list */ return dir_result; Fully supporting multiple inheritance is more work (as your link shows), and often not needed. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From spencerb21 at live.com Thu Jun 15 05:47:31 2017 From: spencerb21 at live.com (Spencer Brown) Date: Thu, 15 Jun 2017 09:47:31 +0000 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: , Message-ID: Maybe it would make sense to implement a C-API function to perform a super() lookup, without all the contortions needed currently. It seems wasteful to have to make a super object, use it then immediately discard all the time. However the current logic is entangled into the type a fair bit, so that might take some work. Spencer Brown > On 15 Jun 2017, at 1:46 pm, Nick Coghlan wrote: > >> On 15 June 2017 at 11:06, Nathaniel Smith wrote: >> On Wed, Jun 14, 2017 at 1:54 PM, Barry Scott wrote: >>>> On 13 Jun 2017, at 23:49, Chris Angelico wrote: >>>> For that purpose, is it possible to use super().__dir__()? Are there >>>> any considerations where that would fail? >>> >>> Remember that I need to do this in the C API and I want default_dir of self in C not python. >>> >>> super().__dir__ looks at the class above me that is typically object() and so is not useful >>> as it does not list the member function from my class or __mro__ or other stuff I may not be aware of >>> that is important to return. >> >> object.__dir__(your_class_instance) should generally return everything >> you would get if you didn't override __dir__ at all. Remember, that >> code doesn't mean "return the methods and attributes defined on the >> object class", it's "run the object class's __dir__ method with >> self=your_class_instance". >> >> I don't know off-hand if there's a nicer way to do this from C than to >> manually look up the "__dir__" attribute on PyBaseObject_Type. > > This is the kind of case where > https://docs.python.org/3/c-api/object.html#c.PyObject_CallMethod is > useful: > > dir_result = PyObject_CallMethod(base_type, "__dir__", "O", self); > /* Add any additional attributes to the dir_result list */ > return dir_result; > > Fully supporting multiple inheritance is more work (as your link > shows), and often not needed. > > Cheers, > Nick. From njs at pobox.com Thu Jun 15 18:15:22 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 15 Jun 2017 15:15:22 -0700 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: On Thu, Jun 15, 2017 at 2:44 PM, Barry Scott wrote: > > On 15 Jun 2017, at 04:45, Nick Coghlan wrote: >> dir_result = PyObject_CallMethod(base_type, "__dir__", "O", self); >> /* Add any additional attributes to the dir_result list */ >> return dir_result; > > > But I need the result of __dir__ for my object not its base. Yes, that's what that code should give you. Try it :-) -n -- Nathaniel J. Smith -- https://vorpus.org From barry at barrys-emacs.org Thu Jun 15 17:44:48 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Thu, 15 Jun 2017 22:44:48 +0100 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: > On 15 Jun 2017, at 04:45, Nick Coghlan wrote: > > On 15 June 2017 at 11:06, Nathaniel Smith > wrote: >> On Wed, Jun 14, 2017 at 1:54 PM, Barry Scott wrote: >>>> On 13 Jun 2017, at 23:49, Chris Angelico wrote: >>>> For that purpose, is it possible to use super().__dir__()? Are there >>>> any considerations where that would fail? >>> >>> Remember that I need to do this in the C API and I want default_dir of self in C not python. >>> >>> super().__dir__ looks at the class above me that is typically object() and so is not useful >>> as it does not list the member function from my class or __mro__ or other stuff I may not be aware of >>> that is important to return. >> >> object.__dir__(your_class_instance) should generally return everything >> you would get if you didn't override __dir__ at all. Remember, that >> code doesn't mean "return the methods and attributes defined on the >> object class", it's "run the object class's __dir__ method with >> self=your_class_instance". >> >> I don't know off-hand if there's a nicer way to do this from C than to >> manually look up the "__dir__" attribute on PyBaseObject_Type. > > This is the kind of case where > https://docs.python.org/3/c-api/object.html#c.PyObject_CallMethod is > useful: > > dir_result = PyObject_CallMethod(base_type, "__dir__", "O", self); > /* Add any additional attributes to the dir_result list */ > return dir_result; But I need the result of __dir__ for my object not its base. Then I need to add in the list of member attributes that are missing because python itself has no knowledge of them they are accessed via getattr(). Now I cannot call __dir__ for my object in the implementation of __dir__ for the obvious reason. Today all classes defined using PyCXX for python3 cannot return the list of member variables via dir(obj). Where as in python2 I can make dir() work. > Fully supporting multiple inheritance is more work (as your link > shows), and often not needed. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Jun 16 04:46:05 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 16 Jun 2017 18:46:05 +1000 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: On 16 June 2017 at 07:44, Barry Scott wrote: > But I need the result of __dir__ for my object not its base. Then I need to > add in the list of member attributes that are missing because python > itself has no knowledge of them they are accessed via getattr(). The C code: dir_result = PyObject_CallMethod(base_type, "__dir__", "O", self); is roughly equivalent to the Python code: dir_result = BaseType.__dir__(self) That is, it's calling the base type's __dir__ method, but it's still using the subclass *instance*. It's the same pattern people use to call a base type's __getattr__ or __getattribute__ for the subclass implementation of those methods, just without multiple inheritance support (since calling super() from C is painful). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tjol at tjol.eu Fri Jun 16 13:36:23 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Fri, 16 Jun 2017 19:36:23 +0200 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory In-Reply-To: References: Message-ID: <7af7b19c-dc85-3fa0-1308-ab00b0c82d2b@tjol.eu> On 08/06/17 15:42, Antoine Pietri wrote: > Hello everyone! > > A very common pattern when dealing with temporary files is code like this: > > with tempfile.TemporaryDirectory() as tmpdir: > tmp_path = tmpdir.name > > os.chmod(tmp_path) > os.foobar(tmp_path) > open(tmp_path).read(barquux) Is it? py> import tempfile py> with tempfile.TemporaryDirectory() as tmpdir: ... print(tmpdir, type(tmpdir)) ... /tmp/tmp2kiqzmi9 py> > > PEP 519 (https://www.python.org/dev/peps/pep-0519/) introduced the > concept of "path-like objects", objects that define a __fspath__() > method. Most of the standard library has been adapted so that the > functions accept path-like objects. > > My proposal is to define __fspath__() for TemporaryDirectory and > NamedTemporaryFile so that we can pass those directly to the library > functions instead of having to use the .name attribute explicitely. > > Thoughts? :-) > -- Thomas Jollans m ? +31 6 42630259 e ? tjol at tjol.eu From ethan at stoneleaf.us Fri Jun 16 16:21:42 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 16 Jun 2017 13:21:42 -0700 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory In-Reply-To: <7af7b19c-dc85-3fa0-1308-ab00b0c82d2b@tjol.eu> References: <7af7b19c-dc85-3fa0-1308-ab00b0c82d2b@tjol.eu> Message-ID: <59443DD6.7010608@stoneleaf.us> On 06/16/2017 10:36 AM, Thomas Jollans wrote: > On 08/06/17 15:42, Antoine Pietri wrote: >> Hello everyone! >> >> A very common pattern when dealing with temporary files is code like this: >> >> with tempfile.TemporaryDirectory() as tmpdir: >> tmp_path = tmpdir.name >> >> os.chmod(tmp_path) >> os.foobar(tmp_path) >> open(tmp_path).read(barquux) > > Is it? > > py> import tempfile > py> with tempfile.TemporaryDirectory() as tmpdir: > ... print(tmpdir, type(tmpdir)) > ... > /tmp/tmp2kiqzmi9 > py> Interesting... on 3.4 and 3.5 I get: --> import tempfile --> tempfile.TemporaryDirectory() --> with tempfile.TemporaryDirectory() as tmpdir: ... tmpdir ... '/tmp/tmpo63icqfe' So a if used directly, and a if used as a context manager. I don't have a copy of 3.6 nor the future 3.7 handy, so maybe it changed there? -- ~Ethan~ From abrault at mapgears.com Fri Jun 16 16:37:39 2017 From: abrault at mapgears.com (Alexandre Brault) Date: Fri, 16 Jun 2017 16:37:39 -0400 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory In-Reply-To: <59443DD6.7010608@stoneleaf.us> References: <7af7b19c-dc85-3fa0-1308-ab00b0c82d2b@tjol.eu> <59443DD6.7010608@stoneleaf.us> Message-ID: On 2017-06-16 04:21 PM, Ethan Furman wrote: > On 06/16/2017 10:36 AM, Thomas Jollans wrote: >> On 08/06/17 15:42, Antoine Pietri wrote: >>> Hello everyone! >>> >>> A very common pattern when dealing with temporary files is code like >>> this: >>> >>> with tempfile.TemporaryDirectory() as tmpdir: >>> tmp_path = tmpdir.name >>> >>> os.chmod(tmp_path) >>> os.foobar(tmp_path) >>> open(tmp_path).read(barquux) >> >> Is it? >> >> py> import tempfile >> py> with tempfile.TemporaryDirectory() as tmpdir: >> ... print(tmpdir, type(tmpdir)) >> ... >> /tmp/tmp2kiqzmi9 >> py> > > Interesting... on 3.4 and 3.5 I get: > > --> import tempfile > > --> tempfile.TemporaryDirectory() > > > --> with tempfile.TemporaryDirectory() as tmpdir: > ... tmpdir > ... > '/tmp/tmpo63icqfe' > > So a if used directly, and a if used as a > context manager. I don't have a copy of 3.6 nor the future 3.7 handy, > so maybe it changed there? > > -- > ~Ethan~ The code in master has the context manager return `self.name`. This behaviour has (based on looking at the 3.2 tag where TemporaryDirectory was added) always been used. Alex From ethan at stoneleaf.us Fri Jun 16 17:02:40 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 16 Jun 2017 14:02:40 -0700 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory In-Reply-To: References: <7af7b19c-dc85-3fa0-1308-ab00b0c82d2b@tjol.eu> <59443DD6.7010608@stoneleaf.us> Message-ID: <59444770.8030304@stoneleaf.us> On 06/16/2017 01:37 PM, Alexandre Brault wrote: >> So a if used directly, and a if used as a >> context manager. I don't have a copy of 3.6 nor the future 3.7 handy, >> so maybe it changed there? > > The code in master has the context manager return `self.name`. This > behaviour has (based on looking at the 3.2 tag where TemporaryDirectory > was added) always been used. It is an often overlooked fact that a context manager is free to return anything, not just itself. So TemporaryDirectory can create the folder, return the name of the folder, and remove the folder on exit. So no need for the "extract the name from the object" dance. Interestingly enough, that is what the docs say [1]. Pretty cool. -- ~Ethan~ [1] https://docs.python.org/3/library/tempfile.html#tempfile.TemporaryDirectory From python at lucidity.plus.com Fri Jun 16 19:35:42 2017 From: python at lucidity.plus.com (Erik) Date: Sat, 17 Jun 2017 00:35:42 +0100 Subject: [Python-ideas] [Python-Dev] Language proposal: variable assignment in functional context In-Reply-To: References: Message-ID: [cross-posted to python-ideas] Hi Robert, On 16/06/17 12:32, Robert Vanden Eynde wrote: > Hello, I would like to propose an idea for the language but I don't know > where I can talk about it. Can you please explain what the problem is that you are trying to solve? > In a nutshell, I would like to be able to write: > y = (b+2 for b = a + 1) The above is (almost) equivalent to: y = (a+1)+2 I realize the parentheses are not required, but I've included them because if your example mixed operators with different precedence then they might be necessary. Other than binding 'b' (you haven't defined what you expect the scope of that to be, but I'll assume it's the outer scope for now), what is it about the form you're proposing that's different? > Or in list comprehension: > Y = [b+2 for a in L for b = a+1] > > Which can already be done like this: > Y = [b+2 for a in L for b in [a+1]] Y = [(a+1)+2 for a in L] > Which is less obvious, has a small overhead (iterating over a list) and > get messy with multiple assignment: > Y = [b+c+2 for a in L for b,c in [(a+1,a+2)]] > > New syntax would allow to write: > Y = [b+c+2 for a in L for b,c = (a+1,a+2)] Y = [(a+1)+(a+2)+2 for a in L] > My first example (b+2 for b = a+1) can already be done using ugly syntax > using lambda > > y = (lambda b: b+2)(b=a+1) > y = (lambda b: b+2)(a+1) > y = (lambda b=a+1: b+2)() > > Choice of syntax: for is good because it uses current keyword, and the > analogy for x = 5 vs for x in [5] is natural. > > But the "for" loses the meaning of iteration. > The use of "with" would maybe sound more logical. > > Python already have the "functional if", lambdas, list comprehension, > but not simple assignment functional style. Can you present an example that can't be re-written simply by reducing the expression as I have done above? Regards, E. From tjol at tjol.eu Fri Jun 16 16:31:57 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Fri, 16 Jun 2017 22:31:57 +0200 Subject: [Python-ideas] Define __fspath__() for NamedTemporaryFile and TemporaryDirectory In-Reply-To: <59443DD6.7010608@stoneleaf.us> References: <7af7b19c-dc85-3fa0-1308-ab00b0c82d2b@tjol.eu> <59443DD6.7010608@stoneleaf.us> Message-ID: <1a11e06d-e660-09d9-e616-a3254146b37a@tjol.eu> On 16/06/17 22:21, Ethan Furman wrote: > On 06/16/2017 10:36 AM, Thomas Jollans wrote: >> On 08/06/17 15:42, Antoine Pietri wrote: >>> Hello everyone! >>> >>> A very common pattern when dealing with temporary files is code like >>> this: >>> >>> with tempfile.TemporaryDirectory() as tmpdir: >>> tmp_path = tmpdir.name >>> >>> os.chmod(tmp_path) >>> os.foobar(tmp_path) >>> open(tmp_path).read(barquux) >> >> Is it? >> >> py> import tempfile >> py> with tempfile.TemporaryDirectory() as tmpdir: >> ... print(tmpdir, type(tmpdir)) >> ... >> /tmp/tmp2kiqzmi9 >> py> > > Interesting... on 3.4 and 3.5 I get: > > --> import tempfile > > --> tempfile.TemporaryDirectory() > > > --> with tempfile.TemporaryDirectory() as tmpdir: > ... tmpdir > ... > '/tmp/tmpo63icqfe' > > So a if used directly, and a if used as a > context manager. I don't have a copy of 3.6 nor the future 3.7 handy, > so maybe it changed there? No, this is still the same in py37. The point is that Antoine's code example does not work (in any Python). For NamedTemporaryFile, the situation is different: py> import tempfile py> import os.path py> py> tf1 = tempfile.NamedTemporaryFile() py> tf1.name '/tmp/tmpotcmslpp' py> os.path.exists(tf1.name) True py> with tf1 as tf2: ... print(tf2) ... print(tf2 is tf1) ... True py> os.path.exists(tf1.name) False py> I was wondering about this since the objection that we're dealing with a file object and not a path does not apply for TemporaryDirectory - however, if used "the right way" with a context manager the issue simply doesn't exist. -- Thomas From steve at pearwood.info Fri Jun 16 20:27:56 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 17 Jun 2017 10:27:56 +1000 Subject: [Python-ideas] [Python-Dev] Language proposal: variable assignment in functional context In-Reply-To: References: Message-ID: <20170617002754.GH3149@ando.pearwood.info> Welcome Robert. My response below. Follow-ups to Python-Ideas, thanks. You'll need to subscribe to see any further discussion. On Fri, Jun 16, 2017 at 11:32:19AM +0000, Robert Vanden Eynde wrote: > In a nutshell, I would like to be able to write: > y = (b+2 for b = a + 1) I think this is somewhat similar to a suggestion of Nick Coghlan's. One possible syntax as a statement might be: y = b + 2 given: b = a + 1 https://www.python.org/dev/peps/pep-3150/ In mathematics, I might write: y = b + 2 where b = a + 1 although of course I wouldn't do so for anything so simple. Here's a better example, the quadratic formula: -b ? ?? x = ????????? 2a where ? = b? - 4ac although even there I'd usually write ? in place. > Python already have the "functional if", lambdas, list comprehension, > but not simple assignment functional style. I think you mean "if *expression*" rather than "functional if". The term "functional" in programming usually refers to a particular paradigm: https://en.wikipedia.org/wiki/Functional_programming -- Steve From srkunze at mail.de Sat Jun 17 03:03:54 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Sat, 17 Jun 2017 09:03:54 +0200 Subject: [Python-ideas] Language proposal: variable assignment in functional context In-Reply-To: <20170617002754.GH3149@ando.pearwood.info> References: <20170617002754.GH3149@ando.pearwood.info> Message-ID: <279b1f21-0e16-465e-7e7b-1c3693c77141@mail.de> On 17.06.2017 02:27, Steven D'Aprano wrote: > I think this is somewhat similar to a suggestion of Nick Coghlan's. One > possible syntax as a statement might be: > > y = b + 2 given: > b = a + 1 Just to get this right:this proposal is about reversing the order of chaining expressions? Instead of: b = a + 1 c = b + 2 we could write it in reverse order: c = b + 2 given/for: b = a + 1 If so, I don't know if it just complicates the language with a feature which does not save writing nor reading nor cpu cycles nor memory and which adds a functionality which is already there (but in reverse order). Maybe there are more benefits I don't see right now. Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jun 17 06:27:02 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 17 Jun 2017 20:27:02 +1000 Subject: [Python-ideas] Language proposal: variable assignment in functional context In-Reply-To: <279b1f21-0e16-465e-7e7b-1c3693c77141@mail.de> References: <20170617002754.GH3149@ando.pearwood.info> <279b1f21-0e16-465e-7e7b-1c3693c77141@mail.de> Message-ID: <20170617102659.GI3149@ando.pearwood.info> On Sat, Jun 17, 2017 at 09:03:54AM +0200, Sven R. Kunze wrote: > On 17.06.2017 02:27, Steven D'Aprano wrote: > >I think this is somewhat similar to a suggestion of Nick Coghlan's. One > >possible syntax as a statement might be: > > > >y = b + 2 given: > > b = a + 1 > > Just to get this right:this proposal is about reversing the order of > chaining expressions? Partly. Did you read the PEP? https://www.python.org/dev/peps/pep-3150/ I quote: The primary motivation is to enable a more declarative style of programming, where the operation to be performed is presented to the reader first, and the details of the necessary subcalculations are presented in the following indented suite. [...] A secondary motivation is to simplify interim calculations in module and class level code without polluting the resulting namespaces. It is not *just* about reversing the order, it is also about avoiding polluting the current namespace (global, or class) with unnecessary temporary variables. This puts the emphasis on the important part of the expression, not the temporary/implementation variables: page = header + body + footer where: header = ... body = ... footer = ... There is prior art: the "where" and "let" clauses in Haskell, as well as mathematics, where it is very common to defer the definition of temporary variables until after they are used. > Instead of: > > b = a + 1 > c = b + 2 > > we could write it in reverse order: > > c = b + 2 given/for: > b = a + 1 Right. But of course such a trivial example doesn't demonstrate any benefit. This might be a better example. Imagine you have this code, where the regular expression and the custom sort function are used in one place only. Because they're only used *once*, we don't really need them to be top-level global names, but currently we have little choice. regex = re.compile(r'.*?(\d*).*') def custom_sort(string): mo = regex.match(string) ... some implementation return key # Later results = sorted(some_strings, key=custom_sort) # Optional del custom_sort, regex Here we get the order of definitions backwards: the thing we actually care about, results = sorted(...), is defined last, and mere implementation details are given priority as top-level names that either hang around forever, or need to be explicitly deleted. Some sort of "where" clause could allow: results = sorted(some_strings, key=custom_sort) where: regex = re.compile(r'.*?(\d*).*') def custom_sort(string): mo = regex.match(string) ... some implementation return key If this syntax was introduced, editors would soon allow you to fold the "where" block and hide it. The custom_sort and regex names would be local to the where block and the results = ... line. Another important use-case is comprehensions, where we often have to repeat ourselves: [obj[0].field.method(spam)[eggs] for obj in sequence if obj[0].field.method] One work around: [m(spam)[eggs] for m in [obj[0].field.method for obj in sequence] if m] But perhaps we could do something like: [m(spam)[eggs] for obj in sequence where m = obj[0].field.method if m] or something similar. > If so, I don't know if it just complicates the language with a feature > which does not save writing nor reading It helps to save reading, by pushing less-important implementation details of an expression into an inner block where it is easy to ignore them. Even if you don't have an editor which does code folding, it is easy to skip over an indented block and just read the header line, ignoring the implementation. We already do this with classes, functions, even loops: class K: ... implementation of K def func(arg): ... implementation of func for x in seq: ... implementation of loop body page = header + body + footer where: ... implementation of page As a general rule, any two lines at the same level of indentation are read as being of equal importance. When we care about the implementation details, we "drop down" into the lower indentation block. But when skimming the code for a high-level overview, we skip the details of indented blocks and focus only on the current level: class K: def func(arg): for x in seq: page = header + body + footer where: (That's why editors often provide code folding, to hide the details of an indented block. But even without that feature, we can do it in our own head, although not as effectively.) -- Steve From ncoghlan at gmail.com Sat Jun 17 11:51:59 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 18 Jun 2017 01:51:59 +1000 Subject: [Python-ideas] Language proposal: variable assignment in functional context In-Reply-To: <279b1f21-0e16-465e-7e7b-1c3693c77141@mail.de> References: <20170617002754.GH3149@ando.pearwood.info> <279b1f21-0e16-465e-7e7b-1c3693c77141@mail.de> Message-ID: On 17 June 2017 at 17:03, Sven R. Kunze wrote: > If so, I don't know if it just complicates the language with a feature which > does not save writing nor reading nor cpu cycles nor memory and which adds a > functionality which is already there (but in reverse order). > > Maybe there are more benefits I don't see right now. You've pretty much hit on why that PEP's been deferred for ~5 years or so - I'm waiting to see use cases where we can genuinely say "this would be so much easier and more readable if we had a given construct!" :) One of the original motivations was that it may potentially make writing callback based code easier. Then asyncio (and variants like curio and trio) came along and asked the question: what if we built on the concepts explored by Twisted's inlineDeferred's, and instead made it easier to write asynchronous code without explicitly constructing callback chains? I do still think the idea has potential (and Steven's post does a good job of summarising why), since mathematical discussion includes the "statement (given these assumptions: form)" for a reason, and pragmatically such a clause offers an interim "single-use namespace" refactoring step between "inline mess of spaghetti code" and "out of order execution using a named function". However, in my own work, having to come up with a sensible name for the encapsulated operation generally comes with a readability benefit as well, so... Cheers, Nick. P.S. One potentially interesting area of application might be in SymPy, as it may make it possible to write symbolic mathematical expressions that track almost identically with their conventionally written counterparts. That's not as compelling a use case as PEP 465's matrix multiplication, but it's also not hard to be more compelling than the limited set of examples I had previously collected :) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mital.vaja at googlemail.com Sat Jun 17 17:27:47 2017 From: mital.vaja at googlemail.com (Mital Ashok) Date: Sat, 17 Jun 2017 22:27:47 +0100 Subject: [Python-ideas] Make functools.singledispatch register method return original function. Message-ID: Right now, an example for single dispatch would be: from functools import singledispatch @singledispatch def fun(arg, verbose=True): if verbose: print("Let me just say,", end=" ") print(arg) @fun.register(int) def _(arg, verbose=True): if verbose: print("Strength in numbers, eh?", end=" ") print(arg) @fun.register(list) def _(arg, verbose=True): if verbose: print("Enumerate this:") for i, elem in enumerate(arg): print(i, elem) But this makes a useless _ function, that should either be deleted or ignored. For properties, a common pattern is this: class Foo: @property def bar(self): return self._bar @bar.setter def bar(self, value): self._bar = value So I'm suggesting that @function.register for single dispatch functions returns the same function, so you would end up with something like: @singledispatch def fun(arg, verbose=True): if verbose: print("Let me just say,", end=" ") print(arg) @fun.register(int) def fun(arg, verbose=True): if verbose: print("Strength in numbers, eh?", end=" ") print(arg) And to get back the old behaviour, where you can get the function being decorated, just call it afterwards: @singledispatch def fun(arg, verbose=True): if verbose: print("Let me just say,", end=" ") print(arg) def used_elsewhere(arg, verbose=True): if verbose: print("Strength in numbers, eh?", end=" ") print(arg) fun.register(int)(used_elsewhere) But this goes against what a single-dispatch function is, so I think this is a minor enough use case to not worry about. From tjol at tjol.eu Sat Jun 17 17:42:01 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Sat, 17 Jun 2017 23:42:01 +0200 Subject: [Python-ideas] Make functools.singledispatch register method return original function. In-Reply-To: References: Message-ID: <37140edc-d6b4-4f77-a2b0-21363cc8d5d0@tjol.eu> On 17/06/17 23:27, Mital Ashok via Python-ideas wrote: > [snip] > So I'm suggesting that @function.register for single dispatch > functions returns the same function, so you would end up with > something like: > > @singledispatch > def fun(arg, verbose=True): > if verbose: > print("Let me just say,", end=" ") > print(arg) > > @fun.register(int) > def fun(arg, verbose=True): > if verbose: > print("Strength in numbers, eh?", end=" ") > print(arg) In principle, I like it! However... > > And to get back the old behaviour, where you can get the function > being decorated, just call it afterwards: A backwards incompatible change like this is not going to happen. In actual fact, a couple of uses of the current behaviour are in the docs at https://docs.python.org/3/library/functools.html#functools.singledispatch : > The register() attribute returns the undecorated function which > enables decorator stacking, pickling, as well as creating unit tests > for each variant independently: > > >>> > >>> @fun.register(float) > ... @fun.register(Decimal) > ... def fun_num(arg, verbose=False): > ... if verbose: > ... print("Half of your number:", end=" ") > ... print(arg / 2) > ... > >>> fun_num is fun > False Perhaps it makes sense to add a new method to singledispatch that has the behaviour you're suggesting, just with a different name? -- Thomas From ncoghlan at gmail.com Sat Jun 17 22:24:21 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 18 Jun 2017 12:24:21 +1000 Subject: [Python-ideas] Make functools.singledispatch register method return original function. In-Reply-To: References: Message-ID: On 18 June 2017 at 07:27, Mital Ashok via Python-ideas wrote: > Right now, an example for single dispatch would be: > > from functools import singledispatch > > @singledispatch > def fun(arg, verbose=True): > if verbose: > print("Let me just say,", end=" ") > print(arg) > > @fun.register(int) > def _(arg, verbose=True): > if verbose: > print("Strength in numbers, eh?", end=" ") > print(arg) > > @fun.register(list) > def _(arg, verbose=True): > if verbose: > print("Enumerate this:") > for i, elem in enumerate(arg): > print(i, elem) > > But this makes a useless _ function, that should either be deleted or > ignored. Don't do that, give the overloads meaningful names that your test suite can then use to check that they do the right thing independently of the dispatch process. Even if you're not doing unit testing at that level, the names will also show up in exception tracebacks and other forms of introspection (e.g. process state dumps), and seeing descriptive names like "_fun_for_int" and "_fun_for_list" is *significantly* more informative than seeing multiple distinct functions all called "_" or "__". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at barrys-emacs.org Sun Jun 18 14:21:21 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Sun, 18 Jun 2017 19:21:21 +0100 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: > On 14 Jun 2017, at 07:33, Nick Coghlan wrote: > > On 14 June 2017 at 13:02, Mahmoud Hashemi wrote: >> That would be amazing! If there's anything I can do to help make that >> happen, please let me know. It'll almost certainly save that much time for >> me alone down the line, anyway :) > > The `IMPORT_FROM` opcode's error handling would probably be the best > place to start poking around: > https://github.com/python/cpython/blob/master/Python/ceval.c#L5055 > > If you can prove the concept there, that would: > > 1. Directly handle the "from x import y" and "import x.y as name" cases > 2. Provide a starting point for factoring out a "report missing module > attribute" helper that could be shared with ModuleType > > As an example of querying _frozen_importlib state from C code, I'd > point to https://github.com/python/cpython/blob/master/Python/import.c#L478 I had thought that the solution would be in the .py implementation of the import machinery not in the core C code. I was going to simply keep track of the names of the modules that are being imported and raise an exception if an import attempted to import a module that had not completed being imported. It seems from a quick loom at the code that this would be practical. Are you saying that there is a subtle point about import and detection of cycles that means the work must be done in C? Barry > > Cheers, > Nick. > > P.S. I also double checked that ImportError & AttributeError have > compatible binary layouts, so dual inheritance from them works :) > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Sun Jun 18 14:10:52 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Sun, 18 Jun 2017 19:10:52 +0100 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: > On 16 Jun 2017, at 09:46, Nick Coghlan wrote: > > On 16 June 2017 at 07:44, Barry Scott wrote: >> But I need the result of __dir__ for my object not its base. Then I need to >> add in the list of member attributes that are missing because python >> itself has no knowledge of them they are accessed via getattr(). > > The C code: > > dir_result = PyObject_CallMethod(base_type, "__dir__", "O", self); > > is roughly equivalent to the Python code: > > dir_result = BaseType.__dir__(self) > > That is, it's calling the base type's __dir__ method, but it's still > using the subclass *instance*. > > It's the same pattern people use to call a base type's __getattr__ or > __getattribute__ for the subclass implementation of those methods, > just without multiple inheritance support (since calling super() from > C is painful). Let me show you problem with an example. Here is an example run of the PyCXX Demo/Python3/simple.cxx code. : 19:01:15 ~/wc/svn/PyCXX : [1] barry at Expanse $ PYTHONPATH=obj python3.6 Python 3.6.0 (v3.6.0:41df79263a11, Dec 22 2016, 17:23:13) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import simple sizeof(int) 4 sizeof(long) 8 sizeof(Py_hash_t) 8 sizeof(Py_ssize_t) 8 >>> dir(simple) ['SimpleError', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'decode_test', 'derived_class_test', 'encode_test', 'func', 'func_with_callback', 'func_with_callback_catch_simple_error', 'make_instance', 'new_style_class', 'old_style_class', 'var'] >>> n=simple.new_style_class() new_style_class c'tor Called with 0 normal arguments. and with 0 keyword arguments: >>> dir(n) ['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_keyword', 'func_noargs', 'func_noargs_raise_exception', 'func_varargs', 'func_varargs_call_member'] >>> n.value 'default value' Notice that 'value' is not in the list of string returned from dir(n). That omission is because python does not know about 'value'. The code does this: Py::Object getattro( const Py::String &name_ ) { std::string name( name_.as_std_string( "utf-8" ) ); if( name == "value" ) { return m_value; } else { return genericGetAttro( name_ ); } } Where getattro is called (indirectly) from tp_getattro. In the python 2 I can tell python that 'value' exists because I provide a value of __members__. What is the way to tell python about 'value' in the python3 world? Barry > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > From mahmoud at hatnote.com Sun Jun 18 14:47:11 2017 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Sun, 18 Jun 2017 11:47:11 -0700 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: Barry, that kind of circular import is actually fine in many (if not most) cases. Modules are immediately created and importable, thenincrementally populated. The problem arises when you try to do something with contents of the module that have not been populated, usually manifesting in the AttributeError above. If you'd like to test this yourself, I've made a tiny demo with a little bit of documentation: https://gist.github.com/mahmoud/32fd056a3d4d1cd03a4e8aeff6b5ee70 Long story short, circular imports can be a code smell, but they're by no means universally an error condition. :) On Sun, Jun 18, 2017 at 11:21 AM, Barry Scott wrote: > > On 14 Jun 2017, at 07:33, Nick Coghlan wrote: > > On 14 June 2017 at 13:02, Mahmoud Hashemi wrote: > > That would be amazing! If there's anything I can do to help make that > happen, please let me know. It'll almost certainly save that much time for > me alone down the line, anyway :) > > > The `IMPORT_FROM` opcode's error handling would probably be the best > place to start poking around: > https://github.com/python/cpython/blob/master/Python/ceval.c#L5055 > > If you can prove the concept there, that would: > > 1. Directly handle the "from x import y" and "import x.y as name" cases > 2. Provide a starting point for factoring out a "report missing module > attribute" helper that could be shared with ModuleType > > As an example of querying _frozen_importlib state from C code, I'd > point to https://github.com/python/cpython/blob/master/Python/ > import.c#L478 > > > I had thought that the solution would be in the .py implementation of the > import > machinery not in the core C code. > > I was going to simply keep track of the names of the modules that are > being imported > and raise an exception if an import attempted to import a module that had > not completed > being imported. It seems from a quick loom at the code that this would be > practical. > > Are you saying that there is a subtle point about import and detection of > cycles that > means the work must be done in C? > > Barry > > > > > > > Cheers, > Nick. > > P.S. I also double checked that ImportError & AttributeError have > compatible binary layouts, so dual inheritance from them works :) > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Sun Jun 18 14:42:25 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Sun, 18 Jun 2017 19:42:25 +0100 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: > On 18 Jun 2017, at 19:21, Barry Scott wrote: > >> >> On 14 Jun 2017, at 07:33, Nick Coghlan > wrote: >> >> On 14 June 2017 at 13:02, Mahmoud Hashemi > wrote: >>> That would be amazing! If there's anything I can do to help make that >>> happen, please let me know. It'll almost certainly save that much time for >>> me alone down the line, anyway :) >> >> The `IMPORT_FROM` opcode's error handling would probably be the best >> place to start poking around: >> https://github.com/python/cpython/blob/master/Python/ceval.c#L5055 >> >> If you can prove the concept there, that would: >> >> 1. Directly handle the "from x import y" and "import x.y as name" cases >> 2. Provide a starting point for factoring out a "report missing module >> attribute" helper that could be shared with ModuleType >> >> As an example of querying _frozen_importlib state from C code, I'd >> point to https://github.com/python/cpython/blob/master/Python/import.c#L478 > > I had thought that the solution would be in the .py implementation of the import > machinery not in the core C code. > > I was going to simply keep track of the names of the modules that are being imported > and raise an exception if an import attempted to import a module that had not completed > being imported. It seems from a quick loom at the code that this would be practical. > > Are you saying that there is a subtle point about import and detection of cycles that > means the work must be done in C? It seemed that PyImport_ImportModuleLevelObject() always calls out the interp->importlib. For example: value = _PyObject_CallMethodIdObjArgs(interp->importlib, &PyId__lock_unlock_module, abs_name, NULL); Where interp->importlib is the frozen importlib.py code I thought. I'd assumed that I would need to change the importlib.py code and build that as the frozen version to implement this. Barry > > Barry > > > > > >> >> Cheers, >> Nick. >> >> P.S. I also double checked that ImportError & AttributeError have >> compatible binary layouts, so dual inheritance from them works :) >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Sun Jun 18 14:58:20 2017 From: phd at phdru.name (Oleg Broytman) Date: Sun, 18 Jun 2017 20:58:20 +0200 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: <20170618185820.GA3194@phdru.name> On Sun, Jun 18, 2017 at 07:21:21PM +0100, Barry Scott wrote: > I was going to simply keep track of the names of the modules that are being imported > and raise an exception if an import attempted to import a module that had not completed > being imported. Please don't do that. In SQLObject module dbconnection imports connection modules to run registration code in their __init__; connection modules import name DBAPI from dbconnection. To avoid problems with circular import dbconnection imports connection modules at the very end. I'm afraid you plan if implemented would prevent me from doing that. See the code. dbconnection.py: https://github.com/sqlobject/sqlobject/blob/master/sqlobject/dbconnection.py#L1099 Connection modules (just a few examples): https://github.com/sqlobject/sqlobject/blob/master/sqlobject/mysql/mysqlconnection.py#L5 https://github.com/sqlobject/sqlobject/blob/master/sqlobject/postgres/pgconnection.py#L7 Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From alireza.rafiei94 at gmail.com Sun Jun 18 17:38:20 2017 From: alireza.rafiei94 at gmail.com (Alireza Rafiei) Date: Sun, 18 Jun 2017 14:38:20 -0700 Subject: [Python-ideas] Continuation of `__name__` or a builtin function for general name getting Message-ID: Hi all, I'm not sure whether this idea has been discussed before or not, so I apologize in advanced if that's the case. Consider the behavior: >>> f = lambda: True > >>> f.__name__ > '' > >>> x = f > >>> x.__name__ > '' I'm arguing the behavior above is too brittle/limited and, considering that the name of the attribute is `__name__`, not entirely consistent with Python's AST. Consider: >>> f = lambda: True > >>> x = f At the first line, an ast.Assign would be created whose target is an ast.Name whose `id` is `f`. At the second line, an ast.Assign would be created whose target is an ast.Name whose `id` is `x`. However, as you can see `__name__` special method returns 'lambda' in both cases (just like it was defined https://docs.python.org/3/library/stdtypes.html#definition.__name__), whereas I think either it should have returned '' and 'x' or a new function/attribute should exist that does so and more. For example, consider: >>> x_1 = 1 > >>> x_2 = 1 > >>> x_3 = 1 > >>> x_4 = x_1 > >>> for i in [x_1, x_2, x_3, x_4]: > >>> print(i) > 1 > 1 > 1 > 1 Now assume such a function exist and is called `name`. Then: >>> name(1) > '1' > >>> name("Something") > "Something" > >>> name(x_1) > 'x_1' > >>> name(x_4) > 'x_4' > >>> name(x_5) > 'x_5' # Or an Exception! > >>> def itername(collection): > >>> for i in map(lambda x: name(x), collection): > >>> yield i > >>> > >>> for i in [x_1, x_2, x_3, x_4]: > >>> print(i, name(i)) > 1, 'i' > 1, 'i' > 1, 'i' > 1, 'i' > >>> for i in itername([x_1, x_2, x_3, x_4]): > >>> print(i) > 'x_1' > 'x_2' > 'x_3' > 'x_4' I think above example gives an idea of the behavior I'm proposing. I can implement it in a hacky way by parsing the block into ast and going through all the nodes that have an `id` or a `name` or `asname` etc, but this behavior wouldn't be pretty for run-time cases e.g. `itername` in the example above. Anyway I'd appreciate any input. P.S. I must confess that I can't think of a single case where having this function is the only way to do the job, if the job is sufficiently different from the specification of said function. It'd be only for the sake of convenience, and continuation of what already was there with `__name__`. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Jun 18 17:46:09 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 19 Jun 2017 07:46:09 +1000 Subject: [Python-ideas] Continuation of `__name__` or a builtin function for general name getting In-Reply-To: References: Message-ID: On Mon, Jun 19, 2017 at 7:38 AM, Alireza Rafiei wrote: > I'm not sure whether this idea has been discussed before or not, so I > apologize in advanced if that's the case. > > Consider the behavior: > >> >>> f = lambda: True >> >>> f.__name__ >> '' >> >>> x = f >> >>> x.__name__ >> '' > > > I'm arguing the behavior above is too brittle/limited and, considering that > the name of the attribute is `__name__`, not entirely consistent with > Python's AST. Consider: > >> >>> f = lambda: True >> >>> x = f > > > At the first line, an ast.Assign would be created whose target is an > ast.Name whose `id` is `f`. > At the second line, an ast.Assign would be created whose target is an > ast.Name whose `id` is `x`. The __name__ of a function has nothing to do with an Assign node, which is simply assigning a value to something. For instance, if you do: >>> f = "hello" you wouldn't expect the string "hello" to have a __name__ - it's just a string. And a lambda function normally won't be assigned to anything. You use lambda when there isn't any name: >>> do_stuff(lambda q: q * 2 + 1) and you use def when you want to assign it to a name: >>> def f(): return True By the time the Assign operation gets performed, the function object - with all of its attributes, including __name__ - has been completely created. I'm not sure what your proposal would do to these kinds of situations, but it shouldn't be modifying the assigned object. ChrisA From alireza.rafiei94 at gmail.com Sun Jun 18 18:16:07 2017 From: alireza.rafiei94 at gmail.com (Alireza Rafiei) Date: Sun, 18 Jun 2017 15:16:07 -0700 Subject: [Python-ideas] Continuation of `__name__` or a builtin function for general name getting In-Reply-To: References: Message-ID: The __name__ of a function has nothing to do with an Assign node, > which is simply assigning a value to something. For instance, if you > do: > >>> f = "hello" > you wouldn't expect the string "hello" to have a __name__ - it's just > a string. And a lambda function normally won't be assigned to > anything. You use lambda when there isn't any name: > >>> do_stuff(lambda q: q * 2 + 1) > and you use def when you want to assign it to a name: > >>> def f(): return True > By the time the Assign operation gets performed, the function object - > with all of its attributes, including __name__ - has been completely > created. I'm not sure what your proposal would do to these kinds of > situations, but it shouldn't be modifying the assigned object. I guess I should have framed it as a `quote` for python. You're absolutely right that it shouldn't be modifying the assigned object and it doesn't. I mentioned the Assign to say that in `x = f`, `x` has name as well, however `x.__name__` returns the name of `f` and not `x`. As for the `f = "hello"`, the value of the name "f" would be "hello" and the value of the name "hello" would be "hello". My proposal is to either change the behavior of `__name__` or have something similar that acts globally for all objects and types to get a quote-like behavior, provided that the operands of quotes are atomic. On Sun, Jun 18, 2017 at 2:46 PM, Chris Angelico wrote: > On Mon, Jun 19, 2017 at 7:38 AM, Alireza Rafiei > wrote: > > I'm not sure whether this idea has been discussed before or not, so I > > apologize in advanced if that's the case. > > > > Consider the behavior: > > > >> >>> f = lambda: True > >> >>> f.__name__ > >> '' > >> >>> x = f > >> >>> x.__name__ > >> '' > > > > > > I'm arguing the behavior above is too brittle/limited and, considering > that > > the name of the attribute is `__name__`, not entirely consistent with > > Python's AST. Consider: > > > >> >>> f = lambda: True > >> >>> x = f > > > > > > At the first line, an ast.Assign would be created whose target is an > > ast.Name whose `id` is `f`. > > At the second line, an ast.Assign would be created whose target is an > > ast.Name whose `id` is `x`. > > The __name__ of a function has nothing to do with an Assign node, > which is simply assigning a value to something. For instance, if you > do: > > >>> f = "hello" > > you wouldn't expect the string "hello" to have a __name__ - it's just > a string. And a lambda function normally won't be assigned to > anything. You use lambda when there isn't any name: > > >>> do_stuff(lambda q: q * 2 + 1) > > and you use def when you want to assign it to a name: > > >>> def f(): return True > > By the time the Assign operation gets performed, the function object - > with all of its attributes, including __name__ - has been completely > created. I'm not sure what your proposal would do to these kinds of > situations, but it shouldn't be modifying the assigned object. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sun Jun 18 18:24:17 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 18 Jun 2017 23:24:17 +0100 Subject: [Python-ideas] Continuation of `__name__` or a builtin function for general name getting In-Reply-To: References: Message-ID: <2838ad17-73b1-8fcc-2372-275d6c258e95@mrabarnett.plus.com> On 2017-06-18 22:38, Alireza Rafiei wrote: > Hi all, > > I'm not sure whether this idea has been discussed before or not, so I > apologize in advanced if that's the case. > > Consider the behavior: > > >>> f = lambda: True > >>> f.__name__ > '' > >>> x = f > >>> x.__name__ > '' > > > I'm arguing the behavior above is too brittle/limited and, considering > that the name of the attribute is `__name__`, not entirely consistent > with Python's AST. Consider: > > >>> f = lambda: True > >>> x = f > > > At the first line, an ast.Assign would be created whose target is an > ast.Name whose `id` is `f`. > At the second line, an ast.Assign would be created whose target is an > ast.Name whose `id` is `x`. > However, as you can see `__name__` special method returns 'lambda' in > both cases (just like it was defined > https://docs.python.org/3/library/stdtypes.html#definition.__name__), > whereas I think either it should have returned '' and 'x' or a > new function/attribute should exist that does so and more. > > For example, consider: > > >>> x_1 = 1 > >>> x_2 = 1 > >>> x_3 = 1 > >>> x_4 = x_1 > >>> for i in [x_1, x_2, x_3, x_4]: > >>> print(i) > 1 > 1 > 1 > 1 > > > Now assume such a function exist and is called `name`. Then: > > >>> name(1) > '1' > >>> name("Something") > "Something" > >>> name(x_1) > 'x_1' > >>> name(x_4) > 'x_4' > >>> name(x_5) > 'x_5' # Or an Exception! > >>> def itername(collection): > >>> for i in map(lambda x: name(x), collection): > >>> yield i > >>> > >>> for i in [x_1, x_2, x_3, x_4]: > >>> print(i, name(i)) > 1, 'i' > 1, 'i' > 1, 'i' > 1, 'i' > >>> for i in itername([x_1, x_2, x_3, x_4]): > >>> print(i) > 'x_1' > 'x_2' > 'x_3' > 'x_4' > [snip] That's not correct. Look at the definition of 'itername'. The lambda returns the result of name(x), which is 'x'. Therefore, the correct result is: 'x' 'x' 'x' 'x' From rosuav at gmail.com Sun Jun 18 18:27:47 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 19 Jun 2017 08:27:47 +1000 Subject: [Python-ideas] Continuation of `__name__` or a builtin function for general name getting In-Reply-To: References: Message-ID: On Mon, Jun 19, 2017 at 8:16 AM, Alireza Rafiei wrote: > I guess I should have framed it as a `quote` for python. You're absolutely > right that it shouldn't be modifying the assigned object and it doesn't. I > mentioned the Assign to say that in `x = f`, `x` has name as well, however > `x.__name__` returns the name of `f` and not `x`. > > As for the `f = "hello"`, the value of the name "f" would be "hello" and the > value of the name "hello" would be "hello". > > My proposal is to either change the behavior of `__name__` or have something > similar that acts globally for all objects and types to get a quote-like > behavior, provided that the operands of quotes are atomic. Hmm. So... after x = f, f.__name__ would be different from x.__name__? ChrisA From alireza.rafiei94 at gmail.com Sun Jun 18 18:47:27 2017 From: alireza.rafiei94 at gmail.com (Alireza Rafiei) Date: Sun, 18 Jun 2017 15:47:27 -0700 Subject: [Python-ideas] Continuation of `__name__` or a builtin function for general name getting In-Reply-To: <2838ad17-73b1-8fcc-2372-275d6c258e95@mrabarnett.plus.com> References: <2838ad17-73b1-8fcc-2372-275d6c258e95@mrabarnett.plus.com> Message-ID: > > [snip] > That's not correct. > Look at the definition of 'itername'. The lambda returns the result of > name(x), which is 'x'. > Therefore, the correct result is: > 'x' > 'x' > 'x' > 'x' You're absolutely right! It should be changed to: >>> def itername(collection): > >>> for i in map(name, collection): > >>> yield i ------ Hmm. So... after x = f, f.__name__ would be different from x.__name__? Yes. ------ I should have written my initial email more carefully! There's another mistake: However, as you can see `__name__` special method returns 'lambda' in both > cases (just like it was defined https://docs.python.org/3/ > library/stdtypes.html#definition.__name__), whereas I think either it > should have returned '' and 'x' or a new function/attribute should > exist that does so and more. It'd be inconsistent to say x.__name__ should return 'x' and f.__name__ should return ''. To avoid further confusion of `name` the function with `__name__`, I take back what I said about `__name__` attribute. However I can't think of an inconsistency with a builtin function called `name` or `quote` or alike that behaves as described previously (name(x) would be 'x' and name(f) would be 'f' and name ("hello") would be "hello" etc.) and still appreciate any input on it. On Sun, Jun 18, 2017 at 3:24 PM, MRAB wrote: > On 2017-06-18 22:38, Alireza Rafiei wrote: > >> Hi all, >> >> I'm not sure whether this idea has been discussed before or not, so I >> apologize in advanced if that's the case. >> >> Consider the behavior: >> >> >>> f = lambda: True >> >>> f.__name__ >> '' >> >>> x = f >> >>> x.__name__ >> '' >> >> >> I'm arguing the behavior above is too brittle/limited and, considering >> that the name of the attribute is `__name__`, not entirely consistent with >> Python's AST. Consider: >> >> >>> f = lambda: True >> >>> x = f >> >> >> At the first line, an ast.Assign would be created whose target is an >> ast.Name whose `id` is `f`. >> At the second line, an ast.Assign would be created whose target is an >> ast.Name whose `id` is `x`. >> However, as you can see `__name__` special method returns 'lambda' in >> both cases (just like it was defined https://docs.python.org/3/libr >> ary/stdtypes.html#definition.__name__), whereas I think either it should >> have returned '' and 'x' or a new function/attribute should exist >> that does so and more. >> >> For example, consider: >> >> >>> x_1 = 1 >> >>> x_2 = 1 >> >>> x_3 = 1 >> >>> x_4 = x_1 >> >>> for i in [x_1, x_2, x_3, x_4]: >> >>> print(i) >> 1 >> 1 >> 1 >> 1 >> >> >> Now assume such a function exist and is called `name`. Then: >> >> >>> name(1) >> '1' >> >>> name("Something") >> "Something" >> >>> name(x_1) >> 'x_1' >> >>> name(x_4) >> 'x_4' >> >>> name(x_5) >> 'x_5' # Or an Exception! >> >>> def itername(collection): >> >>> for i in map(lambda x: name(x), collection): >> >>> yield i >> >>> >> >>> for i in [x_1, x_2, x_3, x_4]: >> >>> print(i, name(i)) >> 1, 'i' >> 1, 'i' >> 1, 'i' >> 1, 'i' >> >>> for i in itername([x_1, x_2, x_3, x_4]): >> >>> print(i) >> 'x_1' >> 'x_2' >> 'x_3' >> 'x_4' >> >> [snip] > > That's not correct. > > Look at the definition of 'itername'. The lambda returns the result of > name(x), which is 'x'. > > Therefore, the correct result is: > > 'x' > 'x' > 'x' > 'x' > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Jun 18 19:00:21 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 19 Jun 2017 09:00:21 +1000 Subject: [Python-ideas] Continuation of `__name__` or a builtin function for general name getting In-Reply-To: References: <2838ad17-73b1-8fcc-2372-275d6c258e95@mrabarnett.plus.com> Message-ID: On Mon, Jun 19, 2017 at 8:47 AM, Alireza Rafiei wrote: >> Hmm. So... after x = f, f.__name__ would be different from x.__name__? > > > Yes. There's a couple of major problems with that. The first is that, in Python, there is *absolutely no difference* between accessing an object via one form and via another. For example: >>> x = object() # or anything else >>> y = [x, x, x] >>> z = x >>> q = {"spam": x} >>> probably = globals()["x"] You can access y[0], z, q["spam"], and (most likely) probably, and they're all the same thing. Exactly the same object. So there's no way to do attribute access on that object and get different results. The second problem is that the current behaviour is extremely important. One such place is with function decorators, which frequently need to know the name of the function being worked on: def command(func): parser.add_parser(func.__name__) ... @command def spaminate(): ... Inside the decorator, "func.__name__" has to be the name of the function being decorated ("spaminate"), *not* "func". The function has an identity and a canonical name. Disrupting that would cause major difficulties for these kinds of decorators. ChrisA From alireza.rafiei94 at gmail.com Sun Jun 18 20:07:16 2017 From: alireza.rafiei94 at gmail.com (Alireza Rafiei) Date: Sun, 18 Jun 2017 17:07:16 -0700 Subject: [Python-ideas] Continuation of `__name__` or a builtin function for general name getting In-Reply-To: References: <2838ad17-73b1-8fcc-2372-275d6c258e95@mrabarnett.plus.com> Message-ID: Thanks for the explanation! On Sun, Jun 18, 2017 at 4:00 PM, Chris Angelico wrote: > On Mon, Jun 19, 2017 at 8:47 AM, Alireza Rafiei > wrote: > >> Hmm. So... after x = f, f.__name__ would be different from x.__name__? > > > > > > Yes. > > There's a couple of major problems with that. The first is that, in > Python, there is *absolutely no difference* between accessing an > object via one form and via another. For example: > > >>> x = object() # or anything else > >>> y = [x, x, x] > >>> z = x > >>> q = {"spam": x} > >>> probably = globals()["x"] > > You can access y[0], z, q["spam"], and (most likely) probably, and > they're all the same thing. Exactly the same object. So there's no way > to do attribute access on that object and get different results. > > The second problem is that the current behaviour is extremely > important. One such place is with function decorators, which > frequently need to know the name of the function being worked on: > > def command(func): > parser.add_parser(func.__name__) > ... > > @command > def spaminate(): > ... > > Inside the decorator, "func.__name__" has to be the name of the > function being decorated ("spaminate"), *not* "func". The function has > an identity and a canonical name. Disrupting that would cause major > difficulties for these kinds of decorators. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sun Jun 18 21:47:43 2017 From: mertz at gnosis.cx (David Mertz) Date: Sun, 18 Jun 2017 18:47:43 -0700 Subject: [Python-ideas] Run length encoding In-Reply-To: References: <593CC97E.4000901@canterbury.ac.nz> Message-ID: As an only semi-joke, I have created a module on GH that meets the needs of this discussion (using the spelling I think are most elegant): https://github.com/DavidMertz/RLE On Sun, Jun 11, 2017 at 1:53 AM, Serhiy Storchaka wrote: > 11.06.17 09:17, Neal Fultz ????: > >> * other people have been false positive and wanted a SQL-type group >> by, but got burned >> * hence the warnings in the docs. >> > > This wouldn't help if people don't read the docs. > > Also, if someone rewrote zip in pure python, would many people actually >> notice a slow down vs network latency, disk IO, etc? >> > > Definitely yes. > > RLE is a building block just like bisect. >> > > This is very specific building block. And if ZIP compression be rewrote in > pure Python it wouldn't use > > FYI, there are multiple compression methods supported in ZIP files, but > the zipmodule module implements not all of them. In particular simple RLE > based methods are not implemented (they almost not used in real world now). > I suppose that if the zipmodule module implements these algorithms it > wouldn't use any general RLE implementation. > > https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjol at tjol.eu Mon Jun 19 03:37:44 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Mon, 19 Jun 2017 09:37:44 +0200 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: <8a0238b5-0bab-e813-bc0e-f20ac3cf22e8@tjol.eu> On 2017-06-18 20:10, Barry Scott wrote: > What is the way to tell python about 'value' in the python3 world? Implement a __dir__ method. This should call the superclass' (e.g. object's) __dir__ and add whatever it is you want to add to the list. In general I'd recommend using properties for this kind of thing rather than messing with __getattr__, __setattr__ and __dir__, especially in pure Python. I'm no expert on the C API, but I think this should be possible by setting PyTypeObject.tp_getset https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_getset -- Thomas From ncoghlan at gmail.com Mon Jun 19 08:36:04 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 19 Jun 2017 22:36:04 +1000 Subject: [Python-ideas] ImportError raised for a circular import In-Reply-To: References: <6A4A2E1B-73D3-46D1-A27A-925772214284@barrys-emacs.org> Message-ID: On 19 June 2017 at 04:47, Mahmoud Hashemi wrote: > Barry, that kind of circular import is actually fine in many (if not most) > cases. Modules are immediately created and importable, thenincrementally > populated. The problem arises when you try to do something with contents of > the module that have not been populated, usually manifesting in the > AttributeError above. > > If you'd like to test this yourself, I've made a tiny demo with a little bit > of documentation: > https://gist.github.com/mahmoud/32fd056a3d4d1cd03a4e8aeff6b5ee70 > > Long story short, circular imports can be a code smell, but they're by no > means universally an error condition. :) Indeed, and this is why we've been begrudgingly giving ground and ensuring that the cases that *can* be made to work actually succeed in practice. We'll let code linters complain about those cycles, rather than having the interpreter continue to get confused :) However, permitting resolvable circular imports means that for the remaining cases that are genuinely irresolvably broken, any custom detection logic needs to live in some combination of the module attribute lookup code and the IMPORT_FROM opcode implementation, rather than being able to isolate the changes to the import machinery itself. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Jun 19 08:56:32 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 19 Jun 2017 22:56:32 +1000 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: On 19 June 2017 at 04:10, Barry Scott wrote: > The code does this: > > Py::Object getattro( const Py::String &name_ ) > { > std::string name( name_.as_std_string( "utf-8" ) ); > > if( name == "value" ) > { > return m_value; > } > else > { > return genericGetAttro( name_ ); > } > } > > Where getattro is called (indirectly) from tp_getattro. > > In the python 2 I can tell python that 'value' exists because I provide a value of __members__. > > What is the way to tell python about 'value' in the python3 world? OK, I think I may understand the confusion now. As Thomas noted, the preferred way of informing Python of data attributes for types implemented in C is to ask the interpreter to automatically create the appropriate descriptor objects by setting the `tp_members` slot on the C level *type*, rather than setting `__members__` on the instance: https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_members That approach also works for new-style classes in Python 2: https://docs.python.org/2/c-api/typeobj.html#c.PyTypeObject.tp_members I believe this is actually an old-/new-style class difference, so the relevant Python 3 change is the fact that the old-style approach simply isn't available any more. So if you use tp_members and tp_getset to request the creation of suitable descriptors, then the interpreter will automatically take care of populating the results of `dir()` correctly. However, if you're genuinely dynamically adding attributes in `__getattr__`, then you're going to need to add code to report them from `__dir__` as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From jjmaldonis at gmail.com Mon Jun 19 17:06:56 2017 From: jjmaldonis at gmail.com (Jason Maldonis) Date: Mon, 19 Jun 2017 16:06:56 -0500 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ Message-ID: Hi everyone, A while back I had a conversation with some folks over on python-list. I was having issues implementing error handling of `AttributeError`s using `__getattr__`. My problem is that it is currently impossible for a `__getattr__` in Python to know which method raised the `AttributeError` that was caught by `__getattr__` if there are nested methods. For example, we cannot tell the difference between `A.x` not existing (which would raise an AttributeError) and some attribute inside `A.x` not existing (which also raises an AttributeError). This is evident from the stack trace that gets printed to screen, but `__getattr__` doesn't get that stack trace. I propose that the error that triggers an `AttributeError` should get passed to `__getattr__` (if `__getattr__` exists of course). Then, when handling errors, users could dig into the problematic error if they so desire. What do you think? Best, Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From george at fischhof.hu Mon Jun 19 17:09:12 2017 From: george at fischhof.hu (George Fischhof) Date: Mon, 19 Jun 2017 23:09:12 +0200 Subject: [Python-ideas] socket module: plain stuples vs named tuples In-Reply-To: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: +1 ;-) for example to get IP address of all interfaces we have to use the third (indexed as 2) member of "something" it is much more beatiful to get ip_address: *import *socket *for *ip *in *socket.gethostbyname_ex(socket.gethostname())[2]: print(ip) BR, George 2017-06-13 22:13 GMT+02:00 Thomas G?ttler : > AFAIK the socket module returns plain tuples in Python3: > > https://docs.python.org/3/library/socket.html > > Why not use named tuples? > > Regards, > Thomas G?ttler > > -- > I am looking for feedback for my personal programming guidelines: > https://github.com/guettli/programming-guidelines > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Jun 19 17:24:22 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 19 Jun 2017 23:24:22 +0200 Subject: [Python-ideas] socket module: plain stuples vs named tuples In-Reply-To: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: Hi, 2017-06-13 22:13 GMT+02:00 Thomas G?ttler : > AFAIK the socket module returns plain tuples in Python3: > > https://docs.python.org/3/library/socket.html > > Why not use named tuples? For technical reasons: the socket module is mostly implemented in the C language, and define a "named tuple" in C requires to implement a "sequence" time which requires much more code than creating a tuple. In short, create a tuple is as simple as Py_BuildValue("OO", item1, item2). Creating a "sequence" type requires something like 50 lines of code, maybe more, I don't know exactly. Victor From guido at python.org Mon Jun 19 17:58:50 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 19 Jun 2017 14:58:50 -0700 Subject: [Python-ideas] socket module: plain stuples vs named tuples In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: There are examples in timemodule.c which went through a similar conversion from plain tuples to (sort-of) named tuples. I agree that upgrading the tuples returned by the socket module to named tuples would be nice, but it's a low priority project. Maybe someone interested can create a PR? (First open an issue stating that you're interested; point to this email from me to prevent that some other core dev just closes it again.) On Mon, Jun 19, 2017 at 2:24 PM, Victor Stinner wrote: > Hi, > > 2017-06-13 22:13 GMT+02:00 Thomas G?ttler : > > AFAIK the socket module returns plain tuples in Python3: > > > > https://docs.python.org/3/library/socket.html > > > > Why not use named tuples? > > For technical reasons: the socket module is mostly implemented in the > C language, and define a "named tuple" in C requires to implement a > "sequence" time which requires much more code than creating a tuple. > > In short, create a tuple is as simple as Py_BuildValue("OO", item1, item2). > > Creating a "sequence" type requires something like 50 lines of code, > maybe more, I don't know exactly. > > Victor > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Jun 19 18:27:53 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 20 Jun 2017 00:27:53 +0200 Subject: [Python-ideas] socket module: plain stuples vs named tuples In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: Oh, about the cost of writing C code, we started to enhance the socket module in socket.py but keep the C code unchanged. I am thinking to the support of enums. Some C functions are wrapped in Python. Victor Le 19 juin 2017 11:59 PM, "Guido van Rossum" a ?crit : > There are examples in timemodule.c which went through a similar conversion > from plain tuples to (sort-of) named tuples. I agree that upgrading the > tuples returned by the socket module to named tuples would be nice, but > it's a low priority project. Maybe someone interested can create a PR? > (First open an issue stating that you're interested; point to this email > from me to prevent that some other core dev just closes it again.) > > On Mon, Jun 19, 2017 at 2:24 PM, Victor Stinner > wrote: > >> Hi, >> >> 2017-06-13 22:13 GMT+02:00 Thomas G?ttler : >> > AFAIK the socket module returns plain tuples in Python3: >> > >> > https://docs.python.org/3/library/socket.html >> > >> > Why not use named tuples? >> >> For technical reasons: the socket module is mostly implemented in the >> C language, and define a "named tuple" in C requires to implement a >> "sequence" time which requires much more code than creating a tuple. >> >> In short, create a tuple is as simple as Py_BuildValue("OO", item1, >> item2). >> >> Creating a "sequence" type requires something like 50 lines of code, >> maybe more, I don't know exactly. >> >> Victor >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at lucidity.plus.com Mon Jun 19 19:47:20 2017 From: python at lucidity.plus.com (Erik) Date: Tue, 20 Jun 2017 00:47:20 +0100 Subject: [Python-ideas] Run length encoding In-Reply-To: References: <593CC97E.4000901@canterbury.ac.nz> Message-ID: <8512d16d-4b76-7558-aefa-039443b7d439@lucidity.plus.com> On 19/06/17 02:47, David Mertz wrote: > As an only semi-joke, I have created a module on GH that meets the needs > of this discussion (using the spelling I think are most elegant): > > https://github.com/DavidMertz/RLE It's a shame you have to build that list when encoding. I tried to work out a way to get the number of items in an iterable without having to capture all the values (on the understanding that if the iterable is already an iterator, it would be consumed). The best I came up with so far (not general purpose, but it works in this scenario) is: from iterator import groupby from operator import countOf def rle_encode(it): return ((k, countOf(g, k)) for k, g in groupby(it)) In your test code, this speeds things up quite a bit over building the list, but that's presumably only because both groupby() and countOf() will use the standard class comparison operator methods which in the case of ints will short-circuit with a C-level pointer comparison first. For user-defined classes with complicated comparison methods, getting the length of the group by comparing the items will probably be worse. Is there a better way of implementing a general-purpose "ilen()"? I tried a couple of other things, but they all required at least one lambda function and slowed things down by about 50% compared to the list-building version. (I agree this is sort of a joke, but it's still an interesting puzzle ...). Regards, E. From steve at pearwood.info Mon Jun 19 20:18:19 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 20 Jun 2017 10:18:19 +1000 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: References: Message-ID: <20170620001819.GK3149@ando.pearwood.info> On Mon, Jun 19, 2017 at 04:06:56PM -0500, Jason Maldonis wrote: > Hi everyone, > > A while back I had a conversation with some folks over on python-list. I > was having issues implementing error handling of `AttributeError`s using > `__getattr__`. [...] > For example, we cannot tell the difference between `A.x` not existing > (which would raise an AttributeError) and some attribute inside `A.x` not > existing (which also raises an AttributeError). I didn't understand what you were talking about here at first. If you write something like A.x.y where y doesn't exist, it's A.x.__getattr__ that is called, not A.__getattr__. But I went and looked at the thread in Python-Ideas and discovered that you're talking about the case where A.x is a descriptor, not an ordinary attribute, and the descriptor leaks AttributeError. Apparently you heavily use properties, and __getattr__, and find that the two don't interact well together when the property getters and setters themselves raise AttributeError. I think that's relevant information that helps explain the problem you are hoping to fix. So I *think* this demonstrates the problem: class A(object): eggs = "text" def __getattr__(self, name): if name == 'cheese': return "cheddar" raise AttributeError('%s missing' % name) @property def spam(self): return self.eggs.uper() # Oops. a = A() a.spam Which gives us Traceback (most recent call last): File "", line 1, in File "", line 6, in __getattr__ AttributeError: spam missing But you go on to say that: > This is evident from the > stack trace that gets printed to screen, but `__getattr__` doesn't get that > stack trace. I can't reproduce that! As you can see from the above, the stack trace doesn't say anything about the actual missing attribute 'uper'. So I must admit I don't actually understand the problem you are hoping to solve. It seems to be different from my understanding of it. > I propose that the error that triggers an `AttributeError` should get > passed to `__getattr__` (if `__getattr__` exists of course). Then, when > handling errors, users could dig into the problematic error if they so > desire. What precisely will be passed to __getattr__? The exception instance? The full traceback object? The name of the missing attribute? Something else? It is hard to really judge this proposal without more detail. I think the most natural thing to pass would be the exception instance, but AttributeError instances don't record the missing attribute name directly (as far as I can tell). Given: try: ''.foo except AttributeError as e: print(e.???) there's nothing in e we can inspect to get the name of the missing exception, 'foo'. (As far as I can see.) We must parse the error message itself, which we really shouldn't do, because the error message is not part of the exception API and could change at any time. So... what precisely should be passed to __getattr__, and what exactly are you going to do with it? Having said that, there's another problem: adding this feature (whatever it actually is) to __getattr__ will break every existing class that uses __getattr__. The problem is that everyone who writes a __getattr__ method writes it like this: def __getattr__(self, name): not: def __getattr__(self, name, error): so the class will break when the method receives two arguments (excluding self) but only has one parameter. *If* we go down this track, it would probably require a __future__ import for at least one release, probably more: - in 3.7, use `from __future__ import extra_getattr_argument` - in 3.8, deprecate the single-argument form of __getattr__ - in 3.9 or 4.0 no longer require the __future__ import. That's a fairly long and heavy process, and will be quite annoying to those writing cross-version code using __getattr__, but it can be done. But only if it actually helps solve the problem. I'm not convinced that it does. It comes down to the question of what this second argument is, and how do you expect to use it? -- Steve From rosuav at gmail.com Mon Jun 19 21:30:08 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 20 Jun 2017 11:30:08 +1000 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: <20170620001819.GK3149@ando.pearwood.info> References: <20170620001819.GK3149@ando.pearwood.info> Message-ID: On Tue, Jun 20, 2017 at 10:18 AM, Steven D'Aprano wrote: > Apparently you heavily use properties, and __getattr__, and find that > the two don't interact well together when the property getters and > setters themselves raise AttributeError. I think that's relevant > information that helps explain the problem you are hoping to fix. > > So I *think* this demonstrates the problem: > > class A(object): > eggs = "text" > def __getattr__(self, name): > if name == 'cheese': > return "cheddar" > raise AttributeError('%s missing' % name) > @property > def spam(self): > return self.eggs.uper() # Oops. I'm quoting Steven's post, but I'm addressing the OP. One good solution to this is a "guard point" around your property functions. def noleak(*exc): def deco(func): @functools.wraps(func) def wrapper(*a, **kw): try: return func(*a, **kw) except exc: raise RuntimeError return wrapper return deco @property @noleak(AttributeError) def spam(self): return self.eggs.uper() In fact, you could make this into a self-wrapping system if you like: def property(func, *, _=property): return _(noleak(AttributeError)(func)) Now, all your @property functions will be guarded: any AttributeErrors they raise will actually bubble as RuntimeErrors instead. Making this work with setters and deleters is left as an exercise for the reader. ChrisA From rosuav at gmail.com Mon Jun 19 21:31:34 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 20 Jun 2017 11:31:34 +1000 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: <20170620001819.GK3149@ando.pearwood.info> References: <20170620001819.GK3149@ando.pearwood.info> Message-ID: On Tue, Jun 20, 2017 at 10:18 AM, Steven D'Aprano wrote: > Having said that, there's another problem: adding this feature (whatever > it actually is) to __getattr__ will break every existing class that uses > __getattr__. The problem is that everyone who writes a __getattr__ > method writes it like this: > > def __getattr__(self, name): > > not: > > def __getattr__(self, name, error): > > so the class will break when the method receives two arguments > (excluding self) but only has one parameter. Why not just write cross-version-compatible code as def __getattr__(self, name, error=None): ? Is there something special about getattr? ChrisA From songofacandy at gmail.com Mon Jun 19 22:05:55 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 20 Jun 2017 11:05:55 +0900 Subject: [Python-ideas] socket module: plain stuples vs named tuples In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: Namedtuple in Python make startup time slow. So I'm very conservative to convert tuple to namedtuple in Python. INADA Naoki On Tue, Jun 20, 2017 at 7:27 AM, Victor Stinner wrote: > Oh, about the cost of writing C code, we started to enhance the socket > module in socket.py but keep the C code unchanged. I am thinking to the > support of enums. Some C functions are wrapped in Python. > > Victor > > Le 19 juin 2017 11:59 PM, "Guido van Rossum" a ?crit : >> >> There are examples in timemodule.c which went through a similar conversion >> from plain tuples to (sort-of) named tuples. I agree that upgrading the >> tuples returned by the socket module to named tuples would be nice, but it's >> a low priority project. Maybe someone interested can create a PR? (First >> open an issue stating that you're interested; point to this email from me to >> prevent that some other core dev just closes it again.) >> >> On Mon, Jun 19, 2017 at 2:24 PM, Victor Stinner >> wrote: >>> >>> Hi, >>> >>> 2017-06-13 22:13 GMT+02:00 Thomas G?ttler : >>> > AFAIK the socket module returns plain tuples in Python3: >>> > >>> > https://docs.python.org/3/library/socket.html >>> > >>> > Why not use named tuples? >>> >>> For technical reasons: the socket module is mostly implemented in the >>> C language, and define a "named tuple" in C requires to implement a >>> "sequence" time which requires much more code than creating a tuple. >>> >>> In short, create a tuple is as simple as Py_BuildValue("OO", item1, >>> item2). >>> >>> Creating a "sequence" type requires something like 50 lines of code, >>> maybe more, I don't know exactly. >>> >>> Victor >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From ethan at stoneleaf.us Mon Jun 19 22:12:23 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 19 Jun 2017 19:12:23 -0700 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: References: <20170620001819.GK3149@ando.pearwood.info> Message-ID: <59488487.7020607@stoneleaf.us> On 06/19/2017 06:31 PM, Chris Angelico wrote: > On Tue, Jun 20, 2017 at 10:18 AM, Steven D'Aprano wrote: >> Having said that, there's another problem: adding this feature (whatever >> it actually is) to __getattr__ will break every existing class that uses >> __getattr__. The problem is that everyone who writes a __getattr__ >> method writes it like this: >> >> def __getattr__(self, name): >> >> not: >> >> def __getattr__(self, name, error): >> >> so the class will break when the method receives two arguments >> (excluding self) but only has one parameter. > > Why not just write cross-version-compatible code as > > def __getattr__(self, name, error=None): > > ? Is there something special about getattr? The point was existing code would fail until that change was made. And a lot of existing code uses __getattr__. -- ~Ethan~ From steve at pearwood.info Mon Jun 19 22:26:44 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 20 Jun 2017 12:26:44 +1000 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: References: <20170620001819.GK3149@ando.pearwood.info> Message-ID: <20170620022643.GM3149@ando.pearwood.info> On Tue, Jun 20, 2017 at 11:31:34AM +1000, Chris Angelico wrote: > Why not just write cross-version-compatible code as > > def __getattr__(self, name, error=None): > > ? Is there something special about getattr? You've still got to write it in the first place. That's a pain, especially since (1) it doesn't do you any good before 3.7 if not later, and (2) even if this error parameter is useful (which is yet to be established), it's a pretty specialised use. Most of the time, you already know the name that failed (its the one being looked up). Perhaps a better approach is to prevent descriptors from leaking AttributeError in the first place? Change the protocol so that if descriptor.__get__ raises AttributeError, it is caught and re-raised as RuntimeError, similar to StopIteration and generators. Or maybe we decide that it's actually a feature, not a problem, for an AttributeError inside self.attr.__get__ to look like self.attr is missing. I don't know. (Also everything I said applies to __setattr__ and __delattr__ as well.) -- Steve From ethan at stoneleaf.us Mon Jun 19 22:36:09 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 19 Jun 2017 19:36:09 -0700 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: <20170620022643.GM3149@ando.pearwood.info> References: <20170620001819.GK3149@ando.pearwood.info> <20170620022643.GM3149@ando.pearwood.info> Message-ID: <59488A19.1010604@stoneleaf.us> On 06/19/2017 07:26 PM, Steven D'Aprano wrote: > Or maybe we decide that it's actually a feature, not a problem, for an > AttributeError inside self.attr.__get__ to look like self.attr is > missing. It's a feature. It's why Enum classes can have members named 'value' and 'name'. -- ~Ethan~ From steve at pearwood.info Mon Jun 19 22:44:12 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 20 Jun 2017 12:44:12 +1000 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: <59488A19.1010604@stoneleaf.us> References: <20170620001819.GK3149@ando.pearwood.info> <20170620022643.GM3149@ando.pearwood.info> <59488A19.1010604@stoneleaf.us> Message-ID: <20170620024411.GN3149@ando.pearwood.info> On Mon, Jun 19, 2017 at 07:36:09PM -0700, Ethan Furman wrote: > On 06/19/2017 07:26 PM, Steven D'Aprano wrote: > > >Or maybe we decide that it's actually a feature, not a problem, for an > >AttributeError inside self.attr.__get__ to look like self.attr is > >missing. > > It's a feature. It's why Enum classes can have members named 'value' and > 'name'. Can you explain further? What's special about value and name? -- Steve From ncoghlan at gmail.com Mon Jun 19 23:04:45 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 20 Jun 2017 13:04:45 +1000 Subject: [Python-ideas] socket module: plain stuples vs named tuples In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: On 20 June 2017 at 12:05, INADA Naoki wrote: > Namedtuple in Python make startup time slow. > So I'm very conservative to convert tuple to namedtuple in Python. Aye, I don't think a Python level wrapper would be the right way to go here - while namedtuple is designed to be as cheap as normal tuples in *use*, the same can't be said for the impact on startup time. As context for anyone not familiar with the time module precedent that Guido mentioned, we have a C level `PyStructSequence` that provides some of the most essential namedtuple features, but not all of them: https://github.com/python/cpython/blob/master/Objects/structseq.c So there's potentially a case to be made for: 1. Including the struct sequence header from "Python.h" and making it part of the stable ABI 2. Documenting it in the C API reference The main argument against doing so is that the initialisation API for it is pretty weird by the standards of the rest of the Python C API - it's mainly designed for use as a C level type *factory*, rather than intended to be used directly. That's also why there isn't a Python level API for it - it's designed around accepting C level structs as inputs, which doesn't translate well to pure Python code (whereas the collections.namedtuple API works well in pure Python code, but doesn't translate well to C code). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Mon Jun 19 23:10:54 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 20 Jun 2017 13:10:54 +1000 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: <20170620022643.GM3149@ando.pearwood.info> References: <20170620001819.GK3149@ando.pearwood.info> <20170620022643.GM3149@ando.pearwood.info> Message-ID: On Tue, Jun 20, 2017 at 12:26 PM, Steven D'Aprano wrote: > On Tue, Jun 20, 2017 at 11:31:34AM +1000, Chris Angelico wrote: > >> Why not just write cross-version-compatible code as >> >> def __getattr__(self, name, error=None): >> >> ? Is there something special about getattr? > > You've still got to write it in the first place. That's a pain, > especially since (1) it doesn't do you any good before 3.7 if not later, > and (2) even if this error parameter is useful (which is yet to be > established), it's a pretty specialised use. Most of the time, you > already know the name that failed (its the one being looked up). Gotcha, yep. I was just confused by your two-parter that made it look like it would be hard (or impossible) to write code that would work on both 3.6 and the new protocol. > Perhaps a better approach is to prevent descriptors from leaking > AttributeError in the first place? Change the protocol so that if > descriptor.__get__ raises AttributeError, it is caught and re-raised as > RuntimeError, similar to StopIteration and generators. This can't be done globally, because that's how a descriptor can be made conditional (it raises AttributeError to say "this attribute does not, in fact, exist"). But it's easy enough - and safe enough - to do it just for your own module, where you know in advance that any AttributeError is a leak. The way generators and StopIteration interact was more easily fixed, because generators have two legitimate ways to emit data (yield and return), but there's no easy way for a magic method to say "I don't have anything to return" other than an exception. Well, that's not strictly true. In JavaScript, they don't raise StopIteration from iterators - they always return a pair of values ("done" and the actual value, where "done" is either false for a yield or true for a StopIteration). That complicates the normal case but it does make the unusual case a bit easier. Also, it's utterly and fundamentally incompatible with the current system, so it'd have to be a brand new competing protocol. ChrisA From jjmaldonis at gmail.com Mon Jun 19 23:26:21 2017 From: jjmaldonis at gmail.com (Jason Maldonis) Date: Mon, 19 Jun 2017 22:26:21 -0500 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: References: <20170620001819.GK3149@ando.pearwood.info> <20170620022643.GM3149@ando.pearwood.info> Message-ID: First, I apologize for the poor post. Your corrections were exactly correct: This is only relevant in the context of properties/descriptors, and the property swallows the error message and it isn't printed to screen. I should not be typing without testing. > So... what precisely should be passed to __getattr__, and what exactly are you going to do with it? I'd say the error instance that caused __getattr__ to be raised should be passed into __getattr__, but I don't have a strong opinion on that. I'll assume that's true for the rest of this post, however. To clarify my mistakes in my first post, your example illustrates what I wanted to show: class A(object): eggs = "text" def __getattr__(self, name): if name == 'cheese': return "cheddar" raise AttributeError('%s missing' % name) @property def spam(self): return self.eggs.uper() # Oops. a = A() a.spam Traceback (most recent call last): File "", line 1, in File "", line 6, in __getattr__ AttributeError: spam missing This swallows the AttributeError from `eggs.uper()` and it isn't available. Even if it were available, I see your point that it may not be especially useful. In one iteration of my code I was looking through the stack trace using the traceback module to find the error I wanted, but I quickly decided that was a bad idea because I couldn't reliably find the error. However, with the full error, it would (I think) be trivial to find the relevant error in the stack trace. With the full stack trace, I would hope that you could properly do any error handling you wanted. However, if the error was available in __getattr__, we could at least `raise from` so that the error isn't completely swallowed. I.e. your example would be slightly modified like this: class A(object): eggs = "text" def __getattr__(self, name, error): if name == 'cheese': return "cheddar" raise AttributeError('%s missing' % name) from error ... which I think is useful. > That's a fairly long and heavy process, and will be quite annoying to > those writing cross-version code using __getattr__, but it can be done. This is another thing I hadn't considered, and I have no problem saying this non-backwards-compatible-change just isn't worth it. Maybe if there are multiple small updates to error handling it could be worth it (which I believe I read was something the devs care quite a bit about atm), but I don't think that this change is a huge deal. > One good solution to this is a "guard point" around your property functions. I am using a very similar decorator (modified from the one you gave me a month or so ago) in my code now, and functionally it works great. There is something that feels a bit off / like a hack to me though, and maybe it's that I am simply "renaming" the error from one I can't handle (due to name conflicts) to one I can. But the modification is just name change -- if I can handle the RuntimeError correctly, I feel like I should have just been able to handle the original AttributeError correctly (because in practice they should be raising an error in response to the exact same problem). That said, your decorator works great and gives me the functionality I needed. > > It's a feature. It's why Enum classes can have members named 'value' and > > 'name'. > Can you explain further? What's special about value and name? I'm also very curious about this. I've read the Enum code a few times but it hasn't clicked yet. > > Perhaps a better approach is to prevent descriptors from leaking > > AttributeError in the first place? Change the protocol so that if > > descriptor.__get__ raises AttributeError, it is caught and re-raised as > > RuntimeError, similar to StopIteration and generators. > This can't be done globally, because that's how a descriptor can be > made conditional (it raises AttributeError to say "this attribute does > not, in fact, exist"). But it's easy enough - and safe enough - to do > it just for your own module, where you know in advance that any > AttributeError is a leak. I have a pretty large codebase with a few different functionalities, so I'm hesitant to say "any AttributeError is a leak" in an object that might affect / be affected by other packages/functionalities. One other thing I really like about the `noleak` decorator is that I can change the AttributeError to a custom MyCustomError, which allows me to handle MyCustomError precisely where it should be handled. Also, maybe I'm just not fully understanding the locality of re-raising another error from descriptor.__get__ because I haven't completely wrapped my head around it yet. On Mon, Jun 19, 2017 at 10:10 PM, Chris Angelico wrote: > On Tue, Jun 20, 2017 at 12:26 PM, Steven D'Aprano > wrote: > > On Tue, Jun 20, 2017 at 11:31:34AM +1000, Chris Angelico wrote: > > > >> Why not just write cross-version-compatible code as > >> > >> def __getattr__(self, name, error=None): > >> > >> ? Is there something special about getattr? > > > > You've still got to write it in the first place. That's a pain, > > especially since (1) it doesn't do you any good before 3.7 if not later, > > and (2) even if this error parameter is useful (which is yet to be > > established), it's a pretty specialised use. Most of the time, you > > already know the name that failed (its the one being looked up). > > Gotcha, yep. I was just confused by your two-parter that made it look > like it would be hard (or impossible) to write code that would work on > both 3.6 and the new protocol. > > > Perhaps a better approach is to prevent descriptors from leaking > > AttributeError in the first place? Change the protocol so that if > > descriptor.__get__ raises AttributeError, it is caught and re-raised as > > RuntimeError, similar to StopIteration and generators. > > This can't be done globally, because that's how a descriptor can be > made conditional (it raises AttributeError to say "this attribute does > not, in fact, exist"). But it's easy enough - and safe enough - to do > it just for your own module, where you know in advance that any > AttributeError is a leak. The way generators and StopIteration > interact was more easily fixed, because generators have two legitimate > ways to emit data (yield and return), but there's no easy way for a > magic method to say "I don't have anything to return" other than an > exception. > > Well, that's not strictly true. In JavaScript, they don't raise > StopIteration from iterators - they always return a pair of values > ("done" and the actual value, where "done" is either false for a yield > or true for a StopIteration). That complicates the normal case but it > does make the unusual case a bit easier. Also, it's utterly and > fundamentally incompatible with the current system, so it'd have to be > a brand new competing protocol. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Jun 19 23:39:01 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 19 Jun 2017 20:39:01 -0700 Subject: [Python-ideas] the error that raises an AttributeError should be passed to __getattr__ In-Reply-To: <20170620024411.GN3149@ando.pearwood.info> References: <20170620001819.GK3149@ando.pearwood.info> <20170620022643.GM3149@ando.pearwood.info> <59488A19.1010604@stoneleaf.us> <20170620024411.GN3149@ando.pearwood.info> Message-ID: <594898D5.7060003@stoneleaf.us> On 06/19/2017 07:44 PM, Steven D'Aprano wrote: > On Mon, Jun 19, 2017 at 07:36:09PM -0700, Ethan Furman wrote: >> On 06/19/2017 07:26 PM, Steven D'Aprano wrote: >>> Or maybe we decide that it's actually a feature, not a problem, for an >>> AttributeError inside self.attr.__get__ to look like self.attr is >>> missing. >> >> It's a feature. It's why Enum classes can have members named 'value' and >> 'name'. > > Can you explain further? What's special about value and name? value and name are attributes of every Enum member; to be specific, they are descriptors, and so live in the class namespace. Enum members also live in the class namespace, so how do you get both a member name value and the value descriptor to both live in the class namespace? Easy. ;) Have the "value" and "name" descriptors check to see if they are being called on an instance, or an the class. If called on an instance they behave normally, returning the "name" or "value" data; but if called on the class the descriptor raises AttributeError, which causes Python to try the Enum class' __getattr__ method, which can find the member and return it... or raise AttributeError again if there is no "name" or "value" member. Here's the docstring from the types.DynamicClassAttribute in question: class DynamicClassAttribute: """Route attribute access on a class to __getattr__. This is a descriptor, used to define attributes that act differently when accessed through an instance and through a class. Instance access remains normal, but access to an attribute through a class will be routed to the class's __getattr__ method; this is done by raising AttributeError. This allows one to have properties active on an instance, and have virtual attributes on the class with the same name (see Enum for an example). """ -- ~Ethan~ From victor.stinner at gmail.com Tue Jun 20 07:12:33 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 20 Jun 2017 13:12:33 +0200 Subject: [Python-ideas] socket module: plain stuples vs named tuples In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: 2017-06-20 4:05 GMT+02:00 INADA Naoki : > Namedtuple in Python make startup time slow. > So I'm very conservative to convert tuple to namedtuple in Python. > INADA Naoki While we are talking about startup time, I would be curious of seeing the overhead (python startup time, when importing socket) of the enums added to socket.py ;-) "import enum" added to Lib/re.py was a regression causing a slowdown in the "python_startup" benchmark, but it was related to the site module importing "re" in Python is running in a virtual environment. This very specific use case was fixed by not using the re module in the site module. Victor From barry at barrys-emacs.org Tue Jun 20 15:54:19 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Tue, 20 Jun 2017 20:54:19 +0100 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: <9A9E548B-B897-4998-9583-F693CC9495AD@barrys-emacs.org> > On 19 Jun 2017, at 13:56, Nick Coghlan wrote: > > On 19 June 2017 at 04:10, Barry Scott wrote: >> The code does this: >> >> Py::Object getattro( const Py::String &name_ ) >> { >> std::string name( name_.as_std_string( "utf-8" ) ); >> >> if( name == "value" ) >> { >> return m_value; >> } >> else >> { >> return genericGetAttro( name_ ); >> } >> } >> >> Where getattro is called (indirectly) from tp_getattro. >> >> In the python 2 I can tell python that 'value' exists because I provide a value of __members__. >> >> What is the way to tell python about 'value' in the python3 world? > > OK, I think I may understand the confusion now. > > As Thomas noted, the preferred way of informing Python of data > attributes for types implemented in C is to ask the interpreter to > automatically create the appropriate descriptor objects by setting the > `tp_members` slot on the C level *type*, rather than setting > `__members__` on the instance: > https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_members > > That approach also works for new-style classes in Python 2: > https://docs.python.org/2/c-api/typeobj.html#c.PyTypeObject.tp_members I'll see if I can use this to convert existing code that depends on __members__ and report back. > > I believe this is actually an old-/new-style class difference, so the > relevant Python 3 change is the fact that the old-style approach > simply isn't available any more. > > So if you use tp_members and tp_getset to request the creation of > suitable descriptors, then the interpreter will automatically take > care of populating the results of `dir()` correctly. However, if > you're genuinely dynamically adding attributes in `__getattr__`, then > you're going to need to add code to report them from `__dir__` as > well. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Tue Jun 20 16:14:38 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Tue, 20 Jun 2017 21:14:38 +0100 Subject: [Python-ideas] Restore the __members__ behavior to python3 for C extension writers In-Reply-To: References: Message-ID: > On 19 Jun 2017, at 13:56, Nick Coghlan wrote: > > On 19 June 2017 at 04:10, Barry Scott wrote: >> The code does this: >> >> Py::Object getattro( const Py::String &name_ ) >> { >> std::string name( name_.as_std_string( "utf-8" ) ); >> >> if( name == "value" ) >> { >> return m_value; >> } >> else >> { >> return genericGetAttro( name_ ); >> } >> } >> >> Where getattro is called (indirectly) from tp_getattro. >> >> In the python 2 I can tell python that 'value' exists because I provide a value of __members__. >> >> What is the way to tell python about 'value' in the python3 world? > > OK, I think I may understand the confusion now. > > As Thomas noted, the preferred way of informing Python of data > attributes for types implemented in C is to ask the interpreter to > automatically create the appropriate descriptor objects by setting the > `tp_members` slot on the C level *type*, rather than setting > `__members__` on the instance: > https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_members > That approach also works for new-style classes in Python 2: > https://docs.python.org/2/c-api/typeobj.html#c.PyTypeObject.tp_members This is not useful as the values in my use cases are typically taken from a C or C++ objects that holds the master copy. The python class is a facade that is forwarding to a embedded object typically. And the values return might switch type. Being a PyString or None for example. > > I believe this is actually an old-/new-style class difference, so the > relevant Python 3 change is the fact that the old-style approach > simply isn't available any more. > > So if you use tp_members and tp_getset to request the creation of > suitable descriptors, then the interpreter will automatically take > care of populating the results of `dir()` correctly. However, if > you're genuinely dynamically adding attributes in `__getattr__`, then > you're going to need to add code to report them from `__dir__` as > well. tp_getset might work. I'll have to a largest block of time experiment with it and think about C++ API for it. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at brice.xyz Wed Jun 21 04:31:05 2017 From: contact at brice.xyz (Brice PARENT) Date: Wed, 21 Jun 2017 10:31:05 +0200 Subject: [Python-ideas] Language proposal: variable assignment in functional context In-Reply-To: <20170617102659.GI3149@ando.pearwood.info> References: <20170617002754.GH3149@ando.pearwood.info> <279b1f21-0e16-465e-7e7b-1c3693c77141@mail.de> <20170617102659.GI3149@ando.pearwood.info> Message-ID: <7a2dcc59-f4a7-a801-9d16-d38991ec1496@brice.xyz> I might not have understood it completely, but I think the use cases would probably better be splitted into two categories, each with a simple solution (simple in usage at least): *When we just want a tiny scope for a variable:* Syntax: with [assignment]: # use of the variable # assigned variable is now out of scope Examples: with b = a + 1: y = b + 2 # we can use y here, but not b or with delta = lambda a, b, c: b**2 - 4 * a * c: x1 = (- b - math.sqrt(delta(a, b, c))) / (2 * a) x2 = (- b + math.sqrt(delta(a, b, c))) / (2 * a) # delta func is now out of scope and has been destroyed We don't keep unnecessarily some variables, as well as we don't risk any collision with outer scopes (and we preserve readability by not declaring a function for that). It would probably mean the assignment operator should behave differently than it does now which could have unexpected (to me) implications. It would have to support both __enter__ and __exit__ methods, but I don't know if this makes any sense. I don't know if with a + 1 as b: would make a better sense or be a side-effect or special cases hell. *When we want to simplify a comprehension:* (although it would probably help in many other situations) Syntax: prepare_iterable(sequence, *functions) which creates a new iterable containing tuples like (element, return_of_function_1, return_of_function_2, ...) Examples: [m(spam)[eggs] for _, m inprepare_iterable(sequence, lambda obj: obj[0].field.method) if m] or, outside of a comprehension: sequence = [0, 1, 5] prepare_iterable(sequence, lambda o: o * 3, lambda o: o + 1) # -> [(0, 0, 1), (1, 3, 2), (5, 15, 6)] The "prepare_iterable" method name might or might not be the right word to use. But English not being my mother language, I'm not the right person to discuss this... It would be a function instead of a method shared by all iterables to be able to yield the elements instead of processing the hole set of data right from the start. This function should probably belong to the standard library but probably not in the general namespace. -- Brice Le 17/06/17 ? 12:27, Steven D'Aprano a ?crit : > On Sat, Jun 17, 2017 at 09:03:54AM +0200, Sven R. Kunze wrote: >> On 17.06.2017 02:27, Steven D'Aprano wrote: >>> I think this is somewhat similar to a suggestion of Nick Coghlan's. One >>> possible syntax as a statement might be: >>> >>> y = b + 2 given: >>> b = a + 1 >> Just to get this right:this proposal is about reversing the order of >> chaining expressions? > Partly. Did you read the PEP? > > https://www.python.org/dev/peps/pep-3150/ > > I quote: > > The primary motivation is to enable a more declarative style of > programming, where the operation to be performed is presented to the > reader first, and the details of the necessary subcalculations are > presented in the following indented suite. > [...] > A secondary motivation is to simplify interim calculations in module > and class level code without polluting the resulting namespaces. > > It is not *just* about reversing the order, it is also about avoiding > polluting the current namespace (global, or class) with unnecessary > temporary variables. This puts the emphasis on the important part of the > expression, not the temporary/implementation variables: > > page = header + body + footer where: > header = ... > body = ... > footer = ... > > There is prior art: the "where" and "let" clauses in Haskell, as well as > mathematics, where it is very common to defer the definition of > temporary variables until after they are used. > > >> Instead of: >> >> b = a + 1 >> c = b + 2 >> >> we could write it in reverse order: >> >> c = b + 2 given/for: >> b = a + 1 > > Right. But of course such a trivial example doesn't demonstrate any > benefit. This might be a better example. > > Imagine you have this code, where the regular expression and the custom > sort function are used in one place only. Because they're only used > *once*, we don't really need them to be top-level global names, but > currently we have little choice. > > regex = re.compile(r'.*?(\d*).*') > > def custom_sort(string): > mo = regex.match(string) > ... some implementation > return key > > # Later > results = sorted(some_strings, key=custom_sort) > > # Optional > del custom_sort, regex > > > Here we get the order of definitions backwards: the thing we actually > care about, results = sorted(...), is defined last, and mere > implementation details are given priority as top-level names that > either hang around forever, or need to be explicitly deleted. > > Some sort of "where" clause could allow: > > results = sorted(some_strings, key=custom_sort) where: > regex = re.compile(r'.*?(\d*).*') > > def custom_sort(string): > mo = regex.match(string) > ... some implementation > return key > > > If this syntax was introduced, editors would soon allow you to fold the > "where" block and hide it. The custom_sort and regex names would be > local to the where block and the results = ... line. > > Another important use-case is comprehensions, where we often have to > repeat ourselves: > > [obj[0].field.method(spam)[eggs] for obj in sequence if obj[0].field.method] > > One work around: > > [m(spam)[eggs] for m in [obj[0].field.method for obj in sequence] if m] > > But perhaps we could do something like: > > [m(spam)[eggs] for obj in sequence where m = obj[0].field.method if m] > > or something similar. > > > >> If so, I don't know if it just complicates the language with a feature >> which does not save writing nor reading > It helps to save reading, by pushing less-important implementation > details of an expression into an inner block where it is easy to ignore > them. Even if you don't have an editor which does code folding, it is > easy to skip over an indented block and just read the header line, > ignoring the implementation. We already do this with classes, functions, > even loops: > > class K: > ... implementation of K > > def func(arg): > ... implementation of func > > for x in seq: > ... implementation of loop body > > page = header + body + footer where: > ... implementation of page > > > As a general rule, any two lines at the same level of indentation are > read as being of equal importance. When we care about the implementation > details, we "drop down" into the lower indentation block. But when > skimming the code for a high-level overview, we skip the details of > indented blocks and focus only on the current level: > > class K: > def func(arg): > for x in seq: > page = header + body + footer where: > > > (That's why editors often provide code folding, to hide the details of > an indented block. But even without that feature, we can do it in our > own head, although not as effectively.) > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Wed Jun 21 05:29:10 2017 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Wed, 21 Jun 2017 11:29:10 +0200 Subject: [Python-ideas] Language proposal: variable assignment in functional context In-Reply-To: <7a2dcc59-f4a7-a801-9d16-d38991ec1496@brice.xyz> References: <20170617002754.GH3149@ando.pearwood.info> <279b1f21-0e16-465e-7e7b-1c3693c77141@mail.de> <20170617102659.GI3149@ando.pearwood.info> <7a2dcc59-f4a7-a801-9d16-d38991ec1496@brice.xyz> Message-ID: I could emulate the "where" semantics as described by Steven using the class statement and eval : I guess it is useful if someone want to try refactoring a piece of "real world" code, and see if it really feels better - then we could try to push for the "where" syntax, which I kinda like: (This emulation requires the final expression to be either a string to be eval'ed, or a lambda function - but then some namespace retrieval and parameter matching would require a bit more code on the metaclass): class Where(type): def __new__(metacls, name, bases, namespace, *, expr=''): return eval(expr, namespace) def sqroot(n): class roots(expr="[(-b + r)/ (2 * a) for r in (+ delta **0.5, - delta ** 0.5) ]", metaclass=Where): a, b, c = n delta = b ** 2 - 4 * a * c return roots On 21 June 2017 at 10:31, Brice PARENT wrote: > I might not have understood it completely, but I think the use cases would > probably better be splitted into two categories, each with a simple > solution (simple in usage at least): > > *When we just want a tiny scope for a variable:* > > Syntax: > > with [assignment]: > # use of the variable > > # assigned variable is now out of scope > > Examples: > > with b = a + 1: > y = b + 2 > > # we can use y here, but not b > > or > > with delta = lambda a, b, c: b**2 - 4 * a * c: > x1 = (- b - math.sqrt(delta(a, b, c))) / (2 * a) > x2 = (- b + math.sqrt(delta(a, b, c))) / (2 * a) > > # delta func is now out of scope and has been destroyed > > We don't keep unnecessarily some variables, as well as we don't risk any > collision with outer scopes (and we preserve readability by not declaring a > function for that). > > It would probably mean the assignment operator should behave differently > than it does now which could have unexpected (to me) implications. It would > have to support both __enter__ and __exit__ methods, but I don't know if > this makes any sense. I don't know if with a + 1 as b: would make a > better sense or be a side-effect or special cases hell. > > *When we want to simplify a comprehension:* > > (although it would probably help in many other situations) > > Syntax: > > prepare_iterable(sequence, *functions) > > which creates a new iterable containing tuples like (element, > return_of_function_1, return_of_function_2, ...) > > Examples: > > [m(spam)[eggs] for _, m in prepare_iterable(sequence, lambda obj: obj[0].field.method) if m] > > or, outside of a comprehension: > > sequence = [0, 1, 5] > prepare_iterable(sequence, lambda o: o * 3, lambda o: o + 1) > # -> [(0, 0, 1), (1, 3, 2), (5, 15, 6)] > The "prepare_iterable" method name might or might not be the right word > to use. But English not being my mother language, I'm not the right person > to discuss this... > It would be a function instead of a method shared by all iterables to be > able to yield the elements instead of processing the hole set of data right > from the start. > This function should probably belong to the standard library but probably > not in the general namespace. > > -- Brice > > Le 17/06/17 ? 12:27, Steven D'Aprano a ?crit : > > On Sat, Jun 17, 2017 at 09:03:54AM +0200, Sven R. Kunze wrote: > > On 17.06.2017 02:27, Steven D'Aprano wrote: > > I think this is somewhat similar to a suggestion of Nick Coghlan's. One > possible syntax as a statement might be: > > y = b + 2 given: > b = a + 1 > > Just to get this right:this proposal is about reversing the order of > chaining expressions? > > Partly. Did you read the PEP? > https://www.python.org/dev/peps/pep-3150/ > > I quote: > > The primary motivation is to enable a more declarative style of > programming, where the operation to be performed is presented to the > reader first, and the details of the necessary subcalculations are > presented in the following indented suite. > [...] > A secondary motivation is to simplify interim calculations in module > and class level code without polluting the resulting namespaces. > > It is not *just* about reversing the order, it is also about avoiding > polluting the current namespace (global, or class) with unnecessary > temporary variables. This puts the emphasis on the important part of the > expression, not the temporary/implementation variables: > > page = header + body + footer where: > header = ... > body = ... > footer = ... > > There is prior art: the "where" and "let" clauses in Haskell, as well as > mathematics, where it is very common to defer the definition of > temporary variables until after they are used. > > > > Instead of: > > b = a + 1 > c = b + 2 > > we could write it in reverse order: > > c = b + 2 given/for: > b = a + 1 > > > Right. But of course such a trivial example doesn't demonstrate any > benefit. This might be a better example. > > Imagine you have this code, where the regular expression and the custom > sort function are used in one place only. Because they're only used > *once*, we don't really need them to be top-level global names, but > currently we have little choice. > > regex = re.compile(r'.*?(\d*).*') > > def custom_sort(string): > mo = regex.match(string) > ... some implementation > return key > > # Later > results = sorted(some_strings, key=custom_sort) > > # Optional > del custom_sort, regex > > > Here we get the order of definitions backwards: the thing we actually > care about, results = sorted(...), is defined last, and mere > implementation details are given priority as top-level names that > either hang around forever, or need to be explicitly deleted. > > Some sort of "where" clause could allow: > > results = sorted(some_strings, key=custom_sort) where: > regex = re.compile(r'.*?(\d*).*') > > def custom_sort(string): > mo = regex.match(string) > ... some implementation > return key > > > If this syntax was introduced, editors would soon allow you to fold the > "where" block and hide it. The custom_sort and regex names would be > local to the where block and the results = ... line. > > Another important use-case is comprehensions, where we often have to > repeat ourselves: > > [obj[0].field.method(spam)[eggs] for obj in sequence if obj[0].field.method] > > One work around: > > [m(spam)[eggs] for m in [obj[0].field.method for obj in sequence] if m] > > But perhaps we could do something like: > > [m(spam)[eggs] for obj in sequence where m = obj[0].field.method if m] > > or something similar. > > > > > If so, I don't know if it just complicates the language with a feature > which does not save writing nor reading > > It helps to save reading, by pushing less-important implementation > details of an expression into an inner block where it is easy to ignore > them. Even if you don't have an editor which does code folding, it is > easy to skip over an indented block and just read the header line, > ignoring the implementation. We already do this with classes, functions, > even loops: > > class K: > ... implementation of K > > def func(arg): > ... implementation of func > > for x in seq: > ... implementation of loop body > > page = header + body + footer where: > ... implementation of page > > > As a general rule, any two lines at the same level of indentation are > read as being of equal importance. When we care about the implementation > details, we "drop down" into the lower indentation block. But when > skimming the code for a high-level overview, we skip the details of > indented blocks and focus only on the current level: > > class K: > def func(arg): > for x in seq: > page = header + body + footer where: > > > (That's why editors often provide code folding, to hide the details of > an indented block. But even without that feature, we can do it in our > own head, although not as effectively.) > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lele at metapensiero.it Wed Jun 21 06:09:40 2017 From: lele at metapensiero.it (Lele Gaifax) Date: Wed, 21 Jun 2017 12:09:40 +0200 Subject: [Python-ideas] Language proposal: variable assignment in functional context References: <20170617002754.GH3149@ando.pearwood.info> <279b1f21-0e16-465e-7e7b-1c3693c77141@mail.de> <20170617102659.GI3149@ando.pearwood.info> <7a2dcc59-f4a7-a801-9d16-d38991ec1496@brice.xyz> Message-ID: <87o9thabyz.fsf@metapensiero.it> Brice PARENT writes: > Examples: > > with b = a + 1: > y = b + 2 I don't think that could work, because the with "arguments" should be expressions, not statements. However, IIRC someone already suggested the alternative with a+1 as b: y = b + 2 but that clashes with the ordinary "context manager" syntax. It's a pity "exec" is now a plain function, instead of a keyword as it was in Py2, as that could allow exec: y = b + 2 with: # or even "in:" b = a + 1 ciao, lele. -- nickname: Lele Gaifax | Quando vivr? di quello che ho pensato ieri real: Emanuele Gaifas | comincer? ad aver paura di chi mi copia. lele at metapensiero.it | -- Fortunato Depero, 1929. From guettliml at thomas-guettler.de Thu Jun 22 11:25:26 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Thu, 22 Jun 2017 17:25:26 +0200 Subject: [Python-ideas] socket module: plain stuples vs named tuples - Thank you In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: <500887d3-f2db-8563-ee5d-a22bab7e56e6@thomas-guettler.de> thank you! I am happy that Guido is open for a pull request ... There were +1 votes, too (and some concern about python startup time). I stopped coding in spare time, since my children are more important at the moment .. if some wants to try it ... go ahead and implement named tuples for the socket standard library - would be great. Just for the records, I came here because of this feature request: https://github.com/giampaolo/psutil/issues/928 Regards, Thomas G?ttler PS: For some strange reasons I received only some mails of this thread. But I could find the whole thread in the archive. Am 20.06.2017 um 04:05 schrieb INADA Naoki: > Namedtuple in Python make startup time slow. > So I'm very conservative to convert tuple to namedtuple in Python. > INADA Naoki > > > On Tue, Jun 20, 2017 at 7:27 AM, Victor Stinner > wrote: >> Oh, about the cost of writing C code, we started to enhance the socket >> module in socket.py but keep the C code unchanged. I am thinking to the >> support of enums. Some C functions are wrapped in Python. >> >> Victor >> >> Le 19 juin 2017 11:59 PM, "Guido van Rossum" a ?crit : >>> >>> There are examples in timemodule.c which went through a similar conversion >>> from plain tuples to (sort-of) named tuples. I agree that upgrading the >>> tuples returned by the socket module to named tuples would be nice, but it's >>> a low priority project. Maybe someone interested can create a PR? (First >>> open an issue stating that you're interested; point to this email from me to >>> prevent that some other core dev just closes it again.) >>> >>> On Mon, Jun 19, 2017 at 2:24 PM, Victor Stinner >>> wrote: >>>> >>>> Hi, >>>> >>>> 2017-06-13 22:13 GMT+02:00 Thomas G?ttler : >>>>> AFAIK the socket module returns plain tuples in Python3: >>>>> >>>>> https://docs.python.org/3/library/socket.html >>>>> >>>>> Why not use named tuples? >>>> >>>> For technical reasons: the socket module is mostly implemented in the >>>> C language, and define a "named tuple" in C requires to implement a >>>> "sequence" time which requires much more code than creating a tuple. >>>> >>>> In short, create a tuple is as simple as Py_BuildValue("OO", item1, >>>> item2). >>>> >>>> Creating a "sequence" type requires something like 50 lines of code, >>>> maybe more, I don't know exactly. >>>> >>>> Victor >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Thomas Guettler http://www.thomas-guettler.de/ From srkunze at mail.de Thu Jun 22 15:17:14 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 22 Jun 2017 21:17:14 +0200 Subject: [Python-ideas] Language proposal: variable assignment in functional context In-Reply-To: References: <20170617002754.GH3149@ando.pearwood.info> <279b1f21-0e16-465e-7e7b-1c3693c77141@mail.de> Message-ID: <33d14764-ee11-52be-11c4-de032a43200e@mail.de> On 17.06.2017 17:51, Nick Coghlan wrote: > You've pretty much hit on why that PEP's been deferred for ~5 years or > so - I'm waiting to see use cases where we can genuinely say "this > would be so much easier and more readable if we had a given > construct!" :) This PEP accepted, we would have 3 ways of doing 1 thing (imperative programming): 1) flat inline execution namespace 2) structured inline execution namespace 3) named code used in 1) oder 2) Is that simple enough for Python? Just one side thought: internally, we have a guideline which says: "please reduce the number of indentations" -> less else, less if, less while, etc. The goal is more compact, less complex code. Our code is already complex enough due to its sher amount. "given" would fall under the same rule here: keep it flat and simple; otherwise, give it a name. > Then asyncio (and variants like > curio and trio) came along and asked the question: what if we built on > the concepts explored by Twisted's inlineDeferred's, and instead made > it easier to write asynchronous code without explicitly constructing > callback chains? Does it relate? I can imagine having both "given" and "async given". > However, in my own work, > having to come up with a sensible name for the encapsulated operation > generally comes with a readability benefit as well, so... Well said. In the end (when it comes to professional code), you need to test those little things anyway. So addressing them is necessary. In interactive code, well, honestly, I don't care so much about spoiling namespaces and using variables names such as 'a' or 'bb' is quite common to try things out. @Steven Good post, thanks for explaining it. :) Might be too much for the simple Python I know and value but hey if it helps. Maybe it will enable a lot of cool stuff and we cannot imagine right now just because David Beazley cannot try it out in practice. Regards, Sven From srkunze at mail.de Thu Jun 22 16:30:57 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 22 Jun 2017 22:30:57 +0200 Subject: [Python-ideas] Improving Catching Exceptions Message-ID: <8293197a-4582-5971-e0f7-76a4e897cbb0@mail.de> Hi folks, just one note I'd like to dump here. We usually teach our newbies to catch exceptions as narrowly as possible, i.e. MyModel.DoesNotExist instead of a plain Exception. This works out quite well for now but the number of examples continue to grow where it's not enough. There are at least three examples I can name off the top of my head: 1) nested StopIteration - PEP 479 2) nested ImportError 3) nested AttributeError 1) is clear. 2) usually can be dealt with by applying the following pattern: try: import user except ImportError: import sys if sys.exc_info()[2].tb_next: raise Chris showed how to deal with 3). Catching nested exception is not what people want many times. Am I the only one getting the impression that there's a common theme here? Regards, Sven From steve at pearwood.info Thu Jun 22 16:55:33 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 23 Jun 2017 06:55:33 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <8293197a-4582-5971-e0f7-76a4e897cbb0@mail.de> References: <8293197a-4582-5971-e0f7-76a4e897cbb0@mail.de> Message-ID: <20170622205532.GP3149@ando.pearwood.info> On Thu, Jun 22, 2017 at 10:30:57PM +0200, Sven R. Kunze wrote: > Hi folks, > > just one note I'd like to dump here. > > We usually teach our newbies to catch exceptions as narrowly as > possible, i.e. MyModel.DoesNotExist instead of a plain Exception. This > works out quite well for now but the number of examples continue to grow > where it's not enough. (1) Under what circumstances is it not enough? (2) Is that list growing? (3) You seem to be implying that "catch narrow exceptions" is bad advice and we should catch Exception instead. How does that help? > There are at least three examples I can name off the top of my head: > 1) nested StopIteration - PEP 479 StopIteration and generators have been around a long time, since Python 2.2 I think, so this is not new. To the extent this was a problem, it is fixed now. > 2) nested ImportError > 3) nested AttributeError Both of those have been around since Python 1.x days, so not new either. If the list is growing, can you give some more recent examples? > 1) is clear. 2) usually can be dealt with by applying the following pattern: > > try: > import user > except ImportError: > import sys > if sys.exc_info()[2].tb_next: > raise I've never needed to write something like that for ImportError. It seems like an anti-pattern to me: sometimes it will silently swallow the exception, and now `user` will remain undefined, a landmine waiting to explode (with NameError) in your code. > Chris showed how to deal with 3). Catching nested exception is not what > people want many times. Isn't it? Why not? Can you explain further? > Am I the only one getting the impression that there's a common theme here? I don't know what common theme you see. I can't see one. Do you actually have a proposal? -- Steve From cs at zip.com.au Thu Jun 22 19:29:23 2017 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 23 Jun 2017 09:29:23 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170622205532.GP3149@ando.pearwood.info> References: <20170622205532.GP3149@ando.pearwood.info> Message-ID: <20170622232923.GA48632@cskk.homeip.net> On 23Jun2017 06:55, Steven D'Aprano wrote: >On Thu, Jun 22, 2017 at 10:30:57PM +0200, Sven R. Kunze wrote: >> We usually teach our newbies to catch exceptions as narrowly as >> possible, i.e. MyModel.DoesNotExist instead of a plain Exception. This >> works out quite well for now but the number of examples continue to grow >> where it's not enough. > >(1) Under what circumstances is it not enough? I believe that he means that it isn't precise enough. In particular, "nested exceptions" to me, from his use cases, means exceptions thrown from within functions called at the top level. I want this control too sometimes. Consider: try: foo(bah[5]) except IndexError as e: ... infer that there is no bah[5] ... Of course, it is possible that bah[5] existed and that foo() raised an IndexError of its own. One might intend some sane handling of a missing bah[5] but instead silently conceal the IndexError from foo() by mishandling it as a missing bah[5]. Naturally one can rearrange this code to call foo() outside that try/except, but that degree of control often leads to quite fiddly looking code with the core flow obscured by many tiny try/excepts. One can easily want, instead, some kind of "shallow except", which would catch exceptions only if they were directly raised from the surface code; such a construct would catch the IndexError from a missing bah[5] in the example above, but _not_ catch an IndexError raised from deeper code such within the foo() function. Something equivalent to: try: foo(bah[5]) except IndexError as e: if e.__traceback__ not directly from the try..except lines: raise ... infer that there is no bah[5] ... There doesn't seem to be a concise way to write that. It might not even be feasible at all, as one doesn't have a way to identify the line(s) within the try/except in a form that one can recognise in a traceback. I can imagine wanting to write something like this: try: foo(bah[5]) except shallow IndexError as e: ... deduce that there is no bah[5] ... Note that one can then deduce the missing bah[5] instead of inferring it. Obviously the actual syntax above is a nonstarter, but something that succinct and direct would be very handy. The nested exception issue actually bites me regularly, almost always with properties. The property system appears designed to allow one to make "conditional" properties, which appear to exist only in some circumstances. I wrote one of them just the other day, along the lines of: @property def target(self): if len(self.targets) == 1: return self.targets[0] raise AttributeError('only exists when this has exactly one target') However, more commonly I end up hiding coding errors with @property, particularly nasty when the coding error is deep in some nested call. Here is a nondeep example based on the above: @property def target(self): if len(self.targgets) == 1: return self.targets[0] raise AttributeError('only exists when this has exactly one target') Here I have misspelt ".targets" as ".targgets". And quietly the .target property is simply missing, and a caller may then infer, incorrectly, things about the number of targets. What I, as the coder, actually wanted was for the errant .targgets reference to trigger something different from Attribute error, something akin to a NameError. (Obviously it _is_ a missing attribute and that is what AttributeError is for, but within a property that is ... unhelpful.) This is so common that I actually keep around a special hack: def prop(func): ''' The builtin @property decorator lets internal AttributeErrors escape. While that can support properties that appear to exist conditionally, in practice this is almost never what I want, and it masks deeper errors. Hence this wrapper for @property that transmutes internal AttributeErrors into RuntimeErrors. ''' def wrapper(*a, **kw): try: return func(*a, **kw) except AttributeError as e: e2 = RuntimeError("inner function %s raised %s" % (func, e)) if sys.version_info[0] >= 3: try: eval('raise e2 from e', globals(), locals()) except: # FIXME: why does this raise a SyntaxError? raise e else: raise e2 return property(wrapper) and often define properties like this: from cs.py.func import prop ....... @prop def target(self): if len(self.targgets) == 1: return self.targets[0] raise AttributeError('only exists when this has exactly one target') Same shape, better semantics from a debugging point of view. This is just one example where "nested" exceptions can be a misleading behavioural symptom. >> Chris showed how to deal with 3). Catching nested exception is not what >> people want many times. > >Isn't it? Why not? Can you explain further? I hope this real world example shows why the scenario is real, and that my discussion shows why for me at least it would be handy to _easily_ catch the "shallow" exception only. Cheers, Cameron Simpson From dirn at dirnonline.com Thu Jun 22 19:47:08 2017 From: dirn at dirnonline.com (Andy Dirnberger) Date: Thu, 22 Jun 2017 19:47:08 -0400 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170622232923.GA48632@cskk.homeip.net> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> Message-ID: > On Jun 22, 2017, at 7:29 PM, Cameron Simpson wrote: > >> On 23Jun2017 06:55, Steven D'Aprano wrote: >>> On Thu, Jun 22, 2017 at 10:30:57PM +0200, Sven R. Kunze wrote: >>> We usually teach our newbies to catch exceptions as narrowly as >>> possible, i.e. MyModel.DoesNotExist instead of a plain Exception. This >>> works out quite well for now but the number of examples continue to grow >>> where it's not enough. >> >> (1) Under what circumstances is it not enough? > > I believe that he means that it isn't precise enough. In particular, "nested exceptions" to me, from his use cases, means exceptions thrown from within functions called at the top level. I want this control too sometimes. > > Consider: > > try: > foo(bah[5]) > except IndexError as e: > ... infer that there is no bah[5] ... > > Of course, it is possible that bah[5] existed and that foo() raised an IndexError of its own. One might intend some sane handling of a missing bah[5] but instead silently conceal the IndexError from foo() by mishandling it as a missing bah[5]. > > Naturally one can rearrange this code to call foo() outside that try/except, but that degree of control often leads to quite fiddly looking code with the core flow obscured by many tiny try/excepts. > > One can easily want, instead, some kind of "shallow except", which would catch exceptions only if they were directly raised from the surface code; such a construct would catch the IndexError from a missing bah[5] in the example above, but _not_ catch an IndexError raised from deeper code such within the foo() function. > > Something equivalent to: > > try: > foo(bah[5]) > except IndexError as e: > if e.__traceback__ not directly from the try..except lines: > raise > ... infer that there is no bah[5] ... > > There doesn't seem to be a concise way to write that. It might not even be feasible at all, as one doesn't have a way to identify the line(s) within the try/except in a form that one can recognise in a traceback. How about something like this? try: val = bah[5] except IndexError: # handle your expected exception here else: foo(val) > > > Cheers, > Cameron Simpson > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From thektulu.pp at gmail.com Thu Jun 22 20:21:41 2017 From: thektulu.pp at gmail.com (=?UTF-8?B?TWljaGHFgiDFu3Vrb3dza2k=?=) Date: Fri, 23 Jun 2017 02:21:41 +0200 Subject: [Python-ideas] [Python-Dev] Language proposal: variable assignment in functional context In-Reply-To: <20170617002754.GH3149@ando.pearwood.info> References: <20170617002754.GH3149@ando.pearwood.info> Message-ID: I've implemented a PoC of `where` expression some time ago. https://github.com/thektulu/cpython/commit/9e669d63d292a639eb6ba2ecea3ed2c0c23f2636 just compile and have fun. 2017-06-17 2:27 GMT+02:00 Steven D'Aprano : > Welcome Robert. My response below. > > Follow-ups to Python-Ideas, thanks. You'll need to subscribe to see any > further discussion. > > > On Fri, Jun 16, 2017 at 11:32:19AM +0000, Robert Vanden Eynde wrote: > > > In a nutshell, I would like to be able to write: > > y = (b+2 for b = a + 1) > > I think this is somewhat similar to a suggestion of Nick Coghlan's. One > possible syntax as a statement might be: > > y = b + 2 given: > b = a + 1 > > > https://www.python.org/dev/peps/pep-3150/ > > In mathematics, I might write: > > y = b + 2 where b = a + 1 > > although of course I wouldn't do so for anything so simple. Here's a > better example, the quadratic formula: > > -b ? ?? > x = ????????? > 2a > > where ? = b? - 4ac > > although even there I'd usually write ? in place. > > > > Python already have the "functional if", lambdas, list comprehension, > > but not simple assignment functional style. > > I think you mean "if *expression*" rather than "functional if". The term > "functional" in programming usually refers to a particular paradigm: > > https://en.wikipedia.org/wiki/Functional_programming > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Jun 22 21:48:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 23 Jun 2017 11:48:10 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170622232923.GA48632@cskk.homeip.net> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> Message-ID: On 23 June 2017 at 09:29, Cameron Simpson wrote: > This is so common that I actually keep around a special hack: > > def prop(func): > ''' The builtin @property decorator lets internal AttributeErrors > escape. > While that can support properties that appear to exist > conditionally, > in practice this is almost never what I want, and it masks deeper > errors. > Hence this wrapper for @property that transmutes internal > AttributeErrors > into RuntimeErrors. > ''' > def wrapper(*a, **kw): > try: > return func(*a, **kw) > except AttributeError as e: > e2 = RuntimeError("inner function %s raised %s" % (func, e)) > if sys.version_info[0] >= 3: > try: > eval('raise e2 from e', globals(), locals()) > except: > # FIXME: why does this raise a SyntaxError? > raise e > else: > raise e2 > return property(wrapper) Slight tangent, but I do sometimes wonder if adding a decorator factory like the following to functools might be useful: def raise_when_returned(return_exc): def decorator(f): @wraps(f) def wrapper(*args, **kwds): try: result = f(*args, **kwds) except selective_exc as unexpected_exc: msg = "inner function {} raised {}".format(f, unexpected_exc) raise RuntimeError(msg) from unexpected_exc if isinstance(result, return_exc): raise result return result It's essentially a generalisation of PEP 479 to arbitrary exception types, since it lets you mark a particular exception type as being communicated back to the wrapper via the return channel rather than as a regular exception: def with_traceback(exc): try: raise exc except BaseException as caught_exc: return caught_exc @property @raise_when_returned(AttributeError) def target(self): if len(self.targets) == 1: return self.targets[0] return with_traceback(AttributeError('only exists when this has exactly one target')) The part I don't like about that approach is the fact that you need to mess about with the exception internals to get a halfway decent traceback on the AttributeError. The main alternative would be to add a "convert_exception" context manager in contextlib, so you could write the example property as: @property def target(self): with convert_exception(AttributeError): if len(self.targets) == 1: return self.targets[0] raise AttributeError('only exists when this has exactly one target') Where "convert_exception" would be something like: def convert_exception(exc_type): """Prevents the given exception type from escaping a region of code by converting it to RuntimeError""" if not issubclass(exc_type, Exception): raise TypeError("Only Exception subclasses can be flagged as unexpected") try: yield except exc_type as unexpected_exc: new_exc = RuntimeError("Unexpected exception") raise new_exc from unexpected_exc The API for this could potentially be made more flexible to allow easy substition of lookup errors with attribute errors and vice-versa (e.g. via additional keyword-only parameters) To bring the tangent back closer to Sven's original point, there are probably also some parts of the import system (such as executing the body of a found module) where the case can be made that we should be converting ImportError to RuntimeError, rather than letting the ImportError escape (with essentially the same rationale as PEP 479). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From cs at zip.com.au Thu Jun 22 21:02:56 2017 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 23 Jun 2017 11:02:56 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: Message-ID: <20170623010256.GA90317@cskk.homeip.net> On 22Jun2017 19:47, Andy Dirnberger wrote: >> On Jun 22, 2017, at 7:29 PM, Cameron Simpson wrote: >> try: >> foo(bah[5]) >> except IndexError as e: >> ... infer that there is no bah[5] ... >> >> Of course, it is possible that bah[5] existed and that foo() raised an IndexError of its own. One might intend some sane handling of a missing bah[5] but instead silently conceal the IndexError from foo() by mishandling it as a missing bah[5]. >> >> Naturally one can rearrange this code to call foo() outside that try/except, but that degree of control often leads to quite fiddly looking code with the core flow obscured by many tiny try/excepts. [...] > >How about something like this? > > try: > val = bah[5] > except IndexError: > # handle your expected exception here > else: > foo(val) That is the kind of refactor to which I alluded in the paragraph above. Doing that a lot tends to obscure the core logic of the code, hence the desire for something more succinct requiring less internal code fiddling. Cheers, Cameron Simpson From python at mrabarnett.plus.com Thu Jun 22 21:55:29 2017 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 23 Jun 2017 02:55:29 +0100 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170622232923.GA48632@cskk.homeip.net> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> Message-ID: On 2017-06-23 00:29, Cameron Simpson wrote: > On 23Jun2017 06:55, Steven D'Aprano wrote: >>On Thu, Jun 22, 2017 at 10:30:57PM +0200, Sven R. Kunze wrote: >>> We usually teach our newbies to catch exceptions as narrowly as >>> possible, i.e. MyModel.DoesNotExist instead of a plain Exception. This >>> works out quite well for now but the number of examples continue to grow >>> where it's not enough. >> >>(1) Under what circumstances is it not enough? > > I believe that he means that it isn't precise enough. In particular, "nested > exceptions" to me, from his use cases, means exceptions thrown from within > functions called at the top level. I want this control too sometimes. > > Consider: > > try: > foo(bah[5]) > except IndexError as e: > ... infer that there is no bah[5] ... > > Of course, it is possible that bah[5] existed and that foo() raised an > IndexError of its own. One might intend some sane handling of a missing bah[5] > but instead silently conceal the IndexError from foo() by mishandling it as a > missing bah[5]. > > Naturally one can rearrange this code to call foo() outside that try/except, > but that degree of control often leads to quite fiddly looking code with the > core flow obscured by many tiny try/excepts. > > One can easily want, instead, some kind of "shallow except", which would catch > exceptions only if they were directly raised from the surface code; such a > construct would catch the IndexError from a missing bah[5] in the example > above, but _not_ catch an IndexError raised from deeper code such within the > foo() function. > > Something equivalent to: > > try: > foo(bah[5]) > except IndexError as e: > if e.__traceback__ not directly from the try..except lines: > raise > ... infer that there is no bah[5] ... > > There doesn't seem to be a concise way to write that. It might not even be > feasible at all, as one doesn't have a way to identify the line(s) within the > try/except in a form that one can recognise in a traceback. > [snip] Increment a counter on every function call and record it on the exception, perhaps? If the exception's call count == the current call count, it was raised in this function. From stephanh42 at gmail.com Fri Jun 23 06:23:24 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Fri, 23 Jun 2017 12:23:24 +0200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> Message-ID: Hi Andy, What you propose is essentially the "try .. catch .. in" construct as described for Standard ML in: https://pdfs.semanticscholar.org/b24a/60f84b296482769bb6752feeb3d93ba6aee8.pdf Something similar for Clojure is at: https://github.com/rufoa/try-let So clearly this is something more people have struggled with. The paper above goes into deep detail on the practical and (proof-)theoretical advantages of such a construct. Stephan 2017-06-23 1:47 GMT+02:00 Andy Dirnberger : > > >> On Jun 22, 2017, at 7:29 PM, Cameron Simpson wrote: >> >>> On 23Jun2017 06:55, Steven D'Aprano wrote: >>>> On Thu, Jun 22, 2017 at 10:30:57PM +0200, Sven R. Kunze wrote: >>>> We usually teach our newbies to catch exceptions as narrowly as >>>> possible, i.e. MyModel.DoesNotExist instead of a plain Exception. This >>>> works out quite well for now but the number of examples continue to grow >>>> where it's not enough. >>> >>> (1) Under what circumstances is it not enough? >> >> I believe that he means that it isn't precise enough. In particular, "nested exceptions" to me, from his use cases, means exceptions thrown from within functions called at the top level. I want this control too sometimes. >> >> Consider: >> >> try: >> foo(bah[5]) >> except IndexError as e: >> ... infer that there is no bah[5] ... >> >> Of course, it is possible that bah[5] existed and that foo() raised an IndexError of its own. One might intend some sane handling of a missing bah[5] but instead silently conceal the IndexError from foo() by mishandling it as a missing bah[5]. >> >> Naturally one can rearrange this code to call foo() outside that try/except, but that degree of control often leads to quite fiddly looking code with the core flow obscured by many tiny try/excepts. >> >> One can easily want, instead, some kind of "shallow except", which would catch exceptions only if they were directly raised from the surface code; such a construct would catch the IndexError from a missing bah[5] in the example above, but _not_ catch an IndexError raised from deeper code such within the foo() function. >> >> Something equivalent to: >> >> try: >> foo(bah[5]) >> except IndexError as e: >> if e.__traceback__ not directly from the try..except lines: >> raise >> ... infer that there is no bah[5] ... >> >> There doesn't seem to be a concise way to write that. It might not even be feasible at all, as one doesn't have a way to identify the line(s) within the try/except in a form that one can recognise in a traceback. > > How about something like this? > > try: > val = bah[5] > except IndexError: > # handle your expected exception here > else: > foo(val) >> >> >> Cheers, >> Cameron Simpson >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From srkunze at mail.de Fri Jun 23 10:20:28 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 23 Jun 2017 16:20:28 +0200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170623010256.GA90317@cskk.homeip.net> References: <20170623010256.GA90317@cskk.homeip.net> Message-ID: <472a4ced-be48-5a00-ec85-f948239c7955@mail.de> On 23.06.2017 03:02, Cameron Simpson wrote: > >> How about something like this? >> >> try: >> val = bah[5] >> except IndexError: >> # handle your expected exception here >> else: >> foo(val) > > That is the kind of refactor to which I alluded in the paragraph > above. Doing that a lot tends to obscure the core logic of the code, > hence the desire for something more succinct requiring less internal > code fiddling. And depending on how complex bha.__getitem__ is, it can raise IndexError unintentionally as well. So, rewriting the outer code doesn't even help then. :-( Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Fri Jun 23 10:59:09 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 23 Jun 2017 15:59:09 +0100 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <472a4ced-be48-5a00-ec85-f948239c7955@mail.de> References: <20170623010256.GA90317@cskk.homeip.net> <472a4ced-be48-5a00-ec85-f948239c7955@mail.de> Message-ID: On 23 June 2017 at 15:20, Sven R. Kunze wrote: > On 23.06.2017 03:02, Cameron Simpson wrote: > > > How about something like this? > > try: > val = bah[5] > except IndexError: > # handle your expected exception here > else: > foo(val) > > > That is the kind of refactor to which I alluded in the paragraph above. > Doing that a lot tends to obscure the core logic of the code, hence the > desire for something more succinct requiring less internal code fiddling. > > > And depending on how complex bha.__getitem__ is, it can raise IndexError > unintentionally as well. So, rewriting the outer code doesn't even help > then. :-( At this point, it becomes unclear to me what constitutes an "intentional" IndexError, as opposed to an "unintentional" one, at least in any sense that can actually be implemented. I appreciate that you want IndexError to mean "there is no 5th element in bah". But if bah has a __getitem__ that raises IndexError for any reason other than that, then the __getitem__ implementation has a bug. And while it might be nice to be able to continue working properly even when the code you're executing has bugs, I think it's a bit optimistic to hope for :-) On the other hand, I do see the point that insisting on finer and finer grained exception handling ultimately ends up with unreadable code. But it's not a problem I'd expect to see much in real life code (where code is either not written that defensively, because either there's context that allows the coder to make assumptions that objects will behave reasonably sanely, or the code gets refactored to put the exception handling in a function, or something like that). Paul From dirn at dirnonline.com Fri Jun 23 11:09:30 2017 From: dirn at dirnonline.com (Andy Dirnberger) Date: Fri, 23 Jun 2017 11:09:30 -0400 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> Message-ID: Hi Stephan, On Fri, Jun 23, 2017 at 6:23 AM, Stephan Houben wrote: > Hi Andy, > > What you propose is essentially the "try .. catch .. in" construct as > described for Standard ML in: > ?It's not really a proposal. It's existing syntax. I was suggesting a way to implement the example that would catch an IndexError raised by accessing elements in bah but not those raised by foo. > > https://pdfs.semanticscholar.org/b24a/60f84b296482769bb6752feeb3d93b > a6aee8.pdf > > Something similar for Clojure is at: > https://github.com/rufoa/try-let > > So clearly this is something more people have struggled with. > The paper above goes into deep detail on the practical and > (proof-)theoretical > advantages of such a construct. > > Stephan > ?A?ndy > > 2017-06-23 1:47 GMT+02:00 Andy Dirnberger : > > > > > >> On Jun 22, 2017, at 7:29 PM, Cameron Simpson wrote: > >> > >>> On 23Jun2017 06:55, Steven D'Aprano wrote: > >>>> On Thu, Jun 22, 2017 at 10:30:57PM +0200, Sven R. Kunze wrote: > >>>> We usually teach our newbies to catch exceptions as narrowly as > >>>> possible, i.e. MyModel.DoesNotExist instead of a plain Exception. This > >>>> works out quite well for now but the number of examples continue to > grow > >>>> where it's not enough. > >>> > >>> (1) Under what circumstances is it not enough? > >> > >> I believe that he means that it isn't precise enough. In particular, > "nested exceptions" to me, from his use cases, means exceptions thrown from > within functions called at the top level. I want this control too sometimes. > >> > >> Consider: > >> > >> try: > >> foo(bah[5]) > >> except IndexError as e: > >> ... infer that there is no bah[5] ... > >> > >> Of course, it is possible that bah[5] existed and that foo() raised an > IndexError of its own. One might intend some sane handling of a missing > bah[5] but instead silently conceal the IndexError from foo() by > mishandling it as a missing bah[5]. > >> > >> Naturally one can rearrange this code to call foo() outside that > try/except, but that degree of control often leads to quite fiddly looking > code with the core flow obscured by many tiny try/excepts. > >> > >> One can easily want, instead, some kind of "shallow except", which > would catch exceptions only if they were directly raised from the surface > code; such a construct would catch the IndexError from a missing bah[5] in > the example above, but _not_ catch an IndexError raised from deeper code > such within the foo() function. > >> > >> Something equivalent to: > >> > >> try: > >> foo(bah[5]) > >> except IndexError as e: > >> if e.__traceback__ not directly from the try..except lines: > >> raise > >> ... infer that there is no bah[5] ... > >> > >> There doesn't seem to be a concise way to write that. It might not even > be feasible at all, as one doesn't have a way to identify the line(s) > within the try/except in a form that one can recognise in a traceback. > > > > How about something like this? > > > > try: > > val = bah[5] > > except IndexError: > > # handle your expected exception here > > else: > > foo(val) > >> > >> > >> Cheers, > >> Cameron Simpson > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Jun 23 12:49:19 2017 From: brett at python.org (Brett Cannon) Date: Fri, 23 Jun 2017 16:49:19 +0000 Subject: [Python-ideas] be upfront if you aren't willing to implement your own idea (was: socket module: plain stuples vs named tuples - Thank you) In-Reply-To: <500887d3-f2db-8563-ee5d-a22bab7e56e6@thomas-guettler.de> References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> <500887d3-f2db-8563-ee5d-a22bab7e56e6@thomas-guettler.de> Message-ID: Everyone, please be upfront when proposing any ideas if you refuse to implement your own idea yourself. It's implicit that if you have an idea to discuss here that you are serious enough about it to see it happen, so if that's not the case then do say so in your first email (obviously if your circumstances change during the discussion then that's understandable). Otherwise people will spend what little spare time they have helping you think through your idea, and then find out that the discussion will more than likely end up leading to no change because the most motivated person behind the discussion isn't motivated enough to actually enact the change. And if you lack knowledge in how to implement the idea or a certain area of expertise, please be upfront about that as well. We have had instances here where ideas have gone as far as PEPs to only find out the OP didn't know C which was a critical requirement to implementing the idea, and so the idea just fell to the wayside and hasn't gone anywhere. It's totally reasonable to ask for help, but once again, please be upfront that you will need it to have any chance of seeing your idea come to fruition. To be perfectly frank, I personally find it misleading to not be told upfront that you know you will need help (if you learn later because you didn't know e.g. C would be required, that's different, but once you do learn then once again be upfront about it). Otherwise I personally feel like I was tricked into a discussion under false pretenses that the OP was motivated enough to put the effort in to see their idea come to be. Had I known to begin with that no one was actually stepping forward to make this change happen I would have skipped the thread and spent the time I put in following the discussion into something more productive like reviewing a pull request. On Thu, 22 Jun 2017 at 08:26 Thomas G?ttler wrote: > thank you! I am happy that Guido is open for a pull request ... There > were +1 votes, too (and some concern about python > startup time). > > > I stopped coding in spare time, since my children are more important at > the moment .. if some wants to try it ... go > ahead and implement named tuples for the socket standard library - would > be great. > > Just for the records, I came here because of this feature request: > > https://github.com/giampaolo/psutil/issues/928 > > Regards, > Thomas G?ttler > > PS: For some strange reasons I received only some mails of this thread. > But I could > find the whole thread in the archive. > > Am 20.06.2017 um 04:05 schrieb INADA Naoki: > > Namedtuple in Python make startup time slow. > > So I'm very conservative to convert tuple to namedtuple in Python. > > INADA Naoki > > > > > > On Tue, Jun 20, 2017 at 7:27 AM, Victor Stinner > > wrote: > >> Oh, about the cost of writing C code, we started to enhance the socket > >> module in socket.py but keep the C code unchanged. I am thinking to the > >> support of enums. Some C functions are wrapped in Python. > >> > >> Victor > >> > >> Le 19 juin 2017 11:59 PM, "Guido van Rossum" a > ?crit : > >>> > >>> There are examples in timemodule.c which went through a similar > conversion > >>> from plain tuples to (sort-of) named tuples. I agree that upgrading the > >>> tuples returned by the socket module to named tuples would be nice, > but it's > >>> a low priority project. Maybe someone interested can create a PR? > (First > >>> open an issue stating that you're interested; point to this email from > me to > >>> prevent that some other core dev just closes it again.) > >>> > >>> On Mon, Jun 19, 2017 at 2:24 PM, Victor Stinner < > victor.stinner at gmail.com> > >>> wrote: > >>>> > >>>> Hi, > >>>> > >>>> 2017-06-13 22:13 GMT+02:00 Thomas G?ttler < > guettliml at thomas-guettler.de>: > >>>>> AFAIK the socket module returns plain tuples in Python3: > >>>>> > >>>>> https://docs.python.org/3/library/socket.html > >>>>> > >>>>> Why not use named tuples? > >>>> > >>>> For technical reasons: the socket module is mostly implemented in the > >>>> C language, and define a "named tuple" in C requires to implement a > >>>> "sequence" time which requires much more code than creating a tuple. > >>>> > >>>> In short, create a tuple is as simple as Py_BuildValue("OO", item1, > >>>> item2). > >>>> > >>>> Creating a "sequence" type requires something like 50 lines of code, > >>>> maybe more, I don't know exactly. > >>>> > >>>> Victor > >>>> _______________________________________________ > >>>> Python-ideas mailing list > >>>> Python-ideas at python.org > >>>> https://mail.python.org/mailman/listinfo/python-ideas > >>>> Code of Conduct: http://python.org/psf/codeofconduct/ > >>> > >>> > >>> > >>> > >>> -- > >>> --Guido van Rossum (python.org/~guido) > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > >> > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > Thomas Guettler http://www.thomas-guettler.de/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Fri Jun 23 14:28:08 2017 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Fri, 23 Jun 2017 11:28:08 -0700 Subject: [Python-ideas] be upfront if you aren't willing to implement your own idea In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> <500887d3-f2db-8563-ee5d-a22bab7e56e6@thomas-guettler.de> Message-ID: <594D5DB8.7090400@brenbarn.net> On 2017-06-23 09:49, Brett Cannon wrote: > Everyone, please be upfront when proposing any ideas if you refuse to > implement your own idea yourself. It's implicit that if you have an idea > to discuss here that you are serious enough about it to see it happen, > so if that's not the case then do say so in your first email (obviously > if your circumstances change during the discussion then that's > understandable). Otherwise people will spend what little spare time they > have helping you think through your idea, and then find out that the > discussion will more than likely end up leading to no change because the > most motivated person behind the discussion isn't motivated enough to > actually enact the change. > > And if you lack knowledge in how to implement the idea or a certain area > of expertise, please be upfront about that as well. We have had > instances here where ideas have gone as far as PEPs to only find out the > OP didn't know C which was a critical requirement to implementing the > idea, and so the idea just fell to the wayside and hasn't gone anywhere. > It's totally reasonable to ask for help, but once again, please be > upfront that you will need it to have any chance of seeing your idea > come to fruition. > > To be perfectly frank, I personally find it misleading to not be told > upfront that you know you will need help (if you learn later because you > didn't know e.g. C would be required, that's different, but once you do > learn then once again be upfront about it). Otherwise I personally feel > like I was tricked into a discussion under false pretenses that the OP > was motivated enough to put the effort in to see their idea come to be. > Had I known to begin with that no one was actually stepping forward to > make this change happen I would have skipped the thread and spent the > time I put in following the discussion into something more productive > like reviewing a pull request. That is a reasonable position, but I think if that's really how this list is supposed to work then it'd be good to state those requirements more explicitly in the list description. Right now the description (https://mail.python.org/mailman/listinfo/python-ideas) just says the list is for "discussion of speculative language ideas for Python". There is no hint that any particular technical qualifications are required other than having used Python enough to have an idea about how to improve it. I also don't think such a requirement is obvious even from reading the list traffic (since I've rarely seen anyone explicitly state their inability to implement, as you suggest, although it does sometimes come up later, as in this case). No doubt this leads to the occasional cockamamie proposal but I think it also allows discussion of useful ideas that otherwise might never be raised. Also, the description does mention that at some point ideas might get moved on to python-dev; although it's not explicit about how this works, I think that creates a vague impression that thinking about how or whether you can implement an idea might be something for a later stage. That said, I don't personally agree with your position here. My impression of discussion on this list is that a good deal of it doesn't really have to do with implementation at all. It has to do with the proposal itself in terms of how it would feel to use it, hashing out what its semantics would be, what the benefits would be for code readability, what confusion it might create etc. --- in short, discussion from the perspective of people who USE Python, not people who implement Python. I think that's good discussion to have even if the proposal eventually stalls because no one with the right skills has the time or inclination to implement it. It would be a shame for all such discussion to get nipped in the bud just because the person with the original proposal doesn't know C or whatever. Also, because, as you say, some people don't know what would be needed to implement their ideas, requiring this kind of disclosure might perversely muffle discussion from people who know enough to know they don't know how to implement their idea, while still allowing all the ideas from people who don't even know whether they know how to implement their idea --- and the latter are probably more likely to fall into the cockamamie category. I realize you're not proposing that all such discussion be stopped entirely, just that it be tagged as I-can't-implement-this-myself at the outset. However, your last paragraph suggests to me that the effect might be similar. You seem to be saying that (some of) those who do know how to implement stuff would like to be able to ignore discussion from anyone who doesn't know how to implement stuff. That's certainly anyone's prerogative, but I think it would be a shame if this resulted in a bifurcation of the list in which ideas can't reach the attention of people who could implement them unless they're proposed by someone who could do so themselves. To me, that would somewhat blur the distinction between python-ideas and python-dev, and potentially chill discussion of "mid-level" ideas proposed by people who know enough to have a potentially useful idea, but not enough to bring it to fruition. We presumably don't want a situation where a person with some amount of knowledge thinks "This might be a good idea. . . but I don't know how to implement it, so if I bring it up on the list the more knowledgeable people will ignore it, oh well, I guess I won't" --- while a person with no knowledge blithely jumps in with "Golly everyone I have this great idea!" (I don't mean to say that is directly what you're proposing, but it is the evolution that came to my mind when I read your comment.) So to put it succinctly, as someone who's found discussion on this list interesting and valuable, I think there is value in having discussion about "what would Python be like if this idea were implemented" even if we never get very far with "how would we implement this idea in Python". And I would find it unfortunate if discussion of the former were prematurely restricted by worries about the latter. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From stephanh42 at gmail.com Fri Jun 23 14:30:07 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Fri, 23 Jun 2017 20:30:07 +0200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> Message-ID: 2017-06-23 17:09 GMT+02:00 Andy Dirnberger : > It's not really a proposal. It's existing syntax. Wow! I have been using Python since 1.5.2 and I never knew this. This is not Guido's famous time machine in action, by any chance? Guess there's some code to refactor using this construct now... Stephan 2017-06-23 17:09 GMT+02:00 Andy Dirnberger : > Hi Stephan, > > On Fri, Jun 23, 2017 at 6:23 AM, Stephan Houben > wrote: >> >> Hi Andy, >> >> What you propose is essentially the "try .. catch .. in" construct as >> described for Standard ML in: > > > It's not really a proposal. It's existing syntax. I was suggesting a way to > implement the example that would catch an IndexError raised by accessing > elements in bah but not those raised by foo. > > >> >> >> >> https://pdfs.semanticscholar.org/b24a/60f84b296482769bb6752feeb3d93ba6aee8.pdf >> >> Something similar for Clojure is at: >> https://github.com/rufoa/try-let >> >> So clearly this is something more people have struggled with. >> The paper above goes into deep detail on the practical and >> (proof-)theoretical >> advantages of such a construct. >> >> Stephan > > > Andy > > >> >> >> 2017-06-23 1:47 GMT+02:00 Andy Dirnberger : >> > >> > >> >> On Jun 22, 2017, at 7:29 PM, Cameron Simpson wrote: >> >> >> >>> On 23Jun2017 06:55, Steven D'Aprano wrote: >> >>>> On Thu, Jun 22, 2017 at 10:30:57PM +0200, Sven R. Kunze wrote: >> >>>> We usually teach our newbies to catch exceptions as narrowly as >> >>>> possible, i.e. MyModel.DoesNotExist instead of a plain Exception. >> >>>> This >> >>>> works out quite well for now but the number of examples continue to >> >>>> grow >> >>>> where it's not enough. >> >>> >> >>> (1) Under what circumstances is it not enough? >> >> >> >> I believe that he means that it isn't precise enough. In particular, >> >> "nested exceptions" to me, from his use cases, means exceptions thrown from >> >> within functions called at the top level. I want this control too sometimes. >> >> >> >> Consider: >> >> >> >> try: >> >> foo(bah[5]) >> >> except IndexError as e: >> >> ... infer that there is no bah[5] ... >> >> >> >> Of course, it is possible that bah[5] existed and that foo() raised an >> >> IndexError of its own. One might intend some sane handling of a missing >> >> bah[5] but instead silently conceal the IndexError from foo() by mishandling >> >> it as a missing bah[5]. >> >> >> >> Naturally one can rearrange this code to call foo() outside that >> >> try/except, but that degree of control often leads to quite fiddly looking >> >> code with the core flow obscured by many tiny try/excepts. >> >> >> >> One can easily want, instead, some kind of "shallow except", which >> >> would catch exceptions only if they were directly raised from the surface >> >> code; such a construct would catch the IndexError from a missing bah[5] in >> >> the example above, but _not_ catch an IndexError raised from deeper code >> >> such within the foo() function. >> >> >> >> Something equivalent to: >> >> >> >> try: >> >> foo(bah[5]) >> >> except IndexError as e: >> >> if e.__traceback__ not directly from the try..except lines: >> >> raise >> >> ... infer that there is no bah[5] ... >> >> >> >> There doesn't seem to be a concise way to write that. It might not even >> >> be feasible at all, as one doesn't have a way to identify the line(s) within >> >> the try/except in a form that one can recognise in a traceback. >> > >> > How about something like this? >> > >> > try: >> > val = bah[5] >> > except IndexError: >> > # handle your expected exception here >> > else: >> > foo(val) >> >> >> >> >> >> Cheers, >> >> Cameron Simpson >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> https://mail.python.org/mailman/listinfo/python-ideas >> >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ > > From guido at python.org Fri Jun 23 14:39:13 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 23 Jun 2017 11:39:13 -0700 Subject: [Python-ideas] be upfront if you aren't willing to implement your own idea In-Reply-To: <594D5DB8.7090400@brenbarn.net> References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> <500887d3-f2db-8563-ee5d-a22bab7e56e6@thomas-guettler.de> <594D5DB8.7090400@brenbarn.net> Message-ID: "to put it succinctly" -- IMO we shouldn't discuss features without giving thought to their implementation. On Fri, Jun 23, 2017 at 11:28 AM, Brendan Barnwell wrote: > On 2017-06-23 09:49, Brett Cannon wrote: > >> Everyone, please be upfront when proposing any ideas if you refuse to >> implement your own idea yourself. It's implicit that if you have an idea >> to discuss here that you are serious enough about it to see it happen, >> so if that's not the case then do say so in your first email (obviously >> if your circumstances change during the discussion then that's >> understandable). Otherwise people will spend what little spare time they >> have helping you think through your idea, and then find out that the >> discussion will more than likely end up leading to no change because the >> most motivated person behind the discussion isn't motivated enough to >> actually enact the change. >> >> And if you lack knowledge in how to implement the idea or a certain area >> of expertise, please be upfront about that as well. We have had >> instances here where ideas have gone as far as PEPs to only find out the >> OP didn't know C which was a critical requirement to implementing the >> idea, and so the idea just fell to the wayside and hasn't gone anywhere. >> It's totally reasonable to ask for help, but once again, please be >> upfront that you will need it to have any chance of seeing your idea >> come to fruition. >> >> To be perfectly frank, I personally find it misleading to not be told >> upfront that you know you will need help (if you learn later because you >> didn't know e.g. C would be required, that's different, but once you do >> learn then once again be upfront about it). Otherwise I personally feel >> like I was tricked into a discussion under false pretenses that the OP >> was motivated enough to put the effort in to see their idea come to be. >> Had I known to begin with that no one was actually stepping forward to >> make this change happen I would have skipped the thread and spent the >> time I put in following the discussion into something more productive >> like reviewing a pull request. >> > > That is a reasonable position, but I think if that's really how > this list is supposed to work then it'd be good to state those requirements > more explicitly in the list description. Right now the description ( > https://mail.python.org/mailman/listinfo/python-ideas) just says the list > is for "discussion of speculative language ideas for Python". There is no > hint that any particular technical qualifications are required other than > having used Python enough to have an idea about how to improve it. I also > don't think such a requirement is obvious even from reading the list > traffic (since I've rarely seen anyone explicitly state their inability to > implement, as you suggest, although it does sometimes come up later, as in > this case). No doubt this leads to the occasional cockamamie proposal but > I think it also allows discussion of useful ideas that otherwise might > never be raised. Also, the description does mention that at some point > ideas might get moved on to python-dev; although it's not explicit about > how this works, I think that creates a vague impression that thinking about > how or whether you can implement an idea might be something for a later > stage. > > That said, I don't personally agree with your position here. My > impression of discussion on this list is that a good deal of it doesn't > really have to do with implementation at all. It has to do with the > proposal itself in terms of how it would feel to use it, hashing out what > its semantics would be, what the benefits would be for code readability, > what confusion it might create etc. --- in short, discussion from the > perspective of people who USE Python, not people who implement Python. I > think that's good discussion to have even if the proposal eventually stalls > because no one with the right skills has the time or inclination to > implement it. It would be a shame for all such discussion to get nipped in > the bud just because the person with the original proposal doesn't know C > or whatever. Also, because, as you say, some people don't know what would > be needed to implement their ideas, requiring this kind of disclosure might > perversely muffle discussion from people who know enough to know they don't > know how to implement their idea, while still allowing all the ideas from > people who don't even know whether they know how to implement their idea > --- and the latter are probably more likely to fall into the cockamamie > category. > > I realize you're not proposing that all such discussion be stopped > entirely, just that it be tagged as I-can't-implement-this-myself at the > outset. However, your last paragraph suggests to me that the effect might > be similar. You seem to be saying that (some of) those who do know how to > implement stuff would like to be able to ignore discussion from anyone who > doesn't know how to implement stuff. That's certainly anyone's > prerogative, but I think it would be a shame if this resulted in a > bifurcation of the list in which ideas can't reach the attention of people > who could implement them unless they're proposed by someone who could do so > themselves. To me, that would somewhat blur the distinction between > python-ideas and python-dev, and potentially chill discussion of > "mid-level" ideas proposed by people who know enough to have a potentially > useful idea, but not enough to bring it to fruition. We presumably don't > want a situation where a person with some amount of knowledge thinks "This > might be a good idea. . . but I don't know how to implement it, so if I > bring it up on the list the more knowledgeable people will ignore it, oh > well, I guess I won't" --- while a person with no knowledge blithely jumps > in with "Golly everyone I have this great idea!" (I don't mean to say that > is directly what you're proposing, but it is the evolution that came to my > mind when I read your comment.) > > So to put it succinctly, as someone who's found discussion on this > list interesting and valuable, I think there is value in having discussion > about "what would Python be like if this idea were implemented" even if we > never get very far with "how would we implement this idea in Python". And > I would find it unfortunate if discussion of the former were prematurely > restricted by worries about the latter. > > -- > Brendan Barnwell > "Do not follow where the path may lead. Go, instead, where there is no > path, and leave a trail." > --author unknown > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Jun 23 15:02:10 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 24 Jun 2017 05:02:10 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170622232923.GA48632@cskk.homeip.net> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> Message-ID: <20170623190209.GQ3149@ando.pearwood.info> On Fri, Jun 23, 2017 at 09:29:23AM +1000, Cameron Simpson wrote: > On 23Jun2017 06:55, Steven D'Aprano wrote: > >On Thu, Jun 22, 2017 at 10:30:57PM +0200, Sven R. Kunze wrote: > >>We usually teach our newbies to catch exceptions as narrowly as > >>possible, i.e. MyModel.DoesNotExist instead of a plain Exception. This > >>works out quite well for now but the number of examples continue to grow > >>where it's not enough. > > > >(1) Under what circumstances is it not enough? > > I believe that he means that it isn't precise enough. In particular, > "nested exceptions" to me, from his use cases, means exceptions thrown from > within functions called at the top level. I want this control too sometimes. But why teach it to newbies? Sven explicitly mentions teaching beginners. If we are talking about advanced features for experts, that's one thing, but it's another if we're talking about Python 101 taught to beginners and newbies. Do we really need to be teaching beginners how to deal with circular imports beyond "don't do it"? > Consider: > > try: > foo(bah[5]) > except IndexError as e: > ... infer that there is no bah[5] ... > > Of course, it is possible that bah[5] existed and that foo() raised an > IndexError of its own. One might intend some sane handling of a missing > bah[5] but instead silently conceal the IndexError from foo() by > mishandling it as a missing bah[5]. Indeed -- if both foo and bah[5] can raise IndexError when the coder believes that only bah[5] can, then the above code is simply buggy. On the other hand, if the author is correct that foo cannot raise IndexError, then the code as given is fine. > Naturally one can rearrange this code to call foo() outside that > try/except, but that degree of control often leads to quite fiddly looking > code with the core flow obscured by many tiny try/excepts. Sadly, that is often the nature of real code, as opposed to "toy" or textbook code that demonstrates an algorithm as cleanly as possible. It's been said that for every line of code in the textbook, the function needs ten lines in production. > One can easily want, instead, some kind of "shallow except", which would > catch exceptions only if they were directly raised from the surface code; > such a construct would catch the IndexError from a missing bah[5] in the > example above, but _not_ catch an IndexError raised from deeper code such > within the foo() function. I think the concept of a "shallow exception" is ill-defined, and to the degree that it is defined, it is *dangerous*: a bug magnet waiting to strike. What do you mean by "directly raised from the surface code"? Why is bah[5] "surface code" but foo(x) is not? But call a function (or method). But worse, it seems that the idea of "shallow" or "deep" depends on *implementation details* of where the exception comes from. For example, changing from a recursive function to a while loop might change the exception from "50 function calls deep" to "1 function deep". What makes bah[5] "shallow"? For all you know, it calls a chain of a dozen __getitem__ methods, due to inheritance or proxying, before the exception is actually raised. Or it might call just a single __getitem__ method, but the method's implementation puts the error checking into a helper method: def __getitem__(self, n): self._validate(n) # may raise IndexError ... How many function calls are shallow, versus deep? This concept of a shallow exception is, it seems to me, a bug magnet. It is superficially attractive, but then you realise that: try: spam[5] except shallow IndexError: ... will behave differently depending on how spam is implemented, even if the interface (raises IndexError) is identical. It seems to me that this concept is trying to let us substitute some sort of undefined but mechanical measurement of "shallowness" for actually understanding what our code does. I don't think this can work. It would be awesome if there was some way for our language to Do What We Mean instead of What We Say. And then we can grow a money tree, and have a magic plum-pudding that stays the same size no matter how many slices we eat, and electricity so cheap the power company pays you to use it... *wink* > The nested exception issue actually bites me regularly, almost always with > properties. [...] > However, more commonly I end up hiding coding errors with @property, > particularly nasty when the coding error is deep in some nested call. Here > is a nondeep example based on the above: > > @property > def target(self): > if len(self.targgets) == 1: > return self.targets[0] > raise AttributeError('only exists when this has exactly one target') The obvious solution to this is to learn to spell correctly :-) Actually, a linter probably would have picked up that typo. But I do see that the issue if more than just typos. [...] > try: > eval('raise e2 from e', globals(), locals()) > except: > # FIXME: why does this raise a SyntaxError? Because "raise" is a statement, not an expression. You need exec(). -- Steve From p.f.moore at gmail.com Fri Jun 23 16:09:25 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 23 Jun 2017 21:09:25 +0100 Subject: [Python-ideas] be upfront if you aren't willing to implement your own idea In-Reply-To: <594D5DB8.7090400@brenbarn.net> References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> <500887d3-f2db-8563-ee5d-a22bab7e56e6@thomas-guettler.de> <594D5DB8.7090400@brenbarn.net> Message-ID: On 23 June 2017 at 19:28, Brendan Barnwell wrote: > So to put it succinctly, as someone who's found discussion on this list > interesting and valuable, I think there is value in having discussion about > "what would Python be like if this idea were implemented" even if we never > get very far with "how would we implement this idea in Python". And I would > find it unfortunate if discussion of the former were prematurely restricted > by worries about the latter. No-one is proposing otherwise, just that people are open when starting a discussion as to whether they anticipate being able to follow through with an implementation if the idea meets with approval, or if they are simply making a suggestion that they hope someone else will take up. That's not too much to ask, nor does it in any way stifle reasonable discussion (it may discourage people who want to *deliberately* give the impression that they will do the work, but actually have no intention of doing so - but I hope there's no-one like that here and if there were, I'm happy with discouraging them). So I'm +1 on Brett's request. Paul From carl.input at gmail.com Fri Jun 23 18:03:23 2017 From: carl.input at gmail.com (Carl Smith) Date: Fri, 23 Jun 2017 23:03:23 +0100 Subject: [Python-ideas] be upfront if you aren't willing to implement your own idea In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> <500887d3-f2db-8563-ee5d-a22bab7e56e6@thomas-guettler.de> <594D5DB8.7090400@brenbarn.net> Message-ID: +1 I'm quite active in the CoffeeScript community, but am also on a ton of medication that ultimately means I won't implement much of what I suggest doing, but the core devs understand the situation well enough to respond accordingly. It really does help when people know what they can reasonably expect from others, and it doesn't take much to let them know. None of this has ever prevented me from being involved. It just prevents me from wasting other people's time. -- Carl -- Carl Smith carl.input at gmail.com On 23 June 2017 at 21:09, Paul Moore wrote: > On 23 June 2017 at 19:28, Brendan Barnwell wrote: > > So to put it succinctly, as someone who's found discussion on this list > > interesting and valuable, I think there is value in having discussion > about > > "what would Python be like if this idea were implemented" even if we > never > > get very far with "how would we implement this idea in Python". And I > would > > find it unfortunate if discussion of the former were prematurely > restricted > > by worries about the latter. > > No-one is proposing otherwise, just that people are open when starting > a discussion as to whether they anticipate being able to follow > through with an implementation if the idea meets with approval, or if > they are simply making a suggestion that they hope someone else will > take up. That's not too much to ask, nor does it in any way stifle > reasonable discussion (it may discourage people who want to > *deliberately* give the impression that they will do the work, but > actually have no intention of doing so - but I hope there's no-one > like that here and if there were, I'm happy with discouraging them). > > So I'm +1 on Brett's request. > Paul > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Fri Jun 23 18:56:03 2017 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 24 Jun 2017 08:56:03 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170623190209.GQ3149@ando.pearwood.info> References: <20170623190209.GQ3149@ando.pearwood.info> Message-ID: <20170623225603.GA27276@cskk.homeip.net> On 24Jun2017 05:02, Steven D'Aprano wrote: >On Fri, Jun 23, 2017 at 09:29:23AM +1000, Cameron Simpson wrote: >> On 23Jun2017 06:55, Steven D'Aprano wrote: >> >On Thu, Jun 22, 2017 at 10:30:57PM +0200, Sven R. Kunze wrote: >> >>We usually teach our newbies to catch exceptions as narrowly as >> >>possible, i.e. MyModel.DoesNotExist instead of a plain Exception. This >> >>works out quite well for now but the number of examples continue to grow >> >>where it's not enough. >> > >> >(1) Under what circumstances is it not enough? >> >> I believe that he means that it isn't precise enough. In particular, >> "nested exceptions" to me, from his use cases, means exceptions thrown from >> within functions called at the top level. I want this control too sometimes. > >But why teach it to newbies? Sven explicitly mentions teaching >beginners. If we are talking about advanced features for experts, that's >one thing, but it's another if we're talking about Python 101 taught to >beginners and newbies. It depends. Explaining that exceptions from called code can be mishandled by a naive try/except is something newbies need to learn to avoid common pitfalls with exceptions, and a real world situation that must be kept in mind when acting on caught exceptions. >Do we really need to be teaching beginners how to deal with circular >imports beyond "don't do it"? Sven's example is with import. The situation is more general. [... snip basic example of simple code where IndexError can arise from multiple causes ...] >> Naturally one can rearrange this code to call foo() outside that >> try/except, but that degree of control often leads to quite fiddly looking >> code with the core flow obscured by many tiny try/excepts. > >Sadly, that is often the nature of real code, as opposed to "toy" or >textbook code that demonstrates an algorithm as cleanly as possible. >It's been said that for every line of code in the textbook, the >function needs ten lines in production. But not always so. And the reason for many language constructs and idioms is explicitly to combat what would otherwise need 10 lines of code (or 100 to do it "right" with corner cases), obscuring the core task and reducing readability/maintainability. So "Sadly, that is often the nature of real code" is not itself an argument against this idea. >> One can easily want, instead, some kind of "shallow except", which would >> catch exceptions only if they were directly raised from the surface code; >> such a construct would catch the IndexError from a missing bah[5] in the >> example above, but _not_ catch an IndexError raised from deeper code such >> within the foo() function. > >I think the concept of a "shallow exception" is ill-defined, and to the >degree that it is defined, it is *dangerous*: a bug magnet waiting to >strike. > >What do you mean by "directly raised from the surface code"? Why is >bah[5] "surface code" but foo(x) is not? But call a function (or >method). [...] I've replied to Paul Moore and suggested this definition as implementable and often useful: A shallow catch would effectively need to mean "the exception's uppermost traceback frame refers to one of the program lines in the try/except suite". Which would work well for lists and other builtin types. And might be insufficient for a duck-type with python-coded dunder methods. The target here is not perform magic but to have a useful tool to identify exceptions that arise fairly directly from the adjacent clode and not what it calls. Without writing cumbersome and fragile boilerplate to dig into an exception's traceback. Cheers, Cameron Simpson From cs at zip.com.au Fri Jun 23 19:00:47 2017 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 24 Jun 2017 09:00:47 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: Message-ID: <20170623230047.GA40143@cskk.homeip.net> On 23Jun2017 11:48, Nick Coghlan wrote: >On 23 June 2017 at 09:29, Cameron Simpson wrote: >> This is so common that I actually keep around a special hack: >> >> def prop(func): >> ''' The builtin @property decorator lets internal AttributeErrors >> escape. >> While that can support properties that appear to exist >> conditionally, >> in practice this is almost never what I want, and it masks deeper >> errors. >> Hence this wrapper for @property that transmutes internal >> AttributeErrors >> into RuntimeErrors. >> ''' >> def wrapper(*a, **kw): >> try: >> return func(*a, **kw) >> except AttributeError as e: >> e2 = RuntimeError("inner function %s raised %s" % (func, e)) >> if sys.version_info[0] >= 3: >> try: >> eval('raise e2 from e', globals(), locals()) >> except: >> # FIXME: why does this raise a SyntaxError? >> raise e >> else: >> raise e2 >> return property(wrapper) > >Slight tangent, but I do sometimes wonder if adding a decorator >factory like the following to functools might be useful: > > def raise_when_returned(return_exc): > def decorator(f): > @wraps(f) > def wrapper(*args, **kwds): > try: > result = f(*args, **kwds) > except selective_exc as unexpected_exc: > msg = "inner function {} raised {}".format(f, >unexpected_exc) > raise RuntimeError(msg) from unexpected_exc > if isinstance(result, return_exc): > raise result > return result > >It's essentially a generalisation of PEP 479 to arbitrary exception >types, since it lets you mark a particular exception type as being >communicated back to the wrapper via the return channel rather than as >a regular exception: > > def with_traceback(exc): > try: > raise exc > except BaseException as caught_exc: > return caught_exc > > @property > @raise_when_returned(AttributeError) > def target(self): > if len(self.targets) == 1: > return self.targets[0] > return with_traceback(AttributeError('only exists when this >has exactly one target')) Funnily enough I have an @transmute decorator which serves just this purpose. It doesn't see as much use as I might imagine, but that is partially because my function predates "raise ... from", which meant that it loses the stack trace from the transmuted exception, impeding debugging. I need to revisit it with that in mind. So yes, your proposed decorator has supporting real world use cases in my world. Cheers, Cameron Simpson From brett at python.org Fri Jun 23 19:22:11 2017 From: brett at python.org (Brett Cannon) Date: Fri, 23 Jun 2017 23:22:11 +0000 Subject: [Python-ideas] be upfront if you aren't willing to implement your own idea (was: socket module: plain stuples vs named tuples - Thank you) In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> <500887d3-f2db-8563-ee5d-a22bab7e56e6@thomas-guettler.de> Message-ID: It has been brought to my attention that some people found this email as sounding rather angry. I was frustrated (and there is more to this specific issue than what everyone is seeing publicly), but I didn't meant for it to come off as angry, and for that I apologize. On Fri, 23 Jun 2017 at 09:49 Brett Cannon wrote: > Everyone, please be upfront when proposing any ideas if you refuse to > implement your own idea yourself. It's implicit that if you have an idea to > discuss here that you are serious enough about it to see it happen, so if > that's not the case then do say so in your first email (obviously if your > circumstances change during the discussion then that's understandable). > Otherwise people will spend what little spare time they have helping you > think through your idea, and then find out that the discussion will more > than likely end up leading to no change because the most motivated person > behind the discussion isn't motivated enough to actually enact the change. > > And if you lack knowledge in how to implement the idea or a certain area > of expertise, please be upfront about that as well. We have had instances > here where ideas have gone as far as PEPs to only find out the OP didn't > know C which was a critical requirement to implementing the idea, and so > the idea just fell to the wayside and hasn't gone anywhere. It's totally > reasonable to ask for help, but once again, please be upfront that you will > need it to have any chance of seeing your idea come to fruition. > > To be perfectly frank, I personally find it misleading to not be told > upfront that you know you will need help (if you learn later because you > didn't know e.g. C would be required, that's different, but once you do > learn then once again be upfront about it). Otherwise I personally feel > like I was tricked into a discussion under false pretenses that the OP was > motivated enough to put the effort in to see their idea come to be. Had I > known to begin with that no one was actually stepping forward to make this > change happen I would have skipped the thread and spent the time I put in > following the discussion into something more productive like reviewing a > pull request. > > On Thu, 22 Jun 2017 at 08:26 Thomas G?ttler > wrote: > >> thank you! I am happy that Guido is open for a pull request ... There >> were +1 votes, too (and some concern about python >> startup time). >> >> >> I stopped coding in spare time, since my children are more important at >> the moment .. if some wants to try it ... go >> ahead and implement named tuples for the socket standard library - would >> be great. >> >> Just for the records, I came here because of this feature request: >> >> https://github.com/giampaolo/psutil/issues/928 >> >> Regards, >> Thomas G?ttler >> >> PS: For some strange reasons I received only some mails of this thread. >> But I could >> find the whole thread in the archive. >> >> Am 20.06.2017 um 04:05 schrieb INADA Naoki: >> > Namedtuple in Python make startup time slow. >> > So I'm very conservative to convert tuple to namedtuple in Python. >> > INADA Naoki >> > >> > >> > On Tue, Jun 20, 2017 at 7:27 AM, Victor Stinner >> > wrote: >> >> Oh, about the cost of writing C code, we started to enhance the socket >> >> module in socket.py but keep the C code unchanged. I am thinking to the >> >> support of enums. Some C functions are wrapped in Python. >> >> >> >> Victor >> >> >> >> Le 19 juin 2017 11:59 PM, "Guido van Rossum" a >> ?crit : >> >>> >> >>> There are examples in timemodule.c which went through a similar >> conversion >> >>> from plain tuples to (sort-of) named tuples. I agree that upgrading >> the >> >>> tuples returned by the socket module to named tuples would be nice, >> but it's >> >>> a low priority project. Maybe someone interested can create a PR? >> (First >> >>> open an issue stating that you're interested; point to this email >> from me to >> >>> prevent that some other core dev just closes it again.) >> >>> >> >>> On Mon, Jun 19, 2017 at 2:24 PM, Victor Stinner < >> victor.stinner at gmail.com> >> >>> wrote: >> >>>> >> >>>> Hi, >> >>>> >> >>>> 2017-06-13 22:13 GMT+02:00 Thomas G?ttler < >> guettliml at thomas-guettler.de>: >> >>>>> AFAIK the socket module returns plain tuples in Python3: >> >>>>> >> >>>>> https://docs.python.org/3/library/socket.html >> >>>>> >> >>>>> Why not use named tuples? >> >>>> >> >>>> For technical reasons: the socket module is mostly implemented in the >> >>>> C language, and define a "named tuple" in C requires to implement a >> >>>> "sequence" time which requires much more code than creating a tuple. >> >>>> >> >>>> In short, create a tuple is as simple as Py_BuildValue("OO", item1, >> >>>> item2). >> >>>> >> >>>> Creating a "sequence" type requires something like 50 lines of code, >> >>>> maybe more, I don't know exactly. >> >>>> >> >>>> Victor >> >>>> _______________________________________________ >> >>>> Python-ideas mailing list >> >>>> Python-ideas at python.org >> >>>> https://mail.python.org/mailman/listinfo/python-ideas >> >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >>> >> >>> >> >>> >> >>> >> >>> -- >> >>> --Guido van Rossum (python.org/~guido) >> >> >> >> >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> https://mail.python.org/mailman/listinfo/python-ideas >> >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > >> >> -- >> Thomas Guettler http://www.thomas-guettler.de/ >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Fri Jun 23 19:37:51 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 24 Jun 2017 00:37:51 +0100 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170623225603.GA27276@cskk.homeip.net> References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 2017-06-23 23:56, Cameron Simpson wrote: > On 24Jun2017 05:02, Steven D'Aprano wrote: [snip] >>I think the concept of a "shallow exception" is ill-defined, and to the >>degree that it is defined, it is *dangerous*: a bug magnet waiting to >>strike. >> >>What do you mean by "directly raised from the surface code"? Why is >>bah[5] "surface code" but foo(x) is not? But call a function (or >>method). > [...] > > I've replied to Paul Moore and suggested this definition as implementable and > often useful: > > A shallow catch would effectively need to mean "the exception's > uppermost traceback frame refers to one of the program lines > in the try/except suite". Which would work well for lists and > other builtin types. And might be insufficient for a duck-type > with python-coded dunder methods. > > The target here is not perform magic but to have a useful tool to identify > exceptions that arise fairly directly from the adjacent clode and not what it > calls. Without writing cumbersome and fragile boilerplate to dig into an > exception's traceback. > I think a "shallow exception" would be one that's part of a defined API, as distinct from one that is an artifact of the implementation, a leak in the abstraction. It's like when "raise ... from None" was introduced to help in those cases where you want to replace an exception that's a detail of the (current) internal implementation with one that's intended for the user. From cs at zip.com.au Fri Jun 23 18:43:44 2017 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 24 Jun 2017 08:43:44 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: Message-ID: <20170623224344.GA23657@cskk.homeip.net> On 23Jun2017 20:30, Stephan Houben wrote: >2017-06-23 17:09 GMT+02:00 Andy Dirnberger : >> It's not really a proposal. It's existing syntax. > >Wow! I have been using Python since 1.5.2 and I never knew this. >This is not Guido's famous time machine in action, by any chance? >Guess there's some code to refactor using this construct now... Alas, no. It is existing syntax in Standard ML, not in Python. Cheers, Cameron Simpson From cs at zip.com.au Fri Jun 23 18:38:33 2017 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 24 Jun 2017 08:38:33 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: Message-ID: <20170623223833.GA10981@cskk.homeip.net> On 23Jun2017 15:59, Paul Moore wrote: >On 23 June 2017 at 15:20, Sven R. Kunze wrote: >> On 23.06.2017 03:02, Cameron Simpson wrote: >> How about something like this? >> >> try: >> val = bah[5] >> except IndexError: >> # handle your expected exception here >> else: >> foo(val) >> >> That is the kind of refactor to which I alluded in the paragraph above. >> Doing that a lot tends to obscure the core logic of the code, hence the >> desire for something more succinct requiring less internal code fiddling. >> >> And depending on how complex bha.__getitem__ is, it can raise IndexError >> unintentionally as well. So, rewriting the outer code doesn't even help >> then. :-( > >At this point, it becomes unclear to me what constitutes an >"intentional" IndexError, as opposed to an "unintentional" one, at >least in any sense that can actually be implemented. While I agree that in object with its own __getitem__ would look "deep", what I was actually suggesting as a possibility was a "shallow" except catch, not some magic "intentional" semantic. A shallow catch would effectively need to mean "the exceptions uppermost traceback frame referers to one of the program lines in the try/except suite". Which would work well for lists and other builtin types. And might be insufficient for a duck-type with python-coded dunder methods. [...snip...] >On the other hand, I do see the point that insisting on finer and >finer grained exception handling ultimately ends up with unreadable >code. But it's not a problem I'd expect to see much in real life code >(where code is either not written that defensively, because either >there's context that allows the coder to make assumptions that objects >will behave reasonably sanely, or the code gets refactored to put the >exception handling in a function, or something like that). Sure, there are many circumstances where a succinct "shallow catch" might not be useful. But there are also plenty of circumstances where one would like just this flavour of precision. Cheers, Cameron Simpson From greg.ewing at canterbury.ac.nz Fri Jun 23 21:21:17 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 24 Jun 2017 13:21:17 +1200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170623224344.GA23657@cskk.homeip.net> References: <20170623224344.GA23657@cskk.homeip.net> Message-ID: <594DBE8D.4000307@canterbury.ac.nz> Cameron Simpson wrote: > Alas, no. It is existing syntax in Standard ML, not in Python. But Python doesn't need it, because try-except-else covers the same thing. -- Greg From greg.ewing at canterbury.ac.nz Fri Jun 23 21:02:55 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 24 Jun 2017 13:02:55 +1200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170622232923.GA48632@cskk.homeip.net> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> Message-ID: <594DBA3F.1010505@canterbury.ac.nz> Cameron Simpson wrote: > try: > foo(bah[5]) > except IndexError as e: > ... infer that there is no bah[5] ... > > One can easily want, instead, some kind of "shallow except", which would > catch exceptions only if they were directly raised from the surface > code; The problem I see with that is how to define what counts as "surface code". If the __getitem__ method of bah is written in Python, I don't see how you could tell that an IndexError raised by it should be caught, but one raised by foo() shouldn't. In any case, this doesn't address the issue raised by the OP, which in this example is that if the implementation of bah.__getitem__ calls something else that raises an IndexError, there's no easy way to distinguish that from one raised by bah.__getitem__ itself. I don't see any way to solve that by messing around with different try-except constructs. It can only be addressed from within bah.__getitem__ itself, by having it catch any incidental IndexErrors and turn them into a different exception. -- Greg From greg.ewing at canterbury.ac.nz Fri Jun 23 21:14:43 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 24 Jun 2017 13:14:43 +1200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170623223833.GA10981@cskk.homeip.net> References: <20170623223833.GA10981@cskk.homeip.net> Message-ID: <594DBD03.6030103@canterbury.ac.nz> Cameron Simpson wrote: > A shallow catch would effectively need to mean "the exceptions uppermost > traceback frame referers to one of the program lines in the try/except > suite". Which would work well for lists and other builtin types. And > might be insufficient for a duck-type with python-coded dunder methods. I think it would be a very bad idea to have a language construct that only works for built-in types. It would be far too unpredictable and fragile. -- Greg From brett at python.org Fri Jun 23 20:31:06 2017 From: brett at python.org (Brett Cannon) Date: Sat, 24 Jun 2017 00:31:06 +0000 Subject: [Python-ideas] be upfront if you aren't willing to implement your own idea In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> <500887d3-f2db-8563-ee5d-a22bab7e56e6@thomas-guettler.de> <594D5DB8.7090400@brenbarn.net> Message-ID: On Fri, 23 Jun 2017 at 13:10 Paul Moore wrote: > On 23 June 2017 at 19:28, Brendan Barnwell wrote: > > So to put it succinctly, as someone who's found discussion on this list > > interesting and valuable, I think there is value in having discussion > about > > "what would Python be like if this idea were implemented" even if we > never > > get very far with "how would we implement this idea in Python". And I > would > > find it unfortunate if discussion of the former were prematurely > restricted > > by worries about the latter. > > No-one is proposing otherwise, just that people are open when starting > a discussion as to whether they anticipate being able to follow > through with an implementation if the idea meets with approval, or if > they are simply making a suggestion that they hope someone else will > take up. That's not too much to ask, nor does it in any way stifle > reasonable discussion (it may discourage people who want to > *deliberately* give the impression that they will do the work, but > actually have no intention of doing so - but I hope there's no-one > like that here and if there were, I'm happy with discouraging them). > > So I'm +1 on Brett's request. +1 to what Paul and Guido said: people are welcome to have hypothetical discussions here as long as they are upfront that it is hypothetical, but I personally choose to ignore all discussions that don't involve discussing the implementation of said idea (hence why learning that at the end is frustrating for those of us who are trying to be pragmatic with our time). Sorry if that wasn't clear enough in my original email where the distinction between "Brett as list admin" and "Brett as list participant" started and ended. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jun 24 06:03:26 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 24 Jun 2017 20:03:26 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <594DBA3F.1010505@canterbury.ac.nz> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> <594DBA3F.1010505@canterbury.ac.nz> Message-ID: <20170624100326.GT3149@ando.pearwood.info> On Sat, Jun 24, 2017 at 01:02:55PM +1200, Greg Ewing wrote: > In any case, this doesn't address the issue raised by the OP, > which in this example is that if the implementation of > bah.__getitem__ calls something else that raises an IndexError, > there's no easy way to distinguish that from one raised by > bah.__getitem__ itself. I'm not convinced that's a meaningful distinction to make in general. Consider the difference between these two classes: class X: def __getitem__(self, n): if n < 0: n += len(self) if not 0 <= n < len(self): raise IndexError ... class Y: def __getitem__(self, n): self._validate(n) ... def _validate(self, n): if n < 0: n += len(self) if not 0 <= n < len(self): raise IndexError The difference is a mere difference of refactoring. Why should one of them be treated as "bah.__getitem__ raises itself" versus "bah.__getitem__ calls something which raises"? That's just an implementation detail. I think we're over-generalizing this problem. There's two actual issues here, and we shouldn't conflate them as the same problem: (1) People write buggy code based on invalid assumptions of what can and can't raise. E.g.: try: foo(baz[5]) except IndexError: ... # assume baz[5] failed (but maybe foo can raise too?) (2) There's a *specific* problem with property where a bug in your getter or setter that raises AttributeError will be masked, appearing as if the property itself doesn't exist. In the case of (1), there's nothing Python the language can do to fix that. The solution is to write better code. Question your assumptions. Think carefully about your pre-conditions and post-conditions and invariants. Plan ahead. Read the documentation of foo before assuming it won't raise. In other words, be a better programmer. If only it were that easy :-( (Aside: I've been thinking for a long time that design by contract is a very useful feature to have. It should be possibly to set a contract that states that this function won't raise a certain exception, and if it does, treat it as a runtime error. But this is still at a very early point in my thinking.) Python libraries rarely give a definitive list of what exceptions functions can raise, so unless you wrote it yourself and know exactly what it can and cannot do, defensive coding suggests that you assume any function call might raise any exception at all: try: item = baz[5] except IndexError: ... # assume baz[5] failed else: foo(item) Can we fix that? Well, maybe we should re-consider the rejection of PEP 463 (exception-catching expressions). https://www.python.org/dev/peps/pep-0463/ Maybe we need a better way to assert that a certain function won't raise a particular exception: try: item = bah[5] without IndexError: foo(item) except IndexError: ... # assume baz[5] failed (But how is that different from try...except...else?) In the case of (2), the masking of bugs inside property getters if they happen to raise AttributeError, I think the std lib can help with that. Perhaps a context manager or decorator (or both) that converts one exception to another? @property @bounce_exception(AttributeError, RuntimeError) def spam(self): ... Now spam.__get__ cannot raise AttributeError, if it does, it will be converted to RuntimeError. If you need finer control over the code that is guarded use the context manager form: @property def spam(self): with bounce_exception(AttributeError, RuntimeError): # guarded if condition: ... # not guarded raise AttributeError('property spam doesn't exist yet') -- Steve From greg.ewing at canterbury.ac.nz Sat Jun 24 08:31:40 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 25 Jun 2017 00:31:40 +1200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170624100326.GT3149@ando.pearwood.info> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> <594DBA3F.1010505@canterbury.ac.nz> <20170624100326.GT3149@ando.pearwood.info> Message-ID: <594E5BAC.5050809@canterbury.ac.nz> Steven D'Aprano wrote: > class X: > def __getitem__(self, n): > if n < 0: > n += len(self) > if not 0 <= n < len(self): > raise IndexError > ... > > class Y: > def __getitem__(self, n): > self._validate(n) > ... > def _validate(self, n): > if n < 0: > n += len(self) > if not 0 <= n < len(self): > raise IndexError > > > Why should one of > them be treated as "bah.__getitem__ raises itself" versus > "bah.__getitem__ calls something which raises"? They shouldn't be treated differently -- they're both legitimate ways for __getitem__ to signal that the item doesn't exist. What *should* be treated differently is if an IndexError occurs incidentally from something else that __getitem__ does. In other words, Y's __getitem__ should be written something like def __getitem__(self, n): self.validate(n) # If we get an IndexError from here on, it's a bug try: # work out the result and return it except IndexError as e: raise RuntimeError from e > I think we're over-generalizing this problem. There's two actual issues > here, and we shouldn't conflate them as the same problem: > > (1) People write buggy code based on invalid assumptions of what can and > can't raise. E.g.: > > (2) There's a *specific* problem with property where a bug in your > getter or setter that raises AttributeError will be masked, appearing as > if the property itself doesn't exist. Agreed. Case 1 can usually be handled by rewriting the code so as to make the scope of exception catching as narrow as possible. Case 2 needs to be addressed within the method concerned on a case-by-case basis. If there's a general principle there, it's something like this: If you're writing a method that uses an exception as part of it's protocol, you should catch any incidental occurrences of the same exception and reraise it as a different exception. I don't think there's anything more the Python language could do to help with either of those. > (Aside: I've been thinking for a long time that design by contract is a > very useful feature to have. It should be possibly to set a contract > that states that this function won't raise a certain exception, and if > it does, treat it as a runtime error. But this is still at a very early > point in my thinking.) That sounds dangerously similar to Java's checked exceptions, which has turned out to be a huge nuisance and not very helpful. > Maybe we need a better way to assert that a certain function won't raise > a particular exception: > > try: > item = bah[5] > without IndexError: > foo(item) > except IndexError: > ... # assume baz[5] failed > > (But how is that different from try...except...else?) It's no different, if I understand what it's supposed to mean correctly. > @property > @bounce_exception(AttributeError, RuntimeError) > def spam(self): > ... In the case of property getters, it seems to me you're almost always going to want that functionality, so maybe it should be incorporated into the property decorator itself. The only cases where you wouldn't want it would be if your property dynamically figures out whether it exists or not, and in those rare cases you would just have to write your own descriptor class. -- Greg From ncoghlan at gmail.com Sat Jun 24 09:45:25 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Jun 2017 23:45:25 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <594E5BAC.5050809@canterbury.ac.nz> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> <594DBA3F.1010505@canterbury.ac.nz> <20170624100326.GT3149@ando.pearwood.info> <594E5BAC.5050809@canterbury.ac.nz> Message-ID: On 24 June 2017 at 22:31, Greg Ewing wrote: > Steven D'Aprano wrote: >> I think we're over-generalizing this problem. There's two actual issues >> here, and we shouldn't conflate them as the same problem: >> >> (1) People write buggy code based on invalid assumptions of what can and >> can't raise. E.g.: >> >> (2) There's a *specific* problem with property where a bug in your getter >> or setter that raises AttributeError will be masked, appearing as if the >> property itself doesn't exist. > > Agreed. > > Case 1 can usually be handled by rewriting the code so as to > make the scope of exception catching as narrow as possible. > > Case 2 needs to be addressed within the method concerned on a > case-by-case basis. If there's a general principle there, it's > something like this: If you're writing a method that uses > an exception as part of it's protocol, you should catch any > incidental occurrences of the same exception and reraise it > as a different exception. > > I don't think there's anything more the Python language could do > to help with either of those. While I used to think that, I'm no longer sure it's true, as it seems to me that a `contextlib.convert_exception` context manager could help with both of them. (That's technically the standard library helping, rather than the language per se, but it's still taking a currently obscure implementation pattern and making it more obvious by giving it a name) So if we assume that existed, and converted the given exception to RuntimeError (the same way PEP 479 does for StopIteration), we'd be able to write magic methods and properties in the following style: def __getitem__(self, n): self.validate(n) with contextlib.convert_exception(IndexError): # If we get an IndexError in here, it's a bug return self._getitem(n) That idiom then works the same way regardless of how far away your code is from the exception handler you're trying to bypass - you could just as easily put it inside the `self._getitem(n)` helper method instead. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Sat Jun 24 15:42:19 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sat, 24 Jun 2017 22:42:19 +0300 Subject: [Python-ideas] Runtime types vs static types Message-ID: There has been some discussion here and there concerning the differences between runtime types and static types (mypy etc.). What I write below is not really an idea or proposal---just a perspective, or a topic that people may want to discuss. Since the discussion on this is currently very fuzzy and scattered and not really happening either AFAICT (I've probably missed many discussions, though). Anyway, I thought I'd give it a shot: Clearly, there needs to be some sort of distinction between runtime classes/types and static types, because static types can be more precise than Python's dynamic runtime semantics. For example, Iterable[int] is an iterable that contains integers. For a static type checker, it is clear what this means. But at runtime, it may be impossible to figure out whether an iterable is really of this type without consuming the whole iterable and checking whether each yielded element is an integer. Even that is not possible if the iterable is infinite. Even Sequence[int] is problematic, because checking the types of all elements of the sequence could take a long time. Since things like isinstance(it, Iterable[int]) cannot guarantee a proper answer, one easily arrives at the conclusion that static types and runtime classes are just two separate things and that one cannot require that all types support something like isinstance at runtime. On the other hand, there are many runtime things that can or could be done using (type) annotations, for example: Multidispatch (example with hypothetical syntax below): @overload def concatenate(parts: Iterable[str]) -> str: return "".join(parts) @overload def concatenate(parts: Iterable[bytes]) -> bytes: return b"".join(parts) @overload def concatenate(parts: Iterable[Iterable]) -> Iterable: return itertools.chain(*parts) or runtime type checking: @check_types def load_from_file(filename: Union[os.PathLike, str, bytes]): with open(filename) as f: return do_stuff_with(f.read()) which would automatically give a nice error message if, say, a file object is given as argument instead of a path to a file. However useful (and efficient) these things might be, the runtime type checks are problematic, as discussed above. Furthermore, other differences between runtime and static typing may emerge (or have emerged), which will complicate the matter further. For instance, the runtime __annotations__ of classes, modules and functions may in some cases contain something completely different from what a type checker thinks the type should be. These and other incompatibilities between runtime and static typing will create two (or more) different kinds of type-annotated Python: runtime-oriented Python and Python with static type checking. These may be incompatible in both directions: a static type checker may complain about code that is perfectly valid for the runtime folks, and code written for static type checking may not be able to use new Python techniques that make use of type hints at runtime. There may not even be a fully functional subset of the two "languages". Different libraries will adhere to different standards and will not be compatible with each other. The split will be much worse and more difficult to understand than Python 2 vs 3, peoples around the world will suffer like never before, and programming in Python will become a very complicated mess. One way of solving the problem would be that type annotations are only a static concept, like with stubs or comment-based type annotations. This would also be nice from a memory and performance perspective, as evaluating and storing the annotations would not occupy memory (although both issues and some more might be nicely solved by making the annotations lazily ealuated). However, leaving out runtime effects of type annotations is not the approach taken, and runtime introspection of annotations seems to have some promising applications as well. And for many cases, the traditional Python class actually acts very nicely as both the runtime and static type. So if type annotations will be both for runtime and for static checking, how to make everything work for both static and runtime typing? Since a writer of a library does not know what the type hints will be used for by the library users, it is very important that there is only one way of making type annotations which will work regardless of what the annotations are used for in the end. This will also make it much easier to learn Python typing. Regarding runtime types and isinstance, let's look at the Iterable[int] example. For this case, there are a few options: 1) Don't implement isinstance This is problematic for runtime uses of annotations. 2) isinstance([1, '2', 'three'], Iterable[int]) returns True This is in fact now the case. This is ok for many runtime situations, but lacks precision compared to the static version. One may want to distinguish between Iterable[int] and Iterable[str] at runtime (e.g. the multidispatch example above). 3) Check as much as you can at runtime There could be something like Reiterable, which means the object is not consumed by iterating over it, so one could actually check if all elements are instances of int. This would be useful in some situations, but not available for every object. Furthermore, the check could take an arbitrary amount of time so it is not really suitable for things like multidispatch or some matching constructs etc., where the performance overhead of the type check is really important. 4) Do a deeper check than in (2) but trust the annotations For example, an instance of a class that has a method like def __iter__(self) -> Iterator[int]: some code could be identified as Iterable[int] at runtime, even if it is not guaranteed that all elements are really integers. On the other hand, an object returned by def get_ints() -> Iterable[int]: some code does not know its own annotations, so the check is difficult to do at runtime. And of course, there may not be annotations available. 5) Something else? And what about PEP544 (protocols), which is being drafted? The PEP seems to aim for having type objects that represent duck-typing protocols/interfaces. Checking whether a protocol is implemented by an object or type is clearly a useful thing to do at runtime, but it is not really clear if isinstance would be a guaranteed feature for PEP544 Protocols. So one question is, is it possible to draw the lines between what works with isinstance and what doesn't, and between what details are checked by isinstance and what aren't? -- Or should insinstance be reserved for a more limited purpose, and add another check function, say `implements(...)`, which would perhaps guarantee some answer for all combinations of object and type? I'll stop here---this email is probably already much longer than a single email should be ;) -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucas.wiman at gmail.com Sat Jun 24 16:30:30 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Sat, 24 Jun 2017 13:30:30 -0700 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: > > And what about PEP544 (protocols), which is being drafted? The PEP seems > to aim for having type objects that represent duck-typing > protocols/interfaces. Checking whether a protocol is implemented by an > object or type is clearly a useful thing to do at runtime, but it is not > really clear if isinstance would be a guaranteed feature for PEP544 > Protocols. > > So one question is, is it possible to draw the lines between what works > with isinstance and what doesn't, and between what details are checked by > isinstance and what aren't? -- Or should insinstance be reserved for a more > limited purpose, and add another check function, say `implements(...)`, > which would perhaps guarantee some answer for all combinations of object > and type? > I'm guessing to implement PEP 544, many of the `__instancecheck__` and `__subclasscheck__` methods in `typing.py` would need to be updated to check the `__annotations__` of the class of the object it's passed against its own definition, (covered in this section of the PEP). I've been somewhat surprised that many of the `__instancecheck__` implementations do not work at runtime, even when the implementation would be trivial (e.g. for `Union`), or would not have subtle edge cases due to immutability (e.g. for `Tuple`, which cannot be used for checking parameterized instances). This seems like counterintuitive behavior that would be straightforward to fix, unless there are subtleties & edge cases I'm missing. If people are amenable to updating those cases, I'd be interested in submitting a patch to that effect. Best, Lucas On Sat, Jun 24, 2017 at 12:42 PM, Koos Zevenhoven wrote: > There has been some discussion here and there concerning the differences > between runtime types and static types (mypy etc.). What I write below is > not really an idea or proposal---just a perspective, or a topic that people > may want to discuss. Since the discussion on this is currently very fuzzy > and scattered and not really happening either AFAICT (I've probably missed > many discussions, though). Anyway, I thought I'd give it a shot: > > Clearly, there needs to be some sort of distinction between runtime > classes/types and static types, because static types can be more precise > than Python's dynamic runtime semantics. For example, Iterable[int] is an > iterable that contains integers. For a static type checker, it is clear > what this means. But at runtime, it may be impossible to figure out whether > an iterable is really of this type without consuming the whole iterable and > checking whether each yielded element is an integer. Even that is not > possible if the iterable is infinite. Even Sequence[int] is problematic, > because checking the types of all elements of the sequence could take a > long time. > > Since things like isinstance(it, Iterable[int]) cannot guarantee a proper > answer, one easily arrives at the conclusion that static types and runtime > classes are just two separate things and that one cannot require that all > types support something like isinstance at runtime. > > On the other hand, there are many runtime things that can or could be done > using (type) annotations, for example: > > Multidispatch (example with hypothetical syntax below): > > @overload > def concatenate(parts: Iterable[str]) -> str: > return "".join(parts) > > @overload > def concatenate(parts: Iterable[bytes]) -> bytes: > return b"".join(parts) > > @overload > def concatenate(parts: Iterable[Iterable]) -> Iterable: > return itertools.chain(*parts) > > or runtime type checking: > > @check_types > def load_from_file(filename: Union[os.PathLike, str, bytes]): > with open(filename) as f: > return do_stuff_with(f.read()) > > which would automatically give a nice error message if, say, a file object > is given as argument instead of a path to a file. > > However useful (and efficient) these things might be, the runtime type > checks are problematic, as discussed above. > > Furthermore, other differences between runtime and static typing may > emerge (or have emerged), which will complicate the matter further. For > instance, the runtime __annotations__ of classes, modules and functions may > in some cases contain something completely different from what a type > checker thinks the type should be. > > These and other incompatibilities between runtime and static typing will > create two (or more) different kinds of type-annotated Python: > runtime-oriented Python and Python with static type checking. These may be > incompatible in both directions: a static type checker may complain about > code that is perfectly valid for the runtime folks, and code written for > static type checking may not be able to use new Python techniques that make > use of type hints at runtime. There may not even be a fully functional > subset of the two "languages". Different libraries will adhere to different > standards and will not be compatible with each other. The split will be > much worse and more difficult to understand than Python 2 vs 3, peoples > around the world will suffer like never before, and programming in Python > will become a very complicated mess. > > > One way of solving the problem would be that type annotations are only a > static concept, like with stubs or comment-based type annotations. This > would also be nice from a memory and performance perspective, as evaluating > and storing the annotations would not occupy memory (although both issues > and some more might be nicely solved by making the annotations lazily > ealuated). However, leaving out runtime effects of type annotations is not > the approach taken, and runtime introspection of annotations seems to have > some promising applications as well. And for many cases, the traditional > Python class actually acts very nicely as both the runtime and static type. > > So if type annotations will be both for runtime and for static checking, > how to make everything work for both static and runtime typing? > > Since a writer of a library does not know what the type hints will be used > for by the library users, it is very important that there is only one way > of making type annotations which will work regardless of what the > annotations are used for in the end. This will also make it much easier to > learn Python typing. > > Regarding runtime types and isinstance, let's look at the Iterable[int] > example. For this case, there are a few options: > > 1) Don't implement isinstance > > This is problematic for runtime uses of annotations. > > 2) isinstance([1, '2', 'three'], Iterable[int]) returns True > > This is in fact now the case. This is ok for many runtime situations, but > lacks precision compared to the static version. One may want to distinguish > between Iterable[int] and Iterable[str] at runtime (e.g. the multidispatch > example above). > > 3) Check as much as you can at runtime > > There could be something like Reiterable, which means the object is not > consumed by iterating over it, so one could actually check if all elements > are instances of int. This would be useful in some situations, but not > available for every object. Furthermore, the check could take an arbitrary > amount of time so it is not really suitable for things like multidispatch > or some matching constructs etc., where the performance overhead of the > type check is really important. > > 4) Do a deeper check than in (2) but trust the annotations > > For example, an instance of a class that has a method like > > def __iter__(self) -> Iterator[int]: > some code > > could be identified as Iterable[int] at runtime, even if it is not > guaranteed that all elements are really integers. > > On the other hand, an object returned by > > def get_ints() -> Iterable[int]: > some code > > does not know its own annotations, so the check is difficult to do at > runtime. And of course, there may not be annotations available. > > 5) Something else? > > > And what about PEP544 (protocols), which is being drafted? The PEP seems > to aim for having type objects that represent duck-typing > protocols/interfaces. Checking whether a protocol is implemented by an > object or type is clearly a useful thing to do at runtime, but it is not > really clear if isinstance would be a guaranteed feature for PEP544 > Protocols. > > So one question is, is it possible to draw the lines between what works > with isinstance and what doesn't, and between what details are checked by > isinstance and what aren't? -- Or should insinstance be reserved for a more > limited purpose, and add another check function, say `implements(...)`, > which would perhaps guarantee some answer for all combinations of object > and type? > > I'll stop here---this email is probably already much longer than a single > email should be ;) > > -- Koos > > > -- > + Koos Zevenhoven + http://twitter.com/k7hoven + > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sat Jun 24 17:29:28 2017 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 24 Jun 2017 16:29:28 -0500 Subject: [Python-ideas] Reducing collisions in small dicts/sets Message-ID: Short course: the average number of probes needed when searching small dicts/sets can be reduced, in both successful ("found") and failing ("not found") cases. But I'm not going to pursue this. This is a brain dump for someone who's willing to endure the interminable pain of arguing about benchmarks ;-) Background: http://bugs.python.org/issue30671 raised some questions about how dict collisions are handled. While the analysis there didn't make sense to me, I wrote enough code to dig into it. As detailed in that bug report, the current implementation appeared to meet the theoretical performance of "uniform hashing", meaning there was no room left for improvement. However, that missed something: the simple expressions for expected probes under uniform hashing are upper bounds, and while they're excellent approximations for modest load factors in sizable tables, for small tables they're significantly overstated. For example, for a table with 5 items in 8 slots, the load factor is a = 5/8 = 0.625, and avg probes when found = log(1/(1-a))/a = 1.57 when not found = 1/(1-a) = 2.67 However, exact analysis gives 1.34 and 2.25 instead. The current code achieves the upper bounds, but not the exact values. As a sanity check, a painfully slow implementation of uniform hashing does achieve the exact values. Code for all this is attached, in a framework that allows you to easily plug in any probe sequence strategy. The current strategy is implemented by generator "current". There are also implementations of "linear" probing, "quadratic" probing, "pre28201" probing (the current strategy before bug 28201 repaired an error in its coding), "uniform" probing, and ... "double". The last is one form of "double hashing" that gets very close to "uniform". Its loop guts are significantly cheaper than the current scheme, just 1 addition and 1 mask. However, it requires a non-trivial modulus to get started, and that's expensive. Is there a cheaper way to get close to "uniform"? I don't know - this was just the best I came up with so far. Does it matter? See above ;-) If you want to pursue this, take these as given: 1. The first probe must be simply the last `nbits` bits of the hash code. The speed of the first probe is supremely important, that's the fastest possible first probe, and it guarantees no collisions at all for a dict indexed by a contiguous range of integers (an important use case). 2. The probe sequence must (at least eventually) depend on every bit in the hash code. Else it's waaay too easy to stumble into quadratic-time behavior for "bad" sets of keys, even by accident. Have fun :-) -------------- next part -------------- MIN_ELTS = 100_000 M64 = (1 << 64) - 1 def phash(obj, M=M64): # hash(obj) as uint64 return hash(obj) & M # Probers: generate sequence of table indices to look at, # in table of size 2**nbits, for object with uint64 hash code h. def linear(h, nbits): mask = (1 << nbits) - 1 i = h & mask while True: yield i i = (i + 1) & mask # offsets of 0, 1, 3, 6, 10, 15, ... # this permutes the index range when the table size is a power of 2 def quadratic(h, nbits): mask = (1 << nbits) - 1 i = h & mask inc = 1 while True: yield i i = (i + inc) & mask inc += 1 def pre28201(h, nbits): mask = (1 << nbits) - 1 i = h & mask while True: yield i i = (5*i + h + 1) & mask h >>= 5 def current(h, nbits): mask = (1 << nbits) - 1 i = h & mask while True: yield i h >>= 5 i = (5*i + h + 1) & mask # One version of "double hashing". The increment between probes is # fixed, but varies across objects. This does very well! Note that the # increment needs to be relatively prime to the table size so that all # possible indices are generated. Because our tables have power-of-2 # sizes, we merely need to ensure the increment is odd. # Using `h % mask` is akin to "casting out 9's" in decimal: it's as if # we broke the hash code into nbits-wide chunks from the right, then # added them, then repeated that procedure until only one "digit" # remains. All bits in the hash code affect the result. # While mod is expensive, a successful search usual gets out on the # first try, & then the lookup can return before the mod completes. def double(h, nbits): mask = (1 << nbits) - 1 i = h & mask yield i inc = (h % mask) | 1 # force it odd while True: i = (i + inc) & mask yield i # The theoretical "gold standard": generate a random permutation of the # table indices for each object. We can't actually do that, but # Python's PRNG gets close enough that there's no practical difference. def uniform(h, nbits): from random import seed, randrange seed(h) n = 1 << nbits seen = set() while True: assert len(seen) < n while True: i = randrange(n) if i not in seen: break seen.add(i) yield i def spray(nbits, objs, cs, prober, *, used=None, shift=5): building = used is None nslots = 1 << nbits mask = nslots - 1 if building: used = [0] * nslots assert len(used) == nslots for o in objs: n = 1 for i in prober(phash(o), nbits): if used[i]: n += 1 else: break if building: used[i] = 1 cs[n] += 1 return used def objgen(i=1): while True: yield str(i) i += 1 # Average probes for a failing search; e.g., # 100 slots; 3 occupied # 1: 97/100 # 2: 3/100 * 97/99 # 3: 3/100 * 2/99 * 97/98 # 4: 3/100 * 2/99 * 1/98 * 97/97 # # `total` slots, `filled` occupied # probability `p` probes will be needed, 1 <= p <= filled+1 # p-1 collisions followed by success: # ff(filled, p-1) / ff(total, p-1) * (total - filled) / (total - (p-1)) # where `ff` is the falling factorial. def avgf(total, filled): assert 0 <= filled < total ffn = float(filled) ffd = float(total) tmf = ffd - ffn result = 0.0 ffpartial = 1.0 ppartial = 0.0 for p in range(1, filled + 2): thisp = ffpartial * tmf / (total - (p-1)) ppartial += thisp result += thisp * p ffpartial *= ffn / ffd ffn -= 1.0 ffd -= 1.0 assert abs(ppartial - 1.0) < 1e-14, ppartial return result # Average probes for a successful search. Alas, this takes time # quadratic in `filled`. def avgs(total, filled): assert 0 < filled < total return sum(avgf(total, f) for f in range(filled)) / filled def pstats(ns): total = sum(ns.values()) small = min(ns) print(f"min {small}:{ns[small]/total:.2%} " f"max {max(ns)} " f"mean {sum(i * j for i, j in ns.items())/total:.2f} ") def drive(nbits): from collections import defaultdict from itertools import islice import math import sys nslots = 1 << nbits dlen = nslots * 2 // 3 assert (sys.getsizeof({i: i for i in range(dlen)}) < sys.getsizeof({i: i for i in range(dlen + 1)})) alpha = dlen / nslots # actual load factor of max dict ntodo = (MIN_ELTS + dlen - 1) // dlen print() print("bits", nbits, f"nslots {nslots:,} dlen {dlen:,} alpha {alpha:.2f} " f"# built {ntodo:,}") print(f"theoretical avg probes for uniform hashing " f"when found {math.log(1/(1-alpha))/alpha:.2f} " f"not found {1/(1-alpha):.2f}") print(" crisp ", end="") if nbits > 12: print("... skipping (slow!)") else: print(f"when found {avgs(nslots, dlen):.2f} " f"not found {avgf(nslots, dlen):.2f}") for prober in (linear, quadratic, pre28201, current, double, uniform): print(" prober", prober.__name__) objs = objgen() good = defaultdict(int) bad = defaultdict(int) for _ in range(ntodo): used = spray(nbits, islice(objs, dlen), good, prober) assert sum(used) == dlen spray(nbits, islice(objs, dlen), bad, prober, used=used) print(" " * 8 + "found ", end="") pstats(good) print(" " * 8 + "fail ", end="") pstats(bad) for bits in range(3, 23): drive(bits) From markusmeskanen at gmail.com Sun Jun 25 07:58:10 2017 From: markusmeskanen at gmail.com (Markus Meskanen) Date: Sun, 25 Jun 2017 14:58:10 +0300 Subject: [Python-ideas] A suggestion for a do...while loop Message-ID: I'm a huge fan of the do...while loop in other languages, and it would often be useful in Python too, when doing stuff like: while True: password = input() if password == ...: break I've seen the pep 315 which got rejected, and I believe these two suggestions were mostly focused on: 1. do ... while : 2. do: while But both were rejected for valid reasons: 1. It makes little sense to have the while at the top, since it might need to use variables from within the body. That's the whole point of the do...while loop 2. There's no other syntax like this in Python, where you'd need a closing unindentation. Also it's using the existing "while" keyword with a different syntax than the current one What I'd like to suggest is a different approach, with an existing syntax found in functions. At first it might sound silly, but I belive it makes sense after a while: do: Now bare with me a second, this is similar to function's def

: And similar to how you can exit functions with `return`, you could also exit the `do` loop with some keyword, such as `until`. This makes it very similar to how `return` behaves in a function: do: password = input() until password == secret_password Now before you say "but return can be anywhere in the function, and there can be multiple of them", I suggest we do the same for "until". It would be like a break, but with a condition: do: password = input('Password: ') until password == secret_password # This line only gets printed if until failed print('Invalid password, try again!') print('You have successfully signed in!') The keywords are obviously a subject to change, I've also been tinkering with repeat/until, do/breakif, repeat/breakon, etc. Any thoughts, is this complete madness? -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sun Jun 25 08:10:30 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 25 Jun 2017 15:10:30 +0300 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: 25.06.17 14:58, Markus Meskanen ????: > I'm a huge fan of the do...while loop in other languages, and it would > often be useful in Python too, when doing stuff like: > > while True: > password = input() > if password == ...: > break In this particular case you could write: for password in iter(input, secret_password): ... In more complex cases you can either write more complex generator or just use conditional break. There is nothing wrong with this. From lucas.bourneuf at laposte.net Sun Jun 25 08:06:54 2017 From: lucas.bourneuf at laposte.net (lucas) Date: Sun, 25 Jun 2017 14:06:54 +0200 Subject: [Python-ideas] + operator on generators Message-ID: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Hello ! I often use generators, and itertools.chain on them. What about providing something like the following: a = (n for n in range(2)) b = (n for n in range(2, 4)) tuple(a + b) # -> 0 1 2 3 This, from user point of view, is just as how the __add__ operator works on lists and tuples. Making generators works the same way could be a great way to avoid calls to itertools.chain everywhere, and to limits the differences between generators and other "linear" collections. I do not know exactly how to implement that (i'm not that good at C, nor CPython source itself), but by seeing the sources, i imagine that i could do something like the list_concat function at Objects/listobject.c:473, but in the Objects/genobject.c file, where instead of copying elements i'm creating and initializing a new chainobject as described at Modules/itertoolsmodule.c:1792. (In pure python, the implementation would be something like `def __add__(self, othr): return itertools.chain(self, othr)`) Best regards, --lucas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From storchaka at gmail.com Sun Jun 25 08:51:10 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 25 Jun 2017 15:51:10 +0300 Subject: [Python-ideas] + operator on generators In-Reply-To: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: 25.06.17 15:06, lucas via Python-ideas ????: > I often use generators, and itertools.chain on them. > What about providing something like the following: > > a = (n for n in range(2)) > b = (n for n in range(2, 4)) > tuple(a + b) # -> 0 1 2 3 > > This, from user point of view, is just as how the > __add__ operator works on lists and tuples. > Making generators works the same way could be a great way to avoid calls > to itertools.chain everywhere, and to limits the differences between > generators and other "linear" collections. > > I do not know exactly how to implement that (i'm not that good at C, nor > CPython source itself), but by seeing the sources, > i imagine that i could do something like the list_concat function at > Objects/listobject.c:473, but in the Objects/genobject.c file, > where instead of copying elements i'm creating and initializing a new > chainobject as described at Modules/itertoolsmodule.c:1792. > > (In pure python, the implementation would be something like `def > __add__(self, othr): return itertools.chain(self, othr)`) It would be weird if the addition is only supported for instances of the generator class, but not for other iterators. Why (n for n in range(2)) + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports arbitrary iterators. Therefore you will need to implement the __add__ method for *all* iterators in the world. However itertools.chain() accepts not just *iterators*. It works with *iterables*. Therefore you will need to implement the __add__ method also for all iterables in the world. But __add__ already is implemented for list and tuple, and many other sequences, and your definition conflicts with this. From k7hoven at gmail.com Sun Jun 25 09:10:47 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 25 Jun 2017 16:10:47 +0300 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: On Sat, Jun 24, 2017 at 11:30 PM, Lucas Wiman wrote: > ? > On Sat, Jun 24, 2017 at 12:42 PM, Koos Zevenhoven > wrote: > >> There has been some discussion here and there concerning the differences >> between runtime types and static types (mypy etc.). What I write below is >> not really an idea or proposal---just a perspective, or a topic that people >> may want to discuss. Since the discussion on this is currently very fuzzy >> and scattered and not really happening either AFAICT (I've probably missed >> many discussions, though). Anyway, I thought I'd give it a shot: >> >> ?[...]? > Regarding runtime types and isinstance, let's look at the Iterable[int] >> example. For this case, there are a few options: >> >> 1) Don't implement isinstance >> >> This is problematic for runtime uses of annotations. >> >> 2) isinstance([1, '2', 'three'], Iterable[int]) returns True >> >> This is in fact now the case. This is ok for many runtime situations, but >> lacks precision compared to the static version. One may want to distinguish >> between Iterable[int] and Iterable[str] at runtime (e.g. the multidispatch >> example above). >> >> 3) Check as much as you can at runtime >> >> There could be something like Reiterable, which means the object is not >> consumed by iterating over it, so one could actually check if all elements >> are instances of int. This would be useful in some situations, but not >> available for every object. Furthermore, the check could take an arbitrary >> amount of time so it is not really suitable for things like multidispatch >> or some matching constructs etc., where the performance overhead of the >> type check is really important. >> >> 4) Do a deeper check than in (2) but trust the annotations >> >> For example, an instance of a class that has a method like >> >> def __iter__(self) -> Iterator[int]: >> some code >> >> could be identified as Iterable[int] at runtime, even if it is not >> guaranteed that all elements are really integers. >> >> On the other hand, an object returned by >> >> def get_ints() -> Iterable[int]: >> some code >> >> does not know its own annotations, so the check is difficult to do at >> runtime. And of course, there may not be annotations available. >> >> 5) Something else? >> >> >> And what about PEP544 (protocols), which is being drafted? The PEP seems >> to aim for having type objects that represent duck-typing >> protocols/interfaces. Checking whether a protocol is implemented by an >> object or type is clearly a useful thing to do at runtime, but it is not >> really clear if isinstance would be a guaranteed feature for PEP544 >> Protocols. >> >> So one question is, is it possible to draw the lines between what works >> with isinstance and what doesn't, and between what details are checked by >> isinstance and what aren't? -- Or should insinstance be reserved for a more >> limited purpose, and add another check function, say `implements(...)`, >> which would perhaps guarantee some answer for all combinations of object >> and type? >> > >> > I'm guessing to implement PEP 544, many of the `__instancecheck__` and > `__subclasscheck__` methods in `typing.py` would need to be updated to > check the `__annotations__` of the class of the object it's passed against > its own definition, (covered in this section > > of the PEP). > > ?I may have missed something, but I believe PEP544 is ?not suggesting that annotations would have any effect on isinstance. Instead, isinstance would by default not work. > I've been somewhat surprised that many of the `__instancecheck__` > implementations do not work at runtime, even when the implementation would > be trivial (e.g. for `Union`), or would not have subtle edge cases due to > immutability (e.g. for `Tuple`, which cannot be used for checking > parameterized instances). This seems like counterintuitive behavior that > would be straightforward to fix, unless there are subtleties & edge cases > I'm missing. > > ?Tuple is an interesting case, because for small tuples (say 2- or 3-tuples), it makes perfect sense to check the types of all elements for some runtime purposes.? Regarding Union, I believe the current situation has a lot to do with the fact that the relation between type annotations and runtime behavior hasn't really settled yet. If people are amenable to updating those cases, I'd be interested in > submitting a patch to that effect. > > ?Thanks for letting us know. (There may not be an instant decision on this particular case, though, but who knows :) -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Sun Jun 25 09:21:18 2017 From: rymg19 at gmail.com (rymg19 at gmail.com) Date: Sun, 25 Jun 2017 06:21:18 -0700 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: <> <> References: <> <> Message-ID: For some background on the removal of __instancecheck__, check the linked issues here: https://github.com/python/typing/issues/135 -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttp://refi64.com On Jun 25, 2017 at 8:11 AM, > wrote: On Sat, Jun 24, 2017 at 11:30 PM, Lucas Wiman wrote: > ? > On Sat, Jun 24, 2017 at 12:42 PM, Koos Zevenhoven > wrote: > >> There has been some discussion here and there concerning the differences >> between runtime types and static types (mypy etc.). What I write below is >> not really an idea or proposal---just a perspective, or a topic that people >> may want to discuss. Since the discussion on this is currently very fuzzy >> and scattered and not really happening either AFAICT (I've probably missed >> many discussions, though). Anyway, I thought I'd give it a shot: >> >> ?[...]? > Regarding runtime types and isinstance, let's look at the Iterable[int] >> example. For this case, there are a few options: >> >> 1) Don't implement isinstance >> >> This is problematic for runtime uses of annotations. >> >> 2) isinstance([1, '2', 'three'], Iterable[int]) returns True >> >> This is in fact now the case. This is ok for many runtime situations, but >> lacks precision compared to the static version. One may want to distinguish >> between Iterable[int] and Iterable[str] at runtime (e.g. the multidispatch >> example above). >> >> 3) Check as much as you can at runtime >> >> There could be something like Reiterable, which means the object is not >> consumed by iterating over it, so one could actually check if all elements >> are instances of int. This would be useful in some situations, but not >> available for every object. Furthermore, the check could take an arbitrary >> amount of time so it is not really suitable for things like multidispatch >> or some matching constructs etc., where the performance overhead of the >> type check is really important. >> >> 4) Do a deeper check than in (2) but trust the annotations >> >> For example, an instance of a class that has a method like >> >> def __iter__(self) -> Iterator[int]: >> some code >> >> could be identified as Iterable[int] at runtime, even if it is not >> guaranteed that all elements are really integers. >> >> On the other hand, an object returned by >> >> def get_ints() -> Iterable[int]: >> some code >> >> does not know its own annotations, so the check is difficult to do at >> runtime. And of course, there may not be annotations available. >> >> 5) Something else? >> >> >> And what about PEP544 (protocols), which is being drafted? The PEP seems >> to aim for having type objects that represent duck-typing >> protocols/interfaces. Checking whether a protocol is implemented by an >> object or type is clearly a useful thing to do at runtime, but it is not >> really clear if isinstance would be a guaranteed feature for PEP544 >> Protocols. >> >> So one question is, is it possible to draw the lines between what works >> with isinstance and what doesn't, and between what details are checked by >> isinstance and what aren't? -- Or should insinstance be reserved for a more >> limited purpose, and add another check function, say `implements(...)`, >> which would perhaps guarantee some answer for all combinations of object >> and type? >> > >> > I'm guessing to implement PEP 544, many of the `__instancecheck__` and > `__subclasscheck__` methods in `typing.py` would need to be updated to > check the `__annotations__` of the class of the object it's passed against > its own definition, (covered in this section > > of the PEP). > > ?I may have missed something, but I believe PEP544 is ?not suggesting that annotations would have any effect on isinstance. Instead, isinstance would by default not work. > I've been somewhat surprised that many of the `__instancecheck__` > implementations do not work at runtime, even when the implementation would > be trivial (e.g. for `Union`), or would not have subtle edge cases due to > immutability (e.g. for `Tuple`, which cannot be used for checking > parameterized instances). This seems like counterintuitive behavior that > would be straightforward to fix, unless there are subtleties & edge cases > I'm missing. > > ?Tuple is an interesting case, because for small tuples (say 2- or 3-tuples), it makes perfect sense to check the types of all elements for some runtime purposes.? Regarding Union, I believe the current situation has a lot to do with the fact that the relation between type annotations and runtime behavior hasn't really settled yet. If people are amenable to updating those cases, I'd be interested in > submitting a patch to that effect. > > ?Thanks for letting us know. (There may not be an instant decision on this particular case, though, but who knows :) -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Sun Jun 25 09:25:58 2017 From: toddrjen at gmail.com (Todd) Date: Sun, 25 Jun 2017 09:25:58 -0400 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: On Jun 25, 2017 07:58, "Markus Meskanen" wrote: I'm a huge fan of the do...while loop in other languages, and it would often be useful in Python too, when doing stuff like: while True: password = input() if password == ...: break I've seen the pep 315 which got rejected, and I believe these two suggestions were mostly focused on: 1. do ... while : 2. do: while But both were rejected for valid reasons: 1. It makes little sense to have the while at the top, since it might need to use variables from within the body. That's the whole point of the do...while loop 2. There's no other syntax like this in Python, where you'd need a closing unindentation. Also it's using the existing "while" keyword with a different syntax than the current one What I'd like to suggest is a different approach, with an existing syntax found in functions. At first it might sound silly, but I belive it makes sense after a while: do: Now bare with me a second, this is similar to function's def
: And similar to how you can exit functions with `return`, you could also exit the `do` loop with some keyword, such as `until`. This makes it very similar to how `return` behaves in a function: do: password = input() until password == secret_password Now before you say "but return can be anywhere in the function, and there can be multiple of them", I suggest we do the same for "until". It would be like a break, but with a condition: do: password = input('Password: ') until password == secret_password # This line only gets printed if until failed print('Invalid password, try again!') print('You have successfully signed in!') The keywords are obviously a subject to change, I've also been tinkering with repeat/until, do/breakif, repeat/breakon, etc. Any thoughts, is this complete madness? The barrier for adding a new keyword is extremely high since anyone using the name in their code will have their code break. If we were going to do something like this, I would prefer to use existing keywords. Perhaps break if condition And we could make "while:" (with no condition) act as "while True:" But I think the benefit of either of these changes is minimal compared to what we already have. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Sun Jun 25 10:04:47 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Sun, 25 Jun 2017 16:04:47 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: I would like to add that for example numpy ndarrays are iterables, but they have an __add__ with completely different semantics, namely element-wise ( numerical) addition. So this proposal would conflict with existing libraries with iterable objects. Stephan Op 25 jun. 2017 2:51 p.m. schreef "Serhiy Storchaka" : > 25.06.17 15:06, lucas via Python-ideas ????: > >> I often use generators, and itertools.chain on them. >> What about providing something like the following: >> >> a = (n for n in range(2)) >> b = (n for n in range(2, 4)) >> tuple(a + b) # -> 0 1 2 3 >> >> This, from user point of view, is just as how the >> __add__ operator works on lists and tuples. >> Making generators works the same way could be a great way to avoid calls >> to itertools.chain everywhere, and to limits the differences between >> generators and other "linear" collections. >> >> I do not know exactly how to implement that (i'm not that good at C, nor >> CPython source itself), but by seeing the sources, >> i imagine that i could do something like the list_concat function at >> Objects/listobject.c:473, but in the Objects/genobject.c file, >> where instead of copying elements i'm creating and initializing a new >> chainobject as described at Modules/itertoolsmodule.c:1792. >> >> (In pure python, the implementation would be something like `def >> __add__(self, othr): return itertools.chain(self, othr)`) >> > > It would be weird if the addition is only supported for instances of the > generator class, but not for other iterators. Why (n for n in range(2)) + > (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, 4)) and > iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports arbitrary > iterators. Therefore you will need to implement the __add__ method for > *all* iterators in the world. > > However itertools.chain() accepts not just *iterators*. It works with > *iterables*. Therefore you will need to implement the __add__ method also > for all iterables in the world. But __add__ already is implemented for list > and tuple, and many other sequences, and your definition conflicts with > this. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucas.wiman at gmail.com Sun Jun 25 12:13:44 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Sun, 25 Jun 2017 09:13:44 -0700 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: > > For some background on the removal of __instancecheck__, check the linked > issues here: > Thanks for the reference (the most relevant discussion starts here ). That said, I think I totally disagree with the underlying philosophy of throwing away a useful and intuitive feature (having `is_instance(foo, Union[Bar, Baz])` just work as you'd naively expect) in the name of making sure that people *understand* there's a distinction between types and classes. This seems opposed to the "zen" of python that there should be exactly one obvious way to do it, since (1) there isn't a way to do it without a third party library, and (2) the obvious way to do it is with `isinstance` and `issubclass`. Indeed, the current implementation makes it somewhat nonobvious even how to implement this functionality yourself in a third-party library (see this gist ). One of the first things I did when playing around with the `typing` module was to fire up the REPL, and try runtime typechecks: >>> from typing import * >>> isinstance(0, Union[int, float]) Traceback (most recent call last): File "", line 1, in File "/Users/lucaswiman/.pyenv/versions/3.6/lib/python3.6/typing.py", line 767, in __instancecheck__ raise TypeError("Unions cannot be used with isinstance().") TypeError: Unions cannot be used with isinstance(). I think the natural reaction of naive users of the library is "That's annoying. Why? What is this library good for?", not "Ah, I've sagely learned a valuable lesson about the subtle-and-important-though-unmentioned distinction between types and classes!" The restriction against runtime type checking makes `typing` pretty much *only* useful when used with the external library `mypy` (or when writing a library with the same purpose as `mypy`), which is a pretty unusual situation for a standard library module. Mark Shannon's example also specifically does not apply to the types I'm thinking of for the reasons I mentioned: > For example, > List[int] and List[str] and mutually incompatible types, yet > isinstance([], List[int]) and isinstance([], List[str)) > both return true. > > There is no corresponding objection for `Union`; I can't think of any* inconsistencies or runtime type changes that would result from defining `_Union.__instancecheck__` as `any(isinstance(obj, t) for t in self.__args__`. For `Tuple`, it's true that `()` would be an instance of `Tuple[X, ...]` for all types X. However, the objection for the `List` case (IIUC; extrapolating slightly) is that the type of the object could change depending on what's added to it. That's not true for tuples since they're immutable, so it's not *inconsistent* to say that `()` is an instance of `Tuple[int, ...]` and `Tuple[str, ...]`, it's just applying a sensible definition to the base case of an empty tuple. That said, it sounds like the decision has already been made, and this would be quite useful functionality to have in *some* form. What do people think about implementing `__contains__` (for `__instancecheck__`) and `__lt__` (for `__subclasscheck__`) for these cases? Then there would still be a convenient syntax for doing runtime type checking/analysis, but wouldn't violate Mark Shannon's objections. Best, Lucas * Counterexamples welcomed, of course! On Sun, Jun 25, 2017 at 6:21 AM, rymg19 at gmail.com wrote: > For some background on the removal of __instancecheck__, check the linked > issues here: > > > https://github.com/python/typing/issues/135 > > > -- > Ryan (????) > Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttp://refi64.com > > On Jun 25, 2017 at 8:11 AM, > wrote: > > On Sat, Jun 24, 2017 at 11:30 PM, Lucas Wiman > wrote: > >> ? >> On Sat, Jun 24, 2017 at 12:42 PM, Koos Zevenhoven >> wrote: >> >>> There has been some discussion here and there concerning the differences >>> between runtime types and static types (mypy etc.). What I write below is >>> not really an idea or proposal---just a perspective, or a topic that people >>> may want to discuss. Since the discussion on this is currently very fuzzy >>> and scattered and not really happening either AFAICT (I've probably missed >>> many discussions, though). Anyway, I thought I'd give it a shot: >>> >>> > ?[...]? > > > >> Regarding runtime types and isinstance, let's look at the Iterable[int] >>> example. For this case, there are a few options: >>> >>> 1) Don't implement isinstance >>> >>> This is problematic for runtime uses of annotations. >>> >>> 2) isinstance([1, '2', 'three'], Iterable[int]) returns True >>> >>> This is in fact now the case. This is ok for many runtime situations, >>> but lacks precision compared to the static version. One may want to >>> distinguish between Iterable[int] and Iterable[str] at runtime (e.g. the >>> multidispatch example above). >>> >>> 3) Check as much as you can at runtime >>> >>> There could be something like Reiterable, which means the object is not >>> consumed by iterating over it, so one could actually check if all elements >>> are instances of int. This would be useful in some situations, but not >>> available for every object. Furthermore, the check could take an arbitrary >>> amount of time so it is not really suitable for things like multidispatch >>> or some matching constructs etc., where the performance overhead of the >>> type check is really important. >>> >>> 4) Do a deeper check than in (2) but trust the annotations >>> >>> For example, an instance of a class that has a method like >>> >>> def __iter__(self) -> Iterator[int]: >>> some code >>> >>> could be identified as Iterable[int] at runtime, even if it is not >>> guaranteed that all elements are really integers. >>> >>> On the other hand, an object returned by >>> >>> def get_ints() -> Iterable[int]: >>> some code >>> >>> does not know its own annotations, so the check is difficult to do at >>> runtime. And of course, there may not be annotations available. >>> >>> 5) Something else? >>> >>> >>> And what about PEP544 (protocols), which is being drafted? The PEP seems >>> to aim for having type objects that represent duck-typing >>> protocols/interfaces. Checking whether a protocol is implemented by an >>> object or type is clearly a useful thing to do at runtime, but it is not >>> really clear if isinstance would be a guaranteed feature for PEP544 >>> Protocols. >>> >>> So one question is, is it possible to draw the lines between what works >>> with isinstance and what doesn't, and between what details are checked by >>> isinstance and what aren't? -- Or should insinstance be reserved for a more >>> limited purpose, and add another check function, say `implements(...)`, >>> which would perhaps guarantee some answer for all combinations of object >>> and type? >>> >> >>> >> I'm guessing to implement PEP 544, many of the `__instancecheck__` and >> `__subclasscheck__` methods in `typing.py` would need to be updated to >> check the `__annotations__` of the class of the object it's passed against >> its own definition, (covered in this section >> >> of the PEP). >> >> > ?I may have missed something, but I believe PEP544 is ?not suggesting that > annotations would have any effect on isinstance. Instead, isinstance would > by default not work. > > > >> I've been somewhat surprised that many of the `__instancecheck__` >> implementations do not work at runtime, even when the implementation would >> be trivial (e.g. for `Union`), or would not have subtle edge cases due to >> immutability (e.g. for `Tuple`, which cannot be used for checking >> parameterized instances). This seems like counterintuitive behavior that >> would be straightforward to fix, unless there are subtleties & edge cases >> I'm missing. >> >> > ?Tuple is an interesting case, because for small tuples (say 2- or > 3-tuples), it makes perfect sense to check the types of all elements for > some runtime purposes.? Regarding Union, I believe the current situation > has a lot to do with the fact that the relation between type annotations > and runtime behavior hasn't really settled yet. > > > If people are amenable to updating those cases, I'd be interested in >> submitting a patch to that effect. >> >> > ?Thanks for letting us know. (There may not be an instant decision on this > particular case, though, but who knows :) > > -- Koos > > > -- > + Koos Zevenhoven + http://twitter.com/k7hoven + > _______________________________________________ Python-ideas mailing list > Python-ideas at python.org https://mail.python.org/ > mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/ > codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Sun Jun 25 13:11:56 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 25 Jun 2017 20:11:56 +0300 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: On Sun, Jun 25, 2017 at 7:13 PM, Lucas Wiman wrote: [...] > >>> from typing import * > >>> isinstance(0, Union[int, float]) > Traceback (most recent call last): > File "", line 1, in > File "/Users/lucaswiman/.pyenv/versions/3.6/lib/python3.6/typing.py", > line 767, in __instancecheck__ > raise TypeError("Unions cannot be used with isinstance().") > TypeError: Unions cannot be used with isinstance(). > > I think the natural reaction of naive users of the library is "That's > annoying. Why? What is this library good for?", not "Ah, I've sagely > learned a valuable lesson about the subtle-and-important-though-unmentioned > distinction between types and classes!" The restriction against runtime > type checking makes `typing` pretty much *only* useful when used with the > external library `mypy` (or when writing a library with the same purpose as > `mypy`), which is a pretty unusual situation for a standard library module. > > Mark Shannon's example also specifically does not apply to the types I'm > thinking of for the reasons I mentioned: > >> For example, >> List[int] and List[str] and mutually incompatible types, yet >> isinstance([], List[int]) and isinstance([], List[str)) >> both return true. >> >> There is no corresponding objection for `Union`; I can't think of any* > inconsistencies or runtime type changes that would result from defining > `_Union.__instancecheck__` as `any(isinstance(obj, t) for t in > self.__args__`. > ?One thing is that, as long as not all types support isinstance, then also isinstance(obj, Union[x, y, z])? will fail for some x, y, z. From some perspective, that may not be an issue, but on the other hand, it may invite people to think that all types do support isinstance. For `Tuple`, it's true that `()` would be an instance of `Tuple[X, ...]` > for all types X. However, the objection for the `List` case (IIUC; > extrapolating slightly) is that the type of the object could change > depending on what's added to it. That's not true for tuples since they're > immutable, so it's not *inconsistent* to say that `()` is an instance of > `Tuple[int, ...]` and `Tuple[str, ...]`, it's just applying a sensible > definition to the base case of an empty tuple. > > ?Yes, but then `isinstance(tuple(range(1000000)), Tuple[int, ...])` would be a difficult case. I think it all comes down to how much isinstance should pretend to be able to do with *types* (as opposed to only classes). Maybe isinstance is not the future of runtime type checking ;). Or should isinstance have a third return value like `Possibly`? -- Koos -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Jun 25 13:21:51 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 26 Jun 2017 03:21:51 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> <594DBA3F.1010505@canterbury.ac.nz> <20170624100326.GT3149@ando.pearwood.info> <594E5BAC.5050809@canterbury.ac.nz> Message-ID: <20170625172149.GU3149@ando.pearwood.info> On Sat, Jun 24, 2017 at 11:45:25PM +1000, Nick Coghlan wrote: > While I used to think that, I'm no longer sure it's true, as it seems > to me that a `contextlib.convert_exception` context manager could help > with both of them. Here is a recipe for such a context manager which is also useable as a decorator: https://code.activestate.com/recipes/580808-guard-against-an-exception-in-the-wrong-place/ or just https://code.activestate.com/recipes/580808 It should work with Python 2.6 through 3.6 and later. try: with exception_guard(ZeroDivisionError): 1/0 # raises ZeroDivisionError except RuntimeError: print ('ZeroDivisionError replaced by RuntimeError') -- Steve From k7hoven at gmail.com Sun Jun 25 13:44:10 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 25 Jun 2017 20:44:10 +0300 Subject: [Python-ideas] + operator on generators In-Reply-To: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < python-ideas at python.org> wrote: > I often use generators, and itertools.chain on them. > What about providing something like the following: > > a = (n for n in range(2)) > b = (n for n in range(2, 4)) > tuple(a + b) # -> 0 1 2 3 > > This, from user point of view, is just as how the > __add__ operator works on lists and tuples. > Making generators works the same way could be a great way to avoid calls > to itertools.chain everywhere, and to limits the differences between > generators and other "linear" collections. > > ?I think a convenient syntax for chaining iterables and sequences would be very usef?ul in Python 3, because there has been a shift from using lists by default to using views to dict keys and values, range objects etc. Having to add an import for a basic operation that used to just work with the + operator feels like a regression to many. It's not really clear if you will be able to implement this, but if you can find a syntax that gets accepted, I think using the same type as itertools.chain might be a good starting point, although the docs should not promise to return that exact type so that support for __getitem__ etc. could be added in the future for cases where the chained iterables are Sequences. -- Koos > I do not know exactly how to implement that (i'm not that good at C, nor > CPython source itself), but by seeing the sources, > i imagine that i could do something like the list_concat function at > Objects/listobject.c:473, but in the Objects/genobject.c file, > where instead of copying elements i'm creating and initializing a new > chainobject as described at Modules/itertoolsmodule.c:1792. > > (In pure python, the implementation would be something like `def > __add__(self, othr): return itertools.chain(self, othr)`) > > Best regards, > --lucas > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucas.wiman at gmail.com Sun Jun 25 13:47:46 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Sun, 25 Jun 2017 10:47:46 -0700 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: > ?Yes, but then `isinstance(tuple(range(1000000)), Tuple[int, ...])` would be a difficult case. Yes, many methods on million-element tuples are slow. :-) Python usually chooses the more intuitive behavior over the faster behavior when there is a choice. IIUC, the objections don't have anything to do with speed. - Lucas On Sun, Jun 25, 2017 at 10:11 AM, Koos Zevenhoven wrote: > On Sun, Jun 25, 2017 at 7:13 PM, Lucas Wiman > wrote: > [...] > >> >>> from typing import * >> >>> isinstance(0, Union[int, float]) >> Traceback (most recent call last): >> File "", line 1, in >> File "/Users/lucaswiman/.pyenv/versions/3.6/lib/python3.6/typing.py", >> line 767, in __instancecheck__ >> raise TypeError("Unions cannot be used with isinstance().") >> TypeError: Unions cannot be used with isinstance(). >> >> I think the natural reaction of naive users of the library is "That's >> annoying. Why? What is this library good for?", not "Ah, I've sagely >> learned a valuable lesson about the subtle-and-important-though-unmentioned >> distinction between types and classes!" The restriction against runtime >> type checking makes `typing` pretty much *only* useful when used with >> the external library `mypy` (or when writing a library with the same >> purpose as `mypy`), which is a pretty unusual situation for a standard >> library module. >> >> Mark Shannon's example also specifically does not apply to the types I'm >> thinking of for the reasons I mentioned: >> >>> For example, >>> List[int] and List[str] and mutually incompatible types, yet >>> isinstance([], List[int]) and isinstance([], List[str)) >>> both return true. >>> >>> There is no corresponding objection for `Union`; I can't think of any* >> inconsistencies or runtime type changes that would result from defining >> `_Union.__instancecheck__` as `any(isinstance(obj, t) for t in >> self.__args__`. >> > > ?One thing is that, as long as not all types support isinstance, then also > isinstance(obj, Union[x, y, z])? will fail for some x, y, z. From some > perspective, that may not be an issue, but on the other hand, it may invite > people to think that all types do support isinstance. > > For `Tuple`, it's true that `()` would be an instance of `Tuple[X, ...]` >> for all types X. However, the objection for the `List` case (IIUC; >> extrapolating slightly) is that the type of the object could change >> depending on what's added to it. That's not true for tuples since they're >> immutable, so it's not *inconsistent* to say that `()` is an instance of >> `Tuple[int, ...]` and `Tuple[str, ...]`, it's just applying a sensible >> definition to the base case of an empty tuple. >> >> > ?Yes, but then `isinstance(tuple(range(1000000)), Tuple[int, ...])` would > be a difficult case. > > I think it all comes down to how much isinstance should pretend to be able > to do with *types* (as opposed to only classes). Maybe isinstance is not > the future of runtime type checking ;). Or should isinstance have a third > return value like `Possibly`? > > -- Koos > -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Sun Jun 25 14:13:25 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 25 Jun 2017 21:13:25 +0300 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: On Sun, Jun 25, 2017 at 8:47 PM, Lucas Wiman wrote: > > ?Yes, but then `isinstance(tuple(range(1000000)), Tuple[int, ...])` > would be a difficult case. > > Yes, many methods on million-element tuples are slow. :-) > > Python usually chooses the more intuitive behavior over the faster > behavior when there is a choice. IIUC, the objections don't have anything > to do with speed. > > ?Sure, performance cannot dictate everything. But if you look at, for example, that multidispatch example I wrote in the OP, it would not be wise to do that expensive check every time. Also, people don't expect an isinstance check to be expensive. But looking at annotations, it is often possible to get the desired answer without checking all elements in a tuple (or especially without consuming generators etc. to check the types). -- Koos ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucas.wiman at gmail.com Sun Jun 25 14:39:16 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Sun, 25 Jun 2017 11:39:16 -0700 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: After rereading Koos' OP, I can now see that he was referring to a different kind of runtime type checking than I am interested in. There was a distinction I was unaware the core typing devs make between "typing-style types" and "classes" that I'll discuss further in the typing repo itself. Apologies for the digression. - Lucas On Sun, Jun 25, 2017 at 11:13 AM, Koos Zevenhoven wrote: > On Sun, Jun 25, 2017 at 8:47 PM, Lucas Wiman > wrote: > >> > ?Yes, but then `isinstance(tuple(range(1000000)), Tuple[int, ...])` >> would be a difficult case. >> >> Yes, many methods on million-element tuples are slow. :-) >> >> Python usually chooses the more intuitive behavior over the faster >> behavior when there is a choice. IIUC, the objections don't have anything >> to do with speed. >> >> > ?Sure, performance cannot dictate everything. But if you look at, for > example, that multidispatch example I wrote in the OP, it would not be wise > to do that expensive check every time. Also, people don't expect an > isinstance check to be expensive. But looking at annotations, it is often > possible to get the desired answer without checking all elements in a tuple > (or especially without consuming generators etc. to check the types). > > -- Koos > ? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danilo.bellini at gmail.com Sun Jun 25 14:55:00 2017 From: danilo.bellini at gmail.com (Danilo J. S. Bellini) Date: Sun, 25 Jun 2017 15:55:00 -0300 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < python-ideas at python.org> wrote: > I often use generators, and itertools.chain on them. > What about providing something like the following: > > a = (n for n in range(2)) > b = (n for n in range(2, 4)) > tuple(a + b) # -> 0 1 2 3 AudioLazy does that: https://github.com/danilobellini/audiolazy -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Jun 25 18:10:41 2017 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 25 Jun 2017 17:10:41 -0500 Subject: [Python-ideas] Reducing collisions in small dicts/sets In-Reply-To: References: Message-ID: Two bits of new info: first, it's possible to get the performance of "double" without division, at least via this way: """ # Double hashing using Fibonacci multiplication for the increment. This # does about as well as `double`, but doesn't require division. # # The multiplicative constant depends on the word size W, and is the # nearest odd integer to 2**W/((1 + sqrt(5))/2). So for a 64-bit box: # # >>> 2**64 / ((1 + decimal.getcontext().sqrt(5))/2) # Decimal('11400714819323198485.95161059') # # For a 32-bit box, it's 2654435769. # # The result is the high-order `nbits` bits of the low-order W bits of # the product. In C, the "& M" part isn't needed (unsigned * in C # returns only the low-order bits to begin with). # # Annoyance: I don't think Python dicts store `nbits`, just 2**nbits. def dfib(h, nbits, M=M64): mask = (1 << nbits) - 1 i = h & mask yield i inc = (((h * 11400714819323198485) & M) >> (64 - nbits)) | 1 while True: i = (i + inc) & mask yield i """ Second, the program I posted uses string objects as test cases. The current string hash acts like a good-quality pseudo-random number generator: change a character in the string, and the hash typically changes "a lot". It's also important to look at hashes with common patterns, because, e.g., all "sufficiently small" integers are their _own_ hash codes (e.g., hash(3) == 3). Indeed, long ago Python changed its dict implementation to avoid quadratic-time behavior for unfortunate sets of real-life integer keys. Example: change `objgen()` to `yield i << 12` instead of `yield str(i)`. Then we generate integers all of whose final 12 bits are zeroes. For all sufficiently small tables, then, these ints _all_ map to the same initial table slot (since the initial probe is taken from the last `nbits` bits). The collision resolution scheme is needed to prevent disaster. Here's a chunk of output from that, for dicts of size 2,730: """ bits 12 nslots 4,096 dlen 2,730 alpha 0.67 # built 37 theoretical avg probes for uniform hashing when found 1.65 not found 3.00 crisp when found 1.65 not found 3.00 prober linear found min 1:0.04% max 2730 mean 1365.50 fail min 2731:100.00% max 2731 mean 2731.00 prober quadratic found min 1:0.04% max 2730 mean 1365.50 fail min 2731:100.00% max 2731 mean 2731.00 prober pre28201 found min 1:0.04% max 61 mean 6.17 fail min 5:28.94% max 68 mean 9.78 prober current found min 1:0.04% max 58 mean 5.17 fail min 4:29.30% max 70 mean 8.62 prober double found min 1:0.04% max 5 mean 2.73 fail min 2:41.47% max 9 mean 3.94 prober dfib found min 1:0.04% max 9 mean 2.53 fail min 2:10.87% max 17 mean 4.48 prober uniform found min 1:66.52% max 21 mean 1.65 fail min 1:33.41% max 30 mean 2.99 """ It's a worst-case disaster for linear and quadratic probing. The `current` scheme does fine, but `double` and `dfib` do significantly better on all measures shown. `uniform` is oblivious, still achieving its theoretical average-case performance. That last isn't surprising "in theory", since it theoretically picks a probe sequence uniformly at random from the set of all table slot permutations. It's perhaps a bit surprising that this approximate _implementation_ also achieves it. But `random.seed(h)` does a relatively enormous amount of work to spray the bits of `h` all over the Twister's internal state, and then shuffle them around. In any case, the "double hashing" variants ("double" and "dfib") look very promising, doing better than the current scheme for both small tables and pathological key sets. From mikhailwas at gmail.com Sun Jun 25 19:09:02 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 26 Jun 2017 01:09:02 +0200 Subject: [Python-ideas] Allow function to return multiple values Message-ID: joannah nanjekye wrote: > [...] > >Today I was writing an example snippet for the book and needed to write a >function that returns two values something like this: > >def return_multiplevalues(num1, num2): > return num1, num2 > > I noticed that this actually returns a tuple of the values which I did not >want in the first place.I wanted python to return two values in their own >types so I can work with them as they are but here I was stuck with working >around a tuple. It was quite puzzling at first what was the actual idea but probably I can guess why this question came up by you. It seems to me (I am just intuitively guessing that) that you were about to write a procedure which operates on global variables. If so you should use the keyword "global" for that. E.g. if you want to work with the variables defined in other part of the code you can simply do it: x = 0 y = 0 def move(): global x, y x = x + 10 y = y + 20 move() print (x,y) This function will change x and y (global variables in this case). Note that without the line with the "global" statement this will not work. Another typical usage is initialising variables inside a procedure: def init_coordinates(): global x,y x=0 y=0 init_coordinates() print (x,y) So for some reason it seemed to me that you are trying to do something like that. >My proposal is we provide a way of functions returning multiple values. >This has been implemented in languages like Go and I have found many cases >where I needed and used such a functionality. I wish for this convenience >in python so that I don't have to suffer going around a tuple. So if using globals as in the above examples you certinly don't have to suffer going around a tuple. Mikhail From rymg19 at gmail.com Sun Jun 25 19:14:18 2017 From: rymg19 at gmail.com (rymg19 at gmail.com) Date: Sun, 25 Jun 2017 16:14:18 -0700 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: <> Message-ID: IIRC I'm pretty sure the OP just didn't know about the existence of tuple unpacking and the ability to use that to return multiple values. -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttp://refi64.com On Jun 25, 2017 at 6:09 PM, > wrote: joannah nanjekye wrote: > [...] > >Today I was writing an example snippet for the book and needed to write a >function that returns two values something like this: > >def return_multiplevalues(num1, num2): > return num1, num2 > > I noticed that this actually returns a tuple of the values which I did not >want in the first place.I wanted python to return two values in their own >types so I can work with them as they are but here I was stuck with working >around a tuple. It was quite puzzling at first what was the actual idea but probably I can guess why this question came up by you. It seems to me (I am just intuitively guessing that) that you were about to write a procedure which operates on global variables. If so you should use the keyword "global" for that. E.g. if you want to work with the variables defined in other part of the code you can simply do it: x = 0 y = 0 def move(): global x, y x = x + 10 y = y + 20 move() print (x,y) This function will change x and y (global variables in this case). Note that without the line with the "global" statement this will not work. Another typical usage is initialising variables inside a procedure: def init_coordinates(): global x,y x=0 y=0 init_coordinates() print (x,y) So for some reason it seemed to me that you are trying to do something like that. >My proposal is we provide a way of functions returning multiple values. >This has been implemented in languages like Go and I have found many cases >where I needed and used such a functionality. I wish for this convenience >in python so that I don't have to suffer going around a tuple. So if using globals as in the above examples you certinly don't have to suffer going around a tuple. Mikhail _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.cliffe at btinternet.com Sun Jun 25 20:06:43 2017 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Mon, 26 Jun 2017 01:06:43 +0100 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <20170624100326.GT3149@ando.pearwood.info> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> <594DBA3F.1010505@canterbury.ac.nz> <20170624100326.GT3149@ando.pearwood.info> Message-ID: On 24/06/2017 11:03, Steven D'Aprano wrote: > On Sat, Jun 24, 2017 at 01:02:55PM +1200, Greg Ewing wrote: > >> In any case, this doesn't address the issue raised by the OP, >> which in this example is that if the implementation of >> bah.__getitem__ calls something else that raises an IndexError, >> there's no easy way to distinguish that from one raised by >> bah.__getitem__ itself. > I'm not convinced that's a meaningful distinction to make in general. > Consider the difference between these two classes: > > class X: > def __getitem__(self, n): > if n < 0: > n += len(self) > if not 0 <= n < len(self): > raise IndexError > ... > > class Y: > def __getitem__(self, n): > self._validate(n) > ... > def _validate(self, n): > if n < 0: > n += len(self) > if not 0 <= n < len(self): > raise IndexError > > > The difference is a mere difference of refactoring. Why should one of > them be treated as "bah.__getitem__ raises itself" versus > "bah.__getitem__ calls something which raises"? That's just an > implementation detail. > > I think we're over-generalizing this problem. There's two actual issues > here, and we shouldn't conflate them as the same problem: > > (1) People write buggy code based on invalid assumptions of what can and > can't raise. E.g.: > > try: > foo(baz[5]) > except IndexError: > ... # assume baz[5] failed (but maybe foo can raise too?) > > > (2) There's a *specific* problem with property where a bug in your > getter or setter that raises AttributeError will be masked, appearing as > if the property itself doesn't exist. > > > In the case of (1), there's nothing Python the language can do to fix > that. The solution is to write better code. Question your assumptions. > Think carefully about your pre-conditions and post-conditions and > invariants. Plan ahead. Read the documentation of foo before assuming > it won't raise. In other words, be a better programmer. > > If only it were that easy :-( > > (Aside: I've been thinking for a long time that design by contract is a > very useful feature to have. It should be possibly to set a contract > that states that this function won't raise a certain exception, and if > it does, treat it as a runtime error. But this is still at a very early > point in my thinking.) > > Python libraries rarely give a definitive list of what exceptions > functions can raise, so unless you wrote it yourself and know exactly > what it can and cannot do, defensive coding suggests that you assume any > function call might raise any exception at all: > > try: > item = baz[5] > except IndexError: > ... # assume baz[5] failed > else: > foo(item) > > > Can we fix that? Well, maybe we should re-consider the rejection of PEP > 463 (exception-catching expressions). > > https://www.python.org/dev/peps/pep-0463/ I'm all in favour of that :-) but I don't see how it helps in this example: try: item = (baz[5] except IndexError: SomeSentinelValue) if item == SomeSentinelValue: ... # assume baz[5] failed else: foo(item) is clunkier than the original version. Or am I missing something? Only if the normal and exceptional cases could be handled the same way would it help: foo(baz[5] except IndexError: 0) Rob Cliffe > > > Maybe we need a better way to assert that a certain function won't raise > a particular exception: > > try: > item = bah[5] > without IndexError: > foo(item) > except IndexError: > ... # assume baz[5] failed > > (But how is that different from try...except...else?) > > > > In the case of (2), the masking of bugs inside property getters if they > happen to raise AttributeError, I think the std lib can help with that. > Perhaps a context manager or decorator (or both) that converts one > exception to another? > > @property > @bounce_exception(AttributeError, RuntimeError) > def spam(self): > ... > > > Now spam.__get__ cannot raise AttributeError, if it does, it will be > converted to RuntimeError. If you need finer control over the code that > is guarded use the context manager form: > > @property > def spam(self): > with bounce_exception(AttributeError, RuntimeError): > # guarded > if condition: > ... > # not guarded > raise AttributeError('property spam doesn't exist yet') > > > From rob.cliffe at btinternet.com Sun Jun 25 20:25:20 2017 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Mon, 26 Jun 2017 01:25:20 +0100 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: On 25/06/2017 12:58, Markus Meskanen wrote: > I'm a huge fan of the do...while loop in other languages, and it would > often be useful in Python too, when doing stuff like: > > while True: > password = input() > if password == ...: > break > > [...]I suggest [...] > > do: > password = input('Password: ') > until password == secret_password > > # This line only gets printed if until failed > print('Invalid password, try again!') > > I don't see any significant advantage in providing an extra Way To Do It. Granted, the "while True" idiom is an idiosyncrasy, but it is frequently used and IMHO intuitive and easy to get used to. Your suggestion doesn't even save a line of code, given that you can write: while True: password = input('Password:') if password == secret_password: break print('Invalid password, try again!') Regards Rob Cliffe From mikhailwas at gmail.com Sun Jun 25 21:20:21 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 26 Jun 2017 03:20:21 +0200 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: On 26 June 2017 at 01:14, rymg19 at gmail.com wrote: > IIRC I'm pretty sure the OP just didn't know about the existence of tuple > unpacking and the ability to use that to return multiple values. > Can be so, though it was not quite clear. The original OP's example function included same variables as input and output and the phrasing "I noticed that this actually returns a tuple of the values which I did not want in the first place" actually can indicate that my theory can be also valid. And it reminded me times starting with Python and wondering why I can't simply write something like: def move(x,y): x = x + 10 y = y + 20 move(x,y) Instead of this: def move(x,y): x1 = x + 10 y1 = y + 20 return x1,y1 x,y = move(x,y) So probably there was some corellation with this and OP's ideas, IDK. > On Jun 25, 2017 at 6:09 PM, wrote: > > joannah nanjekye wrote: > > > > > >> [...] > >> > >>Today I was writing an example snippet for the book and needed to write a > >>function that returns two values something like this: > >> > >>def return_multiplevalues(num1, num2): > >> return num1, num2 > >> > >> I noticed that this actually returns a tuple of the values which I did not > >>want in the first place.I wanted python to return two values in their own > >>types so I can work with them as they are but here I was stuck with working > >>around a tuple. > > > > It was quite puzzling at first what was the actual idea but probably I > > can guess why this question came up by you. > > It seems to me (I am just intuitively guessing that) that you were about to > > write a procedure which operates on global variables. > > If so you should use the keyword "global" for that. > > E.g. if you want to work with the variables defined in other > > part of the code you can simply do it: > > > > x = 0 > > y = 0 > > def move(): > > global x, y > > x = x + 10 > > y = y + 20 > > move() > > print (x,y) > > > > This function will change x and y (global variables in this case). > > Note that without the line with the "global" statement this will not work. > > Another typical usage is initialising variables inside a procedure: > > > > def init_coordinates(): > > global x,y > > x=0 > > y=0 > > init_coordinates() > > print (x,y) > > > > > > So for some reason it seemed to me that you are trying to do > > something like that. > > > >>My proposal is we provide a way of functions returning multiple values. > >>This has been implemented in languages like Go and I have found many cases > >>where I needed and used such a functionality. I wish for this convenience > >>in python so that I don't have to suffer going around a tuple. > > > > So if using globals as in the above examples you certinly don't > > have to suffer going around a tuple. > > > > > > > > Mikhail > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > From wes.turner at gmail.com Sun Jun 25 21:47:42 2017 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 25 Jun 2017 20:47:42 -0500 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: On Sunday, June 25, 2017, Danilo J. S. Bellini wrote: > On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < > python-ideas at python.org > > wrote: > >> I often use generators, and itertools.chain on them. >> What about providing something like the following: >> >> a = (n for n in range(2)) >> b = (n for n in range(2, 4)) >> tuple(a + b) # -> 0 1 2 3 > > > AudioLazy does that: https://github.com/danilobellini/audiolazy > - http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.concat and concatv - https://github.com/kachayev/fn.py#streams-and-infinite-sequences-declaration - Stream() << obj > > > -- > Danilo J. S. Bellini > --------------- > "*It is not our business to set up prohibitions, but to arrive at > conventions.*" (R. Carnap) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Jun 25 23:23:36 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 26 Jun 2017 13:23:36 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: <20170626032336.GV3149@ando.pearwood.info> On Sun, Jun 25, 2017 at 02:06:54PM +0200, lucas via Python-ideas wrote: > What about providing something like the following: > > a = (n for n in range(2)) > b = (n for n in range(2, 4)) > tuple(a + b) # -> 0 1 2 3 As Serhiy points out, this is going to conflict with existing use of + operator for string and sequence concatenation. I have a counter-proposal: introduce the iterator chaining operator "&": iterable & iterable --> itertools.chain(iterable, iterable) The reason I choose & rather than + is that & is less likely to conflict with any existing string/sequence types. None of the built-in or std lib sequences that I can think of support the & operator. Also, & is used for (string?) concatenation in some languages, such as VB.Net, some BASIC dialects, Hypertalk, AppleScript, and Ada. Iterator chaining is more like concatenation than (numeric) addition. However, the & operator is already used for bitwise-AND. Under my proposal that behaviour will continue, and will take priority over chaining. Currently, the & operator does something similar to (but significantly more complex) to this: # simplified pseudo-code of existing behaviour if hasattr(x, '__and__'): return x.__and__(y) elif hasattr(y, '__rand__'): return y.__rand__(x) else: raise TypeError The key is to insert the new behaviour after the existing __(r)and__ code, just before TypeError is raised: attempt existing __(r)and__ behaviour if and only if that fails to apply: return itertools.chain(iter(x), iter(y)) So classes that define a __(r)and__ method will keep their existing behaviour. This implies that we cannot use & to chain sets and frozen sets, since they already define __(r)and__. This has an easy work-around: just call iter() on the set first. Applying & to objects which don't define __(r)and__ and aren't iterable will continue to raise TypeError, just as it does now. The only backwards incompatibility this proposal introduces is for any code which relies on `iterable & iterable` to raise TypeError. Frankly I can't imagine that there is any such code, outside of the Python test suite, but if there is, and people think it is worth it, we could make this a __future__ import. But I think that's overkill. The downside to this proposal is that it adds some conceptual complexity to Python operators. Apart from `is` and `is not`, all Python operators call one or more dunder methods. This is (as far as I know) the first operator which has fall-back functionality if the dunder methods aren't defined. Up to now, I've talked about & chaining being equivalent to the itertools.chain function. That glosses over one difference which needs to be mentioned. The chain function currently doesn't attempt to iterate over its arguments until needed: py> x = itertools.chain("a", 1, "c") py> next(x) 'a' py> next(x) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable Any proposal to change this behaviour for the itertools.chain function should be kept separate from this one. But for the & chaining operator, I think that behaviour must change: if we have an operand that is neither iterable nor defines __(r)and__, the & operator should fail early: [1, 2, 3] & None should raise TypeError immediately, unlike itertools.chain(). -- Steve From ncoghlan at gmail.com Mon Jun 26 00:41:03 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 26 Jun 2017 14:41:03 +1000 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: On 26 June 2017 at 10:25, Rob Cliffe wrote: > > > On 25/06/2017 12:58, Markus Meskanen wrote: >> >> I'm a huge fan of the do...while loop in other languages, and it would >> often be useful in Python too, when doing stuff like: >> >> while True: >> password = input() >> if password == ...: >> break >> >> [...]I suggest [...] >> >> do: >> password = input('Password: ') >> until password == secret_password >> >> # This line only gets printed if until failed >> print('Invalid password, try again!') >> >> > I don't see any significant advantage in providing an extra Way To Do It. > Granted, the "while True" idiom is an idiosyncrasy, but it is frequently > used and IMHO intuitive and easy to get used to. Your suggestion doesn't > even save a line of code, given that you can write: > > while True: > password = input('Password:') > if password == secret_password: break > print('Invalid password, try again!') Right, this is the key challenge for do-while loops in Python: can you come up with something that's significantly clearer than the current "while True/if/break" pattern? The biggest weakness of that idiom is that it isn't really explicit in the header line - there's nothing about "while True:" that directly tells the reader "This loop is expected to exit via a break statement". If we wanted to allow that to be expressed literally, we could probably special case the "while not break" keyword sequence as a do loop: while not break: # Setup if condition: break # Loop continuation That more explicit declaration of intent ("The code in the loop body will conditionally break out of this loop") would allow a couple of things: - the compiler could warn that an else clause attached to such a loop will never execute (technically it could do that for any expression that resolves to `True` as a constant) - code linters could check the loop body for a break statement and complain if they didn't see one Logically, it's exactly the same as writing "while True:", but whereas that spelling suggests "infinite loop", the "while not break:" spelling would more directly suggest "terminated inside the loop body via a break statement" Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tjreedy at udel.edu Mon Jun 26 02:08:11 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 26 Jun 2017 02:08:11 -0400 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: On 6/26/2017 12:41 AM, Nick Coghlan wrote: > On 26 June 2017 at 10:25, Rob Cliffe wrote: >> >> >> On 25/06/2017 12:58, Markus Meskanen wrote: >>> >>> I'm a huge fan of the do...while loop in other languages, and it would >>> often be useful in Python too, when doing stuff like: >>> >>> while True: >>> password = input() >>> if password == ...: >>> break >>> >>> [...]I suggest [...] >>> >>> do: >>> password = input('Password: ') >>> until password == secret_password >>> >>> # This line only gets printed if until failed >>> print('Invalid password, try again!') >>> >>> >> I don't see any significant advantage in providing an extra Way To Do It. >> Granted, the "while True" idiom is an idiosyncrasy, but it is frequently >> used and IMHO intuitive and easy to get used to. Your suggestion doesn't >> even save a line of code, given that you can write: >> >> while True: >> password = input('Password:') >> if password == secret_password: break >> print('Invalid password, try again!') > > Right, this is the key challenge for do-while loops in Python: can you > come up with something that's significantly clearer than the current > "while True/if/break" pattern? > > The biggest weakness of that idiom is that it isn't really explicit in > the header line - there's nothing about "while True:" that directly > tells the reader "This loop is expected to exit via a break > statement". > > If we wanted to allow that to be expressed literally, we could > probably special case the "while not break" keyword sequence as a do > loop: > > while not break: > # Setup > if condition: break > # Loop continuation We would then also need 'while not return:' > That more explicit declaration of intent ("The code in the loop body > will conditionally break out of this loop") would allow a couple of > things: > > - the compiler could warn that an else clause attached to such a loop > will never execute (technically it could do that for any expression > that resolves to `True` as a constant) > - code linters could check the loop body for a break statement and > complain if they didn't see one > > Logically, it's exactly the same as writing "while True:", but whereas > that spelling suggests "infinite loop", the "while not break:" > spelling would more directly suggest "terminated inside the loop body > via a break statement" -- Terry Jan Reedy From rosuav at gmail.com Mon Jun 26 02:14:27 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 26 Jun 2017 16:14:27 +1000 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: On Mon, Jun 26, 2017 at 2:41 PM, Nick Coghlan wrote: > If we wanted to allow that to be expressed literally, we could > probably special case the "while not break" keyword sequence as a do > loop: > > while not break: > # Setup > if condition: break > # Loop continuation > > That more explicit declaration of intent ("The code in the loop body > will conditionally break out of this loop") would allow a couple of > things: > > - the compiler could warn that an else clause attached to such a loop > will never execute (technically it could do that for any expression > that resolves to `True` as a constant) > - code linters could check the loop body for a break statement and > complain if they didn't see one > > Logically, it's exactly the same as writing "while True:", but whereas > that spelling suggests "infinite loop", the "while not break:" > spelling would more directly suggest "terminated inside the loop body > via a break statement" > What I like doing is writing these loops with a string literal as the "condition". It compiles to the same bytecode as 'while True' does, and then you can say what you like in the string. (An empty string would be like 'while False', but there's no point doing that anyway.) So, for example: while "password not correct": password = input('Password:') if password == secret_password: break print('Invalid password, try again!') ChrisA From rosuav at gmail.com Mon Jun 26 02:14:54 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 26 Jun 2017 16:14:54 +1000 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: On Mon, Jun 26, 2017 at 4:08 PM, Terry Reedy wrote: >> If we wanted to allow that to be expressed literally, we could >> probably special case the "while not break" keyword sequence as a do >> loop: >> >> while not break: >> # Setup >> if condition: break >> # Loop continuation > > > We would then also need 'while not return:' And for completeness, "while not throw:". ChrisA From jsbueno at python.org.br Mon Jun 26 04:25:46 2017 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 26 Jun 2017 10:25:46 +0200 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: and "while not except" :-/ maybe we just stick with "while True' and put forward a documenting PEP advising linter packages to look for ways of getting out of the loop. On 26 June 2017 at 08:08, Terry Reedy wrote: > On 6/26/2017 12:41 AM, Nick Coghlan wrote: > >> On 26 June 2017 at 10:25, Rob Cliffe wrote: >> >>> >>> >>> On 25/06/2017 12:58, Markus Meskanen wrote: >>> >>>> >>>> I'm a huge fan of the do...while loop in other languages, and it would >>>> often be useful in Python too, when doing stuff like: >>>> >>>> while True: >>>> password = input() >>>> if password == ...: >>>> break >>>> >>>> [...]I suggest [...] >>>> >>>> do: >>>> password = input('Password: ') >>>> until password == secret_password >>>> >>>> # This line only gets printed if until failed >>>> print('Invalid password, try again!') >>>> >>>> >>>> I don't see any significant advantage in providing an extra Way To Do >>> It. >>> Granted, the "while True" idiom is an idiosyncrasy, but it is frequently >>> used and IMHO intuitive and easy to get used to. Your suggestion doesn't >>> even save a line of code, given that you can write: >>> >>> while True: >>> password = input('Password:') >>> if password == secret_password: break >>> print('Invalid password, try again!') >>> >> >> Right, this is the key challenge for do-while loops in Python: can you >> come up with something that's significantly clearer than the current >> "while True/if/break" pattern? >> >> The biggest weakness of that idiom is that it isn't really explicit in >> the header line - there's nothing about "while True:" that directly >> tells the reader "This loop is expected to exit via a break >> statement". >> >> If we wanted to allow that to be expressed literally, we could >> probably special case the "while not break" keyword sequence as a do >> loop: >> >> while not break: >> # Setup >> if condition: break >> # Loop continuation >> > > We would then also need 'while not return:' > > That more explicit declaration of intent ("The code in the loop body >> will conditionally break out of this loop") would allow a couple of >> things: >> >> - the compiler could warn that an else clause attached to such a loop >> will never execute (technically it could do that for any expression >> that resolves to `True` as a constant) >> - code linters could check the loop body for a break statement and >> complain if they didn't see one >> >> Logically, it's exactly the same as writing "while True:", but whereas >> that spelling suggests "infinite loop", the "while not break:" >> spelling would more directly suggest "terminated inside the loop body >> via a break statement" >> > > > > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Jun 26 05:22:18 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 26 Jun 2017 21:22:18 +1200 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: <5950D24A.9090705@canterbury.ac.nz> Chris Angelico wrote: > And for completeness, "while not throw:". And just to completely confuse everyone, "while not pass". :-) -- Greg From rosuav at gmail.com Mon Jun 26 06:34:41 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 26 Jun 2017 20:34:41 +1000 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: <5950D24A.9090705@canterbury.ac.nz> References: <5950D24A.9090705@canterbury.ac.nz> Message-ID: On Mon, Jun 26, 2017 at 7:22 PM, Greg Ewing wrote: > And just to completely confuse everyone, "while not pass". :-) while "gandalf": ChrisA From jsbueno at python.org.br Mon Jun 26 06:47:02 2017 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 26 Jun 2017 12:47:02 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: On 25 June 2017 at 20:55, Danilo J. S. Bellini wrote: > On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < > python-ideas at python.org> wrote: > >> I often use generators, and itertools.chain on them. >> What about providing something like the following: >> >> a = (n for n in range(2)) >> b = (n for n in range(2, 4)) >> tuple(a + b) # -> 0 1 2 3 > > > You know you can do `tuple(*a, *b)` , right? The problem with the "*" notation is that it will actually render the iterable contents eagerly - unlike something that would just chain them. But for creating tuples, it just works. > AudioLazy does that: https://github.com/danilobellini/audiolazy > > -- > Danilo J. S. Bellini > --------------- > "*It is not our business to set up prohibitions, but to arrive at > conventions.*" (R. Carnap) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Mon Jun 26 09:15:43 2017 From: toddrjen at gmail.com (Todd) Date: Mon, 26 Jun 2017 09:15:43 -0400 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: On Jun 26, 2017 2:15 AM, "Chris Angelico" wrote: On Mon, Jun 26, 2017 at 4:08 PM, Terry Reedy wrote: >> If we wanted to allow that to be expressed literally, we could >> probably special case the "while not break" keyword sequence as a do >> loop: >> >> while not break: >> # Setup >> if condition: break >> # Loop continuation > > > We would then also need 'while not return:' And for completeness, "while not throw:" All these situations could be handled by making a "while:" with no condition act as "while True:" But they could also be handled by updating pep8 to make "while True:" the recommended infinite loop syntax and make linters smarter about this (if they aren't already). -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Mon Jun 26 09:53:15 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 26 Jun 2017 16:53:15 +0300 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: 26.06.17 13:47, Joao S. O. Bueno ????: > On 25 June 2017 at 20:55, Danilo J. S. Bellini > wrote: > > On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas > >wrote: > > I often use generators, and itertools.chain on them. > What about providing something like the following: > > a = (n for n in range(2)) > b = (n for n in range(2, 4)) > tuple(a + b) # -> 0 1 2 3 > > > You know you can do `tuple(*a, *b)` , right? > > The problem with the "*" notation is that it will actually render the > iterable > contents eagerly - unlike something that would just chain them. > But for creating tuples, it just works. Even the tuple constructor is not needed. >>> *a, *b (0, 1, 2, 3) From srkunze at mail.de Mon Jun 26 12:17:52 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Mon, 26 Jun 2017 18:17:52 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: Personally, I find syntactic sugar for concating interators would come in handy. The purpose of iterators and generators is performance and efficiency. So, lowering the bar of using them is a good idea IMO. Also hoping back and forth a generator/iterator-based solution and a, say, list-based/materialized solution would become a lot easier. On 25.06.2017 16:04, Stephan Houben wrote: > I would like to add that for example numpy ndarrays are iterables, but > they have an __add__ with completely different semantics, namely > element-wise ( numerical) addition. > > So this proposal would conflict with existing libraries with iterable > objects. I don't see a conflict. > > Op 25 jun. 2017 2:51 p.m. schreef "Serhiy Storchaka" > >: > > It would be weird if the addition is only supported for instances > of the generator class, but not for other iterators. Why (n for n > in range(2)) + (n for n in range(2, 4)) works, but iter(range(2)) > + iter(range(2, 4)) and iter([0, 1]) + iter((2, 3)) don't? > itertools.chain() supports arbitrary iterators. Therefore you will > need to implement the __add__ method for *all* iterators in the world. > I don't think it's necessary to start with *all* iterators in the world. So, adding iterators and/or generators, should be possible without any problems. It's a start and could already help a lot if I have my use-cases correctly. > However itertools.chain() accepts not just *iterators*. It works > with *iterables*. Therefore you will need to implement the __add__ > method also for all iterables in the world. But __add__ already is > implemented for list and tuple, and many other sequences, and your > definition conflicts with this. > As above, I don't see a conflict. Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Mon Jun 26 12:29:25 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Mon, 26 Jun 2017 18:29:25 +0200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 24.06.2017 01:37, MRAB wrote: > I think a "shallow exception" would be one that's part of a defined > API, as distinct from one that is an artifact of the implementation, a > leak in the abstraction. I like the "shallow exception" idea most. It's simple and it covers most if not all issues. You also hit the nail with pointing to leaking abstractions. > It's like when "raise ... from None" was introduced to help in those > cases where you want to replace an exception that's a detail of the > (current) internal implementation with one that's intended for the user. Regards, Sven PS: This "shallow exception" proposal could help e.g. Django improving their template system. Here's it's the other way round: the exception handling is done by Django and depending on the exception it will fall back to a different attribute access method. I can remember us implementing such a method which accidentally raised a caught exception which we then never saw. Debugging this was a mess and took a quite some time. From srkunze at mail.de Mon Jun 26 12:43:54 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Mon, 26 Jun 2017 18:43:54 +0200 Subject: [Python-ideas] [Python-Dev] Language proposal: variable assignment in functional context In-Reply-To: References: <20170617002754.GH3149@ando.pearwood.info> Message-ID: <21b62304-50f9-7dc4-f26e-2566ac08b956@mail.de> Cool idea, Micha?. I hope there's at least somebody willing to try it out in practice. On 23.06.2017 02:21, Micha? ?ukowski wrote: > I've implemented a PoC of `where` expression some time ago. > > https://github.com/thektulu/cpython/commit/9e669d63d292a639eb6ba2ecea3ed2c0c23f2636 > > just compile and have fun. > > > > 2017-06-17 2:27 GMT+02:00 Steven D'Aprano >: > > Welcome Robert. My response below. > > Follow-ups to Python-Ideas, thanks. You'll need to subscribe to > see any > further discussion. > > > On Fri, Jun 16, 2017 at 11:32:19AM +0000, Robert Vanden Eynde wrote: > > > In a nutshell, I would like to be able to write: > > y = (b+2 for b = a + 1) > > I think this is somewhat similar to a suggestion of Nick > Coghlan's. One > possible syntax as a statement might be: > > y = b + 2 given: > b = a + 1 > > > https://www.python.org/dev/peps/pep-3150/ > > > In mathematics, I might write: > > y = b + 2 where b = a + 1 > > although of course I wouldn't do so for anything so simple. Here's a > better example, the quadratic formula: > > -b ? ?? > x = ????????? > 2a > > where ? = b? - 4ac > > although even there I'd usually write ? in place. > > > > Python already have the "functional if", lambdas, list > comprehension, > > but not simple assignment functional style. > > I think you mean "if *expression*" rather than "functional if". > The term > "functional" in programming usually refers to a particular paradigm: > > https://en.wikipedia.org/wiki/Functional_programming > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Mon Jun 26 12:55:19 2017 From: mertz at gnosis.cx (David Mertz) Date: Mon, 26 Jun 2017 09:55:19 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170626032336.GV3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> Message-ID: On Sun, Jun 25, 2017 at 8:23 PM, Steven D'Aprano wrote: > On Sun, Jun 25, 2017 at 02:06:54PM +0200, lucas via Python-ideas wrote: > > I have a counter-proposal: introduce the iterator chaining operator "&": > > iterable & iterable --> itertools.chain(iterable, iterable) > In [1]: import numpy as np In [2]: import itertools In [3]: a, b = np.array([1,2,3]), np.array([4,5,6]) In [4]: a & b Out[4]: array([0, 0, 2]) In [5]: a + b Out[5]: array([5, 7, 9]) In [6]: list(itertools.chain(a, b)) Out[6]: [1, 2, 3, 4, 5, 6] These are all distinct, useful, and well-defined behaviors. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Jun 26 14:21:03 2017 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 26 Jun 2017 13:21:03 -0500 Subject: [Python-ideas] Reducing collisions in small dicts/sets In-Reply-To: References: Message-ID: Some explanations and cautions. An advantage of sticking with pure Python instead of C is that spare moments are devoted to investigating the problem instead of thrashing with micro-optimizations ;-) Why does the current scheme suffer for small tables? With hindsight it's pretty obvious: it can visit table slots more than once (the other schemes cannot), and while the odds of that happening are tiny in large tables, they're high for small tables. For example, in a 3-bit table (8 slots), suppose the initial index is 1, and there's a collision. The 5*i+1 recurrence _would_ visit slot 6 next, but we also shift in 5 new bits and add them. If those bits are "random", there's a 1-in-8 chance that we'll visit index 1 _again_. If we don't, and have another collision, shifting in the next 5 bits has a 2-in-8 chance of repeating one of the two slots we already visited. And so on. Here using the `str(i)` generator (which yields random-ish hash codes): """ bits 3 nslots 8 dlen 5 alpha 0.62 # built 20,000 ... prober current found min 1:74.80% max 15 mean 1.42 fail min 1:37.62% max 18 mean 2.66 """ So despite that only 5 slots are filled, in at least one case it took 15 probes to find an existing key, and 18 probes to realize a key was missing. In the other schemes, it takes at most 5 probes for success and 6 probes for failure. This is worse for 64-bit hash codes than for 32-bit ones, because we can loop around twice as often before `perturb` permanently becomes 0 (at which point the pure 5*i+1 recurrence visits each slot at most once). The larger the table, the less likely a repeat due to `perturb` becomes. For example, suppose we have an 8-bit table and again visit index 1 first. We may visit any index in range(6, 6+32) next (depending on the 5 fresh bits shifted in), but repeating 1 is _not_ a possibility. Why do the double hashing methods do better for the `i << 12` generator? In any case where the hash codes have "a lot" of trailing bits in common, they all map to the same table index at first, and the probe sequence remains the same for all until the loop shifts `perturb` far enough right that the rightmost differing bits finally show up in the addition. In the double hashing methods, _all_ the bits of the hash code affect the value of `inc` computed before the loop starts, so they have a good chance of differing already on the second probe. This is again potentially worse for the current scheme with 64-bit hash codes, since the sheer number of common trailing bits _can_ be up to 63. However, the double-hashing schemes have pathological cases too, that the current scheme avoids. The first I tried was a `yield i * 1023` generator. These are spectacularly _good_ values for all schemes except `uniform` for successful searches, because i*1023 = j*1023 (mod 2**k) implies i=j (mod 2**k) That is, there are no first-probe collisions at all across a contiguous range of `i` with no more than 2**k values. But the `double` scheme for a 10-bit table degenerates to linear probing in this case, because inc = (h % mask) | 1 # force it odd is always 1 when h is divisible by 1023 (== mask for a 10-bit table). This is terrible for a failing search; e.g., for a 20-bit table (note that 2**20-1 is divisible by 1023): """ bits 20 nslots 1,048,576 dlen 699,050 alpha 0.67 # built 1 theoretical avg probes for uniform hashing when found 1.65 not found 3.00 ... prober current found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 34 mean 3.04 prober double found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 699049 mean 1867.51 prober dfib found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 427625 mean 8.09 prober uniform found min 1:66.65% max 24 mean 1.65 fail min 1:33.35% max 35 mean 3.00 """ While that's a failing-search disaster for `double`, it's also bad for `dfib` (& I don't have a slick explanation for that). So where does that leave us? I don't know. All schemes have good & bad points ;-) I haven't yet thought of a cheap way to compute an `inc` for double-hashing that isn't vulnerable to bad behavior for _some_ easily constructed set of int keys. If you forget "cheap", it's easy; e.g., random.seed(h) inc = random.choice(range(1, mask + 1, 2)) Regardless, I'll attach the current version of the code. -------------- next part -------------- MIN_ELTS = 100_000 M64 = (1 << 64) - 1 def phash(obj, M=M64): # hash(obj) as uint64 return hash(obj) & M # Probers: generate sequence of table indices to look at, # in table of size 2**nbits, for object with uint64 hash code h. def linear(h, nbits): mask = (1 << nbits) - 1 i = h & mask while True: yield i i = (i + 1) & mask # offsets of 0, 1, 3, 6, 10, 15, ... # this permutes the index range when the table size is a power of 2 def quadratic(h, nbits): mask = (1 << nbits) - 1 i = h & mask inc = 1 while True: yield i i = (i + inc) & mask inc += 1 def pre28201(h, nbits): mask = (1 << nbits) - 1 i = h & mask while True: yield i i = (5*i + h + 1) & mask h >>= 5 def current(h, nbits): mask = (1 << nbits) - 1 i = h & mask while True: yield i h >>= 5 i = (5*i + h + 1) & mask # One version of "double hashing". The increment between probes is # fixed, but varies across objects. This does very well! Note that the # increment needs to be relatively prime to the table size so that all # possible indices are generated. Because our tables have power-of-2 # sizes, we merely need to ensure the increment is odd. # Using `h % mask` is akin to "casting out 9's" in decimal: it's as if # we broke the hash code into nbits-wide chunks from the right, then # added them, then repeated that procedure until only one "digit" # remains. All bits in the hash code affect the result. # While mod is expensive, a successful search usual gets out on the # first try, & then the lookup can return before the mod completes. def double(h, nbits): mask = (1 << nbits) - 1 i = h & mask yield i inc = (h % mask) | 1 # force it odd while True: i = (i + inc) & mask yield i # Double hashing using Fibonacci multiplication for the increment. This # does about as well as `double`, but doesn't require division. # # The multiplicative constant depends on the word size W, and is the # nearest odd integer to 2**W/((1 + sqrt(5))/2). So for a 64-bit box: # # >>> 2**64 / ((1 + decimal.getcontext().sqrt(5))/2) # Decimal('11400714819323198485.95161059') # # For a 32-bit box, it's 2654435769. # # The result is the high-order `nbits` bits of the low-order W bits of # the product. In C, the "& M" part isn't needed (unsigned * in C # returns only the low-order bits to begin with). # # Annoyance: I don't think Python dicts store `nbits`, just 2**nbits. def dfib(h, nbits, M=M64): mask = (1 << nbits) - 1 i = h & mask yield i inc = (((h * 11400714819323198485) & M) >> (64 - nbits)) | 1 while True: i = (i + inc) & mask yield i # The theoretical "gold standard": generate a random permutation of the # table indices for each object. We can't actually do that, but # Python's PRNG gets close enough that there's no practical difference. def uniform(h, nbits): from random import seed, randrange seed(h) n = 1 << nbits seen = set() while True: assert len(seen) < n while True: i = randrange(n) if i not in seen: break seen.add(i) yield i def spray(nbits, objs, cs, prober, *, used=None, shift=5): building = used is None nslots = 1 << nbits if building: used = [0] * nslots assert len(used) == nslots for o in objs: n = 1 for i in prober(phash(o), nbits): if used[i]: n += 1 else: break if building: used[i] = 1 cs[n] += 1 return used def objgen(i=1): while True: yield str(i) #yield i << 12 # i*1023 gives a unique first probe, but is deadly # on failing searches for `double` (especially) and # `dfib`. #yield i * 1023 i += 1 # Average probes for a failing search; e.g., # 100 slots; 3 occupied # 1: 97/100 # 2: 3/100 * 97/99 # 3: 3/100 * 2/99 * 97/98 # 4: 3/100 * 2/99 * 1/98 * 97/97 # # `total` slots, `filled` occupied # probability `p` probes will be needed, 1 <= p <= filled+1 # p-1 collisions followed by success: # ff(filled, p-1) / ff(total, p-1) * (total - filled) / (total - (p-1)) # where `ff` is the falling factorial. def avgf(total, filled): assert 0 <= filled < total ffn = float(filled) ffd = float(total) tmf = ffd - ffn result = 0.0 ffpartial = 1.0 ppartial = 0.0 for p in range(1, filled + 2): thisp = ffpartial * tmf / (total - (p-1)) ppartial += thisp result += thisp * p ffpartial *= ffn / ffd ffn -= 1.0 ffd -= 1.0 assert abs(ppartial - 1.0) < 1e-14, ppartial return result # Average probes for a successful search. Alas, this takes time # quadratic in `filled`. def avgs(total, filled): assert 0 < filled < total return sum(avgf(total, f) for f in range(filled)) / filled def pstats(ns): total = sum(ns.values()) small = min(ns) print(f"min {small}:{ns[small]/total:.2%} " f"max {max(ns)} " f"mean {sum(i * j for i, j in ns.items())/total:.2f} ") def drive(nbits): from collections import defaultdict from itertools import islice import math import sys nslots = 1 << nbits dlen = nslots * 2 // 3 assert (sys.getsizeof({i: i for i in range(dlen)}) < sys.getsizeof({i: i for i in range(dlen + 1)})) alpha = dlen / nslots # actual load factor of max dict ntodo = (MIN_ELTS + dlen - 1) // dlen print() print("bits", nbits, f"nslots {nslots:,} dlen {dlen:,} alpha {alpha:.2f} " f"# built {ntodo:,}") print(f"theoretical avg probes for uniform hashing " f"when found {math.log(1/(1-alpha))/alpha:.2f} " f"not found {1/(1-alpha):.2f}") print(" crisp ", end="") if nbits > 12: print("... skipping (slow!)") else: print(f"when found {avgs(nslots, dlen):.2f} " f"not found {avgf(nslots, dlen):.2f}") for prober in ( linear, quadratic, pre28201, current, double, dfib, uniform, ): print(" prober", prober.__name__) objs = objgen() good = defaultdict(int) bad = defaultdict(int) for _ in range(ntodo): used = spray(nbits, islice(objs, dlen), good, prober) assert sum(used) == dlen spray(nbits, islice(objs, nslots), bad, prober, used=used) print(" " * 8 + "found ", end="") pstats(good) print(" " * 8 + "fail ", end="") pstats(bad) for bits in range(3, 23): drive(bits) From python at mrabarnett.plus.com Mon Jun 26 14:51:54 2017 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 26 Jun 2017 19:51:54 +0100 Subject: [Python-ideas] Reducing collisions in small dicts/sets In-Reply-To: References: Message-ID: On 2017-06-26 19:21, Tim Peters wrote: > Some explanations and cautions. An advantage of sticking with pure > Python instead of C is that spare moments are devoted to investigating > the problem instead of thrashing with micro-optimizations ;-) > > Why does the current scheme suffer for small tables? With hindsight > it's pretty obvious: it can visit table slots more than once (the > other schemes cannot), and while the odds of that happening are tiny > in large tables, they're high for small tables. > [snip] > So where does that leave us? I don't know. All schemes have good & > bad points ;-) I haven't yet thought of a cheap way to compute an > `inc` for double-hashing that isn't vulnerable to bad behavior for > _some_ easily constructed set of int keys. If you forget "cheap", > it's easy; e.g., > > random.seed(h) > inc = random.choice(range(1, mask + 1, 2)) > > Regardless, I'll attach the current version of the code. > If the current scheme suffers only for small tables, couldn't you use an alternative scheme only for small tables? From tim.peters at gmail.com Mon Jun 26 15:09:35 2017 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 26 Jun 2017 14:09:35 -0500 Subject: [Python-ideas] Reducing collisions in small dicts/sets In-Reply-To: References: Message-ID: [MRAB ] > If the current scheme suffers only for small tables, couldn't you use an > alternative scheme only for small tables? Sure. But whether that's desirable partly depends on timing actual C code. Try it ;-) For maintenance sanity, it's obviously better to have only one scheme to wrestle with. Note that "may visit a slot more than once" isn't the only thing in play, just one of the seemingly important things. For example, the current scheme requires 3 adds, 2 shifts, and a mask on each loop iteration (or 1 add and 1 shift of those could be replaced by 1 multiplication). The double-hashing schemes only require 1 add and 1 mask per iteration. In cases of collision, that difference is probably swamped by waiting for cache misses. But, as I said in the first msg: """ This is a brain dump for someone who's willing to endure the interminable pain of arguing about benchmarks ;-) """ From mikhailwas at gmail.com Mon Jun 26 16:20:22 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 26 Jun 2017 22:20:22 +0200 Subject: [Python-ideas] A suggestion for a do...while loop Message-ID: >All these situations could be handled by making a "while:" with no >condition act as "while True:" >But they could also be handled by updating pep8 to make "while True:" the >recommended infinite loop syntax and make linters smarter about this (if >they aren't already). There was a big related discussion on Python-list in April (subject "Looping" ). IMHO the cleanest way to denote an infinite loop would be the statement "loop:" Without introducing new keyword I think the optimal would be just "while:" I dont't like "while True:" simply because it does not make enough visual distinction with the "while condition:" statement. E.g. I can have while True: ... while Blue: .... Which adds some extra brain load. So if there was explicit "while:" or "loop:" I would update for it globally in my projects. Mikhail From k7hoven at gmail.com Mon Jun 26 16:26:48 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 26 Jun 2017 23:26:48 +0300 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: On Mon, Jun 26, 2017 at 4:53 PM, Serhiy Storchaka wrote: > 26.06.17 13:47, Joao S. O. Bueno ????: > >> On 25 June 2017 at 20:55, Danilo J. S. Bellini > > wrote: >> >> On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas >> > >wrote: >> >> I often use generators, and itertools.chain on them. >> What about providing something like the following: >> >> a = (n for n in range(2)) >> b = (n for n in range(2, 4)) >> tuple(a + b) # -> 0 1 2 3 >> >> >> You know you can do `tuple(*a, *b)` , right? >> >> The problem with the "*" notation is that it will actually render the >> iterable >> contents eagerly - unlike something that would just chain them. >> But for creating tuples, it just works. >> > > Even the tuple constructor is not needed. > > >>> *a, *b > (0, 1, 2, 3) > ?And you can also do def a_and_b(): yield from a yield from b c = a_and_b() # iterable that yields 0, 1, 2, 3 I sometimes wish there was something like c from: yield from a yield from b ?...or to get a list: c as list from: yield from a yield from b ...or a sum: c as sum from: yield from a yield from b These would be great for avoiding crazy oneliner generator expressions. They would also be equivalent to things like: @list @from def c(): yield from a yield from b @sum @from def c(): yield from a yield from b the above, given: def from(genfunc): return genfunc() Except of course `from` is a keyword and it should probably just be `call`. ? But this still doesn't naturally extend to allow indexing and slicing, like c[2] and c[1:3], for the case where the concatenated iterables are Sequences. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Jun 26 16:30:07 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 26 Jun 2017 13:30:07 -0700 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: References: Message-ID: <59516ECF.2080406@stoneleaf.us> On 06/26/2017 01:20 PM, Mikhail V wrote: > I dont't like "while True:" simply because it does not make enough > visual distinction with the "while condition:" statement. My "while True:" loops look something like: while "": -- ~Ethan~ From wes.turner at gmail.com Mon Jun 26 17:37:55 2017 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 26 Jun 2017 16:37:55 -0500 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: On Sunday, June 25, 2017, Wes Turner wrote: > > > On Sunday, June 25, 2017, Danilo J. S. Bellini > wrote: > >> On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < >> python-ideas at python.org> wrote: >> >>> I often use generators, and itertools.chain on them. >>> What about providing something like the following: >>> >>> a = (n for n in range(2)) >>> b = (n for n in range(2, 4)) >>> tuple(a + b) # -> 0 1 2 3 >> >> >> AudioLazy does that: https://github.com/danilobellini/audiolazy >> > > - http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.concat > and concatv > > - https://github.com/kachayev/fn.py#streams-and-infinite- > sequences-declaration > - Stream() << obj > << is __lshift__() <<= is __ilshift__() https://docs.python.org/2/library/operator.html Do Stream() and __lshift__() from fn.py not solve here? > > >> >> >> -- >> Danilo J. S. Bellini >> --------------- >> "*It is not our business to set up prohibitions, but to arrive at >> conventions.*" (R. Carnap) >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Mon Jun 26 18:12:29 2017 From: cs at zip.com.au (Cameron Simpson) Date: Tue, 27 Jun 2017 08:12:29 +1000 Subject: [Python-ideas] A suggestion for a do...while loop In-Reply-To: <59516ECF.2080406@stoneleaf.us> References: <59516ECF.2080406@stoneleaf.us> Message-ID: <20170626221229.GA33905@cskk.homeip.net> On 26Jun2017 13:30, Ethan Furman wrote: >On 06/26/2017 01:20 PM, Mikhail V wrote: >>I dont't like "while True:" simply because it does not make enough >>visual distinction with the "while condition:" statement. > >My "while True:" loops look something like: > > while "": O_o Nice! Cheers, Cameron Simpson From tim.peters at gmail.com Mon Jun 26 23:18:04 2017 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 26 Jun 2017 22:18:04 -0500 Subject: [Python-ideas] Reducing collisions in small dicts/sets In-Reply-To: References: Message-ID: [Tim] >... I haven't yet thought of a cheap way to compute an > `inc` for double-hashing that isn't vulnerable to bad behavior for > _some_ easily constructed set of int keys. If you forget "cheap", > it's easy; e.g., > > random.seed(h) > inc = random.choice(range(1, mask + 1, 2)) Heh. I always tell people not to believe what they think: run code to make sure! So I did ;-) def duni(h, nbits): from random import seed, choice mask = (1 << nbits) - 1 i = h & mask yield i seed(h) inc = choice(range(1, mask + 1, 2)) while True: i = (i + inc) & mask yield i On the terrible-for-good-reasons-for-failing-`double`-1023*i-searches case, it turned out `duni` did just as bad as `dfib`: """ bits 20 nslots 1,048,576 dlen 699,050 alpha 0.67 # built 1 theoretical avg probes for uniform hashing when found 1.65 not found 3.00 ... crisp ... skipping (slow!) prober current found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 34 mean 3.04 prober double found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 699049 mean 1867.51 prober dfib found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 427625 mean 8.09 prober duni found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 433846 mean 9.84 prober uniform found min 1:66.65% max 24 mean 1.65 fail min 1:33.35% max 35 mean 3.00 Yikes! That really surprised me. So - are integers of the form 1023*i so magical they also manage to provoke random.seed() into amazingly regular behavior? Or is there something about the whole idea of "double hashing" that leaves it vulnerable no matter how "random" an odd increment it picks? It's not the former. More code showed that a great number of distinct `inc` values are in fact used. And double hashing is widely & successfully used in other contexts. so it's not a problem with the idea _in general_. What does appear to be the case: it doesn't always play well with taking the last `nbits` bits as the initial table index. In other double-hashing contexts, they strive to pick a pseudo-random initial table index too. Why this matters: for any odd integer N, N*i = N*j (mod 2**k) if and only if i = j (mod 2**k) So, when integers are their own hash codes, for any `i` all the values in range(i*N, i*N + N*2**k, N) hash to unique slots in a table with 2**k slots (actually any table with 2**m slots for any m >= k). In particular, mod 2**k they map to the slots i*N + 0*N i*N + 1*N i*N + 2*N i*N + 3*N ... So, on a failing search for an integer of the form j*N, whenever double-hashing picks an increment that happens to be a multiple of N, it can bump into a huge chain of collisions, because double-hashing also uses a fixed increment between probes. And this is so for any odd N. For example, using a "yield i * 11" generator: """ bits 20 nslots 1,048,576 dlen 699,050 alpha 0.67 # built 1 theoretical avg probes for uniform hashing when found 1.65 not found 3.00 ... prober current found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 34 mean 3.00 prober double found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 667274 mean 34.10 prober dfib found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 539865 mean 9.30 prober duni found min 1:100.00% max 1 mean 1.00 fail min 1:33.33% max 418562 mean 8.65 prober uniform found min 1:66.63% max 25 mean 1.65 fail min 1:33.31% max 32 mean 3.00 """ All three double-hashing variants have horrible max collision chains, for the reasons just explained. In "normal" double-hashing contexts, the initial table indices are scrambled; it's my fault they follow a (mod 2**k) arithmetic progression in Python. Which fault I gladly accept ;-) It's valuable in practice. But, so long as that stays, it kills the double-hashing idea for me: while it's great for random-ish hash codes, the worst cases are horribly bad and very easily provoked (even by accident). From steve at pearwood.info Tue Jun 27 03:12:46 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 27 Jun 2017 17:12:46 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> Message-ID: <20170627071245.GY3149@ando.pearwood.info> On Mon, Jun 26, 2017 at 09:55:19AM -0700, David Mertz wrote: > On Sun, Jun 25, 2017 at 8:23 PM, Steven D'Aprano > wrote: > > > On Sun, Jun 25, 2017 at 02:06:54PM +0200, lucas via Python-ideas wrote: > > > > I have a counter-proposal: introduce the iterator chaining operator "&": > > > > iterable & iterable --> itertools.chain(iterable, iterable) > > > > In [1]: import numpy as np > In [2]: import itertools > In [3]: a, b = np.array([1,2,3]), np.array([4,5,6]) > In [4]: a & b > Out[4]: array([0, 0, 2]) > In [5]: a + b > Out[5]: array([5, 7, 9]) > In [6]: list(itertools.chain(a, b)) > Out[6]: [1, 2, 3, 4, 5, 6] > > These are all distinct, useful, and well-defined behaviors. Um... yes? I don't understand what point you are making. Did you read all of my post? I know it was long, but if you stopped reading at the point you replied, you might not realise that my proposal keeps the existing bitwise-AND behaviour of & and so the numpy array behaviour won't change. TL;DR - keep the existing __and__ and __rand__ behaviour; - if they are not defined, and both operands x, y are iterable, return chain(x, y); - raise TypeError for operands which neither support __(r)and__ nor are iterable. I think that chaining iterators is common enough and important enough in Python 3 to deserve an operator. While lists are still important, a lot of things which were lists are now lazily generated iterators, and we often need to concatenate them. itertools.chain() is less convenient than it should be. If we decide that chaining deserves an operator, it shouldn't be + because that clashes with existing sequence addition. & has the advantage that it means "concatenation" in some other languages, it means "and" in English which can be read as "add or concatenate", and it is probably unsupported by most iterables. I didn't think of numpy arrays as an exception (I was mostly thinking of sets) but I don't think people chain numpy arrays together very often. If they do, it's easy enough to call iter() first. -- Steve From mertz at gnosis.cx Tue Jun 27 03:47:40 2017 From: mertz at gnosis.cx (David Mertz) Date: Tue, 27 Jun 2017 00:47:40 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170627071245.GY3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> Message-ID: On Tue, Jun 27, 2017 at 12:12 AM, Steven D'Aprano wrote: > > > I have a counter-proposal: introduce the iterator chaining operator > "&": > > > iterable & iterable --> itertools.chain(iterable, iterable) > > > > In [1]: import numpy as np > > In [2]: import itertools > > In [3]: a, b = np.array([1,2,3]), np.array([4,5,6]) > > In [4]: a & b > > Out[4]: array([0, 0, 2]) > > In [5]: a + b > > Out[5]: array([5, 7, 9]) > > In [6]: list(itertools.chain(a, b)) > > Out[6]: [1, 2, 3, 4, 5, 6] > > > > These are all distinct, useful, and well-defined behaviors. > > - keep the existing __and__ and __rand__ behaviour; > - if they are not defined, and both operands x, y are iterable, > return chain(x, y); > I understand. But it invites confusion about just what the `&` operator will do for a given iterable. For NumPy itself, you don't really want to spell `chain(a, b)` so much. But you CAN, they are iterables. The idiomatic way is: >>> np.concat((a,b)) array([1, 2, 3, 4, 5, 6]) However, for "any old iterable" it feels very strange to need to inspect the .__and__() and .__rand__ () methods of the things on both sides before you know WHAT operator it is. Maybe if you are confident a and b are exactly NumPy arrays it is obvious, but what about: from some_array_library import a from other_array_library import b What do you think `a & b` will do under your proposal? Yes, I understand it's deterministic... but it's far from *obvious*. This isn't even doing something pathological like defining both `.__iter__()` and `.__and__()` on the same class... which honestly, isn't even all that pathological; I can imagine real-world use cases. I think that chaining iterators is common enough and important enough in > Python 3 to deserve an operator. While lists are still important, a lot > of things which were lists are now lazily generated iterators, and we > often need to concatenate them. itertools.chain() is less convenient > than it should be. > I actually completely agree! I just wish I could think of a good character that doesn't have some very different meaning in other well-known contexts (even among iterables). Some straw men: both = a ? b both = a ? b Either of those look pretty nice to me, but neither is easy to enter on most keyboards. I think I wouldn't mind `&` if it only worked on iteraTORS. But then it loses many of the use cases. I'd like this, after all: for i in range(10)?[20,19,18]?itertools.count(100): if i>N: break ... -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue Jun 27 04:40:23 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 27 Jun 2017 20:40:23 +1200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> Message-ID: <595219F7.7030804@canterbury.ac.nz> David Mertz wrote: > I just wish I could think of a good > character that doesn't have some very different meaning in other > well-known contexts (even among iterables). (a;b) Should be unambiguous as long as the parens are required. -- Greg From stephanh42 at gmail.com Tue Jun 27 05:01:32 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Tue, 27 Jun 2017 11:01:32 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: <595219F7.7030804@canterbury.ac.nz> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> Message-ID: Hi all, Is "itertools.chain" actually that common? Sufficiently common to warrant its own syntax? In my experience, "enumerate" is far more common among the iterable operations. And that doesn't have special syntax either. A minimal proposal would be to promote "chain" to builtins. Stephan 2017-06-27 10:40 GMT+02:00 Greg Ewing : > David Mertz wrote: >> >> I just wish I could think of a good character that doesn't have some very >> different meaning in other well-known contexts (even among iterables). > > > (a;b) > > Should be unambiguous as long as the parens are required. > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From steve at pearwood.info Tue Jun 27 06:22:19 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 27 Jun 2017 20:22:19 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: <595219F7.7030804@canterbury.ac.nz> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> Message-ID: <20170627102219.GZ3149@ando.pearwood.info> On Tue, Jun 27, 2017 at 08:40:23PM +1200, Greg Ewing wrote: > David Mertz wrote: > >I just wish I could think of a good > >character that doesn't have some very different meaning in other > >well-known contexts (even among iterables). > > (a;b) > > Should be unambiguous as long as the parens are required. Except to the human reader, who can be forgiven for thinking "What the fudge is that semicolon doing there???" (or even less polite). I don't know of any language that uses semi-colon as an operator. That looks like a bug magnet to me. Consider what happens when (not if) you write (a,b) instead, or when you write items = (x; y) and it happens to succeed because x and y are iterable. To be perfectly honest, and no offence is intended, this suggestion seems so wacky to me that I'm not sure if you intended for it to be taken seriously or not. -- Steve From steve at pearwood.info Tue Jun 27 06:38:38 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 27 Jun 2017 20:38:38 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> Message-ID: <20170627103837.GA3149@ando.pearwood.info> On Tue, Jun 27, 2017 at 11:01:32AM +0200, Stephan Houben wrote: > Hi all, > > Is "itertools.chain" actually that common? > Sufficiently common to warrant its own syntax? I think it's much more common than (say) sequence repetition: a = [None]*5 which has had an operator for a long, long time. > In my experience, "enumerate" is far more common > among the iterable operations. > And that doesn't have special syntax either. True. But enumerate is a built-in, and nearly always used in a single context: for i, x in enumerate(sequence): ... A stranger to Python could almost be forgiven for thinking that enumerate is part of the for-loop syntax. In contrast, chaining (while not as common as, say, numeric addition) happens in variable contexts: in expressions, as arguments to function calls, etc. It is absloutely true that this proposal brings nothing new to the language that cannot already be done. It's syntactic sugar. So I guess the value of it depends on whether or not you chain iterables enough that you would rather use an operator rather than a function. > A minimal proposal would be to promote "chain" to builtins. Better than nothing, I suppose. -- Steve From steve at pearwood.info Tue Jun 27 07:10:22 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 27 Jun 2017 21:10:22 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170626032336.GV3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> Message-ID: <20170627111022.GB3149@ando.pearwood.info> On Mon, Jun 26, 2017 at 01:23:36PM +1000, Steven D'Aprano wrote: > The downside to this proposal is that it adds some conceptual complexity > to Python operators. Apart from `is` and `is not`, all Python operators > call one or more dunder methods. This is (as far as I know) the first > operator which has fall-back functionality if the dunder methods aren't > defined. I remembered there is a precedent here. The == operator tries __eq__ before falling back on object identity, at least in Python 2. py> getattr(object(), '__eq__') Traceback (most recent call last): File "", line 1, in AttributeError: 'object' object has no attribute '__eq__' py> object() == object() False -- Steve From stephanh42 at gmail.com Tue Jun 27 07:32:05 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Tue, 27 Jun 2017 13:32:05 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170627103837.GA3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> <20170627103837.GA3149@ando.pearwood.info> Message-ID: Hi Steven, To put this into perspective, I did some greps on Sagemath, being the largest Python project I have installed on this machine (1955 .py files). Occurrences: enumerate: 922 zip: 585 itertools.product: 67 itertools.combinations: 18 itertools.islice: 17 chain: 14 (with or without itertools. prefix) This seems to confirm my gut feeling that "chain" just isn't that common an operation; even among itertools functions, product, combinations and islice are more common. Based on this I would say there is little justification to even put "chain" in builtins, let alone to give it dedicated syntax. I also note that * for repetition is only supported for a few iterables (list, tuple), incidentally the same ones which support + for sequence chaining. Stephan 2017-06-27 12:38 GMT+02:00 Steven D'Aprano : > On Tue, Jun 27, 2017 at 11:01:32AM +0200, Stephan Houben wrote: >> Hi all, >> >> Is "itertools.chain" actually that common? >> Sufficiently common to warrant its own syntax? > > I think it's much more common than (say) sequence repetition: > > a = [None]*5 > > which has had an operator for a long, long time. > >> In my experience, "enumerate" is far more common >> among the iterable operations. >> And that doesn't have special syntax either. > > True. But enumerate is a built-in, and nearly always used in a single > context: > > for i, x in enumerate(sequence): > ... > > A stranger to Python could almost be forgiven for thinking that > enumerate is part of the for-loop syntax. > > In contrast, chaining (while not as common as, say, numeric addition) > happens in variable contexts: in expressions, as arguments to function > calls, etc. > > It is absloutely true that this proposal brings nothing new to the > language that cannot already be done. It's syntactic sugar. So I guess > the value of it depends on whether or not you chain iterables enough > that you would rather use an operator rather than a function. > >> A minimal proposal would be to promote "chain" to builtins. > > Better than nothing, I suppose. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Tue Jun 27 07:41:41 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 27 Jun 2017 21:41:41 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 27 June 2017 at 02:29, Sven R. Kunze wrote: > On 24.06.2017 01:37, MRAB wrote: >> >> I think a "shallow exception" would be one that's part of a defined API, >> as distinct from one that is an artifact of the implementation, a leak in >> the abstraction. > > I like the "shallow exception" idea most. It's simple and it covers most if > not all issues. You also hit the nail with pointing to leaking abstractions. The shallow exception notion breaks a fairly fundamental refactoring principle in Python: you should be able to replace an arbitrary expression with a subfunction or subgenerator that produces the same result without any of the surrounding code being able to tell the difference. By contrast, Steven's exception_guard recipe just takes the existing "raise X from Y" feature, and makes it available as a context manager and function decorator. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Tue Jun 27 07:44:44 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 27 Jun 2017 21:44:44 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> Message-ID: <20170627114443.GC3149@ando.pearwood.info> TL;DR If people really object to & doing double-duty as bitwise-AND and chaining, there's always && as a possibility. On Tue, Jun 27, 2017 at 12:47:40AM -0700, David Mertz wrote: > > - keep the existing __and__ and __rand__ behaviour; > > - if they are not defined, and both operands x, y are iterable, > > return chain(x, y); > > > > I understand. But it invites confusion about just what the `&` operator > will do for a given iterable. With operator overloading, that's a risk for any operator, and true for everything except literals. What will `x * y` do? How about `y * x`? They're not even guarenteed to call the same dunder method. But this is a problem in theory far more than in practice. In practice, you know what types you are expecting, and if you don't get them, you either explicitly raise an exception, or wait for something to fail. "Consenting adults" applies, and we often put responsibility back on the caller to do the right thing. If you expect to use & on iterable arguments, it is reasonable to put the responsibility on the caller to only provide "sensible" iterables, not something wacky like an infinite generator (unless your code is explicitly documented as able to handle them) or one that overrides __(r)and__. Or you could check for it yourself, if you don't trust the argument: if hasattr(arg, '__and__') or hasattr(arg, '__rand__'): raise Something But that strikes me as overkill. You don't normally check for dunders before using an operator, and we already have operators that can return different types depending on the operands: % can mean modulo division or string interpolation * can mean sequence repetition or numeric multiplication + can mean numeric addition or sequence concatenation Why is & can mean ierable chaining or bitwise-AND uniquely confusing? > For NumPy itself, you don't really want to > spell `chain(a, b)` so much. But you CAN, they are iterables. The > idiomatic way is: > > >>> np.concat((a,b)) > array([1, 2, 3, 4, 5, 6]) That's not really chaining, per say -- it is concatenating two arrays to create a third array. (Probably by copying the array elements.) If you wanted to chain a numpy array, you would either use itertools.chain directly, or call iter(myarray) before using the & operator. > However, for "any old iterable" it feels very strange to need to inspect > the .__and__() and .__rand__ () methods of the things on both sides before > you know WHAT operator it is. Do you inspect the dunder methods of objects before calling + or * or & currently? Why would you need to start now? Since there's no way of peering into an object's dunder methods and seeing what they do (short of reading the source code), you always have an element of trust and hope whenever you call an operator on anything but a literal. > Maybe if you are confident a and b are exactly NumPy arrays it is obvious, > but what about: > > from some_array_library import a > from other_array_library import b > > What do you think `a & b` will do under your proposal? Yes, I understand > it's deterministic... but it's far from *obvious*. It's not "obvious" now, since a and b can do anything they like in their dunder methods. They could even inspect the call stack from __next__ and if they see "chain" in the stack, erase your hard drive. Does that mean we don't dare call chain(a, b)? I don't think this proposal adds any new risk. All the risk already exists as soon as you allow method overloading. All we're doing is saying is "try the method overloading first, and if that isn't supported, try iterable chaining second before raising TypeError". > This isn't even doing > something pathological like defining both `.__iter__()` and `.__and__()` on > the same class... which honestly, isn't even all that pathological; I can > imagine real-world use cases. Indeed -- numpy arrays probably do that, as do sets and frozensets. I didn't say it was pathological, I said it was uncommon. > I think that chaining iterators is common enough and important enough in > > Python 3 to deserve an operator. While lists are still important, a lot > > of things which were lists are now lazily generated iterators, and we > > often need to concatenate them. itertools.chain() is less convenient > > than it should be. > > > > I actually completely agree! I just wish I could think of a good character > that doesn't have some very different meaning in other well-known contexts > (even among iterables). There's always && for iterator chaining. -- Steve From steve at pearwood.info Tue Jun 27 07:48:21 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 27 Jun 2017 21:48:21 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> <20170627103837.GA3149@ando.pearwood.info> Message-ID: <20170627114821.GD3149@ando.pearwood.info> On Tue, Jun 27, 2017 at 01:32:05PM +0200, Stephan Houben wrote: > Hi Steven, > > To put this into perspective, I did some greps on Sagemath, > being the largest Python project I have installed on this machine > (1955 .py files). And one which is especially focused on numerical processing, not really the sort of thing that does a much iterator chaining. That's hardly a fair test -- we know there are applications where chaining is not important at all. Its the applications where it *is* important that we should be looking at. -- Steve From stephanh42 at gmail.com Tue Jun 27 09:53:14 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Tue, 27 Jun 2017 15:53:14 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170627114821.GD3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> <20170627103837.GA3149@ando.pearwood.info> <20170627114821.GD3149@ando.pearwood.info> Message-ID: > Its the applications where it *is* important that > we should be looking at. Um, yes, but given our relative positions in this debate, the onus is not really on *me* to demonstrate such an application, right? That would just confuse everbody ;-) (FWIW, Sagemath is not mostly "numerical processing", it is mostly *symbolic* calculations and involves a lot of complex algorithms and datastructures, including sequences.) Stephan 2017-06-27 13:48 GMT+02:00 Steven D'Aprano : > On Tue, Jun 27, 2017 at 01:32:05PM +0200, Stephan Houben wrote: >> Hi Steven, >> >> To put this into perspective, I did some greps on Sagemath, >> being the largest Python project I have installed on this machine >> (1955 .py files). > > And one which is especially focused on numerical processing, not > really the sort of thing that does a much iterator chaining. That's > hardly a fair test -- we know there are applications where chaining is > not important at all. Its the applications where it *is* important that > we should be looking at. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From joshua.morton13 at gmail.com Tue Jun 27 12:03:32 2017 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Tue, 27 Jun 2017 16:03:32 +0000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> <20170627103837.GA3149@ando.pearwood.info> <20170627114821.GD3149@ando.pearwood.info> Message-ID: Just another syntactical suggestion: the binary ++ operator is used as concat in various contexts in various languages, and is probably less likely to confuse people as being either a logical or binary &. On Tue, Jun 27, 2017 at 6:53 AM Stephan Houben wrote: > > Its the applications where it *is* important that > > we should be looking at. > > Um, yes, but given our relative positions in this debate, > the onus is not really on *me* to demonstrate such an application, right? > That would just confuse everbody ;-) > > (FWIW, Sagemath is not mostly "numerical processing", it is mostly > *symbolic* calculations and involves a lot of complex algorithms and > datastructures, including sequences.) > > Stephan > > 2017-06-27 13:48 GMT+02:00 Steven D'Aprano : > > On Tue, Jun 27, 2017 at 01:32:05PM +0200, Stephan Houben wrote: > >> Hi Steven, > >> > >> To put this into perspective, I did some greps on Sagemath, > >> being the largest Python project I have installed on this machine > >> (1955 .py files). > > > > And one which is especially focused on numerical processing, not > > really the sort of thing that does a much iterator chaining. That's > > hardly a fair test -- we know there are applications where chaining is > > not important at all. Its the applications where it *is* important that > > we should be looking at. > > > > > > -- > > Steve > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Tue Jun 27 13:06:13 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Tue, 27 Jun 2017 19:06:13 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> <20170627103837.GA3149@ando.pearwood.info> <20170627114821.GD3149@ando.pearwood.info> Message-ID: Unfortunately this is existing syntax: a++b is parsed as a+(+b) Stephan Op 27 jun. 2017 6:03 p.m. schreef "Joshua Morton" : Just another syntactical suggestion: the binary ++ operator is used as concat in various contexts in various languages, and is probably less likely to confuse people as being either a logical or binary &. On Tue, Jun 27, 2017 at 6:53 AM Stephan Houben wrote: > > Its the applications where it *is* important that > > we should be looking at. > > Um, yes, but given our relative positions in this debate, > the onus is not really on *me* to demonstrate such an application, right? > That would just confuse everbody ;-) > > (FWIW, Sagemath is not mostly "numerical processing", it is mostly > *symbolic* calculations and involves a lot of complex algorithms and > datastructures, including sequences.) > > Stephan > > 2017-06-27 13:48 GMT+02:00 Steven D'Aprano : > > On Tue, Jun 27, 2017 at 01:32:05PM +0200, Stephan Houben wrote: > >> Hi Steven, > >> > >> To put this into perspective, I did some greps on Sagemath, > >> being the largest Python project I have installed on this machine > >> (1955 .py files). > > > > And one which is especially focused on numerical processing, not > > really the sort of thing that does a much iterator chaining. That's > > hardly a fair test -- we know there are applications where chaining is > > not important at all. Its the applications where it *is* important that > > we should be looking at. > > > > > > -- > > Steve > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joshua.morton13 at gmail.com Tue Jun 27 13:14:39 2017 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Tue, 27 Jun 2017 17:14:39 +0000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> <20170627103837.GA3149@ando.pearwood.info> <20170627114821.GD3149@ando.pearwood.info> Message-ID: Argh, you're correct. Thanks for the catch. On Tue, Jun 27, 2017 at 10:06 AM Stephan Houben wrote: > Unfortunately this is existing syntax: > > a++b > is parsed as > a+(+b) > > Stephan > > > Op 27 jun. 2017 6:03 p.m. schreef "Joshua Morton" < > joshua.morton13 at gmail.com>: > > Just another syntactical suggestion: the binary ++ operator is used as > concat in various contexts in various languages, and is probably less > likely to confuse people as being either a logical or binary &. > > On Tue, Jun 27, 2017 at 6:53 AM Stephan Houben > wrote: > >> > Its the applications where it *is* important that >> > we should be looking at. >> >> Um, yes, but given our relative positions in this debate, >> the onus is not really on *me* to demonstrate such an application, right? >> That would just confuse everbody ;-) >> >> (FWIW, Sagemath is not mostly "numerical processing", it is mostly >> *symbolic* calculations and involves a lot of complex algorithms and >> datastructures, including sequences.) >> >> Stephan >> >> 2017-06-27 13:48 GMT+02:00 Steven D'Aprano : >> > On Tue, Jun 27, 2017 at 01:32:05PM +0200, Stephan Houben wrote: >> >> Hi Steven, >> >> >> >> To put this into perspective, I did some greps on Sagemath, >> >> being the largest Python project I have installed on this machine >> >> (1955 .py files). >> > >> > And one which is especially focused on numerical processing, not >> > really the sort of thing that does a much iterator chaining. That's >> > hardly a fair test -- we know there are applications where chaining is >> > not important at all. Its the applications where it *is* important that >> > we should be looking at. >> > >> > >> > -- >> > Steve >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Tue Jun 27 14:17:44 2017 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Tue, 27 Jun 2017 11:17:44 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> Message-ID: <5952A148.4040900@brenbarn.net> On 2017-06-27 00:47, David Mertz wrote: > Maybe if you are confident a and b are exactly NumPy arrays it is > obvious, but what about: > > from some_array_library import a > from other_array_library import b > > What do you think `a & b` will do under your proposal? Yes, I understand > it's deterministic... but it's far from *obvious*. This isn't even > doing something pathological like defining both `.__iter__()` and > `.__and__()` on the same class... which honestly, isn't even all that > pathological; I can imagine real-world use cases. Hmmm, is the proposal really meant to include behavior that global and non-overridable? My understanding was that the proposal would effectively be like defining a default __and__ (or whatever) on some basic iterator types. Individual iterables (or iterators) could still define their own magic methods to define their own behavior. Because of that, I wouldn't expect it to be obvious what would happen in your case. If I import types from two random libraries, I can't expect to know what any operator does on them without reading their docs. Also because of that, I think it might be a bit much to expect this new concat operator to work on ALL iterables/iterators/ Nothing else really works that way; types have to define their own operator behavior. Iterators can be instances of any class that defines a __next__, so I don't see how we could support the magic-concat-everything operator without interfering there. So. . . wouldn't it be somewhat more reasonable to define this concat operator only on actual generators, and perhaps instances of the common iterator types returned from zip, enumerate, etc.? Someone earlier in the thread said that would be "weird" because it would be less generic than itertools.chain, but it seems to me it would cover most of the needed use cases (including the one that was initially given as a motivating example). Also, this generator.__add__ could be smart about handling other iterables on the right of the operator, so you could do (x for x in blah) + [1, 2, 3] + "hello" . . .and, as long as you started with a regular generator, it could work, by having the magic method return a new instance of a type that also has this behavior. This would be a bit odd because I think most existing Python types don't try to do this kind of thing (subsuming many different types of right-operands). But I think it would be useful. In the interim, it could be played with by just making a class that implements the __add__ (or whatever), so you could do cool_iterator(x for x in blah) + [1, 2, 3] + "hello" . . . and just wrapping the leftmost operand would be enough to give you nice syntax for chaining all the rest. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From mertz at gnosis.cx Tue Jun 27 15:27:25 2017 From: mertz at gnosis.cx (David Mertz) Date: Tue, 27 Jun 2017 12:27:25 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170627114443.GC3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <20170627114443.GC3149@ando.pearwood.info> Message-ID: On Tue, Jun 27, 2017 at 4:44 AM, Steven D'Aprano wrote: > But that strikes me as overkill. You don't normally check for dunders > before using an operator, and we already have operators that can return > different types depending on the operands: > > % can mean modulo division or string interpolation > * can mean sequence repetition or numeric multiplication > + can mean numeric addition or sequence concatenation > > Why is > > & can mean iterable chaining or bitwise-AND > I don't think it's "uniquely confusing." Just more so than the other examples you give. For example, I might write functions like these (untested): def modulo1(i: int, j: int) -> int: return i % j def modulo2(s: str, t: tuple) -> str: return s % t And similar examples for `*` and `+`. When I try to write this: def ampersand(x: Iterable, y: Iterable) -> Iterable: return x & y More ambiguity exists. The type signature works for both Numpy arrays and generators (under the proposed language feature), but the function does something different... in a way that is "more different" than I'd expect. That said, I like the idea of having iterators that act magically to fold in general iterables after an .__and__() or .__add__() as proposed by Brendan down-thread. Without any language change we could have: chainable(x for x in blah) + [1, 2, 3] + "hello" And I would like a language change that made a number of common iterable objects "chainable" without the wrapper. This wrapper could of course be used as a decorator too. E.g. generator comprehensions, things returned by itertools functions, range(), enumerate(), zip(), etc. This wouldn't promise that EVERY iterable or iterator had that "chainable" behavior, but it would cover 90% of the use cases. And I wouldn't find it confusing because the leftmost object would be the one determining the behavior, which feels more intuitive and predictable. I don't hate `&&`, but I think this approach makes more sense. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Tue Jun 27 15:49:15 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 27 Jun 2017 21:49:15 +0200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 27.06.2017 13:41, Nick Coghlan wrote: > The shallow exception notion breaks a fairly fundamental refactoring > principle in Python: you should be able to replace an arbitrary > expression with a subfunction or subgenerator that produces the same > result without any of the surrounding code being able to tell the > difference. I would agree with you here but this "refactoring principle in Python" doesn't work for control flow. Just look at "return", "break", "continue" etc. Exceptions are another way of handling control flow. So, this doesn't apply here IMO. > By contrast, Steven's exception_guard recipe just takes the existing > "raise X from Y" feature, and makes it available as a context manager > and function decorator. I don't see how this helps differentiating shallow and nested exceptions such as: try: with exception_guard(ImportError): import myspeciallib except RuntimeError: # catches shallow and nested ones import fallbacks.MySpecialLib as myspeciallib Regards, Sven PS: this has nothing to do with cyclic imports. It can be a misconfiguration of the system which fails nested imports. In those cases, we fallback silently. From rosuav at gmail.com Tue Jun 27 16:03:34 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Jun 2017 06:03:34 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On Wed, Jun 28, 2017 at 5:49 AM, Sven R. Kunze wrote: > On 27.06.2017 13:41, Nick Coghlan wrote: >> >> The shallow exception notion breaks a fairly fundamental refactoring >> principle in Python: you should be able to replace an arbitrary >> expression with a subfunction or subgenerator that produces the same >> result without any of the surrounding code being able to tell the >> difference. > > > I would agree with you here but this "refactoring principle in Python" > doesn't work for control flow. > > Just look at "return", "break", "continue" etc. Exceptions are another way > of handling control flow. So, this doesn't apply here IMO. The ability to safely refactor control flow is part of why 'yield from' exists, and why PEP 479 changed how StopIteration bubbles. Local control flow is hard to refactor, but exceptions are global control flow, and most certainly CAN be refactored safely. ChrisA From srkunze at mail.de Tue Jun 27 16:29:51 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 27 Jun 2017 22:29:51 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <20170627114443.GC3149@ando.pearwood.info> Message-ID: <92543572-ea2a-bd54-a068-c7743fb67623@mail.de> On 27.06.2017 21:27, David Mertz wrote: > And I would like a language change that made a number of common > iterable objects "chainable" without the wrapper. This wrapper could > of course be used as a decorator too. > > E.g. generator comprehensions, things returned by itertools functions, > range(), enumerate(), zip(), etc. This wouldn't promise that EVERY > iterable or iterator had that "chainable" behavior, but it would cover > 90% of the use cases. And I wouldn't find it confusing because the > leftmost object would be the one determining the behavior, which feels > more intuitive and predictable. I think that most people in favor of this proposal agree with you here. Let's start with something simple which can be extended bit by bit to cover more and more use-cases. I for one would include also right-handed iterators/generator because I definitely know real-world usage (just recently) but that can wait until you feel comfortable with it as well. If it's a language change, I would like the plus operator to be it as it integrated well with lists, such as generator + (list1 + list2) It can be sometimes necessary to group things up like this. Using a different operator symbol here, I would find confusing. Plus looks like concat to me. Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Tue Jun 27 16:57:56 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 27 Jun 2017 13:57:56 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170626032336.GV3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> Message-ID: <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> On 2017-06-25 20:23, Steven D'Aprano wrote: > I have a counter-proposal: introduce the iterator chaining operator "&": > > iterable & iterable --> itertools.chain(iterable, iterable) > I like this suggestion. Here's another color that might be less controversial: iterable3 = iterable1.chain(iterable2) Perhaps more obvious than &, easier to use than "from itertools import chain...". -Mike From mertz at gnosis.cx Tue Jun 27 17:02:52 2017 From: mertz at gnosis.cx (David Mertz) Date: Tue, 27 Jun 2017 14:02:52 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> Message-ID: On Tue, Jun 27, 2017 at 1:57 PM, Mike Miller wrote: > I like this suggestion. Here's another color that might be less > controversial: > > iterable3 = iterable1.chain(iterable2) > How do you chain it1, it2, it3, etc? I guess `it1.chain(it2.chain(it3)))` ... but that starts to become distinctly less readable IMO. I'd much rather spell `chain(it1, it2, it3)`. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Tue Jun 27 17:05:40 2017 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Tue, 27 Jun 2017 14:05:40 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> Message-ID: <5952C8A4.5050207@brenbarn.net> On 2017-06-27 14:02, David Mertz wrote: > On Tue, Jun 27, 2017 at 1:57 PM, Mike Miller > wrote: > > I like this suggestion. Here's another color that might be less > controversial: > > iterable3 = iterable1.chain(iterable2) > > > How do you chain it1, it2, it3, etc? > > I guess `it1.chain(it2.chain(it3)))` ... but that starts to become > distinctly less readable IMO. I'd much rather spell `chain(it1, it2, it3)`. Even if this "chain" only took one argument, you could do it1.chain(it2).chain(it3). But I don't see why it couldn't take multiple arguments as you suggest. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From python-ideas at mgmiller.net Tue Jun 27 17:10:49 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 27 Jun 2017 14:10:49 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> Message-ID: On 2017-06-27 14:02, David Mertz wrote: > iterable3 = iterable1.chain(iterable2) > > > How do you chain it1, it2, it3, etc? Why not: iterable5 = iterable1.chain(iterable2, iterable3, iterable4) ? i.e. Couldn't a class method do this with itertools.chain() under the hood? From python-ideas at mgmiller.net Tue Jun 27 17:13:29 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 27 Jun 2017 14:13:29 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: <5952C8A4.5050207@brenbarn.net> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> Message-ID: <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> On 2017-06-27 14:05, Brendan Barnwell wrote: > Even if this "chain" only took one argument, you could do > it1.chain(it2).chain(it3). But I don't see why it couldn't take multiple > arguments as you suggest. > Right, and as I forgot to mention, making it a built-in is an uphill battle with higher backward compatibility concerns. From wes.turner at gmail.com Tue Jun 27 19:34:11 2017 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 27 Jun 2017 18:34:11 -0500 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: On Monday, June 26, 2017, Wes Turner wrote: > > > On Sunday, June 25, 2017, Wes Turner > wrote: > >> >> >> On Sunday, June 25, 2017, Danilo J. S. Bellini >> wrote: >> >>> On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < >>> python-ideas at python.org> wrote: >>> >>>> I often use generators, and itertools.chain on them. >>>> What about providing something like the following: >>>> >>>> a = (n for n in range(2)) >>>> b = (n for n in range(2, 4)) >>>> tuple(a + b) # -> 0 1 2 3 >>> >>> >>> AudioLazy does that: https://github.com/danilobellini/audiolazy >>> >> >> - http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.concat >> and concatv >> >> - https://github.com/kachayev/fn.py#streams-and-infinite-seque >> nces-declaration >> - Stream() << obj >> > > << is __lshift__() > <<= is __ilshift__() > > https://docs.python.org/2/library/operator.html > > Do Stream() and __lshift__() from fn.py not solve here? > In this syntax example, iter1 is mutated before iteration: iter_x = Stream(iter1) << iter2 << iter3 iter1 <<= iter5 # IDK if fn.py yet has <<= list(iter1) list(iter_x) > > >> >> >>> >>> >>> -- >>> Danilo J. S. Bellini >>> --------------- >>> "*It is not our business to set up prohibitions, but to arrive at >>> conventions.*" (R. Carnap) >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Tue Jun 27 21:48:05 2017 From: mistersheik at gmail.com (Neil Girdhar) Date: Tue, 27 Jun 2017 18:48:05 -0700 (PDT) Subject: [Python-ideas] Dictionary destructing and unpacking. In-Reply-To: References: <5938EF88.2040408@canterbury.ac.nz> <22840.63866.935987.467598@turnbull.sk.tsukuba.ac.jp> Message-ID: <41034ce5-1d8d-4eb0-b497-f2d12ba8b586@googlegroups.com> By the way, this is already in the CPython source code if I remember from when I worked on 448. The code for dict unpacking is merely blocked. I like this syntax from a purity standpoint, but I don't think I would use it that much. On Thursday, June 8, 2017 at 3:18:21 PM UTC-4, Nick Badger wrote: > > Well, it's not deliberately not destructive, but I'd be more in favor of > dict unpacking-assignment if it were spelled more like this: > > >>> foo = {'a': 1, 'b': 2, 'c': 3, 'd': 4} > >>> {'a': bar, 'b': baz, **rest} = foo > >>> bar > 1 > >>> baz > 2 > >>> rest > {'c': 3, 'd': 4} > >>> foo > {'a': 1, 'b': 2, 'c': 3, 'd': 4} > > That also takes care of the ordering issue, and any ambiguity about "am I > unpacking the keys, the values, or both?", at the cost of a bit more > typing. However, I'm a bit on the fence about this syntax as well: it's > pretty easily confused with dictionary creation. Maybe the same thing but > without the brackets? > > Just a thought I had this morning. > Nick > > > Nick Badger > https://www.nickbadger.com > > 2017-06-08 7:00 GMT-07:00 Nick Coghlan >: > >> On 8 June 2017 at 17:49, Paul Moore > >> wrote: >> > On 8 June 2017 at 08:15, Stephen J. Turnbull >> > > wrote: >> >> If you like this feature, and wish it were in Python, I genuinely wish >> >> you good luck getting it in. My point is just that in precisely that >> >> use case I wouldn't be passing dictionaries that need destructuring >> >> around. I believe that to be the case for most Pythonistas. >> >> (Although several have posted in favor of some way to destructure >> >> dictionaries, typically those in favor of the status quo don't speak >> >> up until it looks like there will be a change.) >> > >> > The most common use case I find for this is when dealing with JSON (as >> > someone else pointed out). But that's a definite case of dealing with >> > data in a format that's "unnatural" for Python (by definition, JSON is >> > "natural" for JavaScript). While having better support for working >> > with JSON would be nice, I typically find myself wishing for better >> > JSON handling libraries (ones that deal better with mappings with >> > known keys) than for language features. But of course, I could write >> > such a library myself, if it mattered sufficiently to me - and it >> > never seems *that* important :-) >> >> Aye, I've had good experiences with using JSL to define JSON schemas >> for ad hoc JSON data structures that didn't already have them: >> https://jsl.readthedocs.io/en/latest/ >> >> And then, if you really wanted to, something like JSON Schema Objects >> provides automated destructuring and validation based on those >> schemas: >> https://python-jsonschema-objects.readthedocs.io/en/latest/Introduction.html >> >> However, it really isn't an ad hoc scripting friendly way to go - it's >> an "I'm writing a tested-and-formally-released application and want to >> strictly manage the data processing boundaries between components" >> style solution. >> >> pandas.read_json is pretty nice >> ( >> https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_json.html >> ), >> but would be a heavy dependency to bring in *just* for JSON -> >> DataFrame conversions. >> >> For myself, the things I mainly miss are: >> >> * getitem/setitem/delitem counterparts to getattr/setattr/delattr >> * getattrs and getitems builtins for retrieving multiple attributes or >> items in a single call (with the default value for missing results >> moved to a keyword-only argument) >> >> Now, these aren't hard to write yourself (and you can even use >> operator.attrgetter and operator.itemgetter as part of building them), >> but it's a sufficiently irritating niggle not to have them at my >> fingertips whenever they'd be convenient that I'll often end up >> writing out the long form equivalents instead. >> >> Are these necessary? Clearly not (although we did decide >> operator.itemgetter and operator.attrgetter were important enough to >> add for use with the map() and filter() builtins and other itertools). >> >> Is it a source of irritation that they're not there? Absolutely, at >> least for me. >> >> Cheers, >> Nick. >> >> P.S. Just clearly not irritating enough for me to actually put a patch >> together and push for a final decision one way or the other regarding >> adding them ;) >> >> -- >> Nick Coghlan | ncog... at gmail.com | Brisbane, >> Australia >> _______________________________________________ >> Python-ideas mailing list >> Python... at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Jun 27 22:25:12 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 28 Jun 2017 12:25:12 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 28 June 2017 at 06:03, Chris Angelico wrote: > On Wed, Jun 28, 2017 at 5:49 AM, Sven R. Kunze wrote: >> I would agree with you here but this "refactoring principle in Python" >> doesn't work for control flow. >> >> Just look at "return", "break", "continue" etc. Exceptions are another way >> of handling control flow. So, this doesn't apply here IMO. > > The ability to safely refactor control flow is part of why 'yield > from' exists, and why PEP 479 changed how StopIteration bubbles. Local > control flow is hard to refactor, but exceptions are global control > flow, and most certainly CAN be refactored safely. And PEP 479 establishes a precedent for how we handle the cases where we decide we *don't* want a particular exception type to propagate normally: create a boundary on the stack that converts the otherwise ambiguous exception type to RuntimeError. While generator functions now do that implicitly for StopIteration, and "raise X from Y" lets people write suitable exception handlers themselves, we don't offer an easy way to do it with a context manager (with statement as stack boundary), asynchronous context manager (async with statement as stack boundary), or a function decorator (execution frame as stack boundary). So while I prefer "contextlib.convert_exception" as the name (rather than the "exception_guard" Steven used in his recipe), I'd definitely be open to a bugs.python.org RFE and a PR against contextlib to add such a construct to Python 3.7. We'd have a few specific API details to work out (e.g. whether or not to accept arbitrary conversion functions as conversion targets in addition to accepting exception types and iterables of exception types, whether or not to allow "None" as the conversion target to get the same behaviour as "contextlib.suppress"), but I'm already sold on the general concept. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Jun 27 22:47:21 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 28 Jun 2017 12:47:21 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: On 28 June 2017 at 07:13, Mike Miller wrote: > > On 2017-06-27 14:05, Brendan Barnwell wrote: >> >> Even if this "chain" only took one argument, you could do >> it1.chain(it2).chain(it3). But I don't see why it couldn't take multiple >> arguments as you suggest. > > Right, and as I forgot to mention, making it a built-in is an uphill battle > with higher backward compatibility concerns. While I haven't been following this thread closely, I'd like to note that arguing for a "chain()" builtin has the virtue that would just be arguing for the promotion of the existing itertools.chain function into the builtin namespace. Such an approach has a lot to recommend it: 1. It has precedent, in that Python 3's map(), filter(), and zip(), are essentially Python 2's itertools.imap(), ifilter(), and izip() 2. There's no need for a naming or semantics debate, as we'd just be promoting an established standard library API into the builtin namespace 3. Preserving compatibility with older versions is straightforward: just do an unconditional "from itertools import chain" 4. As an added bonus, we'd also get "chain.from_iterable" as a builtin API So it would be good to have a short PEP that argued that since chaining arbitrary iterables is at least as important as mapping, filtering, and zipping them, itertools.chain should be added to the builtin namespace in 3.7+ (but no, I'm not volunteering to write that myself). As a *separate* discussion, folks could then also argue for the additional of a `__lshift__` operator implementation specifically to iterator chains that let you write: full_chain = chain(it1) << it2 << it3 # Incrementally create new chains full_chain <<= it4 # Extend an existing chain I'd be surprised if such a proposal got accepted for 3.7, but it would make a good follow-up discussion for 3.8 (assuming chain() made it into the 3.7 builtins). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Tue Jun 27 23:16:04 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 28 Jun 2017 13:16:04 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On Wed, Jun 28, 2017 at 12:25 PM, Nick Coghlan wrote: > While generator functions now do that implicitly for StopIteration, > and "raise X from Y" lets people write suitable exception handlers > themselves, we don't offer an easy way to do it with a context manager > (with statement as stack boundary), asynchronous context manager > (async with statement as stack boundary), or a function decorator > (execution frame as stack boundary). > > So while I prefer "contextlib.convert_exception" as the name (rather > than the "exception_guard" Steven used in his recipe), I'd definitely > be open to a bugs.python.org RFE and a PR against contextlib to add > such a construct to Python 3.7. > > We'd have a few specific API details to work out (e.g. whether or not > to accept arbitrary conversion functions as conversion targets in > addition to accepting exception types and iterables of exception > types, whether or not to allow "None" as the conversion target to get > the same behaviour as "contextlib.suppress"), but I'm already sold on > the general concept. I agree broadly, but I'm sure there'll be the usual ton of bikeshedding about the details. The idea behind this decorator, AIUI, is a declaration that "a FooException coming out of here is a bug", and if I were looking for that, I'd look for something about the function leaking an exception, or preventing exceptions. So maybe convert_exception will work, but definitely have a docs reference from contextlib.suppress to this ("if exceptions of this type would indicate code bugs, consider convert_exception instead"). In my testing, I've called it "no_leak" or some variant thereon, though that's a shorthand that wouldn't suit the stdlib. ChrisA From tjreedy at udel.edu Wed Jun 28 00:30:15 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 28 Jun 2017 00:30:15 -0400 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: On 6/27/2017 10:47 PM, Nick Coghlan wrote: > While I haven't been following this thread closely, I'd like to note > that arguing for a "chain()" builtin has the virtue that would just be > arguing for the promotion of the existing itertools.chain function > into the builtin namespace. > > Such an approach has a lot to recommend it: > > 1. It has precedent, in that Python 3's map(), filter(), and zip(), > are essentially Python 2's itertools.imap(), ifilter(), and izip() > 2. There's no need for a naming or semantics debate, as we'd just be > promoting an established standard library API into the builtin > namespace A counter-argument is that there are other itertools that deserve promotion, by usage, even more. But we need to see comparisons from more that one limited corpus. On the other hand, there might be a theory argument that chain is somehow more basic, akin to map, etc, in a way that others are not. > 3. Preserving compatibility with older versions is straightforward: > just do an unconditional "from itertools import chain" > 4. As an added bonus, we'd also get "chain.from_iterable" as a builtin API > > So it would be good to have a short PEP that argued that since > chaining arbitrary iterables is at least as important as mapping, > filtering, and zipping them, itertools.chain should be added to the > builtin namespace in 3.7+ (but no, I'm not volunteering to write that > myself). > > As a *separate* discussion, folks could then also argue for the > additional of a `__lshift__` operator implementation specifically to > iterator chains that let you write: > > full_chain = chain(it1) << it2 << it3 # Incrementally create new chains > full_chain <<= it4 # Extend an existing chain > > I'd be surprised if such a proposal got accepted for 3.7, but it would > make a good follow-up discussion for 3.8 (assuming chain() made it > into the 3.7 builtins). -- Terry Jan Reedy From ncoghlan at gmail.com Wed Jun 28 02:00:18 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 28 Jun 2017 16:00:18 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 28 June 2017 at 13:16, Chris Angelico wrote: > On Wed, Jun 28, 2017 at 12:25 PM, Nick Coghlan wrote: >> While generator functions now do that implicitly for StopIteration, >> and "raise X from Y" lets people write suitable exception handlers >> themselves, we don't offer an easy way to do it with a context manager >> (with statement as stack boundary), asynchronous context manager >> (async with statement as stack boundary), or a function decorator >> (execution frame as stack boundary). >> >> So while I prefer "contextlib.convert_exception" as the name (rather >> than the "exception_guard" Steven used in his recipe), I'd definitely >> be open to a bugs.python.org RFE and a PR against contextlib to add >> such a construct to Python 3.7. >> >> We'd have a few specific API details to work out (e.g. whether or not >> to accept arbitrary conversion functions as conversion targets in >> addition to accepting exception types and iterables of exception >> types, whether or not to allow "None" as the conversion target to get >> the same behaviour as "contextlib.suppress"), but I'm already sold on >> the general concept. > > I agree broadly, but I'm sure there'll be the usual ton of > bikeshedding about the details. The idea behind this decorator, AIUI, > is a declaration that "a FooException coming out of here is a bug", > and if I were looking for that, I'd look for something about the > function leaking an exception, or preventing exceptions. So maybe > convert_exception will work, but definitely have a docs reference from > contextlib.suppress to this ("if exceptions of this type would > indicate code bugs, consider convert_exception instead"). In my > testing, I've called it "no_leak" or some variant thereon, though > that's a shorthand that wouldn't suit the stdlib. Right, and I'd like us to keep in mind the KeyError -> AttributeError (and vice-versa) use case as well. Similar to ExitStack, it would be appropriate to make some additions to the "recipes" section in the docs that covered things like "Keep AttributeError from being suppressed in a property implementation". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Jun 28 02:06:41 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 28 Jun 2017 16:06:41 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: On 28 June 2017 at 14:30, Terry Reedy wrote: > On 6/27/2017 10:47 PM, Nick Coghlan wrote: >> Such an approach has a lot to recommend it: >> >> 1. It has precedent, in that Python 3's map(), filter(), and zip(), >> are essentially Python 2's itertools.imap(), ifilter(), and izip() >> 2. There's no need for a naming or semantics debate, as we'd just be >> promoting an established standard library API into the builtin >> namespace > > A counter-argument is that there are other itertools that deserve promotion, > by usage, even more. But we need to see comparisons from more that one > limited corpus. > > On the other hand, there might be a theory argument that chain is somehow > more basic, akin to map, etc, in a way that others are not. The main rationale I see is the one that kicked off the most recent discussion, which is that in Python 2, you could readily chain the output of map(), filter(), zip(), range(), dict.keys(), dict.values(), dict.items(), etc together with "+", simply because they all returned concrete lists. In Python 3, you can't do that as easily anymore, since they all return either iterators or computed containers that don't support "+". While there are good reasons not to implement "+" on those iterators and custom containers, we *can* fairly easily restore builtin concatentation support for their outputs, and we can do it in a way that's friendly to all implementations that already provide the itertools.chain API. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Wed Jun 28 04:54:01 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 28 Jun 2017 09:54:01 +0100 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: On 28 June 2017 at 05:30, Terry Reedy wrote: > On 6/27/2017 10:47 PM, Nick Coghlan wrote: > >> While I haven't been following this thread closely, I'd like to note >> that arguing for a "chain()" builtin has the virtue that would just be >> arguing for the promotion of the existing itertools.chain function >> into the builtin namespace. >> >> Such an approach has a lot to recommend it: >> >> 1. It has precedent, in that Python 3's map(), filter(), and zip(), >> are essentially Python 2's itertools.imap(), ifilter(), and izip() >> 2. There's no need for a naming or semantics debate, as we'd just be >> promoting an established standard library API into the builtin >> namespace > > > A counter-argument is that there are other itertools that deserve promotion, > by usage, even more. But we need to see comparisons from more that one > limited corpus. Indeed. I don't recall *ever* using itertools.chain myself. I'd be interested in seeing some usage stats to support this proposal. As an example, I see 8 uses of itertools.chain in pip and its various vendored packages, as opposed to around 30 uses of map (plus however many list comprehensions are used in place of maps). On a very brief scan, it looks like the various other itertools are used less than chain, but with only 8 uses of chain, it's not really possible to read anything more into the relative frequencies. Paul From ncoghlan at gmail.com Wed Jun 28 05:09:41 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 28 Jun 2017 19:09:41 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: On 28 June 2017 at 18:54, Paul Moore wrote: > On 28 June 2017 at 05:30, Terry Reedy wrote: >> A counter-argument is that there are other itertools that deserve promotion, >> by usage, even more. But we need to see comparisons from more that one >> limited corpus. > > Indeed. I don't recall *ever* using itertools.chain myself. I'd be > interested in seeing some usage stats to support this proposal. As an > example, I see 8 uses of itertools.chain in pip and its various > vendored packages, as opposed to around 30 uses of map (plus however > many list comprehensions are used in place of maps). On a very brief > scan, it looks like the various other itertools are used less than > chain, but with only 8 uses of chain, it's not really possible to read > anything more into the relative frequencies. The other thing to look for would be list() and list.extend() calls. I know I use those quite a bit in combination with str.join, where I don't actually *need* a list, it's just currently the most convenient way to accumulate all the components I'm planning to combine. And if you're converting from Python 2 code, then adding a few list() calls in critical places in order to keep the obj.extend() calls working is likely to be easier in many cases than switching over to using itertools.chain. For simple cases, that's fine (since a list of direct references will be lower overhead than accumulating a chain of short iterables), but without builtin support for iterator concatenation, it's currently a nonlocal refactoring (to add the "from itertools import chain" at the top of the file) to switch completely to the "pervasive iterators" model when folks actually do want to do that. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From srkunze at mail.de Wed Jun 28 07:35:00 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 28 Jun 2017 13:35:00 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: <964dd489-430e-fe9e-4649-b36805fa51d7@mail.de> On 28.06.2017 11:09, Nick Coghlan wrote: > The other thing to look for would be list() and list.extend() calls. I > know I use those quite a bit in combination with str.join, where I > don't actually *need* a list, it's just currently the most convenient > way to accumulate all the components I'm planning to combine. And if > you're converting from Python 2 code, then adding a few list() calls > in critical places in order to keep the obj.extend() calls working is > likely to be easier in many cases than switching over to using > itertools.chain. This is exactly the reason why I also doubt that Stephan's Sagemath stats are telling anything beyond "chain isn't used that much". Iterators are only nice to have if you work with simple lists up to 1000 items. Current hardware is able to fix that for you. There are simply more readable ways of doing "chaining of sequences" for many cases. Even if you are already on Python 3. In the end, list and "+" operator are the best way of "doing sequences". Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.m.bray at gmail.com Wed Jun 28 07:40:50 2017 From: erik.m.bray at gmail.com (Erik Bray) Date: Wed, 28 Jun 2017 13:40:50 +0200 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders Message-ID: Hi folks, I normally wouldn't bring something like this up here, except I think that there is possibility of something to be done--a language documentation clarification if nothing else, though possibly an actual code change as well. I've been having an argument with a colleague over the last couple days over the proper way order of statements when setting up a try/finally to perform cleanup of some action. On some level we're both being stubborn I think, and I'm not looking for resolution as to who's right/wrong or I wouldn't bring it to this list in the first place. The original argument was over setting and later restoring os.environ, but we ended up arguing over threading.Lock.acquire/release which I think is a more interesting example of the problem, and he did raise a good point that I do want to bring up. My colleague's contention is that given lock = threading.Lock() this is simply *wrong*: lock.acquire() try: do_something() finally: lock.release() whereas this is okay: with lock: do_something() Ignoring other details of how threading.Lock is actually implemented, assuming that Lock.__enter__ calls acquire() and Lock.__exit__ calls release() then as far as I've known ever since Python 2.5 first came out these two examples are semantically *equivalent*, and I can't find any way of reading PEP 343 or the Python language reference that would suggest otherwise. However, there *is* a difference, and has to do with how signals are handled, particularly w.r.t. context managers implemented in C (hence we are talking CPython specifically): If Lock.__enter__ is a pure Python method (even if it maybe calls some C methods), and a SIGINT is handled during execution of that method, then in almost all cases a KeyboardInterrupt exception will be raised from within Lock.__enter__--this means the suite under the with: statement is never evaluated, and Lock.__exit__ is never called. You can be fairly sure the KeyboardInterrupt will be raised from somewhere within a pure Python Lock.__enter__ because there will usually be at least one remaining opcode to be evaluated, such as RETURN_VALUE. Because of how delayed execution of signal handlers is implemented in the pyeval main loop, this means the signal handler for SIGINT will be called *before* RETURN_VALUE, resulting in the KeyboardInterrupt exception being raised. Standard stuff. However, if Lock.__enter__ is a PyCFunction things are quite different. If you look at how the SETUP_WITH opcode is implemented, it first calls the __enter__ method with _PyObjet_CallNoArg. If this returns NULL (i.e. an exception occurred in __enter__) then "goto error" is executed and the exception is raised. However if it returns non-NULL the finally block is set up with PyFrame_BlockSetup and execution proceeds to the next opcode. At this point a potentially waiting SIGINT is handled, resulting in KeyboardInterrupt being raised while inside the with statement's suite, and finally block, and hence Lock.__exit__ are entered. Long story short, because Lock.__enter__ is a C function, assuming that it succeeds normally then with lock: do_something() always guarantees that Lock.__exit__ will be called if a SIGINT was handled inside Lock.__enter__, whereas with lock.acquire() try: ... finally: lock.release() there is at last a small possibility that the SIGINT handler is called after the CALL_FUNCTION op but before the try/finally block is entered (e.g. before executing POP_TOP or SETUP_FINALLY). So the end result is that the lock is held and never released after the KeyboardInterrupt (whether or not it's handled somehow). Whereas, again, if Lock.__enter__ is a pure Python function there's less likely to be any difference (though I don't think the possibility can be ruled out entirely). At the very least I think this quirk of CPython should be mentioned somewhere (since in all other cases the semantic meaning of the "with:" statement is clear). However, I think it might be possible to gain more consistency between these cases if pending signals are checked/handled after any direct call to PyCFunction from within the ceval loop. Sorry for the tl;dr; any thoughts? From srkunze at mail.de Wed Jun 28 07:48:16 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 28 Jun 2017 13:48:16 +0200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 28.06.2017 08:00, Nick Coghlan wrote: > Right, and I'd like us to keep in mind the KeyError -> AttributeError > (and vice-versa) use case as well. Similar to ExitStack, it would be > appropriate to make some additions to the "recipes" section in the > docs that covered things like "Keep AttributeError from being > suppressed in a property implementation". As it was snipped away, let me ask again: I don't see how this helps differentiating shallow and nested exceptions such as: try: with exception_guard(ImportError): import myspeciallib except RuntimeError: # catches shallow and nested ones import fallbacks.MySpecialLib as myspeciallib At least in my tests, exception_guard works this way and I don't see any improvements to current behavior. Moreover, I am somewhat skeptical that using this recipe will really improve the situation. It's a lot of code where users don't have any stdlib support. I furthermore doubt that all Python coders will now wrap their properties using the guard. So, using these properties we will have almost no improvement. I still don't see it as the responsibility of coder of the property to guard against anything. Nobody is forced to catch exceptions when using a property. If that's the "best" outcome, I will stick to https://stackoverflow.com/questions/20459166/how-to-catch-an-importerror-non-recursively because 1) Google finds it for me and 2) we don't have to maintain 100 lines of code ourself. Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Wed Jun 28 08:00:55 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 28 Jun 2017 15:00:55 +0300 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: On Wed, Jun 28, 2017 at 11:54 AM, Paul Moore wrote: > Indeed. I don't recall *ever* using itertools.chain myself. ? In fact, me neither. Or maybe a couple of times. For such a basic task, it feels more natural to write a generator function, or even turn it into a list, if you can be sure that the 'unnecessary' lists will be small and that the code won't be a performance bottle neck. To reiterate on this some more: One of the nice things of Python 3 is (or could be) the efficiency of not making unnecessary lists by default. But for the programmer/beginner it's not nearly as convenient with the views as it is with lists. Beginners quickly need to learn about Also generators are really nice, and chaining them is just as useful/necessary as extending or concatenating lists. Chaining generators with other iterables is certainly useful, but when all the parts of the chained object are iterable but not sequences, that seems like an invitation to use list() at some point in the code. So whatever the outcome of this discussion (I hope there is one, whether it is by adding iterator-related builtins or something more sophisticated), it should probably take into account possible future ways of dealing with some kind of "lazy lists". However, I'm actually not sure the syntax of chaining generators/iterables should necessarily be the same as for chaining arbitrary sequences. The programmer needs to be well aware of whether the resulting object is a Sequence or 'just' a generator. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Jun 28 08:18:23 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 28 Jun 2017 14:18:23 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: On 28.06.2017 14:00, Koos Zevenhoven wrote: > The programmer needs to be well aware of whether the resulting object > is a Sequence or 'just' a generator. Could you elaborate more on **why**? Regards, Sven PS: I consider this proposal to be like allowing adding floats and ints together. If I don't know if there was a float in the sum, don't know if my result will be a float. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jun 28 08:26:08 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 28 Jun 2017 22:26:08 +1000 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: References: Message-ID: On 28 June 2017 at 21:40, Erik Bray wrote: > My colleague's contention is that given > > lock = threading.Lock() > > this is simply *wrong*: > > lock.acquire() > try: > do_something() > finally: > lock.release() > > whereas this is okay: > > with lock: > do_something() Technically both are slightly racy with respect to async signals (e.g. KeyboardInterrupt), but the with statement form is less exposed to the problem (since it does more of its work in single opcodes). Nathaniel Smith posted a good write-up of the technical details to the issue tracker based on his work with trio: https://bugs.python.org/issue29988 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From erik.m.bray at gmail.com Wed Jun 28 09:09:13 2017 From: erik.m.bray at gmail.com (Erik Bray) Date: Wed, 28 Jun 2017 15:09:13 +0200 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: References: Message-ID: On Wed, Jun 28, 2017 at 2:26 PM, Nick Coghlan wrote: > On 28 June 2017 at 21:40, Erik Bray wrote: >> My colleague's contention is that given >> >> lock = threading.Lock() >> >> this is simply *wrong*: >> >> lock.acquire() >> try: >> do_something() >> finally: >> lock.release() >> >> whereas this is okay: >> >> with lock: >> do_something() > > Technically both are slightly racy with respect to async signals (e.g. > KeyboardInterrupt), but the with statement form is less exposed to the > problem (since it does more of its work in single opcodes). > > Nathaniel Smith posted a good write-up of the technical details to the > issue tracker based on his work with trio: > https://bugs.python.org/issue29988 Interesting; thanks for pointing this out. Part of me felt like this has to have come up before but my searching didn't bring this up somehow (and even then it's only a couple months old itself). I didn't think about the possible race condition before WITH_CLEANUP_START, but obviously that's a possibility as well. Anyways since this is already acknowledged as a real bug I guess any further followup can happen on the issue tracker. Thanks, Erik From greg.ewing at canterbury.ac.nz Wed Jun 28 09:19:25 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 29 Jun 2017 01:19:25 +1200 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: References: Message-ID: <5953ACDD.3040806@canterbury.ac.nz> Erik Bray wrote: > At this point a potentially > waiting SIGINT is handled, resulting in KeyboardInterrupt being raised > while inside the with statement's suite, and finally block, and hence > Lock.__exit__ are entered. Seems to me this is the behaviour you *want* in this case, otherwise the lock can be acquired and never released. It's disconcerting that it seems to be very difficult to get that behaviour with a pure Python implementation. > I think it might be possible to > gain more consistency between these cases if pending signals are > checked/handled after any direct call to PyCFunction from within the > ceval loop. IMO that would be going in the wrong direction by making the C case just as broken as the Python case. Instead, I would ask what needs to be done to make this work correctly in the Python case as well as the C case. I don't think it's even possible to write Python code that does this correctly at the moment. What's needed is a way to temporarily mask delivery of asynchronous exceptions for a region of code, but unless I've missed something, no such facility is currently provided. What would such a facility look like? One possibility would be to model it on the sigsetmask() system call, so there would be a function such as mask_async_signals(bool) that turns delivery of async signals on or off. However, I don't think that would work. To fix the locking case, what we need to do is mask async signals during the locking operation, and only unmask them once the lock has been acquired. We might write a context manager with an __enter__ method like this: def __enter__(self): mask_async_signals(True) try: self.acquire() finally: mask_async_signals(False) But then we have the same problem again -- if a Keyboard Interrupt occurs after mask_async_signals(False) but before __enter__ returns, the lock won't get released. Another approach would be to provide a context manager such as async_signals_masked(bool) Then the whole locking operation could be written as with async_signals_masked(True): lock.acquire() try: with async_signals_masked(False): # do stuff here finally: lock.release() Now there's no possibility for a KeyboardInterrupt to be delivered until we're safely inside the body, but we've lost the ability to capture the pattern in the form of a context manager. The only way out of this I can think of at the moment is to make the above pattern part of the context manager protocol itself. In other words, async exceptions are always masked while the __enter__ and __exit__ methods are executing, and unmasked while the body is executing. -- Greg From ncoghlan at gmail.com Wed Jun 28 09:26:02 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 28 Jun 2017 23:26:02 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 28 June 2017 at 21:48, Sven R. Kunze wrote: > As it was snipped away, let me ask again: > > I don't see how this helps differentiating shallow and nested exceptions > such as: > > try: > with exception_guard(ImportError): > import myspeciallib > except RuntimeError: # catches shallow and nested ones > import fallbacks.MySpecialLib as myspeciallib There are two answers to that: 1. In 3.3+ you can just catch ImportError, check "exc.name", and re-raise if it's not for the module you care about 2. There's a reasonable case to be made that importlib should include an ImportError -> RuntimeError conversion around the call to loader.exec_module (in the same spirit as PEP 479). That way, in: try: import myspeciallib except ImportError: import fallbacks.MySpecialLib as myspeciallib any caught ImportError would relate to "myspeciallib", while uncaught ImportErrors arising from *executing* "myspeciallib" will be converted to RuntimeError, with the original ImportError as their __cause__. So it would make sense to file an RFE against 3.7 proposing that behavioural change (we couldn't reasonably do anything like that with the old `load_module()` importer API, as raising ImportError was how that API signalled "I can't load that". We don't have that problem with `exec_module()`). > At least in my tests, exception_guard works this way and I don't see any > improvements to current behavior. Moreover, I am somewhat skeptical that > using this recipe will really improve the situation. It's a lot of code > where users don't have any stdlib support. I furthermore doubt that all > Python coders will now wrap their properties using the guard. So, using > these properties we will have almost no improvement. I still don't see it as > the responsibility of coder of the property to guard against anything. > Nobody is forced to catch exceptions when using a property. Honestly, if folks are trying to write complex Python code without using at least "pylint -E" as a static check for typos in attribute names (regardless of whether those lines get executed or not), then inadvertently hiding AttributeError in property and __getattr__ implementations is likely to be the least of their problems. So pylint's structural checks, type analysis tools like MyPy, or more advanced IDEs like PyCharm are typically going to be a better option for folks wanting to guard against bugs in their *own* code than adding defensive code purely as a cross-check on their own work. The cases I'm interested in are the ones where you're either developing some kind of framework and you need to code that framework defensively to guard against unexpected failures in the components you're executing (e.g. exec_module() in the PEP 451 import protocol), or else you're needing to adapt between two different kinds of exception reporting protocol (e.g. KeyError to AttributeError and vice-versa). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From erik.m.bray at gmail.com Wed Jun 28 09:30:03 2017 From: erik.m.bray at gmail.com (Erik Bray) Date: Wed, 28 Jun 2017 15:30:03 +0200 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: References: Message-ID: On Wed, Jun 28, 2017 at 3:09 PM, Erik Bray wrote: > On Wed, Jun 28, 2017 at 2:26 PM, Nick Coghlan wrote: >> On 28 June 2017 at 21:40, Erik Bray wrote: >>> My colleague's contention is that given >>> >>> lock = threading.Lock() >>> >>> this is simply *wrong*: >>> >>> lock.acquire() >>> try: >>> do_something() >>> finally: >>> lock.release() >>> >>> whereas this is okay: >>> >>> with lock: >>> do_something() >> >> Technically both are slightly racy with respect to async signals (e.g. >> KeyboardInterrupt), but the with statement form is less exposed to the >> problem (since it does more of its work in single opcodes). >> >> Nathaniel Smith posted a good write-up of the technical details to the >> issue tracker based on his work with trio: >> https://bugs.python.org/issue29988 > > Interesting; thanks for pointing this out. Part of me felt like this > has to have come up before but my searching didn't bring this up > somehow (and even then it's only a couple months old itself). > > I didn't think about the possible race condition before > WITH_CLEANUP_START, but obviously that's a possibility as well. > Anyways since this is already acknowledged as a real bug I guess any > further followup can happen on the issue tracker. On second thought, maybe there is a case to made w.r.t. making a documentation change about the semantics of the `with` statement: The old-style syntax cannot make any guarantees about atomicity w.r.t. async events. That is, there's no way syntactically in Python to declare that no exception will be raised between "lock.acquire()" and the setup of the "try/finally" blocks. However, if issue-29988 were *fixed* somehow (and I'm not convinced it can't be fixed in the limited case of `with` statements) then there really would be a major semantic difference of the `with` statement in that it does support this invariant. Then the question is whether that difference be made a requirement of the language (probably too onerous a requirement?), or just a feature of CPython (which should still be documented one way or the other IMO). Erik From erik.m.bray at gmail.com Wed Jun 28 09:40:05 2017 From: erik.m.bray at gmail.com (Erik Bray) Date: Wed, 28 Jun 2017 15:40:05 +0200 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: <5953ACDD.3040806@canterbury.ac.nz> References: <5953ACDD.3040806@canterbury.ac.nz> Message-ID: On Wed, Jun 28, 2017 at 3:19 PM, Greg Ewing wrote: > Erik Bray wrote: >> >> At this point a potentially >> waiting SIGINT is handled, resulting in KeyboardInterrupt being raised >> while inside the with statement's suite, and finally block, and hence >> Lock.__exit__ are entered. > > > Seems to me this is the behaviour you *want* in this case, > otherwise the lock can be acquired and never released. > It's disconcerting that it seems to be very difficult to > get that behaviour with a pure Python implementation. I think normally you're right--this is the behavior you would *want*, but not the behavior that's consistent with how Python implements the `with` statement, all else being equal. Though it's still not entirely fair either because if Lock.__enter__ were pure Python somehow, it's possible the exception would be raised either before or after the lock is actually marked as "acquired", whereas in the C implementation acquisition of the lock will always succeed (assuming the lock was free, and no other exceptional conditions) before the signal handler is executed. >> I think it might be possible to >> gain more consistency between these cases if pending signals are >> checked/handled after any direct call to PyCFunction from within the >> ceval loop. > > > IMO that would be going in the wrong direction by making > the C case just as broken as the Python case. > > Instead, I would ask what needs to be done to make this > work correctly in the Python case as well as the C case. You have a point there, but at the same time the Python case, while "broken" insofar as it can lead to broken code, seems correct from the Pythonic perspective. The other possibility would be to actually change the semantics of the `with` statement. Or as you mention below, a way to temporarily mask signals... > I don't think it's even possible to write Python code that > does this correctly at the moment. What's needed is a > way to temporarily mask delivery of asynchronous exceptions > for a region of code, but unless I've missed something, > no such facility is currently provided. > > What would such a facility look like? One possibility > would be to model it on the sigsetmask() system call, so > there would be a function such as > > mask_async_signals(bool) > > that turns delivery of async signals on or off. > > However, I don't think that would work. To fix the locking > case, what we need to do is mask async signals during the > locking operation, and only unmask them once the lock has > been acquired. We might write a context manager with an > __enter__ method like this: > > def __enter__(self): > mask_async_signals(True) > try: > self.acquire() > finally: > mask_async_signals(False) > > But then we have the same problem again -- if a Keyboard > Interrupt occurs after mask_async_signals(False) but > before __enter__ returns, the lock won't get released. Exactly. > Another approach would be to provide a context manager > such as > > async_signals_masked(bool) > > Then the whole locking operation could be written as > > with async_signals_masked(True): > lock.acquire() > try: > with async_signals_masked(False): > # do stuff here > finally: > lock.release() > > Now there's no possibility for a KeyboardInterrupt to > be delivered until we're safely inside the body, but we've > lost the ability to capture the pattern in the form of > a context manager. > > The only way out of this I can think of at the moment is > to make the above pattern part of the context manager > protocol itself. In other words, async exceptions are > always masked while the __enter__ and __exit__ methods > are executing, and unmasked while the body is executing. I think so too. That's more or less in line with Nick's idea on njs's issue (https://bugs.python.org/issue29988) of an ATOMIC_UNTIL opcode. That's just one implementation possibility. My question would be to make that a language-level requirement of the context manager protocol, or just something CPython does... Thanks, Erik From k7hoven at gmail.com Wed Jun 28 10:01:47 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 28 Jun 2017 17:01:47 +0300 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: On Wed, Jun 28, 2017 at 3:18 PM, Sven R. Kunze wrote: > On 28.06.2017 14:00, Koos Zevenhoven wrote: > > The programmer needs to be well aware of whether the resulting object is a > Sequence or 'just' a generator. > > > Could you elaborate more on **why**? > > ?For a moment, I was wondering what the double emphasis was for, but then I realized you are simply calling `statement.__why__()`? directly instead of the recommended `spoiler(statement)`. But sure, I just got on vacation and I even found a power extension cord to use my laptop at the pool, so what else would I do ;). It all depends on what you need to do with the result of the concatenation. When all you need is something to iterate over, a generator-like thingy is fine. But when you need something for indexing and slicing or len etc., you want to be sure that that is what you're getting. But maybe someone passed you an argument that is not a sequence, or you forgot if a function returns a sequence or a generator. In that case, you want the error right away, instead of from some completely different piece of code somewhere that thinks it's getting a sequence. I don't think Python should depend on a static type checker to catch the error early. After all, type hints are optional. PS: I consider this proposal to be like allowing adding floats and ints > together. If I don't know if there was a float in the sum, don't know if my > result will be a float. > ?Not to say that the float/int case is never problematic, but the situation is still different. Often when a float makes any sense, you can work with either floats or ints and it doesn't really matter. But if you specifically *need* an int, you usually don't call functions that return floats. But if you do use division etc., you probably need to think about floor/ceil/closest anyway. And yes, there have probably been Python 2->3 porting bugs where / division was not appropriately replaced with //. But regarding containers, it often makes just as much sense for a function to return a generator as it does to return a sequence. The name or purpose of a function may give no hint about whether an iterable or sequence is returned, and you can't expect everyone to prefix their function names with iter_ and seq_ etc. And it's not just function return values, it's also arguments passed into your function. ?? ?-- Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jun 28 11:51:51 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 29 Jun 2017 01:51:51 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: <20170628155151.GG3149@ando.pearwood.info> On Wed, Jun 28, 2017 at 12:25:12PM +1000, Nick Coghlan wrote: [...] > So while I prefer "contextlib.convert_exception" as the name (rather > than the "exception_guard" Steven used in his recipe), I'd definitely > be open to a bugs.python.org RFE and a PR against contextlib to add > such a construct to Python 3.7. http://bugs.python.org/issue30792 -- Steve From steve at pearwood.info Wed Jun 28 12:16:13 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 29 Jun 2017 02:16:13 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: <708d6e27-3b1c-f5c4-bd53-b9ce807d2410@mgmiller.net> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <708d6e27-3b1c-f5c4-bd53-b9ce807d2410@mgmiller.net> Message-ID: <20170628161613.GH3149@ando.pearwood.info> On Tue, Jun 27, 2017 at 01:53:37PM -0700, Mike Miller wrote: > > On 2017-06-25 20:23, Steven D'Aprano wrote: > >I have a counter-proposal: introduce the iterator chaining operator "&": > > > > iterable & iterable --> itertools.chain(iterable, iterable) > > > > I like this suggestion. Here's another color that might be less > controversial: > > iterable3 = iterable1.chain(iterable2) That requires every iterable class to add its own reimplementation of chain, or else it will surprisingly not be chainable -- or at least it *sometimes* won't be chainable. chain(iterable1, iterable2) would be more acceptable. The reason why a function would be better here than a method is explained in the FAQ for why len() is a function. The itertools chain function accepts *any* iterable. For this proposal to make sense, we can do no less. It isn't acceptable to: - only support a subset of iterables (unless there's an easy work-around); - expect everyone to add a chain() method to their iterable classes; - change the iterator protocol to require extra methods; - conflict with the + operator that already works with sequences; - choose an arbitrary operator that doesn't have at least some association with concatenation or chaining or addition (e.g. "?" would be unacceptible). Ideally, we should also be able to avoid conflicting with other operators as well, but given that there's only a small set of ASCII symbols to choose from, we should at least consider using an existing operator so long as: - we don't break existing code by changing the existing behaviour; - any clashes between the new and old behaviour should be uncommon (which is why + is unacceptible: using + on lists, tuples and strings is too common); - in any clash, the existing behaviour has priority over new behaviour (to avoid breaking existing code); - and there's an easy work-around to get the new behaviour (say, call iter() on the object first). -- Steve From steve at pearwood.info Wed Jun 28 12:28:52 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 29 Jun 2017 02:28:52 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: <20170628162852.GI3149@ando.pearwood.info> On Wed, Jun 28, 2017 at 12:47:21PM +1000, Nick Coghlan wrote: > While I haven't been following this thread closely, I'd like to note > that arguing for a "chain()" builtin has the virtue that would just be > arguing for the promotion of the existing itertools.chain function > into the builtin namespace. Yes, given the difficulties in choosing a operator, I'm leaning towards this approach now: promote itertools.chain to a builtin, and re-visit the operator suggestions in 3.8 or 3.9. -- Steve From srkunze at mail.de Wed Jun 28 14:01:05 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 28 Jun 2017 20:01:05 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> Message-ID: <080ba01a-a809-77c3-b804-56645f349b43@mail.de> On 28.06.2017 16:01, Koos Zevenhoven wrote: > For a moment, I was wondering what the double emphasis was for, but > then I realized you are simply calling `statement.__why__()`? directly > instead of the recommended `spoiler(statement)`. Doing this for years now. Sometimes, when 'statement.__why__()' returns None, 'spoiler(statement)' returns some thought-terminating clich?. ;) > But sure, I just got on vacation and I even found a power extension > cord to use my laptop at the pool, so what else would I do ;). Good man. Today, a colleague of mine showed me a mobile mini-keyboard with a phone bracket (not even a dock). So, having his 7'' smartphone, he can work from his vacations and answer emails as well. ;) Cheap notebook replacement, if you don't prefer large screens and keyboards. :D > It all depends on what you need to do with the result of the > concatenation. When all you need is something to iterate over, a > generator-like thingy is fine. But when you need something for > indexing and slicing or len etc., you want to be sure that that is > what you're getting. But maybe someone passed you an argument that is > not a sequence, or you forgot if a function returns a sequence or a > generator. In that case, you want the error right away, instead of > from some completely different piece of code somewhere that thinks > it's getting a sequence. I don't think Python should depend on a > static type checker to catch the error early. After all, type hints > are optional. I understand that. In the end, I remember people on this mailing-list recommending me to use "list(...)" to make sure you got one in your hands. I remember this being necessary in the conversion process from Python2 to 3. The pattern is already here. > PS: I consider this proposal to be like allowing adding floats and > ints together. If I don't know if there was a float in the sum, > don't know if my result will be a float. > > > ?Not to say that the float/int case is never problematic, but the > situation is still different. Often when a float makes any sense, you > can work with either floats or ints and it doesn't really matter. But > if you specifically *need* an int, you usually don't call functions > that return floats. But if you do use division etc., you probably need > to think about floor/ceil/closest anyway. And yes, there have probably > been Python 2->3 porting bugs where / division was not appropriately > replaced with //. Division is one thing, numeric input parameters from unknown sources is another. In this regard, calling "int(...)" or "list(...)" follows the same scheme IMO. > But regarding containers, it often makes just as much sense for a > function to return a generator as it does to return a sequence. The > name or purpose of a function may give no hint about whether an > iterable or sequence is returned, and you can't expect everyone to > prefix their function names with iter_ and seq_ etc. And it's not just > function return values, it's also arguments passed into your function. Exactly. Neither want I those prefixes. And I can tell you they aren't necessary in practice at all. Just my 2 cents on this: At work, we heavily rely on Django. Django provides a so-called QuerySet type, its db-result abstraction. Among those querysets, our functions return lists and sets with no indication of whatsoever type it may be. It works quite well and we didn't had any issues with that. If we need a list, we wrap it with "list(...)". It's as simple as that. The valid concern, that it could be confusing which type the return value might have, is usually an abstract one. I can tell you that in practice it's not really an issue to talk about. Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Wed Jun 28 14:14:04 2017 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Wed, 28 Jun 2017 11:14:04 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170628161613.GH3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <708d6e27-3b1c-f5c4-bd53-b9ce807d2410@mgmiller.net> <20170628161613.GH3149@ando.pearwood.info> Message-ID: <5953F1EC.1080407@brenbarn.net> On 2017-06-28 09:16, Steven D'Aprano wrote: > On Tue, Jun 27, 2017 at 01:53:37PM -0700, Mike Miller wrote: >> > >> >On 2017-06-25 20:23, Steven D'Aprano wrote: >>> > >I have a counter-proposal: introduce the iterator chaining operator "&": >>> > > >>> > > iterable & iterable --> itertools.chain(iterable, iterable) >>> > > >> > >> >I like this suggestion. Here's another color that might be less >> >controversial: >> > >> > iterable3 = iterable1.chain(iterable2) > That requires every iterable class to add its own reimplementation of > chain, or else it will surprisingly not be chainable -- or at least it > *sometimes* won't be chainable. > > chain(iterable1, iterable2) would be more acceptable. The reason why a > function would be better here than a method is explained in the FAQ for > why len() is a function. I still think a good middle ground would be to have such a function, but have the return type of that function be an iterator that provides a .chain method or (better) defines __add__ to allow adding it to other iterables. Then a single call to "chain" (or whatever the global function was called) would be enough to give you a nice readable syntax if you later want to chain other iterables on. This behavior would not need to be stipulated for any other kinds of iterators; "chain" would just be a function that converts any iterable into a nicely chainable one, similar to how pathlib.Path converts a string into a nicely manipulable Path object that allows various handy path operations. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From k7hoven at gmail.com Wed Jun 28 14:37:39 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 28 Jun 2017 21:37:39 +0300 Subject: [Python-ideas] + operator on generators In-Reply-To: <080ba01a-a809-77c3-b804-56645f349b43@mail.de> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> <080ba01a-a809-77c3-b804-56645f349b43@mail.de> Message-ID: On Wed, Jun 28, 2017 at 9:01 PM, Sven R. Kunze wrote: > Good man. Today, a colleague of mine showed me a mobile mini-keyboard with > a phone bracket (not even a dock). So, having his 7'' smartphone, he can > work from his vacations and answer emails as well. ;) Cheap notebook > replacement, if you don't prefer large screens and keyboards. :D > > ?Oh, I've been very close to getting one of those. But then I should probably get a pair of glasses too ;). ? > It all depends on what you need to do with the result of the > concatenation. When all you need is something to iterate over, a > generator-like thingy is fine. But when you need something for indexing and > slicing or len etc., you want to be sure that that is what you're getting. > But maybe someone passed you an argument that is not a sequence, or you > forgot if a function returns a sequence or a generator. In that case, you > want the error right away, instead of from some completely different piece > of code somewhere that thinks it's getting a sequence. I don't think Python > should depend on a static type checker to catch the error early. After all, > type hints are optional. > > > I understand that. In the end, I remember people on this mailing-list > recommending me to use "list(...)" to make sure you got one in your hands. > I remember this being necessary in the conversion process from Python2 to > 3. The pattern is already here. > > ?That pattern annoys people and negates the benefits of views and generators.? > PS: I consider this proposal to be like allowing adding floats and ints >> together. If I don't know if there was a float in the sum, don't know if my >> result will be a float. >> > > ?Not to say that the float/int case is never problematic, but the > situation is still different. Often when a float makes any sense, you can > work with either floats or ints and it doesn't really matter. But if you > specifically *need* an int, you usually don't call functions that return > floats. But if you do use division etc., you probably need to think about > floor/ceil/closest anyway. And yes, there have probably been Python 2->3 > porting bugs where / division was not appropriately replaced with //. > > > Division is one thing, numeric input parameters from unknown sources is > another. In this regard, calling "int(...)" or "list(...)" follows the same > scheme IMO. > > ?Sure, but you may want to turn your unknown sources into something predictable as soon as possible. You'll need to deal with the errors in the input anyway. Returning sequences vs generators is a different matter.? You don't want to turn generators into lists if you don't have to. > ?[...]? Just my 2 cents on this: > At work, we heavily rely on Django. Django provides a so-called QuerySet > type, its db-result abstraction. Among those querysets, our functions > return lists and sets with no indication of whatsoever type it may be. It > works quite well and we didn't had any issues with that. If we need a list, > we wrap it with "list(...)". It's as simple as that. The valid concern, > that it could be confusing which type the return value might have, is > usually an abstract one. I can tell you that in practice it's not really an > issue to talk about. > Very often one doesn't really need a list, but just something that has indexing, slicing and/or len(). Wrapping things with list() can be ok, but uses memory and is O(n). Generating lists from all kinds of iterables all the time is just a whole bunch of unnecessary overhead. But yes, it happens, because that's the convenient way of doing it now. That's like going back to Python 2, but with additional calls to list() required. Maybe you're lucky that your iterables are small and not a bottle neck and/or you just don't feel guilty every time you call list() where you shouldn't have to ;). ?-- Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Jun 28 15:14:26 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 28 Jun 2017 12:14:26 -0700 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On Jun 28, 2017 6:26 AM, "Nick Coghlan" wrote: On 28 June 2017 at 21:48, Sven R. Kunze wrote: > As it was snipped away, let me ask again: > > I don't see how this helps differentiating shallow and nested exceptions > such as: > > try: > with exception_guard(ImportError): > import myspeciallib > except RuntimeError: # catches shallow and nested ones > import fallbacks.MySpecialLib as myspeciallib There are two answers to that: 1. In 3.3+ you can just catch ImportError, check "exc.name", and re-raise if it's not for the module you care about 2. There's a reasonable case to be made that importlib should include an ImportError -> RuntimeError conversion around the call to loader.exec_module (in the same spirit as PEP 479). That way, in: try: import myspeciallib except ImportError: import fallbacks.MySpecialLib as myspeciallib any caught ImportError would relate to "myspeciallib", while uncaught ImportErrors arising from *executing* "myspeciallib" will be converted to RuntimeError, with the original ImportError as their __cause__. So it would make sense to file an RFE against 3.7 proposing that behavioural change (we couldn't reasonably do anything like that with the old `load_module()` importer API, as raising ImportError was how that API signalled "I can't load that". We don't have that problem with `exec_module()`). What about modules that want to raise ImportError to indicate that they aren't available on the current system, perhaps because some of their dependencies are missing? For example, 'import ssl' should raise an ImportError if 'ssl.py' is present but '_ssl.so' is missing; the existence of '_ssl.so' is an internal implementation detail. And perhaps 'import trio.ssl' should raise ImportError if 'ssl' is missing. (Historically not all systems have openssl available, so this is a common situation where existing libraries contain ImportError guards.) With PEP 479 there was a different and better way to generate a StopIteration if you wanted one (just 'return'). Here I'm afraid existing projects might actually be relying on the implicit exception leakage in significant numbers :-/ More generally, my impression is that this is one of the reasons why exceptions have fallen out of favor in more recent languages. They're certainly workable, and python's certainly not going to change now, but they do have these awkward aspects that weren't as clear 20 years ago and that now we have to live with. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Jun 28 17:14:14 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 28 Jun 2017 16:14:14 -0500 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: On Tuesday, June 27, 2017, Wes Turner wrote: > > > On Monday, June 26, 2017, Wes Turner > wrote: > >> >> >> On Sunday, June 25, 2017, Wes Turner wrote: >> >>> >>> >>> On Sunday, June 25, 2017, Danilo J. S. Bellini >>> wrote: >>> >>>> On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < >>>> python-ideas at python.org> wrote: >>>> >>>>> I often use generators, and itertools.chain on them. >>>>> What about providing something like the following: >>>>> >>>>> a = (n for n in range(2)) >>>>> b = (n for n in range(2, 4)) >>>>> tuple(a + b) # -> 0 1 2 3 >>>> >>>> >>>> AudioLazy does that: https://github.com/danilobellini/audiolazy >>>> >>> >>> - http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.concat >>> and concatv >>> >> """ We use chain.from_iterable rather than chain(*seqs) so that seqs can be a generator. """ ... ``chain.from_iterable()`` > >>> - https://github.com/kachayev/fn.py#streams-and-infinite-seque >>> nces-declaration >>> - Stream() << obj >>> >> >> << is __lshift__() >> <<= is __ilshift__() >> >> https://docs.python.org/2/library/operator.html >> >> Do Stream() and __lshift__() from fn.py not solve here? >> > > In this syntax example, iter1 is mutated before iteration: > > iter_x = Stream(iter1) << iter2 << iter3 > iter1 <<= iter5 # IDK if fn.py yet has <<= > list(iter1) > list(iter_x) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Jun 28 18:10:47 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 29 Jun 2017 00:10:47 +0200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 28.06.2017 15:26, Nick Coghlan wrote: > 1. In 3.3+ you can just catch ImportError, check "exc.name", and > re-raise if it's not for the module you care about I see, didn't know that one. I gave it a try and it's not 100% the behavior I have expected, but one could workaround if the valid package structure is known. Not sure for certain. "from foo.bar.baz import abc" can yield to "exc.name" being one of "foo", "foo.bar" or "foo.bar.baz". Not perfect but sort of doable. > 2. There's a reasonable case to be made that importlib should include > an ImportError -> RuntimeError conversion around the call to > loader.exec_module (in the same spirit as PEP 479). That way, in: > > try: > import myspeciallib > except ImportError: > import fallbacks.MySpecialLib as myspeciallib > > any caught ImportError would relate to "myspeciallib", while uncaught > ImportErrors arising from *executing* "myspeciallib" will be converted > to RuntimeError, with the original ImportError as their __cause__. Generally changing the behavior for ImportError doesn't sound like it would work for all projects out there. For fallback imports, I am on your side, that's a real use case which can be solved by changing the behavior of ImportErrors. But for all imports? I don't know if that's good idea. > [People should use tools, guard against bugs and try to avoid mistakes.] Sure, but I don't see who this can help, if I use third-party code. The cases, which I described in the original post, were simple cases, where we catch too many exception. So, I don't even have the chance to see the error, to file a bug report, to issue a pull request, etc. etc. > The cases I'm interested in are the ones where you're either > developing some kind of framework and you need to code that framework > defensively to guard against unexpected failures in the components > you're executing (e.g. exec_module() in the PEP 451 import protocol), > or else you're needing to adapt between two different kinds of > exception reporting protocol (e.g. KeyError to AttributeError and > vice-versa). I am unsure what you mean by those abstract words "framework" and "components". But let me state it in different words: there are *raisers* and *catchers* which do the respective thing with exceptions. If you control the code on both sides, things are easy to change. Pre-condition: you know the bug in the first place, which is hard when you catch too much. If you control the raiser only, it doesn't help to say: "don't make mistakes, configure systems right, code better, etc." People will make mistakes, systems will be misconfigured, linters don't find everything, etc. If you control the catcher only, you definitely want to narrow down the amount of caught exceptions as far as possible. This was the original intend of this thread IIRC. This way you help to discover bugs in raising code. Addition benefit, you catching code reacts only to the right exceptions. One word about frameworks here. Django, for instance, is on both sides. The template engine is mostly on the catchers side, whereas the database layer is on the raisers side. I get the feeling that the solutions presented here are way too complicated and error-prone. My opinion on this topic still is that catching exceptions is not mandatory. Nobody is forced to do it and it's even better to let exceptions bubble up to visualize bugs. If one still needs to catch them, he should only catch those he really, really needs to catch and nothing more. If this cannot be improved sensibly, well, so be it. Although I still don't find the argument presented against "catching shallow exception" a little bit too abstract compared to the practical benefit. Maybe, there's a better solution, maybe not. Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Jun 28 18:22:38 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 29 Jun 2017 00:22:38 +0200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: <89e7081d-e91a-2f14-f293-06835c61f88f@mail.de> On 28.06.2017 21:14, Nathaniel Smith wrote: > With PEP 479 there was a different and better way to generate a > StopIteration if you wanted one (just 'return'). Here I'm afraid > existing projects might actually be relying on the implicit exception > leakage in significant numbers :-/ My concern as well. > More generally, my impression is that this is one of the reasons why > exceptions have fallen out of favor in more recent languages. They're > certainly workable, and python's certainly not going to change now, > but they do have these awkward aspects that weren't as clear 20 years > ago and that now we have to live with. I am quite satisfied with the ability of exceptions to expose bugs as fast and clear as possible. I still think we can improve on the catching side a little bit to narrow down the relevant exceptions. Other than that, I would be interested to hear what system you have in mind. What alternative (borrowed from more recent languages) can you imagine? Checking return values like C or golang? No ability to catch them at all? How to handle bugs in the context of UI applications where a crash in front of the user should be avoided? Regards, Sven From srkunze at mail.de Wed Jun 28 18:29:44 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 29 Jun 2017 00:29:44 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <5a60b858-688c-538c-00b5-a0c8ab73bbaf@mgmiller.net> <5952C8A4.5050207@brenbarn.net> <6af7e196-0053-8f2f-4812-4872ebcea7c7@mgmiller.net> <080ba01a-a809-77c3-b804-56645f349b43@mail.de> Message-ID: <274725c4-0062-92d0-0bc4-80e08255f75d@mail.de> On 28.06.2017 20:37, Koos Zevenhoven wrote: > Oh, I've been very close to getting one of those. But then I should > probably get a pair of glasses too ;). :D > ??That pattern annoys people and negates the benefits of views and > generators.? Sure, that's why I am in favor of this proposal. It would remove the necessity to do that in various places. :) > > Sure, but you may want to turn your unknown sources into something > predictable as soon as possible. You'll need to deal with the errors > in the input anyway. That's a good point. > Very often one doesn't really need a list, but just something that has > indexing, slicing and/or len(). Wrapping things with list() can be ok, > but uses memory and is O(n). Generating lists from all kinds of > iterables all the time is just a whole bunch of unnecessary overhead. > But yes, it happens, because that's the convenient way of doing it > now. That's like going back to Python 2, but with additional calls to > list() required. Maybe you're lucky that your iterables are small and > not a bottle neck and/or you just don't feel guilty every time you > call list() where you shouldn't have to ;). Yep, exactly. That's why I like an easier way of concating them with no bells and whistles. Preferably like lists today. ;) Cheers, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Wed Jun 28 19:33:13 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 29 Jun 2017 11:33:13 +1200 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: References: <5953ACDD.3040806@canterbury.ac.nz> Message-ID: <59543CB9.3040609@canterbury.ac.nz> Erik Bray wrote: > My question would be to > make that a language-level requirement of the context manager > protocol, or just something CPython does... I think it should be a language-level requirement, otherwise it's not much use. Note that it's different from some existing CPython-only behaviour such as refcounting, because it's possible to code around those things on other implementations that don't provide the same guarantees, but here there's *no* way to code around it. At the very least, it should be a documented guarantee in CPython, not just something left "up to the implementation". -- Greg From njs at pobox.com Wed Jun 28 20:03:48 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 28 Jun 2017 17:03:48 -0700 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: <5953ACDD.3040806@canterbury.ac.nz> References: <5953ACDD.3040806@canterbury.ac.nz> Message-ID: On Wed, Jun 28, 2017 at 6:19 AM, Greg Ewing wrote: > Erik Bray wrote: >> >> At this point a potentially >> waiting SIGINT is handled, resulting in KeyboardInterrupt being raised >> while inside the with statement's suite, and finally block, and hence >> Lock.__exit__ are entered. > > > Seems to me this is the behaviour you *want* in this case, > otherwise the lock can be acquired and never released. > It's disconcerting that it seems to be very difficult to > get that behaviour with a pure Python implementation. I agree :-) >> I think it might be possible to >> gain more consistency between these cases if pending signals are >> checked/handled after any direct call to PyCFunction from within the >> ceval loop. > > > IMO that would be going in the wrong direction by making > the C case just as broken as the Python case. > > Instead, I would ask what needs to be done to make this > work correctly in the Python case as well as the C case. > > I don't think it's even possible to write Python code that > does this correctly at the moment. What's needed is a > way to temporarily mask delivery of asynchronous exceptions > for a region of code, but unless I've missed something, > no such facility is currently provided. It's *almost* possible in some cases, by installing a specialized signal handler which introspects the stack to see if we're in one of these delicate regions of code. See: https://vorpus.org/blog/control-c-handling-in-python-and-trio/#how-trio-handles-control-c The "almost" is because currently, I have a function decorator that marks certain functions as needing protection against async signals, which works by injecting a magic local variable into the function when it starts. The problem is that this can't be done atomically, so if you have an __exit__ method like: def __exit__(self, *args): # XX _this_function_is_protected = True self.release() then it's possible for a signal to arrive at the point marked "XX", and then your lock never gets released. One solution would be: https://bugs.python.org/issue12857 I've also been considering gross things like keeping a global WeakSet of the code objects for all functions that have been registered for async signal protection. However, trio's solution looks a bit different than what you'd need for a general python program, because the general strategy is that if a signal arrives at a bad moment, we don't delay it (we can't!); instead, we make a note to deliver it later. For an async framework like trio this is fine, and in fact we need the "deliver the signal later" facility anyway, because we need to handle the case where a signal arrives while the event loop is polling for I/O and there isn't any active task to deliver the signal to anyway. For a generic solution in the interpreter, then I agree that it'd probably make more sense to have a way to delay running the signal handler until an appropriate moment. > What would such a facility look like? One possibility > would be to model it on the sigsetmask() system call, so > there would be a function such as > > mask_async_signals(bool) > > that turns delivery of async signals on or off. > > However, I don't think that would work. To fix the locking > case, what we need to do is mask async signals during the > locking operation, and only unmask them once the lock has > been acquired. We might write a context manager with an > __enter__ method like this: > > def __enter__(self): > mask_async_signals(True) > try: > self.acquire() > finally: > mask_async_signals(False) > > But then we have the same problem again -- if a Keyboard > Interrupt occurs after mask_async_signals(False) but > before __enter__ returns, the lock won't get released. > > Another approach would be to provide a context manager > such as > > async_signals_masked(bool) > > Then the whole locking operation could be written as > > with async_signals_masked(True): > lock.acquire() > try: > with async_signals_masked(False): > # do stuff here > finally: > lock.release() > > Now there's no possibility for a KeyboardInterrupt to > be delivered until we're safely inside the body, but we've > lost the ability to capture the pattern in the form of > a context manager. If async_signals_masked is implemented in C and can be used as a decorator, then you could do: class MyLock: @async_signals_masked(True) def __enter__(self): ... @async_signals_masked(True) def __exit__(self, *exc_info): ... However, there's still a problem: in async code, you can yield out of a async-signals-masked section, either because you have an async context manager: @async_signals_masked(True) async def __aexit__(self, *exc_info): ... or because you're using the context manager directly to protect some delicate code: async def acquire(self): # Make sure KeyboardInterrupt can't leave us with a half-taken lock with async_signals_masked(True): if self._already_held: await self._lock_released_event ... So what should this async_signals_masked state do when we yield out from under it? If it's a thread-local, then the masking state will "leak" into other async function callstacks (or similar for regular generators), which is pretty bad. But it can't be just frame-local data either, because we want the masking state to be inherited by and subroutines we call from inside the masked block. This is why trio uses the stack walking trick: it means that when you use 'yield' to switch callstacks, the async signal masking state gets switched too, automatically and atomically. So maybe a better way would be to do something more like what trio does. For example, we could have a flag on a function frame that says "this frame (and the code in it) should not be interrupted", and then in the bytecode loop when a signal arrives, walk up the call stack to see if any of these flags are set before running the Python-level signal handler. There's some efficiency and complexity questions here, but maybe it's not too bad (signals are only received rarely, and maybe there are some tricks to reduce the overhead). > The only way out of this I can think of at the moment is > to make the above pattern part of the context manager > protocol itself. In other words, async exceptions are > always masked while the __enter__ and __exit__ methods > are executing, and unmasked while the body is executing. This would make me nervous, because context managers are used for all kinds of things, and only some of them involve delicate resource manipulation. Masking async exceptions is a trade-off: if you do it at the wrong place, then you can end up with a program that refuses to respond to control-C, which is pretty frustrating. There are also some rather nasty cases, like I think Popen.__exit__ might block waiting for SIGCHLD to be delivered? And anyway, you still have to solve the problem of how you communicate this state to subroutines called by __(a)enter__ and __(a)exit__, but not let it leak when you yield. Once you solve that I think you have 95% of the machinery you need to make this user-controllable. -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Wed Jun 28 22:57:23 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 29 Jun 2017 12:57:23 +1000 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: References: <20170623190209.GQ3149@ando.pearwood.info> <20170623225603.GA27276@cskk.homeip.net> Message-ID: On 29 June 2017 at 05:14, Nathaniel Smith wrote: > What about modules that want to raise ImportError to indicate that they > aren't available on the current system, perhaps because some of their > dependencies are missing? For example, 'import ssl' should raise an > ImportError if 'ssl.py' is present but '_ssl.so' is missing; the existence > of '_ssl.so' is an internal implementation detail. And perhaps 'import > trio.ssl' should raise ImportError if 'ssl' is missing. (Historically not > all systems have openssl available, so this is a common situation where > existing libraries contain ImportError guards.) > > With PEP 479 there was a different and better way to generate a > StopIteration if you wanted one (just 'return'). Here I'm afraid existing > projects might actually be relying on the implicit exception leakage in > significant numbers :-/ Hence "it may be worth filing an RFE so we can discuss the implications", rather than "we should definitely do it". The kinds of cases you cite will already fail for import guards that check for "exc.name == 'the_module_being_imported'" though, so I'd be OK with requiring modules that actually wanted that behaviour to do: try: import other_module except ImportError as exc: raise ImportError("other_module is missing", name=__name__, path=__file__) from exc The guard around exec_module() would then look something like: try: loader.exec_module(mod) except ImportError as exc: if exc.name == mod.__name__: raise msg = f"Failed to import '{exc.name}' from '{mod.__name__}'" raise RuntimeError(msg) from exc Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From greg.ewing at canterbury.ac.nz Thu Jun 29 00:14:31 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 29 Jun 2017 16:14:31 +1200 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: References: <5953ACDD.3040806@canterbury.ac.nz> Message-ID: <59547EA7.6010201@canterbury.ac.nz> Nathaniel Smith wrote: > So what should this async_signals_masked state do when we yield out > from under it? If it's a thread-local, then the masking state will > "leak" into other async function callstacks (or similar for regular > generators), which is pretty bad. But it can't be just frame-local > data either, because we want the masking state to be inherited by and > subroutines we call from inside the masked block. That should be easy enough, shouldn't it? When entering a new frame, copy the mask state from the calling frame. > For example, we could have a flag on a function frame that says > "this frame (and the code in it) should not be interrupted", and then > in the bytecode loop when a signal arrives, walk up the call stack to > see if any of these flags are set before running the Python-level > signal handler. That would make it impossible to temporarily unmask async signals in a region where they're masked. An example of a situation where you might want to do that is in an implementation of lock.acquire(). If the thread needs to block while waiting for the lock to become available, you probably want to allow ctrl-C to interrupt the thread while it's blocked. > This would make me nervous, because context managers are used for all > kinds of things, and only some of them involve delicate resource > manipulation. The next step I had in mind was to extend the context manager protocol so that the context manager can indicate whether it wants async signals masked, so it would only happen for things like lock that really need it. -- Greg From njs at pobox.com Thu Jun 29 01:05:33 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 28 Jun 2017 22:05:33 -0700 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: <59547EA7.6010201@canterbury.ac.nz> References: <5953ACDD.3040806@canterbury.ac.nz> <59547EA7.6010201@canterbury.ac.nz> Message-ID: On Wed, Jun 28, 2017 at 9:14 PM, Greg Ewing wrote: > Nathaniel Smith wrote: >> >> So what should this async_signals_masked state do when we yield out >> from under it? If it's a thread-local, then the masking state will >> "leak" into other async function callstacks (or similar for regular >> generators), which is pretty bad. But it can't be just frame-local >> data either, because we want the masking state to be inherited by and >> subroutines we call from inside the masked block. > > > That should be easy enough, shouldn't it? When entering a new > frame, copy the mask state from the calling frame. Right, that approach would be semantically equivalent to the walking-the-call-stack approach, just it shifts some of the cost around (it makes signals cheaper by making each function call a tiny bit more expensive). >> For example, we could have a flag on a function frame that says >> "this frame (and the code in it) should not be interrupted", and then >> in the bytecode loop when a signal arrives, walk up the call stack to >> see if any of these flags are set before running the Python-level >> signal handler. > > > That would make it impossible to temporarily unmask async > signals in a region where they're masked. > > An example of a situation where you might want to do that is > in an implementation of lock.acquire(). If the thread needs to > block while waiting for the lock to become available, you > probably want to allow ctrl-C to interrupt the thread while > it's blocked. So trio actually does allow this kind of nesting -- for any given frame the special flag can be unset, set to True, or set to False, and the signal handler walks up until it finds the first one frame where it's set and uses that. So I guess the equivalent would be two flags, or a little enum field in the frame object, or something like that. >> This would make me nervous, because context managers are used for all >> kinds of things, and only some of them involve delicate resource >> manipulation. > > > The next step I had in mind was to extend the context manager > protocol so that the context manager can indicate whether it > wants async signals masked, so it would only happen for things > like lock that really need it. A magic (implemented in C) decorator like @async_signals_masked I think would be the simplest way to do this extension. (Or maybe better call it @no_signal_delivery, because it would have to block all running of signal handlers; the interpreter doesn't know whether a signal handler will raise an exception until it calls it.) -n -- Nathaniel J. Smith -- https://vorpus.org From greg.ewing at canterbury.ac.nz Thu Jun 29 05:18:11 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 29 Jun 2017 21:18:11 +1200 Subject: [Python-ideas] Asynchronous exception handling around with/try statement borders In-Reply-To: References: <5953ACDD.3040806@canterbury.ac.nz> <59547EA7.6010201@canterbury.ac.nz> Message-ID: <5954C5D3.90800@canterbury.ac.nz> Nathaniel Smith wrote: > A magic (implemented in C) decorator like @async_signals_masked I > think would be the simplest way to do this extension. I don't have a good feeling about that approach. While implementing the decorator in C might be good enough in CPython to ensure no window of opportunity exists to leak a signal, that might not be true in other Python implementations. -- Greg From levkivskyi at gmail.com Thu Jun 29 05:40:29 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 29 Jun 2017 11:40:29 +0200 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: Sorry, I was not able to completely digest the OP, but I think there are some points I need to clarify. 1. Distinction between runtime classes and static types is quite sane and a simple idea. A runtime class is something associated with an actual object, while static type is something associated with an AST node. Mixing them would be misleading, since they "live in parallel planes". Although it is true that there is a type that corresponds to every runtime class. 2. Currently isinstance(obj, List[int]) fails with TypeError, ditto for issubclass and for user defined generic classes: class C(Generic[T]): ... isinstance(obj, C) # works, returns only True or False isinstance(obj, C[int]) # TypeError issubclass(cls, C) # works issubclass(cls, C[int]) # raisesTypeError 3. User defined protocols will by default raise TypeError with isinstance(), but the user can opt-in (using @runtime decorator) for the same behavior as normal generics, this is how typing.Iterable currently works: class MyIter: def __iter__(self): return [42] isinstance(MyIter(), Iterable) # True isinstance(MyIter(), Iterable[int]) # TypeError class A(Protocol[T]): x: T isinstance(obj, A) # TypeError @runtime class B(Protocol[T]): y: T isinstance(obj, B) # True or False depending on whether 'obj' has attribute 'y' isinstance(obj, B[int]) # Still TypeError -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Thu Jun 29 09:30:21 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 29 Jun 2017 16:30:21 +0300 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: On Thu, Jun 29, 2017 at 12:40 PM, Ivan Levkivskyi wrote: > Sorry, I was not able to completely digest the OP, but I think there are > some points I need to clarify. > > 1. Distinction between runtime classes and static types is quite sane and > a simple idea. > A runtime class is something associated with an actual object, > while static type is something associated with an AST node. > Here, ?I'm more concerned about *types* at runtime vs *types* for static checking. Some of these may be normal classes, but those are not the problematic ones.? > Mixing them would be misleading, since they "live in parallel planes". > Although it is true that there is a type that corresponds to every runtime > class. > ?It's not clear to me what you mean by mixing them. They are already partly mixed, right? I think you are speaking from the static-checker point of view, where there are only types, and runtime behavior is completely separate (at least in some sense). This view works especially well when types are in comments or stubs. But when the types are also present at runtime, I don't think this view is completely realistic. Some of the objects that represent types are regular classes, and some of them may be only types (like Union[str, bytes] or Sequence[int]), but not normal Python classes that you instantiate. Even if they represent types, not classes, they exist at runtime, and there should at least *exist* a well-defined answer to whether an object is in 'instance' of a given type. (Not sure if 'instance' should be world used here) Ignoring that *types* are also a runtime concept seems dangerous to me. > 2. Currently isinstance(obj, List[int]) fails with TypeError, ditto for > issubclass and > for user defined generic classes: > > class C(Generic[T]): > ... > > isinstance(obj, C) # works, returns only True or False > isinstance(obj, C[int]) # TypeError > issubclass(cls, C) # works > issubclass(cls, C[int]) # raisesTypeError > > ?I suppose that's the best that isinstance can do in these cases. But I'm not sure if isinstance and issubclass should try to do as much as reasonable, or if it they should just handle the normal classes, and let some new function, say implements(), take care of the other *types*. ?Anyway, this whole concept of two 'parallel universes' is problematic, because the universes overlap in at least two different ways. ?-- Koos? 3. User defined protocols will by default raise TypeError with isinstance(), > but the user can opt-in (using @runtime decorator) for the same behavior > as normal generics, > this is how typing.Iterable currently works: > > class MyIter: > def __iter__(self): > return [42] > > isinstance(MyIter(), Iterable) # True > isinstance(MyIter(), Iterable[int]) # TypeError > > class A(Protocol[T]): > x: T > isinstance(obj, A) # TypeError > > @runtime > class B(Protocol[T]): > y: T > isinstance(obj, B) # True or False depending on whether 'obj' has > attribute 'y' > isinstance(obj, B[int]) # Still TypeError > > -- > Ivan > > > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From zuo at chopin.edu.pl Thu Jun 29 18:57:42 2017 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Fri, 30 Jun 2017 00:57:42 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> Message-ID: <20170630005742.3ec21b38@grzmot> 2017-06-27 Stephan Houben dixit: > Is "itertools.chain" actually that common? > Sufficiently common to warrant its own syntax? Please, note that it can be upturned: maybe they are not so common as they could be because of all that burden with importing from separate module -- after all we are saying about somewhat very simple operation, so using lists and `+` just wins because of our (programmers') laziness. :-) Cheers. *j From zuo at chopin.edu.pl Thu Jun 29 19:09:51 2017 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Fri, 30 Jun 2017 01:09:51 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> Message-ID: <20170630010951.3fc7d17b@grzmot> 2017-06-25 Serhiy Storchaka dixit: > 25.06.17 15:06, lucas via Python-ideas ????: > > I often use generators, and itertools.chain on them. > > What about providing something like the following: > > > > a = (n for n in range(2)) > > b = (n for n in range(2, 4)) > > tuple(a + b) # -> 0 1 2 3 [...] > It would be weird if the addition is only supported for instances of > the generator class, but not for other iterators. Why (n for n in > range(2)) > + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, > 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports > arbitrary iterators. Therefore you will need to implement the __add__ > method for *all* iterators in the world. > > However itertools.chain() accepts not just *iterators*. [...] But implementation of the OP's proposal does not need to be based on __add__ at all. It could be based on extending the current behaviour of the `+` operator itself. Now this behavior is (roughly): try left side's __add__, if failed try right side's __radd__, if failed raise TypeError. New behavior could be (again: roughly): try left side's __add__, if failed try right side's __radd__, if failed try __iter__ of both sides and chain them (creating a new iterator?), if failed raise TypeError. And similarly, for `+=`: try __iadd__..., try __add__..., try __iter__..., raise TypeError. Cheers. *j ? Preferably using the existing `yield from` mechanism -- because, in case of generators, it would provide a way to combine ("concatenate") *generators*, preserving semantics of all that their __next__(), send(), throw() nice stuff... From fakedme+py at gmail.com Thu Jun 29 19:33:12 2017 From: fakedme+py at gmail.com (Soni L.) Date: Thu, 29 Jun 2017 20:33:12 -0300 Subject: [Python-ideas] Python 4: Concatenation Message-ID: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> Step 1. get rid of + for strings, lists, etc. (string/list concatenation is not addition) Step 2. add concatenation operator for strings, lists, and basically anything that can be iterated. effectively an overloadable itertools.chain. (list cat list = new list, not iterator, but effectively makes itertools.chain useless.) Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23, 22 cat 33 = 2233, etc. if you need bitwise concatenation, you're already in bitwise "hack" land so do it yourself. (no idea why bitwise is considered hacky as I use it all the time, but oh well) Step 4. make it into python 4, since it breaks backwards compatibility. From rosuav at gmail.com Thu Jun 29 19:45:20 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 30 Jun 2017 09:45:20 +1000 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> Message-ID: On Fri, Jun 30, 2017 at 9:33 AM, Soni L. wrote: > Step 1. get rid of + for strings, lists, etc. (string/list concatenation is > not addition) > > Step 2. add concatenation operator for strings, lists, and basically > anything that can be iterated. effectively an overloadable itertools.chain. > (list cat list = new list, not iterator, but effectively makes > itertools.chain useless.) > > Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23, 22 > cat 33 = 2233, etc. if you need bitwise concatenation, you're already in > bitwise "hack" land so do it yourself. (no idea why bitwise is considered > hacky as I use it all the time, but oh well) Nope. Practicality beats purity. Something like this exists in REXX ("+" means addition, and "||" means concatenation), and it doesn't help (though it's necessary there as REXX doesn't have distinct data types for strings and numbers). String concatenation might as well be addition. It's close enough and it makes perfect sense. Get a bunch of people together and ask them this: "If 5+6 means 11, then what does 'hello' + 'world' mean?". Most of them will assume it means concatenation. ChrisA From steve at pearwood.info Thu Jun 29 20:48:49 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 30 Jun 2017 10:48:49 +1000 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> Message-ID: <20170630004847.GJ3149@ando.pearwood.info> On Thu, Jun 29, 2017 at 08:33:12PM -0300, Soni L. wrote: > Step 1. get rid of + for strings, lists, etc. (string/list concatenation > is not addition) I agree that using + for concatenation is sub-optimal, & is a better choice, but we're stuck with it. And honestly it's not *that* big a deal that I would break backwards compatibility for this. Fixing the "problem" is more of a pain than just living with it. > Step 2. add concatenation operator for strings, lists, and basically > anything that can be iterated. effectively an overloadable > itertools.chain. (list cat list = new list, not iterator, but > effectively makes itertools.chain useless.) Chaining is not concatenation. Being able to concatenate two strings (or two tuples, two lists) and get an actual string rather than a chained iterator is a good thing. word = (stem + suffix).upper() Being able to chain arbitrary iterables and get an iterator is also a good thing: chain(astring, alist, atuple) If we had a chaining operator, it too would have to accept arbitrary iterables and return an iterator. > Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23, When would you need that? What use-case for concatenation of numbers is there, and why is it important enough to use an operator instead of a custom function? The second part is the most critical -- I'm sure there are uses for concatenating digits to get integers, although I can't think of any right now -- but ASCII operators are in short supply, why are we using one for such a specialised and rarely used function? Things would be different if we had a dedicated concatenation operator, then we could allow things like 1 & '1' returns '11' say but we don't and I don't expect that allowing this is important enough to force the backwards compatibility break. > 22 cat 33 = 2233, etc. if you need bitwise concatenation, you're already > in bitwise "hack" land so do it yourself. (no idea why bitwise is > considered hacky as I use it all the time, but oh well) > > Step 4. make it into python 4, since it breaks backwards compatibility. Python 4 will not be a major backwards incompatible version like Python 3 was. It will be just a regular evolutionary (rather than revolutionary) upgrade from 3.9. When I want to talk about major backwards incompatibilities, I talk about "Python 5000", by analogy to "Python 3000". -- Steve From rymg19 at gmail.com Thu Jun 29 22:01:00 2017 From: rymg19 at gmail.com (rymg19 at gmail.com) Date: Thu, 29 Jun 2017 22:01:00 -0400 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: <<15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com>> Message-ID: I feel like this would literally break the world for almost no real benefit... -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttp://refi64.com On Jun 29, 2017 at 6:33 PM, > wrote: Step 1. get rid of + for strings, lists, etc. (string/list concatenation is not addition) Step 2. add concatenation operator for strings, lists, and basically anything that can be iterated. effectively an overloadable itertools.chain. (list cat list = new list, not iterator, but effectively makes itertools.chain useless.) Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23, 22 cat 33 = 2233, etc. if you need bitwise concatenation, you're already in bitwise "hack" land so do it yourself. (no idea why bitwise is considered hacky as I use it all the time, but oh well) Step 4. make it into python 4, since it breaks backwards compatibility. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fakedme+py at gmail.com Thu Jun 29 22:14:46 2017 From: fakedme+py at gmail.com (Soni L.) Date: Thu, 29 Jun 2017 23:14:46 -0300 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: <20170630004847.GJ3149@ando.pearwood.info> References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> <20170630004847.GJ3149@ando.pearwood.info> Message-ID: On 2017-06-29 09:48 PM, Steven D'Aprano wrote: > On Thu, Jun 29, 2017 at 08:33:12PM -0300, Soni L. wrote: > >> Step 1. get rid of + for strings, lists, etc. (string/list concatenation >> is not addition) > I agree that using + for concatenation is sub-optimal, & is a better > choice, but we're stuck with it. And honestly it's not *that* big a deal > that I would break backwards compatibility for this. Fixing the > "problem" is more of a pain than just living with it. > > >> Step 2. add concatenation operator for strings, lists, and basically >> anything that can be iterated. effectively an overloadable >> itertools.chain. (list cat list = new list, not iterator, but >> effectively makes itertools.chain useless.) > Chaining is not concatenation. > > Being able to concatenate two strings (or two tuples, two lists) and get an actual > string rather than a chained iterator is a good thing. > > word = (stem + suffix).upper() > > Being able to chain arbitrary iterables and get an iterator is also a > good thing: > > chain(astring, alist, atuple) > > If we had a chaining operator, it too would have to accept arbitrary > iterables and return an iterator. astring cat alist is undefined for string (since strings are very specific about types), so it would return a list. alist cat atuple would return a list, because the list comes first. This is *EFFECTIVELY* equivalent to chaining, since iterating the results of these concatenations produces the *exact* same results as iterating their chainings. (And don't say "performance" - CPython has a GIL, and Python makes many convenience-over-performance tradeoffs like this.) > > >> Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23, > When would you need that? What use-case for concatenation of numbers is > there, and why is it important enough to use an operator instead of a > custom function? > > The second part is the most critical -- I'm sure there are uses for > concatenating digits to get integers, although I can't think of any > right now -- but ASCII operators are in short supply, why are we using > one for such a specialised and rarely used function? > > Things would be different if we had a dedicated concatenation operator, > then we could allow things like > > 1 & '1' returns '11' > > say but we don't and I don't expect that allowing this is important > enough to force the backwards compatibility break. Since we'd have a concatenation operator, why not extend them to integers? No reason not to, really. In practice tho, it would never be used. This was never about integers, even if I did mention them. > > >> 22 cat 33 = 2233, etc. if you need bitwise concatenation, you're already >> in bitwise "hack" land so do it yourself. (no idea why bitwise is >> considered hacky as I use it all the time, but oh well) >> >> Step 4. make it into python 4, since it breaks backwards compatibility. > Python 4 will not be a major backwards incompatible version like Python > 3 was. It will be just a regular evolutionary (rather than > revolutionary) upgrade from 3.9. > > When I want to talk about major backwards incompatibilities, I talk > about "Python 5000", by analogy to "Python 3000". > > > This isn't a *major* backwards incompatibility. Unlike with unicode/strings, a dumb static analysis program can trivially replace + with the concatenation operator, whatever that may be. Technically, nothing forces us to remove + from strings and such and the itertools stuff - we could just make them deprecated in python 4, and remove them in python 5. (PS: I don't propose using literally "cat" for concatenation. That was just a placeholder.) From rosuav at gmail.com Fri Jun 30 00:00:22 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 30 Jun 2017 14:00:22 +1000 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> <20170630004847.GJ3149@ando.pearwood.info> Message-ID: On Fri, Jun 30, 2017 at 12:14 PM, Soni L. wrote: > This isn't a *major* backwards incompatibility. Unlike with unicode/strings, > a dumb static analysis program can trivially replace + with the > concatenation operator, whatever that may be. Technically, nothing forces us > to remove + from strings and such and the itertools stuff - we could just > make them deprecated in python 4, and remove them in python 5. It wouldn't be quite that trivial though. If all you do is replace "+" with "&", you've broken anything that uses numeric addition. Since this is a semantic change, you can't defer it to run-time (the way a JIT compiler like PyPy could), and you can't afford to have it be "mostly right but might have edge cases" like something based on type hints would be. So there'd be some human work involved, as with the bytes/text distinction. What you're wanting to do is take one operator ("+") and split it into two roles (addition and concatenation). That means getting into the programmer's head, so it can't be completely automated. Even if it CAN be fully automated, though, what would you gain? You've made str+str no longer valid - to what end? Here's a counter-proposal: Start with your step 2, and create a new __concat__ magic method and corresponding operator. Then str gets a special case: class str(str): # let's pretend def __concat__(self, other): return self + str(other) And tuple gets a special case: class tuple(tuple): # pretend again def __concat__(self, other): return *self, *other And maybe a few others (list, set, possibly dict). For everything else, object() will handle them: class object(object): # mind-bending def __concat__(self, other): return itertools.chain(iter(self), iter(other)) Since this isn't *changing* the meaning of anything, it's backwards compatible. You gain an explicit concatenation operator, the default case is handled by Python's standard mechanisms, the special cases are handled by Python's standard mechanisms, and it's all exactly what people would expect. Then the use of '+' to concatenate strings can be deprecated without removal (or, more likely, kept fully supported by the language but deprecated in style guides), and you've mostly achieved what you sought. Your challenge: Find a suitable operator to use. It wants to be ASCII, and it has to be illegal syntax in current Python versions. It doesn't have to be a single character, but it should be short (two is okay, three is the number thou shalt stop at, four thou shalt not count, and five is right out) and easily typed, since string concatenation is incredibly common. It should ideally evoke "concatenation", but that isn't strictly necessary (the link between "@" and "matrix multiplication" is tenuous at best). Good luck. :) For my part, I'm -0.5 on my own counter-proposal, but that's a fair slab better than the -1000 that I am on the version that breaks backward compatibility for minimal real gain. ChrisA From electro.nnn at gmail.com Fri Jun 30 01:07:58 2017 From: electro.nnn at gmail.com (electron) Date: Fri, 30 Jun 2017 09:37:58 +0430 Subject: [Python-ideas] Python 4: Concatenation Message-ID: ---------- Forwarded message ---------- From: Chris Angelico Date: Fri, Jun 30, 2017 at 9:12 AM Subject: Re: [Python-ideas] Python 4: Concatenation To: electron On Fri, Jun 30, 2017 at 2:38 PM, electron wrote: > On Fri, Jun 30, 2017 at 8:30 AM, Chris Angelico wrote: >> >> Your challenge: Find a suitable operator to use. It wants to be ASCII, >> and it has to be illegal syntax in current Python versions. It doesn't >> have to be a single character, but it should be short (two is okay, >> three is the number thou shalt stop at, four thou shalt not count, and >> five is right out) and easily typed, since string concatenation is >> incredibly common. It should ideally evoke "concatenation", but that >> isn't strictly necessary (the link between "@" and "matrix >> multiplication" is tenuous at best). > > > Just curious, does double dot `..` (also used in Lua) meet those conditions? I think so, but someone else may know of a way it'd be syntactically ambiguous or otherwise unsuitable. You'll do better to say that to the list. ChrisA -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Jun 30 01:45:15 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Jun 2017 15:45:15 +1000 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> Message-ID: On 30 June 2017 at 09:33, Soni L. wrote: > Step 4. make it into python 4, since it breaks backwards compatibility. If a Python 4.0 ever happens, it will abide by the usual feature release compatibility restrictions (i.e. anything that it drops will have gone through programmatic deprecation in preceding 3.x releases). This means there won't be any abrupt changes in syntax or semantics the way there were for the 3.0 transition. http://www.curiousefficiency.org/posts/2014/08/python-4000.html goes into more detail on that topic (although some time after I wrote that article, we decided that there probably *will* just be a 3.10, rather than switching the numbering to 4.0) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Fri Jun 30 02:04:17 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 30 Jun 2017 16:04:17 +1000 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> <20170630004847.GJ3149@ando.pearwood.info> Message-ID: <20170630060417.GK3149@ando.pearwood.info> On Thu, Jun 29, 2017 at 11:14:46PM -0300, Soni L. wrote: > astring cat alist is undefined for string (since strings are very > specific about types), so it would return a list. > > alist cat atuple would return a list, because the list comes first. This would be strongly unacceptable to me. If iterating over the items was the *only* thing people ever did with sequences, that *might* be acceptable, but it isn't. We do lots of other things, depending on what the sequence is: - we sort lists, append to them, slice them, etc; - we convert strings to uppercase, search them, etc. It is important to know that if you concatenate something to a string, it will either give a string, or noisily fail, rather than silently convert to a different type that doesn't support string operations. Now admittedly that rule can be broken by third-party classes (since they can overload operators to do anything), but that's more of a problem in theory than in practice. Your suggestion would make it a problem for builtins as well as (badly-behaved?) third-party classes. For when you don't care about the type, you just want it to be an iterator, that's where chaining is useful, and whether it is a chain function or a chain operator, it should accept any iterable and return an iterator. Concatenation is not the same as general chaining, although they are related. Concatenation should return the same type as its operands. Chaining can just return an arbitrary iterator. > (And don't say "performance" - CPython has a GIL, and Python makes > many convenience-over-performance tradeoffs like this.) Are you aware that CPython doesn't just have a GIL because the core devs think it would be funny to slow the language down? The GIL actually makes CPython faster: so far, all attempts to remove the GIL have made CPython slower. So your "performance" tradeoff goes the other way: without the GIL, Python code would be slower. (Perhaps the Gilectomy will change that in the future, but at the moment, it is fair to say that the GIL is an optimization that makes Python faster, not slower.) > Since we'd have a concatenation operator, why not extend them to > integers? No reason not to, really. That's the wrong question. Never ask "why not add this to the language?", the right question is "why should we add this?". We don't just add bloat and confusing, useless features to the language because nobody can think of a reason not to. Features have a cost: they cost developer effort to program and maintain, they cost effort to maintain the tests and documentation and to fix bugs, they cost users effort to learn about them and deal with them. Every feature has to pay its own way: the benefits have to outweigh the costs. -- Steve From cory at lukasa.co.uk Fri Jun 30 04:47:22 2017 From: cory at lukasa.co.uk (Cory Benfield) Date: Fri, 30 Jun 2017 09:47:22 +0100 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> <20170630004847.GJ3149@ando.pearwood.info> Message-ID: <07D713F3-2A4D-4076-8C1A-B1B7AEAAB201@lukasa.co.uk> > On 30 Jun 2017, at 03:14, Soni L. wrote: > > This isn't a *major* backwards incompatibility. Unlike with unicode/strings, a dumb static analysis program can trivially replace + with the concatenation operator, whatever that may be. Technically, nothing forces us to remove + from strings and such and the itertools stuff - we could just make them deprecated in python 4, and remove them in python 5. No it can?t, not unless you?re defining concatenation as identical to numeric addition (which I saw in your original post you are not). For example: def mymethod(a, b): return a + b What should the static analysis program do here? Naturally, it?s unclear. The only way to be even remotely sure in the current Python world where type hinting is optional and gradual is to do what PyPy does, which is to run the entire program and JIT it, and even then PyPy puts in guards to confirm that it doesn?t get caught out if and when an assumption is wrong. So yes, I?d say this is at least as bad as the unicode/bytes divide in terms of static analysis: unless you make type hinting mandatory for any function including the symbol ?+?, there is no automatic transformation that can be made here. Cory -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Jun 30 07:33:38 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 30 Jun 2017 13:33:38 +0200 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> Message-ID: 2017-06-30 1:33 GMT+02:00 Soni L. : > Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23, 22 > cat 33 = 2233, etc. if you need bitwise concatenation, you're already in > bitwise "hack" land so do it yourself. (no idea why bitwise is considered > hacky as I use it all the time, but oh well) I *never* needed "2 cat 3 == 23". Strange operator :-) Victor From jw14896.2014 at my.bristol.ac.uk Fri Jun 30 07:51:26 2017 From: jw14896.2014 at my.bristol.ac.uk (Jamie Willis) Date: Fri, 30 Jun 2017 12:51:26 +0100 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> Message-ID: Just as an aside, if a concatenation operator *was* included, a suitable operator would be "++", this is the concatenation operator in languages like Haskell (for strings) and the majority of Scala cases. Alternatively "<>" is an alternative, being the monoidal append operator in Haskell, which retains a certain similarly. I suggest these purely for their accepted usage, which means they should be more reasonable to identify. Jamie On 30 Jun 2017 12:35 pm, "Victor Stinner" wrote: > 2017-06-30 1:33 GMT+02:00 Soni L. : > > Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23, 22 > > cat 33 = 2233, etc. if you need bitwise concatenation, you're already in > > bitwise "hack" land so do it yourself. (no idea why bitwise is considered > > hacky as I use it all the time, but oh well) > > I *never* needed "2 cat 3 == 23". Strange operator :-) > > Victor > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.m.bray at gmail.com Fri Jun 30 08:22:22 2017 From: erik.m.bray at gmail.com (Erik Bray) Date: Fri, 30 Jun 2017 14:22:22 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170630010951.3fc7d17b@grzmot> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170630010951.3fc7d17b@grzmot> Message-ID: On Fri, Jun 30, 2017 at 1:09 AM, Jan Kaliszewski wrote: > 2017-06-25 Serhiy Storchaka dixit: > >> 25.06.17 15:06, lucas via Python-ideas ????: > >> > I often use generators, and itertools.chain on them. >> > What about providing something like the following: >> > >> > a = (n for n in range(2)) >> > b = (n for n in range(2, 4)) >> > tuple(a + b) # -> 0 1 2 3 > [...] >> It would be weird if the addition is only supported for instances of >> the generator class, but not for other iterators. Why (n for n in >> range(2)) >> + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, >> 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports >> arbitrary iterators. Therefore you will need to implement the __add__ >> method for *all* iterators in the world. >> >> However itertools.chain() accepts not just *iterators*. > [...] > > But implementation of the OP's proposal does not need to be based on > __add__ at all. It could be based on extending the current behaviour of > the `+` operator itself. > > Now this behavior is (roughly): try left side's __add__, if failed try > right side's __radd__, if failed raise TypeError. > > New behavior could be (again: roughly): try left side's __add__, if > failed try right side's __radd__, if failed try __iter__ of both sides > and chain them (creating a new iterator?), if failed raise TypeError. > > And similarly, for `+=`: try __iadd__..., try __add__..., try > __iter__..., raise TypeError. I actually really like this proposal, in additional to the original proposal of using '+' to chain generators--I don't think it necessarily needs to be extended to *all* iterables. But this proposal goes one better. I just have to wonder what kind of strange unexpected bugs would result. For example now you could add a list to a string: >>> list(['a', 'b', 'c'] + 'def') ['a', 'b', 'c', 'd', 'e', 'f'] Personally, I really like this and find it natural. But it will break anything expecting this to be a TypeError. From steve at pearwood.info Fri Jun 30 08:43:34 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 30 Jun 2017 22:43:34 +1000 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> Message-ID: <20170630124334.GL3149@ando.pearwood.info> On Fri, Jun 30, 2017 at 12:51:26PM +0100, Jamie Willis wrote: > Just as an aside, if a concatenation operator *was* included, a suitable > operator would be "++", As mentioned earlier in this thread, that is not possible in Python as syntactically `x ++ y` would be parsed as `x + (+y)` (the plus binary operator followed by the plus unary operator). > this is the concatenation operator in languages > like Haskell (for strings) and the majority of Scala cases. Alternatively > "<>" is an alternative, being the monoidal append operator in Haskell, > which retains a certain similarly. "<>" is familiar to many people as "not equal" in various programming languages, including older versions of Python. I'm not entirely sure what connection "<>" has to append, it seems pretty arbitrary to me, although in fairness nearly all operators are arbitrary symbols if you go back far enough. -- Steve From srkunze at mail.de Fri Jun 30 09:10:08 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 30 Jun 2017 15:10:08 +0200 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> Message-ID: On 30.06.2017 13:51, Jamie Willis wrote: > Just as an aside, if a concatenation operator *was* included, a > suitable operator would be "++", this is the concatenation operator in > languages like Haskell (for strings) and the majority of Scala cases. > Alternatively "<>" is an alternative, being the monoidal append > operator in Haskell, which retains a certain similarly. I suggest > these purely for their accepted usage, which means they should be more > reasonable to identify. '+' is the perfect concat operator. I love Python for this feature. Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Fri Jun 30 09:24:24 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 30 Jun 2017 15:24:24 +0200 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> Message-ID: <20170630132424.GA16285@phdru.name> On Fri, Jun 30, 2017 at 03:10:08PM +0200, "Sven R. Kunze" wrote: > '+' is the perfect concat operator. I love Python for this feature. +1 from me > Regards, > Sven Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From clint.hepner at gmail.com Fri Jun 30 10:14:47 2017 From: clint.hepner at gmail.com (Clint Hepner) Date: Fri, 30 Jun 2017 10:14:47 -0400 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: <20170630124334.GL3149@ando.pearwood.info> References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> <20170630124334.GL3149@ando.pearwood.info> Message-ID: <4225C639-DD80-44B2-A375-55D00CB3D7E1@gmail.com> > On Jun 30, 2017, at 8:43 AM, Steven D'Aprano wrote: > >> On Fri, Jun 30, 2017 at 12:51:26PM +0100, Jamie Willis wrote: >> >> Alternatively >> "<>" is an alternative, being the monoidal append operator in Haskell, >> which retains a certain similarly. > > "<>" is familiar to many people as "not equal" in various programming > languages, including older versions of Python. I'm not entirely sure > what connection "<>" has to append, it seems pretty arbitrary to me, > although in fairness nearly all operators are arbitrary symbols if you > go back far enough. > Even in Haskell, <> relies on the context of other operators like <*>, <$>, <+>, <|>, etc. to suggest a sort of generic, minimal binary operator. That meaning wouldn't translate well to other languages. Clint From fakedme+py at gmail.com Fri Jun 30 10:39:48 2017 From: fakedme+py at gmail.com (Soni L.) Date: Fri, 30 Jun 2017 11:39:48 -0300 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: <20170630124334.GL3149@ando.pearwood.info> References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> <20170630124334.GL3149@ando.pearwood.info> Message-ID: On 2017-06-30 09:43 AM, Steven D'Aprano wrote: > On Fri, Jun 30, 2017 at 12:51:26PM +0100, Jamie Willis wrote: > >> Just as an aside, if a concatenation operator *was* included, a suitable >> operator would be "++", > As mentioned earlier in this thread, that is not possible in Python as > syntactically `x ++ y` would be parsed as `x + (+y)` (the plus binary > operator followed by the plus unary operator). > >> this is the concatenation operator in languages >> like Haskell (for strings) and the majority of Scala cases. Alternatively >> "<>" is an alternative, being the monoidal append operator in Haskell, >> which retains a certain similarly. > "<>" is familiar to many people as "not equal" in various programming > languages, including older versions of Python. I'm not entirely sure > what connection "<>" has to append, it seems pretty arbitrary to me, > although in fairness nearly all operators are arbitrary symbols if you > go back far enough. > > || is the mathematical notation for concatenation. Which, just so happens to be available in Python, even if it might be confused with short-circuiting `or`. From rosuav at gmail.com Fri Jun 30 10:46:47 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 1 Jul 2017 00:46:47 +1000 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> <20170630124334.GL3149@ando.pearwood.info> Message-ID: On Sat, Jul 1, 2017 at 12:39 AM, Soni L. wrote: > || is the mathematical notation for concatenation. Which, just so happens to > be available in Python, even if it might be confused with short-circuiting > `or`. Also used in REXX. But the short-circuiting 'or' is not overridable. You'd have to use bitwise or instead. ChrisA From fakedme+py at gmail.com Fri Jun 30 11:09:52 2017 From: fakedme+py at gmail.com (Soni L.) Date: Fri, 30 Jun 2017 12:09:52 -0300 Subject: [Python-ideas] Bytecode JIT Message-ID: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> CPython should get a tracing JIT that turns slow bytecode into fast bytecode. A JIT doesn't have to produce machine code. bytecode-to-bytecode compilation is still compilation. bytecode-to-bytecode compilation works on iOS, and doesn't require deviating from C. (This "internal bytecode" should do things like know that 2 variables necessarily hold integers, doing just "x = y + z" in C in an IADD instruction as opposed to all those middle-of-function typechecks and overhead. You can typecheck once at the start of the function and run separate traces on that. Since this "internal bytecode" is extremely unsafe, it should be considered an implementation detail and never exposed to external code.) From phd at phdru.name Fri Jun 30 11:15:30 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 30 Jun 2017 17:15:30 +0200 Subject: [Python-ideas] CPython should get... In-Reply-To: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> Message-ID: <20170630151530.GA23663@phdru.name> On Fri, Jun 30, 2017 at 12:09:52PM -0300, "Soni L." wrote: > CPython should get a You're welcome to create one. Go on, send your pull requests! Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From steve at pearwood.info Fri Jun 30 11:20:39 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 1 Jul 2017 01:20:39 +1000 Subject: [Python-ideas] Bytecode JIT In-Reply-To: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> Message-ID: <20170630152039.GM3149@ando.pearwood.info> On Fri, Jun 30, 2017 at 12:09:52PM -0300, Soni L. wrote: > CPython should get a tracing JIT that turns slow bytecode into fast > bytecode. Are you volunteering to do the work? -- Steve From mertz at gnosis.cx Fri Jun 30 11:50:45 2017 From: mertz at gnosis.cx (David Mertz) Date: Fri, 30 Jun 2017 08:50:45 -0700 Subject: [Python-ideas] Bytecode JIT In-Reply-To: <20170630152039.GM3149@ando.pearwood.info> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170630152039.GM3149@ando.pearwood.info> Message-ID: PyPy does basically this. So does the tentative project Pyjion. Also Numba, but on a pre-function basis. It's not a bad ideas, and one that currently exists with varying degrees of refinement in several projects. I may have forgotten a few others. I suppose Brython in a sense. This is very unlikely to make it into core CPython because code simplicity is one of its goals. But all those other projects would welcome your help. On Jun 30, 2017 8:22 AM, "Steven D'Aprano" wrote: On Fri, Jun 30, 2017 at 12:09:52PM -0300, Soni L. wrote: > CPython should get a tracing JIT that turns slow bytecode into fast > bytecode. Are you volunteering to do the work? -- Steve _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Fri Jun 30 13:37:12 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 30 Jun 2017 20:37:12 +0300 Subject: [Python-ideas] CPython should get... In-Reply-To: <20170630151530.GA23663@phdru.name> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170630151530.GA23663@phdru.name> Message-ID: On Jun 30, 2017 5:16 PM, "Oleg Broytman" wrote: On Fri, Jun 30, 2017 at 12:09:52PM -0300, "Soni L." wrote: > CPython should get a You're welcome to create one. Go on, send your pull requests! But if you are planning to do that, it is still a good idea to ask for feedback here first. That will increase the chances of acceptance by a lot. Also, it doesn't necessarily need to be your own idea :) -- Koos -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Fri Jun 30 14:02:45 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 30 Jun 2017 21:02:45 +0300 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170630010951.3fc7d17b@grzmot> Message-ID: On Jun 30, 2017 2:23 PM, "Erik Bray" wrote: I actually really like this proposal, in additional to the original proposal of using '+' to chain generators--I don't think it necessarily needs to be extended to *all* iterables. But this proposal goes one better. I just have to wonder what kind of strange unexpected bugs would result. For example now you could add a list to a string: >>> list(['a', 'b', 'c'] + 'def') ['a', 'b', 'c', 'd', 'e', 'f'] Personally, I really like this and find it natural. But it will break anything expecting this to be a TypeError. Note that you can already do: [*iterable1, *iterable2] Or like in your example: >>> [*['a', 'b', 'c'], *'def'] ['a', 'b', 'c', 'd', 'e', 'f'] At least I think you can do that in 3.6 ;) -- Koos (mobile) -------------- next part -------------- An HTML attachment was scrubbed... URL: From breamoreboy at yahoo.co.uk Fri Jun 30 14:51:45 2017 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Fri, 30 Jun 2017 19:51:45 +0100 Subject: [Python-ideas] Bytecode JIT In-Reply-To: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> Message-ID: On 30/06/2017 16:09, Soni L. wrote: > CPython should get a tracing JIT that turns slow bytecode into fast > bytecode. > > A JIT doesn't have to produce machine code. bytecode-to-bytecode > compilation is still compilation. bytecode-to-bytecode compilation works > on iOS, and doesn't require deviating from C. > > (This "internal bytecode" should do things like know that 2 variables > necessarily hold integers, doing just "x = y + z" in C in an IADD > instruction as opposed to all those middle-of-function typechecks and > overhead. You can typecheck once at the start of the function and run > separate traces on that. Since this "internal bytecode" is extremely > unsafe, it should be considered an implementation detail and never > exposed to external code.) > Patches are always welcome. When do you intend delivering yours? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence --- This email has been checked for viruses by AVG. http://www.avg.com From tjreedy at udel.edu Fri Jun 30 16:59:49 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 30 Jun 2017 16:59:49 -0400 Subject: [Python-ideas] Python 4: Concatenation In-Reply-To: <20170630132424.GA16285@phdru.name> References: <15beb1ca-79e1-c257-864c-8155d549ec10@gmail.com> <20170630132424.GA16285@phdru.name> Message-ID: On 6/30/2017 9:24 AM, Oleg Broytman wrote: > On Fri, Jun 30, 2017 at 03:10:08PM +0200, "Sven R. Kunze" wrote: >> '+' is the perfect concat operator. I love Python for this feature. > > +1 from me and me. I think extending it to chain iterators is an intriguing idea. It would not be the first time syntax was implemented with more than one special method. When the boolean value of an object is needed, first .__bool__, then .__len__ are used. Iter() first tries .__iter__, then .__getitem__. When counts are expressed in their original unary notation, addition is concatention. If one thinks of a sequence as a unary representation of its length*, then concatenation is adddition. *This is a version of the mathematical idea of cardinal number. Whether intentionally or by accident, or perhaps, whether by analysis or intuition, I think Guido got this one right. -- Terry Jan Reedy From victor.stinner at gmail.com Fri Jun 30 18:17:59 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 1 Jul 2017 00:17:59 +0200 Subject: [Python-ideas] Bytecode JIT In-Reply-To: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> Message-ID: 2017-06-30 17:09 GMT+02:00 Soni L. : > CPython should get a tracing JIT that turns slow bytecode into fast > bytecode. > > A JIT doesn't have to produce machine code. bytecode-to-bytecode compilation > is still compilation. bytecode-to-bytecode compilation works on iOS, and > doesn't require deviating from C. Optimizations require to make assumptions on the code, and deoptimize if an assumption becomes wrong. I call these things "guards". If I understood correctly, PyPy is able to deoptimize a function in the middle of the function, while executing it. In my FAT Python project, I tried something simpler: add guards at the function entry point, and decide at the entry which version of the code should be run (FAT Python allows to have more than 2 versions of the code for the same function). I described my implementation in the PEP 510: https://www.python.org/dev/peps/pep-0510/ I agree that you *can* emit more efficient bytecode using assumptions. But I'm not sure that the best speedup will be worth it. For example, if your maximum speedup is 20% but the JIT compiler increases the startup time and uses more memory, I'm not sure that users will use it. The design will restrict indirectly the maximum speed. At the bytecode level, you cannot specialize bytecode for 1+2 (x+y with x=1 and y=2) for example. The BINARY_ADD instruction calls PyNumber_Add(), but a previous experience showed that the dispatch inside PyNumber_Add() to reach long_add() is expensive. I'm trying to find a solution to not make CPython 20% faster, but 2x faster. See my talk at the recent Python Language Summit (at Pycon US): https://github.com/haypo/conf/raw/master/2017-PyconUS/summit.pdf https://lwn.net/Articles/723949/ My mid-term/long-term plan for FAT Python is to support multiple optimizers, and allow developers to choose between bytecode ("Python" code) and machine code ("C" code). For example, work on an optimizer reusing Cython rather than writing a new compiler from scratch. My current optimizer works at the AST level and emits more efficient bytecode by rewriting the AST. But another major design choice in FAT Python is to run the optimizer ahead-of-time (AoT), rather than just-in-time (JIT). Maybe it will not work. We will see :-) I suggest you to take a look at my notes to make CPython faster: http://faster-cpython.readthedocs.io/ FAT Python homepage: http://faster-cpython.readthedocs.io/fat_python.html -- You may also be interested by my Pycon US talk about CPython optimization in 3.5, 3.6 and 3.7: https://lwn.net/Articles/725114/ Victor From fakedme+py at gmail.com Fri Jun 30 18:30:59 2017 From: fakedme+py at gmail.com (Soni L.) Date: Fri, 30 Jun 2017 19:30:59 -0300 Subject: [Python-ideas] Bytecode JIT In-Reply-To: References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> Message-ID: On 2017-06-30 07:17 PM, Victor Stinner wrote: > 2017-06-30 17:09 GMT+02:00 Soni L. : >> CPython should get a tracing JIT that turns slow bytecode into fast >> bytecode. >> >> A JIT doesn't have to produce machine code. bytecode-to-bytecode compilation >> is still compilation. bytecode-to-bytecode compilation works on iOS, and >> doesn't require deviating from C. > Optimizations require to make assumptions on the code, and deoptimize > if an assumption becomes wrong. I call these things "guards". If I > understood correctly, PyPy is able to deoptimize a function in the > middle of the function, while executing it. In my FAT Python project, > I tried something simpler: add guards at the function entry point, and > decide at the entry which version of the code should be run (FAT > Python allows to have more than 2 versions of the code for the same > function). > > I described my implementation in the PEP 510: > https://www.python.org/dev/peps/pep-0510/ > > I agree that you *can* emit more efficient bytecode using assumptions. > But I'm not sure that the best speedup will be worth it. For example, > if your maximum speedup is 20% but the JIT compiler increases the > startup time and uses more memory, I'm not sure that users will use > it. The design will restrict indirectly the maximum speed. > > At the bytecode level, you cannot specialize bytecode for 1+2 (x+y > with x=1 and y=2) for example. The BINARY_ADD instruction calls > PyNumber_Add(), but a previous experience showed that the dispatch > inside PyNumber_Add() to reach long_add() is expensive. If you can assert that the sum(s) never overflow an int, you can avoid hitting long_add() entirely, and avoid all the checks around it. IADD would be IADD as opposed to NADD because it would add two ints specifically, not two numbers. And it would do no overflow checks because the JIT already told it no overflow can happen. > > I'm trying to find a solution to not make CPython 20% faster, but 2x > faster. See my talk at the recent Python Language Summit (at Pycon > US): > https://github.com/haypo/conf/raw/master/2017-PyconUS/summit.pdf > https://lwn.net/Articles/723949/ > > My mid-term/long-term plan for FAT Python is to support multiple > optimizers, and allow developers to choose between bytecode ("Python" > code) and machine code ("C" code). For example, work on an optimizer > reusing Cython rather than writing a new compiler from scratch. My > current optimizer works at the AST level and emits more efficient > bytecode by rewriting the AST. > > But another major design choice in FAT Python is to run the optimizer > ahead-of-time (AoT), rather than just-in-time (JIT). Maybe it will not > work. We will see :-) > > I suggest you to take a look at my notes to make CPython faster: > http://faster-cpython.readthedocs.io/ > > FAT Python homepage: > http://faster-cpython.readthedocs.io/fat_python.html > > -- > > You may also be interested by my Pycon US talk about CPython > optimization in 3.5, 3.6 and 3.7: > https://lwn.net/Articles/725114/ > > Victor From cs at zip.com.au Fri Jun 30 19:39:12 2017 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 1 Jul 2017 09:39:12 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: Message-ID: <20170630233912.GA24971@cskk.homeip.net> On 28Jun2017 09:54, Paul Moore wrote: >On 28 June 2017 at 05:30, Terry Reedy wrote: >> On 6/27/2017 10:47 PM, Nick Coghlan wrote: >> >>> While I haven't been following this thread closely, I'd like to note >>> that arguing for a "chain()" builtin has the virtue that would just be >>> arguing for the promotion of the existing itertools.chain function >>> into the builtin namespace. >>> >>> Such an approach has a lot to recommend it: >>> >>> 1. It has precedent, in that Python 3's map(), filter(), and zip(), >>> are essentially Python 2's itertools.imap(), ifilter(), and izip() >>> 2. There's no need for a naming or semantics debate, as we'd just be >>> promoting an established standard library API into the builtin >>> namespace >> >> >> A counter-argument is that there are other itertools that deserve promotion, >> by usage, even more. But we need to see comparisons from more that one >> limited corpus. > >Indeed. I don't recall *ever* using itertools.chain myself. I'd be >interested in seeing some usage stats to support this proposal. As an >example, I see 8 uses of itertools.chain in pip and its various >vendored packages, as opposed to around 30 uses of map (plus however >many list comprehensions are used in place of maps). On a very brief >scan, it looks like the various other itertools are used less than >chain, but with only 8 uses of chain, it's not really possible to read >anything more into the relative frequencies. I don't use it often, but when I do it is very handy. While I'm not arguing for making it a builtin on the basis of my own use (though I've no objections either), a quick grep shows: My maildb kit uses chain to assemble multiple related header values: *chain( msg.get_all(hdr, []) for hdr in ('to', 'cc', 'bcc', 'resent-to', 'resent-cc') ) Two examples where I use it to insert items in front of an iterable: chunks = chain( [data], chunks ) blocks = indirect_blocks(chain( ( topblock, nexttopblock ), blocks )) Neither of these is amenable to list rephrasings because the tail iterables ("chunks" and "blocks") are of unknown and potentially large size. And a few other cases whose uses are harder to succinctly describe, but generally "iterable flattening". So it is uncommon for me, but very useful when I want it. Just some (small) data points. Cheers, Cameron Simpson From cs at zip.com.au Fri Jun 30 20:17:40 2017 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 1 Jul 2017 10:17:40 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: Message-ID: <20170701001740.GA74841@cskk.homeip.net> On 26Jun2017 23:26, Koos Zevenhoven wrote: >I sometimes wish there was something like > >c from: > yield from a > yield from b Nice. >?...or to get a list: > >c as list from: > yield from a > yield from b > >...or a sum: > >c as sum from: > yield from a > yield from b > >These would be great for avoiding crazy oneliner generator expressions. Also nice, but for me a nonstarter because it breaks the existing pythyon idion that "... as foo" means to bind the name "foo" as the expression on the left. Such as with import, except. So +1 for the form, -1 for the particular keyword. Cheers, Cameron Simpson Trust the computer industry to shorten Year 2000 to Y2K. It was this thinking that caused the problem in the first place. - Mark Ovens From mertz at gnosis.cx Fri Jun 30 21:33:33 2017 From: mertz at gnosis.cx (David Mertz) Date: Fri, 30 Jun 2017 18:33:33 -0700 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170701001740.GA74841@cskk.homeip.net> References: <20170701001740.GA74841@cskk.homeip.net> Message-ID: We HAVE spellings for these things: c from: >> yield from a >> yield from b >> > c = chain(a, b) > c as list from: >> yield from a >> yield from b >> > c = list(chain(a, b)) > c as sum from: >> yield from a >> yield from b >> > c = sum(chain(a, b)) Those really are not "crazy generator expressions." -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: