From stefan_ml at behnel.de Tue Jan 1 08:39:11 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 1 Jan 2019 14:39:11 +0100 Subject: [Python-ideas] No need to add a regex pattern literal In-Reply-To: References: <20f68a19-dd5d-b5cf-dbd0-3ec1a6181138@163.com> <20181231122316.28d49afc@fsol> <264c35f0-ccfd-7326-931c-ce5aa098709c@python.org> Message-ID: Ma Lin schrieb am 31.12.18 um 14:02: > On 18-12-31 19:47, Antoine Pitrou wrote: >> The complaint is that the global cache is still too costly. >> See measurements in https://bugs.python.org/issue35559 > > In this issue, using a global variable `_has_non_base16_digits` [1] will > accelerate 30%. > Is re module's internal cache [2] so bad? > > If rewrite re module's cache with C and use a custom data structure, maybe > we will get a small speedup. > > [1] `_has_non_base16_digits` in PR11287 > [1] https://github.com/python/cpython/pull/11287/files > > [2] re module's internal cache code: > [2] https://github.com/python/cpython/blob/master/Lib/re.py#L268-L295 > > _cache = {}? # ordered! > _MAXCACHE = 512 > def _compile(pattern, flags): > ??? # internal: compile pattern > ??? if isinstance(flags, RegexFlag): > ??????? flags = flags.value > ??? try: > ??????? return _cache[type(pattern), pattern, flags] > ??? except KeyError: > ??????? pass > ??? ... I wouldn't be surprised if the slowest part here was the isinstance() check. Maybe the RegexFlag class could implement "__hash__()" as "return hash(self.value)" ? Stefan From malincns at 163.com Tue Jan 1 21:51:13 2019 From: malincns at 163.com (Ma Lin) Date: Wed, 2 Jan 2019 10:51:13 +0800 Subject: [Python-ideas] No need to add a regex pattern literal In-Reply-To: References: <20f68a19-dd5d-b5cf-dbd0-3ec1a6181138@163.com> <20181231122316.28d49afc@fsol> <264c35f0-ccfd-7326-931c-ce5aa098709c@python.org> Message-ID: <17b8b959-812e-d02b-fee2-19abf3123ad9@163.com> On 19-1-1 21:39, Stefan Behnel wrote: > I wouldn't be surprised if the slowest part here was the isinstance() > check. Maybe the RegexFlag class could implement "__hash__()" as "return > hash(self.value)" ? Apply this patch: ?def _compile(pattern, flags): ???? # internal: compile pattern -??? if isinstance(flags, RegexFlag): -??????? flags = flags.value +??? try: +??????? flags = int(flags) +??? except: +??????? pass ???? try: ???????? return _cache[type(pattern), pattern, flags] ???? except KeyError: Then run this benchmark on my Raspberry Pi 3B: import perf runner = perf.Runner() runner.timeit(name="compile_re", ????????????? stmt="re.compile(b'[^0-9A-F]')", ????????????? setup="import re") Mean +- std dev: [a] 7.71 us +- 0.09 us -> [b] 6.74 us +- 0.10 us: 1.14x faster (-13%) Looks great. From steelman at post.pl Fri Jan 4 09:57:53 2019 From: steelman at post.pl (=?UTF-8?Q?=C5=81ukasz_Stelmach?=) Date: Fri, 4 Jan 2019 15:57:53 +0100 (CET) Subject: [Python-ideas] Fixed point format for numbers with locale based separators Message-ID: <1541732215.575168.1546613873762@poczta.home.pl> Hi, I would like to present two pull requests[1][2] implementing fixed point presentation of numbers and ask for comments. The first is mine. I learnt about the second after publishing mine. The only format using decimal separator from locale data for float/complex/decimal numbers at the moment is "n" which behaves like "g". The drawback of these formats, I would like to overcome, is the inability to print numbers ranging more than one order of magnitude with the same number of decimal digits without "manually" (with some additional custom code) adjusting precission. The other option is to "manually" replace "." as printed by "f" with a local decimal separator. Neither of these option is appealing to my. Formatting 1.23456789 * n (LC_ALL=3Dpl_PL.UTF-8) | n | ".2f" | ".3n" | |---+----------+----------| | 1 | 1.23 | 1,23 | | 2 | 12.35 | 12,3 | | 3 | 123.46 | 123 | | 4 | 1234.57 | 1,23e+03 | In the application I want to create I am going to present users numbers ranging up to 3 orders of magnitude and I (my users) want them to be presented consistently with regards to number of decimal digits AND I want to conform to rules of languages of my users. And I would like to avoid the exponent notation by all means. I can't say much about James Emerton's implementation or his intentions, but please take a look at our patches and give your comments so either of us or together we can implement this feature. PS. In theory both implementations could be merged because James chose to use "l" to use LC_MONETARY category and I chose "m" to use LC_NUMERIC. [1] https://github.com/python/cpython/pull/11405 [2] https://github.com/python/cpython/pull/8612 -- Mi?ego dnia, ?ukasz Stelmach From steve at pearwood.info Fri Jan 4 11:56:01 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 5 Jan 2019 03:56:01 +1100 Subject: [Python-ideas] Fixed point format for numbers with locale based separators In-Reply-To: <1541732215.575168.1546613873762@poczta.home.pl> References: <1541732215.575168.1546613873762@poczta.home.pl> Message-ID: <20190104165600.GR13616@ando.pearwood.info> On Fri, Jan 04, 2019 at 03:57:53PM +0100, ?ukasz Stelmach wrote: > Hi, > > I would like to present two pull requests[1][2] implementing fixed point > presentation of numbers and ask for comments. The first is mine. I > learnt about the second after publishing mine. Before I look at the implementation, can you explain the functional requirements please? In other words, what is the new feature you hope to have excepted? Explain the intention and the API (the interface). The implementation is the least important part :-) [...] > Formatting 1.23456789 * n (LC_ALL=3Dpl_PL.UTF-8) > | n | ".2f" | ".3n" | > |---+----------+----------| > | 1 | 1.23 | 1,23 | > | 2 | 12.35 | 12,3 | > | 3 | 123.46 | 123 | > | 4 | 1234.57 | 1,23e+03 | I'm afraid I cannot work out what that table means. You say "Formatting 1.23... * n" (multiplying by n) but the results shown aren't multiplied by n=2, n=3, n=4 as the table suggests. Can you show what Python code you expect will produce the expected output? Thank you. -- Steve From abedillon at gmail.com Fri Jan 4 14:01:51 2019 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 4 Jan 2019 13:01:51 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS Message-ID: I keep coming back to this great video about coding style, and one point in particular rings true to me: ALL_CAPS_IS_OBNOXIOUS It destroys the visual flow of code and for what? To signify a global, constant, or Enum? Is that really so important? I don't think so. I think the all caps style has out-lived its usefulness and needs to go the way of the dodo. The last time I saw all caps used appropriately was in a YouTube comment where some guy was ranting about the communist Jewish banker conspiracy to control the world. In that case, all caps clearly communicated to me that the person was a frothing lunatic (thought find the idea of communist bankers intriguing). Currently PEP-8 prescribes all caps for constants and uses the all cap variable "FILES" as an example in a different section. It also appears to be the defacto-standard for enums (based on the documentation ) I don't think it's necessary to make any breaking changes. Just pep-8 and (of less importance) spurious documentation examples. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Fri Jan 4 14:13:30 2019 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Fri, 4 Jan 2019 13:13:30 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: IMO it's good to have some sort of visual differentiation between constants and normal values. A lot of camelCase-based style guides use a k prefix (e.g. kMyValue), but Python doesn't use it (other than PascalCase for classes). If there were to be an alternative to ALL_CAPS for constants, I guess maybe it'd also be PascalCase? That being said, Dart 2 has dropped ALL_CAPS constants from its style guide, and although everyone's survived just fine, I do somewhat miss being able to immediately be able to see where something is coming from solely from the case. Side note: it seems like ALL_CAPS kind of came from macros being using for constants in C and persisted. -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/ On Fri, Jan 4, 2019, 1:02 PM Abe Dillon I keep coming back to this great video about > coding style, and one point in particular rings true to me: > ALL_CAPS_IS_OBNOXIOUS > > It destroys the visual flow of code and for what? To signify a global, > constant, or Enum? Is that really so important? I don't think so. I think > the all caps style has out-lived its usefulness and needs to go the way of > the dodo. > > The last time I saw all caps used appropriately was in a YouTube comment > where some guy was ranting about the communist Jewish banker conspiracy to > control the world. In that case, all caps clearly communicated to me that > the person was a frothing lunatic (thought find the idea of communist > bankers intriguing). > > Currently PEP-8 prescribes all caps for constants > and uses the all > cap variable "FILES" as an example in a different section. > It > also appears to be the defacto-standard for enums (based on the > documentation > ) > > I don't think it's necessary to make any breaking changes. Just pep-8 and > (of less importance) spurious documentation examples. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Fri Jan 4 15:55:09 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Fri, 4 Jan 2019 21:55:09 +0100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: So you're saying we should prefer a future where it's an inconsistent mess? I agree with you that the C standard is ugly but it's.more important to have a standard than what that standard is. And we do have a strong standard today. -1 from me / Anders > On 4 Jan 2019, at 20:01, Abe Dillon wrote: > > I keep coming back to this great video about coding style, and one point in particular rings true to me: ALL_CAPS_IS_OBNOXIOUS > > It destroys the visual flow of code and for what? To signify a global, constant, or Enum? Is that really so important? I don't think so. I think the all caps style has out-lived its usefulness and needs to go the way of the dodo. > > The last time I saw all caps used appropriately was in a YouTube comment where some guy was ranting about the communist Jewish banker conspiracy to control the world. In that case, all caps clearly communicated to me that the person was a frothing lunatic (thought find the idea of communist bankers intriguing). > > Currently PEP-8 prescribes all caps for constants and uses the all cap variable "FILES" as an example in a different section. It also appears to be the defacto-standard for enums (based on the documentation) > > I don't think it's necessary to make any breaking changes. Just pep-8 and (of less importance) spurious documentation examples. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernardo at bernardosulzbach.com Fri Jan 4 16:06:15 2019 From: bernardo at bernardosulzbach.com (Bernardo Sulzbach) Date: Fri, 4 Jan 2019 19:06:15 -0200 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: I disagree. Changing this in the PEP will make an absurd amount of code which is PEP-8 compliant no longer so. Also, the refactoring may not always be trivial as the lowercase names may already be in use. I'd suggest violating PEP-8 instead of trying to change it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Fri Jan 4 17:58:51 2019 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 4 Jan 2019 16:58:51 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: Do you not have/use syntax highlighting? If not, why not? There's a right and wrong tool for everything. In the case of visually differentiating various kinds of code entities, the IDE is the right tool, all caps is not. On Fri, Jan 4, 2019 at 1:13 PM Ryan Gonzalez wrote: > IMO it's good to have some sort of visual differentiation between > constants and normal values. A lot of camelCase-based style guides use a k > prefix (e.g. kMyValue), but Python doesn't use it (other than PascalCase > for classes). If there were to be an alternative to ALL_CAPS for constants, > I guess maybe it'd also be PascalCase? > > That being said, Dart 2 has dropped ALL_CAPS constants from its style > guide, and although everyone's survived just fine, I do somewhat miss being > able to immediately be able to see where something is coming from solely > from the case. > > Side note: it seems like ALL_CAPS kind of came from macros being using for > constants in C and persisted. > > -- > Ryan (????) > Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else > https://refi64.com/ > > On Fri, Jan 4, 2019, 1:02 PM Abe Dillon >> I keep coming back to this great video about >> coding style, and one point in particular rings true to me: >> ALL_CAPS_IS_OBNOXIOUS >> >> It destroys the visual flow of code and for what? To signify a global, >> constant, or Enum? Is that really so important? I don't think so. I think >> the all caps style has out-lived its usefulness and needs to go the way of >> the dodo. >> >> The last time I saw all caps used appropriately was in a YouTube comment >> where some guy was ranting about the communist Jewish banker conspiracy to >> control the world. In that case, all caps clearly communicated to me that >> the person was a frothing lunatic (thought find the idea of communist >> bankers intriguing). >> >> Currently PEP-8 prescribes all caps for constants >> and uses the all >> cap variable "FILES" as an example in a different section. >> It >> also appears to be the defacto-standard for enums (based on the >> documentation >> ) >> >> I don't think it's necessary to make any breaking changes. Just pep-8 and >> (of less importance) spurious documentation examples. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Jan 4 18:02:38 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 5 Jan 2019 10:02:38 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Sat, Jan 5, 2019 at 9:59 AM Abe Dillon wrote: > > Do you not have/use syntax highlighting? If not, why not? There's a right and wrong tool for everything. In the case of visually differentiating various kinds of code entities, the IDE is the right tool, all caps is not. > All-caps is a signal to the human or the IDE that this is never going to be mutated or rebound. How else do you convey that information? How does the IDE know this doesn't ever get changed? ChrisA From boxed at killingar.net Fri Jan 4 18:05:03 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sat, 5 Jan 2019 00:05:03 +0100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <24E737BA-38C6-48C5-AC26-47855103F20C@killingar.net> > Do you not have/use syntax highlighting? If not, why not? There's a right and wrong tool for everything. In the case of visually differentiating various kinds of code entities, the IDE is the right tool, all caps is not. This is an argument against: - the line length limit (because the IDE should just soft break lines in a super nice way) - explicit ?self.? (swift does this with syntax highlighting for example) - CamelCase for classes/types (actually python does a bad job here anyway with int, str, datetime, etc) I?m not saying I disagree but we should be aware that this is the argument. (and as ChrisA rightly points out it?s not fully applicable to constants anyway) / Anders From abedillon at gmail.com Fri Jan 4 18:10:10 2019 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 4 Jan 2019 17:10:10 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: > > So you're saying we should prefer a future where it's an inconsistent mess? No. And please don't straw man. It's a very annoying argumentative tactic. I prefer a future where all caps aren't used. I understand that the change I propose won't magically transport us there, but I don't think it justifies encouraging all caps. As it is, the mix of all caps, camel-case, and snake-case IS and inconsistent and visual mess. Discouraging all caps will only result in a diminishing occurrence of all caps. it's.more important to have a standard than what that standard is. And we > do have a strong standard today. I understand that there's a barrier to change, but there's also a circular logic to resisting change because adhering to a standard is good. How bad would it really be to remove the line about constants being all caps from PEP-8? On Fri, Jan 4, 2019 at 2:55 PM Anders Hovm?ller wrote: > So you're saying we should prefer a future where it's an inconsistent > mess? I agree with you that the C standard is ugly but it's.more important > to have a standard than what that standard is. And we do have a strong > standard today. > > -1 from me > > / Anders > > On 4 Jan 2019, at 20:01, Abe Dillon wrote: > > I keep coming back to this great video about > coding style, and one point in particular rings true to me: > ALL_CAPS_IS_OBNOXIOUS > > It destroys the visual flow of code and for what? To signify a global, > constant, or Enum? Is that really so important? I don't think so. I think > the all caps style has out-lived its usefulness and needs to go the way of > the dodo. > > The last time I saw all caps used appropriately was in a YouTube comment > where some guy was ranting about the communist Jewish banker conspiracy to > control the world. In that case, all caps clearly communicated to me that > the person was a frothing lunatic (thought find the idea of communist > bankers intriguing). > > Currently PEP-8 prescribes all caps for constants > and uses the all > cap variable "FILES" as an example in a different section. > It > also appears to be the defacto-standard for enums (based on the > documentation > ) > > I don't think it's necessary to make any breaking changes. Just pep-8 and > (of less importance) spurious documentation examples. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Fri Jan 4 18:14:54 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sat, 5 Jan 2019 00:14:54 +0100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <78B7F104-11E5-425A-8822-30493CDBCEAA@killingar.net> > So you're saying we should prefer a future where it's an inconsistent mess? > No. And please don't straw man. It's a very annoying argumentative tactic. I prefer a future where all caps aren't used. I understand that the change I propose won't magically transport us there, but I don't think it justifies encouraging all caps. As it is, the mix of all caps, camel-case, and snake-case IS and inconsistent and visual mess. Discouraging all caps will only result in a diminishing occurrence of all caps. You mean it?s already an inconsistent mess? Hmm, maybe. I?d like to see some stats or something. You might be right! > > it's.more important to have a standard than what that standard is. And we do have a strong standard today. > I understand that there's a barrier to change, but there's also a circular logic to resisting change because adhering to a standard is good. Agreed. It?s a tradeoff with the amount of time we spend in the ugly place between standards. Maybe it?s not so much time, or maybe it?s a minor annoyance and so it doesn?t matter how long we are in it. / Anders -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Jan 4 18:15:06 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 5 Jan 2019 10:15:06 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Sat, Jan 5, 2019 at 10:10 AM Abe Dillon wrote: >> >> So you're saying we should prefer a future where it's an inconsistent mess? > > No. And please don't straw man. It's a very annoying argumentative tactic. I prefer a future where all caps aren't used. I understand that the change I propose won't magically transport us there, but I don't think it justifies encouraging all caps. As it is, the mix of all caps, camel-case, and snake-case IS and inconsistent and visual mess. Discouraging all caps will only result in a diminishing occurrence of all caps. > >> it's.more important to have a standard than what that standard is. And we do have a strong standard today. > > I understand that there's a barrier to change, but there's also a circular logic to resisting change because adhering to a standard is good. > > How bad would it really be to remove the line about constants being all caps from PEP-8? How do you propose, instead, for the constantness of something to be indicated? ChrisA From abedillon at gmail.com Fri Jan 4 18:29:23 2019 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 4 Jan 2019 17:29:23 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: > > How do you propose, instead, for the constantness of something to be > indicated? That's a good question. I honestly don't use constants all that much, I like to move such things out to config files. For a constant like math.pi, it's never been caps, yet people know it's not a great idea to change it. There are a lot of tools to indicate constantness: 1) provide a property to access an otherwise _plz_dont_touch_variable 2) Use an Enum 3) Use documentation to say: treat this as constant 4) Rely upon consenting adults to not change variables outside of scope. It's weird to manipulate math.pi because it's in a separate module. I stopped using all caps a long time ago and it just hasn't created a problem because manipulating global variables without knowing what they are is such a bad idea to begin with. On Fri, Jan 4, 2019 at 5:16 PM Chris Angelico wrote: > On Sat, Jan 5, 2019 at 10:10 AM Abe Dillon wrote: > >> > >> So you're saying we should prefer a future where it's an inconsistent > mess? > > > > No. And please don't straw man. It's a very annoying argumentative > tactic. I prefer a future where all caps aren't used. I understand that the > change I propose won't magically transport us there, but I don't think it > justifies encouraging all caps. As it is, the mix of all caps, camel-case, > and snake-case IS and inconsistent and visual mess. Discouraging all caps > will only result in a diminishing occurrence of all caps. > > > >> it's.more important to have a standard than what that standard is. And > we do have a strong standard today. > > > > I understand that there's a barrier to change, but there's also a > circular logic to resisting change because adhering to a standard is good. > > > > How bad would it really be to remove the line about constants being all > caps from PEP-8? > > How do you propose, instead, for the constantness of something to be > indicated? > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Jan 4 18:41:04 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 5 Jan 2019 10:41:04 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Sat, Jan 5, 2019 at 10:29 AM Abe Dillon wrote: >> >> How do you propose, instead, for the constantness of something to be indicated? > > That's a good question. I honestly don't use constants all that much... Just to be clear here: you're trying to say that the ALL_CAPS_NAME convention is unnecessary, but you don't use constants. That kinda weakens your argument a bit :) > 1) provide a property to access an otherwise _plz_dont_touch_variable > 2) Use an Enum > 3) Use documentation to say: treat this as constant > 4) Rely upon consenting adults to not change variables outside of scope. It's weird to manipulate math.pi because it's in a separate module. > > I stopped using all caps a long time ago and it just hasn't created a problem because manipulating global variables without knowing what they are is such a bad idea to begin with. The whole point of the all-caps globals is to tell you a lot about what they are. For instance, I will often use a module-level constant for a file or path name; within the module, it is deliberately treated as a constant, but if you import the module somewhere else, you could reassign it before calling any functions in the module, and they'll all use the changed path. We use well-chosen variable names to avoid needing special properties or documentation to explain how something is to be used. It's far better to distinguish between "thing" and "things" than to have to say "thing" and "array_of_thing". The whole "consenting adults" policy has to be built on clear indications, and the name of something is a vital part of that. It's up to you whether you actually use the all-caps convention in your own code or not, but IMO it is an extremely useful convention to have in the toolbox, and should be kept. ChrisA From abedillon at gmail.com Fri Jan 4 18:43:58 2019 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 4 Jan 2019 17:43:58 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: > > Changing this in the PEP will make an absurd amount of code which is > PEP-8 compliant no longer so. It depends on how we change it. There's plenty of room to compromise between explicitly mandating all caps and explicitly forbidding all caps. We could simply remove the section that says to use all caps for constants. We could replace that section with a statement that while it's discouraged, it doesn't violate pep8 when used for constants. We could change some of the documentation (especially around enums) to show non-all-caps style is acceptable. etc. On Fri, Jan 4, 2019 at 3:06 PM Bernardo Sulzbach < bernardo at bernardosulzbach.com> wrote: > I disagree. Changing this in the PEP will make an absurd amount of code > which is PEP-8 compliant no longer so. Also, the refactoring may not always > be trivial as the lowercase names may already be in use. > > I'd suggest violating PEP-8 instead of trying to change it. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Fri Jan 4 19:03:50 2019 From: fuzzyman at gmail.com (Michael Foord) Date: Sat, 5 Jan 2019 00:03:50 +0000 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Fri, 4 Jan 2019 at 19:02, Abe Dillon wrote: > I keep coming back to this great video about > coding style, and one point in particular rings true to me: > ALL_CAPS_IS_OBNOXIOUS > I really like the convention. It's nice and clear and absolutely everyone knows what it means. Michael > It destroys the visual flow of code and for what? To signify a global, > constant, or Enum? Is that really so important? I don't think so. I think > the all caps style has out-lived its usefulness and needs to go the way of > the dodo. > > The last time I saw all caps used appropriately was in a YouTube comment > where some guy was ranting about the communist Jewish banker conspiracy to > control the world. In that case, all caps clearly communicated to me that > the person was a frothing lunatic (thought find the idea of communist > bankers intriguing). > > Currently PEP-8 prescribes all caps for constants > and uses the all > cap variable "FILES" as an example in a different section. > It > also appears to be the defacto-standard for enums (based on the > documentation > ) > > I don't think it's necessary to make any breaking changes. Just pep-8 and > (of less importance) spurious documentation examples. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- http://www.michaelfoord.co.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Fri Jan 4 19:09:44 2019 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 4 Jan 2019 18:09:44 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: > > Just to be clear here: you're trying to say that the ALL_CAPS_NAME > convention is unnecessary, but you don't use constants. That kinda > weakens your argument a bit :) > Just to be clear, I'm currently working in a Java shop where the code-styles are to use all caps for constants, enums, and class-level variables. Most classes in our code base declare a class-level "LOG" or "LOGGER" object. I've found that completely unnecessary. A simple "log" works just fine. I've never been tempted to write over it. It would be impossible in Java anyway since the guidelines are to declare everything "final" (yes... shoot me) anyway. I helped one of the higher-ups in the company write a Python script and he found the lack of straight-jacket harnesses extremely distressing. How could you write code without "final"? or "private"? or type checking?! You have to use consistent white space?!?! It's all Stockholm syndrome. The whole point of the all-caps globals is to tell you a lot about what > they are. A lot? The only thing it canonically tells you is to not modify it which is my default assumption with any variable I didn't declare myself and also impossible to do in the case of enums. I will often use a module-level constant > for a file or path name; within the module, it is deliberately treated > as a constant, but if you import the module somewhere else, you could > reassign it before calling any functions in the module, and they'll > all use the changed path. Do you communicate that API through the variable name alone? How so? I would assume any module-level variables are not to be modified unless there was documentation stating otherwise. You really don't need a obnoxious shouty convention to tell people not to change things. It's up to you whether you actually use the all-caps convention in > your own code or not, but IMO it is an extremely useful convention to > have in the toolbox, and should be kept. My boss would say: > It's up to you whether you actually use final in > your own code or not, but IMO it is an extremely useful tool to > have in the toolbox, and should be kept. (and also you have to use it > because it's in the style guide) On Fri, Jan 4, 2019 at 5:41 PM Chris Angelico wrote: > On Sat, Jan 5, 2019 at 10:29 AM Abe Dillon wrote: > >> > >> How do you propose, instead, for the constantness of something to be > indicated? > > > > That's a good question. I honestly don't use constants all that much... > > Just to be clear here: you're trying to say that the ALL_CAPS_NAME > convention is unnecessary, but you don't use constants. That kinda > weakens your argument a bit :) > > > 1) provide a property to access an otherwise _plz_dont_touch_variable > > 2) Use an Enum > > 3) Use documentation to say: treat this as constant > > 4) Rely upon consenting adults to not change variables outside of scope. > It's weird to manipulate math.pi because it's in a separate module. > > > > I stopped using all caps a long time ago and it just hasn't created a > problem because manipulating global variables without knowing what they are > is such a bad idea to begin with. > > The whole point of the all-caps globals is to tell you a lot about > what they are. For instance, I will often use a module-level constant > for a file or path name; within the module, it is deliberately treated > as a constant, but if you import the module somewhere else, you could > reassign it before calling any functions in the module, and they'll > all use the changed path. > > We use well-chosen variable names to avoid needing special properties > or documentation to explain how something is to be used. It's far > better to distinguish between "thing" and "things" than to have to say > "thing" and "array_of_thing". The whole "consenting adults" policy has > to be built on clear indications, and the name of something is a vital > part of that. > > It's up to you whether you actually use the all-caps convention in > your own code or not, but IMO it is an extremely useful convention to > have in the toolbox, and should be kept. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at cskk.id.au Fri Jan 4 19:12:00 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Sat, 5 Jan 2019 11:12:00 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <20190105001200.GA32199@cskk.homeip.net> On 04Jan2019 17:29, Abe Dillon wrote: >> How do you propose, instead, for the constantness of something to be >> indicated? > >That's a good question. I honestly don't use constants all that much, I >like to move such things out to config files. I like there to be some default behaviour in the absence of config files. Cheers, Cameron Simpson From cs at cskk.id.au Fri Jan 4 19:13:26 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Sat, 5 Jan 2019 11:13:26 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <20190105001326.GA40122@cskk.homeip.net> On 05Jan2019 00:03, Michael Foord wrote: >On Fri, 4 Jan 2019 at 19:02, Abe Dillon wrote: >> I keep coming back to this great video >> about coding style, and one point in particular rings true to me: >> ALL_CAPS_IS_OBNOXIOUS > >I really like the convention. It's nice and clear and absolutely everyone >knows what it means. Me too. Cheers, Cameron Simpson From abedillon at gmail.com Fri Jan 4 19:15:11 2019 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 4 Jan 2019 18:15:11 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: Sure everyone knows what it means, but it's meaning is essentially useless because the default assumption when you encounter a variable you don't know, is that you shouldn't overwrite it. If you found a module-level variable in Pandas named cucumber, would you immediately assume you can write whatever value you want to pandas.cucumber because it isn't in all caps? On Fri, Jan 4, 2019 at 6:04 PM Michael Foord wrote: > > > On Fri, 4 Jan 2019 at 19:02, Abe Dillon wrote: > >> I keep coming back to this great video about >> coding style, and one point in particular rings true to me: >> ALL_CAPS_IS_OBNOXIOUS >> > > > I really like the convention. It's nice and clear and absolutely everyone > knows what it means. > > Michael > > > > >> It destroys the visual flow of code and for what? To signify a global, >> constant, or Enum? Is that really so important? I don't think so. I think >> the all caps style has out-lived its usefulness and needs to go the way of >> the dodo. >> >> The last time I saw all caps used appropriately was in a YouTube comment >> where some guy was ranting about the communist Jewish banker conspiracy to >> control the world. In that case, all caps clearly communicated to me that >> the person was a frothing lunatic (thought find the idea of communist >> bankers intriguing). >> >> Currently PEP-8 prescribes all caps for constants >> and uses the all >> cap variable "FILES" as an example in a different section. >> It >> also appears to be the defacto-standard for enums (based on the >> documentation >> ) >> >> I don't think it's necessary to make any breaking changes. Just pep-8 and >> (of less importance) spurious documentation examples. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -- > > http://www.michaelfoord.co.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Jan 4 19:19:36 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 5 Jan 2019 11:19:36 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Sat, Jan 5, 2019 at 11:09 AM Abe Dillon wrote: >> >> Just to be clear here: you're trying to say that the ALL_CAPS_NAME >> convention is unnecessary, but you don't use constants. That kinda >> weakens your argument a bit :) > > > Just to be clear, I'm currently working in a Java shop where the code-styles are to use all caps for constants, enums, and class-level variables. Most classes in our code base declare a class-level "LOG" or "LOGGER" object. I've found that completely unnecessary. A simple "log" works just fine. I've never been tempted to write over it. It would be impossible in Java anyway since the guidelines are to declare everything "final" (yes... shoot me) anyway. I helped one of the higher-ups in the company write a Python script and he found the lack of straight-jacket harnesses extremely distressing. How could you write code without "final"? or "private"? or type checking?! You have to use consistent white space?!?! > > It's all Stockholm syndrome. The fact that the all-caps convention is used differently (even wrongly) in your current Java employment doesn't really impact this. >> The whole point of the all-caps globals is to tell you a lot about what they are. > > A lot? The only thing it canonically tells you is to not modify it which is my default assumption with any variable I didn't declare myself and also impossible to do in the case of enums. > >> I will often use a module-level constant >> for a file or path name; within the module, it is deliberately treated >> as a constant, but if you import the module somewhere else, you could >> reassign it before calling any functions in the module, and they'll >> all use the changed path. > > Do you communicate that API through the variable name alone? How so? I would assume any module-level variables are not to be modified unless there was documentation stating otherwise. You really don't need a obnoxious shouty convention to tell people not to change things. > Yeah. By naming it in all caps. That's exactly how that's communicated. Example: https://github.com/Rosuav/shed/blob/master/emotify.py#L6 By default, it's calculated from __file__, but if an external caller wants to change this, it's most welcome to. Since it's in all-caps, you don't have to worry about it being changed or mutated during the normal course of operation. >> It's up to you whether you actually use the all-caps convention in >> your own code or not, but IMO it is an extremely useful convention to >> have in the toolbox, and should be kept. > > My boss would say: >> >> It's up to you whether you actually use final in >> your own code or not, but IMO it is an extremely useful tool to >> have in the toolbox, and should be kept. (and also you have to use it because it's in the style guide) > Well, then, no, that's not "it's up to you". Something mandated by a style guide is not a tool in your toolbox, it's a requirement of the project. But if you leave that part out, then yes, 'final' becomes a tool that you may use if you wish, or ignore if you wish. (Personally, I would ignore that one, with the exception of "public static final" as an incantation for "class-level constant".) ChrisA From abedillon at gmail.com Fri Jan 4 19:58:40 2019 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 4 Jan 2019 18:58:40 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: > > >> I will often use a module-level constant > >> for a file or path name; within the module, it is deliberately treated > >> as a constant, but if you import the module somewhere else, you could > >> reassign it before calling any functions in the module, and they'll > >> all use the changed path. > > > > Do you communicate that API through the variable name alone? How so? I > would assume any module-level variables are not to be modified unless there > was documentation stating otherwise. You really don't need a obnoxious > shouty convention to tell people not to change things. > > > Yeah. By naming it in all caps. That's exactly how that's communicated. > Example: > https://github.com/Rosuav/shed/blob/master/emotify.py#L6 > By default, it's calculated from __file__, but if an external caller > wants to change this, it's most welcome to. Since it's in all-caps, > you don't have to worry about it being changed or mutated during the > normal course of operation. You made EMOTE_PATH all caps signifying it's a constant, then added a comment saying that it "Can be overwritten prior to calling get_emote_list". So not only was your naming convention, on its own, insufficient to explain the API of your code, but you're completely violating PEP-8 because your using the constancy convention on something that's *not* constant. That's 100% counterintuitive. You subvert the *one thing* that all-caps is supposed to communicate. My whole point is that the thing that all-caps communicates: "you shouldn't change this", is the default assumption for pretty-much every variable you encounter, so it's really not that helpful. If you snooped around and found that the pandas module had a variable called "cucumber" that doesn't appear in the docs at all, would you assume that it's kosher to assign a value to pandas.cucumber? Well, then, no, that's not "it's up to you". Something mandated by a > style guide is not a tool in your toolbox Yes, and PEP-8 being the de-facto style guide for Python is the starting place for many mandatory Python style guides, which is why I'm arguing to remove the all caps BS from it. But if you leave that part out, then yes, 'final' becomes a > tool that you may use if you wish, or ignore if you wish. Do you think we should add "final", "private", etc. to Python? The point was that such things are clearly not necessary, yet if you suggest removing them, you'll get the exact same "might as well keep it in the toolbox" reaction or worse. The code style guides at my work aren't unique, they've been recycled for decades. It's like day-light savings time. People have some vague notion that it might have been a good idea in the past and now we just keep going through the motions even though it makes no sense. You just can't shake the notion that somehow, typing in all caps is a great way to convey information in code, when it's an extremely obnoxious means of conveying information in any other written form. On Fri, Jan 4, 2019 at 6:23 PM Chris Angelico wrote: > On Sat, Jan 5, 2019 at 11:09 AM Abe Dillon wrote: > >> > >> Just to be clear here: you're trying to say that the ALL_CAPS_NAME > >> convention is unnecessary, but you don't use constants. That kinda > >> weakens your argument a bit :) > > > > > > Just to be clear, I'm currently working in a Java shop where the > code-styles are to use all caps for constants, enums, and class-level > variables. Most classes in our code base declare a class-level "LOG" or > "LOGGER" object. I've found that completely unnecessary. A simple "log" > works just fine. I've never been tempted to write over it. It would be > impossible in Java anyway since the guidelines are to declare everything > "final" (yes... shoot me) anyway. I helped one of the higher-ups in the > company write a Python script and he found the lack of straight-jacket > harnesses extremely distressing. How could you write code without "final"? > or "private"? or type checking?! You have to use consistent white space?!?! > > > > It's all Stockholm syndrome. > > The fact that the all-caps convention is used differently (even > wrongly) in your current Java employment doesn't really impact this. > > >> The whole point of the all-caps globals is to tell you a lot about what > they are. > > > > A lot? The only thing it canonically tells you is to not modify it > which is my default assumption with any variable I didn't declare myself > and also impossible to do in the case of enums. > > > >> I will often use a module-level constant > >> for a file or path name; within the module, it is deliberately treated > >> as a constant, but if you import the module somewhere else, you could > >> reassign it before calling any functions in the module, and they'll > >> all use the changed path. > > > > Do you communicate that API through the variable name alone? How so? I > would assume any module-level variables are not to be modified unless there > was documentation stating otherwise. You really don't need a obnoxious > shouty convention to tell people not to change things. > > > > Yeah. By naming it in all caps. That's exactly how that's communicated. > Example: > > https://github.com/Rosuav/shed/blob/master/emotify.py#L6 > > By default, it's calculated from __file__, but if an external caller > wants to change this, it's most welcome to. Since it's in all-caps, > you don't have to worry about it being changed or mutated during the > normal course of operation. > > >> It's up to you whether you actually use the all-caps convention in > >> your own code or not, but IMO it is an extremely useful convention to > >> have in the toolbox, and should be kept. > > > > My boss would say: > >> > >> It's up to you whether you actually use final in > >> your own code or not, but IMO it is an extremely useful tool to > >> have in the toolbox, and should be kept. (and also you have to use it > because it's in the style guide) > > > > Well, then, no, that's not "it's up to you". Something mandated by a > style guide is not a tool in your toolbox, it's a requirement of the > project. But if you leave that part out, then yes, 'final' becomes a > tool that you may use if you wish, or ignore if you wish. (Personally, > I would ignore that one, with the exception of "public static final" > as an incantation for "class-level constant".) > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noahs2003 at gmail.com Fri Jan 4 20:09:39 2019 From: noahs2003 at gmail.com (Noah Simon) Date: Fri, 4 Jan 2019 20:09:39 -0500 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: Looking at the arguments regarding rendering existing code non-PEP-8-compliant, I think if we were to make this change it should be made in Python 4.0, or whatever the next backwards-incompatible version will be. However, personally I disagree with the fundamental assertion that ALL_CAPS is always ugly. In my humble opinion, when used sparingly (as it is according to PEP 8), they make for a good distinguisher from regular_variables or ClassNames. Since python doesn?t have a ?const? or ?final? keyword like many strongly typed languages, IDEs would otherwise have a hard time distinguishing between constants and variables. (Although it?s up for debate whether it?s important to developers if this disambiguation is present.) Sent from my iPhone On Jan 4, 2019, at 7:58 PM, Abe Dillon wrote: >> >> I will often use a module-level constant >> >> for a file or path name; within the module, it is deliberately treated >> >> as a constant, but if you import the module somewhere else, you could >> >> reassign it before calling any functions in the module, and they'll >> >> all use the changed path. >> > >> > Do you communicate that API through the variable name alone? How so? I would assume any module-level variables are not to be modified unless there was documentation stating otherwise. You really don't need a obnoxious shouty convention to tell people not to change things. >> > >> Yeah. By naming it in all caps. That's exactly how that's communicated. Example: >> https://github.com/Rosuav/shed/blob/master/emotify.py#L6 >> By default, it's calculated from __file__, but if an external caller >> wants to change this, it's most welcome to. Since it's in all-caps, >> you don't have to worry about it being changed or mutated during the >> normal course of operation. > > You made EMOTE_PATH all caps signifying it's a constant, then added a comment saying that it "Can be overwritten prior to calling get_emote_list". > > So not only was your naming convention, on its own, insufficient to explain the API of your code, but you're completely violating PEP-8 because your using the constancy convention on something that's not constant. That's 100% counterintuitive. You subvert the one thing that all-caps is supposed to communicate. > > My whole point is that the thing that all-caps communicates: "you shouldn't change this", is the default assumption for pretty-much every variable you encounter, so it's really not that helpful. If you snooped around and found that the pandas module had a variable called "cucumber" that doesn't appear in the docs at all, would you assume that it's kosher to assign a value to pandas.cucumber? > >> Well, then, no, that's not "it's up to you". Something mandated by a >> style guide is not a tool in your toolbox > Yes, and PEP-8 being the de-facto style guide for Python is the starting place for many mandatory Python style guides, which is why I'm arguing to remove the all caps BS from it. > >> But if you leave that part out, then yes, 'final' becomes a >> tool that you may use if you wish, or ignore if you wish. > Do you think we should add "final", "private", etc. to Python? The point was that such things are clearly not necessary, yet if you suggest removing them, you'll get the exact same "might as well keep it in the toolbox" reaction or worse. > > The code style guides at my work aren't unique, they've been recycled for decades. It's like day-light savings time. People have some vague notion that it might have been a good idea in the past and now we just keep going through the motions even though it makes no sense. You just can't shake the notion that somehow, typing in all caps is a great way to convey information in code, when it's an extremely obnoxious means of conveying information in any other written form. > >> On Fri, Jan 4, 2019 at 6:23 PM Chris Angelico wrote: >> On Sat, Jan 5, 2019 at 11:09 AM Abe Dillon wrote: >> >> >> >> Just to be clear here: you're trying to say that the ALL_CAPS_NAME >> >> convention is unnecessary, but you don't use constants. That kinda >> >> weakens your argument a bit :) >> > >> > >> > Just to be clear, I'm currently working in a Java shop where the code-styles are to use all caps for constants, enums, and class-level variables. Most classes in our code base declare a class-level "LOG" or "LOGGER" object. I've found that completely unnecessary. A simple "log" works just fine. I've never been tempted to write over it. It would be impossible in Java anyway since the guidelines are to declare everything "final" (yes... shoot me) anyway. I helped one of the higher-ups in the company write a Python script and he found the lack of straight-jacket harnesses extremely distressing. How could you write code without "final"? or "private"? or type checking?! You have to use consistent white space?!?! >> > >> > It's all Stockholm syndrome. >> >> The fact that the all-caps convention is used differently (even >> wrongly) in your current Java employment doesn't really impact this. >> >> >> The whole point of the all-caps globals is to tell you a lot about what they are. >> > >> > A lot? The only thing it canonically tells you is to not modify it which is my default assumption with any variable I didn't declare myself and also impossible to do in the case of enums. >> > >> >> I will often use a module-level constant >> >> for a file or path name; within the module, it is deliberately treated >> >> as a constant, but if you import the module somewhere else, you could >> >> reassign it before calling any functions in the module, and they'll >> >> all use the changed path. >> > >> > Do you communicate that API through the variable name alone? How so? I would assume any module-level variables are not to be modified unless there was documentation stating otherwise. You really don't need a obnoxious shouty convention to tell people not to change things. >> > >> >> Yeah. By naming it in all caps. That's exactly how that's communicated. Example: >> >> https://github.com/Rosuav/shed/blob/master/emotify.py#L6 >> >> By default, it's calculated from __file__, but if an external caller >> wants to change this, it's most welcome to. Since it's in all-caps, >> you don't have to worry about it being changed or mutated during the >> normal course of operation. >> >> >> It's up to you whether you actually use the all-caps convention in >> >> your own code or not, but IMO it is an extremely useful convention to >> >> have in the toolbox, and should be kept. >> > >> > My boss would say: >> >> >> >> It's up to you whether you actually use final in >> >> your own code or not, but IMO it is an extremely useful tool to >> >> have in the toolbox, and should be kept. (and also you have to use it because it's in the style guide) >> > >> >> Well, then, no, that's not "it's up to you". Something mandated by a >> style guide is not a tool in your toolbox, it's a requirement of the >> project. But if you leave that part out, then yes, 'final' becomes a >> tool that you may use if you wish, or ignore if you wish. (Personally, >> I would ignore that one, with the exception of "public static final" >> as an incantation for "class-level constant".) >> >> ChrisA >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Jan 4 20:15:42 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 5 Jan 2019 12:15:42 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Sat, Jan 5, 2019 at 11:58 AM Abe Dillon wrote: > You made EMOTE_PATH all caps signifying it's a constant, then added a comment saying that it "Can be overwritten prior to calling get_emote_list". > > So not only was your naming convention, on its own, insufficient to explain the API of your code, but you're completely violating PEP-8 because your using the constancy convention on something that's not constant. That's 100% counterintuitive. You subvert the one thing that all-caps is supposed to communicate. > If "don't change this externally" is the default, why would we have a naming convention meaning "don't change this externally"? I think you're misunderstanding the way that module constants are used in Python. They CAN be overridden. That is part of the point. All-caps does *not* signify "don't override this". ChrisA From abedillon at gmail.com Fri Jan 4 21:11:44 2019 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 4 Jan 2019 20:11:44 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: > > All-caps does *not* signify "don't override this". PEP-8 specifically says to use all caps to signify constants. What exactly do you think the word "constant" means? I know very well Python doesn't support actual constants and that anything can be overridden. Your example goes against the convention of using all caps to signify constants and it's a bad design all around. You would be better off with: def get_emote_lits(root = os.path.normpath(__file__ + "/../emotes")): ... # congratulations, you've re-invented the default parameter in a more clunky way! I don't know many APIs built around having you fiddle with module-level variables, it seems like an anti-pattern to me. > > If "don't change this externally" is the default, why would we have a > naming convention meaning "don't change this externally"? Modules and Classes and functions typically expose functionality while the purpose of most other objects is to encapsulate state, so it makes sense that attributes would be read/write by default, though some even disagree on that point. I'm not a hard-core functional programming fanatic, so I'm not about to argue that everything should be read-only by default. I also think you're playing dumb to avoid confronting my point: If you found that there was an undocumented module-level attribute in pandas, what would be your initial assumption? If it's not all caps, is it fair game to override it? Can you re-write your emote code so that it's clear that the emote-path variable is part of the API *without* any documentation? How much does all caps really communicate? On Fri, Jan 4, 2019 at 7:16 PM Chris Angelico wrote: > On Sat, Jan 5, 2019 at 11:58 AM Abe Dillon wrote: > > You made EMOTE_PATH all caps signifying it's a constant, then added a > comment saying that it "Can be overwritten prior to calling get_emote_list". > > > > So not only was your naming convention, on its own, insufficient to > explain the API of your code, but you're completely violating PEP-8 because > your using the constancy convention on something that's not constant. > That's 100% counterintuitive. You subvert the one thing that all-caps is > supposed to communicate. > > > > If "don't change this externally" is the default, why would we have a > naming convention meaning "don't change this externally"? I think > you're misunderstanding the way that module constants are used in > Python. They CAN be overridden. That is part of the point. All-caps > does *not* signify "don't override this". > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Jan 4 21:16:06 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 5 Jan 2019 13:16:06 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Sat, Jan 5, 2019 at 1:11 PM Abe Dillon wrote: >> >> All-caps does *not* signify "don't override this". > > PEP-8 specifically says to use all caps to signify constants. What exactly do you think the word "constant" means? I know very well Python doesn't support actual constants and that anything can be overridden. > At this point, we're way off python-ideas territory. Might be time to reopen the subject on python-list - what IS a constant, who's allowed to change them, etc, etc, etc. ChrisA From mertz at gnosis.cx Fri Jan 4 21:45:17 2019 From: mertz at gnosis.cx (David Mertz) Date: Fri, 4 Jan 2019 21:45:17 -0500 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: Like everyone other than Abe in this thread, I find judicious use of CONSTANTS to be highly readable and useful. Yes, there is a little wiggle room about just how constant a constant has to be since Python doesn't have a straightforward way to create real constants. Very rarely I might change a value named in all caps. But the distinction between a value intended as fixed and one I merely probably won't change is worth marking typographically.... Especially since there's no actual Python semantics enforcing it. On Fri, Jan 4, 2019, 2:02 PM Abe Dillon I keep coming back to this great video about > coding style, and one point in particular rings true to me: > ALL_CAPS_IS_OBNOXIOUS > > It destroys the visual flow of code and for what? To signify a global, > constant, or Enum? Is that really so important? I don't think so. I think > the all caps style has out-lived its usefulness and needs to go the way of > the dodo. > > The last time I saw all caps used appropriately was in a YouTube comment > where some guy was ranting about the communist Jewish banker conspiracy to > control the world. In that case, all caps clearly communicated to me that > the person was a frothing lunatic (thought find the idea of communist > bankers intriguing). > > Currently PEP-8 prescribes all caps for constants > and uses the all > cap variable "FILES" as an example in a different section. > It > also appears to be the defacto-standard for enums (based on the > documentation > ) > > I don't think it's necessary to make any breaking changes. Just pep-8 and > (of less importance) spurious documentation examples. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Jan 4 22:41:43 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 5 Jan 2019 14:41:43 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <20190105034143.GS13616@ando.pearwood.info> On Fri, Jan 04, 2019 at 01:01:51PM -0600, Abe Dillon wrote: > I keep coming back to this great video about > coding style, and one point in particular rings true to me: > ALL_CAPS_IS_OBNOXIOUS > > It destroys the visual flow of code Does it? This claim doesn't ring true to me. To me, "visual flow of code" is the way it flows down and across the page, not the shape of the individual words. To me, long lines spoil the visual flow of code (especially if they are long enough that I have to scroll horizontally to see the end). To me, sudden blocks of unexpected indentation spoil the visual flow of code. (Fortunately, this is rare in Python.) I've looked over code in the standard library, my own code, and third-party libraries, and I don't see that the choice of name disrupts the flow of code, whether it is written in CamelCase of lowercase or ALLCAPS or even rAnSOmenOTecAse. (Although I admit that last one is quite hard to read...) I have a bunch of code written in RPL for the HP-48GX calculator, and the convention there is that nearly everything is written in allcaps. Here's an equivalent function to Python's divmod(): << DUP2 DIV SWAP OVER * ROT SWAP - >> The flow is fine if you know how to read reverse Polish notation ("Yoda speak"). It flows from left to right, and top down, same as English. Only the word order is different. The flow would be precisely the same if it were written like this: << dup2 div swap over * rot swap - >> Where RPL does suffer from the lack of visual flow is the lack of structure to the code. In Python terms, it would be as if we wrote: def function(): if condition: for x in sequence: do_this() do_that() endfor else: do_something_else() endif Ouch. The bottom line is, I don't agree that the visual flow of code is negatively affected, or affected at all, by the shape of individual words in the code. > and for what? To signify a global, > constant, or Enum? Is that really so important? I don't think so. I think the convention is useful, of moderate importance, and I think Python code would be ever-so-slightly harder to understand without it. I rarely, if ever, use allcaps for constants defined and used in a single function, but that's because my functions are typically short enough that you can fit the entire function on screen at once and tell that the name is defined once and never re-bound, hence a constant. Where the naming convention really makes sense is for module-level constants, where the initial binding is typically separated from the eventual use by a lot of time and space, I think it is useful to have a simple naming convention to distinguish between variables and constants. When I see this in the middle of a function: def foo(): ... process(spam, FILENAME, eggs, ...) ... I can immediately tell that unlike spam and eggs, FILENAME ought to be a global constant, which is a valuable hint that I can probably find the value of FILENAME by looking at the top of the module, and not worry about it being rebound anywhere else. So yes, having a naming convention for constants is useful. And FILENAME is much better than cfilename or kfilename or constant_filename_please_dont_rebind_ok_thx *wink* What naming convention would you suggest for distinguishing between constants and variables? I suppose one might argue that we don't need to care about the semantics of which names are variables and which are constants. In fairness, we cope quite well with modules, classes and functions being effectively constants and yet written in non-allcaps. But on the other hand, we generally can recognise modules, classes and functions by name and usage. We rarely say "process(module)", but we might say "process(module.something)". Classes have their own naming convention. So the analogy between global constants which don't use the allcaps convention (namely modules, classes and functions) and global constants which do is fairly weak. We can (usually) accurately recognise modules, classes and functions from context, but we can't do the same for constants. > Currently PEP-8 prescribes all caps for constants > and uses the all cap > variable "FILES" as an example in a different section. > It > also appears to be the defacto-standard for enums (based on the > documentation > ) That's because the typical use for enums is as constants. If I had a *variable* which merely held an enum, I wouldn't use allcaps: # No! Don't do this! for MYENUM in list_of_enums: if condition(MYENUM): MYENUM = something_else() process(MYENUM) -- Steve From steve at pearwood.info Fri Jan 4 22:51:39 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 5 Jan 2019 14:51:39 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <20190105035139.GT13616@ando.pearwood.info> On Fri, Jan 04, 2019 at 06:15:11PM -0600, Abe Dillon wrote: > Sure everyone knows what it means, but it's meaning is essentially useless > because the default assumption when you encounter a variable you don't > know, is that you shouldn't overwrite it. If you found a module-level > variable in Pandas named cucumber, would you immediately assume you can > write whatever value you want to pandas.cucumber because it isn't in all > caps? Code is read more often than it is written. The important question is rarely "can I overwrite random variables?" but is usually "what is the value of this variable right now?". If I see a call like: function(cucumber) I may have no idea what cucumber holds, or where to find its binding. I may have to work hard to determine (1) where it is initially bound; (2) whether or not it has been, or could be, rebound to something else; (3) and what value it holds at the time it is passed to the function. If cucumber is a *variable*, that's kind of unavoidable, its part of what makes programming so ~~frustrating~~ fun and why we have debuggers, tracing and the venerable old technique of inserting print() calls to find out what's going on in our code. But if I see: function(CUCUMBER) that tells me that in idiomatic Python code, CUCUMBER is a constant bound close to the top of the module, and never rebound. -- Steve From tritium-list at sdamon.com Fri Jan 4 23:04:07 2019 From: tritium-list at sdamon.com (Alex Walters) Date: Fri, 4 Jan 2019 23:04:07 -0500 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <02f801d4a4ab$b4f0cd70$1ed26850$@sdamon.com> > -----Original Message----- > From: Python-ideas list=sdamon.com at python.org> On Behalf Of Abe Dillon > Sent: Friday, January 4, 2019 2:02 PM > To: Python-Ideas > Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS > > I keep coming back to this great video about > coding style, and one point in particular rings true to me: > ALL_CAPS_IS_OBNOXIOUS > Then by all means, adopt some other convention for your own project's style guides. PEP-8 is not some dictatorial decree from on high that all python be written one way and one way only - heck, even the standard library ignores PEP-8 when convenient, and the committers are loth to merge patches that only correct style issues. > It destroys the visual flow of code and for what? To signify a global, constant, > or Enum? Is that really so important? I don't think so. I think the all caps style > has out-lived its usefulness and needs to go the way of the dodo. > I disagree. They look fine to me, and don't break the flow of reading code for me. > The last time I saw all caps used appropriately was in a YouTube comment > where some guy was ranting about the communist Jewish banker conspiracy > to control the world. In that case, all caps clearly communicated to me that > the person was a frothing lunatic (thought find the idea of communist > bankers intriguing). Last I looked, you could not create variables in YouTube comments, so I don't see how that is germane. > > Currently PEP-8 prescribes all caps for constants > and uses the all > cap variable "FILES" as an example in a different section. > commas> It also appears to be the defacto-standard for enums (based on > the documentation an-enum> ) > > I don't think it's necessary to make any breaking changes. Just pep-8 and (of > less importance) spurious documentation examples. Again, you are free to use whatever style guide you choose - an act which in itself is compliant with PEP-8. From steve at pearwood.info Fri Jan 4 23:20:53 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 5 Jan 2019 15:20:53 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <20190105042053.GU13616@ando.pearwood.info> On Fri, Jan 04, 2019 at 01:01:51PM -0600, Abe Dillon wrote: > I keep coming back to this great video I've just watched it, its a bit slow to start but I agree with Abe that it is a great video. (And not just because the speaker agrees with me about 80 columns :-) I don't agree with everything he says, but even where I disagree it is great food for thought. I *strongly* suggest people watch the video, although you might find (as I did) that the main lessons of it are that many common Java idioms exist to work around poor language design, and that IDEs rot the brain. *semi-wink* Coming back to the ALLCAPS question, the speaker makes an excellent point that in Java, you don't need a naming convention for constants because the compiler will give an error if you try to write to a constant. But we don't have that in Python. Even if you run a linter that will warn on rebinding of constants, you still need a way to tell the linter that it is a constant. The speaker also points out that in programming, we only have a very few mechanisms for communicating the meaning of our code: - names; - code structure; - spacing (indentation, grouping). Code structure is set by the language and there's not much we can do about it (unless you're using a language like FORTH where you can create your own flow control structures). So in practice we only have naming and spacing. That's an excellent point, but he has missed one more: * naming conventions. In Python, we use leading and trailing underscores to give strong hints about usage: _spam # private implementation detail __spam # same, but with name mangling __spam__ # overload an operator or other special meaning spam_ # avoid name clashes with builtins We typically use CamelCase for classes, making it easy to distinguish classes from instances, modules and functions. And we use ALLCAPS for constants. If that's not needed in Java (I have my doubts...) we should also remember the speaker's very good advice that just because something is needed (or not needed) in language X, doesn't mean that language Y should copy it. -- Steve From levkivskyi at gmail.com Sat Jan 5 03:34:21 2019 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 5 Jan 2019 08:34:21 +0000 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Sat, 5 Jan 2019 at 02:46, David Mertz wrote: > Like everyone other than Abe in this thread, I find judicious use of > CONSTANTS to be highly readable and useful. > > Yes, there is a little wiggle room about just how constant a constant has > to be since Python doesn't have a straightforward way to create real > constants. Very rarely I might change a value named in all caps. But the > distinction between a value intended as fixed and one I merely probably > won't change is worth marking typographically.... Especially since there's > no actual Python semantics enforcing it. > There is. Mypy supports final names, final methods and whatnot https://mypy.readthedocs.io/en/latest/final_attrs.html Anyway I don't see a problem in using CAPS for constants, finally it is just a style guide, Python will work even with class sTYLISH_oNE: ... -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From steelman at post.pl Sat Jan 5 05:00:52 2019 From: steelman at post.pl (=?UTF-8?Q?=C5=81ukasz_Stelmach?=) Date: Sat, 05 Jan 2019 10:00:52 +0000 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <861B9B92-E5E9-4B5C-9260-82FA27AEA00A@post.pl> Dnia January 5, 2019 12:09:44 AM UTC, Abe Dillon napisa?(a): >> The whole point of the all-caps globals is to tell you a lot about what >> they are. > > A lot? The only thing it canonically tells you is to not modify it It also tells anyone, who is browsing a code, which values come from a spec of some kind (e.g. network protocols, file formats etc.) > which is my default assumption with any variable I didn't declare myself It has been quite some time since I have learnt to avoid arguments like "I can do it, can everynoe else", because they are false. I like ALL_CAPS and other typographic clues, because I do a lot of integration work and I don't use IDE to browse every file I need to read. -- ?ukasz Stelmach z podr??y From barry at barrys-emacs.org Sat Jan 5 04:32:16 2019 From: barry at barrys-emacs.org (Barry Scott) Date: Sat, 05 Jan 2019 09:32:16 +0000 Subject: [Python-ideas] Fixed point format for numbers with locale based separators In-Reply-To: <1541732215.575168.1546613873762@poczta.home.pl> References: <1541732215.575168.1546613873762@poczta.home.pl> Message-ID: <3958640.cMbgjVjVHv@varric.chelsea.private> On Friday, 4 January 2019 14:57:53 GMT ?ukasz Stelmach wrote: > Hi, > > I would like to present two pull requests[1][2] implementing fixed point > presentation of numbers and ask for comments. The first is mine. I > learnt about the second after publishing mine. > > The only format using decimal separator from locale data for > float/complex/decimal numbers at the moment is "n" which behaves like > "g". The drawback of these formats, I would like to overcome, is the > inability to print numbers ranging more than one order of magnitude with > the same number of decimal digits without "manually" (with some additional > custom code) adjusting precission. The other option is to "manually" > replace "." as printed by "f" with a local decimal separator. Neither of > these option is appealing to my. > > Formatting 1.23456789 * n (LC_ALL=3Dpl_PL.UTF-8) > > | n | ".2f" | ".3n" | > | > |---+----------+----------| > | > | 1 | 1.23 | 1,23 | > | 2 | 12.35 | 12,3 | > | 3 | 123.46 | 123 | > | 4 | 1234.57 | 1,23e+03 | Can you use locale.format_string() to solve this? I used this to test: import locale n = 1.23456789 for order in range(5): m = n * (10**order) for lang in ('en_GB.utf8', 'pl_PL.utf8'): locale.setlocale(locale.LC_ALL, lang) print( 'python %%.2f in %s: %.2f' % (lang, m) ) print( locale.format_string('locale %%.2f in %s: %.2f', (lang, m), grouping=True) ) print() Which outputs: python %.2f in en_GB.utf8: 1.23 locale %.2f in en_GB.utf8: 1.23 python %.2f in pl_PL.utf8: 1.23 locale %.2f in pl_PL.utf8: 1,23 python %.2f in en_GB.utf8: 12.35 locale %.2f in en_GB.utf8: 12.35 python %.2f in pl_PL.utf8: 12.35 locale %.2f in pl_PL.utf8: 12,35 python %.2f in en_GB.utf8: 123.46 locale %.2f in en_GB.utf8: 123.46 python %.2f in pl_PL.utf8: 123.46 locale %.2f in pl_PL.utf8: 123,46 python %.2f in en_GB.utf8: 1234.57 locale %.2f in en_GB.utf8: 1,234.57 python %.2f in pl_PL.utf8: 1234.57 locale %.2f in pl_PL.utf8: 1?234,57 python %.2f in en_GB.utf8: 12345.68 locale %.2f in en_GB.utf8: 12,345.68 python %.2f in pl_PL.utf8: 12345.68 locale %.2f in pl_PL.utf8: 12?345,68 Barry From p.f.moore at gmail.com Sat Jan 5 05:14:47 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 5 Jan 2019 10:14:47 +0000 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Sat, 5 Jan 2019 at 00:05, Michael Foord wrote: > > On Fri, 4 Jan 2019 at 19:02, Abe Dillon wrote: >> >> I keep coming back to this great video about coding style, and one point in particular rings true to me: ALL_CAPS_IS_OBNOXIOUS > > I really like the convention. It's nice and clear and absolutely everyone knows what it means. So do I. And I just looked at the linked video, and notice that it's originally about Java. There's no obvious reason to me that Java conventions should transfer to Python. Java has "public static final" to declare actual constants (yay Java terseness! :-)) and has no global scope (so the all-caps is at class level, not global). -1 on this proposal. With all the "style checker" tools out there, even if it's only an optional suggestion, it'll end up getting mandated all over the place... Paul From arj.python at gmail.com Sat Jan 5 05:47:36 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Sat, 5 Jan 2019 14:47:36 +0400 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: knowing this mailing list, i think there was not enough reasons for the no caps case while waiting for the emoji era, i think it stands good Abdur-Rahmaan Janhangeer http://www.pythonmembers.club | https://github.com/Abdur-rahmaanJ Mauritius -------------- next part -------------- An HTML attachment was scrubbed... URL: From uamr567 at sina.com Sat Jan 5 07:52:16 2019 From: uamr567 at sina.com (=?GBK?B?TW9vbti8c3Vu?=) Date: Sat, 05 Jan 2019 20:52:16 +0800 Subject: [Python-ideas] Make the @contextmanager of contextlib to be a real contextmanager Message-ID: <20190105125216.12E492D00095@webmail.sinamail.sina.com.cn> As we know,when we import the module--'contextlib',we can use the decorator '@contextmanager' and keyword ?yield? to make a 'instance' of Class '_GeneratorContextManager' in 'contextlib' module,then we can use it like:with 'instance' as 'xx': 'code block' passBut there is a little bug,when the code block raise a error,the instance cannot run the code which after the keyword 'yield'.So I try to repair this bug, I add a method '_next()' in the Class '_GeneratorContextManager',and then insert it to the the method '__exit__',then we can make the instance like a realcontextmanager.You can cat the concrete content in the attachment. Conmments:the method '_next()' in in line 79,the method '__exit__' in line 97. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: _contextlib.py Type: text/x-python Size: 13696 bytes Desc: not available URL: From storchaka at gmail.com Sat Jan 5 09:45:34 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 5 Jan 2019 16:45:34 +0200 Subject: [Python-ideas] Make the @contextmanager of contextlib to be a real contextmanager In-Reply-To: <20190105125216.12E492D00095@webmail.sinamail.sina.com.cn> References: <20190105125216.12E492D00095@webmail.sinamail.sina.com.cn> Message-ID: 05.01.19 14:52, Moon?sun ????: > As we know,when we import the module--'contextlib',we can use the > decorator '@contextmanager' and keyword ?yield? to make a 'instance' of > Class '_GeneratorContextManager' in 'contextlib' module,then we can use > it like: > with 'instance' as 'xx': > ? ? 'code block' > ? ? pass > But there is a little bug,when the code block raise a error,the instance > cannot run the code which after the keyword 'yield'. This is not a bug. Consider the following example: @contextmanager def cm(): try: yield except BaseException as err: print('Fail:', err) raise else: print('Success') with cm(): 1/0 What result would you expect? Test it with the stdlib implementation and with your implementation. From steelman at post.pl Sat Jan 5 10:41:20 2019 From: steelman at post.pl (=?utf-8?Q?=C5=81ukasz?= Stelmach) Date: Sat, 05 Jan 2019 16:41:20 +0100 Subject: [Python-ideas] Fixed point format for numbers with locale based separators In-Reply-To: <20190104165600.GR13616@ando.pearwood.info> (Steven D'Aprano's message of "Sat, 5 Jan 2019 03:56:01 +1100") References: <1541732215.575168.1546613873762@poczta.home.pl> <20190104165600.GR13616@ando.pearwood.info> Message-ID: <87y37z9knz.fsf%steelman@post.pl> Steven D'Aprano writes: > On Fri, Jan 04, 2019 at 03:57:53PM +0100, ?ukasz Stelmach wrote: >> Hi, >> >> I would like to present two pull requests[1][2] implementing fixed point >> presentation of numbers and ask for comments. The first is mine. I >> learnt about the second after publishing mine. > > Before I look at the implementation, can you explain the functional > requirements please? As I stated in the original message below the table: >> In the application I want to create I am going to present users numbers >> ranging up to 3 orders of magnitude and I (my users) want them to be >> presented consistently with regards to number of decimal digits AND I >> want to conform to rules of languages of my users. And I would like to >> avoid the exponent notation by all means. The pint[1] library I use, implements formatting of physical quantities using the format()/__format__ code. As far as I can tell my patch for Python is shorter and more straightforward than a patch for pint to use locale.format(). Because the "g" based "n" formatter has been present since the advanced string formatting was described in PEP-3101, I think it is necessary to add the "m" formatter based on "f". The advanced string formatting facility in Python is very convenient and programmers shouldn't forced to use locale.format() like this "The total length of {} sticks is {} meters.".format(n_sticks, locale.format(".2f", l_sticks)) instead of "The total length of {} sticks is {:.2f} meters.".format(n_sticks, l_sticks) > In other words, what is the new feature you hope to have excepted? > Explain the intention and the API (the interface). The implementation is > the least important part :-) I wish to add a new formatter "m" for float/complex/decimal numbers, which behaves like the existing "f", but uses the decimal separator from the locale database. There is "n" formmatter which behaves like "g" but it does not fit my needs. > [...] >> Formatting 1.23456789 * n (LC_ALL=3Dpl_PL.UTF-8) >> | n | ".2f" | ".3n" | >> |---+----------+----------| >> | 1 | 1.23 | 1,23 | >> | 2 | 12.35 | 12,3 | >> | 3 | 123.46 | 123 | >> | 4 | 1234.57 | 1,23e+03 | > > I'm afraid I cannot work out what that table means. You say "Formatting > 1.23... * n" (multiplying by n) but the results shown aren't multiplied > by n=2, n=3, n=4 as the table suggests. > > Can you show what Python code you expect will produce the expected > output? for n in range(1,5): print("| {} | {:8.2f} | {:8.3n} |".format(n,1.23456789 * 10**n, 1.23456789 * 10**n)) [1] http://pint.readthedocs.io/ -- By?o mi bardzo mi?o. --- Rurku. --- ... >?ukasz< --- To dobrze, ?e mnie s?uchasz. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 617 bytes Desc: not available URL: From python-ideas at mgmiller.net Sat Jan 5 13:21:41 2019 From: python-ideas at mgmiller.net (Mike Miller) Date: Sat, 5 Jan 2019 10:21:41 -0800 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <6211e654-f0a2-ae39-3f7b-edf38aa3c334@mgmiller.net> On 1/5/19 12:34 AM, Ivan Levkivskyi wrote: > There is. Mypy supports final names, final methods and whatnot > https://mypy.readthedocs.io/en/latest/final_attrs.html Believe this^^ is the best answer, unfortunately buried. Use typing hints for constants and tools that support them, and all-caps is no longer needed. An additional sentence that mentions this as an alternative in PEP8 sounds useful. -Mike From steelman at post.pl Sat Jan 5 15:03:06 2019 From: steelman at post.pl (=?utf-8?Q?=C5=81ukasz?= Stelmach) Date: Sat, 05 Jan 2019 21:03:06 +0100 Subject: [Python-ideas] Fixed point format for numbers with locale based separators In-Reply-To: <3958640.cMbgjVjVHv@varric.chelsea.private> (Barry Scott's message of "Sat, 05 Jan 2019 09:32:16 +0000") References: <1541732215.575168.1546613873762@poczta.home.pl> <3958640.cMbgjVjVHv@varric.chelsea.private> Message-ID: <87r2dqan45.fsf%steelman@post.pl> Barry Scott writes: > On Friday, 4 January 2019 14:57:53 GMT ?ukasz Stelmach wrote: >> >> I would like to present two pull requests[1][2] implementing fixed point >> presentation of numbers and ask for comments. The first is mine. I >> learnt about the second after publishing mine. >> >> The only format using decimal separator from locale data for >> float/complex/decimal numbers at the moment is "n" which behaves like >> "g". The drawback of these formats, I would like to overcome, is the >> inability to print numbers ranging more than one order of magnitude with >> the same number of decimal digits without "manually" (with some additional >> custom code) adjusting precission. The other option is to "manually" >> replace "." as printed by "f" with a local decimal separator. Neither of >> these option is appealing to my. >> >> Formatting 1.23456789 * n (LC_ALL=3Dpl_PL.UTF-8) >> >> | n | ".2f" | ".3n" | >> |---+----------+----------| >> | 1 | 1.23 | 1,23 | >> | 2 | 12.35 | 12,3 | >> | 3 | 123.46 | 123 | >> | 4 | 1234.57 | 1,23e+03 | > > Can you use locale.format_string() to solve this? I am afraid I can't. I am using a library called pint[1] in my project. It allows me to choose how its objects are formated but it uses format() internally. It adds some custom extensions to format strings which, as far as I can tell, mekes it hard if not impossible to patch it to locale.format_string(). But this is rather an excuse. I thnik, had this problem some time ago and I got away with locale.format_string() then, but honestly I think format()/string.format/__format__ shuld support locale aware "f" just like there is "n" that behaves like "g". [1] http://pint.readthedocs.io/ -- By?o mi bardzo mi?o. --- Rurku. --- ... >?ukasz< --- To dobrze, ?e mnie s?uchasz. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 617 bytes Desc: not available URL: From simon.bordeyne at gmail.com Sat Jan 5 19:38:33 2019 From: simon.bordeyne at gmail.com (Simon) Date: Sun, 6 Jan 2019 01:38:33 +0100 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: Message-ID: I was writing some python code earlier, and I noticed that in a code that looks somwhat like this one : try: i = int("string") print("continued on") j = int(9.0) except ValueError as e: print(e) >>> "invalid literal for int() with base 10: 'string'" this code will handle the exception, but the code in the try block will not continue. I propose to be able to use the continue keyword to continue the execution of the try block even when an error is handled. The above could then be changed to : try: i = int("string") print("continued on") j = int(9.0) except ValueError as e: print(e) continue >>> "invalid literal for int() with base 10: 'string'" >>> "continued on" -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Sat Jan 5 19:49:30 2019 From: Richard at Damon-Family.org (Richard Damon) Date: Sat, 5 Jan 2019 19:49:30 -0500 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: Message-ID: On 1/5/19 7:38 PM, Simon wrote: > > I was writing some python code earlier, and I noticed that in a code > that looks somwhat like this one : > > ? ? try: > ? ? ? ? i = int("string") > ? ? ? ? print("continued on") > ? ? ? ? j = int(9.0) > ? ? except ValueError as e: > ? ? ? ? print(e) > > >>> "invalid literal for int() with base 10: 'string'" > > this code will handle the exception, but the code in the try block > will not continue. > > I propose to be able to use the continue keyword to continue the > execution of the try block even when an error is handled. The above > could then be changed to : > > > ? ? try: > ? ? ? ? i = int("string") > ? ? ? ? print("continued on") > ? ? ? ? j = int(9.0) > ? ? except ValueError as e: > ? ? ? ? print(e) > ? ? ? ? continue > > >>> "invalid literal for int() with base 10: 'string'" > >>> "continued on" > How would you tell it where to continue? Why would it be the next statement? If you want that then you just need to do it like: try: ??? i = int("string") except ValueError as e: ??? print(e) print("continued on") j = int(9.0) i.e. the try block is the program segment that either executes successful, or your exception routine recovers from the error and sets things up to continue from there. -- Richard Damon From 2QdxY4RzWzUUiLuE at potatochowder.com Sat Jan 5 17:53:09 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Sat, 5 Jan 2019 16:53:09 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: <6211e654-f0a2-ae39-3f7b-edf38aa3c334@mgmiller.net> References: <6211e654-f0a2-ae39-3f7b-edf38aa3c334@mgmiller.net> Message-ID: On 1/5/19 12:21 PM, Mike Miller wrote:> > On 1/5/19 12:34 AM, Ivan Levkivskyi wrote: > > > There is. Mypy supports final names, final methods and whatnot > > https://mypy.readthedocs.io/en/latest/final_attrs.html > > > Believe this^^ is the best answer, unfortunately buried. The type hinting is physically separate (often in a different module) than the usage. If I'm looking at some code that uses the constant, the type hint is somewhere else. > Use typing hints for constants and tools that support them, and > all-caps is no longer needed. Requiring a tool/IDE to highlight this attribute is a step backwards. Can your email client find the type hint when it's in some other python module? Will proponents of type hints to provide this information also type them into answers on Stack Overflow? Dan From eric at trueblade.com Sat Jan 5 19:48:27 2019 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 5 Jan 2019 19:48:27 -0500 Subject: [Python-ideas] Fixed point format for numbers with locale based separators In-Reply-To: <87r2dqan45.fsf%steelman@post.pl> References: <1541732215.575168.1546613873762@poczta.home.pl> <3958640.cMbgjVjVHv@varric.chelsea.private> <87r2dqan45.fsf%steelman@post.pl> Message-ID: <7797ca18-ab81-7bd7-7912-1df2305fd23f@trueblade.com> On 1/5/2019 3:03 PM, ?ukasz Stelmach wrote: > Barry Scott writes: >> On Friday, 4 January 2019 14:57:53 GMT ?ukasz Stelmach wrote: >>> >>> I would like to present two pull requests[1][2] implementing fixed point >>> presentation of numbers and ask for comments. The first is mine. I >>> learnt about the second after publishing mine. >>> >>> The only format using decimal separator from locale data for >>> float/complex/decimal numbers at the moment is "n" which behaves like >>> "g". The drawback of these formats, I would like to overcome, is the >>> inability to print numbers ranging more than one order of magnitude with >>> the same number of decimal digits without "manually" (with some additional >>> custom code) adjusting precission. The other option is to "manually" >>> replace "." as printed by "f" with a local decimal separator. Neither of >>> these option is appealing to my. >>> >>> Formatting 1.23456789 * n (LC_ALL=3Dpl_PL.UTF-8) >>> >>> | n | ".2f" | ".3n" | >>> |---+----------+----------| >>> | 1 | 1.23 | 1,23 | >>> | 2 | 12.35 | 12,3 | >>> | 3 | 123.46 | 123 | >>> | 4 | 1234.57 | 1,23e+03 | >> >> Can you use locale.format_string() to solve this? > > I am afraid I can't. I am using a library called pint[1] in my > project. It allows me to choose how its objects are formated but it uses > format() internally. It adds some custom extensions to format strings > which, as far as I can tell, mekes it hard if not impossible to patch it > to locale.format_string(). But this is rather an excuse. I do think that this is a compelling use case for "f" style locale-aware formatting. I support adding it in some format or another (pun intended). My only concern is how to paint the bike shed. Should we just use another format spec "type" character instead of "f", as the two linked issues propose? Or maybe use an additional "alternate form" style character, so that we could use different locale options, either now or in the future? https://bugs.python.org/issue33731 is similar to https://bugs.python.org/issue34311 but proposes using LC_MONETARY instead of LC_NUMERIC. I'm not suggesting we solve every possible problem here, but we at least shouldn't paint ourselves into a corner and instead allow a future where we could expand things, if needed, and without using up tons of format spec "type" characters for every permutation of "type" plus LC_MONETARY or LC_NUMERIC. Here's a straw man: The current specification for the format spec is: [[fill]align][sign][#][0][width][grouping_option][.precision][type] Let's say we change it to: [[fill]align][sign][#][*|$][0][width][grouping_option][.precision][type] (I think that's unambiguous, but I'd have to think it through some more) Let's call the new [*|$] character the "locale character". If the locale character is "*", use locale-aware formatting for the given "type", with LC_NUMERIC. So, "*g" would be equivalent to the existing "n", and "*f" would give you the current "f" formatting, except using LC_NUMERIC for the decimal point. If the locale character is "$" use locale-aware LC_MONETARY. So then we could use "$g", "$f", etc. These locale characters would also work with int, so "*d" would make "n" obsolete (but I'm not proposing to remove it). These should also work with these "type" values for floats: '%', 'f', 'F', 'g', 'G', 'e', 'E', and None (as defined in the docs to mean a missing "type", not a real None value). I don't know if there are any cases where '#' alternate form would be used with '*' or '$'. If not, then maybe we could make the format spec be the slightly simpler: [[fill]align][sign][#|*|$][0][width][grouping_option][.precision][type] But it's probably worth keeping '#' orthogonal to the locale character. Maybe someday we'll want to use them together. The locale character should be supported in the numeric types that support the default format spec mini-language: int, float, decimal, and complex, at least. I'd have to grep for others. I think that for format spec "type" values where it doesn't make sense, using these new locale characters would raise ValueError. For example, since "b" output can never be locale-aware, "*b" would be an error, much like ",b" is currently an error. I'm not married to '*' for LC_NUMERIC, although I think '$' makes sense for LC_MONETARY. Again, this is just a straw man proposal that would require fleshing out. I think it might also require a PEP, but it would be as simple as PEP 378 for adding comma grouping formatting. Somewhere to memorialize the decision and how we got there, including rejected alternate proposals, would be a good thing. Eric From uamr567 at sina.com Sun Jan 6 00:12:39 2019 From: uamr567 at sina.com (=?GBK?B?TW9vbti8c3Vu?=) Date: Sun, 06 Jan 2019 13:12:39 +0800 Subject: [Python-ideas] =?gbk?b?u9i4tKO6UHl0aG9uLWlkZWFzIERpZ2VzdCwgVm9s?= =?gbk?q?_146=2C_Issue_13?= Message-ID: <20190106051239.A45B118C008B@webmail.sinamail.sina.com.cn> Thanks for your reply.But the answer is not I except, I will show you some examples to explain what result I except: @contextmanagerdef cm(): print('open file') yield print('close file')with cm(): 1/0 If I use a contextmanager ,I except it can help me to close the file anytime,even raise an error,but if I define a function with @contextmanager like the example which I have showed for you, it will never print('close file') I can only modify it like this:@contextmanagerdef cm(): try: print('open file') yield except Exception as e: print('Error',e) finally: print('close file') It is not friendly for us to use it, so I modify the contextlib to fix it,you can catch it from the e-mail attachment.It's in the line 79 and line 97---------------------------------------------------------------------- ????python-ideas-request at python.org ????python-ideas at python.org ???Python-ideas Digest, Vol 146, Issue 13 ???2019?01?06? 01?05? Send Python-ideas mailing list submissions to python-ideas at python.org To subscribe or unsubscribe via the World Wide Web, visit https://mail.python.org/mailman/listinfo/python-ideas or, via email, send a message with subject or body 'help' to python-ideas-request at python.org You can reach the person managing the list at python-ideas-owner at python.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Python-ideas digest..." Today's Topics: 1. Re: Make the @contextmanager of contextlib to be a real contextmanager (Serhiy Storchaka) 2. Re: Fixed point format for numbers with locale based separators (?ukasz Stelmach) ---------------------------------------------------------------------- Message: 1 Date: Sat, 5 Jan 2019 16:45:34 +0200 From: Serhiy Storchaka To: python-ideas at python.org Subject: Re: [Python-ideas] Make the @contextmanager of contextlib to be a real contextmanager Message-ID: Content-Type: text/plain; charset=UTF-8; format=flowed 05.01.19 14:52, Moon?sun ????: > As we know,when we import the module--'contextlib',we can use the > decorator '@contextmanager' and keyword ?yield? to make a 'instance' of > Class '_GeneratorContextManager' in 'contextlib' module,then we can use > it like: > with 'instance' as 'xx': > ? ? 'code block' > ? ? pass > But there is a little bug,when the code block raise a error,the instance > cannot run the code which after the keyword 'yield'. This is not a bug. Consider the following example: @contextmanager def cm(): try: yield except BaseException as err: print('Fail:', err) raise else: print('Success') with cm(): 1/0 What result would you expect? Test it with the stdlib implementation and with your implementation. ------------------------------ Message: 2 Date: Sat, 05 Jan 2019 16:41:20 +0100 From: ?ukasz Stelmach To: Steven D'Aprano Cc: python-ideas at python.org Subject: Re: [Python-ideas] Fixed point format for numbers with locale based separators Message-ID: <87y37z9knz.fsf%steelman at post.pl> Content-Type: text/plain; charset="utf-8" Steven D'Aprano writes: > On Fri, Jan 04, 2019 at 03:57:53PM +0100, ?ukasz Stelmach wrote: >> Hi, >> >> I would like to present two pull requests[1][2] implementing fixed point >> presentation of numbers and ask for comments. The first is mine. I >> learnt about the second after publishing mine. > > Before I look at the implementation, can you explain the functional > requirements please? As I stated in the original message below the table: >> In the application I want to create I am going to present users numbers >> ranging up to 3 orders of magnitude and I (my users) want them to be >> presented consistently with regards to number of decimal digits AND I >> want to conform to rules of languages of my users. And I would like to >> avoid the exponent notation by all means. The pint[1] library I use, implements formatting of physical quantities using the format()/__format__ code. As far as I can tell my patch for Python is shorter and more straightforward than a patch for pint to use locale.format(). Because the "g" based "n" formatter has been present since the advanced string formatting was described in PEP-3101, I think it is necessary to add the "m" formatter based on "f". The advanced string formatting facility in Python is very convenient and programmers shouldn't forced to use locale.format() like this "The total length of {} sticks is {} meters.".format(n_sticks, locale.format(".2f", l_sticks)) instead of "The total length of {} sticks is {:.2f} meters.".format(n_sticks, l_sticks) > In other words, what is the new feature you hope to have excepted? > Explain the intention and the API (the interface). The implementation is > the least important part :-) I wish to add a new formatter "m" for float/complex/decimal numbers, which behaves like the existing "f", but uses the decimal separator from the locale database. There is "n" formmatter which behaves like "g" but it does not fit my needs. > [...] >> Formatting 1.23456789 * n (LC_ALL=3Dpl_PL.UTF-8) >> | n | ".2f" | ".3n" | >> |---+----------+----------| >> | 1 | 1.23 | 1,23 | >> | 2 | 12.35 | 12,3 | >> | 3 | 123.46 | 123 | >> | 4 | 1234.57 | 1,23e+03 | > > I'm afraid I cannot work out what that table means. You say "Formatting > 1.23... * n" (multiplying by n) but the results shown aren't multiplied > by n=2, n=3, n=4 as the table suggests. > > Can you show what Python code you expect will produce the expected > output? for n in range(1,5): print("| {} | {:8.2f} | {:8.3n} |".format(n,1.23456789 * 10**n, 1.23456789 * 10**n)) [1] http://pint.readthedocs.io/ -- By?o mi bardzo mi?o. --- Rurku. --- ... >?ukasz< --- To dobrze, ?e mnie s?uchasz. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 617 bytes Desc: not available URL: ------------------------------ Subject: Digest Footer _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas ------------------------------ End of Python-ideas Digest, Vol 146, Issue 13 ********************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: _contextlib.py Type: text/x-python Size: 14088 bytes Desc: not available URL: From njs at pobox.com Sun Jan 6 00:23:40 2019 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 5 Jan 2019 21:23:40 -0800 Subject: [Python-ideas] =?utf-8?b?5Zue5aSN77yaUHl0aG9uLWlkZWFzIERpZ2Vz?= =?utf-8?q?t=2C_Vol_146=2C_Issue_13?= In-Reply-To: <20190106051239.A45B118C008B@webmail.sinamail.sina.com.cn> References: <20190106051239.A45B118C008B@webmail.sinamail.sina.com.cn> Message-ID: On Sat, Jan 5, 2019 at 9:13 PM Moon?sun wrote: > > Thanks for your reply. > But the answer is not I except, I will show you some examples to explain what result I except: > > @contextmanager > def cm(): > print('open file') > yield > print('close file') > with cm(): > 1/0 > > If I use a contextmanager ,I except it can help me to close the file anytime,even raise an error, > but if I define a function with @contextmanager like the example which I have showed for you, > it will never print('close file') > > I can only modify it like this: > @contextmanager > def cm(): > try: > print('open file') > yield > except Exception as e: > print('Error',e) > finally: > print('close file') > > It is not friendly for us to use it, so I modify the contextlib to fix it,you can catch it from the e-mail attachment. > It's in the line 79 and line 97 This is intentional, and can't be changed without breaking lots of code. With your version, there's no way for the context manager to catch or modify the exception, which is a common use case. For example, here's a context manager I wrote recently: @contextmanager def catch_and_log(exctype): try: yield except exctype: log.exception(...) This can't be done using your version. Of course you can have your own version of @contextmanager that works however you prefer. -n -- Nathaniel J. Smith -- https://vorpus.org From steve at pearwood.info Sun Jan 6 00:31:34 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 6 Jan 2019 16:31:34 +1100 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: Message-ID: <20190106053133.GW13616@ando.pearwood.info> On Sun, Jan 06, 2019 at 01:38:33AM +0100, Simon wrote: > I propose to be able to use the continue keyword to continue the execution > of the try block even when an error is handled. The above could then be > changed to : > > > try: > i = int("string") > print("continued on") > j = int(9.0) > except ValueError as e: > print(e) > continue > > >>> "invalid literal for int() with base 10: 'string'" > >>> "continued on" That's literally what the except clause is intended for, not the body of the try block. try: i = int("string") except ValueError as e: print(e) print("continued on") j = int(9.0) Putting the error handler in the try block is a serious problem for two reasons: (1) What if an error *doesn't* occur? The error handler occurs anyway: try: i = int("1234") # This succeeds. # and then immediately runs the error handler... oops! print("continued on") j = int(9.0) except ValueError as e: print(e) continue (2) And what happens if the error handler raises an error? The exception is raised, the except clause is called, the continue statement jumps back to the same block of code that just failed and will fail again. And again. And again. Forever. -- Steve From robertc at robertcollins.net Sun Jan 6 01:15:13 2019 From: robertc at robertcollins.net (Robert Collins) Date: Sun, 6 Jan 2019 19:15:13 +1300 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: Message-ID: On Sun., 6 Jan. 2019, 13:39 Simon > I was writing some python code earlier, and I noticed that in a code that > looks somwhat like this one : > > try: > i = int("string") > print("continued on") > j = int(9.0) > except ValueError as e: > print(e) > > >>> "invalid literal for int() with base 10: 'string'" > > this code will handle the exception, but the code in the try block will > not continue. > > I propose to be able to use the continue keyword to continue the execution > of the try block even when an error is handled. The above could then be > changed to : > In terms of implementation, I think continue would be problematic while true: try: x = foo() return x except: continue is already valid code. You'd need some way of disambiguating, either a keyword or parameter to continue. Both of which would require a very big benefit for us to do, given the ecosystem impact that such things have. > try: > i = int("string") > print("continued on") > j = int(9.0) > except ValueError as e: > print(e) > continue > > >>> "invalid literal for int() with base 10: 'string'" > >>> "continued on" > Exception handling is not internally line orientated, so this proposed resume functionality doesn't map exactly. But if the following in the same way as what you envision: def handle(f, *args): try: return f(*args) except ValueError as e: print(e) i = handle( int, "string") handle(print, "continued on") j = handle(int, 9.0) Then I have to say I'm not sure what you are trying to solve. Is it the verbosity? Is it the flow control? -------------- next part -------------- An HTML attachment was scrubbed... URL: From amber.yust at gmail.com Sun Jan 6 03:07:31 2019 From: amber.yust at gmail.com (Amber Yust) Date: Sun, 6 Jan 2019 00:07:31 -0800 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: Message-ID: On Sat, Jan 5, 2019 at 4:39 PM Simon wrote: > I propose to be able to use the continue keyword to continue the execution > of the try block even when an error is handled. The above could then be > changed to : > > > try: > i = int("string") > print("continued on") > j = int(9.0) > except ValueError as e: > print(e) > continue > > >>> "invalid literal for int() with base 10: 'string'" > >>> "continued on" > There is already a much simpler way of doing this: try: i = int("string") except ValueError as e: print(e) print("continued on") j = int(9.0) The point of the 'try' block is to encapsulate the code you want to *stop* executing if an exception is raised. If you want code to be run regardless of whether an exception is raised, move it past the try-except. ~Amber -------------- next part -------------- An HTML attachment was scrubbed... URL: From skrah at bytereef.org Sun Jan 6 16:22:00 2019 From: skrah at bytereef.org (Stefan Krah) Date: Sun, 6 Jan 2019 22:22:00 +0100 Subject: [Python-ideas] Fixed point format for numbers with locale based separators Message-ID: <20190106212200.GA12395@bytereef.org> Eric V. Smith wrote: > If the locale character is "*", use locale-aware formatting for the > given "type", with LC_NUMERIC. So, "*g" would be equivalent to the > existing "n", and "*f" would give you the current "f" formatting, except > using LC_NUMERIC for the decimal point. If the locale character is "$" > use locale-aware LC_MONETARY. So then we could use "$g", "$f", etc. > These locale characters would also work with int, so "*d" would make "n" > obsolete (but I'm not proposing to remove it). +1. I also think it's best to have a modifier and cover all cases. > But it's probably worth keeping '#' orthogonal to the locale character. > Maybe someday we'll want to use them together. Yes, somehow it feels right to keep them separate, even if we never use them together. > I think it might also require a PEP, but it would be as simple as > PEP 378 for adding comma grouping formatting. Somewhere to memorialize > the decision and how we got there, including rejected alternate > proposals, would be a good thing. It would be nice (if anyone wants to do the work), but your proposal is already perfect for me. I like the "*" and "$" choices. Stefan Krah From chris.barker at noaa.gov Sun Jan 6 17:29:35 2019 From: chris.barker at noaa.gov (Chris Barker) Date: Sun, 6 Jan 2019 14:29:35 -0800 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: Message-ID: There is already a much simpler way of doing this: > try: > i = int("string") > except ValueError as e: > print(e) > print("continued on") > j = int(9.0) > > The point of the 'try' block is to encapsulate the code you want to *stop* > executing if an exception is raised. If you want code to be run regardless > of whether an exception is raised, move it past the try-except. > To be fair, I suspect the issue was there were two calls to int() there that might raise a ValueError, and the OP wanted to catch them with one except, so you would need to do somethign like: try: i = int("string") except ValueError as e: print(e) print("continued on") try: j = int(9.0) except ValueError as e: print(e) Which can seem a bit verbose, but in fact, there are a number of ways one might want to proceed with/without an error, and the current except, finally, else options cover them all in a clearly defined way. -CHB > > ~Amber > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Jan 6 19:27:22 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 7 Jan 2019 11:27:22 +1100 Subject: [Python-ideas] NAN handling in the statistics module Message-ID: <20190107002722.GA13616@ando.pearwood.info> Bug #33084 reports that the statistics library calculates median and other stats wrongly if the data contains NANs. Worse, the result depends on the initial placement of the NAN: py> from statistics import median py> NAN = float('nan') py> median([NAN, 1, 2, 3, 4]) 2 py> median([1, 2, 3, 4, NAN]) 3 See the bug report for more detail: https://bugs.python.org/issue33084 The caller can always filter NANs out of their own data, but following the lead of some other stats packages, I propose a standard way for the statistics module to do so. I hope this will be uncontroversial (he says, optimistically...) but just in case, here is some prior art: (1) Nearly all R stats functions take a "na.rm" argument which defaults to False; if True, NA and NAN values will be stripped. (2) The scipy.stats.ttest_ind function takes a "nan_policy" argument which specifies what to do if a NAN is seen in the data. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html (3) At least some Matlab functions, such as mean(), take an optional flag that determines whether to ignore NANs or include them. https://au.mathworks.com/help/matlab/ref/mean.html#bt5b82t-1-nanflag I propose adding a "nan_policy" keyword-only parameter to the relevant statistics functions (mean, median, variance etc), and defining the following policies: IGNORE: quietly ignore all NANs FAIL: raise an exception if any NAN is seen in the data PASS: pass NANs through unchanged (the default) RETURN: return a NAN if any NAN is seen in the data WARN: ignore all NANs but raise a warning if one is seen PASS is equivalent to saying that you, the caller, have taken full responsibility for filtering out NANs and there's no need for the function to slow down processing by doing so again. Either that, or you want the current implementation-dependent behaviour. FAIL is equivalent to treating all NANs as "signalling NANs". The presence of a NAN is an error. RETURN is equivalent to "NAN poisoning" -- the presence of a NAN in a calculation causes it to return a NAN, allowing NANs to propogate through multiple calculations. IGNORE and WARN are the same, except IGNORE is silent and WARN raises a warning. Questions: - does anyone have an serious objections to this? - what do you think of the names for the policies? - are there any additional policies that you would like to see? (if so, please give use-cases) - are you happy with the default? Bike-shed away! -- Steve From mertz at gnosis.cx Sun Jan 6 19:46:03 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 6 Jan 2019 19:46:03 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107002722.GA13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: Would these policies be named as strings or with an enum? Following Pandas, we'd probably support both. I won't bikeshed the names, but they seem to cover desired behaviors. On Sun, Jan 6, 2019, 7:28 PM Steven D'Aprano Bug #33084 reports that the statistics library calculates median and > other stats wrongly if the data contains NANs. Worse, the result depends > on the initial placement of the NAN: > > py> from statistics import median > py> NAN = float('nan') > py> median([NAN, 1, 2, 3, 4]) > 2 > py> median([1, 2, 3, 4, NAN]) > 3 > > See the bug report for more detail: > > https://bugs.python.org/issue33084 > > > The caller can always filter NANs out of their own data, but following > the lead of some other stats packages, I propose a standard way for the > statistics module to do so. I hope this will be uncontroversial (he > says, optimistically...) but just in case, here is some prior art: > > (1) Nearly all R stats functions take a "na.rm" argument which defaults > to False; if True, NA and NAN values will be stripped. > > (2) The scipy.stats.ttest_ind function takes a "nan_policy" argument > which specifies what to do if a NAN is seen in the data. > > > https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html > > (3) At least some Matlab functions, such as mean(), take an optional > flag that determines whether to ignore NANs or include them. > > https://au.mathworks.com/help/matlab/ref/mean.html#bt5b82t-1-nanflag > > > I propose adding a "nan_policy" keyword-only parameter to the relevant > statistics functions (mean, median, variance etc), and defining the > following policies: > > IGNORE: quietly ignore all NANs > FAIL: raise an exception if any NAN is seen in the data > PASS: pass NANs through unchanged (the default) > RETURN: return a NAN if any NAN is seen in the data > WARN: ignore all NANs but raise a warning if one is seen > > PASS is equivalent to saying that you, the caller, have taken full > responsibility for filtering out NANs and there's no need for the > function to slow down processing by doing so again. Either that, or you > want the current implementation-dependent behaviour. > > FAIL is equivalent to treating all NANs as "signalling NANs". The > presence of a NAN is an error. > > RETURN is equivalent to "NAN poisoning" -- the presence of a NAN in a > calculation causes it to return a NAN, allowing NANs to propogate > through multiple calculations. > > IGNORE and WARN are the same, except IGNORE is silent and WARN raises a > warning. > > Questions: > > - does anyone have an serious objections to this? > > - what do you think of the names for the policies? > > - are there any additional policies that you would like to see? > (if so, please give use-cases) > > - are you happy with the default? > > > Bike-shed away! > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Jan 6 21:19:31 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 7 Jan 2019 13:19:31 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: <20190107021930.GB13616@ando.pearwood.info> On Sun, Jan 06, 2019 at 07:46:03PM -0500, David Mertz wrote: > Would these policies be named as strings or with an enum? Following Pandas, > we'd probably support both. Sure, I can support both. > I won't bikeshed the names, but they seem to > cover desired behaviors. Good to hear. -- Steve From mertz at gnosis.cx Sun Jan 6 21:46:51 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 6 Jan 2019 21:46:51 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107002722.GA13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: I have to say though that the existing behavior of `statistics.median[_low|_high|]` is SURPRISING if not outright wrong. It is the behavior in existing Python, but it is very strange. The implementation simply does whatever `sorted()` does, which is an implementation detail. In particular, NaN's being neither less than nor greater than any floating point number, just stay where they are during sorting. But that's a particular feature of TimSort. Yes, we are guaranteed that sorts are stable; and we have rules about which things can and cannot be compared for inequality at all. But beyond that, I do not think Python ever promised that NaNs would remain in the same positions after sorting if some other position was stable under a different sorting algorithm. So in the incredibly unlikely even I invent a DavidSort that behaves better than TimSort, is stable, and compares only the same Python objects as current CPython, a future version could use this algorithm without breaking promises... even if NaN's sometimes sorted differently than in TimSort. For that matter, some new implementation could use my not-nearly-as-good DavidSort, and while being slower, would still be compliant. Relying on that for the result of `median()` feels strange to me. It feels strange as the default behavior, but that's the status quo. But it feels even stranger that there are not at least options to deal with NaNs in more of the signaling or poisoning ways that every other numeric library does. On Sun, Jan 6, 2019 at 7:28 PM Steven D'Aprano wrote: > Bug #33084 reports that the statistics library calculates median and > other stats wrongly if the data contains NANs. Worse, the result depends > on the initial placement of the NAN: > > py> from statistics import median > py> NAN = float('nan') > py> median([NAN, 1, 2, 3, 4]) > 2 > py> median([1, 2, 3, 4, NAN]) > 3 > > See the bug report for more detail: > > https://bugs.python.org/issue33084 > > > The caller can always filter NANs out of their own data, but following > the lead of some other stats packages, I propose a standard way for the > statistics module to do so. I hope this will be uncontroversial (he > says, optimistically...) but just in case, here is some prior art: > > (1) Nearly all R stats functions take a "na.rm" argument which defaults > to False; if True, NA and NAN values will be stripped. > > (2) The scipy.stats.ttest_ind function takes a "nan_policy" argument > which specifies what to do if a NAN is seen in the data. > > > https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html > > (3) At least some Matlab functions, such as mean(), take an optional > flag that determines whether to ignore NANs or include them. > > https://au.mathworks.com/help/matlab/ref/mean.html#bt5b82t-1-nanflag > > > I propose adding a "nan_policy" keyword-only parameter to the relevant > statistics functions (mean, median, variance etc), and defining the > following policies: > > IGNORE: quietly ignore all NANs > FAIL: raise an exception if any NAN is seen in the data > PASS: pass NANs through unchanged (the default) > RETURN: return a NAN if any NAN is seen in the data > WARN: ignore all NANs but raise a warning if one is seen > > PASS is equivalent to saying that you, the caller, have taken full > responsibility for filtering out NANs and there's no need for the > function to slow down processing by doing so again. Either that, or you > want the current implementation-dependent behaviour. > > FAIL is equivalent to treating all NANs as "signalling NANs". The > presence of a NAN is an error. > > RETURN is equivalent to "NAN poisoning" -- the presence of a NAN in a > calculation causes it to return a NAN, allowing NANs to propogate > through multiple calculations. > > IGNORE and WARN are the same, except IGNORE is silent and WARN raises a > warning. > > Questions: > > - does anyone have an serious objections to this? > > - what do you think of the names for the policies? > > - are there any additional policies that you would like to see? > (if so, please give use-cases) > > - are you happy with the default? > > > Bike-shed away! > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.bordeyne at gmail.com Sun Jan 6 21:54:07 2019 From: simon.bordeyne at gmail.com (simon.bordeyne) Date: Mon, 07 Jan 2019 03:54:07 +0100 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: Message-ID: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> I knew that you can just chain try/except blocks, and it's how I do it now, but the example I provided wasn't very realistic. Take for example the initialization of a class from a config file, config file which may or may not have certain keys in it. With many keys, it is very inconvenient to chain try/except blocks to handle every possible case. Having the continue keyword would prove useful to put several error prone lines of code into a single try block, and have the execution continue as normal at tge statement after the statement errored out? Envoy? depuis mon smartphone Samsung Galaxy. -------- Message d'origine --------De : Amber Yust Date : 06/01/2019 09:07 (GMT+01:00) ? : Simon Cc : python-ideas at python.org Objet : Re: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks On Sat, Jan 5, 2019 at 4:39 PM Simon wrote:I propose to be able to use the continue keyword to continue the execution of the try block even when an error is handled. The above could then be changed to : ? ? try:? ? ? ? i = int("string")? ? ? ? print("continued on")? ? ? ? j = int(9.0)? ? except ValueError as e:? ? ? ? print(e)? ? ? ? continue >>> "invalid literal for int() with base 10: 'string'">>> "continued on" There is already a much simpler way of doing this: ? ? try:? ? ? ? i =?int("string")? ? except ValueError as e:? ? ? ? print(e)? ? print("continued on")? ? j = int(9.0) The point of the 'try' block is to encapsulate the code you want to *stop* executing if an exception is raised. If you want code to be run regardless of whether an exception is raised, move it past the try-except. ~Amber? -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Sun Jan 6 22:06:19 2019 From: Richard at Damon-Family.org (Richard Damon) Date: Sun, 6 Jan 2019 22:06:19 -0500 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> Message-ID: <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> On 1/6/19 9:54 PM, simon.bordeyne wrote: > I knew that you can just chain try/except blocks, and it's how I do it > now, but the example I provided wasn't very realistic. > > Take for example the initialization of a class from a config file, > config file which may or may not have certain keys in it. With many > keys, it is very inconvenient to chain try/except blocks to handle > every possible case. Having the continue keyword would prove useful to > put several error prone lines of code into a single try block, and > have the execution continue as normal at tge statement after the > statement errored out? > > For something like reading options from a config file, I would use a call that specifies the key and a value to use if the key isn't present, and inside that function I might use a try to handle any exception caused when processing the key, and it could return the default. For your case, it is hard to imagine what you could put in the except block to handle the error, as you have no idea which key threw the error, so you have no idea which key needs to be fixed. Also, what happens if the exception is thrown inside a function that is called, do you return to the next line of that function, or the next line after the function call? What happens if the exception happens inside a loop (that is inside the try)? Do you just go to the next instruction in the loop and continue looping? -- Richard Damon From shoyer at gmail.com Sun Jan 6 22:40:32 2019 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 6 Jan 2019 19:40:32 -0800 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107002722.GA13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: On Sun, Jan 6, 2019 at 4:27 PM Steven D'Aprano wrote: > I propose adding a "nan_policy" keyword-only parameter to the relevant > statistics functions (mean, median, variance etc), and defining the > following policies: > > IGNORE: quietly ignore all NANs > FAIL: raise an exception if any NAN is seen in the data > PASS: pass NANs through unchanged (the default) > RETURN: return a NAN if any NAN is seen in the data > WARN: ignore all NANs but raise a warning if one is seen > I don't think PASS should be the default behavior, and I'm not sure it would be productive to actually implement all of these options. For reference, NumPy and pandas (the two most popular packages for data analytics in Python) support two of these modes: - RETURN (numpy.mean() and skipna=False for pandas) - IGNORE (numpy.nanmean() and skipna=True for pandas) RETURN is the default behavior for NumPy; IGNORE is the default for pandas. I'm pretty sure RETURN is the right default behavior for Python's standard library and anything else should be considered a bug. It safely propagates NaNs, along the lines of IEEE float behavior. I'm not sure what the use cases are for PASS, FAIL, or WARN, none of which are supported by NumPy or pandas: - PASS is a license to return silently incorrect results, in return for very marginal performance benefits. This seems at odds with the intended focus of the statistics module on correctness over speed. Returning incorrect statistics should not be considered a feature that needs to be maintained. - FAIL would make sense if statistics functions could introduce *new* NaN values. But as far as I can tell, statistics functions already raise StatisticsError in these cases (e.g., if zero data point are provided). If users are concerned about accidentally propagating NaNs, they should be encouraged to check for NaNs at the entry points of their code. - WARN is even less useful than FAIL. Seriously, who likes warnings? NumPy uses this approach for in array operations that produce NaNs (e.g., when dividing by zero), because *some* but not all results may be valid. But statistics functions return scalars. I'm not even entirely sure it makes sense to add the IGNORE option, or at least to add it only for NaN. None is also a reasonable sentinel for a missing value in Python, and user defined types (e.g., pandas.NaT) also fall in this category. It seems a little strange to single NaN out in particular. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Jan 6 22:41:08 2019 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 6 Jan 2019 21:41:08 -0600 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: [David Mertz ] > I have to say though that the existing behavior of `statistics.median[_low|_high|]` > is SURPRISING if not outright wrong. It is the behavior in existing Python, > but it is very strange. > > The implementation simply does whatever `sorted()` does, which is an > implementation detail. In particular, NaN's being neither less than nor > greater than any floating point number, just stay where they are during > sorting. I expect you inferred that from staring at a handful of examples, but it's illusion. Python's sort uses only __lt__ comparisons, and if those don't implement a total ordering then _nothing_ is defined about sort's result (beyond that it's some permutation of the original list). There's nothing special about NaNs in this. For example, if you sort a list of sets, then "<" means subset inclusion, which doesn't define a total ordering among sets in general either (unless for every pair of sets in a specific list, one is a proper subset of the other - in which case the list of sets will be sorted in order of increasing cardinality). > But that's a particular feature of TimSort. Yes, we are guaranteed that sorts > are stable; and we have rules about which things can and cannot be compared > for inequality at all. But beyond that, I do not think Python ever promised that > NaNs would remain in the same positions after sorting We don't promise it, and it's not true. For example, >>> import math >>> nan = math.nan >>> xs = [0, 1, 2, 4, nan, 5, 3] >>> sorted(xs) [0, 1, 2, 3, 4, nan, 5] The NaN happened to move "one place to the right" there. There's no point to analyzing "why" - it's purely an accident deriving from the pattern of __lt__ outcomes the internals happened to invoke. FYI, it goes like so: is 1 < 0? No, so the first two are already sorted. is 2 < 1? No, so the first three are already sorted. is 4 < 2? No, so the first four are already sorted is nan < 4? No, so the first five are already sorted is 5 < nan? No, so the first six are already sorted is 3 < 5? Yes! At that point a binary insertion is used to move 3 into place. And none of timsort's "fancy" parts even come into play for lists so small. The patterns of comparisons the fancy parts invoke can be much more involved. At no point does the algorithm have any idea that there are NaNs in the list - it only looks at boolean __lt__ outcomes. So, certainly, if you want median to be predictable in the presence of NaNs, sort's behavior in the presence of NaNs can't be relied on in any respect. >>> sorted([6, 5, nan, 4, 3, 2, 1]) [1, 2, 3, 4, 5, 6, nan] >>> sorted([9, 9, 9, 9, 9, 9, nan, 1, 2, 3, 4, 5, 6]) [9, 9, 9, 9, 9, 9, nan, 1, 2, 3, 4, 5, 6] From mertz at gnosis.cx Sun Jan 6 23:05:39 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 6 Jan 2019 23:05:39 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: [... apologies if this is dup, got a bounce ...] > [David Mertz ] >> I have to say though that the existing behavior of `statistics.median[_low|_high|]` >> is SURPRISING if not outright wrong. It is the behavior in existing Python, >> but it is very strange. >> >> The implementation simply does whatever `sorted()` does, which is an >> implementation detail. In particular, NaN's being neither less than nor >> greater than any floating point number, just stay where they are during >> sorting. > > I expect you inferred that from staring at a handful of examples, but > it's illusion. Python's sort uses only __lt__ comparisons, and if > those don't implement a total ordering then _nothing_ is defined about > sort's result (beyond that it's some permutation of the original > list). Thanks Tim for clarifying. Is it even the case that sorts are STABLE in the face of non-total orderings under __lt__? A couple quick examples don't refute that, but what I tried was not very thorough, nor did I think much about TimSort itself. > So, certainly, if you want median to be predictable in the presence of > NaNs, sort's behavior in the presence of NaNs can't be relied on in > any respect. Playing with Tim's examples, this suggests that statistics.median() is simply outright WRONG. I can think of absolutely no way to characterize these as reasonable results: Python 3.7.1 | packaged by conda-forge | (default, Nov 13 2018, 09:50:42) In [4]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4, 5]) Out[4]: 1 In [5]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4]) Out[5]: nan -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Jan 6 23:09:03 2019 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 6 Jan 2019 22:09:03 -0600 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: [David Mertz ] > Thanks Tim for clarifying. Is it even the case that sorts are STABLE in > the face of non-total orderings under __lt__? A couple quick examples > don't refute that, but what I tried was not very thorough, nor did I > think much about TimSort itself. I'm not clear on what "stable" could mean in the absence of a total ordering. Not only does sort not assume __lt__ is a total ordering, it doesn't assume it's transitive, or even deterministic. We really can't assume anything about potentially user-defined functions. What sort does guarantee is that the result list is some permutation of the input list, regardless of how insanely __lt__ may behave. If __lt__ sanely defines a deterministic total order, then "stable" and "sorted" are guaranteed too, with their obvious meanings. From mertz at gnosis.cx Sun Jan 6 23:16:58 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 6 Jan 2019 23:16:58 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: OK, let me be more precise. Obviously if the implementation in a class is: class Foo: def __lt__(self, other): return random.random() < 0.5 Then we aren't going to rely on much. * If comparison of any two items in a list (under __lt__) is deterministic, is the resulting sort order deterministic? (Pretty sure this is a yes) * If the pairwise comparisons are deterministic, is sorting idempotent? This statement is certainly false: * If two items are equal, and pairwise inequality is deterministic, exchanging the items does not affect the sorting of other items in the list. On Sun, Jan 6, 2019 at 11:09 PM Tim Peters wrote: > [David Mertz ] > > Thanks Tim for clarifying. Is it even the case that sorts are STABLE in > > the face of non-total orderings under __lt__? A couple quick examples > > don't refute that, but what I tried was not very thorough, nor did I > > think much about TimSort itself. > > I'm not clear on what "stable" could mean in the absence of a total > ordering. Not only does sort not assume __lt__ is a total ordering, > it doesn't assume it's transitive, or even deterministic. We really > can't assume anything about potentially user-defined functions. > > What sort does guarantee is that the result list is some permutation > of the input list, regardless of how insanely __lt__ may behave. If > __lt__ sanely defines a deterministic total order, then "stable" and > "sorted" are guaranteed too, with their obvious meanings. > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sun Jan 6 23:27:46 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 6 Jan 2019 23:27:46 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: This statement is certainly false: > > * If two items are equal, and pairwise inequality is deterministic, > exchanging the items does not affect the sorting of other items in the list. > Just to demonstrate this obviousness: >>> sorted([9, 9, 9, b, 1, 2, 3, a]) [1, 2, 3, A, B, 9, 9, 9] >>> sorted([9, 9, 9, a, 1, 2, 3, b]) [B, 9, 9, 9, A, 1, 2, 3] >>> a == b True The classes involved are: class A: def __lt__(self, other): return False __gt__ = __lt__ def __eq__(self, other): return True def __repr__(self): return self.__class__.__name__ class B(A): def __lt__(self, other): return True __gt__ = __lt__ I do not think these are useful, but __lt__ is deterministic here. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Jan 6 23:29:08 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 7 Jan 2019 15:29:08 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: On Mon, Jan 7, 2019 at 3:19 PM David Mertz wrote: > > OK, let me be more precise. Obviously if the implementation in a class is: > > class Foo: > def __lt__(self, other): > return random.random() < 0.5 > > > Then we aren't going to rely on much. > > * If comparison of any two items in a list (under __lt__) is deterministic, is the resulting sort order deterministic? (Pretty sure this is a yes) If you guarantee that exactly one of "x < y" and "y < x" is true for any given pair of values from the list, and further guarantee that if x < y and y < z then x < z, you have a total order. Without those two guarantees, you could have deterministic comparisons (eg "nan < 5" is always false, but so is "5 < nan"), but there's no way to truly put the elements "in order". Defining __lt__ as "rock < paper", "paper < scissors", "scissors < rock" means that you can't guarantee the sort order, nor determinism. Are those guarantees safe for your purposes? If so, sort() is, AIUI, guaranteed to behave sanely. ChrisA From tim.peters at gmail.com Sun Jan 6 23:46:14 2019 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 6 Jan 2019 22:46:14 -0600 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: [David Mertz ] > OK, let me be more precise. Obviously if the implementation in a class is: > > class Foo: > def __lt__(self, other): > return random.random() < 0.5 > > Then we aren't going to rely on much. > > * If comparison of any two items in a list (under __lt__) is deterministic, is > the resulting sort order deterministic? (Pretty sure this is a yes) Yes, but not defined unless __lt__ also defines a total ordering. > * If the pairwise comparisons are deterministic, is sorting idempotent? Not necessarily. For example, the 2-element list here swaps its elements every time `.sort()` is invoked, because the second element always claims it's "less than" the first element, regardless of which order they're in: class RelentlesslyTiny: def __init__(self, name): self.name = name def __repr__(self): return self.name def __lt__(self, other): return self is not other a = RelentlesslyTiny("A") b = RelentlesslyTiny("B") xs = [a, b] print(xs) xs.sort() print("after sorting once", xs) xs.sort() print("after sorting twice", xs) [A, B] after sorting once [B, A] after sorting twice [A, B] > This statement is certainly false: > > * If two items are equal, and pairwise inequality is deterministic, exchanging > the items does not affect the sorting of other items in the list. What I said at the start ;-) The only thing .sort() always guarantees regardless of how goofy __lt__ may be is that the result list will be some permutation of the input list. This is so even if __lt__ raises an uncaught exception, killing the sort mid-stream. From steve at pearwood.info Mon Jan 7 01:26:30 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 7 Jan 2019 17:26:30 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: <20190107062630.GC13616@ando.pearwood.info> On Sun, Jan 06, 2019 at 10:52:47PM -0500, David Mertz wrote: > Playing with Tim's examples, this suggests that statistics.median() is > simply outright WRONG. I can think of absolutely no way to characterize > these as reasonable results: > > Python 3.7.1 | packaged by conda-forge | (default, Nov 13 2018, 09:50:42) > In [4]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4, 5]) > Out[4]: 1 > In [5]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4]) > Out[5]: nan The second is possibly correct if one thinks that the median of a list containing NAN should return NAN -- but its only correct by accident, not design. As I wrote on the bug tracker: "I agree that the current implementation-dependent behaviour when there are NANs in the data is troublesome." The only reason why I don't call it a bug is that median() makes no promises about NANs at all, any more than it makes promises about the median of a list of sets or any other values which don't define a total order. help(median) says: Return the median (middle value) of numeric data. By definition, data containing Not A Number values isn't numeric :-) I'm not opposed to documenting this better. Patches welcome :-) There are at least three correct behaviours in the face of data containing NANs: propogate a NAN result, fail fast with an exception, or treat NANs as missing data that can be ignored. Only the caller can decide which is the right policy for their data set. Aside: the IEEE-754 standard provides both signalling and quiet NANs. It is hard and unreliable to generate signalling float NANs in Python, but we can do it with Decimal: py> from statistics import median py> from decimal import Decimal py> median([1, 3, 4, Decimal("sNAN"), 2]) Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.5/statistics.py", line 349, in median data = sorted(data) decimal.InvalidOperation: [] In principle, one ought to be able to construct float signalling NANs too, but unfortunately that's platform dependent: https://mail.python.org/pipermail/python-dev/2018-November/155713.html Back to the topic on hand: I agree that median() does "the wrong thing" when NANs are involved, but there is no one "right thing" that we can do in its place. People disagree as to whether NANs should propogate, or raise, or be treated as missing data, and I see good arguments for all three. -- Steve From mertz at gnosis.cx Mon Jan 7 01:34:47 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 7 Jan 2019 01:34:47 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107062630.GC13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> Message-ID: On Mon, Jan 7, 2019 at 1:27 AM Steven D'Aprano wrote: > > In [4]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4, 5]) > > Out[4]: 1 > > In [5]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4]) > > Out[5]: nan > > The second is possibly correct if one thinks that the median of a list > containing NAN should return NAN -- but its only correct by accident, > not design. > Exactly... in the second example, the nan just happens to wind up "in the middle" of the sorted() list. The fact that is the return value has nothing to do propagating the nan (if it did, I think it would be a reasonable answer). I contrived the examples to get these... the first answer which is the "most wrong number" is also selected for the same reason than a nan is "near the middle." > I'm not opposed to documenting this better. Patches welcome :-) > I'll provide a suggested batch on the bug. It will simply be a wholly different implementation of median and friends. > There are at least three correct behaviours in the face of data > containing NANs: propogate a NAN result, fail fast with an exception, or > treat NANs as missing data that can be ignored. Only the caller can > decide which is the right policy for their data set. I'm not sure that raising right away is necessary as an option. That feels like something a user could catch at the end when they get a NaN result. But those seem reasonable as three options. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From elazarg at gmail.com Mon Jan 7 01:58:33 2019 From: elazarg at gmail.com (Elazar) Date: Mon, 7 Jan 2019 08:58:33 +0200 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> Message-ID: I think the main issue is this: exception handling is already problematic with its nonlocal transfer of control. Making it bidirectional makes code even more difficult to understand. State will change "under your feet" without any syntactic clue. In "The Design and Evolution of C++" Bjarne Stroustroup quotes an engineer working on a large system using such a feature extensively ; they ended up having to rewrite every single occurrence of it because it introduced a huge amounts of bugs. This is the reason C++ does not support resume semantics. Newer languages go in the direction of avoiding exceptions altogether, not adding more intricate control flow directives. This proposal is basically about introducing goto to the language. Elazar ?????? ??? ??, 7 ????? 2019, 5:07, ??? Richard Damon ?< Richard at damon-family.org>: > On 1/6/19 9:54 PM, simon.bordeyne wrote: > > I knew that you can just chain try/except blocks, and it's how I do it > > now, but the example I provided wasn't very realistic. > > > > Take for example the initialization of a class from a config file, > > config file which may or may not have certain keys in it. With many > > keys, it is very inconvenient to chain try/except blocks to handle > > every possible case. Having the continue keyword would prove useful to > > put several error prone lines of code into a single try block, and > > have the execution continue as normal at tge statement after the > > statement errored out > > > > > For something like reading options from a config file, I would use a > call that specifies the key and a value to use if the key isn't present, > and inside that function I might use a try to handle any exception > caused when processing the key, and it could return the default. > > For your case, it is hard to imagine what you could put in the except > block to handle the error, as you have no idea which key threw the > error, so you have no idea which key needs to be fixed. > > Also, what happens if the exception is thrown inside a function that is > called, do you return to the next line of that function, or the next > line after the function call? > > What happens if the exception happens inside a loop (that is inside the > try)? Do you just go to the next instruction in the loop and continue > looping? > > -- > Richard Damon > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Jan 7 02:05:26 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 7 Jan 2019 18:05:26 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: <20190107070526.GD13616@ando.pearwood.info> On Sun, Jan 06, 2019 at 07:40:32PM -0800, Stephan Hoyer wrote: > On Sun, Jan 6, 2019 at 4:27 PM Steven D'Aprano wrote: > > > I propose adding a "nan_policy" keyword-only parameter to the relevant > > statistics functions (mean, median, variance etc), and defining the > > following policies: > > > > IGNORE: quietly ignore all NANs > > FAIL: raise an exception if any NAN is seen in the data > > PASS: pass NANs through unchanged (the default) > > RETURN: return a NAN if any NAN is seen in the data > > WARN: ignore all NANs but raise a warning if one is seen > > > > I don't think PASS should be the default behavior, and I'm not sure it > would be productive to actually implement all of these options. I'm not wedded to the idea that the default ought to be the current behaviour. If there is a strong argument for one of the others, I'm listening. > For reference, NumPy and pandas (the two most popular packages for data > analytics in Python) support two of these modes: > - RETURN (numpy.mean() and skipna=False for pandas) > - IGNORE (numpy.nanmean() and skipna=True for pandas) > > RETURN is the default behavior for NumPy; IGNORE is the default for pandas. > > I'm pretty sure RETURN is the right default behavior for Python's standard > library and anything else should be considered a bug. It safely propagates > NaNs, along the lines of IEEE float behavior. How would you answer those who say that the right behaviour is not to propogate unwanted NANs, but to fail fast and raise an exception? > I'm not sure what the use cases are for PASS, FAIL, or WARN, none of which > are supported by NumPy or pandas: > - PASS is a license to return silently incorrect results, in return for > very marginal performance benefits. By my (very rough) preliminary testing, the cost of checking for NANs doubles the cost of calculating the median, and increases the cost of calculating the mean() by 25%. I'm not trying to compete with statistics libraries written in C for speed, but that doesn't mean I don't care about performance at all. The statistics library is already slower than I like and I don't want to slow it down further for the common case (numeric data with no NANs) for the sake of the uncommon case (data with NANs). But I hear you about the "return silently incorrect results" part. Fortunately, I think that only applies to sort-based functions like median(). mean() etc ought to propogate NANs with any reasonable implementation, but I'm reluctant to make that a guarantee in case I come up with some unreasonable implementation :-) > This seems at odds with the intended > focus of the statistics module on correctness over speed. Returning > incorrect statistics should not be considered a feature that needs to be > maintained. It is only incorrect because the data violates the documented requirement that it be *numeric data*, and the undocumented requirement that the numbers have a total order. (So complex numbers are out.) I admit that the docs could be improved, but there are no guarantees made about NANs. This doesn't mean I don't want to improve the situation! Far from it, hence this discussion. > - FAIL would make sense if statistics functions could introduce *new* NaN > values. But as far as I can tell, statistics functions already raise > StatisticsError in these cases (e.g., if zero data point are provided). If > users are concerned about accidentally propagating NaNs, they should be > encouraged to check for NaNs at the entry points of their code. As far as I can tell, there are two kinds of people when it comes to NANs: those who think that signalling NANs are a waste of time and NANs should always propogate, and those who hate NANs and wish that they would always signal (raise an exception). I'm not going to get into an argument about who is right or who is wrong. > - WARN is even less useful than FAIL. Seriously, who likes warnings? Me :-) > NumPy > uses this approach for in array operations that produce NaNs (e.g., when > dividing by zero), because *some* but not all results may be valid. But > statistics functions return scalars. > > I'm not even entirely sure it makes sense to add the IGNORE option, or at > least to add it only for NaN. None is also a reasonable sentinel for a > missing value in Python, and user defined types (e.g., pandas.NaT) also > fall in this category. It seems a little strange to single NaN out in > particular. I am considering adding support for a dedicated "missing" value, whether it is None or a special sentinel. But one thing at a time. Ignoring NANs is moderately common in other statistics libraries, and although I personally feel that NANs shouldn't be used for missing values, I know many people do so. -- Steve From njs at pobox.com Mon Jan 7 02:31:30 2019 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 6 Jan 2019 23:31:30 -0800 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107070526.GD13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190107070526.GD13616@ando.pearwood.info> Message-ID: On Sun, Jan 6, 2019 at 11:06 PM Steven D'Aprano wrote: > I'm not wedded to the idea that the default ought to be the current > behaviour. If there is a strong argument for one of the others, I'm > listening. "Errors should never pass silently"? Silently returning nonsensical results is hard to defend as a default behavior IMO :-) > How would you answer those who say that the right behaviour is not to > propogate unwanted NANs, but to fail fast and raise an exception? Both seem defensible a priori, but every other mathematical operation in Python propagates NaNs instead of raising an exception. Is there something unusual about median that would justify giving it unusual behavior? -n -- Nathaniel J. Smith -- https://vorpus.org From boxed at killingar.net Mon Jan 7 03:10:34 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Mon, 7 Jan 2019 09:10:34 +0100 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> Message-ID: > This proposal is basically about introducing goto to the language. A bit hyperbolic but I agree that it has the same problem as goto. But the specific suggested solution is not something we should be restricted so rigidly to in this discussion. One could for example see another solution to the same problem: with supress_raise(TypeError, ValueError): do_the_things() I have no idea how to actually implement this though and it's also a bad idea but I think we should first find the best idea to solve the underlying pain point then talk about rejecting or supporting that. / Anders From steve at pearwood.info Mon Jan 7 03:09:54 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 7 Jan 2019 19:09:54 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107070526.GD13616@ando.pearwood.info> Message-ID: <20190107080953.GE13616@ando.pearwood.info> (By the way, I'm not outright disagreeing with you, I'm trying to weigh up the pros and cons of your position. You've given me a lot to think about. More below.) On Sun, Jan 06, 2019 at 11:31:30PM -0800, Nathaniel Smith wrote: > On Sun, Jan 6, 2019 at 11:06 PM Steven D'Aprano wrote: > > I'm not wedded to the idea that the default ought to be the current > > behaviour. If there is a strong argument for one of the others, I'm > > listening. > > "Errors should never pass silently"? Silently returning nonsensical > results is hard to defend as a default behavior IMO :-) If you violate the assumptions of the function, just about everything can in principle return nonsensical results. True, most of the time you have to work hard at it: class MyList(list): def __len__(self): return random.randint(0, sys.maxint) but it isn't unreasonable to document the assumptions of a function, and if the caller violates those assumptions, Garbage In Garbage Out applies. E.g. bisect requires that your list is sorted in ascending order. If it isn't, the results you get are nonsensical. py> data = [8, 6, 4, 2, 0] py> bisect.bisect(data, 1) 0 That's not a bug in bisect, that's a bug in the caller's code, and it isn't bisect's responsibility to fix it. Although it could be documented better, that's the current situation with NANs and median(). Data with NANs don't have a total ordering, and total ordering is the unstated assumption behind the idea of a median or middle value. So all bets are off. > > How would you answer those who say that the right behaviour is not to > > propogate unwanted NANs, but to fail fast and raise an exception? > > Both seem defensible a priori, but every other mathematical operation > in Python propagates NaNs instead of raising an exception. Is there > something unusual about median that would justify giving it unusual > behavior? Well, not everything... py> NAN/0 Traceback (most recent call last): File "", line 1, in ZeroDivisionError: float division by zero There may be others. But I'm not sure that "everything else does it" is a strong justification. It is *a* justification, since consistency is good, but consistency does not necessarily outweigh other concerns. One possible argument for making PASS the default, even if that means implementation-dependent behaviour with NANs, is that in the absense of a clear preference for FAIL or RETURN, at least PASS is backwards compatible. You might shoot yourself in the foot, but at least you know its the same foot you shot yourself in using the previous version *wink* -- Steve From solipsis at pitrou.net Mon Jan 7 03:24:08 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 Jan 2019 09:24:08 +0100 Subject: [Python-ideas] NAN handling in the statistics module References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: <20190107092408.6e7b2353@fsol> On Sun, 6 Jan 2019 19:40:32 -0800 Stephan Hoyer wrote: > On Sun, Jan 6, 2019 at 4:27 PM Steven D'Aprano wrote: > > > I propose adding a "nan_policy" keyword-only parameter to the relevant > > statistics functions (mean, median, variance etc), and defining the > > following policies: > > > > IGNORE: quietly ignore all NANs > > FAIL: raise an exception if any NAN is seen in the data > > PASS: pass NANs through unchanged (the default) > > RETURN: return a NAN if any NAN is seen in the data > > WARN: ignore all NANs but raise a warning if one is seen > > > > I don't think PASS should be the default behavior, and I'm not sure it > would be productive to actually implement all of these options. > > For reference, NumPy and pandas (the two most popular packages for data > analytics in Python) support two of these modes: > - RETURN (numpy.mean() and skipna=False for pandas) > - IGNORE (numpy.nanmean() and skipna=True for pandas) > > RETURN is the default behavior for NumPy; IGNORE is the default for pandas. I agree with Stephan that RETURN and IGNORE are the only useful modes of operation here. Regards Antoine. From steelman at post.pl Mon Jan 7 03:58:39 2019 From: steelman at post.pl (=?UTF-8?Q?=C5=81ukasz_Stelmach?=) Date: Mon, 7 Jan 2019 09:58:39 +0100 (CET) Subject: [Python-ideas] Fixed point format for numbers with locale based separators In-Reply-To: <7797ca18-ab81-7bd7-7912-1df2305fd23f@trueblade.com> References: <1541732215.575168.1546613873762@poczta.home.pl> <3958640.cMbgjVjVHv@varric.chelsea.private> <87r2dqan45.fsf%steelman@post.pl> <7797ca18-ab81-7bd7-7912-1df2305fd23f@trueblade.com> Message-ID: <502524909.630262.1546851519322@poczta.home.pl> Dnia 6 stycznia 2019 o 01:48 "Eric V. Smith" napisa?(a): > On 1/5/2019 3:03 PM, ?ukasz Stelmach wrote: >> Barry Scott writes: >>> On Friday, 4 January 2019 14:57:53 GMT ?ukasz Stelmach wrote: >>>> >>>> I would like to present two pull requests[1][2] implementing fixed >>>> point presentation of numbers and ask for comments. The first is >>>> mine. I learnt about the second after publishing mine. >>>> >>>> The only format using decimal separator from locale data for >>>> float/complex/decimal numbers at the moment is "n" which behaves >>>> like "g". The drawback of these formats, I would like to overcome, >>>> is the inability to print numbers ranging more than one order of >>>> magnitude with the same number of decimal digits without "manually" >>>> (with some additional custom code) adjusting precission. The other >>>> option is to "manually" replace "." as printed by "f" with a local >>>> decimal separator. Neither of these option is appealing to my. >>>> >>>> Formatting 1.23456789 * n (LC_ALL=3Dpl_PL.UTF-8) >>>> >>>> | n | ".2f" | ".3n" | >>>> |---+----------+----------| >>>> | 1 | 1.23 | 1,23 | >>>> | 2 | 12.35 | 12,3 | >>>> | 3 | 123.46 | 123 | >>>> | 4 | 1234.57 | 1,23e+03 | >>> >>> Can you use locale.format_string() to solve this? >> >> I am afraid I can't. I am using a library called pint[1] in my >> project. It allows me to choose how its objects are formated but it >> uses format() internally. It adds some custom extensions to format >> strings which, as far as I can tell, mekes it hard if not impossible >> to patch it to locale.format_string(). But this is rather an excuse. > > I do think that this is a compelling use case for "f" style > locale-aware formatting. I support adding it in some format or another > (pun intended). > > My only concern is how to paint the bike shed. Should we just use > another format spec "type" character instead of "f", as the two linked > issues propose? Or maybe use an additional "alternate form" style > character, so that we could use different locale options, either now > or in the future? https://bugs.python.org/issue33731 is similar to > https://bugs.python.org/issue34311 but proposes using LC_MONETARY > instead of LC_NUMERIC. > > I'm not suggesting we solve every possible problem here, but we at > least shouldn't paint ourselves into a corner and instead allow a > future where we could expand things, if needed, and without using up > tons of format spec "type" characters for every permutation of "type" > plus LC_MONETARY or LC_NUMERIC. > > Here's a straw man: > > The current specification for the format spec is: > [[fill]align][sign][#][0][width][grouping_option][.precision][type] > > Let's say we change it to: > [[fill]align][sign][#][*|$][0][width][grouping_option][.precision][type] > > (I think that's unambiguous, but I'd have to think it through some more) > > Let's call the new [*|$] character the "locale character". [...] OK, it doesn't sound bad at all and I wonder if there is *any* other situation that may allow/require choosing between different categories of locale data to format the same value. If so (I need to read some more about locale date), I think your idea can be extended even further. Let's use 'Lx' as even more general 'locale control sequence' where 'x' is a locale category in general (LC_CTYPE, LC_). Should we support only POSIX categories[1] or extensions like LC_PAPER in glibc or other OS/library too? BTW. Is there any scanf() equivalent in Python, that uses the same syntax as format()? Because it might benefit from such control sequences even more? > Again, this is just a straw man proposal that would require fleshing > out. I think it might also require a PEP, but it would be as simple as > PEP 378 for adding comma grouping formatting. Somewhere to memorialize > the decision and how we got there, including rejected alternate > proposals, would be a good thing. Challenge accepted (-; Where do I start? [1] https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html -- Kind regards, ?ukasz Stelmach From rosuav at gmail.com Mon Jan 7 04:27:04 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 7 Jan 2019 20:27:04 +1100 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> Message-ID: On Mon, Jan 7, 2019 at 7:11 PM Anders Hovm?ller wrote: > > > > This proposal is basically about introducing goto to the language. > > A bit hyperbolic but I agree that it has the same problem as goto. But the specific suggested solution is not something we should be restricted so rigidly to in this discussion. One could for example see another solution to the same problem: > > with supress_raise(TypeError, ValueError): > do_the_things() > > I have no idea how to actually implement this though and it's also a bad idea but I think we should first find the best idea to solve the underlying pain point then talk about rejecting or supporting that. > You mean like this? https://docs.python.org/3/library/contextlib.html#contextlib.suppress ChrisA From boxed at killingar.net Mon Jan 7 04:40:02 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Mon, 7 Jan 2019 10:40:02 +0100 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> Message-ID: > You mean like this? > > https://docs.python.org/3/library/contextlib.html#contextlib.suppress Hah. Exactly. Maybe that is what the OP wanted in the first place? It's always surprising how much stuff is in the standard lib even after all these years! Thanks for this. / Anders From steve at pearwood.info Mon Jan 7 06:49:15 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 7 Jan 2019 22:49:15 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> Message-ID: <20190107114914.GH13616@ando.pearwood.info> On Mon, Jan 07, 2019 at 01:34:47AM -0500, David Mertz wrote: > > I'm not opposed to documenting this better. Patches welcome :-) > > > > I'll provide a suggested batch on the bug. It will simply be a wholly > different implementation of median and friends. I ask for a documentation patch and you start talking about a whole new implementation. Huh. A new implementation with precisely the same behaviour is a waste of time, so I presume you're planning to change the behaviour. How about if you start off by explaining what the new semantics are? -- Steve From oscar.j.benjamin at gmail.com Mon Jan 7 07:10:33 2019 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Mon, 7 Jan 2019 12:10:33 +0000 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> Message-ID: On Mon, 7 Jan 2019 at 09:27, Chris Angelico wrote: > > On Mon, Jan 7, 2019 at 7:11 PM Anders Hovm?ller wrote: > > > > > > > This proposal is basically about introducing goto to the language. > > > > A bit hyperbolic but I agree that it has the same problem as goto. But the specific suggested solution is not something we should be restricted so rigidly to in this discussion. One could for example see another solution to the same problem: > > > > with supress_raise(TypeError, ValueError): > > do_the_things() > > > > I have no idea how to actually implement this though and it's also a bad idea but I think we should first find the best idea to solve the underlying pain point then talk about rejecting or supporting that. > > > > You mean like this? > > https://docs.python.org/3/library/contextlib.html#contextlib.suppress That doesn't do what the OP requested. It suppresses errors from the outside but doesn't resume execution in the block so e.g.: a = b = None with suppress(ValueError): a = float(str_a) b = float(str_b) The OP wants the the b= line to execute even if the a= line raises an exception. -- Oscar From jfine2358 at gmail.com Mon Jan 7 09:01:34 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 7 Jan 2019 14:01:34 +0000 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107114914.GH13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> Message-ID: Happy New Year (off topic). Based on a quick review of the python docs, the bug report, PEP 450 and this thread, I suggest 1. More carefully draw attention to the NaN feature, in the documentation for existing Python versions. 2. Consider revising statistics.py so that it raises an exception, when passed NaN data. https://www.python.org/dev/peps/pep-0450/#rationale says The proposed statistics module is motivated by the "batteries included" philosophy towards the Python standard library. Raymond Hettinger and other senior developers have requested a quality statistics library that falls somewhere in between high-end statistics libraries and ad hoc code. Statistical functions such as mean, standard deviation and others are obvious and useful batteries, familiar to any Secondary School student. The PEP makes no mention of NaN. Was it in error, in not stating that NaN data is admissable? Is NaN part of the "batteries familar to any Secondary School student?". https://docs.python.org/3/library/statistics.html says This module provides functions for calculating mathematical statistics of numeric (Real-valued) data. Some people regard NaN as not being a real-valued number. (Hint: There's a clue in the name: Not A Number.) Note that statistics.py already raises StatisticsError, when it regards the data as flawed. Finally, I suggest that we might learn from == Fix some special cases in Fractions? https://mail.python.org/pipermail/python-ideas/2018-August/053083.html == I'll put a brief summary of my message into the bug tracker for this issue. -- Jonathan From steve at pearwood.info Mon Jan 7 09:57:59 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 8 Jan 2019 01:57:59 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> Message-ID: <20190107145759.GI13616@ando.pearwood.info> On Mon, Jan 07, 2019 at 02:01:34PM +0000, Jonathan Fine wrote: > Finally, I suggest that we might learn from > == > Fix some special cases in Fractions? > https://mail.python.org/pipermail/python-ideas/2018-August/053083.html > == I remember that thread from August, and I've just re-read the entire thing now, and I don't see the relevance. Can you explain why you think it is relevant to this thread? -- Steve From mertz at gnosis.cx Mon Jan 7 10:05:19 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 7 Jan 2019 10:05:19 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107114914.GH13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> Message-ID: On Mon, Jan 7, 2019 at 6:50 AM Steven D'Aprano wrote: > > I'll provide a suggested batch on the bug. It will simply be a wholly > > different implementation of median and friends. > > I ask for a documentation patch and you start talking about a whole new > implementation. Huh. > A new implementation with precisely the same behaviour is a waste of > time, so I presume you're planning to change the behaviour. How about if > you start off by explaining what the new semantics are? > I think it would be counter-productive to document the bug (as something other than a bug). Picking what is a completely arbitrary element in face of a non-total order can never be "correct" behavior, and is never worth preserving for compatibility. I think the use of statistics.median against partially ordered elements is simply rare enough that no one tripped against it, or at least no one reported it before. Notice that the code itself pretty much recognizes the bug in this comment: # FIXME: investigate ways to calculate medians without sorting? Quickselect? So it seems like the original author knew the implementation was wrong. But you're right, the new behavior needs to be decided. Propagating NaNs is reasonable. Filtering out NaN's is reasonable. Those are the default behaviors of NumPy and Pandas, respectively: np.median([1,2,3,nan]) # -> nan pd.Series([1,2,3,nan]).median() # -> 2.0 (Yes, of course there are ways in each to get the other behavior). Other non-Python tools similarly suggest one of those behaviors, but really nothing else. So yeah, what I was suggesting as a patch was an implementation that had PROPAGATE and IGNORE semantics. I don't have a real opinion about which should be the default, but the current behavior should simply not exist at all. As I think about it, warnings and exceptions are really too complex an API for this module. It's not hard to manually check for NaNs and generate those in your own code. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Jan 7 10:15:19 2019 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 8 Jan 2019 02:15:19 +1100 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> Message-ID: On Mon, Jan 7, 2019 at 11:11 PM Oscar Benjamin wrote: > > On Mon, 7 Jan 2019 at 09:27, Chris Angelico wrote: > > > > On Mon, Jan 7, 2019 at 7:11 PM Anders Hovm?ller wrote: > > > > > > > > > > This proposal is basically about introducing goto to the language. > > > > > > A bit hyperbolic but I agree that it has the same problem as goto. But the specific suggested solution is not something we should be restricted so rigidly to in this discussion. One could for example see another solution to the same problem: > > > > > > with supress_raise(TypeError, ValueError): > > > do_the_things() > > > > > > I have no idea how to actually implement this though and it's also a bad idea but I think we should first find the best idea to solve the underlying pain point then talk about rejecting or supporting that. > > > > > > > You mean like this? > > > > https://docs.python.org/3/library/contextlib.html#contextlib.suppress > > That doesn't do what the OP requested. It suppresses errors from the > outside but doesn't resume execution in the block so e.g.: > > a = b = None > with suppress(ValueError): > a = float(str_a) > b = float(str_b) > > The OP wants the the b= line to execute even if the a= line raises an exception. > True, but what the OP actually asked for is basically impossible. And at least you can write: with suppress(ValueError): a = float(str_a) with suppress(ValueError): b = float(str_b) which is a heap less noisy than the explicit try/except. ChrisA From eric at trueblade.com Mon Jan 7 10:30:59 2019 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 7 Jan 2019 10:30:59 -0500 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> Message-ID: On 1/7/2019 10:15 AM, Chris Angelico wrote: > On Mon, Jan 7, 2019 at 11:11 PM Oscar Benjamin > wrote: >> On Mon, 7 Jan 2019 at 09:27, Chris Angelico wrote: >>> On Mon, Jan 7, 2019 at 7:11 PM Anders Hovm?ller wrote: >>>> >>>>> This proposal is basically about introducing goto to the language. >>>> A bit hyperbolic but I agree that it has the same problem as goto. But the specific suggested solution is not something we should be restricted so rigidly to in this discussion. One could for example see another solution to the same problem: >>>> >>>> with supress_raise(TypeError, ValueError): >>>> do_the_things() >>>> >>>> I have no idea how to actually implement this though and it's also a bad idea but I think we should first find the best idea to solve the underlying pain point then talk about rejecting or supporting that. >>>> >>> You mean like this? >>> >>> https://docs.python.org/3/library/contextlib.html#contextlib.suppress >> That doesn't do what the OP requested. It suppresses errors from the >> outside but doesn't resume execution in the block so e.g.: >> >> a = b = None >> with suppress(ValueError): >> a = float(str_a) >> b = float(str_b) >> >> The OP wants the the b= line to execute even if the a= line raises an exception. >> > True, but what the OP actually asked for is basically impossible. And > at least you can write: > > with suppress(ValueError): > a = float(str_a) > with suppress(ValueError): > b = float(str_b) > > which is a heap less noisy than the explicit try/except. I think the OP was asking for a construct to automatically wrap every statement in a suppress context manager. Which is probably possible, but I think a bad idea. As a third party solution, maybe some creative AST manipulations could do the trick if someone were so inclined. Eric From steve at pearwood.info Mon Jan 7 11:34:24 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 8 Jan 2019 03:34:24 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> Message-ID: <20190107163424.GJ13616@ando.pearwood.info> On Mon, Jan 07, 2019 at 10:05:19AM -0500, David Mertz wrote: > On Mon, Jan 7, 2019 at 6:50 AM Steven D'Aprano wrote: > > > > I'll provide a suggested batch on the bug. It will simply be a wholly > > > different implementation of median and friends. > > > > I ask for a documentation patch and you start talking about a whole new > > implementation. Huh. > > A new implementation with precisely the same behaviour is a waste of > > time, so I presume you're planning to change the behaviour. How about if > > you start off by explaining what the new semantics are? > > > > I think it would be counter-productive to document the bug (as something > other than a bug). Its not a bug in median(), because median requires the data implement a total order. Although that isn't explicitly documented, it is common sense: if the data cannot be sorted into smallest-to-largest order, how can you decide which value is in the middle? What is explicitly documented is that median requires numeric data, and NANs aren't numbers. So the only bug here is the caller's failure to filter out NANs. If you pass it garbage data, you get garbage results. Nevertheless, it is a perfectly reasonable thing to want to use data which may or may not contain NANs, and I want to enhance the statistics module to make it easier for the caller to handle NANs in whichever way they see fit. This is a new feature, not a bug fix. > Picking what is a completely arbitrary element in face > of a non-total order can never be "correct" behavior, and is never worth > preserving for compatibility. If you truly believe that, then you should also believe that both list.sort() and the bisect module are buggy, for precisely the same reason. Perhaps you ought to raise a couple of bug reports, and see if you can get Tim and Raymond to agree that sorting and bisect should do something other than what they already do in the face of data that doesn't define a total order. > I think the use of statistics.median against > partially ordered elements is simply rare enough that no one tripped > against it, or at least no one reported it before. I'm sure it is rare. Nevertheless, I still want to make it easier for people to deal with this case. > Notice that the code itself pretty much recognizes the bug in this comment: > > # FIXME: investigate ways to calculate medians without sorting? Quickselect? I doubt Quickselect will be immune to the problem of NANs. It too relies on comparisons, and while I don't know for sure that it requires a total order, I'd be surprised if it doesn't. Quickselect is basically a variant of Quicksort that only partially sorts the data. > So it seems like the original author knew the implementation was wrong. That's not why I put that comment in. Sorting is O(N log N) on average, and Quickselect can be O(N) on average. In principle, Quickselect or a similar selection algorithm could be faster than sorting. [...] > It's not hard to manually check for NaNs and > generate those in your own code. That is correct, but by that logic, we don't need to support *any* form of NAN handling at all. It is easy (if inefficent) for the caller to pre-filter their data. I want to make it easier and more convenient and avoid having to iterate over the data twice if it isn't necessary. -- Steve From guido at python.org Mon Jan 7 11:49:33 2019 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Jan 2019 08:49:33 -0800 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107163424.GJ13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> Message-ID: On Mon, Jan 7, 2019 at 8:39 AM Steven D'Aprano wrote: > Its not a bug in median(), because median requires the data implement a > total order. Although that isn't explicitly documented, it is common > sense: if the data cannot be sorted into smallest-to-largest order, how > can you decide which value is in the middle? > > What is explicitly documented is that median requires numeric data, and > NANs aren't numbers. So the only bug here is the caller's failure to > filter out NANs. If you pass it garbage data, you get garbage results. > > Nevertheless, it is a perfectly reasonable thing to want to use data > which may or may not contain NANs, and I want to enhance the statistics > module to make it easier for the caller to handle NANs in whichever way > they see fit. This is a new feature, not a bug fix. > So then you are arguing that making reasonable treatment of NANs the default is not breaking backwards compatibility (because previously the data was considered wrong). This sounds like a good idea to me. Presumably the NANs are inserted into the data explicitly in order to signal missing data -- this seems more plausible to me (given the typical use case for the statistics module) than that they would be the result of a computation like Inf/Inf. (While propagating NANs makes sense for the fundamental arithmetical and mathematical functions, given that we have chosen not to raise an error when encountering them, I think other stdlib libraries are not beholden to that behavior.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Mon Jan 7 12:19:33 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 7 Jan 2019 12:19:33 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107163424.GJ13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> Message-ID: On Mon, Jan 7, 2019, 11:38 AM Steven D'Aprano Its not a bug in median(), because median requires the data implement a > total order. Although that isn't explicitly documented, it is common sense: > if the data cannot be sorted into smallest-to-largest order, how can you > decide which value is in the middle? > I can see reason that median per-se requires a total order. Yes, the implementation chosen (and many reasonable and obvious implementations) make that assumption. But here is a perfectly reasonable definition of median: * A median is an element of a collection such that 1/2 of all elements of the collection are less than it. Depending on how you interpret median, this element might also not be in the original collection, but be some newly generated value that has that property. E.g. statistics.median([1,2,3,4]) == 2.5. Under a partial ordering, a median may not be unique. Even under a total ordering this is true if some subset of elements form an equivalence class. But under partial ordering, the non-uniqueness can get much weirder. What is explicitly documented is that median requires numeric data, and > NANs aren't numbers. So the only bug here is the caller's failure to > filter out NANs. If you pass it garbage data, you get garbage results. > OK, then we should either raise an exception or propagate the NaN if that is the intended meaning of the function. And obviously document that such is the assumption. NaN's *are* explicitly in the floating-point domain, so it's fuzzy whether they are numeric or not, notwithstanding the name. I'm very happy to push NaN-filtering to users (as NumPy does, although it provides alternate functions for many reductions that incorporate this... the basic ones always propagate NaNs though). > Nevertheless, it is a perfectly reasonable thing to want to use data > which may or may not contain NANs, and I want to enhance the statistics > module to make it easier for the caller to handle NANs in whichever way > they see fit. This is a new feature, not a bug fix. > I disagree about bug vs. feature. The old behavior is simply and unambiguously wrong, but was not previously noticed. Obviously, the bug does not affect most uses, which is why it was not noticed. > If you truly believe that, then you should also believe that both > list.sort() and the bisect module are buggy, for precisely the same > reason. > I cannot perceive any close connection between the correct behavior of statistics.mean() and that of list.sort() or bisect. I know the concrete implementation of the former uses the latter, but the answers for what is RIGHT feel completely independent to me. I doubt Quickselect will be immune to the problem of NANs. It too relies > on comparisons, and while I don't know for sure that it requires a total > order, I'd be surprised if it doesn't. Quickselect is basically a > variant of Quicksort that only partially sorts the data. > Yes, I was thinking of trying to tweak Quickselect to handle NaNs during the process. I.e. probably terminate and propagate the NaN early, as soon as one is encountered. That might save much of the work if a NaN is encountered early and most comparisons and moves can be avoided. Of course, I'm sure there is a worst case where almost all the work is done before a NaN check is performed in some constructed example. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Mon Jan 7 12:28:30 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 7 Jan 2019 12:28:30 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> Message-ID: On Mon, Jan 7, 2019 at 12:19 PM David Mertz wrote: > Under a partial ordering, a median may not be unique. Even under a total > ordering this is true if some subset of elements form an equivalence > class. But under partial ordering, the non-uniqueness can get much weirder. > I'm sure with more thought, weirder things can be thought of. But just as a quick example, it would be easy to write classes such that: a < b < c < a In such a case (or expand for an odd number of distinct things), it would be reasonable to call ANY element of [a, b, c] a median. That's funny, but it is not imprecise. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Mon Jan 7 14:35:45 2019 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 7 Jan 2019 19:35:45 +0000 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107163424.GJ13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> Message-ID: <4483976b-6a4d-52de-5198-3279669ed4ab@mrabarnett.plus.com> On 2019-01-07 16:34, Steven D'Aprano wrote: > On Mon, Jan 07, 2019 at 10:05:19AM -0500, David Mertz wrote: [snip] >> It's not hard to manually check for NaNs and >> generate those in your own code. > > That is correct, but by that logic, we don't need to support *any* form > of NAN handling at all. It is easy (if inefficent) for the caller to > pre-filter their data. I want to make it easier and more convenient and > avoid having to iterate over the data twice if it isn't necessary. > Could the functions optionally accept a callback that will be called when a NaN is first seen? If the callback returns False, NaNs are suppressed, otherwise they are retained and the function returns NaN (or whatever). The callback would give the user a chance to raise a warning or an exception, if desired. From mertz at gnosis.cx Mon Jan 7 14:44:57 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 7 Jan 2019 14:44:57 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <4483976b-6a4d-52de-5198-3279669ed4ab@mrabarnett.plus.com> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> <4483976b-6a4d-52de-5198-3279669ed4ab@mrabarnett.plus.com> Message-ID: This callback idea feels way over-engineered for this module. It would absolutely make sense in a more specialized numeric or statistical library. But `statistics` feels to me like it should be only simple and basic operations, with very few knobs attached. On Mon, Jan 7, 2019, 2:36 PM MRAB On 2019-01-07 16:34, Steven D'Aprano wrote: > > On Mon, Jan 07, 2019 at 10:05:19AM -0500, David Mertz wrote: > [snip] > >> It's not hard to manually check for NaNs and > >> generate those in your own code. > > > > That is correct, but by that logic, we don't need to support *any* form > > of NAN handling at all. It is easy (if inefficent) for the caller to > > pre-filter their data. I want to make it easier and more convenient and > > avoid having to iterate over the data twice if it isn't necessary. > > > Could the functions optionally accept a callback that will be called > when a NaN is first seen? > > If the callback returns False, NaNs are suppressed, otherwise they are > retained and the function returns NaN (or whatever). > > The callback would give the user a chance to raise a warning or an > exception, if desired. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Mon Jan 7 15:38:28 2019 From: barry at barrys-emacs.org (Barry) Date: Mon, 7 Jan 2019 20:38:28 +0000 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> Message-ID: > On 7 Jan 2019, at 03:06, Richard Damon wrote: > > For something like reading options from a config file, I would use a > call that specifies the key and a value to use if the key isn't present, > and inside that function I might use a try to handle any exception > caused when processing the key, and it could return the default. Most config file APIs I have used have has_section and has_key type functions that remove the need to catch exceptions. What config file API are you using htat is missing this? Barry From steve at pearwood.info Mon Jan 7 18:27:03 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 8 Jan 2019 10:27:03 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <4483976b-6a4d-52de-5198-3279669ed4ab@mrabarnett.plus.com> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> <4483976b-6a4d-52de-5198-3279669ed4ab@mrabarnett.plus.com> Message-ID: <20190107232701.GK13616@ando.pearwood.info> On Mon, Jan 07, 2019 at 07:35:45PM +0000, MRAB wrote: > Could the functions optionally accept a callback that will be called > when a NaN is first seen? > > If the callback returns False, NaNs are suppressed, otherwise they are > retained and the function returns NaN (or whatever). That's an interesting API which I shall have to think about. > The callback would give the user a chance to raise a warning or an > exception, if desired. One practical annoyance of this API is that you cannot include raise from a lambda, so people desiring "fail fast" semantics can't do this: result = mean(data, callback=lambda: raise Exception) They have to pre-declare the callback using def. -- Steve From Richard at Damon-Family.org Mon Jan 7 23:15:34 2019 From: Richard at Damon-Family.org (Richard Damon) Date: Mon, 7 Jan 2019 23:15:34 -0500 Subject: [Python-ideas] Possible PEP regarding the use of the continue keyword in try/except blocks In-Reply-To: References: <5c32bf53.1c69fb81.81d27.4efc@mx.google.com> <3398afd9-b807-25a3-89a6-8f2e9e1ff8f5@Damon-Family.org> Message-ID: <66a4c39f-7d60-2b78-f119-a530b34fb09e@Damon-Family.org> On 1/7/19 3:38 PM, Barry wrote: > >> On 7 Jan 2019, at 03:06, Richard Damon wrote: >> >> For something like reading options from a config file, I would use a >> call that specifies the key and a value to use if the key isn't present, >> and inside that function I might use a try to handle any exception >> caused when processing the key, and it could return the default. > Most config file APIs I have used have has_section and has_key type functions that remove the need to catch exceptions. > > What config file API are you using htat is missing this? > > Barry > I was talking about if I was to roll my own, I would start by calling a function I was writing with the key / default value, and it might have a try block so any error that threw an exception would cause it to fall back to the default value. The OP is obviously obviously thinking of something a bit off standard, or he would just be using a standard config reader, and not need this. Maybe the issue is parsing the data from the config line into some internal format, and wanting to catch bad values, like a line that said "nfiles = 42balloons" that throws when it is expecting just a number. -- Richard Damon From steve at pearwood.info Tue Jan 8 05:56:20 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 8 Jan 2019 21:56:20 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <23604.20573.194385.786558@turnbull.sk.tsukuba.ac.jp> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <23604.20573.194385.786558@turnbull.sk.tsukuba.ac.jp> Message-ID: <20190108105619.GP13616@ando.pearwood.info> On Tue, Jan 08, 2019 at 04:25:17PM +0900, Stephen J. Turnbull wrote: > Steven D'Aprano writes: > > > By definition, data containing Not A Number values isn't numeric :-) > > Unfortunately, that's just a joke, because in fact numeric functions > produce NaNs. I'm not sure if you're agreeing with me or disagreeing, so I'll assume you're agreeing and move on :-) > I agree that this can easily be resolved by documenting that it is the > caller's responsibility to remove NaNs from numeric data, but I prefer > your proposed flags. > > > The only reason why I don't call it a bug is that median() makes no > > promises about NANs at all, any more than it makes promises about the > > median of a list of sets or any other values which don't define a total > > order. > > Pedantically, I would prefer that the promise that ordinal data > (vs. specifically numerical) has a median be made explicit, as there > are many cases where statistical data is ordinal. I think that is reasonable. Provided the data defines a total order, the median is well-defined when there are an odd number of data points, or you can use median_low and median_high regardless of the number of data points. > This may be a moot > point, as in most cases ordinal data is represented numerically in > computation (Likert scales, for example, are rarely coded as "hate, > "dislike", "indifferent", "like", "love", but instead as 1, 2, 3, 4, > 5), and from the point of view of UI presentation, IntEnums do the > right thing here (print as identifiers, sort as integers). > > Perhaps a better way to document this would be to suggest that ordinal > data be represented using IntEnums? (Again to be pedantic, one might > want OrderedEnums that can be compared but don't allow other > arithmetic operations.) That's a nice solution. -- Steve (the other one) From tim.peters at gmail.com Tue Jan 8 23:55:57 2019 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 8 Jan 2019 22:55:57 -0600 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107163424.GJ13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> Message-ID: I'd like to see internal consistency across the central-tendency statistics in the presence of NaNs. What happens now: mean: the code appears to guarantee that a NaN will be returned if a NaN is in the input. median: as recently detailed, just about anything can happen, depending on how undefined behaviors in .sort() interact. mode: while NaN != NaN at the Python level, internally dicts use an identity shortcut so that, effectively, "is" takes precedence over `__eq__`. So a given NaN object will be recognized as repeated if it appears more than once, but distinct NaN objects remain distinct: So, e.g., >>> from math import inf, nan >>> import statistics >>> statistics.mode([2, 2, nan, nan, nan]) nan That's NOT "NaN-in, NaN-out", it's "a single NaN object is the object that appeared most often". Make those 3 distinct NaN objects (inf - inf results) instead, and the mode changes: >>> statistics.mode([2, 2, inf - inf, inf - inf, inf - inf]) 2 Since the current behavior of `mean()` is the only one that's sane, that should probably become the default for all of them (NaN in -> NaN out). "NaN in -> exception" and "pretend NaNs in the input don't exist" are the other possibly useful behaviors. About median speed, I wouldn't worry. Long ago I tried many variations of QuickSelect, and it required very large inputs for a Python-coded QuickSelect to run faster than a straightforward .sort()+index. It's bound to be worse now: - Current Python .sort() is significantly faster on one-type lists because it figures out the single type-specific comparison routine needed once at the start, instead of enduring N log N full-blown PyObject_RichCompareBool calls. - And the current .sort() can be very much faster than older ones on data with significant order. In the limit, .sort()+index will run faster than any QuickSelect variant on already-sorted or already-reverse-sorted data. QuickSelect variants aren't adaptive in any sense, except that a "fat pivot" version (3-way partition, into < pivot, == pivot, and > pivot regions) is very effective on data with many equal values. In Python 3.7.2, for randomly ordered random\-ish floats I find that median() is significantly faster than mean() even on lists with millions of elements, despite that the former sorts and the latter doesn't. From steve at pearwood.info Wed Jan 9 00:19:36 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 9 Jan 2019 16:19:36 +1100 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107002722.GA13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> Message-ID: <20190109051935.GT13616@ando.pearwood.info> On Mon, Jan 07, 2019 at 11:27:22AM +1100, Steven D'Aprano wrote: [...] > I propose adding a "nan_policy" keyword-only parameter to the relevant > statistics functions (mean, median, variance etc), and defining the > following policies: I asked some heavy users of statistics software (not just Python users) what behaviour they would find useful, and as I feared, I got no conclusive answer. So far, the answers seem to be almost evenly split into four camps: - don't do anything, it is the caller's responsibility to filter NANs; - raise an immediate error; - return a NAN; - treat them as missing data. (Currently it is a small sample size, so I don't expect the answers will stay evenly split if more people answer.) On consideration of all the views expressed, thank you to everyone who commented, I'm now inclined to default to returning a NAN (which happens to be the current behaviour of mean etc, but not median except by accident) even if it impacts performance. -- Steve From mertz at gnosis.cx Wed Jan 9 00:49:46 2019 From: mertz at gnosis.cx (David Mertz) Date: Wed, 9 Jan 2019 00:49:46 -0500 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> Message-ID: On Tue, Jan 8, 2019 at 11:57 PM Tim Peters wrote: > I'd like to see internal consistency across the central-tendency > statistics in the presence of NaNs. What happens now: > I think consistent NaN-poisoning would be excellent behavior. It will always make sense for median (and its variants). >>> statistics.mode([2, 2, nan, nan, nan]) > nan > >>> statistics.mode([2, 2, inf - inf, inf - inf, inf - inf]) > 2 > But in the mode case, I'm not sure we should ALWAYS treat a NaN as poisoning the result. If NaN means "missing value" then sometimes it could change things, and we shouldn't guess. But what if it cannot? >>> statistics.mode([9, 9, 9, 9, nan1, nan2, nan3]) No matter what missing value we take those nans to maybe-possibly represent, 9 is still the most common element. This is only true when the most common thing occurs at least as often as the 2nd most common thing PLUS the number of all NaNs. But in that case, 9 really is the mode. We have one example of non-poisoning NaN in basic operations: >>> nan**0 1.0 So if the NaN "cannot possibly change the answer" then its reasonable to produce a non-NaN answer IMO. Except we don't really get that with 0**nan or 0*nan already... so a NaN-poisoning mode wouldn't actually offend my sensibilities that much. :-). I guess you could argue that NaN "could be inf". In that case 0*nan being nan makes sense. But this still feels hard to slightly odd: >>> 0**inf 0.0 >>> 0**nan nan I guess it's supported by: >>> 0**-1 ZeroDivisionError: 0.0 cannot be raised to a negative power A *missing value* could be a negative one. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Jan 9 01:11:28 2019 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 9 Jan 2019 00:11:28 -0600 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> Message-ID: [David Mertz ] > I think consistent NaN-poisoning would be excellent behavior. It will > always make sense for median (and its variants). > >> >>> statistics.mode([2, 2, nan, nan, nan]) >> nan >> >>> statistics.mode([2, 2, inf - inf, inf - inf, inf - inf]) >> 2 > > > But in the mode case, I'm not sure we should ALWAYS treat a NaN as > poisoning the result. I am: I thought about the following but didn't write about it because it's too strained to be of actual sane use ;-) > If NaN means "missing value" then sometimes it could change things, >?and we shouldn't guess. But what if it cannot? > > >>> statistics.mode([9, 9, 9, 9, nan1, nan2, nan3]) > > No matter what missing value we take those nans to maybe-possibly represent, 9 > is still the most common element. This is only true when the most common thing > occurs at least as often as the 2nd most common thing PLUS the number > of all NaNs. But in that case, 9 really is the mode. See "too strained" above. It's equally true that, e.g., the _median_ of your list above: [9, 9, 9, 9, nan1, nan2, nan3] is also 9 regardless of what values are plugged in for the nans. That may be easier to realize at first with a simpler list, like [5, 5, nan] It sounds essentially useless to me, just theoretically possible to make a mess of implementations to cater to. "The right" (obvious, unsurprising, useful, easy to implement, easy to understand) non-exceptional behavior in the presence of NaNs is to pretend they weren't in the list to begin with. But I'd rather ;people ask for that _if_ that's what they want. From jfine2358 at gmail.com Wed Jan 9 06:50:09 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 9 Jan 2019 11:50:09 +0000 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107062630.GC13616@ando.pearwood.info> <20190107114914.GH13616@ando.pearwood.info> <20190107163424.GJ13616@ando.pearwood.info> Message-ID: I've just read statistics.py, and found something that might be usefully considered along with the NaN question. >>> median([1]) 1 >>> median([1, 1]) 1.0 To record this, and associated behaviour involving Fraction, I've added: Division by 2 in statistics.median: https://bugs.python.org/issue35698 -- Jonathan From oscar.j.benjamin at gmail.com Wed Jan 9 20:21:56 2019 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Thu, 10 Jan 2019 01:21:56 +0000 Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190109051935.GT13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190109051935.GT13616@ando.pearwood.info> Message-ID: On Wed, 9 Jan 2019 at 05:20, Steven D'Aprano wrote: > > On Mon, Jan 07, 2019 at 11:27:22AM +1100, Steven D'Aprano wrote: > > [...] > > I propose adding a "nan_policy" keyword-only parameter to the relevant > > statistics functions (mean, median, variance etc), and defining the > > following policies: > > > I asked some heavy users of statistics software (not just Python users) > what behaviour they would find useful, and as I feared, I got no > conclusive answer. So far, the answers seem to be almost evenly split > into four camps: > > - don't do anything, it is the caller's responsibility to filter NANs; > > - raise an immediate error; > > - return a NAN; > > - treat them as missing data. I would prefer to raise an exception in on nan. It's much easier to debug an exception than a nan. Take a look at the Julia docs for their statistics module: https://docs.julialang.org/en/v1/stdlib/Statistics/index.html In julia they have defined an explicit "missing" value. With that you can explicitly distinguish between a calculation error and missing data. The obvious Python equivalent would be None. > On consideration of all the views expressed, thank you to everyone who > commented, I'm now inclined to default to returning a NAN (which happens > to be the current behaviour of mean etc, but not median except by > accident) even if it impacts performance. Whichever way you go with this it might make sense to provide helper functions for users to deal with nans e.g.: xbar = mean(without_nans(data)) xbar = mode(replace_nans_with_None(data)) -- Oscar From mistersheik at gmail.com Thu Jan 10 11:42:11 2019 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 10 Jan 2019 08:42:11 -0800 (PST) Subject: [Python-ideas] NAN handling in the statistics module In-Reply-To: <20190107080953.GE13616@ando.pearwood.info> References: <20190107002722.GA13616@ando.pearwood.info> <20190107070526.GD13616@ando.pearwood.info> <20190107080953.GE13616@ando.pearwood.info> Message-ID: <455d0c65-99ae-4b0f-822d-b9901983cbc0@googlegroups.com> On Monday, January 7, 2019 at 3:16:07 AM UTC-5, Steven D'Aprano wrote: > > (By the way, I'm not outright disagreeing with you, I'm trying to weigh > up the pros and cons of your position. You've given me a lot to think > about. More below.) > > On Sun, Jan 06, 2019 at 11:31:30PM -0800, Nathaniel Smith wrote: > > On Sun, Jan 6, 2019 at 11:06 PM Steven D'Aprano > wrote: > > > I'm not wedded to the idea that the default ought to be the current > > > behaviour. If there is a strong argument for one of the others, I'm > > > listening. > > > > "Errors should never pass silently"? Silently returning nonsensical > > results is hard to defend as a default behavior IMO :-) > > If you violate the assumptions of the function, just about everything > can in principle return nonsensical results. True, most of the time you > have to work hard at it: > > class MyList(list): > def __len__(self): > return random.randint(0, sys.maxint) > > but it isn't unreasonable to document the assumptions of a function, and > if the caller violates those assumptions, Garbage In Garbage Out > applies. > I'm with Antoine, Nathaniel, David, and Chris: it is unreasonable to silently return nonsensical results even if you've documented it. Documenting it only makes it worse because it's like an "I told you so" when people finally figure out what's wrong and go to file the bug. > > E.g. bisect requires that your list is sorted in ascending order. If it > isn't, the results you get are nonsensical. > > py> data = [8, 6, 4, 2, 0] > py> bisect.bisect(data, 1) > 0 > > That's not a bug in bisect, that's a bug in the caller's code, and it > isn't bisect's responsibility to fix it. > > Although it could be documented better, that's the current situation > with NANs and median(). Data with NANs don't have a total ordering, and > total ordering is the unstated assumption behind the idea of a median or > middle value. So all bets are off. > > > > > How would you answer those who say that the right behaviour is not to > > > propogate unwanted NANs, but to fail fast and raise an exception? > > > > Both seem defensible a priori, but every other mathematical operation > > in Python propagates NaNs instead of raising an exception. Is there > > something unusual about median that would justify giving it unusual > > behavior? > > Well, not everything... > > py> NAN/0 > Traceback (most recent call last): > File "", line 1, in > ZeroDivisionError: float division by zero > > > There may be others. But I'm not sure that "everything else does it" is > a strong justification. It is *a* justification, since consistency is > good, but consistency does not necessarily outweigh other concerns. > > One possible argument for making PASS the default, even if that means > implementation-dependent behaviour with NANs, is that in the absense of > a clear preference for FAIL or RETURN, at least PASS is backwards > compatible. > > You might shoot yourself in the foot, but at least you know its the same > foot you shot yourself in using the previous version *wink* > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Jan 10 12:06:07 2019 From: mertz at gnosis.cx (David Mertz) Date: Thu, 10 Jan 2019 12:06:07 -0500 Subject: [Python-ideas] Fwd: NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107070526.GD13616@ando.pearwood.info> <20190107080953.GE13616@ando.pearwood.info> <455d0c65-99ae-4b0f-822d-b9901983cbc0@googlegroups.com> Message-ID: > > One possible argument for making PASS the default, even if that means >> implementation-dependent behaviour with NANs, is that in the absense of a >> clear preference for FAIL or RETURN, at least PASS is backwards compatible. >> >> You might shoot yourself in the foot, but at least you know its the same >> foot you shot yourself in using the previous version *wink* >> > I've lost attribution chain. I think this is Steven, but it doesn't really matter. This statement is untrue, or at least only accidentally true at most. The behavior of sorted() against partially ordered collections is unspecified. The author of Timsort says exactly this. If stastics.median() keeps the same implementation?or keeps it with a PASS argument?it may or may not produce the same result in a later Python versions. Timsort is great, but even that has been tweaked sightly over time. I guess the statement is true if "same foot" means "meaningless answer" not some specific value. But that hardly feels like a defense of the behavior. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Thu Jan 10 12:21:51 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Thu, 10 Jan 2019 17:21:51 +0000 Subject: [Python-ideas] Fwd: NAN handling in the statistics module In-Reply-To: References: <20190107002722.GA13616@ando.pearwood.info> <20190107070526.GD13616@ando.pearwood.info> <20190107080953.GE13616@ando.pearwood.info> <455d0c65-99ae-4b0f-822d-b9901983cbc0@googlegroups.com> Message-ID: On Thu, Jan 10, 2019 at 5:07 PM David Mertz wrote: >>> You might shoot yourself in the foot, but at least you know its the same foot you shot yourself in using the previous version *wink* > I've lost attribution chain. I think this is Steven, but it doesn't really matter. I think it was Steve. So far as I know, he's the only person on this list who winks at other participants. -- Jonathan From arj.python at gmail.com Wed Jan 16 11:11:25 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Wed, 16 Jan 2019 20:11:25 +0400 Subject: [Python-ideas] tkinter: time for round buttons? Message-ID: without starting a should we ban tkinter discussion, i'd like to propose that we add rounded corners buttons. that might make the aesthetic level go up a bit more poor me, if only py had some really nice native gui Abdur-Rahmaan Janhangeer http://www.pythonmembers.club Mauritius -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Wed Jan 16 13:46:52 2019 From: barry at barrys-emacs.org (Barry Scott) Date: Wed, 16 Jan 2019 18:46:52 +0000 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: References: Message-ID: <67A80046-B494-4D27-ADEC-2A1675172376@barrys-emacs.org> > On 16 Jan 2019, at 16:11, Abdur-Rahmaan Janhangeer wrote: > > without starting a should we ban tkinter discussion, i'd like to propose that we add rounded corners buttons. that might make the aesthetic level go up a bit more > > poor me, if only py had some really nice native gui It has a number of GUI tool kits. Personal I use PyQt5 which is feature rich and not that hard to get started with. Apps have native OS look at feel. pip install PyQy5 Barry > > Abdur-Rahmaan Janhangeer > http://www.pythonmembers.club > Mauritius > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Wed Jan 16 15:32:26 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Wed, 16 Jan 2019 15:32:26 -0500 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: <67A80046-B494-4D27-ADEC-2A1675172376@barrys-emacs.org> References: <67A80046-B494-4D27-ADEC-2A1675172376@barrys-emacs.org> Message-ID: On Wed, Jan 16, 2019 at 2:07 PM Barry Scott wrote: > aesthetic level go up a bit more > > > tkinter is a wrapper around TK -- so it's TK that would need to "modernize" > poor me, if only py had some really nice native gu > > pip install PyQy5 > or pip install wxpython native support to Windows, OS-X, *nix/GTK The install story is SO much better now that the "built in" advantage of tkinter is no longer such a big deal. -Chris -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Jan 16 15:49:05 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 16 Jan 2019 15:49:05 -0500 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: References: Message-ID: On 1/16/2019 11:11 AM, Abdur-Rahmaan Janhangeer wrote: > without starting a should we ban tkinter discussion, Then don't bring up such an idea. > i'd like to propose that we add rounded corners buttons. This is out-of-scope for python-ideas. 'Tkinter' abbreviates 'Tk interface'. It provides a class-based but otherwise thin wrapping of the tcl/tk widget set. The appearance of widgets depends on the Operating System. On my Mac with Mohave, tkinter buttons *do* have rounded corners. In Windows, they don't. Talk to Microsoft about that. I don't know about the various unix versions and distributions. -- Terry Jan Reedy From barry at barrys-emacs.org Wed Jan 16 16:24:38 2019 From: barry at barrys-emacs.org (Barry) Date: Wed, 16 Jan 2019 21:24:38 +0000 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: References: <67A80046-B494-4D27-ADEC-2A1675172376@barrys-emacs.org> Message-ID: <62A64F1D-21D4-4622-AA4C-0FDAD2692628@barrys-emacs.org> > On 16 Jan 2019, at 20:32, Christopher Barker wrote: > >> On Wed, Jan 16, 2019 at 2:07 PM Barry Scott wrote: > >> aesthetic level go up a bit more >>> > tkinter is a wrapper around TK -- so it's TK that would need to "modernize" >>> poor me, if only py had some really nice native gu >> pip install PyQy5 > > or pip install wxpython I used to use wxPython, but it bugs and platform specific quirky drove me to port to PyQt and I have not regreted that decision. > > native support to Windows, OS-X, *nix/GTK > > The install story is SO much better now that the "built in" advantage of tkinter is no longer such a big deal. > > -Chris > > -- > Christopher Barker, PhD > > Python Language Consulting > - Teaching > - Scientific Software Development > - Desktop GUI and Web Development > - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Wed Jan 16 19:26:06 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Thu, 17 Jan 2019 04:26:06 +0400 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: References: Message-ID: let us say i'm a novice user, for me py's gui is such. if on Mac it gives rounded corners but on others no, it's pretty unpredictable. and if it does have roundedness but you can't control it then it's no good Abdur-Rahmaan Janhangeer http://www.pythonmembers.club Mauritius On Thu, 17 Jan 2019, 00:49 Terry Reedy This is out-of-scope for python-ideas. 'Tkinter' abbreviates 'Tk > interface'. It provides a class-based but otherwise thin wrapping of > the tcl/tk widget set. The appearance of widgets depends on the > Operating System. On my Mac with Mohave, tkinter buttons *do* have > rounded corners. In Windows, they don't. Talk to Microsoft about that. > I don't know about the various unix versions and distributions. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Wed Jan 16 19:28:54 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Thu, 17 Jan 2019 04:28:54 +0400 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: <67A80046-B494-4D27-ADEC-2A1675172376@barrys-emacs.org> References: <67A80046-B494-4D27-ADEC-2A1675172376@barrys-emacs.org> Message-ID: pip imstall PyQt5? Abdur-Rahmaan Janhangeer http://www.pythonmembers.club Mauritius pip install PyQy5 > > Barry > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Wed Jan 16 19:34:21 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Thu, 17 Jan 2019 04:34:21 +0400 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: References: <67A80046-B494-4D27-ADEC-2A1675172376@barrys-emacs.org> Message-ID: maybe commercial licenses issue Abdur-Rahmaan Janhangeer http://www.pythonmembers.club Mauritius The install story is SO much better now that the "built in" advantage of > tkinter is no longer such a big deal. > > -Chris > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Jan 16 22:23:55 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 16 Jan 2019 22:23:55 -0500 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: References: Message-ID: On 1/16/2019 7:26 PM, Abdur-Rahmaan Janhangeer wrote: > let us say i'm a novice user, for me py's gui is such. if on Mac it > gives rounded corners but on others no, it's pretty unpredictable. and > if it does have roundedness but you can't control it then it's no good It is using native widgits of the platform that people on the platform are used to. This thread is off-topic and belongs on python-list, not python-ideas. -- Terry Jan Reedy From turnbull.stephen.fw at u.tsukuba.ac.jp Wed Jan 16 23:10:50 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Thu, 17 Jan 2019 13:10:50 +0900 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: References: <67A80046-B494-4D27-ADEC-2A1675172376@barrys-emacs.org> Message-ID: <23616.74.29957.127391@turnbull.sk.tsukuba.ac.jp> Abdur-Rahmaan Janhangeer writes: > maybe commercial licenses issue Among others. Tastes differ on licensing, look-and-feel, and consistent UI. Not just cross-platform, but within-platform as well: people may have other software that uses specific toolkits (aside from those that have been mentioned already, there are GTK-based GUIs) with which they want a compatible look-and-feel. The "batteries included" principle of the standard library is that there should be one way to do common things. Tkinter meets that standard. Other stdlib tools (IDLE at least, and that's a big one) use Tkinter. They would need to be ported if we wanted to encourage use of a fancier toolkit. But there's no need for that: their advantages are obvious, and using them is quite easy. And, most important of all, there are very different preferences among the candidates. Regards From greg.ewing at canterbury.ac.nz Wed Jan 16 23:55:13 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Jan 2019 17:55:13 +1300 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: References: Message-ID: <5C400AB1.6080601@canterbury.ac.nz> Terry Reedy wrote: > It is using native widgits of the platform that people on the platform > are used to. Indeed, and because of that, you *shouldn't* have any control over it. People get annoyed when a GUI follows an author's personal preferences instead of platform conventions. -- Greg From steve at pearwood.info Thu Jan 17 00:24:11 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 17 Jan 2019 16:24:11 +1100 Subject: [Python-ideas] tkinter: time for round buttons? In-Reply-To: References: Message-ID: <20190117052410.GD13616@ando.pearwood.info> On Thu, Jan 17, 2019 at 04:26:06AM +0400, Abdur-Rahmaan Janhangeer wrote: > let us say i'm a novice user, for me py's gui is such. if on Mac it gives > rounded corners but on others no, it's pretty unpredictable. Its not unpredictable at all, it is easy to predict: if I'm using a Mac, it has Mac-style buttons with round corners. If I'm using Unix, it has Unix-style rectangle buttons. If I'm using Windows, it has whatever Windows uses. If a user is using Mac OS X, and Unix, and Windows, they probably are not a novice user -- and if they are a novice, this is a good lesson that different OSes are different, look different, and behave different. -- Steve From till.varoquaux at gmail.com Thu Jan 17 17:33:57 2019 From: till.varoquaux at gmail.com (Till) Date: Thu, 17 Jan 2019 17:33:57 -0500 Subject: [Python-ideas] Add support for external annotations in the typing module Message-ID: We started a discussion in https://github.com/python/typing/issues/600 about adding support for extra annotations in the typing module. Since this is probably going to turn into a PEP I'm transferring the discussion here to have more visibility. The document below has been modified a bit from the one in GH to reflect the feedback I got: + Added a small blurb about how ``Annotated`` should support being used as an alias Things that were raised but are not reflected in this document: + The dataclass example is confusing. I kept it for now because dataclasses often come up in conversations about why we might want to support annotations in the typing module. Maybe I should rework the section. + `...` as a valid parameter for the first argument (if you want to add an annotation but use the type inferred by your type checker). This is an interesting idea, it's probably worth adding support for it if and only if we decide to support in other places. (c.f.: https://github.com/python/typing/issues/276) Thanks, Add support for external annotations in the typing module ========================================================== We propose adding an ``Annotated`` type to the typing module to decorate existing types with context-specific metadata. Specifically, a type ``T`` can be annotated with metadata ``x`` via the typehint ``Annotated[T, x]``. This metadata can be used for either static analysis or at runtime. If a library (or tool) encounters a typehint ``Annotated[T, x]`` and has no special logic for metadata ``x``, it should ignore it and simply treat the type as ``T``. Unlike the `no_type_check` functionality that current exists in the ``typing`` module which completely disables typechecking annotations on a function or a class, the ``Annotated`` type allows for both static typechecking of ``T`` (e.g., via MyPy or Pyre, which can safely ignore ``x``) together with runtime access to ``x`` within a specific application. We believe that the introduction of this type would address a diverse set of use cases of interest to the broader Python community. Motivating examples: ~~~~~~~~~~~~~~~~~~~~ reading binary data +++++++++++++++++++ The ``struct`` module provides a way to read and write C structs directly from their byte representation. It currently relies on a string representation of the C type to read in values:: record = b'raymond \x32\x12\x08\x01\x08' name, serialnum, school, gradelevel = unpack('<10sHHb', record) The struct documentation [struct-examples]_ suggests using a named tuple to unpack the values and make this a bit more tractable:: from collections import namedtuple Student = namedtuple('Student', 'name serialnum school gradelevel') Student._make(unpack('<10sHHb', record)) # Student(name=b'raymond ', serialnum=4658, school=264, gradelevel=8) However, this recommendation is somewhat problematic; as we add more fields, it's going to get increasingly tedious to match the properties in the named tuple with the arguments in ``unpack``. Instead, annotations can provide better interoperability with a type checker or an IDE without adding any special logic outside of the ``struct`` module:: from typing import NamedTuple UnsignedShort = Annotated[int, struct.ctype('H')] SignedChar = Annotated[int, struct.ctype('b')] @struct.packed class Student(NamedTuple): # MyPy typechecks 'name' field as 'str' name: Annotated[str, struct.ctype("<10s")] serialnum: UnsignedShort school: SignedChar gradelevel: SignedChar # 'unpack' only uses the metadata within the type annotations Student.unpack(record)) # Student(name=b'raymond ', serialnum=4658, school=264, gradelevel=8) dataclasses ++++++++++++ Here's an example with dataclasses [dataclass]_ that is a problematic from the typechecking standpoint:: from dataclasses import dataclass, field @dataclass class C: myint: int = 0 # the field tells the @dataclass decorator that the default action in the # constructor of this class is to set "self.mylist = list()" mylist: List[int] = field(default_factory=list) Even though one might expect that ``mylist`` is a class attribute accessible via ``C.mylist`` (like ``C.myint`` is) due to the assignment syntax, that is not the case. Instead, the ``@dataclass`` decorator strips out the assignment to this attribute, leading to an ``AttributeError`` upon access:: C.myint # Ok: 0 C.mylist # AttributeError: type object 'C' has no attribute 'mylist' This can lead to confusion for newcomers to the library who may not expect this behavior. Furthermore, the typechecker needs to understand the semantics of dataclasses and know to not treat the above example as an assignment operation in (which translates to additional complexity). It makes more sense to move the information contained in ``field`` to an annotation:: @dataclass class C: myint: int = 0 mylist: Annotated[List[int], field(default_factory=list)] # now, the AttributeError is more intuitive because there is no assignment operator C.mylist # AttributeError # the constructor knows how to use the annotations to set the 'mylist' attribute c = C() c.mylist # [] The main benefit of writing annotations like this is that it provides a way for clients to gracefully degrade when they don't know what to do with the extra annotations (by just ignoring them). If you used a typechecker that didn't have any special handling for dataclasses and the ``field`` annotation, you would still be able to run checks as though the type were simply:: class C: myint: int = 0 mylist: List[int] lowering barriers to developing new types +++++++++++++++++++++++++++++++++++++++++ Typically when adding a new type, we need to upstream that type to the typing module and change MyPy [MyPy]_, PyCharm [PyCharm]_, Pyre [Pyre]_, pytype [pytype]_, etc. This is particularly important when working on open-source code that makes use of our new types, seeing as the code would not be immediately transportable to other developers' tools without additional logic (this is a limitation of MyPy plugins [MyPy-plugins]_), which allow for extending MyPy but would require a consumer of new typehints to be using MyPy and have the same plugin installed). As a result, there is a high cost to developing and trying out new types in a codebase. Ideally, we should be able to introduce new types in a manner that allows for graceful degradation when clients do not have a custom MyPy plugin, which would lower the barrier to development and ensure some degree of backward compatibility. For example, suppose that we wanted to add support for tagged unions [tagged-unions]_ to Python. One way to accomplish would be to annotate ``TypedDict`` in Python such that only one field is allowed to be set:: Currency = Annotated( TypedDict('Currency', {'dollars': float, 'pounds': float}, total=False), TaggedUnion, ) This is a somewhat cumbersome syntax but it allows us to iterate on this proof-of-concept and have people with non-patched IDEs work in a codebase with tagged unions. We could easily test this proposal and iron out the kinks before trying to upstream tagged union to `typing`, MyPy, etc. Moreover, tools that do not have support for parsing the ``TaggedUnion`` annotation would still be able able to treat `Currency` as a ``TypedDict``, which is still a close approximation (slightly less strict). Details of proposed changes to ``typing`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ syntax ++++++ ``Annotated`` is parameterized with a type and an arbitrary list of Python values that represent the annotations. Here are the specific details of the syntax: * The first argument to ``Annotated`` must be a valid ``typing`` type or ``...`` (to use the infered type). * Multiple type annotations are supported (Annotated supports variadic arguments): ``Annotated[int, ValueRange(3, 10), ctype("char")]`` * ``Annotated`` must be called with at least two arguments (``Annotated[int]`` is not valid) * The order of the annotations is preserved and matters for equality checks:: Annotated[int, ValueRange(3, 10), ctype("char")] != \ Annotated[int, ctype("char"), ValueRange(3, 10)] * Nested ``Annotated`` types are flattened, with metadata ordered starting with the innermost annotation:: Annotated[Annotated[int, ValueRange(3, 10)], ctype("char")] ==\ Annotated[int, ValueRange(3, 10), ctype("char")] * Duplicated annotations are not removed: ``Annotated[int, ValueRange(3, 10)] != Annotated[int, ValueRange(3, 10), ValueRange(3, 10)]`` * ``Annotation`` can be used a higher order aliases:: Typevar T = ... Vec = Annotated[List[Tuple[T, T]], MaxLen(10)] # Vec[int] == `Annotated[List[Tuple[int, int]], MaxLen(10)] consuming annotations ++++++++++++++++++++++ Ultimately, the responsibility of how to interpret the annotations (if at all) is the responsibility of the tool or library encountering the `Annotated` type. A tool or library encountering an `Annotated` type can scan through the annotations to determine if they are of interest (e.g., using `isinstance`). **Unknown annotations** When a tool or a library does not support annotations or encounters an unknown annotation it should just ignore it and treat annotated type as the underlying type. For example, if we were to add an annotation that is not an instance of `struct.ctype` to the annotation for name (e.g., `Annotated[str, 'foo', struct.ctype("<10s")]`), the unpack method should ignore it. **Namespacing annotations** We do not need namespaces for annotations since the class used by the annotations acts as a namespace. **Multiple annotations** It's up to the tool consuming the annotations to decide whether the client is allowed to have several annotations on one type and how to merge those annotations. Since the ``Annotated`` type allows you to put several annotations of the same (or different) type(s) on any node, the tools or libraries consuming those annotations are in charge of dealing with potential duplicates. For example, if you are doing value range analysis you might allow this:: T1 = Annotated[int, ValueRange(-10, 5)] T2 = Annotated[T1, ValueRange(-20, 3)] Flattening nested annotations, this translates to:: T2 = Annotated[int, ValueRange(-10, 5), ValueRange(-20, 3)] An application consuming this type might choose to reduce these annotations via an intersection of the ranges, in which case ``T2`` would be treated equivalently to ``Annotated[int, ValueRange(-10, 3)]``. An alternative application might reduce these via a union, in which case ``T2`` would be treated equivalently to ``Annotated[int, ValueRange(-20, 5)]``. Other applications may decide to not support multiple annotations and throw an exception. References =========== .. [struct-examples] https://docs.python.org/3/library/struct.html#examples .. [dataclass] https://docs.python.org/3/library/dataclasses.html .. [MyPy] https://github.com/python/mypy .. [MyPy-plugins] https://mypy.readthedocs.io/en/latest/extending_mypy.html#extending-mypy-using-plugins .. [PyCharm] https://www.jetbrains.com/pycharm/ .. [Pyre] https://pyre-check.org/ .. [pytype] https://github.com/google/pytype .. [tagged-unions] https://en.wikipedia.org/wiki/Tagged_union -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Thu Jan 17 19:35:18 2019 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 17 Jan 2019 16:35:18 -0800 Subject: [Python-ideas] Add support for external annotations in the typing module In-Reply-To: References: Message-ID: On Thu, Jan 17, 2019 at 2:34 PM Till wrote: > We started a discussion in https://github.com/python/typing/issues/600 > about adding support for extra annotations in the typing module. > > Since this is probably going to turn into a PEP I'm transferring the > discussion here to have more visibility. > > The document below has been modified a bit from the one in GH to reflect > the feedback I got: > > + Added a small blurb about how ``Annotated`` should support being used > as an alias > > Things that were raised but are not reflected in this document: > > + The dataclass example is confusing. I kept it for now because > dataclasses often come up in conversations about why we might want to > support annotations in the typing module. Maybe I should rework the > section. > > + `...` as a valid parameter for the first argument (if you want to add > an annotation but use the type inferred by your type checker). This is an > interesting idea, it's probably worth adding support for it if and only if > we decide to support in other places. (c.f.: > https://github.com/python/typing/issues/276) > > Thanks, > > Add support for external annotations in the typing module > ========================================================== > > We propose adding an ``Annotated`` type to the typing module to decorate > existing types with context-specific metadata. Specifically, a type ``T`` > can be annotated with metadata ``x`` via the typehint ``Annotated[T, x]``. > This metadata can be used for either static analysis or at runtime. If a > library (or tool) encounters a typehint ``Annotated[T, x]`` and has no > special logic for metadata ``x``, it should ignore it and simply treat the > type as ``T``. Unlike the `no_type_check` functionality that current exists > in the ``typing`` module which completely disables typechecking annotations > on a function or a class, the ``Annotated`` type allows for both static > typechecking of ``T`` (e.g., via MyPy or Pyre, which can safely ignore > ``x``) together with runtime access to ``x`` within a specific > application. We believe that the introduction of this type would address a > diverse set of use cases of interest to the broader Python community. > > Motivating examples: > ~~~~~~~~~~~~~~~~~~~~ > > reading binary data > +++++++++++++++++++ > > The ``struct`` module provides a way to read and write C structs directly > from their byte representation. It currently relies on a string > representation of the C type to read in values:: > > record = b'raymond \x32\x12\x08\x01\x08' > name, serialnum, school, gradelevel = unpack('<10sHHb', record) > > The struct documentation [struct-examples]_ suggests using a named tuple > to unpack the values and make this a bit more tractable:: > > from collections import namedtuple > Student = namedtuple('Student', 'name serialnum school gradelevel') > Student._make(unpack('<10sHHb', record)) > # Student(name=b'raymond ', serialnum=4658, school=264, gradelevel=8) > > > However, this recommendation is somewhat problematic; as we add more > fields, it's going to get increasingly tedious to match the properties in > the named tuple with the arguments in ``unpack``. > > Instead, annotations can provide better interoperability with a type > checker or an IDE without adding any special logic outside of the > ``struct`` module:: > > from typing import NamedTuple > UnsignedShort = Annotated[int, struct.ctype('H')] > SignedChar = Annotated[int, struct.ctype('b')] > > @struct.packed > class Student(NamedTuple): > # MyPy typechecks 'name' field as 'str' > name: Annotated[str, struct.ctype("<10s")] > serialnum: UnsignedShort > school: SignedChar > gradelevel: SignedChar > > # 'unpack' only uses the metadata within the type annotations > Student.unpack(record)) > # Student(name=b'raymond ', serialnum=4658, school=264, gradelevel=8) > > > > dataclasses > ++++++++++++ > > Here's an example with dataclasses [dataclass]_ that is a problematic from > the typechecking standpoint:: > > from dataclasses import dataclass, field > > @dataclass > class C: > myint: int = 0 > # the field tells the @dataclass decorator that the default action in > the > # constructor of this class is to set "self.mylist = list()" > mylist: List[int] = field(default_factory=list) > > Even though one might expect that ``mylist`` is a class attribute > accessible via ``C.mylist`` (like ``C.myint`` is) due to the assignment > syntax, that is not the case. Instead, the ``@dataclass`` decorator strips > out the assignment to this attribute, leading to an ``AttributeError`` upon > access:: > > C.myint # Ok: 0 > C.mylist # AttributeError: type object 'C' has no attribute 'mylist' > > > This can lead to confusion for newcomers to the library who may not expect > this behavior. Furthermore, the typechecker needs to understand the > semantics of dataclasses and know to not treat the above example as an > assignment operation in (which translates to additional complexity). > > It makes more sense to move the information contained in ``field`` to an > annotation:: > > @dataclass > class C: > myint: int = 0 > mylist: Annotated[List[int], field(default_factory=list)] > > # now, the AttributeError is more intuitive because there is no > assignment operator > C.mylist # AttributeError > > # the constructor knows how to use the annotations to set the 'mylist' > attribute > c = C() > c.mylist # [] > > The main benefit of writing annotations like this is that it provides a > way for clients to gracefully degrade when they don't know what to do with > the extra annotations (by just ignoring them). If you used a typechecker > that didn't have any special handling for dataclasses and the ``field`` > annotation, you would still be able to run checks as though the type were > simply:: > > class C: > myint: int = 0 > mylist: List[int] > > > lowering barriers to developing new types > +++++++++++++++++++++++++++++++++++++++++ > > Typically when adding a new type, we need to upstream that type to the > typing module and change MyPy [MyPy]_, PyCharm [PyCharm]_, Pyre [Pyre]_, > pytype [pytype]_, etc. This is particularly important when working on > open-source code that makes use of our new types, seeing as the code would > not be immediately transportable to other developers' tools without > additional logic (this is a limitation of MyPy plugins [MyPy-plugins]_), > which allow for extending MyPy but would require a consumer of new > typehints to be using MyPy and have the same plugin installed). As a > result, there is a high cost to developing and trying out new types in a > codebase. Ideally, we should be able to introduce new types in a manner > that allows for graceful degradation when clients do not have a custom MyPy > plugin, which would lower the barrier to development and ensure some degree > of backward compatibility. > > For example, suppose that we wanted to add support for tagged unions > [tagged-unions]_ to Python. One way to accomplish would be to annotate > ``TypedDict`` in Python such that only one field is allowed to be set:: > > Currency = Annotated( > TypedDict('Currency', {'dollars': float, 'pounds': float}, > total=False), > TaggedUnion, > ) > > This is a somewhat cumbersome syntax but it allows us to iterate on this > proof-of-concept and have people with non-patched IDEs work in a codebase > with tagged unions. We could easily test this proposal and iron out the > kinks before trying to upstream tagged union to `typing`, MyPy, etc. > Moreover, tools that do not have support for parsing the ``TaggedUnion`` > annotation would still be able able to treat `Currency` as a ``TypedDict``, > which is still a close approximation (slightly less strict). > > > Details of proposed changes to ``typing`` > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > syntax > ++++++ > > ``Annotated`` is parameterized with a type and an arbitrary list of Python > values that represent the annotations. Here are the specific details of the > syntax: > > * The first argument to ``Annotated`` must be a valid ``typing`` type or > ``...`` (to use the infered type). > > * Multiple type annotations are supported (Annotated supports variadic > arguments): ``Annotated[int, ValueRange(3, 10), ctype("char")]`` > > * ``Annotated`` must be called with at least two arguments > (``Annotated[int]`` is not valid) > > * The order of the annotations is preserved and matters for equality > checks:: > > Annotated[int, ValueRange(3, 10), ctype("char")] != \ > Annotated[int, ctype("char"), ValueRange(3, 10)] > > * Nested ``Annotated`` types are flattened, with metadata ordered starting > with the innermost annotation:: > > Annotated[Annotated[int, ValueRange(3, 10)], ctype("char")] ==\ > Annotated[int, ValueRange(3, 10), ctype("char")] > > * Duplicated annotations are not removed: ``Annotated[int, ValueRange(3, > 10)] != Annotated[int, ValueRange(3, 10), ValueRange(3, 10)]`` > > * ``Annotation`` can be used a higher order aliases:: > > Typevar T = ... > Vec = Annotated[List[Tuple[T, T]], MaxLen(10)] > # Vec[int] == `Annotated[List[Tuple[int, int]], MaxLen(10)] > > > > consuming annotations > ++++++++++++++++++++++ > > Ultimately, the responsibility of how to interpret the annotations (if at > all) is the responsibility of the tool or library encountering the > `Annotated` type. A tool or library encountering an `Annotated` type can > scan through the annotations to determine if they are of interest (e.g., > using `isinstance`). > > **Unknown annotations** > When a tool or a library does not support annotations or encounters an > unknown annotation it should just ignore it and treat annotated type as the > underlying type. For example, if we were to add an annotation that is not > an instance of `struct.ctype` to the annotation for name (e.g., > `Annotated[str, 'foo', struct.ctype("<10s")]`), the unpack method should > ignore it. > > **Namespacing annotations** > We do not need namespaces for annotations since the class used by the > annotations acts as a namespace. > > **Multiple annotations** > It's up to the tool consuming the annotations to decide whether the > client is allowed to have several annotations on one type and how to merge > those annotations. > > Since the ``Annotated`` type allows you to put several annotations of > the same (or different) type(s) on any node, the tools or libraries > consuming those annotations are in charge of dealing with potential > duplicates. For example, if you are doing value range analysis you might > allow this:: > > T1 = Annotated[int, ValueRange(-10, 5)] > T2 = Annotated[T1, ValueRange(-20, 3)] > > Flattening nested annotations, this translates to:: > > T2 = Annotated[int, ValueRange(-10, 5), ValueRange(-20, 3)] > > An application consuming this type might choose to reduce these > annotations via an intersection of the ranges, in which case ``T2`` would > be treated equivalently to ``Annotated[int, ValueRange(-10, 3)]``. > > An alternative application might reduce these via a union, in which case > ``T2`` would be treated equivalently to ``Annotated[int, ValueRange(-20, > 5)]``. > > Other applications may decide to not support multiple annotations and > throw an exception. > (0) Observaton / TL;DR - This PEP really seems to be more of a way to declare multiple different arbitrary purposes annotations all attached to a single callable/parameter/return/variable. So that static checkers continue to work, but runtime user of annotations for whatever purpose can also work at the same time. (1a) A struct.unpack supporting this will then need to evaluate annotations in the outer scope at runtime due to our desired long term PEP-563 `from __future__ import annotations` behavior. But that becomes true of anything else wanting to use annotations at runtime so we should really make a typing library function that does this for everyone to use. (1b) This proposal potentially expands the burden of type checkers... but it shouldn't. They should be free to take the first type listed in an Annotated[] block as the type of the variable, raising an error if someone has listed multiple types (telling them to use Union[] for that). a static checker *could* do useful things with multiple annotations it knows how to handle, but I think it'd be unwise to implement that in any manner where Annotated and Union could both be used for the same purpose. It makes me wonder if Annotated[] is meaningfully different from Union at all. (2a) At first glance I don't like that the `T1 = Annotated[int, SomeOtherInfo(23)]` syntax uses [] rather than () as it really is constructing a runtime type. It isn't clear what should use [] and what should use () so I'd suggest using () for everything there. (2b) Ask yourself: Why should SomeOtherInfo and ValueRange and struct.ctype be () calls yet none of `Annotated[Union[List[bytes], Dict[bytes, Optional[float]]]]` be calls? If you can come up with an answer to that, why _should_ anyone need to know that? -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From till.varoquaux at gmail.com Fri Jan 18 16:59:03 2019 From: till.varoquaux at gmail.com (Till) Date: Fri, 18 Jan 2019 16:59:03 -0500 Subject: [Python-ideas] Add support for external annotations in the typing module In-Reply-To: References: Message-ID: Thanks for the feedback Gregory. You raise a lot of good points; this is going to help me write a clearer pep. (0) Pretty much. They can be used as refinement for more advanced type checkers (e.g.: for linear types). (1a) I knew about the postponed evaluation but hadn't read PEP-563 yet. Thx for the heads up. (1b) I think you think you meant `Intersection` type rather than `Union` type. A value of type `Intersection[A, B]` is both of type `A` and of type `B`. If we had Intersection and allowed to passed arguments decorated with NoTypeCheck then we good do without `Annotation`. This could be a bit messy though because you'd probably want to make sure that NoTypeCheck only appears in `Intersection`. Another advantage of `Annotated` is that there's a clear "principal" type. So you can make calls to constructors transparent. e.g.: class A: .... A_with_info = Annotated[A, ...] A_with_info(5) # create the value A(5) (2a) and (2b): I don't have any strong feelings when it comes to syntax; I tried to be consistent with the standard library (and maybe I got it wrong). My understanding [] is used to create a new type whereas () is used to create a new value: > Deque(range(2)) deque([0, 1]) > Deque[int] typing.Deque[int] On Thu, 17 Jan 2019 at 19:35 Gregory P. Smith wrote: > On Thu, Jan 17, 2019 at 2:34 PM Till wrote: > >> We started a discussion in https://github.com/python/typing/issues/600 >> about adding support for extra annotations in the typing module. >> >> Since this is probably going to turn into a PEP I'm transferring the >> discussion here to have more visibility. >> >> The document below has been modified a bit from the one in GH to reflect >> the feedback I got: >> >> + Added a small blurb about how ``Annotated`` should support being used >> as an alias >> >> Things that were raised but are not reflected in this document: >> >> + The dataclass example is confusing. I kept it for now because >> dataclasses often come up in conversations about why we might want to >> support annotations in the typing module. Maybe I should rework the >> section. >> >> + `...` as a valid parameter for the first argument (if you want to add >> an annotation but use the type inferred by your type checker). This is an >> interesting idea, it's probably worth adding support for it if and only if >> we decide to support in other places. (c.f.: >> https://github.com/python/typing/issues/276) >> >> Thanks, >> >> Add support for external annotations in the typing module >> ========================================================== >> >> We propose adding an ``Annotated`` type to the typing module to decorate >> existing types with context-specific metadata. Specifically, a type ``T`` >> can be annotated with metadata ``x`` via the typehint ``Annotated[T, x]``. >> This metadata can be used for either static analysis or at runtime. If a >> library (or tool) encounters a typehint ``Annotated[T, x]`` and has no >> special logic for metadata ``x``, it should ignore it and simply treat the >> type as ``T``. Unlike the `no_type_check` functionality that current exists >> in the ``typing`` module which completely disables typechecking annotations >> on a function or a class, the ``Annotated`` type allows for both static >> typechecking of ``T`` (e.g., via MyPy or Pyre, which can safely ignore >> ``x``) together with runtime access to ``x`` within a specific >> application. We believe that the introduction of this type would address a >> diverse set of use cases of interest to the broader Python community. >> >> Motivating examples: >> ~~~~~~~~~~~~~~~~~~~~ >> >> reading binary data >> +++++++++++++++++++ >> >> The ``struct`` module provides a way to read and write C structs directly >> from their byte representation. It currently relies on a string >> representation of the C type to read in values:: >> >> record = b'raymond \x32\x12\x08\x01\x08' >> name, serialnum, school, gradelevel = unpack('<10sHHb', record) >> >> The struct documentation [struct-examples]_ suggests using a named tuple >> to unpack the values and make this a bit more tractable:: >> >> from collections import namedtuple >> Student = namedtuple('Student', 'name serialnum school gradelevel') >> Student._make(unpack('<10sHHb', record)) >> # Student(name=b'raymond ', serialnum=4658, school=264, gradelevel=8) >> >> >> However, this recommendation is somewhat problematic; as we add more >> fields, it's going to get increasingly tedious to match the properties in >> the named tuple with the arguments in ``unpack``. >> >> Instead, annotations can provide better interoperability with a type >> checker or an IDE without adding any special logic outside of the >> ``struct`` module:: >> >> from typing import NamedTuple >> UnsignedShort = Annotated[int, struct.ctype('H')] >> SignedChar = Annotated[int, struct.ctype('b')] >> >> @struct.packed >> class Student(NamedTuple): >> # MyPy typechecks 'name' field as 'str' >> name: Annotated[str, struct.ctype("<10s")] >> serialnum: UnsignedShort >> school: SignedChar >> gradelevel: SignedChar >> >> # 'unpack' only uses the metadata within the type annotations >> Student.unpack(record)) >> # Student(name=b'raymond ', serialnum=4658, school=264, gradelevel=8) >> >> >> >> dataclasses >> ++++++++++++ >> >> Here's an example with dataclasses [dataclass]_ that is a problematic >> from the typechecking standpoint:: >> >> from dataclasses import dataclass, field >> >> @dataclass >> class C: >> myint: int = 0 >> # the field tells the @dataclass decorator that the default action in >> the >> # constructor of this class is to set "self.mylist = list()" >> mylist: List[int] = field(default_factory=list) >> >> Even though one might expect that ``mylist`` is a class attribute >> accessible via ``C.mylist`` (like ``C.myint`` is) due to the assignment >> syntax, that is not the case. Instead, the ``@dataclass`` decorator strips >> out the assignment to this attribute, leading to an ``AttributeError`` upon >> access:: >> >> C.myint # Ok: 0 >> C.mylist # AttributeError: type object 'C' has no attribute 'mylist' >> >> >> This can lead to confusion for newcomers to the library who may not >> expect this behavior. Furthermore, the typechecker needs to understand the >> semantics of dataclasses and know to not treat the above example as an >> assignment operation in (which translates to additional complexity). >> >> It makes more sense to move the information contained in ``field`` to an >> annotation:: >> >> @dataclass >> class C: >> myint: int = 0 >> mylist: Annotated[List[int], field(default_factory=list)] >> >> # now, the AttributeError is more intuitive because there is no >> assignment operator >> C.mylist # AttributeError >> >> # the constructor knows how to use the annotations to set the 'mylist' >> attribute >> c = C() >> c.mylist # [] >> >> The main benefit of writing annotations like this is that it provides a >> way for clients to gracefully degrade when they don't know what to do with >> the extra annotations (by just ignoring them). If you used a typechecker >> that didn't have any special handling for dataclasses and the ``field`` >> annotation, you would still be able to run checks as though the type were >> simply:: >> >> class C: >> myint: int = 0 >> mylist: List[int] >> >> >> lowering barriers to developing new types >> +++++++++++++++++++++++++++++++++++++++++ >> >> Typically when adding a new type, we need to upstream that type to the >> typing module and change MyPy [MyPy]_, PyCharm [PyCharm]_, Pyre [Pyre]_, >> pytype [pytype]_, etc. This is particularly important when working on >> open-source code that makes use of our new types, seeing as the code would >> not be immediately transportable to other developers' tools without >> additional logic (this is a limitation of MyPy plugins [MyPy-plugins]_), >> which allow for extending MyPy but would require a consumer of new >> typehints to be using MyPy and have the same plugin installed). As a >> result, there is a high cost to developing and trying out new types in a >> codebase. Ideally, we should be able to introduce new types in a manner >> that allows for graceful degradation when clients do not have a custom MyPy >> plugin, which would lower the barrier to development and ensure some degree >> of backward compatibility. >> >> For example, suppose that we wanted to add support for tagged unions >> [tagged-unions]_ to Python. One way to accomplish would be to annotate >> ``TypedDict`` in Python such that only one field is allowed to be set:: >> >> Currency = Annotated( >> TypedDict('Currency', {'dollars': float, 'pounds': float}, >> total=False), >> TaggedUnion, >> ) >> >> This is a somewhat cumbersome syntax but it allows us to iterate on this >> proof-of-concept and have people with non-patched IDEs work in a codebase >> with tagged unions. We could easily test this proposal and iron out the >> kinks before trying to upstream tagged union to `typing`, MyPy, etc. >> Moreover, tools that do not have support for parsing the ``TaggedUnion`` >> annotation would still be able able to treat `Currency` as a ``TypedDict``, >> which is still a close approximation (slightly less strict). >> >> >> Details of proposed changes to ``typing`` >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> syntax >> ++++++ >> >> ``Annotated`` is parameterized with a type and an arbitrary list of >> Python values that represent the annotations. Here are the specific details >> of the syntax: >> >> * The first argument to ``Annotated`` must be a valid ``typing`` type or >> ``...`` (to use the infered type). >> >> * Multiple type annotations are supported (Annotated supports variadic >> arguments): ``Annotated[int, ValueRange(3, 10), ctype("char")]`` >> >> * ``Annotated`` must be called with at least two arguments >> (``Annotated[int]`` is not valid) >> >> * The order of the annotations is preserved and matters for equality >> checks:: >> >> Annotated[int, ValueRange(3, 10), ctype("char")] != \ >> Annotated[int, ctype("char"), ValueRange(3, 10)] >> >> * Nested ``Annotated`` types are flattened, with metadata ordered >> starting with the innermost annotation:: >> >> Annotated[Annotated[int, ValueRange(3, 10)], ctype("char")] ==\ >> Annotated[int, ValueRange(3, 10), ctype("char")] >> >> * Duplicated annotations are not removed: ``Annotated[int, ValueRange(3, >> 10)] != Annotated[int, ValueRange(3, 10), ValueRange(3, 10)]`` >> >> * ``Annotation`` can be used a higher order aliases:: >> >> Typevar T = ... >> Vec = Annotated[List[Tuple[T, T]], MaxLen(10)] >> # Vec[int] == `Annotated[List[Tuple[int, int]], MaxLen(10)] >> >> >> >> consuming annotations >> ++++++++++++++++++++++ >> >> Ultimately, the responsibility of how to interpret the annotations (if at >> all) is the responsibility of the tool or library encountering the >> `Annotated` type. A tool or library encountering an `Annotated` type can >> scan through the annotations to determine if they are of interest (e.g., >> using `isinstance`). >> >> **Unknown annotations** >> When a tool or a library does not support annotations or encounters an >> unknown annotation it should just ignore it and treat annotated type as the >> underlying type. For example, if we were to add an annotation that is not >> an instance of `struct.ctype` to the annotation for name (e.g., >> `Annotated[str, 'foo', struct.ctype("<10s")]`), the unpack method should >> ignore it. >> >> **Namespacing annotations** >> We do not need namespaces for annotations since the class used by the >> annotations acts as a namespace. >> >> **Multiple annotations** >> It's up to the tool consuming the annotations to decide whether the >> client is allowed to have several annotations on one type and how to merge >> those annotations. >> >> Since the ``Annotated`` type allows you to put several annotations of >> the same (or different) type(s) on any node, the tools or libraries >> consuming those annotations are in charge of dealing with potential >> duplicates. For example, if you are doing value range analysis you might >> allow this:: >> >> T1 = Annotated[int, ValueRange(-10, 5)] >> T2 = Annotated[T1, ValueRange(-20, 3)] >> >> Flattening nested annotations, this translates to:: >> >> T2 = Annotated[int, ValueRange(-10, 5), ValueRange(-20, 3)] >> >> An application consuming this type might choose to reduce these >> annotations via an intersection of the ranges, in which case ``T2`` would >> be treated equivalently to ``Annotated[int, ValueRange(-10, 3)]``. >> >> An alternative application might reduce these via a union, in which >> case ``T2`` would be treated equivalently to ``Annotated[int, >> ValueRange(-20, 5)]``. >> >> Other applications may decide to not support multiple annotations and >> throw an exception. >> > > (0) Observaton / TL;DR - This PEP really seems to be more of a way to > declare multiple different arbitrary purposes annotations all attached to a > single callable/parameter/return/variable. So that static checkers > continue to work, but runtime user of annotations for whatever purpose can > also work at the same time. > > (1a) A struct.unpack supporting this will then need to evaluate > annotations in the outer scope at runtime due to our desired long term > PEP-563 `from __future__ import annotations` behavior. But that becomes > true of anything else wanting to use annotations at runtime so we should > really make a typing library function that does this for everyone to use. > > (1b) This proposal potentially expands the burden of type checkers... but > it shouldn't. They should be free to take the first type listed in an > Annotated[] block as the type of the variable, raising an error if someone > has listed multiple types (telling them to use Union[] for that). a static > checker *could* do useful things with multiple annotations it knows how > to handle, but I think it'd be unwise to implement that in any manner where > Annotated and Union could both be used for the same purpose. > > It makes me wonder if Annotated[] is meaningfully different from Union at > all. > > (2a) At first glance I don't like that the `T1 = Annotated[int, > SomeOtherInfo(23)]` syntax uses [] rather than () as it really is > constructing a runtime type. It isn't clear what should use [] and what > should use () so I'd suggest using () for everything there. > > (2b) Ask yourself: Why should SomeOtherInfo and ValueRange and > struct.ctype be () calls yet none of `Annotated[Union[List[bytes], > Dict[bytes, Optional[float]]]]` be calls? If you can come up with an > answer to that, why _should_ anyone need to know that? > > -gps > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamtlu at gmail.com Sun Jan 20 19:21:50 2019 From: jamtlu at gmail.com (James Lu) Date: Sun, 20 Jan 2019 19:21:50 -0500 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax Message-ID: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Backtick expressions work exactly like lambdas, except that they are bound to the instance they are created in every time that class is used to create one. To illustrate, this ?percent? property is bound to the instance, not to the class. class Example: percent = property(`self.v*self.v2/100`) And a few more examples for clarity. def example(): locals()['a'] = 1 expr = `a+1` return expr() # error: one variable is required Any variable names that exist when the backtick expression is created are bound to the expression, and the reference to the expression is stored within the expression. Names that do not exist when the expresssion is created must be passed in as parameters. Such names can also be passed in as keyword arguments. Backtick expressions are created when their scope is created. Variable names that are declared but have not been assigned to will be considered to exist for the purposes of the backtick expression. Directly calling a backtick expression as soon as it?s created is forbidden: `v+1`(a) But this is technically allowed but discouraged, like how := works: (`v+1`)(a) Use Cases This can be used anywhere a lambda would feel ?heavy? or long. Here are a few use cases where using a backtick expression would allow code to be significantly mote readable: If/else chains that would be switch statements. Creating decorators. Passing in logging hooks. Writing design-by-contract contracts. (See icontract on GitHub for an example of what DBC looks like in Python.) Tests and assertions. Additionally, the instance binding enables: A shorthand way to create a class that wraps an API to a better or more uniform code interface. Previously you?d need to make defs and @property, now each wrapped property and method is a single, readable line of code. Appendix I propose syntax highlighters show a backtick expression on a different background color, a lighter shade of black for a dark theme; dirty white for a light thing. I also propose the following attributes on the backtick expression. __str__(): the string [parameter names separated by commas] => [the string of the backtick expression] __repr__(): the original string of the backtick expression, surrounded by backticks. I secondarily propose that backtick expressions are only bound to their instances when defined within a class when the following syntax is used: def a = ? Now, let?s bikeshed. From steve at pearwood.info Mon Jan 21 01:56:17 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 21 Jan 2019 17:56:17 +1100 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: <20190121065617.GV13616@ando.pearwood.info> On Sun, Jan 20, 2019 at 07:21:50PM -0500, James Lu wrote: > Backtick expressions work exactly like lambdas, except that they are > bound to the instance they are created in every time that class is > used to create one. To illustrate, this ?percent? property is bound to > the instance, not to the class. > class Example: > percent = property(`self.v*self.v2/100`) Sorry, that example is not clear to me. What does it do? There is no instance "self" to bind to at this point. > And a few more examples for clarity. > > def example(): > locals()['a'] = 1 > expr = `a+1` > return expr() # error: one variable is required Still not clear to me. It might help if you showed expected input and output, rather than expecting us to guess. > Any variable names that exist when the backtick expression is created > are bound to the expression, I don't know what you mean by binding a name to an expression. > and the reference to the expression is > stored within the expression. So it forms a reference loop? The expression stores a reference to itself? Why? > Names that do not exist when the > expresssion is created must be passed in as parameters. That's different behaviour from regular functions, where names are only resolved when the function is called. > Such names can > also be passed in as keyword arguments. Backtick expressions are > created when their scope is created. Created when their scope is created? So not when the line containing the expression is executed? > Variable names that are declared but have not been assigned to will be > considered to exist for the purposes of the backtick expression. Python doesn't have variable declarations, so I don't know what this means. > Directly calling a backtick expression as soon as it?s created is forbidden: Why? > `v+1`(a) > > But this is technically allowed but discouraged, like how := works: > (`v+1`)(a) How is that different? You're still directly calling the expression. [...] > I propose syntax highlighters show a backtick expression on a > different background color, a lighter shade of black for a dark theme; > dirty white for a light thing. Nothing to do with us, or you for that matter. Editors are free to use whatever syntax highlighting they like, including none at all, and to allow users to customise that highlighting. It disturbs me that you believe you get to tell everyone what syntax highlighting they should use for this feature. That's pretty dictatorial, and not in a good BDFL way. -- Steve From cspealma at redhat.com Mon Jan 21 11:22:35 2019 From: cspealma at redhat.com (Calvin Spealman) Date: Mon, 21 Jan 2019 11:22:35 -0500 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: On Sun, Jan 20, 2019 at 9:43 PM James Lu wrote: > Backtick expressions work exactly like lambdas, except that they are bound > to the instance they are created in every time that class is used to create > one. To illustrate, this ?percent? property is bound to the instance, not > to the class. > class Example: > percent = property(`self.v*self.v2/100`) > > And a few more examples for clarity. > > def example(): > locals()['a'] = 1 > expr = `a+1` > return expr() # error: one variable is required > > Any variable names that exist when the backtick expression is created are > bound to the expression, and the reference to the expression is stored > within the expression. Names that do not exist when the expresssion is > created must be passed in as parameters. Such names can also be passed in > as keyword arguments. Backtick expressions are created when their scope is > created. > > Variable names that are declared but have not been assigned to will be > considered to exist for the purposes of the backtick expression. > > Directly calling a backtick expression as soon as it?s created is > forbidden: > > `v+1`(a) > > But this is technically allowed but discouraged, like how := works: > (`v+1`)(a) > > Use Cases > > This can be used anywhere a lambda would feel ?heavy? or long. Here are a > few use cases where using a backtick expression would allow code to be > significantly mote readable: > > If/else chains that would be switch statements. > Creating decorators. > Passing in logging hooks. > Writing design-by-contract contracts. (See icontract on GitHub for an > example of what DBC looks like in Python.) > Tests and assertions. > > Additionally, the instance binding enables: > > A shorthand way to create a class that wraps an API to a better or more > uniform code interface. Previously you?d need to make defs and @property, > now each wrapped property and method is a single, readable line of code. > > Appendix > > I propose syntax highlighters show a backtick expression on a different > background color, a lighter shade of black for a dark theme; dirty white > for a light thing. > > I also propose the following attributes on the backtick expression. > > __str__(): the string [parameter names separated by commas] => [the string > of the backtick expression] > __repr__(): the original string of the backtick expression, surrounded by > backticks. > > I secondarily propose that backtick expressions are only bound to their > instances when defined within a class when the following syntax is used: > > def a = > > ? > Now, let?s bikeshed. > I don't overall hate the idea, but I do have a few negatives to list I see. 1) While backticks are a free syntax now, they used to be a repr() expression in older versions of Python! I'm not keen on re-using them, it'll look real weird to us old people who remember that being common. 2) As a syntax it is pretty lightweight and could be easy to overlook. I like that you thought of highlighters, but you can't depend on them to make this syntax easier to notice. Instead, it is likely that the expressions in the backticks will just blend in with the rest of the code around them. The one positive I see is that because there is no open and closing pair of backticks, like parens or brackets, you can't easily nest this syntax and I actually like how it inherently discourages or makes that impossible! I can't say if I'm -0 or +0 but it is one of those. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- CALVIN SPEALMAN SENIOR QUALITY ENGINEER cspealma at redhat.com M: +1.336.210.5107 TRIED. TESTED. TRUSTED. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Mon Jan 21 11:45:57 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 21 Jan 2019 16:45:57 +0000 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: > Backtick expressions work exactly like lambdas, except that they are bound to the instance they are created in every time that class is used to create one. I would if possible very much like to see some real world examples of Python code, that would benefit by being rewritten to use the new syntax. I'm particularly interested in examples that were written before this idea was posted to this list. And extra points for links with line numbers to Python code in a public git repository. -- Jonathan From steve at pearwood.info Mon Jan 21 17:38:00 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 22 Jan 2019 09:38:00 +1100 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <20190121065617.GV13616@ando.pearwood.info> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> <20190121065617.GV13616@ando.pearwood.info> Message-ID: <20190121223800.GF10079@ando.pearwood.info> On Mon, Jan 21, 2019 at 05:56:17PM +1100, Steven D'Aprano wrote: [...] > > And a few more examples for clarity. > > > > def example(): > > locals()['a'] = 1 > > expr = `a+1` > > return expr() # error: one variable is required > > Still not clear to me. It might help if you showed expected input and > output, rather than expecting us to guess. My comment there is excessively terse and I should explain, my apologies. Assigning to the ``locals()`` dictionary is not guaranteed to create or modify the equivalent local variable. Inside a function, ``locals()['a'] = 1`` is NOT the same as ``a = 1``. In CPython, such assignments don't work, although many people don't realise that. In other implementations, they might. So I'm not sure if this is meant to just be a fancy way of assigning to ``a``, or a fancy way of NOT assigning to ``a``, which gives me two possible interpretations of that example depending on whether or not James is aware that writing to locals() may or may not create a local variable. # Backtick expressions don't resolve locals. def example(): a = 1 expr = `a+1` return expr() # error: one variable is required The alternative is a bit harder to guess what it does, since we don't know whether there is or isn't a global variable ``a``. But given that apparently we are required to pass ``a`` as an argument to the expression object, I guess that the second interpretation is: # Backtick expressions don't look for globals. a = 1 # or not, the behavious doesn't change (or does it?) def example(): expr = `a+1` return expr() # error: one variable is required Personally, *either* behaviour seems so awful to me that I can hardly credit that James Lu intends either. The second one means that expression objects can't call builtin or global level functions, unless you pass them in as arguments: obj = `len([]) + 1` obj() # Fails with NameError obj(len=len) # Returns 1 which seems ludicrous. But that appears to be the consequence of requiring variables that aren't in the local scope to be passed as arguments. Since I cannot believe that James actually intends either of these behaviours, I can only repeat my request for clarification. What is this example supposed to show? -- Steve From greg.ewing at canterbury.ac.nz Tue Jan 22 02:44:55 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 22 Jan 2019 20:44:55 +1300 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: <5C46C9F7.7020301@canterbury.ac.nz> Calvin Spealman wrote: > The one positive I see is that because there is no open and closing pair > of backticks, like parens or brackets, you can't easily nest this syntax > and I actually like how it inherently discourages or makes that impossible! Perhaps surprisingly, the backtick syntax in Python 2 actually is nestable, despite beginning and ending with the same character. Python 2.7 (r27:82500, Oct 15 2010, 21:14:33) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> `'a'+`2+3`` "'a5'" -- Greg From jamtlu at gmail.com Mon Jan 21 11:15:33 2019 From: jamtlu at gmail.com (James Lu) Date: Mon, 21 Jan 2019 11:15:33 -0500 Subject: [Python-ideas] =?utf-8?q?Discussion=3A_Duck_typing_with_?= =?utf-8?b?4oCcY29uY2VwdHPigJ0=?= Message-ID: So here?s an interesting idea, not a proposal yet. In C++20, a Concept is a list of Boolean expressions with a name that can be used in place of a type in a templated (ie type-generic) function. from typing import Concept Iterator = Concept(lambda o: hasattr(o, "__iter__", lambda o: iter(o) != NotImplemented) # Concept inheritance Iterable = Concept(lambda o: hasattr(o, "__next__"), Iterator) You could use concepts to define many practical ?real-world? duck types. A concept is like an opt-in duck typing type assertion. Since it?s a part of type annotation syntax, assertions can be generated automatically by looking at assertions. You would use a Concept like any other type in type annotations. Marko, how do you think Concepts might integrate with icontract? (I?m imagining overriding an import hook to automatically add contracts to functions with concepts.) How frequently do you use duck typing at Parquery? How might Concepts affect how often you used duck typing? From levkivskyi at gmail.com Tue Jan 22 06:57:14 2019 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 22 Jan 2019 11:57:14 +0000 Subject: [Python-ideas] =?utf-8?q?Discussion=3A_Duck_typing_with_?= =?utf-8?b?4oCcY29uY2VwdHPigJ0=?= In-Reply-To: References: Message-ID: I think you may be a bit late. Have you heard about PEP 544? -- Ivan On Tue, 22 Jan 2019 at 11:50, James Lu wrote: > So here?s an interesting idea, not a proposal yet. > > In C++20, a Concept is a list of Boolean expressions with a name that can > be used in place of a type in a templated (ie type-generic) function. > > from typing import Concept > Iterator = Concept(lambda o: hasattr(o, "__iter__", lambda o: iter(o) != > NotImplemented) > # Concept inheritance > Iterable = Concept(lambda o: hasattr(o, "__next__"), Iterator) > > You could use concepts to define many practical ?real-world? duck types. > > A concept is like an opt-in duck typing type assertion. Since it?s a part > of type annotation syntax, assertions can be generated automatically by > looking at assertions. > > You would use a Concept like any other type in type annotations. > > Marko, how do you think Concepts might integrate with icontract? (I?m > imagining overriding an import hook to automatically add contracts to > functions with concepts.) How frequently do you use duck typing at > Parquery? How might Concepts affect how often you used duck typing? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamtlu at gmail.com Tue Jan 22 08:42:21 2019 From: jamtlu at gmail.com (James Lu) Date: Tue, 22 Jan 2019 08:42:21 -0500 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <20190121065617.GV13616@ando.pearwood.info> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> <20190121065617.GV13616@ando.pearwood.info> Message-ID: > On Jan 21, 2019, at 1:56 AM, Steven D'Aprano wrote: > > It disturbs me that you believe you get to tell everyone what syntax > highlighting they should use for this feature. That's pretty > dictatorial, and not in a good BDFL way. I don?t want to tell anyone how to make their syntax highlighter- just to point out that there are good ways to make the code within backtick expressions visually distinct from the surrounding code. I told the list because it?s helpful for the readability discussion. From jamtlu at gmail.com Tue Jan 22 08:42:40 2019 From: jamtlu at gmail.com (James Lu) Date: Tue, 22 Jan 2019 08:42:40 -0500 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <5C46C9F7.7020301@canterbury.ac.nz> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> <5C46C9F7.7020301@canterbury.ac.nz> Message-ID: <79BF2294-E037-4ABB-847D-DE6B7BF68527@gmail.com> Later today I will send a working implementation of backtick expressions as a function call. From jamtlu at gmail.com Tue Jan 22 08:43:05 2019 From: jamtlu at gmail.com (James Lu) Date: Tue, 22 Jan 2019 08:43:05 -0500 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <20190121065617.GV13616@ando.pearwood.info> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> <20190121065617.GV13616@ando.pearwood.info> Message-ID: <6CF2256C-AD67-42EB-9EB7-D4F539EB21EC@gmail.com> > On Jan 21, 2019, at 1:56 AM, Steven D'Aprano wrote: > > It disturbs me that you believe you get to tell everyone what syntax > highlighting they should use for this feature. That's pretty > dictatorial, and not in a good BDFL way. I don?t want to tell anyone how to make their syntax highlighter- just to point out that there are good ways to make the code within backtick expressions visually distinct from the surrounding code. I told the list because it?s helpful for the readability discussion. From jamtlu at gmail.com Tue Jan 22 08:43:16 2019 From: jamtlu at gmail.com (James Lu) Date: Tue, 22 Jan 2019 08:43:16 -0500 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <20190121065617.GV13616@ando.pearwood.info> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> <20190121065617.GV13616@ando.pearwood.info> Message-ID: <8B3C1E65-0D72-4F64-AB39-1A9852C901CA@gmail.com> I?m a little busy recently, so I?ll reply to as much as I can now and reply to the rest later. Scratch the stuff I said about scope. Backtick expressions should inherit the scope normally like any other nested function. > That's different behaviour from regular functions, where names are only > resolved when the function is called. What benefits are there to the late name lookup of normal functions? I?m looking to have backtick expressions raise early, not late. We can relax the names to be optional if a ternary expression is used within the backtick expression. I realized that the behavior of Backtick Expressions can be silently affected by global variables. Example: x = 1 def increment_each(l): return map(`x+1`, l) ## Explicit expression, implicit usage Explicit backtick expressions are ones that, for all parameters that the created function produces, it uses a caret before the name of the parameter. The other names must exist when the backtick expression is evaluated. Example: parameter = 0 is_integer = `int(^parameter) == ^parameter` # arity: 1 despite the global definition self = 'James' str = `^self.__class__.__str__(^self)` # arity: 1 despite the global definition str(type(lambda: 1)) # use our custom str function on the function type; output: ## Implicitly Created Backtick Expression Implicit baktick expressions, ones that mention undefined parameters without using the caret mark are generally discouraged. However they create UncastBacktickExpression, which must be cast using the .to_safe(*names) method to be used, which takes in a list of parameter names and outputs a normal backtick expression. Even if the variable is defined on a global level, it can be explicitly overridden in to_safe. Example 1 `x+1`.to_safe('x')(1) # output: 2 Example 2 x = 0 `x+1`.to_safe('x')(1) # output: 2 If a backtick expression has no unspecified names and has no carets, it is an implicit backtick expression. This allows developers to safely omit the ^ when the code that is using the resulting backtick expression is aware of the parameters to be used, given that it's obvious to the developer which names are parameters. > On Jan 21, 2019, at 1:56 AM, Steven D'Aprano wrote: > > That's different behaviour from regular functions, where names are only > resolved when the function is called. From shoyer at gmail.com Tue Jan 22 13:24:41 2019 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 22 Jan 2019 10:24:41 -0800 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: On Mon, Jan 21, 2019 at 8:47 AM Jonathan Fine wrote: > > Backtick expressions work exactly like lambdas, except that they are > bound to the instance they are created in every time that class is used to > create one. > > I would if possible very much like to see some real world examples of > Python code, that would benefit by being rewritten to use the new > syntax. > > I'm particularly interested in examples that were written before this > idea was posted to this list. And extra points for links with line > numbers to Python code in a public git repository. > This has come up a few times before on Python-Ideas. Here are a few examples from 2015 (but please read the full thread for discussion): https://mail.python.org/pipermail/python-ideas/2015-March/032758.html https://mail.python.org/pipermail/python-ideas/2015-March/032822.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From pflarr at gmail.com Tue Jan 22 15:11:10 2019 From: pflarr at gmail.com (Paul Ferrell) Date: Tue, 22 Jan 2019 13:11:10 -0700 Subject: [Python-ideas] Potential PEP: with/except Message-ID: I've found that almost any time I'm writing a 'with' block, it's doing something that could throw an exception. As a result, each of those 'with' blocks needs to be nested within a 'try' block. Due to the nature of 'with', it is rarely (if ever) the case that the try block contains anything other than the with block itself. As a result, I would like to propose that the syntax for 'with' blocks be changed such that they can be accompanied by 'except', 'finally', and/or 'else' blocks as per a standard 'try' block. These would handle exceptions that occur in the 'with' block, including the execution of the applicable __enter__ and __exit__ methods. Example: try: with open(path) as myfile: ... # Do stuff with file except (OSError, IOError) as err: logger.error("Failed to read/open file {}: {}".format(path, err) The above would turn into simply: with open(path) as myfile: ... # Do stuff with file except (OSError, IOError) as err: logger.error(...) I think this is rather straightforward in meaning and easy to read, and simplifies some unnecessary nesting. I see this as the natural evolution of what 'with' is all about - replacing necessary try-finally blocks with something more elegant. We just didn't include the 'except' portion. I'm a bit hesitant to put this out there. I'm not worried about it getting shot down - that's kind of the point here. I'm just pretty strongly against to unnecessary syntactical additions to the language. This though, I think I can except. It introduces no new concepts and requires no special knowledge to use. There's no question about what is going on when you read it. -- Paul Ferrell pflarr at gmail.com From cspealma at redhat.com Tue Jan 22 15:23:42 2019 From: cspealma at redhat.com (Calvin Spealman) Date: Tue, 22 Jan 2019 15:23:42 -0500 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: On Tue, Jan 22, 2019 at 3:11 PM Paul Ferrell wrote: > I've found that almost any time I'm writing a 'with' block, it's doing > something that could throw an exception. As a result, each of those > 'with' blocks needs to be nested within a 'try' block. Due to the > nature of 'with', it is rarely (if ever) the case that the try block > contains anything other than the with block itself. > > As a result, I would like to propose that the syntax for 'with' blocks > be changed such that they can be accompanied by 'except', 'finally', > and/or 'else' blocks as per a standard 'try' block. These would handle > exceptions that occur in the 'with' block, including the execution of > the applicable __enter__ and __exit__ methods. > > Example: > > try: > with open(path) as myfile: > ... # Do stuff with file > except (OSError, IOError) as err: > logger.error("Failed to read/open file {}: {}".format(path, err) > > The above would turn into simply: > > with open(path) as myfile: > ... # Do stuff with file > except (OSError, IOError) as err: > logger.error(...) > > It definitely makes sense, both the problem and the proposed solution. The thing that concerns me is that any such problem and solution seems to apply equally to any other kind of block. Why not allow excepts on fo loops, for example? > > I think this is rather straightforward in meaning and easy to read, > and simplifies some unnecessary nesting. I see this as the natural > evolution of what 'with' > is all about - replacing necessary try-finally blocks with something > more elegant. We just didn't include the 'except' portion. > > I'm a bit hesitant to put this out there. I'm not worried about it > getting shot down - that's kind of the point here. I'm just pretty > strongly against to unnecessary syntactical additions to the language. > This though, I think I can except. It introduces no new concepts and > requires no special knowledge to use. There's no question about what > is going on when you read it. > > -- > Paul Ferrell > pflarr at gmail.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- CALVIN SPEALMAN SENIOR QUALITY ENGINEER cspealma at redhat.com M: +1.336.210.5107 TRIED. TESTED. TRUSTED. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Tue Jan 22 15:31:42 2019 From: mike at selik.org (Michael Selik) Date: Tue, 22 Jan 2019 12:31:42 -0800 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: On Tue, Jan 22, 2019, 12:11 PM Paul Ferrell I see this as the natural evolution of what 'with' is all about - > replacing necessary try-finally blocks with something more elegant. We just > didn't include the 'except' portion. > The time machine strikes again. In fact, you can handle exceptions with a context manager object. Whatever you're with-ing must have a dunder exit method, which received any exceptions raised in the block as an argument. Return true and the exception is suppressed. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Tue Jan 22 15:39:51 2019 From: barry at barrys-emacs.org (Barry Scott) Date: Tue, 22 Jan 2019 20:39:51 +0000 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <79BF2294-E037-4ABB-847D-DE6B7BF68527@gmail.com> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> <5C46C9F7.7020301@canterbury.ac.nz> <79BF2294-E037-4ABB-847D-DE6B7BF68527@gmail.com> Message-ID: The problem with using the back-tick is that it is far too easy to miss read it for a single-quote. back-tick in bash has the $( xxx ) replacement that avoids the problem. Please find an alternative syntax that avoid the problem. Barry > On 22 Jan 2019, at 13:42, James Lu wrote: > > Later today I will send a working implementation of backtick expressions as a function call. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From barry at barrys-emacs.org Tue Jan 22 15:51:05 2019 From: barry at barrys-emacs.org (Barry Scott) Date: Tue, 22 Jan 2019 20:51:05 +0000 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: <98AF0133-6913-417E-9581-2E84ECAFA460@barrys-emacs.org> > On 22 Jan 2019, at 20:31, Michael Selik wrote: > > On Tue, Jan 22, 2019, 12:11 PM Paul Ferrell wrote: > I see this as the natural evolution of what 'with' is all about - replacing necessary try-finally blocks with something more elegant. We just didn't include the 'except' portion. > > The time machine strikes again. In fact, you can handle exceptions with a context manager object. Whatever you're with-ing must have a dunder exit method, which received any exceptions raised in the block as an argument. Return true and the exception is suppressed. Suppressing the exception is not the general case. And will not work for the example given. Barry > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Tue Jan 22 16:26:39 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 22 Jan 2019 16:26:39 -0500 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: <98AF0133-6913-417E-9581-2E84ECAFA460@barrys-emacs.org> References: <98AF0133-6913-417E-9581-2E84ECAFA460@barrys-emacs.org> Message-ID: You could write a context manager that used an arbitrary callback passed in to handle exceptions (including re-raising as needed). This doesn't require new syntax, just writing a custom CM. On Tue, Jan 22, 2019, 4:20 PM Barry Scott > > On 22 Jan 2019, at 20:31, Michael Selik wrote: > > On Tue, Jan 22, 2019, 12:11 PM Paul Ferrell >> I see this as the natural evolution of what 'with' is all about - >> replacing necessary try-finally blocks with something more elegant. We just >> didn't include the 'except' portion. >> > > The time machine strikes again. In fact, you can handle exceptions with a > context manager object. Whatever you're with-ing must have a dunder exit > method, which received any exceptions raised in the block as an argument. > Return true and the exception is suppressed. > > > Suppressing the exception is not the general case. > And will not work for the example given. > > Barry > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jan 22 16:29:51 2019 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 23 Jan 2019 08:29:51 +1100 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: On Wed, Jan 23, 2019 at 7:11 AM Paul Ferrell wrote: > As a result, I would like to propose that the syntax for 'with' blocks > be changed such that they can be accompanied by 'except', 'finally', > and/or 'else' blocks as per a standard 'try' block. These would handle > exceptions that occur in the 'with' block, including the execution of > the applicable __enter__ and __exit__ methods. > > Example: > > try: > with open(path) as myfile: > ... # Do stuff with file > except (OSError, IOError) as err: > logger.error("Failed to read/open file {}: {}".format(path, err) > > The above would turn into simply: > > with open(path) as myfile: > ... # Do stuff with file > except (OSError, IOError) as err: > logger.error(...) Edge case: The "try/with/except" structure includes the entire 'with' header inside the try block, including the call to open(). But if the with block itself is handling the exceptions, the expression "open(path)" is actually evaluated before the exception handling gets going. So adding an except clause is NOT the same as just adding another context manager to the stack. I'm -0.25 on it, as I don't think overlaying in this way improves clarity. The current syntax makes it very obvious that the "open(path)" call is inside the try/except, but the proposed syntax isn't so clear (and as many people will expect it to be inside as outside). ChrisA From barry at barrys-emacs.org Tue Jan 22 15:47:45 2019 From: barry at barrys-emacs.org (Barry Scott) Date: Tue, 22 Jan 2019 20:47:45 +0000 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: <9C6C7B85-7470-4439-BF05-76865395EC13@barrys-emacs.org> > On 22 Jan 2019, at 20:11, Paul Ferrell wrote: > > I've found that almost any time I'm writing a 'with' block, it's doing > something that could throw an exception. As a result, each of those > 'with' blocks needs to be nested within a 'try' block. Due to the > nature of 'with', it is rarely (if ever) the case that the try block > contains anything other than the with block itself. > > As a result, I would like to propose that the syntax for 'with' blocks > be changed such that they can be accompanied by 'except', 'finally', > and/or 'else' blocks as per a standard 'try' block. These would handle > exceptions that occur in the 'with' block, including the execution of > the applicable __enter__ and __exit__ methods. > > Example: > > try: > with open(path) as myfile: > ... # Do stuff with file > except (OSError, IOError) as err: > logger.error("Failed to read/open file {}: {}".format(path, err) > > The above would turn into simply: > > with open(path) as myfile: > ... # Do stuff with file > except (OSError, IOError) as err: > logger.error(...) > Or this that shows the except/finally will be present? try with open(path) as myfile: ... # Do stuff with file except (OSError, IOError) as err: logger.error(...) > > I think this is rather straightforward in meaning and easy to read, > and simplifies some unnecessary nesting. I see this as the natural > evolution of what 'with' > is all about - replacing necessary try-finally blocks with something > more elegant. We just didn't include the 'except' portion. > > I'm a bit hesitant to put this out there. I'm not worried about it > getting shot down - that's kind of the point here. I'm just pretty > strongly against to unnecessary syntactical additions to the language. > This though, I think I can except. It introduces no new concepts and > requires no special knowledge to use. There's no question about what > is going on when you read it. > > -- > Paul Ferrell > pflarr at gmail.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From pythonchb at gmail.com Tue Jan 22 16:53:50 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 22 Jan 2019 13:53:50 -0800 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: Going back to the original post: On Sun, Jan 20, 2019 at 6:43 PM James Lu wrote: > Backtick expressions work exactly like lambdas, except that they are bound > to the instance they are created in every time that class is used to create > one. ?!? bound every time that instance is used to create one -- I have no idea what that means, or why it's useful. As I understand it, Python has expressions and statements -- expressions evaluate to a value that can be referenced in various ways, statements can do other stuff. A lambda is an expression -- it evaluates to an function object, which can then be bound to a name, stored in a container, passed to another function, etc. Is this proposal somehow different? does a backtick expression evaluate to a function? It seems from other parts of this thread that is does somethign different with namespaces and globals and locals, or ??? To illustrate, this ?percent? property is bound to the instance, not to the > class. > class Example: > percent = property(`self.v*self.v2/100`) > I do not understand this -- "property" is a decorator, or even more generally, a class which, when called, creates a "property" instance, that follows the descriptor protocol: https://docs.python.org/3/howto/descriptor.html As it is currently written, property expects a "getter" method as it's first argument. So the above would be written as: @property def percent(self): return self.v * self.v2/100 or, without the decoration: def percent(self): return self.v * self.v2/100 percent = property(percent) or, with a lambda: percent = property(lambda self: self.v * self.v2 / 100)) In any case, property expects a callable as it's first argument that will take an instance as its first argument. So there is no place for "binding to an instance". As far as I can tell, other than saving the six characters of lambda, the other thing that this does is provide some implicit argument calling -- how does that work exactly? in the above example, remember that "self" is a convention, so your backtick example could jsut as easily be: class Example: percent = property(`thing.v * thing.v2 / 100`) so when the function object is created, how does it know what arguments it takes? In this case, there is only one name used in the expression, so I guess we could assume that that's an argument, but what if it were: class Example: percent = property(`thing.v * thing.v2 / x`) Now we have both "self" and "x" as names to be resolved -- and the function will be called with an instance as the first argument -- so is that first argument "thing" or "x", and what namespace is the other one to be looked for? You do try to explain this here: Any variable names that exist when the backtick expression is created are > bound to the expression, what is "the expression"? -- as a rule expressions can't be bound to. and the reference to the expression is stored within the expression. it sure sounds like you are using the same work in two ways here... > Names that do not exist when the expression is created must be passed in > as parameters. so if I have a block of code like: a = 5 bt = `a + b` Then bt will be a function that takes one positional argument, so this is the equivalent of: bt = lambda b: a + b Which means that if I go and add a "b" above that line of code later on, suddenly the meaning of this expression changes? that sure seems fragile to me! or is it? bt = lambda b, a=a: a + b that is, the function gets the VALUE of a at teh time the function is created? Such names can also be passed in as keyword arguments. Backtick expressions > are created when their scope is created. > no it getting really, really confusing if there are more than a couple names involved: a = 5 b = 7 bt = `c * d / a + b *e *f / d * f` so c, d, e, and f are not defined, so the function needs 4 parameters? are they positional? in what order? if keyword, what is the default? This seems like a LOT of magic -- and for what? just to save typing? -CHB Use Cases > > This can be used anywhere a lambda would feel ?heavy? or long. Here are a > few use cases where using a backtick expression would allow code to be > significantly mote readable: > > If/else chains that would be switch statements. > Creating decorators. > Passing in logging hooks. > Writing design-by-contract contracts. (See icontract on GitHub for an > example of what DBC looks like in Python.) > Tests and assertions. > We're really going to need to examples for these -- I can imagine examples where the code would be shorter, but NOT more readable! -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From pflarr at gmail.com Tue Jan 22 17:15:04 2019 From: pflarr at gmail.com (Paul Ferrell) Date: Tue, 22 Jan 2019 15:15:04 -0700 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: On the whole chain on context managers: I'm aware that 'exit' gets the exceptions raised, which is great in certain situations that are specific to the object. However, the common case is specific to the _usage_ of the object. Even if it were all just one type of object, like opening files, there are many different situations in which I need to handle the errors. Some may just need to be logged, some ignored, some may need to attempt a retry, some should cause a complete failure. Writing custom context managers for all of the different cases is the opposite of the desired result of slightly cleaner, less redundant code. Then consider I'm often using many different context managed objects in a code base, often simultaneously. This idea hit me because I keep running into the try/with/except pattern so often, and in so many different circumstances. On Tue, Jan 22, 2019 at 1:31 PM Michael Selik wrote: > On Tue, Jan 22, 2019, 12:11 PM Paul Ferrell >> I see this as the natural evolution of what 'with' is all about - >> replacing necessary try-finally blocks with something more elegant. We just >> didn't include the 'except' portion. >> > > The time machine strikes again. In fact, you can handle exceptions with a > context manager object. Whatever you're with-ing must have a dunder exit > method, which received any exceptions raised in the block as an argument. > Return true and the exception is suppressed. > >> -- Paul Ferrell pflarr at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From pflarr at gmail.com Tue Jan 22 17:15:50 2019 From: pflarr at gmail.com (Paul Ferrell) Date: Tue, 22 Jan 2019 15:15:50 -0700 Subject: [Python-ideas] Fwd: Potential PEP: with/except In-Reply-To: References: Message-ID: > > The thing that concerns me is that any such problem and solution seems > to apply equally to any other kind of block. Why not allow excepts on fo > loops, for example? > Very good point. I think 'with' is special in that it typically contains the entirety of the use of an object, and the type of objects one tends to use in a 'with' are prone to throwing exceptions. Other statements like it don't intrinsically encapsulate the usage of an object. On Tue, Jan 22, 2019 at 1:23 PM Calvin Spealman wrote: > > > On Tue, Jan 22, 2019 at 3:11 PM Paul Ferrell wrote: > >> I've found that almost any time I'm writing a 'with' block, it's doing >> something that could throw an exception. As a result, each of those >> 'with' blocks needs to be nested within a 'try' block. Due to the >> nature of 'with', it is rarely (if ever) the case that the try block >> contains anything other than the with block itself. >> >> As a result, I would like to propose that the syntax for 'with' blocks >> be changed such that they can be accompanied by 'except', 'finally', >> and/or 'else' blocks as per a standard 'try' block. These would handle >> exceptions that occur in the 'with' block, including the execution of >> the applicable __enter__ and __exit__ methods. >> >> Example: >> >> try: >> with open(path) as myfile: >> ... # Do stuff with file >> except (OSError, IOError) as err: >> logger.error("Failed to read/open file {}: {}".format(path, err) >> >> The above would turn into simply: >> >> with open(path) as myfile: >> ... # Do stuff with file >> except (OSError, IOError) as err: >> logger.error(...) >> >> > It definitely makes sense, both the problem and the proposed solution. > > The thing that concerns me is that any such problem and solution seems > to apply equally to any other kind of block. Why not allow excepts on fo > loops, for example? > > >> >> I think this is rather straightforward in meaning and easy to read, >> and simplifies some unnecessary nesting. I see this as the natural >> evolution of what 'with' >> is all about - replacing necessary try-finally blocks with something >> more elegant. We just didn't include the 'except' portion. >> >> I'm a bit hesitant to put this out there. I'm not worried about it >> getting shot down - that's kind of the point here. I'm just pretty >> strongly against to unnecessary syntactical additions to the language. >> This though, I think I can except. It introduces no new concepts and >> requires no special knowledge to use. There's no question about what >> is going on when you read it. >> >> -- >> Paul Ferrell >> pflarr at gmail.com >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > > CALVIN SPEALMAN > > SENIOR QUALITY ENGINEER > > cspealma at redhat.com M: +1.336.210.5107 > > TRIED. TESTED. TRUSTED. > -- Paul Ferrell pflarr at gmail.com -- Paul Ferrell pflarr at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jan 22 17:16:13 2019 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 22 Jan 2019 14:16:13 -0800 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: The first concern that comes to my mind is... When I see: with: ... except: ... Is that a shorthand for try: with: ... except: ... or for with: try: ... except: ... ? Both are plausible, and it makes a big difference, because 'with' already has an implicit 'except' block built in. -n On Tue, Jan 22, 2019, 12:12 Paul Ferrell I've found that almost any time I'm writing a 'with' block, it's doing > something that could throw an exception. As a result, each of those > 'with' blocks needs to be nested within a 'try' block. Due to the > nature of 'with', it is rarely (if ever) the case that the try block > contains anything other than the with block itself. > > As a result, I would like to propose that the syntax for 'with' blocks > be changed such that they can be accompanied by 'except', 'finally', > and/or 'else' blocks as per a standard 'try' block. These would handle > exceptions that occur in the 'with' block, including the execution of > the applicable __enter__ and __exit__ methods. > > Example: > > try: > with open(path) as myfile: > ... # Do stuff with file > except (OSError, IOError) as err: > logger.error("Failed to read/open file {}: {}".format(path, err) > > The above would turn into simply: > > with open(path) as myfile: > ... # Do stuff with file > except (OSError, IOError) as err: > logger.error(...) > > > I think this is rather straightforward in meaning and easy to read, > and simplifies some unnecessary nesting. I see this as the natural > evolution of what 'with' > is all about - replacing necessary try-finally blocks with something > more elegant. We just didn't include the 'except' portion. > > I'm a bit hesitant to put this out there. I'm not worried about it > getting shot down - that's kind of the point here. I'm just pretty > strongly against to unnecessary syntactical additions to the language. > This though, I think I can except. It introduces no new concepts and > requires no special knowledge to use. There's no question about what > is going on when you read it. > > -- > Paul Ferrell > pflarr at gmail.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Jan 22 17:17:23 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 23 Jan 2019 09:17:23 +1100 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: <20190122221722.GE13616@ando.pearwood.info> On Tue, Jan 22, 2019 at 01:11:10PM -0700, Paul Ferrell wrote: [...] > I would like to propose that the syntax for 'with' blocks > be changed such that they can be accompanied by 'except', 'finally', > and/or 'else' blocks as per a standard 'try' block. What benefit does this give apart from saving one line and one indent? If either is in short supply, the code probably needs refactoring, not new syntax. The beauty of the current syntax is that try...except and with blocks are fully independent, composable blocks which can be learned and reasoned about seperately. You're proposing to add a new special-case syntax: while ... except ... that adds a new block structure that has to be implemented, documented, tested, maintained, taught and learned. It will inevitably lead to questions on mailing lists, IRC and Stackoverflow asking what is the difference between a separate try...with...except and a with...except, and when to choose one or the other. And of course then there will be the inevitable requests that we generalise it to other blocks: for ... except ... while ... except ... If this will allow us to write more expressive code, or do things we couldn't easily do before, then it might be worthwhile to add this additional complexity. But if all it does is save one line and one indent, then I believe it is redundant and I would be against it. -- Steve From pflarr at gmail.com Tue Jan 22 17:22:27 2019 From: pflarr at gmail.com (Paul Ferrell) Date: Tue, 22 Jan 2019 15:22:27 -0700 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: That is definitely an ambiguity worth considering (whether __enter__ is within the implied 'try'). Anecdotally, I showed the with/except example to my student (who's relatively new to python), to see how he interpreted it. He (correctly?) assumed the CM operations were within the 'try', and was pretty surprised when I told him that the except part of the with wasn't actually valid Python. On Tue, Jan 22, 2019 at 2:30 PM Chris Angelico wrote: > On Wed, Jan 23, 2019 at 7:11 AM Paul Ferrell wrote: > > As a result, I would like to propose that the syntax for 'with' blocks > > be changed such that they can be accompanied by 'except', 'finally', > > and/or 'else' blocks as per a standard 'try' block. These would handle > > exceptions that occur in the 'with' block, including the execution of > > the applicable __enter__ and __exit__ methods. > > > > Example: > > > > try: > > with open(path) as myfile: > > ... # Do stuff with file > > except (OSError, IOError) as err: > > logger.error("Failed to read/open file {}: {}".format(path, err) > > > > The above would turn into simply: > > > > with open(path) as myfile: > > ... # Do stuff with file > > except (OSError, IOError) as err: > > logger.error(...) > > Edge case: The "try/with/except" structure includes the entire 'with' > header inside the try block, including the call to open(). But if the > with block itself is handling the exceptions, the expression > "open(path)" is actually evaluated before the exception handling gets > going. So adding an except clause is NOT the same as just adding > another context manager to the stack. > > I'm -0.25 on it, as I don't think overlaying in this way improves > clarity. The current syntax makes it very obvious that the > "open(path)" call is inside the try/except, but the proposed syntax > isn't so clear (and as many people will expect it to be inside as > outside). > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Paul Ferrell pflarr at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From pflarr at gmail.com Tue Jan 22 17:48:27 2019 From: pflarr at gmail.com (Paul Ferrell) Date: Tue, 22 Jan 2019 15:48:27 -0700 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: <20190122221722.GE13616@ando.pearwood.info> References: <20190122221722.GE13616@ando.pearwood.info> Message-ID: I completely understand your perspective, and agree with most of it. It doesn't add new expressiveness, it adds a bit of polish (and I think completeness) to the relatively new concept of 'with' statements. Is this so intuitive that we don't actually have to teach it? Is it such a natural extension to 'with', that it would immediately be weird to find it missing in the future? If the answer to either of those questions is 'no', then I absolutely retract my idea. On Tue, Jan 22, 2019 at 3:22 PM Steven D'Aprano wrote: > On Tue, Jan 22, 2019 at 01:11:10PM -0700, Paul Ferrell wrote: > > [...] > > I would like to propose that the syntax for 'with' blocks > > be changed such that they can be accompanied by 'except', 'finally', > > and/or 'else' blocks as per a standard 'try' block. > > What benefit does this give apart from saving one line and one indent? > If either is in short supply, the code probably needs refactoring, not > new syntax. > > The beauty of the current syntax is that try...except and with blocks > are fully independent, composable blocks which can be learned and > reasoned about seperately. You're proposing to add a new special-case > syntax: > > while ... > except ... > > that adds a new block structure that has to be implemented, documented, > tested, maintained, taught and learned. It will inevitably lead to > questions on mailing lists, IRC and Stackoverflow asking what is the > difference between a separate try...with...except and a with...except, > and when to choose one or the other. > > And of course then there will be the inevitable requests that we > generalise it to other blocks: > > for ... > except ... > > while ... > except ... > > If this will allow us to write more expressive code, or do things we > couldn't easily do before, then it might be worthwhile to add this > additional complexity. > > But if all it does is save one line and one indent, then I believe it is > redundant and I would be against it. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Paul Ferrell pflarr at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jan 22 17:49:57 2019 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 23 Jan 2019 09:49:57 +1100 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: On Wed, Jan 23, 2019 at 9:23 AM Paul Ferrell wrote: > > That is definitely an ambiguity worth considering (whether __enter__ is within the implied 'try'). > It's not even __enter__. >>> import dis >>> def f(path): ... with open(path) as f: ... print("Got f:", f) ... >>> dis.dis(f) 2 0 LOAD_GLOBAL 0 (open) 2 LOAD_FAST 0 (path) 4 CALL_FUNCTION 1 6 SETUP_WITH 16 (to 24) 8 STORE_FAST 1 (f) 3 10 LOAD_GLOBAL 1 (print) 12 LOAD_CONST 1 ('Got f:') 14 LOAD_FAST 1 (f) 16 CALL_FUNCTION 2 18 POP_TOP 20 POP_BLOCK 22 BEGIN_FINALLY >> 24 WITH_CLEANUP_START 26 WITH_CLEANUP_FINISH 28 END_FINALLY 30 LOAD_CONST 0 (None) 32 RETURN_VALUE >>> At the time when "open(path)" is called, the 'with' block hasn't begun operating yet. The SETUP_WITH operation will call __enter__, but prior to that, we have to have an object to use as the context manager. Any exception thrown by opening the file will happen before __enter__ gets called. AIUI the __enter__ method of file objects doesn't actually do anything much (just validates that it's still open). ChrisA From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Jan 22 19:30:21 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Wed, 23 Jan 2019 09:30:21 +0900 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: <23623.46493.902289.853332@turnbull.sk.tsukuba.ac.jp> Jonathan Fine writes: > > Backtick expressions work exactly like lambdas, except that they > > are bound to the instance they are created in every time that > > class is used to create one. > > I would if possible very much like to see some real world examples of > Python code, that would benefit by being rewritten to use the new > syntax. Note: the usual way of doing this is to find examples in the standard library. It's not perfect, but the stdlib is generally pretty good code to start with, and is written with a fairly consistent style. Yes, those two features makes finding syntax that makes the stdlib more readable a pretty high barrier. Steve -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull at sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN From steve at pearwood.info Tue Jan 22 20:05:17 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 23 Jan 2019 12:05:17 +1100 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: <20190123010517.GF13616@ando.pearwood.info> On Tue, Jan 22, 2019 at 03:22:27PM -0700, Paul Ferrell wrote: > Anecdotally, I showed the with/except example to my student (who's > relatively new to python), to see how he interpreted it. He (correctly?) > assumed the CM operations were within the 'try', and was pretty surprised > when I told him that the except part of the with wasn't actually valid > Python. One of the more pernicious myths about language design is that if something surprises a beginner, it must be a bad idea. The reality is that beginners are the worst people to judge what is good or bad or consistent, because they don't have the knowledge or experience to recognise deep consistency or flaws in an feature. I'm just making a general observation here, not making a specific claim that this specific proposal is flawed (beyond the point I made earlier that it may be unnecessary and redundant). But anecdotes about "beginners were surprised by..." don't carry much weight with me. (Not "zero weight", it isn't as if I *want* to surprise beginners, and sometimes newcomers to a language can spot things which experts are so used to they don't notice any longer.) -- Steve From steve at pearwood.info Tue Jan 22 20:28:26 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 23 Jan 2019 12:28:26 +1100 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: <20190123012826.GG13616@ando.pearwood.info> I've been thinking more about this proposal, and realised why I've been feeling a slight sense of disquiet about it. I think it encourages an anti-pattern of catching too much. (Or at least a code smell.) Although we're all guilty of violating this principle from time to time, in general we ought to surround the minimum amount of code with a try...except that we need. Ideally (but rarely possible in practice) we want to surround a single operation which might raise at a time. Otherwise, we risk this sort of failure: try: n = len(sequence) result = process(n) except TypeError: # we implicitly assume that the ONLY source of # TypeError is calling len(sequence) handle_iterator(sequence) But what if process(n) itself raises TypeError? Perhaps because it takes two mandatory arguments, not one, and we've just hidden a bug in our code. Now obviously this specific example is just a toy, but the principle applies. When I see code like: try: with spam(arg) as x: block except SomeException: # implicitly assume that spam(arg) is the only # thing which can fail handle_failure_in_spam() what I see is a try block which may be too greedy, possibly hiding bugs in the code. So what we probably *actually* want is: try: tmp = spam(arg) except SomeException: handle_failure_in_spam() with tmp as x: block but who can be bothered writing that? At least until they've been bitten by the failure to do so. Given this: with spam(arg) as x: block except: ... what is the scope of this with...except clause? * just the call to spam(arg) - the call to spam(arg) and the call to x.__enter__ - spam(arg), x.__enter__ and the entire with block - just the call to x.__enter__ - just the block Visually, the beauty of the try...except syntax is that there is nothing else happening on the try line. The try statement is purely a delimiter, and it is *only* the indented block below it which is guarded. But the "with..." line in this proposal acts as both delimiter and code, and so it is ambiguous whether we want the delimiter to come before or after the code: try with block except with try block except try with except block Logically, I don't want this to guard the block. Doing so guards too much: it is bad enough when I'm lazy and surround the entire with statement in a single try...except, I don't want the language providing me a feature specifically to encourage me to do it. But visually, I would *never* guess that the block was not guarded by the with...except clause. Logically, I don't want it to cover the body of the with statement, but I hate to imagine having to explain to people why it doesn't. But the alternative is to enshrine in syntax something which *by design* guards too much and is a code smell. -- Steve From steve at pearwood.info Wed Jan 23 01:43:14 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 23 Jan 2019 17:43:14 +1100 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <20190121065617.GV13616@ando.pearwood.info> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> <20190121065617.GV13616@ando.pearwood.info> Message-ID: <20190123064314.GJ13616@ando.pearwood.info> On Mon, Jan 21, 2019 at 05:56:17PM +1100, Steven D'Aprano wrote: [...] > > Variable names that are declared but have not been assigned to will be > > considered to exist for the purposes of the backtick expression. > > Python doesn't have variable declarations, so I don't know what this > means. Somebody emailed me off-list, and reminded me about type hints: var: int but that's not a declaration in the usual sense of the word. It is officially an annotation. As PEP 526 states: "Type annotations should not be confused with variable declarations in statically typed languages." https://www.python.org/dev/peps/pep-0526/#non-goals At a global level, such type hints have no effect beyond recording the annotation for introspection purposes. Inside a function, they don't do that, but do instruct the compiler to treat the name as a local rather than global. What they certainly don't do is create the variable. If people want to argue that such variable annotations are declarations "but not the same as in statically typed languages", that becomes a matter of argument over definitions. In any case, whether they are declarations or annotations, at least now I think that I understand the intent of the quoted paragraph. As I understand it: - backtick objects raise at creation-time if they refer to a variable that doesn't exist at that moment; - but if the variable has been annotated using the "var: int" syntax, it will be deemed to exist even if it doesn't. To be precise, by "variable" I mean specifically a name-binding. In the interactive interpreter I frequently create functions that refer to a global variable or another function, before I've created that variable or other function. Sometimes this happens in code in modules as well. def spam(): return eggs() + 1 # But eggs doesn't exist yet! def eggs(): return 999 If I change ``spam`` to a backtick object, it is (I presume) an error, because ``eggs`` doesn't exist. What a drag. Especially since (unlike statically typed languages) this inconvenience doesn't even buy me any runtime efficiency. The name lookup for eggs will still have to be done at runtime every time I call spam(). -- Steve From marko.ristin at gmail.com Wed Jan 23 02:04:24 2019 From: marko.ristin at gmail.com (Marko Ristin-Kaufmann) Date: Wed, 23 Jan 2019 08:04:24 +0100 Subject: [Python-ideas] =?utf-8?q?Discussion=3A_Duck_typing_with_?= =?utf-8?b?4oCcY29uY2VwdHPigJ0=?= In-Reply-To: References: Message-ID: Hi James, As Ivan has mentioned, Protocols already allow for statical type checks: https://mypy.readthedocs.io/en/latest/protocols.html We didn't need protocols that often at Parquery, maybe half a dozen of times? While we didn't use them in Python, we had to use them intensively in Go where it is a bit of a nightmare. It gives you freedom for cases when your input arguments are fairly general (e.g., in a public library), but it made refactoring our _production_ code (i.e. specific in contrast to general) much harder since you couldn't look up easily which type implements which "interface" (as protocols are called in Go). Cheers Marko -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leban.us Wed Jan 23 03:01:28 2019 From: bruce at leban.us (Bruce Leban) Date: Wed, 23 Jan 2019 00:01:28 -0800 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: On Sun, Jan 20, 2019 at 6:43 PM James Lu wrote: > Backtick expressions work exactly like lambdas, except that they are bound > to the instance they are created in every time that class is used to create > one. To illustrate, ... First, if there is a useful procedure I am strongly against using backticks because (1) it's been used in the past with an entirely different meaning and (2) it looks ugly and is not visually suggestive at all of what it does, especially not the subtle difference between other function definitions. Second, I don't understand exactly what this difference or why it would be useful. It would help for you to give examples comparing lambda and this variation. Third, you mention using ^ in "explicit" expressions to refer to parameters of the "created function" and I do not know what function you are referring to or what the exact semantics of this are. Again, a comparison of two expressions with and without that ^ would help. An expression is not a function and not all expressions are written inside functions. (And as to the specific proposed syntax, there already is the ^ xor operator and the most expected meaning of ^value is ~value. just as the unary + and - operators corresponds to the binary operators. The only thing that I can think of is that you want `foo + ^bar` to be another way of writing lambda bar: foo + bar with some under-specified behavior for evaluating foo and different under-specified behavior for evaluating bar . Finally, if there is some other useful semantics for references inside a function definition, then I would think the best way to do that is to implement that, not add a new function difference. For example, lambda foo: foo + $bar def sample(foo): return foo + $foo where I'm arbitrarily using $ to represent the new semantics whatever they are (no point in bikeshedding syntax when semantics are yet to be defined). --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamtlu at gmail.com Wed Jan 23 16:06:38 2019 From: jamtlu at gmail.com (James Lu) Date: Wed, 23 Jan 2019 16:06:38 -0500 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: Backtick expressions (now) use the same scoping and same binding rules as other functions. The only difference is that class Class: stacticmethod = `...` staticmethod = lambda: ... def instancemethod = `...` # an instancemethod that's called with self passed in def property = property(`...`) # an instancemethod that's called with self passed in > The only thing that I can think of is that you want `foo + ^bar` to be another way of writing lambda bar: foo + bar with some under-specified behavior for evaluating foo and different under-specified behavior for evaluating bar . That is what `lambda bar: foo + ^bar` means. A caret in a backtick expression indicates that the name after the caret is a parameter. All names with the same name must have a caret before them. Mandatory parameters can be passed in as keyword arguments or as positional ones. As for the under-specification, I've been working on an example implementation I'll send soon for backtick expressions. I've also been doing the "look for use cases in stdlib" thing that Johnathan and Steve mentioned. On Wed, Jan 23, 2019 at 3:02 AM Bruce Leban wrote: > On Sun, Jan 20, 2019 at 6:43 PM James Lu wrote: > >> Backtick expressions work exactly like lambdas, except that they are >> bound to the instance they are created in every time that class is used to >> create one. To illustrate, ... > > > First, if there is a useful procedure I am strongly against using > backticks because (1) it's been used in the past with an entirely different > meaning and (2) it looks ugly and is not visually suggestive at all of what > it does, especially not the subtle difference between other function > definitions. > > Second, I don't understand exactly what this difference or why it would be > useful. It would help for you to give examples comparing lambda and this > variation. > > Third, you mention using ^ in "explicit" expressions to refer to > parameters of the "created function" and I do not know what function you > are referring to or what the exact semantics of this are. Again, a > comparison of two expressions with and without that ^ would help. An > expression is not a function and not all expressions are written inside > functions. (And as to the specific proposed syntax, there already is the ^ > xor operator and the most expected meaning of ^value is ~value. just as > the unary + and - operators corresponds to the binary operators. > > The only thing that I can think of is that you want `foo + ^bar` to be > another way of writing lambda bar: foo + bar with some under-specified behavior > for evaluating foo and different under-specified behavior for evaluating > bar. > > Finally, if there is some other useful semantics for references inside a > function definition, then I would think the best way to do that is to > implement that, not add a new function difference. For example, > > lambda foo: foo + $bar > > def sample(foo): > > return foo + $foo > > > where I'm arbitrarily using $ to represent the new semantics whatever they > are (no point in bikeshedding syntax when semantics are yet to be defined). > > --- Bruce > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Wed Jan 23 16:16:18 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Wed, 23 Jan 2019 13:16:18 -0800 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: > > > The only thing that I can think of is that you want `foo + ^bar` to be > another way of writing lambda bar: foo + bar with some under-specified behavior > for evaluating foo and different under-specified behavior for evaluating > bar. > > That is what `lambda bar: foo + ^bar` means. > Now you have a lambda inside a back tick expression?!? > > A caret in a backtick expression indicates that the name after the caret > is a parameter. All names with the same name must have a caret before them. > Mandatory parameters can be passed in as keyword arguments or as positional > ones. > So are all the ^ names positional parameters? And only them? And are they on the order they first appear in the expression? How about when there are parentheses, so the order evaluated my not be the same as the order written? Anyway, IIUC, this is a way to write a lambda with less typing, and harder to read. And fewer features? any way to do keyword (with default) parameters? ?Explicit is bettter than implicit? So -1 from me. -CHB > > As for the under-specification, I've been working on an example > implementation I'll send soon for backtick expressions. > > I've also been doing the "look for use cases in stdlib" thing that > Johnathan and Steve mentioned. > > > On Wed, Jan 23, 2019 at 3:02 AM Bruce Leban wrote: > >> On Sun, Jan 20, 2019 at 6:43 PM James Lu wrote: >> >>> Backtick expressions work exactly like lambdas, except that they are >>> bound to the instance they are created in every time that class is used to >>> create one. To illustrate, ... >> >> >> First, if there is a useful procedure I am strongly against using >> backticks because (1) it's been used in the past with an entirely different >> meaning and (2) it looks ugly and is not visually suggestive at all of what >> it does, especially not the subtle difference between other function >> definitions. >> >> Second, I don't understand exactly what this difference or why it would >> be useful. It would help for you to give examples comparing lambda and this >> variation. >> >> Third, you mention using ^ in "explicit" expressions to refer to >> parameters of the "created function" and I do not know what function you >> are referring to or what the exact semantics of this are. Again, a >> comparison of two expressions with and without that ^ would help. An >> expression is not a function and not all expressions are written inside >> functions. (And as to the specific proposed syntax, there already is the >> ^ xor operator and the most expected meaning of ^value is ~value. just >> as the unary + and - operators corresponds to the binary operators. >> >> The only thing that I can think of is that you want `foo + ^bar` to be >> another way of writing lambda bar: foo + bar with some under-specified behavior >> for evaluating foo and different under-specified behavior for evaluating >> bar. >> >> Finally, if there is some other useful semantics for references inside a >> function definition, then I would think the best way to do that is to >> implement that, not add a new function difference. For example, >> >> lambda foo: foo + $bar >> >> def sample(foo): >> >> return foo + $foo >> >> >> where I'm arbitrarily using $ to represent the new semantics whatever >> they are (no point in bikeshedding syntax when semantics are yet to be >> defined). >> >> --- Bruce >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From ricocotam at gmail.com Wed Jan 23 16:20:02 2019 From: ricocotam at gmail.com (Adrien Ricocotam) Date: Wed, 23 Jan 2019 22:20:02 +0100 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: <20190123012826.GG13616@ando.pearwood.info> References: <20190123012826.GG13616@ando.pearwood.info> Message-ID: I have a neutral feeling about the proposal but I?d like to suggest something We can extend the try/with to other blocks, as suggested. What could be done to prevent any ambiguity is : try with blabla as blabla2 : ... except: ... Which is equivalent of : try: with blabla as blabl2: ... except: .... Any other combination should be explicit, except if there?s a nice syntax but I didn?t find it. Extending to other blocks would give : try for ... in ...: ... except: ... But if we use the ? else ? here, what?s going on ? It?s a one line saver (which is useless) but having many indented blocks doesn?t produce readable code in my opinion. Saving one indented block while keeping things clear is a good thing in my opinion. As long as it stays clear (which is not the case in the for block). But adding this feature to the with block without using it for other blocks is a bit strange, imho. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leban.us Wed Jan 23 18:11:07 2019 From: bruce at leban.us (Bruce Leban) Date: Wed, 23 Jan 2019 15:11:07 -0800 Subject: [Python-ideas] Backtick expression: similar to a shorter lambda syntax In-Reply-To: References: <9E015565-9F0C-41D9-B171-275D2FDC654D@gmail.com> Message-ID: On Wed, Jan 23, 2019 at 1:07 PM James Lu wrote: > Backtick expressions (now) use the same scoping and same binding rules as > other functions. > What do you mean by "now"?? There are no backtick expressions in Python anymore and they were never functions. > The only difference is that > class Class: > stacticmethod = `...` > staticmethod = lambda: ... > def instancemethod = `...` # an instancemethod that's called with self > passed in > def property = property(`...`) # an instancemethod that's called with > self passed in > You seem to be inventing new syntax as you go. And you haven't told us how the first two above differ. > > The only thing that I can think of is that you want `foo + ^bar` to be > another way of writing lambda bar: foo + bar with some under-specified behavior > for evaluating foo and different under-specified behavior for evaluating > bar. > > That is what `lambda bar: foo + ^bar` means. > I have no idea what this means. You're giving syntax without semantics. The word "that" in your sentence is an unbound reference. > > A caret in a backtick expression indicates that the name after the caret > is a parameter. All names with the same name must have a caret before them. > Mandatory parameters can be passed in as keyword arguments or as positional > ones. > In a word, ick. So to find the parameters for a function, I need to scan through the entire text of the function looking for ^? And you think that's an improvement? You've given no explanation of how this is better and saving typing the word lambda isn't enough. All in all this sounds like "but these go to 11" justification and that really is insufficient. I'm -11. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From daveshawley at gmail.com Thu Jan 24 08:08:29 2019 From: daveshawley at gmail.com (David Shawley) Date: Thu, 24 Jan 2019 08:08:29 -0500 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: <20190123012826.GG13616@ando.pearwood.info> References: <20190123012826.GG13616@ando.pearwood.info> Message-ID: On Jan 22, 2019, at 8:28 PM, Steven D'Aprano wrote: > > I've been thinking more about this proposal, and realised why I've been > feeling a slight sense of disquiet about it. I think it encourages an > anti-pattern of catching too much. (Or at least a code smell.) > > Although we're all guilty of violating this principle from time to time, > in general we ought to surround the minimum amount of code with a > try...except that we need. Ideally (but rarely possible in practice) we > want to surround a single operation which might raise at a time. > I ended up at a similar conclusion this morning. I'm wary of enclosing too much code in the try block since it makes handling exceptions very difficult unless you have a rich hierarchy of exceptions -- something that I also warn against. I view the problem of enclosing too much in the try block as an extension of an overly broad except block. It makes it nearly impossible to know where the exception came from so you cannot handle it safely unless you have an exception hierarchy where each exception is raised from a single place in the code. A similar anti-pattern that I see regularly is to *catch too often* where there is a try-catch block every time that a method from another module or library is called and the catch portion translates the exception instead of handling it. It's interesting that the "exception translator" pattern leads to an overly rich exception hierarchy. It is interesting that you mention whether the try-catch would wrap __enter__ or not. The main reason that I am -1 on this proposal is that it introduces more ambiguity in what is happening. I was about to write pretty much what you have written with regards to the syntax muddying the waters about the scope of exception handling especially with regards to the meaning of the return value of `context_manager.__exit__` > But the alternative is to enshrine in syntax something which *by design* > guards too much and is a code smell. > Very well put ;) - dave -- Safe wiring is not something to be learned after the fire trucks have left. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Sat Jan 26 08:04:12 2019 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler_Lists?=) Date: Sat, 26 Jan 2019 14:04:12 +0100 Subject: [Python-ideas] kwargs for return Message-ID: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> I know this is going to get rejected, but I want to speak out the idea nevertheless: I love kwargs and named arguments from the beginning (roughly 18 years now) I guess you came across this several times before: In the first version of the API one variable gets returned: Example: ??? status = backend.transmit_data() But later you want to add something to the API. For the input part of a method this is solved. You can add an optional kwarg. But for the output part of a method, there you can't change the interface easily up to now. Use case: you want to add an optional list of messages which could get returned. You want to change to ??? status, messages = backend.transmit_data() If you have 10 different backend implementations, then you need to change all of them. This is difficult, if the backends reside in different repos and maybe you even don't own some of these repos. Current draw-back: you need to change all of them at once. Of course you could work around it be using this ?? status_messages = backend.transmit_data() And then do some fancy guessing if the variable contains only the status or a tuple containing status and messages. Some days ago I head the idea that kwargs for return would help here. This should handle both cases: Case1: The old backend returns only the status, and the caller wants both status and messages. Somehow the default for messages needs to be defined. In my case it would be the empty list. Case2: The new backends returning the status and messages. The old caller just gets the status. The messages get discarded. Above is the use case. How could kwargs for return look like? Maybe like this: ..... Sorry, I could not find a nice, clean and simple syntax for this up to now. Maybe someone else is more creative than I am. What do you think about this? Regards, ? Thomas G?ttler -- Thomas Guettler http://www.thomas-guettler.de/ I am looking for feedback: https://github.com/guettli/programming-guidelines -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jan 26 08:51:06 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 27 Jan 2019 00:51:06 +1100 Subject: [Python-ideas] kwargs for return In-Reply-To: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> Message-ID: <20190126135105.GG26605@ando.pearwood.info> On Sat, Jan 26, 2019 at 02:04:12PM +0100, Thomas G?ttler Lists wrote: > Example: > > ??? status = backend.transmit_data() > > But later you want to add something to the API. [...] > How could kwargs for return look like? return {'status': True, 'messages': []} Or perhaps better: return ResultObject(status=True, messages=[]) I don't see anything here that can't be done by returning a dict, a namedtuple (possibly with optional fields), or some other object with named fields. They can be optional, they can have defaults, and you can extend the object by adding new fields without breaking backwards compatibility. -- Steve From boxed at killingar.net Sat Jan 26 09:29:59 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sat, 26 Jan 2019 15:29:59 +0100 Subject: [Python-ideas] kwargs for return In-Reply-To: <20190126135105.GG26605@ando.pearwood.info> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> Message-ID: > I don't see anything here that can't be done by returning a dict, a > namedtuple (possibly with optional fields), or some other object with > named fields. They can be optional, they can have defaults, and you can > extend the object by adding new fields without breaking backwards > compatibility. That assumes you knew before hand to do that. The question is about the normal situation when you didn't. Also you totally disregarded the call site where there is no way to do a nice dict unpacking in python. The tuple case is super special and convenient but strictly worse than having properly named fields. To me this question sounds like it's about dict unpacking with one special case to keep backwards compatibility. This should be possible with a simple dict subclass in some cases... / Anders From mike at selik.org Sat Jan 26 10:14:06 2019 From: mike at selik.org (Michael Selik) Date: Sat, 26 Jan 2019 07:14:06 -0800 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> Message-ID: On Sat, Jan 26, 2019, 6:30 AM Anders Hovm?ller > > I don't see anything here that can't be done by returning a dict, a > > namedtuple (possibly with optional fields), or some other object with > > named fields. They can be optional, they can have defaults, and you can > > extend the object by adding new fields without breaking backwards > > compatibility. > > That assumes you knew before hand to do that. The question is about the > normal situation when you didn't. > > Also you totally disregarded the call site where there is no way to do a > nice dict unpacking in python. The tuple case is super special and > convenient but strictly worse than having properly named fields. > > To me this question sounds like it's about dict unpacking with one special > case to keep backwards compatibility. > My "destructure" module might help. I was playing around with the idea of dict unpacking and extended it to a kind of case matching. https://github.com/selik/destructure Grant Jenks independently came to almost the same idea and implementation. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jan 26 10:30:11 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 27 Jan 2019 02:30:11 +1100 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> Message-ID: <20190126153010.GH26605@ando.pearwood.info> On Sat, Jan 26, 2019 at 03:29:59PM +0100, Anders Hovm?ller wrote: > > > I don't see anything here that can't be done by returning a dict, a > > namedtuple (possibly with optional fields), or some other object with > > named fields. They can be optional, they can have defaults, and you can > > extend the object by adding new fields without breaking backwards > > compatibility. > > That assumes you knew before hand to do that. The question is about > the normal situation when you didn't. Exactly the same can be said about the given scenario with or without this hypthetical "kwargs for return". Thomas talks about having to change a bunch of backends. Okay, but he still has to change them to use "kwargs for return" because they're not using them yet. So there is no difference here. The point is, you can future-proof your API *right now*, today, without waiting for "kwargs for return" to be added to Python 3.8 or 3.9 or 5000. Return a dict or some object with named fields. Then you can add new fields to the object in the future, without breaking backwards compatibility again, since callers that don't expect the new fields will simply ignore them. Of course if we aren't doing that *yet* then doing so for the first time will be a breaking change. But that can be said about any time we change our mind about what we're doing, and do something different. > Also you totally disregarded the call site where there is no way to do > a nice dict unpacking in python. It wasn't clear to me that Thomas is talking about dict unpacking. It still isn't. He makes the analogy with passing keyword arguments to a function where they are collected in a **kwargs dict. That parameter isn't automatically unpacked, you get a dict. So I expect that "kwargs for return" should work the same way: it returns a dict. If you want to unpack it, you can unpack it yourself in anyway you see fit. But perhaps you are correct, and Thomas actually is talking about dict unpacking and not "kwargs for return". But perhaps if he had spent more time demonstrating what he wanted to do with some pseudo-code, and less explaining why he wanted to do it, I might have found his intention more understandable. > The tuple case is super special and > convenient but strictly worse than having properly named fields. In what way is it worse, given that returning a namedtuple with named fields is backwards compatible with returning a regular tuple? We can have our cake and eat it too. Unless the caller does a type-check, there is no difference. Sequence unpacking will still work, and namedtuples unlike regular tuples can support optional attributes. > To me this question sounds like it's about dict unpacking with one > special case to keep backwards compatibility. This should be possible > with a simple dict subclass in some cases... This is hardly the first time dict unpacking has been proposed. Each time, they flounder and go nowhere. What is this "simple dict subclass" that is going to solve the problem? Perhaps you should start by telling us *precisely* what the problem is that your subclass will solve. Because I don't know what your idea of dict unpacking is, and how it compares or differs from previous times it has been proposed. Are there any other languages which support dict unpacking? How does it work there? -- Steve From mertz at gnosis.cx Sat Jan 26 11:49:50 2019 From: mertz at gnosis.cx (David Mertz) Date: Sat, 26 Jan 2019 11:49:50 -0500 Subject: [Python-ideas] kwargs for return In-Reply-To: <20190126135105.GG26605@ando.pearwood.info> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> Message-ID: I was going to write exactly they're same idea Steven did. Right now you can simply design APIs to return dictionaries or, maybe better, namedtuples. Namedtuples are really nice since you can define new attributes when you upgrade an API without breaking any old coffee that used the prior attributes... Of course, you can only add more, not remove old ones, to assure compatibility. Unlike dictionaries, namedtuples cannot contain arbitrary "keywords" at runtime, which is either good or bad depending on your purposes. Recently, dataclasses are also an option. They are cool, but I haven't yet had a reason to use them. They feel heavier than namedtuples though (as a programming construct, not talking about memory usage or speed or whatever). On Sat, Jan 26, 2019, 8:52 AM Steven D'Aprano On Sat, Jan 26, 2019 at 02:04:12PM +0100, Thomas G?ttler Lists wrote: > > > Example: > > > > status = backend.transmit_data() > > > > But later you want to add something to the API. > [...] > > How could kwargs for return look like? > > return {'status': True, 'messages': []} > > Or perhaps better: > > return ResultObject(status=True, messages=[]) > > > I don't see anything here that can't be done by returning a dict, a > namedtuple (possibly with optional fields), or some other object with > named fields. They can be optional, they can have defaults, and you can > extend the object by adding new fields without breaking backwards > compatibility. > > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sat Jan 26 11:59:36 2019 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 26 Jan 2019 11:59:36 -0500 Subject: [Python-ideas] kwargs for return In-Reply-To: <20190126153010.GH26605@ando.pearwood.info> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126153010.GH26605@ando.pearwood.info> Message-ID: On Saturday, January 26, 2019, Steven D'Aprano wrote: > > > Perhaps you should start by telling us *precisely* what the problem is > that your subclass will solve. Because I don't know what your idea of > dict unpacking is, and how it compares or differs from previous times it > has been proposed. Dataclasses initialization may be most useful currently implemented syntactic sugar for a dict return value contract that specifies a variable name (and datatype)? Is there a better way to specify a return object interface with type annotations that throws exceptions at runtime that dataclasses? > > Are there any other languages which support dict unpacking? How does it > work there? This about object destructuring in JS is worth a read: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment#Object_destructuring Here are two simple cases: """ var o = {p: 42, q: true}; var {p: foo, q: bar} = o; console.log(foo); // 42 console.log(bar); // true """ Does it throw an exception when a value is undefined? You can specify defaults: """ var {a: aa = 10, b: bb = 5} = {a: 3}; console.log(aa); // 3 console.log(bb); // 5 """ > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sat Jan 26 12:01:55 2019 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 26 Jan 2019 12:01:55 -0500 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> Message-ID: On Saturday, January 26, 2019, David Mertz wrote: > I was going to write exactly they're same idea Steven did. > > Right now you can simply design APIs to return dictionaries or, maybe > better, namedtuples. Namedtuples are really nice since you can define new > attributes when you upgrade an API without breaking any old coffee that > used the prior attributes... Of course, you can only add more, not remove > old ones, to assure compatibility. Unlike dictionaries, namedtuples cannot > contain arbitrary "keywords" at runtime, which is either good or bad > depending on your purposes. > Tuples are a dangerous (and classic, legacy) interface contract. NamedTuples must be modified to allow additional (i.e. irrelevant to the FUT) data through the return interface. > > Recently, dataclasses are also an option. They are cool, but I haven't yet > had a reason to use them. They feel heavier than namedtuples though (as a > programming construct, not talking about memory usage or speed or whatever). > > On Sat, Jan 26, 2019, 8:52 AM Steven D'Aprano >> On Sat, Jan 26, 2019 at 02:04:12PM +0100, Thomas G?ttler Lists wrote: >> >> > Example: >> > >> > status = backend.transmit_data() >> > >> > But later you want to add something to the API. >> [...] >> > How could kwargs for return look like? >> >> return {'status': True, 'messages': []} >> >> Or perhaps better: >> >> return ResultObject(status=True, messages=[]) >> >> >> I don't see anything here that can't be done by returning a dict, a >> namedtuple (possibly with optional fields), or some other object with >> named fields. They can be optional, they can have defaults, and you can >> extend the object by adding new fields without breaking backwards >> compatibility. >> >> >> >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Jan 26 12:30:24 2019 From: mertz at gnosis.cx (David Mertz) Date: Sat, 26 Jan 2019 12:30:24 -0500 Subject: [Python-ideas] kwargs for return In-Reply-To: <20190126153010.GH26605@ando.pearwood.info> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126153010.GH26605@ando.pearwood.info> Message-ID: On Sat, Jan 26, 2019 at 10:31 AM Steven D'Aprano wrote: > In what way is it worse, given that returning a namedtuple with named > fields is backwards compatible with returning a regular tuple? We can > have our cake and eat it too. > Unless the caller does a type-check, there is no difference. Sequence > unpacking will still work, and namedtuples unlike regular tuples can > support optional attributes. > I suppose the one difference is where someone improperly relies on tuple unpacking. Old version: def myfun(): # ... return a, b, c # Call site val1, val2, val3 = myfun() New version: def myfun(): # ... return a, b, c, d Now the call site will get "ValueError: too many values to unpack". Namedtuples don't solve this problem, of course. But they don't make anything worse either. The better approach, of course, is to document the API as only using attribute access, not positional. I reckon dataclasses from the start could address that concern... but so can documentation alone. E.g.: Old version (improved): def myfun(): mydata = namedtuple("mydata", "a b c") # ... return mydata(a, b, c) # Call site ret = myfun() val1, val2, val3 = ret.a, ret.b, ret.c New version (improved) def myfun(): mydata = namedtuple("mydata", "a b c d e") # ... return mydata(a, b, c, d, e) Now the call site is completely happy with no changes (assuming it doesn't need to care about what values 'ret.d' or 'ret.e' contain... but presumably those extra values are optional in some way. Moreover, we are even perfectly fine if we had created namedtuple("mydata", "e d c b a") for some reason, completely changing the positions of all the named attributes in the improved namedtuple. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jan 26 12:47:45 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 27 Jan 2019 04:47:45 +1100 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> Message-ID: <20190126174745.GI26605@ando.pearwood.info> On Sat, Jan 26, 2019 at 12:01:55PM -0500, Wes Turner wrote: > Tuples are a dangerous (and classic, legacy) interface contract. What? -- Steve From steve at pearwood.info Sat Jan 26 12:55:52 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 27 Jan 2019 04:55:52 +1100 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126153010.GH26605@ando.pearwood.info> Message-ID: <20190126175551.GJ26605@ando.pearwood.info> On Sat, Jan 26, 2019 at 11:59:36AM -0500, Wes Turner wrote: > This about object destructuring in JS is worth a read: > > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment#Object_destructuring Thanks. -- Steve From eric at trueblade.com Sat Jan 26 13:09:18 2019 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 26 Jan 2019 13:09:18 -0500 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126153010.GH26605@ando.pearwood.info> Message-ID: <0507ba35-a340-26c7-07d6-a1bd1359638c@trueblade.com> On 1/26/2019 12:30 PM, David Mertz wrote: > On Sat, Jan 26, 2019 at 10:31 AM Steven D'Aprano > wrote: > > In what way is it worse, given that returning a namedtuple with named > > fields is backwards compatible with returning a regular tuple? We can > have our cake and eat it too. > Unless the caller does a type-check, there is no difference. Sequence > unpacking will still work, and namedtuples unlike regular tuples can > support optional attributes. > > > I suppose the one difference is where someone improperly relies on tuple > unpacking. > > Old version: > > def myfun(): > ? ? # ... > ? ? return a, b, c > > # Call site > val1, val2, val3 = myfun() > > > New version: > > def myfun(): > ? ? # ... > ? ? return a, b, c, d > > > Now the call site will get "ValueError: too many values to unpack". > Namedtuples don't solve this problem, of course.? But they don't make > anything worse either. > > The better approach, of course, is to document the API as only using > attribute access, not positional.? I reckon dataclasses from the start > could address that concern... but so can documentation alone.? E.g.: > > Old version (improved): > > def myfun(): > > ? ? mydata = namedtuple("mydata", "a b c") > > ? ? # ... > ? ? return mydata(a, b, c) > > # Call site > ret = myfun() > > val1, val2, val3 = ret.a, ret.b, ret.c > > > New version (improved) > > def myfun(): > > ? ? mydata = namedtuple("mydata", "a b c d e") > > ? ? # ... > ? ? return mydata(a, b, c, d, e) > > Now the call site is completely happy with no changes (assuming it > doesn't need to care about what values 'ret.d' or 'ret.e' contain... but > presumably those extra values are optional in some way. > > Moreover, we are even perfectly fine if we had created > namedtuple("mydata", "e d c b a")?for some reason, completely changing > the positions of all the named attributes in the improved namedtuple. Preventing this automatic unpacking (and preventing iteration in general) was one of the motivating factors for dataclasses: https://www.python.org/dev/peps/pep-0557/#id47 Eric From mertz at gnosis.cx Sat Jan 26 13:12:36 2019 From: mertz at gnosis.cx (David Mertz) Date: Sat, 26 Jan 2019 13:12:36 -0500 Subject: [Python-ideas] kwargs for return In-Reply-To: <0507ba35-a340-26c7-07d6-a1bd1359638c@trueblade.com> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126153010.GH26605@ando.pearwood.info> <0507ba35-a340-26c7-07d6-a1bd1359638c@trueblade.com> Message-ID: Indeed! I promise to use dataclass next time I find myself about to use namedtuple. :-) I'm pretty sure that virtually all my uses will allow that. On Sat, Jan 26, 2019, 1:09 PM Eric V. Smith > > On 1/26/2019 12:30 PM, David Mertz wrote: > > On Sat, Jan 26, 2019 at 10:31 AM Steven D'Aprano > > wrote: > > > > In what way is it worse, given that returning a namedtuple with named > > > > fields is backwards compatible with returning a regular tuple? We can > > have our cake and eat it too. > > Unless the caller does a type-check, there is no difference. Sequence > > unpacking will still work, and namedtuples unlike regular tuples can > > support optional attributes. > > > > > > I suppose the one difference is where someone improperly relies on tuple > > unpacking. > > > > Old version: > > > > def myfun(): > > # ... > > return a, b, c > > > > # Call site > > val1, val2, val3 = myfun() > > > > > > New version: > > > > def myfun(): > > # ... > > return a, b, c, d > > > > > > Now the call site will get "ValueError: too many values to unpack". > > Namedtuples don't solve this problem, of course. But they don't make > > anything worse either. > > > > The better approach, of course, is to document the API as only using > > attribute access, not positional. I reckon dataclasses from the start > > could address that concern... but so can documentation alone. E.g.: > > > > Old version (improved): > > > > def myfun(): > > > > mydata = namedtuple("mydata", "a b c") > > > > # ... > > return mydata(a, b, c) > > > > # Call site > > ret = myfun() > > > > val1, val2, val3 = ret.a, ret.b, ret.c > > > > > > New version (improved) > > > > def myfun(): > > > > mydata = namedtuple("mydata", "a b c d e") > > > > # ... > > return mydata(a, b, c, d, e) > > > > Now the call site is completely happy with no changes (assuming it > > doesn't need to care about what values 'ret.d' or 'ret.e' contain... but > > presumably those extra values are optional in some way. > > > > Moreover, we are even perfectly fine if we had created > > namedtuple("mydata", "e d c b a") for some reason, completely changing > > the positions of all the named attributes in the improved namedtuple. > > Preventing this automatic unpacking (and preventing iteration in > general) was one of the motivating factors for dataclasses: > https://www.python.org/dev/peps/pep-0557/#id47 > > Eric > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Sat Jan 26 13:20:11 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sat, 26 Jan 2019 10:20:11 -0800 Subject: [Python-ideas] kwargs for return In-Reply-To: <20190126174745.GI26605@ando.pearwood.info> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126174745.GI26605@ando.pearwood.info> Message-ID: I think I "get" what Thomas is talking about here: Starting with the simplest example, when defining a function, you can have one take a single positional parameter: def fun(x): ... and you can have code all over the place that calls it: fun(something) Later on, if you want to exapand the API, ytou can add a keyword parameter: def fun(x, y=None): ... And all the old code that already calls that function with one argument still works, and newer code can optionally specify the keyword argument -- this is a really nice feature that makes Python very refactorable. But for return values, there is no such flexibility -- if you have already written your function with the simple API: def fun(...): ... return something And it is being used already as such: x = fun() Then you decide that an optional extra return value would be useful, and you re-write your function: def fun(...): ... return something, something_optional now all the call locations will need to be updated: x, y = fun() or maybe: x, __ = fun() Sure, if you had had the foresight, then you _could_ have written your original function to return a more flexible data structure (dict, NamedTuple, etc), but, well, we usually don't have that foresight :-). My first thought was that function return tuples, so you could document that your function should be called as such: x = fun()[0] but, alas, tuple unpacking is apparently automatically disabled for single value tuples (how do you distinguish a tuple with a single value and the value itself??) . so you could do this if you started with two or more return values: x, y = fun()[:2] OR you could hae your original function return a len-1 tuple in the first place: def test(): return (5,) but then that would require the same foresight. So: IIUC, Thomas's idea is that there be some way to have"optional" return values, stabbing at a possible syntax to make the case: Original: def fun(): return 5 called as: x = fun() Updated: def fun() return 5, *, 6 Now it can still be called as: x = fun() and result in x == 5 or: x, y = fun() and result in x == 5, y == 6 So: syntax asside, I'm not sure how this could be implemented -- As I understand it, functions return either a single value, or a tuple of values -- there is nothing special about how assignment is happening when a function is called. That is: result = fun() x = result is exactly the same as: x = fun() So to impliment this idea, functions would have to return an object that would act like a single object when assigned to a single name: x = fun() but an unpackable object when assigned to multiple names: x, y = fun() but then, if you had this function: def fun() return x, *, y and you called it like so: result = fun() x, y = result either x, y = result would fail, or result would be this "function return object", rather than the single value. I can't think of any way to resolve that problem without really breaking the language. -CHB On Sat, Jan 26, 2019 at 9:48 AM Steven D'Aprano wrote: > On Sat, Jan 26, 2019 at 12:01:55PM -0500, Wes Turner wrote: > > > Tuples are a dangerous (and classic, legacy) interface contract. > > What? > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Sat Jan 26 13:26:16 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sat, 26 Jan 2019 10:26:16 -0800 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126153010.GH26605@ando.pearwood.info> <0507ba35-a340-26c7-07d6-a1bd1359638c@trueblade.com> Message-ID: On Sat, Jan 26, 2019 at 10:13 AM David Mertz wrote: > Indeed! I promise to use dataclass next time I find myself about to use > namedtuple. :-) > Indeed IIUC, namedtuple was purposely designed to be able to replace tuples as well as adding the named access. But that does indeed cause potential issues. However, dataclasses see kind of heavyweight to me -- am I imagining that, or could one make a named_not_tuple that was appreciably lighter weight? (in creation time, memory use, that sort of thing) -CHB > > I'm pretty sure that virtually all my uses will allow that. > > On Sat, Jan 26, 2019, 1:09 PM Eric V. Smith >> >> >> On 1/26/2019 12:30 PM, David Mertz wrote: >> > On Sat, Jan 26, 2019 at 10:31 AM Steven D'Aprano > > > wrote: >> > >> > In what way is it worse, given that returning a namedtuple with >> named >> > >> > fields is backwards compatible with returning a regular tuple? We >> can >> > have our cake and eat it too. >> > Unless the caller does a type-check, there is no difference. >> Sequence >> > unpacking will still work, and namedtuples unlike regular tuples can >> > support optional attributes. >> > >> > >> > I suppose the one difference is where someone improperly relies on >> tuple >> > unpacking. >> > >> > Old version: >> > >> > def myfun(): >> > # ... >> > return a, b, c >> > >> > # Call site >> > val1, val2, val3 = myfun() >> > >> > >> > New version: >> > >> > def myfun(): >> > # ... >> > return a, b, c, d >> > >> > >> > Now the call site will get "ValueError: too many values to unpack". >> > Namedtuples don't solve this problem, of course. But they don't make >> > anything worse either. >> > >> > The better approach, of course, is to document the API as only using >> > attribute access, not positional. I reckon dataclasses from the start >> > could address that concern... but so can documentation alone. E.g.: >> > >> > Old version (improved): >> > >> > def myfun(): >> > >> > mydata = namedtuple("mydata", "a b c") >> > >> > # ... >> > return mydata(a, b, c) >> > >> > # Call site >> > ret = myfun() >> > >> > val1, val2, val3 = ret.a, ret.b, ret.c >> > >> > >> > New version (improved) >> > >> > def myfun(): >> > >> > mydata = namedtuple("mydata", "a b c d e") >> > >> > # ... >> > return mydata(a, b, c, d, e) >> > >> > Now the call site is completely happy with no changes (assuming it >> > doesn't need to care about what values 'ret.d' or 'ret.e' contain... >> but >> > presumably those extra values are optional in some way. >> > >> > Moreover, we are even perfectly fine if we had created >> > namedtuple("mydata", "e d c b a") for some reason, completely changing >> > the positions of all the named attributes in the improved namedtuple. >> >> Preventing this automatic unpacking (and preventing iteration in >> general) was one of the motivating factors for dataclasses: >> https://www.python.org/dev/peps/pep-0557/#id47 >> >> Eric >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Jan 26 13:43:44 2019 From: mertz at gnosis.cx (David Mertz) Date: Sat, 26 Jan 2019 13:43:44 -0500 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126174745.GI26605@ando.pearwood.info> Message-ID: On Sat, Jan 26, 2019, 1:21 PM Christopher Barker > As I understand it, functions return either a single value, or a tuple of > values -- there is nothing special about how assignment is happening when a > function is called. > No. It's simpler than that! Functions return a single value, period. That single value might happen to be a tuple or something else unpackable. This makes it feel like we have multiple return values, but we never actually do. The fact that "tuples are spelled by commas not by parentheses" makes this distinction easy to ignore most of the time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Jan 26 13:59:32 2019 From: mertz at gnosis.cx (David Mertz) Date: Sat, 26 Jan 2019 13:59:32 -0500 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126153010.GH26605@ando.pearwood.info> <0507ba35-a340-26c7-07d6-a1bd1359638c@trueblade.com> Message-ID: I'm not certain of memory usage. But using 'make_dataclass' makes the "noise" pretty much no worse than namedtuple. Person = namedtuple("Person", "name age address") Person = make_dataclass("Person", "name age address".split()) Unless you have millions of there's objects, memory probably isn't that important. But I guess you might... and namedtuple did sell itself as "less memory than small dictionaries" On Sat, Jan 26, 2019, 1:26 PM Christopher Barker On Sat, Jan 26, 2019 at 10:13 AM David Mertz wrote: > >> Indeed! I promise to use dataclass next time I find myself about to use >> namedtuple. :-) >> > > Indeed IIUC, namedtuple was purposely designed to be able to replace > tuples as well as adding the named access. > > But that does indeed cause potential issues. However, dataclasses see kind > of heavyweight to me -- am I imagining that, or could one make a > named_not_tuple that was appreciably lighter weight? (in creation time, > memory use, that sort of thing) > > -CHB > > > > > > > >> >> I'm pretty sure that virtually all my uses will allow that. >> >> On Sat, Jan 26, 2019, 1:09 PM Eric V. Smith > >>> >>> >>> On 1/26/2019 12:30 PM, David Mertz wrote: >>> > On Sat, Jan 26, 2019 at 10:31 AM Steven D'Aprano >> > > wrote: >>> > >>> > In what way is it worse, given that returning a namedtuple with >>> named >>> > >>> > fields is backwards compatible with returning a regular tuple? We >>> can >>> > have our cake and eat it too. >>> > Unless the caller does a type-check, there is no difference. >>> Sequence >>> > unpacking will still work, and namedtuples unlike regular tuples >>> can >>> > support optional attributes. >>> > >>> > >>> > I suppose the one difference is where someone improperly relies on >>> tuple >>> > unpacking. >>> > >>> > Old version: >>> > >>> > def myfun(): >>> > # ... >>> > return a, b, c >>> > >>> > # Call site >>> > val1, val2, val3 = myfun() >>> > >>> > >>> > New version: >>> > >>> > def myfun(): >>> > # ... >>> > return a, b, c, d >>> > >>> > >>> > Now the call site will get "ValueError: too many values to unpack". >>> > Namedtuples don't solve this problem, of course. But they don't make >>> > anything worse either. >>> > >>> > The better approach, of course, is to document the API as only using >>> > attribute access, not positional. I reckon dataclasses from the start >>> > could address that concern... but so can documentation alone. E.g.: >>> > >>> > Old version (improved): >>> > >>> > def myfun(): >>> > >>> > mydata = namedtuple("mydata", "a b c") >>> > >>> > # ... >>> > return mydata(a, b, c) >>> > >>> > # Call site >>> > ret = myfun() >>> > >>> > val1, val2, val3 = ret.a, ret.b, ret.c >>> > >>> > >>> > New version (improved) >>> > >>> > def myfun(): >>> > >>> > mydata = namedtuple("mydata", "a b c d e") >>> > >>> > # ... >>> > return mydata(a, b, c, d, e) >>> > >>> > Now the call site is completely happy with no changes (assuming it >>> > doesn't need to care about what values 'ret.d' or 'ret.e' contain... >>> but >>> > presumably those extra values are optional in some way. >>> > >>> > Moreover, we are even perfectly fine if we had created >>> > namedtuple("mydata", "e d c b a") for some reason, completely changing >>> > the positions of all the named attributes in the improved namedtuple. >>> >>> Preventing this automatic unpacking (and preventing iteration in >>> general) was one of the motivating factors for dataclasses: >>> https://www.python.org/dev/peps/pep-0557/#id47 >>> >>> Eric >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > Christopher Barker, PhD > > Python Language Consulting > - Teaching > - Scientific Software Development > - Desktop GUI and Web Development > - wxPython, numpy, scipy, Cython > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Sat Jan 26 14:30:53 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sat, 26 Jan 2019 11:30:53 -0800 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126174745.GI26605@ando.pearwood.info> Message-ID: On Sat, Jan 26, 2019 at 10:43 AM David Mertz wrote: > No. It's simpler than that! Functions return a single value, period. > > That single value might happen to be a tuple or something else unpackable. > d'uh -- I was thinking common use case. > This makes it feel like we have multiple return values, but we never > actually do. The fact that "tuples are spelled by commas not by > parentheses" makes this distinction easy to ignore most of the time. > yup. And unpacking behavior. So I guess the correct way to phrase that is that functions return a single object, and that object may or may not be unpackable. Key to this is that unlike function parameters, python isn't doing anything special when returning a value form a function, or when assigning a value to a name: * functions return a value * assignment applies unpacking when assigning to multiple names. These two things are orthogonal in the language. The challenge in this case is that when you assign to a single name, there is no unpacking: x = 3, So in this case, the 1-tuple doesn't get unpacked -- when you are assigning to a single name, there is no unpacking to be done. but you can unpack a one-tuple, by assigning to name with a trailing comma: In [62]: x, = (3,) In [63]: x Out[63]: 3 So the challenge is that to support this new feature, we'd need to changing assignment, so that: x = an_object would look at an_object, and determine if it was one of these function_return_objects that should have the first value unpacked if assigned to a single name, but unpack the others if assigned to a tuple of names -- and THAT is a pretty big change! And it would still create odd behavior if teh function return value were not assigned right away, but stored in some other container: a_list = [fun1(), fun2()] So really, there is no way to create this functionality without major changes to the language. -CHB -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jan 26 19:01:05 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 27 Jan 2019 11:01:05 +1100 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126174745.GI26605@ando.pearwood.info> Message-ID: <20190127000104.GK26605@ando.pearwood.info> On Sat, Jan 26, 2019 at 10:20:11AM -0800, Christopher Barker wrote: > My first thought was that function return tuples, so you could document > that your function should be called as such: > > x = fun()[0] > > but, alas, tuple unpacking is apparently automatically disabled for single > value tuples (how do you distinguish a tuple with a single value and the > value itself??) The time machine strikes again. We have not one but THREE ways of doing so (although two are alternate ways of spelling the same thing): py> def func(): ... return [1] ... py> (spam,) = func() # use a 1-element tuple on the left py> [spam] = func() # or a list py> spam 1 py> spam, *ignore = func() py> spam 1 py> ignore [] But if you're extracting a single value using subscripting on the right hand side, you don't need anything so fancy: py> eggs = func()[0] # doesn't matter how many items func returns py> eggs 1 -- Steve From pythonchb at gmail.com Sat Jan 26 23:24:24 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sat, 26 Jan 2019 20:24:24 -0800 Subject: [Python-ideas] kwargs for return In-Reply-To: <20190127000104.GK26605@ando.pearwood.info> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126174745.GI26605@ando.pearwood.info> <20190127000104.GK26605@ando.pearwood.info> Message-ID: On Sat, Jan 26, 2019 at 4:01 PM Steven D'Aprano wrote: > On Sat, Jan 26, 2019 at 10:20:11AM -0800, Christopher Barker wrote: ... > but, alas, tuple unpacking is apparently automatically disabled for single > > value tuples (how do you distinguish a tuple with a single value and the > > value itself??) > > The time machine strikes again. We have not one but THREE ways of doing > so (although two are alternate ways of spelling the same thing): > py> def func(): > ... return [1] > Sure, but this requires that you actually return something "unpackable" from the function. As David Mertz pointed out, functions always return a single value, but that value may or may not be unpackable. So the OP's desire, that you could extend a function that was originally written returning a single scalar value to instead return multiple values, and have code that expected a single value still work the same simply isn't possible (without other major changes to Python). -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at cskk.id.au Sat Jan 26 23:33:15 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Sun, 27 Jan 2019 15:33:15 +1100 Subject: [Python-ideas] kwargs for return In-Reply-To: <20190126153010.GH26605@ando.pearwood.info> References: <20190126153010.GH26605@ando.pearwood.info> Message-ID: <20190127043315.GA40699@cskk.homeip.net> On 27Jan2019 02:30, Steven D'Aprano wrote: >On Sat, Jan 26, 2019 at 03:29:59PM +0100, Anders Hovm?ller wrote: >> > I don't see anything here that can't be done by returning a dict, a >> > namedtuple (possibly with optional fields), or some other object with >> > named fields. They can be optional, they can have defaults, and you can >> > extend the object by adding new fields without breaking backwards >> > compatibility. >> >> That assumes you knew before hand to do that. The question is about >> the normal situation when you didn't. > >Exactly the same can be said about the given scenario with or >without this hypthetical "kwargs for return". > >Thomas talks about having to change a bunch of backends. Okay, but he >still has to change them to use "kwargs for return" because they're not >using them yet. So there is no difference here. I don't think so. It looks to me like Thomas' idea is to offer a facility a little like **kw in function, but for assignment. So in his case, he wants to have one backend start returning a richer result _without_ bringing all the other backends up to that level. This is particularly salient when "the other backends" includes third party plugin facilities, where Thomas (or you or I) cannot update their source. So, he wants to converse of changing a function which previously was like: def f(a, b): into: def f(a, b, **kw): In Python you can freely do this without changing _any_ of the places calling your function. So, for assignment he's got: result = backend.foo() and he's like to go to something like: result, **kw = richer_backend.foo() while still letting the older less rich backends be used in the same assignment. >The point is, you can future-proof your API *right now*, today, without >waiting for "kwargs for return" to be added to Python 3.8 or 3.9 or >5000. Return a dict or some object with named fields. [...] Sure, but Thomas' scenario is where nonfutureproof API is already in the wild. >> Also you totally disregarded the call site where there is no way to >> do a nice dict unpacking in python. > >It wasn't clear to me that Thomas is talking about dict unpacking. It >still isn't. He makes the analogy with passing keyword arguments to a >function where they are collected in a **kwargs dict. That parameter >isn't automatically unpacked, you get a dict. Yeah, but with a function call, not only do you not need to unpack it at the function receiving end, you don't even need to _supply_ it at the calling end, and you can still use **kw at the function receiving end; it will simply be empty. >So I expect that "kwargs >for return" should work the same way: it returns a dict. If you want to >unpack it, you can unpack it yourself in anyway you see fit. Yeah. Or even *a. Like this: Python 3.6.8 (default, Dec 30 2018, 12:58:01) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> def f(): return 3 ... >>> def f2(): return 3, 4 ... >>> *x = f() File "", line 1 SyntaxError: starred assignment target must be in a list or tuple >>> a, *x = f() Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> a, *x = f2() Of course this can't work out of the box with current Python because argument unpacking expected the right hand side to be a single unpackable entity, and: a, *x = ... is just choosing to unpack only the first of the values, dropping the rest into x. Of course a **kw analogue is better because it lets one associate names with values. In terms of syntax, we can't go with: a, *x = ... because precedence lets us write "bare" tuples: a, *x = 1, 2, 3 so the right hand side isn't 3 distinct expressions, it is one tuple (yes, made of 3 expressions) and it is the left side choosing to unpack it directly. However, a, **kw = ... is an outright syntax error, leaving a convenient syntactic hole to provide Thomas' notion. In current syntax, the right hand side remains a single expression, and kw will always be an empty dict. The tricky bit isn't the left side, it is what to provide on the right. Idea: what if **kw mean to unpack RHS.__dict__ (for ordinary objects) i.e. to be filled in with the attributes of the RHS expression value. So, Thomas' old API: def foo(): return 3 and: a, **kw = foo() get a=3 and kw={}. But the richer API: class Richness(int): def __init__(self, value): super().__int__(value) self.x = 'x!' self.y = 4 def foo_rich(): return Richness(3) a, **kw = foo_rich() gets a=3 and kw={'x': 'x!', 'y': 4}. I've got mixed mfeelings about this, but it does supply the kind of mechanism he seems to be thinking about. Cheers, Cameron Simpson From steve at pearwood.info Sun Jan 27 01:24:32 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 27 Jan 2019 17:24:32 +1100 Subject: [Python-ideas] kwargs for return In-Reply-To: References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <20190126135105.GG26605@ando.pearwood.info> <20190126174745.GI26605@ando.pearwood.info> Message-ID: <20190127062432.GL26605@ando.pearwood.info> On Sat, Jan 26, 2019 at 10:20:11AM -0800, Christopher Barker wrote: [...] > Starting with the simplest example, when defining a function, you > can have one take a single positional parameter: [...] > Later on, if you want to exapand the API, ytou can add a keyword parameter: > > def fun(x, y=None): > ... > > And all the old code that already calls that function with one argument > still works, and newer code can optionally specify the keyword argument -- > this is a really nice feature that makes Python very refactorable. In the above example, the caller doesn't need to specify ``y`` as a keyword argument, they can call fun(obj) with a single positional argument too. Keyword arguments are a red-herring here. What makes this work is not *keyword arguments* but default values. See below. > But for return values, there is no such flexibility With no default value, your keyword arguments MUST be supplied and backwards compatibility is broken. Given a default value, the called function merely sees the default value if no other value is given. (You all know how this works, I trust I don't need to demonstrate.) The symmetrically equivalent to arguments with defaults would be if we could supply defaults to the assignment targets, something like this syntax (for illustration purposes only): spam, eggs, cheese="cheddar", aardvark=42 = func() Now func() must return between two and four values. I trust the analogy with parameters with default values is obvious: Calling a function: - parameters without a default are mandatory; - parameters with a default are optional; - supplied arguments are bound to parameters from left to right; - any parameters which don't get an argument have the default bound; - if they don't have a default, it is an error. Returning from a function (hypothetical): - assignment targets without a default are mandatory; - assignment targets with a default are optional; - returned items are bound to targets from left to right; - any target which don't get a result have the default bound; - if they don't have a default, it is an error. Note that all of this is based on positional arguments, so presumably it would use sequence unpacking and allow the equivalent of *args to collect additional positional arguments (if any): spam, eggs, cheese="cheddar", aardvark=42, *extra = func() Javascript already kind of works this way, because it has a default value of undefined, and destructuring assignment (sequence unpacking) assigns undefined to any variable that otherwise wouldn't get a value: js> var a, b = 1 js> a === undefined true js> b 1 But none of this has anything to do with *keyword arguments*, let alone collecting kwargs as in the subject line. The keyword argument analogy might suggest using some form of dict unpacking, but the complexity ramps up even higher: 1. At the callee's end, the function returns some sort of mapping between keys and items. For the sake of the argument, let's assume keys must be identifiers, and invent syntax to make it easier: def func(): return spam=True, eggs=42, messages=[], cheese=(1, 2) (syntax for illustration purposes only). 2. At the caller's end, we need to supply the key, the binding target (which may not be the same!), a possible default value, and somewhere to stash any unexpected key:value pairs. Let's say: spam, eggs->foo.bar, aardvark=None, **extra = func() might bind: spam = True foo.bar = 42 aardvark = None extra = {'messages': [], 'cheese': (1, 2)} At this point somebody will say "Why can't we make the analogy between calling a function and returning from a function complete, and allow *both* positional arguments / sequence unpacking *and* keyword arguments / dict unpacking at the same time?". [...] > Sure, if you had had the foresight, then you _could_ have written your > original function to return a more flexible data structure (dict, > NamedTuple, etc), but, well, we usually don't have that foresight :-). *shrug* That can apply to any part of the API. I now want to return an arbitrary float, but I documented that I only return positive ints... if only I had the foresight... [...] > So: IIUC, Thomas's idea is that there be some way to have"optional" return > values, stabbing at a possible syntax to make the case: [...] > Now it can still be called as: > > x = fun() > > and result in x == 5 > > or: > > x, y = fun() > > and result in x == 5, y == 6 How do you distinguish between these three situations? # I don't care if you return other values, I only care about the first x = fun() # Don't bother unpacking the result, just give it to me as a tuple x = fun() # Oops I forgot that fun() returns two values x = fun() -- Steve From steve at pearwood.info Sun Jan 27 02:41:15 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 27 Jan 2019 18:41:15 +1100 Subject: [Python-ideas] kwargs for return In-Reply-To: <20190127043315.GA40699@cskk.homeip.net> References: <20190126153010.GH26605@ando.pearwood.info> <20190127043315.GA40699@cskk.homeip.net> Message-ID: <20190127074111.GM26605@ando.pearwood.info> On Sun, Jan 27, 2019 at 03:33:15PM +1100, Cameron Simpson wrote: > I don't think so. It looks to me like Thomas' idea is to offer a > facility a little like **kw in function, but for assignment. Why **keyword** arguments rather than **positional** arguments? Aside from the subject line, what part of Thomas' post hints at the analogy with keyword arguments? function(spam=1, eggs=2, cheese=3) Aside from the subject line, I'm not seeing the analogy with keyword parameters here. If he wants some sort of dict unpacking, I don't think he's said so. Did I miss something? But in any case, regardless of whether he wants dict unpacking or not, Thomas doesn't want the caller to be forced to update their calls. Okay, let's consider the analogy carefully: Functions that collect extra keyword args need to explicitly include a **kwargs in their parameter list. If we write this: def spam(x): ... spam(123, foo=1, bar=2, baz=3) we get a TypeError. We don't get foo, bar, baz silently ignored. So if we follow this analogy, then dict unpacking needs some sort of "collect all remaining keyword arguments", analogous to what we can already do with sequences: foo, bar, baz, *extras = [1, 2, 3, 4, 5, 6, 7, 8] Javascript ignores extra values: js> var [x, y] = [1, 2, 3, 4] js> x 1 js> y 2 but in Python, this is an error: foo, bar, baz = [1, 2, 3, 4] So given some sort of "return a mapping of keys to values": def spam(): # For now, assume we simply return a dict return dict(messages=[], success=True) let's gloss over the dict-unpacking syntax, whatever it is, and assume that if a function returns a *single* key:value, and the assignment target matches that key, it Just Works: success = spam() But by analogy with **kwargs that has to be an error since there is nothing to collect the unused key 'messages'. It needs to be: success, **extras = spam() which gives us success=True and extras={'messages': []}. But Thomas doesn't want the caller to have to update their code either. To do so would be analogous to having function calls start ignoring unexpected keyword arguments: assert len([], foo=1, bar=2) == 0 so *not* like **kwargs at all. And it would require ignoring the Zen: Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. > So in his case, he wants to have one backend start returning a richer > result _without_ bringing all the other backends up to that level. This > is particularly salient when "the other backends" includes third party > plugin facilities, where Thomas (or you or I) cannot update their > source. I've pointed out that we can solve his use-case by planning ahead and returning an object that can hold additional, optional fields. Callers that don't know about those fields can just ignore them. Backends that don't know to supply optional fields can just leave them out. Or he can wrap the backend functions in one of at least three Design Patterns made for this sort of scenario: Adaptor, Bridge or Facade, whichever is more appropriate. By decoupling the backend from the frontend, he can easily adapt the result even if he cannot change the backends directly. So Thomas' use-case already has good solutions. But as people have repeatedly pointed out, they all require some foresight and planning. To which my response is, yes they do. Just like we have to plan ahead and include *extra in your sequence packing assignments, or **kwargs in your function parameter list. > So, he wants to converse of changing a function which previously was > like: > > def f(a, b): > > into: > > def f(a, b, **kw): Yes, but to take this analogy further, he wants to do so without having to actually add that **kw to the parameter list. So he apparently wants errors to pass silently. Since Thomas apparently feels that neither the caller nor the callee should be expected to plan ahead, while still expecting backwards compatibility to hold even in the event of backwards incompatible changes, I can only conclude that he wants the interpreter to guess the intention of the caller AND the callee and Do The Right Thing no matter what: http://www.catb.org/jargon/html/D/DWIM.html (Half tongue in cheek here.) > In Python you can freely do this without changing _any_ of the places > calling your function. But only because the function author has included **kw in their parameter list. If they haven't, it remains an error. > So, for assignment he's got: > > result = backend.foo() > > and he's like to go to something like: > > result, **kw = richer_backend.foo() > > while still letting the older less rich backends be used in the same > assignment. That would be equivalent to having unused keyword arguments (or positional arguments for that matter) just disappear into the aether, silently with no error or notice. Like in Javascript. And what about the opposite situation, where the caller is expecting two results, but the backend only returns one? Javascript packs the extra variable with ``undefined``, but Python doesn't do that. Does Thomas actually want errors to pass silently? I don't wish to guess his intentions. [...] > Idea: what if **kw mean to unpack RHS.__dict__ (for ordinary objects) > i.e. to be filled in with the attributes of the RHS expression value. > > So, Thomas' old API: > > def foo(): > return 3 > > and: > > a, **kw = foo() > > get a=3 and kw={}. Um, no, it wouldn't do that -- it would fail, because ints don't have a __dict__. And don't forget __slots__. What about properties and other descriptors, private attributes, etc. Is *every* attribute of an object supposed to be a separate part of the return result? If a caller knows about the new API, how to they directly access the newer fields? You might say: a, x, y, **kwargs = foo() to automatically extract a.x and a.y (as in the example class you gave below) but what if I want to give names which are meaningful at the caller end, instead of using the names foo() supplies? a, counter, description, **kwargs = foo() Now my meaningful names don't match the attributes. Nor does the order I give them. Now what happens? > But the richer API: > > class Richness(int): > > def __init__(self, value): > super().__int__(value) > self.x = 'x!' > self.y = 4 [...] > I've got mixed mfeelings about this I don't. -- Steve From turnbull.stephen.fw at u.tsukuba.ac.jp Sun Jan 27 06:28:00 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sun, 27 Jan 2019 20:28:00 +0900 Subject: [Python-ideas] kwargs for return In-Reply-To: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> Message-ID: <23629.38336.664543.488502@turnbull.sk.tsukuba.ac.jp> Thomas G?ttler Lists writes: > Example: > > > ??? status = backend.transmit_data() > > But later you want to add something to the [API's return value]. Common Lisp has this feature. There, it's called "multiple values". The called function uses a special form that creates a multiple value. The primary value is always available, whether the return is a normal value or a multiple value. Another special form is used to access the secondary values (and if it's not used, any secondary values are immediately discarded). In Python-like pseudo-code: def divide(dividend, divisor): # A special form, represented as syntax. return_multiple_values (dividend // divisor, dividend % divisor) def greatest_smaller_multiple(dividend, divisor): result = divisor * divide(dividend, divisor) # Secondary values are not accessible from result. return result def recreate_dividend(dividend, divisor): # The ``multiple_values`` builtin is magic: there is no other way # to "see" multiple values. # It returns a tuple. # For keyword returns, have a convention that the second value # is a dict or use an appropriate "UnboxedMultipleValues" class. vs = multiple_values(divide(dividend, divisor)) return vs[0] * divisor + vs[1] But Common Lisp experience suggests (1) if you want to use this feature to change the API, you generally want to do a full refactoring anyway, because (2) the called function can (and often enough to be problematic, *does*) creep in the direction of *requiring* attention to the secondary values, can (and sometimes *does*) lead to subtle bugs at call sites that only use the primary value. It's unclear to me that this feature can be safely used in the way your example suggests. Steve From ashafer at pm.me Sun Jan 27 06:52:34 2019 From: ashafer at pm.me (Alex Shafer) Date: Sun, 27 Jan 2019 11:52:34 +0000 Subject: [Python-ideas] Single line single expression try/except syntax Message-ID: <2OWYwltSss6YO1wCe9OucKzE56cpC72WO_V2kLSG4rplW7epItPwJ0KNMxgi9UsDeAvpAw0_4U9OGmqXNOs4FQ==@pm.me> Hello, I'd like to discuss an idea I had to shorten the syntax for the common case of having a try/except/finally/else block where all of the following conditions are met: * There is only one except block, no finally or else * The exception is not captured in the except block, i.e. `except KeyError:` not `except KeyError as e:` * The contents of the except block is only a single expression * Perhaps, the expression starts with a control word such as pass, break, continue, return, raise. As it happens, everything useful I can think to do with this right now currently uses these. Unclear to me if this should be a requirement. The syntax I envisioned would be something like the following: try on ValueError pass: ??? some_list.remove('value') I'm not at all attached to the `on` token specifically, but I think something is necessary there. Other examples: def func(): ??? try on KeyError, ValueError return False: ??????? dict_of_lists['key'].remove('value') key = 'foo' try on KeyError raise MyApplicationError(f'{key} not found'): ??? a_dict[key] for i in range(100): ??? try on TypeError continue: ??????? a_list[i] += 1 ??????? etc() I realize this could be accomplished with context managers, but that seems like overkill to simply throw away the exception, and would increase the overall required code length. Thanks for your input! Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 509 bytes Desc: OpenPGP digital signature URL: From mike at selik.org Sun Jan 27 16:03:08 2019 From: mike at selik.org (Michael Selik) Date: Sun, 27 Jan 2019 13:03:08 -0800 Subject: [Python-ideas] Single line single expression try/except syntax In-Reply-To: <2OWYwltSss6YO1wCe9OucKzE56cpC72WO_V2kLSG4rplW7epItPwJ0KNMxgi9UsDeAvpAw0_4U9OGmqXNOs4FQ==@pm.me> References: <2OWYwltSss6YO1wCe9OucKzE56cpC72WO_V2kLSG4rplW7epItPwJ0KNMxgi9UsDeAvpAw0_4U9OGmqXNOs4FQ==@pm.me> Message-ID: Any discussion of except expressions should reference PEP 463 and respond to the arguments there. https://www.python.org/dev/peps/pep-0463/ On Sun, Jan 27, 2019, 3:52 AM Alex Shafer via Python-ideas < python-ideas at python.org wrote: > Hello, > > I'd like to discuss an idea I had to shorten the syntax for the common > case of having a try/except/finally/else block where all of the following > conditions are met: > > * There is only one except block, no finally or else > * The exception is not captured in the except block, i.e. `except > KeyError:` not `except KeyError as e:` > * The contents of the except block is only a single expression > * Perhaps, the expression starts with a control word such as pass, break, > continue, return, raise. As it happens, everything useful I can think to do > with this right now currently uses these. Unclear to me if this should be a > requirement. > > The syntax I envisioned would be something like the following: > > try on ValueError pass: > some_list.remove('value') > > I'm not at all attached to the `on` token specifically, but I think > something is necessary there. > > Other examples: > > def func(): > try on KeyError, ValueError return False: > dict_of_lists['key'].remove('value') > > key = 'foo' > try on KeyError raise MyApplicationError(f'{key} not found'): > a_dict[key] > > > > for i in range(100): > try on TypeError continue: > a_list[i] += 1 > etc() > > I realize this could be accomplished with context managers, but that seems > like overkill to simply throw away the exception, and would increase the > overall required code length. > > Thanks for your input! > Alex > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Jan 27 18:13:01 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 28 Jan 2019 12:13:01 +1300 Subject: [Python-ideas] kwargs for return In-Reply-To: <23629.38336.664543.488502@turnbull.sk.tsukuba.ac.jp> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> <23629.38336.664543.488502@turnbull.sk.tsukuba.ac.jp> Message-ID: <5C4E3AFD.3080804@canterbury.ac.nz> Stephen J. Turnbull wrote: > Common Lisp has this feature. Lua also has something like this, but it's more built-in. The syntax looks deceptively similar to tuple packing and unpacking in Python, but it's magical. I wouldn't like to see that kind of magic in Python. -- Greg From benrudiak at gmail.com Sun Jan 27 22:21:36 2019 From: benrudiak at gmail.com (Ben Rudiak-Gould) Date: Sun, 27 Jan 2019 19:21:36 -0800 Subject: [Python-ideas] Single line single expression try/except syntax In-Reply-To: References: <2OWYwltSss6YO1wCe9OucKzE56cpC72WO_V2kLSG4rplW7epItPwJ0KNMxgi9UsDeAvpAw0_4U9OGmqXNOs4FQ==@pm.me> Message-ID: Aren't the arguments for accepting PEP 463 basically the same as the arguments for accepting assignment expressions? The current syntax is painfully verbose and people use inefficient and ad hoc constructions to get around it. Better to have a language feature to support the way that people actually want to write code. If Guido wrote that rejection of PEP 463 then I can't help thinking that he changed his perspective between then and PEP 572 and might have accepted PEP 463 if it had been proposed more recently. (I'm aware of the drama surrounding PEP 572, but still.) On Sun, Jan 27, 2019 at 1:04 PM Michael Selik wrote: > Any discussion of except expressions should reference PEP 463 and respond > to the arguments there. > > https://www.python.org/dev/peps/pep-0463/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Jan 27 22:52:54 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 28 Jan 2019 14:52:54 +1100 Subject: [Python-ideas] Single line single expression try/except syntax In-Reply-To: References: <2OWYwltSss6YO1wCe9OucKzE56cpC72WO_V2kLSG4rplW7epItPwJ0KNMxgi9UsDeAvpAw0_4U9OGmqXNOs4FQ==@pm.me> Message-ID: <20190128035254.GN26605@ando.pearwood.info> On Sun, Jan 27, 2019 at 07:21:36PM -0800, Ben Rudiak-Gould wrote: > If Guido wrote that rejection of PEP 463 then I can't help thinking that he > changed his perspective between then and PEP 572 and might have > accepted PEP 463 if it had been proposed more recently. Maybe, maybe not, but either way Michael's advice that any discussion about try/except expressions should respond to the points raised in PEP 463. (By the way, since Guido's retirement as BDFL, any major proposal doesn't have to just convince him, but a committee of people (as yet unselected). -- Steve From steve at pearwood.info Sun Jan 27 23:53:28 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 28 Jan 2019 15:53:28 +1100 Subject: [Python-ideas] Single line single expression try/except syntax In-Reply-To: <20190128035254.GN26605@ando.pearwood.info> References: <2OWYwltSss6YO1wCe9OucKzE56cpC72WO_V2kLSG4rplW7epItPwJ0KNMxgi9UsDeAvpAw0_4U9OGmqXNOs4FQ==@pm.me> <20190128035254.GN26605@ando.pearwood.info> Message-ID: <20190128045328.GP26605@ando.pearwood.info> On Mon, Jan 28, 2019 at 02:52:54PM +1100, Steven D'Aprano wrote: > Maybe, maybe not, but either way Michael's advice that any discussion > about try/except expressions should respond to the points raised in PEP > 463. Oops, incomplete sentence... I meant that Michael's advice to respond to the PEP still applies. -- Steve From turnbull.stephen.fw at u.tsukuba.ac.jp Mon Jan 28 02:46:23 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Mon, 28 Jan 2019 16:46:23 +0900 Subject: [Python-ideas] Single line single expression try/except syntax In-Reply-To: <20190128035254.GN26605@ando.pearwood.info> References: <2OWYwltSss6YO1wCe9OucKzE56cpC72WO_V2kLSG4rplW7epItPwJ0KNMxgi9UsDeAvpAw0_4U9OGmqXNOs4FQ==@pm.me> <20190128035254.GN26605@ando.pearwood.info> Message-ID: <23630.45903.513708.987591@turnbull.sk.tsukuba.ac.jp> Steven D'Aprano writes: > (By the way, since Guido's retirement as BDFL, any major proposal > doesn't have to just convince him, but a committee of people (as > yet unselected). I don't read the governance PEPs that way, FWIW. A major proposal needs to convince "Python Dev", as summarized, interpreted, and guided by either such a committee or by that committee's BDFL-Delegate. This has been the custom for quite a while; I don't think Guido has been a micro-manager in this millennium ;-), although the BDFL-Delegate custom is "only" about a decade old. Steve From guettliml at thomas-guettler.de Mon Jan 28 04:03:30 2019 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Mon, 28 Jan 2019 10:03:30 +0100 Subject: [Python-ideas] kwargs for return In-Reply-To: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> References: <54454f78-73d8-1899-a93f-509b5c39d125@thomas-guettler.de> Message-ID: <66b9e674-f92a-a467-19f0-880db29385f5@thomas-guettler.de> Wow, thank you very much for all those answers and hints to my message. David opened my eyes with this: Functions return a single value, period. Yes, this means my question is not about a function, it is about assignment. Dictionary unpacking could be used for my use case. Since it does not exist, I will look at dataclasses. Thank you very much for your feedback. Thomas G?ttler Am 26.01.19 um 14:04 schrieb Thomas G?ttler Lists: > I know this is going to get rejected, but I want to speak out the idea nevertheless: > > I love kwargs and named arguments from the beginning (roughly 18 years now) > > I guess you came across this several times before: > > In the first version of the API one variable gets returned: > > Example: > > > ??? status = backend.transmit_data() > > But later you want to add something to the API. > > > For the input part of a method this is solved. You > can add an optional kwarg. > > But for the output part of a method, there you can't change > the interface easily up to now. > > Use case: you want to add an optional list of messages which > could get returned. > > You want to change to > > ??? status, messages = backend.transmit_data() > > If you have 10 different backend implementations, > then you need to change all of them. > > This is difficult, if the backends reside in different > repos and maybe you even don't own some of these repos. > > Current draw-back: you need to change all of them at once. > > Of course you could work around it be using this > > ?? status_messages = backend.transmit_data() > > And then do some fancy guessing if the variable contains > only the status or a tuple containing status and messages. > > Some days ago I head the idea that kwargs for return would > help here. > > > This should handle both cases: > > Case1: The old backend returns only the status, and the caller > wants both status and messages. Somehow the default > for messages needs to be defined. In my case it would > be the empty list. > > > Case2: The new backends returning the status > and messages. The old caller just gets the > status. The messages get discarded. > > Above is the use case. > > How could kwargs for return look like? > > Maybe like this: > > ..... > > Sorry, I could not find a nice, clean and simple syntax > for this up to now. > > Maybe someone else is more creative than I am. > > What do you think about this? > > > Regards, > ? Thomas G?ttler > > > -- > Thomas Guettlerhttp://www.thomas-guettler.de/ > I am looking for feedback:https://github.com/guettli/programming-guidelines > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Thomas Guettler http://www.thomas-guettler.de/ I am looking for feedback: https://github.com/guettli/programming-guidelines From jpic at yourlabs.org Mon Jan 28 20:40:42 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Tue, 29 Jan 2019 02:40:42 +0100 Subject: [Python-ideas] Add list.join() please Message-ID: Hello, During the last 10 years, Python has made steady progress in convenience to assemble strings. However, it seems to me that joining is still, when possible, the cleanest way to code string assembly. However, I'm still sometimes confused between the different syntaxes used by join methods: 0. os.path.join takes *args 1. str.join takes a list argument, this inconsistence make it easy to mistake with the os.path.join signature Also, I still think that: '_'.join(['cancel', name]) Would be more readable as such: ['cancel', name].join('_') Not only this would fix both of my issues with the current status-quo, but this would also be completely backward compatible, and probably not very hard to implement: just add a join method to list. Thanks in advance for your reply Have a great day -- ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jpic at yourlabs.org Mon Jan 28 20:49:50 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Tue, 29 Jan 2019 02:49:50 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: PS: sorry for my silly example, i know that example could also be written f'cancel_{name}', which is awesome, thank you for that ! But for more complex strings I'm trying to avoid: def foo(): return textwrap.dedent(f''' some {more(complex)} {st.ri("ng")} ''').strip() For some reason, I prefer: def foo(): return '\n'.join(['some', more(complex), st.ri('ng')]) But that would be even more readable (less nesting of statements): def foo(): return ['some', more(complex), st.ri('ng')].join('\n') Hope this makes sense Have a great day -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Mon Jan 28 21:22:37 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 28 Jan 2019 21:22:37 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: On Mon, Jan 28, 2019 at 8:44 PM Jamesie Pic wrote: > ['cancel', name].join('_') > This is a frequent suggestion. It is also one that makes no sense whatsoever if you think about Python's semantics. What would you expect to happen with this line: ['foo', b'foo', 37, re.compile('foo')].join('_') List are not restricted to containing only strings (or things that are string-like enough that they might play well with joining). Growing a method that pertains only to that specialized sort of list breaks the mental model of Python. Moreover, there is no way to TELL if a particular list is a "list of strings" other than checking each item inside it (unlike in many languages). -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Tue Jan 29 00:05:00 2019 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Mon, 28 Jan 2019 21:05:00 -0800 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: <5C4FDEFC.6070902@brenbarn.net> On 2019-01-28 18:22, David Mertz wrote: > On Mon, Jan 28, 2019 at 8:44 PM Jamesie Pic > wrote: > > ['cancel', name].join('_') > > > This is a frequent suggestion. It is also one that makes no sense > whatsoever if you think about Python's semantics. What would you expect > to happen with this line: > > ['foo', b'foo', 37, re.compile('foo')].join('_') > > List are not restricted to containing only strings (or things that are > string-like enough that they might play well with joining). Growing a > method that pertains only to that specialized sort of list breaks the > mental model of Python. Moreover, there is no way to TELL if a > particular list is a "list of strings" other than checking each item > inside it (unlike in many languages). That problem already exists with str.join though. It's just currently spelled this way: ','.join(['foo', b'foo', 37, re.compile('foo')]) . . . and the result is an error. I don't see how it's semantically any less sensible to call list.join on a list of non-string things than it is to pass a list of non-string things to str.join. Personally what I find is perverse is that .join is a method of strings but does NOT call str() on the items to be joined. The cases where I would have been surprised or bitten by something accidentally being converted to a string are massively outweighed by the cases where I want everything to be converted into a string, because, dangit, I'm joining them into a bigger string. I agree that a list method would be nice, but we then have to think about should we add similar methods to all iterable types, since str.join can take any iterable (not just a list). -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From ronmarti18 at gmail.com Tue Jan 29 00:36:45 2019 From: ronmarti18 at gmail.com (Ronie Martinez) Date: Tue, 29 Jan 2019 13:36:45 +0800 Subject: [Python-ideas] Add list.join() please In-Reply-To: <5C4FDEFC.6070902@brenbarn.net> References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: If there is a more Pythonic way of joining lists, tuples, sets, etc., it is by using a keyword and not a method. For example, using a keyword, say *joins*: '-' joins ['list', 'of', 'strings'] > This is more readable than using the method join() since you can read this as "dash joins a list of strings". Although, the current method of joining lists is almost similar to this, the current method is somewhat "confusing" for beginners or for people who came from other languages. BTW, this is just what comes in my mind and not supported by Python. On Tue, Jan 29, 2019 at 1:22 PM Brendan Barnwell wrote: > On 2019-01-28 18:22, David Mertz wrote: > > On Mon, Jan 28, 2019 at 8:44 PM Jamesie Pic > > wrote: > > > > ['cancel', name].join('_') > > > > > > This is a frequent suggestion. It is also one that makes no sense > > whatsoever if you think about Python's semantics. What would you expect > > to happen with this line: > > > > ['foo', b'foo', 37, re.compile('foo')].join('_') > > > > List are not restricted to containing only strings (or things that are > > string-like enough that they might play well with joining). Growing a > > method that pertains only to that specialized sort of list breaks the > > mental model of Python. Moreover, there is no way to TELL if a > > particular list is a "list of strings" other than checking each item > > inside it (unlike in many languages). > > That problem already exists with str.join though. It's just > currently > spelled this way: > > ','.join(['foo', b'foo', 37, re.compile('foo')]) > > . . . and the result is an error. I don't see how it's > semantically > any less sensible to call list.join on a list of non-string things than > it is to pass a list of non-string things to str.join. > > Personally what I find is perverse is that .join is a method of > strings > but does NOT call str() on the items to be joined. The cases where I > would have been surprised or bitten by something accidentally being > converted to a string are massively outweighed by the cases where I want > everything to be converted into a string, because, dangit, I'm joining > them into a bigger string. > > I agree that a list method would be nice, but we then have to > think > about should we add similar methods to all iterable types, since > str.join can take any iterable (not just a list). > > -- > Brendan Barnwell > "Do not follow where the path may lead. Go, instead, where there is no > path, and leave a trail." > --author unknown > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Tue Jan 29 00:47:34 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 29 Jan 2019 00:47:34 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: <5C4FDEFC.6070902@brenbarn.net> References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: On Tue, Jan 29, 2019, 12:22 AM Brendan Barnwell > What would you expect to happen with this line: > > > > ['foo', b'foo', 37, re.compile('foo')].join('_') > > That problem already exists with str.join though. It's just > currently spelled this way: > > ','.join(['foo', b'foo', 37, re.compile('foo')]) > > . . . and the result is an error. I don't see how it's > semantically > any less sensible to call list.join on a list of non-string things than > it is to pass a list of non-string things to str.join. This feels like an important asymmetry to me. There is a difference between to object itself being the wrong kind of thing and the arguments to a method being wrong. In the first case, the object (a heterogenous list) can NEVER support a .join() method. It's simply the wrong kind of object. Of course, it's right as far as the basic type system goes, but its deeper (maybe "structural") type cannot support that method. On the other hand, sure, almost any function, including methods, will choke on bad arguments. But no string *object* rules out joining if good arguments can be found. I am sure readers will immediately reply, "what about list.sort()?" Unfortunately, that really will simply fail on lists of the wrong "type." After all these years, I still think that change in Python 2.3 or so was the wrong choice (for those with fewer gray hairs: when the hills were young, Python objects were arbitrarily comparable under inequality, even when the answer didn't "mean" anything). I actually agree that a 'cast_to_string_and_join()' function sounds useful. Of course, you can write one easily enough, it doesn't need to be a method. For that matter, I think I'd probably rather that str.join() was simply a function in the string module or somewhere similar, with a signature like 'join(delim, iter_of_strings)' -------------- next part -------------- An HTML attachment was scrubbed... URL: From tahafut at gmail.com Tue Jan 29 00:49:34 2019 From: tahafut at gmail.com (Henry Chen) Date: Mon, 28 Jan 2019 21:49:34 -0800 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: One could always write str.join('_', ['list', 'of', 'strings']) I'm not advocating for this syntax, but perhaps it is clarifying. Also, a quick search finds this thread from 20 years ago on this very issue: https://mail.python.org/pipermail/python-dev/1999-June/095366.html On Mon, Jan 28, 2019 at 9:37 PM Ronie Martinez wrote: > If there is a more Pythonic way of joining lists, tuples, sets, etc., it > is by using a keyword and not a method. For example, using a keyword, say > *joins*: > > '-' joins ['list', 'of', 'strings'] >> > > This is more readable than using the method join() since you can read this > as "dash joins a list of strings". Although, the current method of joining > lists is almost similar to this, the current method is somewhat "confusing" > for beginners or for people who came from other languages. > > BTW, this is just what comes in my mind and not supported by Python. > > On Tue, Jan 29, 2019 at 1:22 PM Brendan Barnwell > wrote: > >> On 2019-01-28 18:22, David Mertz wrote: >> > On Mon, Jan 28, 2019 at 8:44 PM Jamesie Pic > > > wrote: >> > >> > ['cancel', name].join('_') >> > >> > >> > This is a frequent suggestion. It is also one that makes no sense >> > whatsoever if you think about Python's semantics. What would you expect >> > to happen with this line: >> > >> > ['foo', b'foo', 37, re.compile('foo')].join('_') >> > >> > List are not restricted to containing only strings (or things that are >> > string-like enough that they might play well with joining). Growing a >> > method that pertains only to that specialized sort of list breaks the >> > mental model of Python. Moreover, there is no way to TELL if a >> > particular list is a "list of strings" other than checking each item >> > inside it (unlike in many languages). >> >> That problem already exists with str.join though. It's just >> currently >> spelled this way: >> >> ','.join(['foo', b'foo', 37, re.compile('foo')]) >> >> . . . and the result is an error. I don't see how it's >> semantically >> any less sensible to call list.join on a list of non-string things than >> it is to pass a list of non-string things to str.join. >> >> Personally what I find is perverse is that .join is a method of >> strings >> but does NOT call str() on the items to be joined. The cases where I >> would have been surprised or bitten by something accidentally being >> converted to a string are massively outweighed by the cases where I want >> everything to be converted into a string, because, dangit, I'm joining >> them into a bigger string. >> >> I agree that a list method would be nice, but we then have to >> think >> about should we add similar methods to all iterable types, since >> str.join can take any iterable (not just a list). >> >> -- >> Brendan Barnwell >> "Do not follow where the path may lead. Go, instead, where there is no >> path, and leave a trail." >> --author unknown >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jan 29 01:04:23 2019 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 29 Jan 2019 17:04:23 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: On Tue, Jan 29, 2019 at 4:48 PM David Mertz wrote: > In the first case, the object (a heterogenous list) can NEVER support a .join() method. It's simply the wrong kind of object. Of course, it's right as far as the basic type system goes, but its deeper (maybe "structural") type cannot support that method. > > On the other hand, sure, almost any function, including methods, will choke on bad arguments. But no string *object* rules out joining if good arguments can be found. > > I am sure readers will immediately reply, "what about list.sort()?" Unfortunately, that really will simply fail on lists of the wrong "type." After all these years, I still think that change in Python 2.3 or so was the wrong choice (for those with fewer gray hairs: when the hills were young, Python objects were arbitrarily comparable under inequality, even when the answer didn't "mean" anything). > Considering that you can provide a key function to sort(), there is by definition no list of objects which utterly cannot be sorted. That said, though, I don't think this is an overly strong argument. The main reason lists don't have a join method is that str.join() can take *any iterable*, so it's perfectly legal to join tuples or generators without needing to listify them. Consider: # Join the parts, ignoring empty ones "_".join(filter(None, parts)) c = collections.Counter(...) "_".join(item for item, count in c.most_common()) # solving Brendan's complaint of perversity "_".join(map(str, stuff)) If these were flipped around, you'd have to explicitly call list() on them just to get a join method. BTW, Ronie: I would disagree. Python uses syntactic elements only where functions are incapable of providing equivalent functionality. That's why print became a function in 3.0 - it didn't need to be magical syntax any more. ChrisA From boxed at killingar.net Tue Jan 29 01:04:51 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Tue, 29 Jan 2019 07:04:51 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: <5997EFCF-FF8C-4EC1-9E05-4AB4A8762C51@killingar.net> Just by the title I thought you meant >>> [1].join([2, 3, 4]) [2, 1, 3, 1, 4] This is what I'd expect on the list class. So -1 for your suggestion but +1 for what I thought you meant before I read the complete mail :) > On 29 Jan 2019, at 02:40, Jamesie Pic wrote: > > Hello, > > During the last 10 years, Python has made steady progress in convenience to assemble strings. However, it seems to me that joining is still, when possible, the cleanest way to code string assembly. > > However, I'm still sometimes confused between the different syntaxes used by join methods: > > 0. os.path.join takes *args > 1. str.join takes a list argument, this inconsistence make it easy to mistake with the os.path.join signature > > Also, I still think that: > > '_'.join(['cancel', name]) > > Would be more readable as such: > > ['cancel', name].join('_') > > Not only this would fix both of my issues with the current status-quo, but this would also be completely backward compatible, and probably not very hard to implement: just add a join method to list. > > Thanks in advance for your reply > > Have a great day > > -- > ? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From robertve92 at gmail.com Tue Jan 29 04:13:10 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Tue, 29 Jan 2019 10:13:10 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: <5C4FDEFC.6070902@brenbarn.net> References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: > > > Personally what I find is perverse is that .join is a method of > strings > but does NOT call str() on the items to be joined. Yeah, that's a good reason to use .format when you have a fixed number of arguments. "{}, {}, {}, {}".format(some, random, stuff, here) And then there is map. Otherwise .join is very common on iterables like '\n'.join(make_string(object) for object in something) '\n'.join(map(make_string, something)) '\n'.join(map(str, nonstr)) '\n'.join('{}: {}'.format(x, y) for x,y in blabla) '\n'.join(map('[{}]'.format, stuff)) A "join format" construct is very typical in codes producing strings from iterable. I agree on the part "a list doesn't always contain string so why would it have a join method". -------------- next part -------------- An HTML attachment was scrubbed... URL: From jpic at yourlabs.org Tue Jan 29 06:27:45 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Tue, 29 Jan 2019 12:27:45 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: Thanks for your feedback ! So, do you think anything can be done to make when assembling strings less confusing / fix the inconsistency between the syntax of of os.path.join and str.join ? Have a great day -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Tue Jan 29 06:54:35 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Tue, 29 Jan 2019 11:54:35 +0000 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: > So, do you think anything can be done to make when > assembling strings less confusing / fix the inconsistency > between the syntax of of os.path.join and str.join ? How about improving the documentation. Providing additional examples might be a good starting point. That could make a good blog post. In fact, there might already be one out there. I've not looked. -- Jonathan From jpic at yourlabs.org Tue Jan 29 08:27:16 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Tue, 29 Jan 2019 14:27:16 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: Thanks for the advice Jonathan, can you clarify the documentation topic you think should be improved or created ? "Assembling strings" or "inconsistencies between os.path.join and str.join" ? I've written an article to summarize but I don't want to publish it because my blog serves my lobbying for python and not against it. Also I don't feel confident about it because I never had the luck to work closely with core-devs or other people with a lot more experience than me like I can so easily find on internet (thank you all, I love you !). So, I deliver it here under WTFPL license. The mistake I'm still doing after 10 years of Python I love Python really, but there's a mistake I've been doing over and over again while assembling strings of all sorts in Python and that I have unconsciously ignored until now. Love it or hate it, but when you start with python it's hard to be completely indifferent to: '\n'.join(['some', 'thing']) But then you read the kilometers of justifications that the python devs have already had for the past 20 years about it and, well, grow indifference about it "that's the way it's gonna be if I want to use python". But recently, I started to tackle one of the dissatisfaction I have with my own code: I think how I assemble strings doesn't make me feel great compared to the rest of what I'm doing with Python. However, it strikes me that assembling strings in python is something I do many times a day, for 10 years, so, taking some time to question my own doing could prove helpful on the long run. The little story of a little obsession... ## `os.path.join(*args)` vs. `str.join(arg)` I'm living a dream with os.path.join: >>> os.path.join('some', 'path') 'some/path' But then I decide that cross platform is going to be to much work so why not join with slashes directly and only support free operating systems: >>> '/'.join('some', 'path') TypeError: join() takes exactly one argument (2 given) "Well ! I forgot about this for a minute, let's "fix" it and move on": >>> '/'.join(['some', 'path']) 'some/path' Ohhh, I'm not really sure in this case, isn't my code going to look more readable with the os.path.join notation after all ? Ten years later, I still make the same mistake, because 2 seconds before doing a str join I was doing a path join. The fix is easy because the error message is clear, so it's easier to ignore the inconsistency and just fix it and move on. But, what if, this was an elephant in the room that it was so easy to look away from ? ## Long f-strings vs. join The new python format syntax with f-strings is pretty awesome, let's see how we can assemble a triple quoted f-string: foo = f''' some {more(complex)} {st.ri("ng")} '''.strip() Pretty cool right ? In a function it would look like this: def foo(): return f''' some {more(complex)} {st.ri("ng")} ''').strip() Ok so that would also work but we're going to have to import a module from the standard library to restore visual indentation on that code: import textwrap def foo(): return textwrap.dedent(f''' some {more(complex)} {st.ri("ng")} ''').strip() Let's compare this to the join notation: def foo(): return '\n'.join('some', more(complex), st.ri('ng')) Needless to say, I prefer the join notation for this use case. Not only does it fit in a single line but it doesn't require to dedent the text with an imported function, nor does it require to juggle with quotes, but also it sorts of look like it would be more performant. All in all, I prefer the join notation to assemble longer strings. Note that in practice, using f-strings for the "pieces" that I want to assemble and that works great: def foo(): return '\n'.join('some', more(complex), f'_{other}_') Anyway, ok good-enough looking code ! Let's see what you have to say: TypeError: join() takes exactly one argument (2 given) Oh, that again, kk gotfix: def foo(): return '\n'.join(['some', more(complex), f'_{other}_']) I should take metrics about the number of times were I make this mistake during a day, cause it looks like it would be a lot (i switch between os.path.join to str.join a lot). ## The 20-yr old jurisprudence So, what looks more ergonomic between those two syntax: [ 'some', more(complex), f'_{other}_' ].join('\n') '\n'.join([ 'some', more(complex), f'_{other}_' ]) It seems there is a lot of friction when proposing to add a convenience join method to the list method. I won't go over the reasons for this here, there's already a lot to read about it on internet, that's been written during the last 20 years. ## Conclusion I have absolutely no idea what should be done about this, the purpose of this article was just to share a bit of one of my obsessions with string assembling. Maybe it strikes me assembling strings multiple times a day with a language I've got 10 years of full-time experience and still repeating the same mistakes. Not because I don't understand the jurisprudence, not because I don't understand the documentation, or because the documentation is wrong, but probably just because i switch from os.path.join and str.join which take different syntax, i think. Perhaps the most relevant proposal here would be to extend str.join signature, which currently supports this notation: str.join(iterable) To support also this notation: str.join(arg1, ...argN) So at least, people won't be doing mistakes when switching over from os.path.join and str.join. Perhaps, something else ? Have a great day From jpic at yourlabs.org Tue Jan 29 09:44:35 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Tue, 29 Jan 2019 15:44:35 +0100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Fri, Jan 4, 2019 at 10:07 PM Bernardo Sulzbach wrote: > > I'd suggest violating PEP-8 instead of trying to change it. TBH even my bash global environment variables tend to become more and more lowercase ... From robertve92 at gmail.com Tue Jan 29 10:20:17 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Tue, 29 Jan 2019 16:20:17 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: So you'd propose to add some kind of def Join(sep, *args): return sep.join(map(str, args)) To the standard lib ? Or to add another method to str class that do that ? class str: ... def Join(self, *args): return self.join(map(str, args)) I agree such a function is super convenient but does it need to be added to the standard lib I have it in my custom utils.py and my PYTHONTARTUP.py file so that I can use it everywhere. Call it Join, superjoin, joinargs... On Tue, 29 Jan 2019, 02:43 Jamesie Pic Hello, > > During the last 10 years, Python has made steady progress in convenience > to assemble strings. However, it seems to me that joining is still, when > possible, the cleanest way to code string assembly. > > However, I'm still sometimes confused between the different syntaxes used > by join methods: > > 0. os.path.join takes *args > 1. str.join takes a list argument, this inconsistence make it easy to > mistake with the os.path.join signature > > Also, I still think that: > > '_'.join(['cancel', name]) > > Would be more readable as such: > > ['cancel', name].join('_') > > Not only this would fix both of my issues with the current status-quo, but > this would also be completely backward compatible, and probably not very > hard to implement: just add a join method to list. > > Thanks in advance for your reply > > Have a great day > > -- > ? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Tue Jan 29 10:24:31 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Tue, 29 Jan 2019 16:24:31 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: Oh and if you want to write ['a', 'b', 'c'].join('.') Check out pip install funcoperators and you can write : ['a', 'b', 'c'] |join('.') Given you defined the function below : from funcoperators import postfix def join(sep): return postfix(lambda it: sep.join(map(str, it)) You can even choose the operator : ['a', 'b', 'c'] -join('.') ['a', 'b', 'c'] /join('.') ['a', 'b', 'c'] @join('.') ... Disclaimer : I'm the creator of funcoperators On Tue, 29 Jan 2019, 02:43 Jamesie Pic Hello, > > During the last 10 years, Python has made steady progress in convenience > to assemble strings. However, it seems to me that joining is still, when > possible, the cleanest way to code string assembly. > > However, I'm still sometimes confused between the different syntaxes used > by join methods: > > 0. os.path.join takes *args > 1. str.join takes a list argument, this inconsistence make it easy to > mistake with the os.path.join signature > > Also, I still think that: > > '_'.join(['cancel', name]) > > Would be more readable as such: > > ['cancel', name].join('_') > > Not only this would fix both of my issues with the current status-quo, but > this would also be completely backward compatible, and probably not very > hard to implement: just add a join method to list. > > Thanks in advance for your reply > > Have a great day > > -- > ? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elazarg at gmail.com Tue Jan 29 10:39:25 2019 From: elazarg at gmail.com (Elazar) Date: Tue, 29 Jan 2019 17:39:25 +0200 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: ?????? ???, 5 ????? 2019, 2:10, ??? Abe Dillon ? > The whole point of the all-caps globals is to tell you a lot about what >> they are. > > A lot? The only thing it canonically tells you is to not modify it > Wrong. It also tells it's unlikely to be modified by the code that does declare it, which can really help when reaaoning about the code, or debugging it, as it makes many potential sources of bugs less likely. Elazar -------------- next part -------------- An HTML attachment was scrubbed... URL: From jpic at yourlabs.org Tue Jan 29 12:34:31 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Tue, 29 Jan 2019 18:34:31 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: funcoperators is pretty neat ! But at this stage of the discussion I would also try to get automatic string casting since the purpose is to assemble a string. It would be great in the stdlib because switching between os.path.join and str.join is so error-prone, and assembling strings seems like a pretty common task. It's not uncommon to find str.join in arguments against Python. Monkey patching str in PYTHONTARTUP.py would work, but then that would require users pulling my package to also hack their startup script. Or even worse: we could patch the startup script upon package installation. It seems like it would make redistribution a lot harder than it should. Another approach would be to add a stringify(delim='\n') method to iterables, it would accept a delimiter argument and would return a string of all items casted to string and separated by the delimiter. That would be certainly more backward-compatible than supporting an alternate str.join(1, 'b') call. Meanwhile I've opened a PR on boltons, but, well, it looks a lot like php.net/implode, and I'm not really sure we want that :D https://github.com/mahmoud/boltons/pull/197/commits/2b4059855ab4ceae54032bff55da0a6622f1ff01#diff-51cb56be573adcc71320e6953926bc52R430 -- ? From chris.barker at noaa.gov Tue Jan 29 15:49:19 2019 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 29 Jan 2019 12:49:19 -0800 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: A couple notes: On Tue, Jan 29, 2019 at 5:31 AM Jamesie Pic wrote: > can you clarify the documentation > topic you think should be improved or created ? "Assembling strings" > I would think "assembling strings", though there is a lot out there already. > or "inconsistencies between os.path.join and str.join" ? > well, if we're talking about moving forward, then the Path object is probably the "right" way to join paths anyway :-) a_path / "a_dir" / "a_filename" But to the core language issue -- I started using Python with 1.5.* and back then join() was in the string module (and is there in 2.7 still) And yes, I did expect it to be a list method... Then it was added as a method of the string object. And I thought THAT was odd -- be really appreciated that I didn't need to import a module to do something fundamental. But the fact is, that joining strings is fundamentally a string operation, so it makes sense for it to be there. In earlier py2, I would have thought, maybe it should be a list method -- it's pretty darn common to join lists of strings, yes? But what about tuples? Python was kind of all about sequences -- so maybe all sequences could have that method -- i.e part of the sequence ABC. But with > py3k, Python is more about iterables than sequences -- and join (and many other methods and functions) operate on any iterable -- and this is a really good thing. So add join to ALL iterables? That makes little sense, and really isn't possible -- an iterable is something that conforms to the iterator protocol -- it's not a type, or even an ABC. So in the end, join really does only make sense as string method. Or Maybe as a built in -- but we really don't need any more of those. If you want to argue that str.join() should take multiple arguments, like os.path.join does, then, well we could do that -- it currently takes one and only one argument, so it could be extended to join multiple arguments -- but I have hardly ever seem a use case for that. The mistake I'm still doing after 10 years of Python > hmm -- I've seen a lot of newbies struggle with this, but haven't had an issue with it for years myself. > >>> '/'.join('some', 'path') > TypeError: join() takes exactly one argument (2 given) > pathlib aside, that really isn't the right way to join paths ..... os.path.jon exists for a (good) reasons. One of which is this: In [22]: os.path.join("this/", "that") Out[22]: 'this/that' -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Tue Jan 29 16:21:48 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Tue, 29 Jan 2019 21:21:48 +0000 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: I've not been following closely, so please forgive me if I'm repeating something already said in this thread. Summary: str.join allows us to easily avoid, when assembling strings, 1. Quadratic running time. 2. Redundant trailing comma syntax error. The inbuilt help(str.join) gives: S.join(iterable) -> str Return a string which is the concatenation of the strings in the iterable. The separator between elements is S. This is different from sum in two ways. The first is the separator S. The second is performance related. Consider s = 0 for i in range(100): s += 1 and s = '' for i in range(100): s += 'a' The first has linear running time (in the parameter represented by 100). The second has quadratic running time (unless string addition is doing something clever, like being lazy in evaluation). The separator S is important. In Python a redundant trailing comma, like so, val = [0, 1, 2, 3,] is both allowed and useful. (For example, when the entries are each on a simple line, it reduces the noise that arises when an entry is added at the end. And when the entries are reordered.) For some languages, the redundant trailing comma is a syntax error. To serialise data for such languages, you can do this: >>> '[{}]'.format(', '.join(map(str, v))) '[0, 1, 2, 3]' >From here, by all means repackage for your own convenience in your own library, or use a third party library that already has what you want. (A widely used pypi package has, I think, a head start for adoption into the standard library.) By the way, as search for "python strtools" gives me https://pypi.org/project/extratools/ https://www.chuancong.site/extratools/functions/strtools/ https://pypi.org/project/str-tools/. # This seems to be an empty stub. -- Jonathan From steve at pearwood.info Tue Jan 29 16:44:39 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 30 Jan 2019 08:44:39 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: <20190129214439.GB2613@ando.pearwood.info> On Tue, Jan 29, 2019 at 09:21:48PM +0000, Jonathan Fine wrote: > I've not been following closely, so please forgive me if I'm repeating > something already said in this thread. > > Summary: str.join allows us to easily avoid, when assembling strings, > 1. Quadratic running time. > 2. Redundant trailing comma syntax error. The lack of a syntax error for trailing commas is a language-wide feature that has nothing to do with str.join. > The inbuilt help(str.join) gives: > S.join(iterable) -> str > Return a string which is the concatenation of the strings in the > iterable. The separator between elements is S. > > This is different from sum in two ways. Three ways. sum() intentionally doesn't support strings at all: py> sum(['a', 'b', 'c'], '') Traceback (most recent call last): File "", line 1, in TypeError: sum() can't sum strings [use ''.join(seq) instead] unless you cleverly defeat this intentional limitation. (How to do this is left as an exercise for the reader.) > The first is the separator S. > The second is performance related. Consider > s = 0 > for i in range(100): > s += 1 > and > s = '' > for i in range(100): > s += 'a' > > The first has linear running time (in the parameter represented by > 100). The second has quadratic running time (unless string addition is > doing something clever, like being lazy in evaluation). In CPython, string addition does often do something clever. But not by being lazy -- it optimizes the string concatenation by appending to the strings in place if and only if it is safe to do so. -- Steve From jpic at yourlabs.org Tue Jan 29 16:45:43 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Tue, 29 Jan 2019 22:45:43 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: Thank you Jonathan, performance is one of the various reasons I prefer join to assembles strings, than, say, triple-quote dedent'ed f-strings or concatenation. It also plays well syntaxically, even though there is still a little room for improvement. For example, in PHP implode('-', array(2, 'a')) returns '2-a', and now that I think of it, it's the only thing i regret from php's stdlib... And assembling a string like that really looks like a common problem programmers face every day of their journey... The chuacong.site design for extratools documentation is really beautiful ! I found the smartplit function but no smartjoin. On my side I have requested comments on a PR in the boltons repo already, let's see if they find refutation before proposing a smartjoin implementation to extratools. https://github.com/mahmoud/boltons/pull/197 Would you recommend to release it on its own ? Ie. from implode import implode ? Thanks From jpic at yourlabs.org Tue Jan 29 16:51:26 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Tue, 29 Jan 2019 22:51:26 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: On Tue, Jan 29, 2019 at 9:50 PM Chris Barker via Python-ideas wrote: > I would think "assembling strings", though there is a lot out there already. Which one do you prefer ? > So in the end, join really does only make sense as string method. What do you think of list.stringify(delim) ? Thanks for your reply, I recon using paths does make the article more confusing, it was meant as an example to illustrate common problems that a programmer caring about user experience are like. It makes the article look like the point was to build crossplatform paths, and distracts the user from the whole purpose of assembling a string with code. Have a great day ;) From cs at cskk.id.au Tue Jan 29 17:30:01 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Wed, 30 Jan 2019 09:30:01 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <20190129223001.GA92125@cskk.homeip.net> On 29Jan2019 15:44, Jamesie Pic wrote: >On Fri, Jan 4, 2019 at 10:07 PM Bernardo Sulzbach > wrote: >> I'd suggest violating PEP-8 instead of trying to change it. > >TBH even my bash global environment variables tend to become more and >more lowercase ... If you mean _exported_ variables, then this is actually a really bad idea. The shell (sh, bash, ksh etc) makes no enforcement about naming for exported vs unexported variables. And the exported namespace ($PATH etc) is totally open ended, because any programme might expect arbitrary optional exported names for easy tuning of defaults. So, you think, since I only use variables I intend and only export variables I plan to, I can do what I like. Example script: a=1 b=2 export b So $b is now exported to subcommands, but not $a. However: the "exported set" is initially the environment you inherit. Which means: Any variable that _arrives_ in the environment is _already_ in the exported set. So, another script: a=1 b=2 # not exporting either If that gets called from the environment where you'd exported $b (eg from the first script, which could easily be your ~/.profile or ~/.bashrc), then $b gets _modified_ and _exported_ to subcommands, even though you hadn't asked. Because it came in initially from the environment. This means that you don't directly control what is local to the script and what is exported (and thus can affect other scripts). The _only_ way to maintain sanity is the existing convention: local script variables use lowercase names and exported variables use UPPERCASE names. With that in hand, and cooperation from everyone else, you have predictable and reliable behaviour. And you have a nice visual distinction in your code because you know immediately (by convention) whether a variable is exported or not. By exporting lowercase variables you violate this convention, and make your script environment unsafe for others to use. Do many many example scripts on the web do the reverse: using UPPERCASE names for local script variables? Yes they do, and they do a disservice to everyone. Cheers, Cameron Simpson From steve at pearwood.info Tue Jan 29 18:04:57 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 30 Jan 2019 10:04:57 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: <20190129230457.GG1834@ando.pearwood.info> On Tue, Jan 29, 2019 at 10:51:26PM +0100, Jamesie Pic wrote: > What do you think of list.stringify(delim) ? What's so special about lists? What do you think of: tuple.stringify deque.stringify iterator.stringify dict.keys.stringify etc. And what's so special about strings that lists have to support a stringify method and not every other type? list.floatify list.intify list.tuplify list.setify list.iteratorify Programming languages should be more about composable, re-usable general purpose components more than special cases. -- Steve From tjreedy at udel.edu Tue Jan 29 18:30:46 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 29 Jan 2019 18:30:46 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: On 1/28/2019 8:40 PM, Jamesie Pic wrote: > 0. os.path.join takes *args Since at least 1 path is required, the signature is join(path, *paths). I presume that this is the Python version of the Unix version of the system call that it wraps.. The hidden argument is os.sep. It is equivalent to os.sep.join((path,)+paths) (though one would not write it this way). > 1. str.join takes a list argument, This premise behind the (repeated) request is a false. str.joins arguments are a string (the joiner) and an *iterable of strings*, which is an abstract subclass of the abstract concept 'iterable'. And only a small fraction of lists are lists of strings and therefore iterables of strings. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Tue Jan 29 18:38:13 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Jan 2019 12:38:13 +1300 Subject: [Python-ideas] Add list.join() please In-Reply-To: <5C4FDEFC.6070902@brenbarn.net> References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: <5C50E3E5.2060900@canterbury.ac.nz> Brendan Barnwell wrote: > Personally what I find is perverse is that .join is a method of > strings but does NOT call str() on the items to be joined. Neither do most other string methods: >>> s = "hovercraft" >>> s.count(42) Traceback (most recent call last): File "", line 1, in TypeError: Can't convert 'int' object to str implicitly Why should join() be any different? -- Greg From marcos.eliziario at gmail.com Tue Jan 29 19:08:21 2019 From: marcos.eliziario at gmail.com (Marcos Eliziario) Date: Tue, 29 Jan 2019 22:08:21 -0200 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: <20190129223001.GA92125@cskk.homeip.net> References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: Is it that really obnoxious? Does using upper case for constants measurably slows down coders? Can you cite the actual papers describing such experiments that lead to this conclusion ? Because, from my experience having a visual clue that a value is a constant or an enum is something pretty useful. Surely, I'd hate reading a newspaper article where the editor generously sprinkled upper case words everywhere, but analogies only go so far, reading code have some similarities with reading prose, but still is not the same activity. Best, Marcos Eliziario Em ter, 29 de jan de 2019 ?s 20:30, Cameron Simpson escreveu: > On 29Jan2019 15:44, Jamesie Pic wrote: > >On Fri, Jan 4, 2019 at 10:07 PM Bernardo Sulzbach > > wrote: > >> I'd suggest violating PEP-8 instead of trying to change it. > > > >TBH even my bash global environment variables tend to become more and > >more lowercase ... > > If you mean _exported_ variables, then this is actually a really bad > idea. > > The shell (sh, bash, ksh etc) makes no enforcement about naming for > exported vs unexported variables. And the exported namespace ($PATH etc) > is totally open ended, because any programme might expect arbitrary > optional exported names for easy tuning of defaults. > > So, you think, since I only use variables I intend and only export > variables I plan to, I can do what I like. Example script: > > a=1 > b=2 > export b > > So $b is now exported to subcommands, but not $a. > > However: the "exported set" is initially the environment you inherit. > Which means: > > Any variable that _arrives_ in the environment is _already_ in the > exported set. So, another script: > > a=1 > b=2 > # not exporting either > > If that gets called from the environment where you'd exported $b (eg > from the first script, which could easily be your ~/.profile or > ~/.bashrc), then $b gets _modified_ and _exported_ to subcommands, even > though you hadn't asked. Because it came in initially from the > environment. > > This means that you don't directly control what is local to the script > and what is exported (and thus can affect other scripts). > > The _only_ way to maintain sanity is the existing convention: local > script variables use lowercase names and exported variables use > UPPERCASE names. With that in hand, and cooperation from everyone else, > you have predictable and reliable behaviour. And you have a nice visual > distinction in your code because you know immediately (by convention) > whether a variable is exported or not. > > By exporting lowercase variables you violate this convention, and make > your script environment unsafe for others to use. > > Do many many example scripts on the web do the reverse: using UPPERCASE > names for local script variables? Yes they do, and they do a disservice > to everyone. > > Cheers, > Cameron Simpson > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Marcos Elizi?rio Santos mobile/whatsapp/telegram: +55(21) 9-8027-0156 skype: marcos.eliziario at gmail.com linked-in : https://www.linkedin.com/in/eliziario/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashafer at pm.me Tue Jan 29 19:09:55 2019 From: ashafer at pm.me (Alex Shafer) Date: Wed, 30 Jan 2019 00:09:55 +0000 Subject: [Python-ideas] Add list.join() please In-Reply-To: <20190129230457.GG1834@ando.pearwood.info> References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> Message-ID: <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> 1) I'm in favor of adding a stringify method to all collections 2) strings are special and worthy of a "special case" because strings tend to be human readable and are used in all kinds of user interface. -------- Original Message -------- On Jan 29, 2019, 16:04, Steven D'Aprano wrote: > On Tue, Jan 29, 2019 at 10:51:26PM +0100, Jamesie Pic wrote: > >> What do you think of list.stringify(delim) ? > > What's so special about lists? What do you think of: > > tuple.stringify > deque.stringify > iterator.stringify > dict.keys.stringify > > etc. And what's so special about strings that lists have to support a > stringify method and not every other type? > > list.floatify > list.intify > list.tuplify > list.setify > list.iteratorify > > Programming languages should be more about composable, re-usable general > purpose components more than special cases. > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Tue Jan 29 19:12:05 2019 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 30 Jan 2019 00:12:05 +0000 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: <23e20d26-e3e0-7f41-795b-c4b6b0b39d67@mrabarnett.plus.com> On 2019-01-29 23:30, Terry Reedy wrote: > On 1/28/2019 8:40 PM, Jamesie Pic wrote: > > > 0. os.path.join takes *args > > Since at least 1 path is required, the signature is join(path, *paths). > I presume that this is the Python version of the Unix version of the > system call that it wraps.. The hidden argument is os.sep. It is > equivalent to os.sep.join((path,)+paths) (though one would not write it > this way). > >> 1. str.join takes a list argument, > > This premise behind the (repeated) request is a false. str.joins > arguments are a string (the joiner) and an *iterable of strings*, which > is an abstract subclass of the abstract concept 'iterable'. And only a > small fraction of lists are lists of strings and therefore iterables of > strings. > One the examples given was writing: >>> '/'.join('some', 'path') To me, this suggests that what the OP _really_ wants is for str.join to accept multiple arguments, much as os.path.join does. I thought that there would be a problem with that because currently the single argument is an iterable, and you wouldn't want to iterate the first argument of '/'.join('some', 'path'). However, both min and max will accept either a single argument that's iterated over or multiple arguments that are not, so there's a precedent there. From python at mrabarnett.plus.com Tue Jan 29 19:14:51 2019 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 30 Jan 2019 00:14:51 +0000 Subject: [Python-ideas] Add list.join() please In-Reply-To: <5C50E3E5.2060900@canterbury.ac.nz> References: <5C4FDEFC.6070902@brenbarn.net> <5C50E3E5.2060900@canterbury.ac.nz> Message-ID: <848c7a34-7495-7213-118a-9b9d0b3c94b9@mrabarnett.plus.com> On 2019-01-29 23:38, Greg Ewing wrote: > Brendan Barnwell wrote: >> Personally what I find is perverse is that .join is a method of >> strings but does NOT call str() on the items to be joined. > > Neither do most other string methods: > > >>> s = "hovercraft" > >>> s.count(42) > Traceback (most recent call last): > File "", line 1, in > TypeError: Can't convert 'int' object to str implicitly > > Why should join() be any different? > And what if you don't want str, but instead repr, or ascii? (An optional stringifying function, maybe? :-)) From robertve92 at gmail.com Tue Jan 29 20:15:27 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 30 Jan 2019 02:15:27 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: +1 to that email !! (how to do that in email haha) On Tue, 29 Jan 2019, 21:50 Chris Barker via Python-ideas < python-ideas at python.org wrote: > A couple notes: > > On Tue, Jan 29, 2019 at 5:31 AM Jamesie Pic wrote: > >> can you clarify the documentation >> topic you think should be improved or created ? "Assembling strings" >> > > I would think "assembling strings", though there is a lot out there > already. > > >> or "inconsistencies between os.path.join and str.join" ? >> > > well, if we're talking about moving forward, then the Path object is > probably the "right" way to join paths anyway :-) > > a_path / "a_dir" / "a_filename" > > But to the core language issue -- I started using Python with 1.5.* and > back then join() was in the string module (and is there in 2.7 still) > > And yes, I did expect it to be a list method... > > Then it was added as a method of the string object. > > And I thought THAT was odd -- be really appreciated that I didn't need to > import a module to do something fundamental. > > But the fact is, that joining strings is fundamentally a string operation, > so it makes sense for it to be there. > > In earlier py2, I would have thought, maybe it should be a list method -- > it's pretty darn common to join lists of strings, yes? But what about > tuples? Python was kind of all about sequences -- so maybe all sequences > could have that method -- i.e part of the sequence ABC. > > But with > py3k, Python is more about iterables than sequences -- and join > (and many other methods and functions) operate on any iterable -- and this > is a really good thing. > > So add join to ALL iterables? That makes little sense, and really isn't > possible -- an iterable is something that conforms to the iterator protocol > -- it's not a type, or even an ABC. > > So in the end, join really does only make sense as string method. > > Or Maybe as a built in -- but we really don't need any more of those. > > If you want to argue that str.join() should take multiple arguments, like > os.path.join does, then, well we could do that -- it currently takes one > and only one argument, so it could be extended to join multiple arguments > -- but I have hardly ever seem a use case for that. > > The mistake I'm still doing after 10 years of Python >> > > hmm -- I've seen a lot of newbies struggle with this, but haven't had an > issue with it for years myself. > > >> >>> '/'.join('some', 'path') >> TypeError: join() takes exactly one argument (2 given) >> > > pathlib aside, that really isn't the right way to join paths ..... > os.path.jon exists for a (good) reasons. One of which is this: > > In [22]: os.path.join("this/", "that") > Out[22]: 'this/that' > > -CHB > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Tue Jan 29 20:19:07 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 30 Jan 2019 02:19:07 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> Message-ID: +1 Good performance analysis IMHO :) On Tue, 29 Jan 2019, 22:24 Jonathan Fine I've not been following closely, so please forgive me if I'm repeating > something already said in this thread. > > Summary: str.join allows us to easily avoid, when assembling strings, > 1. Quadratic running time. > 2. Redundant trailing comma syntax error. > > The inbuilt help(str.join) gives: > S.join(iterable) -> str > Return a string which is the concatenation of the strings in the > iterable. The separator between elements is S. > > This is different from sum in two ways. The first is the separator S. > The second is performance related. Consider > s = 0 > for i in range(100): > s += 1 > and > s = '' > for i in range(100): > s += 'a' > > The first has linear running time (in the parameter represented by > 100). The second has quadratic running time (unless string addition is > doing something clever, like being lazy in evaluation). > > The separator S is important. In Python a redundant trailing comma, like > so, > val = [0, 1, 2, 3,] > is both allowed and useful. (For example, when the entries are each on > a simple line, it reduces the noise that arises when an entry is added > at the end. And when the entries are reordered.) > > For some languages, the redundant trailing comma is a syntax error. To > serialise data for such languages, you can do this: > >>> '[{}]'.format(', '.join(map(str, v))) > '[0, 1, 2, 3]' > > From here, by all means repackage for your own convenience in your own > library, or use a third party library that already has what you want. > (A widely used pypi package has, I think, a head start for adoption > into the standard library.) > > By the way, as search for "python strtools" gives me > https://pypi.org/project/extratools/ > https://www.chuancong.site/extratools/functions/strtools/ > > https://pypi.org/project/str-tools/. # This seems to be an empty stub. > > -- > Jonathan > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Tue Jan 29 20:44:21 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 29 Jan 2019 20:44:21 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: stringify = lambda it: type(it)(map(str, it)) Done! Does that really need to be in the STDLIB? On Tue, Jan 29, 2019, 7:11 PM Alex Shafer via Python-ideas < python-ideas at python.org wrote: > 1) I'm in favor of adding a stringify method to all collections > > 2) strings are special and worthy of a "special case" because strings tend > to be human readable and are used in all kinds of user interface. > > > > -------- Original Message -------- > On Jan 29, 2019, 16:04, Steven D'Aprano < steve at pearwood.info> wrote: > > > On Tue, Jan 29, 2019 at 10:51:26PM +0100, Jamesie Pic wrote: > > > What do you think of list.stringify(delim) ? > > What's so special about lists? What do you think of: > > tuple.stringify > deque.stringify > iterator.stringify > dict.keys.stringify > > etc. And what's so special about strings that lists have to support a > stringify method and not every other type? > > list.floatify > list.intify > list.tuplify > list.setify > list.iteratorify > > Programming languages should be more about composable, re-usable general > purpose components more than special cases. > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashafer at pm.me Tue Jan 29 20:52:37 2019 From: ashafer at pm.me (Alex Shafer) Date: Wed, 30 Jan 2019 01:52:37 +0000 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: That would be strongly preferred to duplication across hundreds of use cases and thousands (millions?) of users. Not all of them are likely to come up with the most efficient implementation either. -------- Original Message -------- On Jan 29, 2019, 18:44, David Mertz wrote: > stringify = lambda it: type(it)(map(str, it)) > > Done! Does that really need to be in the STDLIB? > > On Tue, Jan 29, 2019, 7:11 PM Alex Shafer via Python-ideas >> 1) I'm in favor of adding a stringify method to all collections >> >> 2) strings are special and worthy of a "special case" because strings tend to be human readable and are used in all kinds of user interface. >> >> -------- Original Message -------- >> On Jan 29, 2019, 16:04, Steven D'Aprano < steve at pearwood.info> wrote: >> >>> On Tue, Jan 29, 2019 at 10:51:26PM +0100, Jamesie Pic wrote: >>> >>>> What do you think of list.stringify(delim) ? >>> >>> What's so special about lists? What do you think of: >>> >>> tuple.stringify >>> deque.stringify >>> iterator.stringify >>> dict.keys.stringify >>> >>> etc. And what's so special about strings that lists have to support a >>> stringify method and not every other type? >>> >>> list.floatify >>> list.intify >>> list.tuplify >>> list.setify >>> list.iteratorify >>> >>> Programming languages should be more about composable, re-usable general >>> purpose components more than special cases. >>> >>> -- >>> Steve >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Tue Jan 29 20:57:07 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 29 Jan 2019 20:57:07 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: "Not every five line function needs to be in the standard library" ... even more true for every one line function. I can think of a few dozen variations of similar but not quite identical behavior to my little stringify() that "could be useful." Python gives us easy composition to create each of them. It's not PHP, after all. On Tue, Jan 29, 2019 at 8:52 PM Alex Shafer wrote: > That would be strongly preferred to duplication across hundreds of use > cases and thousands (millions?) of users. Not all of them are likely to > come up with the most efficient implementation either. > -------- Original Message -------- > On Jan 29, 2019, 18:44, David Mertz < mertz at gnosis.cx> wrote: > > > stringify = lambda it: type(it)(map(str, it)) > > Done! Does that really need to be in the STDLIB? > > On Tue, Jan 29, 2019, 7:11 PM Alex Shafer via Python-ideas < > python-ideas at python.org wrote: > >> 1) I'm in favor of adding a stringify method to all collections >> >> 2) strings are special and worthy of a "special case" because strings >> tend to be human readable and are used in all kinds of user interface. >> >> >> >> -------- Original Message -------- >> On Jan 29, 2019, 16:04, Steven D'Aprano < steve at pearwood.info> wrote: >> >> >> On Tue, Jan 29, 2019 at 10:51:26PM +0100, Jamesie Pic wrote: >> >> > What do you think of list.stringify(delim) ? >> >> What's so special about lists? What do you think of: >> >> tuple.stringify >> deque.stringify >> iterator.stringify >> dict.keys.stringify >> >> etc. And what's so special about strings that lists have to support a >> stringify method and not every other type? >> >> list.floatify >> list.intify >> list.tuplify >> list.setify >> list.iteratorify >> >> Programming languages should be more about composable, re-usable general >> purpose components more than special cases. >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashafer at pm.me Tue Jan 29 21:03:33 2019 From: ashafer at pm.me (Alex Shafer) Date: Wed, 30 Jan 2019 02:03:33 +0000 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: <-W5nDPMBFWNDd2mI62kotcdiuSEZYS4Uf4M2gBEz-4VifS8vK3DhKdSUSIhRlXJeIgdkblhJK-7YUjjZehZs_w==@pm.me> Frankly this sounds like resistance to adaptation and evolution. How long ago was that adage written? Or perhaps this is a pathological instance of the snowball fallacy? Adding one widely requested feature does not imply that all requested features will be added. -------- Original Message -------- On Jan 29, 2019, 18:57, David Mertz wrote: > "Not every five line function needs to be in the standard library" > > ... even more true for every one line function. I can think of a few dozen variations of similar but not quite identical behavior to my little stringify() that "could be useful." Python gives us easy composition to create each of them. It's not PHP, after all. > > On Tue, Jan 29, 2019 at 8:52 PM Alex Shafer wrote: > >> That would be strongly preferred to duplication across hundreds of use cases and thousands (millions?) of users. Not all of them are likely to come up with the most efficient implementation either. >> -------- Original Message -------- >> On Jan 29, 2019, 18:44, David Mertz < mertz at gnosis.cx> wrote: >> >>> stringify = lambda it: type(it)(map(str, it)) >>> >>> Done! Does that really need to be in the STDLIB? >>> >>> On Tue, Jan 29, 2019, 7:11 PM Alex Shafer via Python-ideas >> >>>> 1) I'm in favor of adding a stringify method to all collections >>>> >>>> 2) strings are special and worthy of a "special case" because strings tend to be human readable and are used in all kinds of user interface. >>>> >>>> -------- Original Message -------- >>>> On Jan 29, 2019, 16:04, Steven D'Aprano < steve at pearwood.info> wrote: >>>> >>>>> On Tue, Jan 29, 2019 at 10:51:26PM +0100, Jamesie Pic wrote: >>>>> >>>>>> What do you think of list.stringify(delim) ? >>>>> >>>>> What's so special about lists? What do you think of: >>>>> >>>>> tuple.stringify >>>>> deque.stringify >>>>> iterator.stringify >>>>> dict.keys.stringify >>>>> >>>>> etc. And what's so special about strings that lists have to support a >>>>> stringify method and not every other type? >>>>> >>>>> list.floatify >>>>> list.intify >>>>> list.tuplify >>>>> list.setify >>>>> list.iteratorify >>>>> >>>>> Programming languages should be more about composable, re-usable general >>>>> purpose components more than special cases. >>>>> >>>>> -- >>>>> Steve >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>> >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ja.py at farowl.co.uk Tue Jan 29 15:36:28 2019 From: ja.py at farowl.co.uk (Jeff Allen) Date: Tue, 29 Jan 2019 20:36:28 +0000 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: On 29/01/2019 01:40, Jamesie Pic wrote: > ... I'm still sometimes confused between the different syntaxes used > by join methods: > > 0. os.path.join takes *args > 1. str.join takes a list argument, this inconsistence make it easy to > mistake with the os.path.join signature It seems fairly consistent to make: ??? os.path.join('a', 'b', 'c') short for: ??? os.path.sep.join(['a', 'b', 'c']) > Also, I still think that: > > '_'.join(['cancel', name]) > > Would be more readable as such: > > ['cancel', name].join('_') Please, no. This would be un-Pythonic in my view. It makes so much more sense that str should have a method that takes an iterable, returning str, than that every iterable should have a join(str) returning str. Consider you get this kind of thing for free: ??? "-".join(str(i) for i in range(10)) I learned enough Groovy last year to use Gradle and was so disappointed to find myself having to write: ?????????? excludes: exclusions.join(',')??? // Yes, it's that way round :o Even Java agrees (since 1.8) with Python. Jeff Allen -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Tue Jan 29 21:58:54 2019 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Tue, 29 Jan 2019 18:58:54 -0800 Subject: [Python-ideas] Add list.join() please In-Reply-To: <848c7a34-7495-7213-118a-9b9d0b3c94b9@mrabarnett.plus.com> References: <5C4FDEFC.6070902@brenbarn.net> <5C50E3E5.2060900@canterbury.ac.nz> <848c7a34-7495-7213-118a-9b9d0b3c94b9@mrabarnett.plus.com> Message-ID: <5C5112EE.600@brenbarn.net> On 2019-01-29 16:14, MRAB wrote: > On 2019-01-29 23:38, Greg Ewing wrote: >> Brendan Barnwell wrote: >>> Personally what I find is perverse is that .join is a method of >>> strings but does NOT call str() on the items to be joined. >> >> Neither do most other string methods: >> >> >>> s = "hovercraft" >> >>> s.count(42) >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: Can't convert 'int' object to str implicitly >> >> Why should join() be any different? >> > And what if you don't want str, but instead repr, or ascii? Then you can still convert them yourself beforehand, and any stringifying that .join did would be a no-op. If you want to call repr on all your stuff beforehand, great, then you'll get strings and you can join them just like anything else. But you'll ADDITIONALLY be able to not pre-stringify them in a custom way, in which case they'll be stringified in the default way. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From robertve92 at gmail.com Tue Jan 29 22:36:56 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 30 Jan 2019 04:36:56 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: > stringify = lambda it: type(it)(map(str, it)) > stringify(range(5)) doesn't work ^^ One advantage or having a standard function is that it has been designed by a lot of persons for all possible use cases :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Tue Jan 29 22:38:38 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 30 Jan 2019 04:38:38 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: +1 On Wed, 30 Jan 2019, 02:57 David Mertz "Not every five line function needs to be in the standard library" > > ... even more true for every one line function. I can think of a few > dozen variations of similar but not quite identical behavior to my little > stringify() that "could be useful." Python gives us easy composition to > create each of them. It's not PHP, after all. > > On Tue, Jan 29, 2019 at 8:52 PM Alex Shafer wrote: > >> That would be strongly preferred to duplication across hundreds of use >> cases and thousands (millions?) of users. Not all of them are likely to >> come up with the most efficient implementation either. >> -------- Original Message -------- >> On Jan 29, 2019, 18:44, David Mertz < mertz at gnosis.cx> wrote: >> >> >> stringify = lambda it: type(it)(map(str, it)) >> >> Done! Does that really need to be in the STDLIB? >> >> On Tue, Jan 29, 2019, 7:11 PM Alex Shafer via Python-ideas < >> python-ideas at python.org wrote: >> >>> 1) I'm in favor of adding a stringify method to all collections >>> >>> 2) strings are special and worthy of a "special case" because strings >>> tend to be human readable and are used in all kinds of user interface. >>> >>> >>> >>> -------- Original Message -------- >>> On Jan 29, 2019, 16:04, Steven D'Aprano < steve at pearwood.info> wrote: >>> >>> >>> On Tue, Jan 29, 2019 at 10:51:26PM +0100, Jamesie Pic wrote: >>> >>> > What do you think of list.stringify(delim) ? >>> >>> What's so special about lists? What do you think of: >>> >>> tuple.stringify >>> deque.stringify >>> iterator.stringify >>> dict.keys.stringify >>> >>> etc. And what's so special about strings that lists have to support a >>> stringify method and not every other type? >>> >>> list.floatify >>> list.intify >>> list.tuplify >>> list.setify >>> list.iteratorify >>> >>> Programming languages should be more about composable, re-usable general >>> purpose components more than special cases. >>> >>> -- >>> Steve >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Tue Jan 29 22:46:11 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 29 Jan 2019 22:46:11 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: Of course not! The request was for something that worked on Python *collections*. If the OP wanted something that worked on iterables in general, we'd need a different function with different behavior. Of course, it also doesn't work on dictionaries. I don't really have any ideas what the desired behavior might be for dicts. Various things are conceivable, none obvious. But it's fine on lists, sets, tuples, deques, and some other things that are roughly sequence-like. On Tue, Jan 29, 2019, 10:38 PM Robert Vanden Eynde > stringify = lambda it: type(it)(map(str, it)) >> > > stringify(range(5)) doesn't work ^^ > > One advantage or having a standard function is that it has been designed > by a lot of persons for all possible use cases :) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Tue Jan 29 22:51:01 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 29 Jan 2019 22:51:01 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: The point really is that something called 'stringify()' could do a lot of different reasonable and useful things. None of them are universally what users would want. Unless you have to function scads if optional keyword arguments, is behavior would surprise many users and not for their purpose. On Tue, Jan 29, 2019, 10:46 PM David Mertz Of course not! The request was for something that worked on Python > *collections*. If the OP wanted something that worked on iterables in > general, we'd need a different function with different behavior. > > Of course, it also doesn't work on dictionaries. I don't really have any > ideas what the desired behavior might be for dicts. Various things are > conceivable, none obvious. But it's fine on lists, sets, tuples, deques, > and some other things that are roughly sequence-like. > > > > On Tue, Jan 29, 2019, 10:38 PM Robert Vanden Eynde wrote: > >> >> stringify = lambda it: type(it)(map(str, it)) >>> >> >> stringify(range(5)) doesn't work ^^ >> >> One advantage or having a standard function is that it has been designed >> by a lot of persons for all possible use cases :) >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Tue Jan 29 21:56:54 2019 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Tue, 29 Jan 2019 18:56:54 -0800 Subject: [Python-ideas] Add list.join() please In-Reply-To: <5C50E3E5.2060900@canterbury.ac.nz> References: <5C4FDEFC.6070902@brenbarn.net> <5C50E3E5.2060900@canterbury.ac.nz> Message-ID: <5C511276.2000505@brenbarn.net> On 2019-01-29 15:38, Greg Ewing wrote: > Brendan Barnwell wrote: >> Personally what I find is perverse is that .join is a method of >> strings but does NOT call str() on the items to be joined. > > Neither do most other string methods: > > >>> s = "hovercraft" > >>> s.count(42) > Traceback (most recent call last): > File "", line 1, in > TypeError: Can't convert 'int' object to str implicitly > > Why should join() be any different? Oh please. Because it also RETURNS a string. Of course count won't return a string, it returns a count. But str.join is for "I want to join these items into a single string separated by this delimiter". If the output is to a be a string obtained by combining other items, there is nothing lost by converting them to strings. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From greg.ewing at canterbury.ac.nz Tue Jan 29 23:48:13 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Jan 2019 17:48:13 +1300 Subject: [Python-ideas] Add list.join() please In-Reply-To: <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: <5C512C8D.3000803@canterbury.ac.nz> Alex Shafer via Python-ideas wrote: > 1) I'm in favor of adding a stringify method to all collections Are you volunteering to update all the collection classes in the world written in Python? :-) -- Greg From robertve92 at gmail.com Tue Jan 29 23:57:01 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 30 Jan 2019 05:57:01 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: On Wed, 30 Jan 2019, 04:46 David Mertz wrote: Of course not! [...] > I agree > Of course, it also doesn't work on dictionaries. [...] > I agree -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Wed Jan 30 00:01:20 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 30 Jan 2019 06:01:20 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: I love it when the discussion goes fast like here! :D The messages are short or long-structured-and-explaining, I love it :) -- Sorry if I may look like a troll sometimes, I truly like the conversation and I want to share the excitement :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jan 30 00:43:56 2019 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 29 Jan 2019 21:43:56 -0800 Subject: [Python-ideas] Add list.join() please In-Reply-To: <5C512C8D.3000803@canterbury.ac.nz> References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> Message-ID: > Alex Shafer via Python-ideas wrote: >> 1) I'm in favor of adding a stringify method to all collections > > Are you volunteering to update all the collection > classes in the world written in Python? :-) To be fair, we could add an implementation to the sequence ABC, and get pretty far. Not that I?m suggesting that ? as I said earlier, Python is all about iterables, not sequences, anyway. Also, despite some folks? instance that this ?stringify? method is something many folks want -.I?m not even sure what it is. I was thinking it was: def stringify(self, sep): return sep.join(str(i) for i in self) Which, by the way would work for any iterable :-) If you want a language designed specifically for text processing, use Perl. Python is deliberately strongly typed, so that: 2 + ?2? Raises an error. Why should: ??.join([2, ?2?]) not raise an error as well? And aside from repr or ascii, when I turn numbers into text, I usually want to control the formatting anyway: ? ?.join(f?{n:.2f}? for n in seq) So having str() called automatically for a join wouldn?t be that useful. -CHB From robertve92 at gmail.com Wed Jan 30 01:02:35 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 30 Jan 2019 07:02:35 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> Message-ID: > > def stringify(self, sep): > return sep.join(str(i) for i in self) > = map(sep.join(map(str, self)) However some folks want: def stringify(*args, *, sep:str=SomeDefault): return sep.join(map(str, args)) In order to have: >>> stringify(1, 2, "3", sep="-") 1-2-3 And I agree about the formatting, we know that str(x) and format(x) are synonyms so I'd suggest: def stringify(*args, *, sep:str=SomeDefault, fmt=''): return sep.join(format(x, fmt) for x in args) And the implicit call to str is really not surprising for a function called stringify IMO If you want a language designed specifically for text processing, use Perl. > True ! However typing python -cp "1+1" is really tempting... > Python is deliberately strongly typed, so that: > > 2 + ?2? > > Raises an error. Why should: > > ??.join([2, ?2?]) not raise an error as well? > I agree -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Wed Jan 30 01:13:33 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 30 Jan 2019 07:13:33 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> Message-ID: > def stringify(*args, *, sep:str=SomeDefault): > I meant def stringify(*args, sep:str=SomeDefault) So an idea would use duck typing to find out if we have 1 iterable or a multiple stuff : def stringify(*args, sep:str=SomeDefault, fmt=''): it = args[0] if len(args) == 1 and hasattr(args[0], '__iter__') else args return sep.join(format(x, fmt) for x in it) But ? duck typing is nasty... I don't want that in the stdlib (but in a pip package, sure!) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Jan 30 02:17:29 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 30 Jan 2019 02:17:29 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: <23e20d26-e3e0-7f41-795b-c4b6b0b39d67@mrabarnett.plus.com> References: <23e20d26-e3e0-7f41-795b-c4b6b0b39d67@mrabarnett.plus.com> Message-ID: On 1/29/2019 7:12 PM, MRAB wrote: > On 2019-01-29 23:30, Terry Reedy wrote: >> On 1/28/2019 8:40 PM, Jamesie Pic wrote: >> >> ? > 0. os.path.join takes *args >> >> Since at least 1 path is required, the signature is join(path, *paths). >> I presume that this is the Python version of the Unix version of the >> system call that it wraps..? The hidden argument is os.sep.? It is >> equivalent to os.sep.join((path,)+paths) (though one would not write it >> this way). >> >>> 1. str.join takes a list argument, >> >> This premise behind the (repeated) request is a false.? str.joins >> arguments are a string (the joiner) and an *iterable of strings*, which >> is an abstract subclass of the abstract concept 'iterable'.? And only a >> small fraction of lists are lists of strings and therefore iterables of >> strings. >> > One the examples given was writing: > > >>> '/'.join('some', 'path') > > To me, this suggests that what the OP _really_ wants is for str.join to > accept multiple arguments, much as os.path.join does. > > I thought that there would be a problem with that because currently the > single argument is an iterable, and you wouldn't want to iterate the > first argument of '/'.join('some', 'path'). > However, both min and max will accept either a single argument that's > iterated over or multiple arguments that are not, so there's a precedent > there. I have done things like this in private code, but it makes for messy signatures. The doc pretends that min has two signatures, given in the docstring: min(iterable, *[, default=obj, key=func]) -> value min(arg1, arg2, *args, *[, key=func]) -> value I believe that the actual signature is the uninformative min(*args, **kwargs). The arg form, without key, is the original. If min were being written today, I don't think it would be included. -- Terry Jan Reedy From steve at pearwood.info Wed Jan 30 03:20:46 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 30 Jan 2019 19:20:46 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: <20190130082045.GH1834@ando.pearwood.info> On Wed, Jan 30, 2019 at 12:09:55AM +0000, Alex Shafer wrote: > 2) strings are special and worthy of a "special case" because strings > tend to be human readable and are used in all kinds of user interface. So are ints, floats, bools, lists, tuples, sets, dicts, etc. We already have a "stringify" function that applies to one object at a time. It's spelled str(), or if you prefer a slightly different format, repr(). To apply the stringify function of your choice to more than one object, you can use a for-loop, or a list comprehension, or a set comprehension, or map(). This is called composition of re-usable components, and it is a Good Thing. If you don't like the two built-in stringify functions, you can write your own, and they still work with for-loops, comprehensions and map(). Best of all, we're not even limited to strings. Change your mind and want floats instead of strings? Because these are re-usable, composable components, you don't have to wait for Python 4.3 to get a list floatify() method, you can just unplug the str() component and replace it with the float() component. -- Steve From jpic at yourlabs.org Wed Jan 30 04:46:56 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 10:46:56 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: <20190129230457.GG1834@ando.pearwood.info> References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> Message-ID: Thanks Steven for your reply. For me, assembling a string from various variables is a much more common programing task, because that's how users except software to communicate with them, be it on CLI, GUI, or through Web. It doesn't matter if your software works and the user doesn't understand it. It doesn't matter if your software doesn't work, as long as the user understands it. I wonder what makes my use case so special, perhaps because when I make software it's always on the purpose to serve an actual human being need ? From jpic at yourlabs.org Wed Jan 30 04:58:54 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 10:58:54 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> Message-ID: On Wed, Jan 30, 2019 at 2:45 AM David Mertz wrote: > Done! Does that really need to be in the STDLIB? Well, Robert suggested to define it in the python startup script. The issue I'm having with that is that it will make my software harder to distribute: it will require the user to hack their startup script, or even worse : do it ourself in setup.py ! Jonathan suggested to add it to an external package like strtools that has a smartsplit() function, but not smartjoin(). So far I have a PR in boltons, I've requested their comments, so, I'll let you know if they have a refutation to provide. Otherwise, I will try to submit it to the strtools package. Otherwise, I can make a custom package for that one-liner, like it's fairly common to do in NPM packages. Do you have any suggestions on the API ? I see that the implode name is available on PyPi, do you think this would be nice to import the one-liner ? from implode import implode Thanks for your reply -- ? From jpic at yourlabs.org Wed Jan 30 05:01:11 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 11:01:11 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: I'm not disagreeing by any mean. I'm just saying assembling strings is a common programing task and that we have two different methods with the same name and inconsistent signatures and that it's error-prone. I'm most certainly *not* advocating for breaking compatibility or whatnot. From rosuav at gmail.com Wed Jan 30 05:05:25 2019 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 30 Jan 2019 21:05:25 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> Message-ID: On Wed, Jan 30, 2019 at 8:50 PM Jamesie Pic wrote: > > For me, assembling a string from various variables is a much more > common programing task, because that's how users except software to > communicate with them, be it on CLI, GUI, or through Web. > > It doesn't matter if your software works and the user doesn't > understand it. It doesn't matter if your software doesn't work, as > long as the user understands it. > > I wonder what makes my use case so special, perhaps because when I > make software it's always on the purpose to serve an actual human > being need ? Most places where you need to talk to humans, you'll end up either interpolating the values into a template of some sort (see: percent formatting, the format method, and f-strings), or plug individual values straight into method calls (eg when building a GUI). I'm not sure why or how your use-case is somehow different here. It's generally best to provide simple low-level functionality, and then let people build it into whatever they like. For example, VLC Media Player and Counter-Strike: Global Offensive don't have any means of interacting, but with some simple Python programming in between, it's possible to arrange it so that the music automatically pauses while you're in a match. But there does NOT need to be a game feature "automatically pause VLC while in a match". Joining a collection of strings is possible. Stringifying a collection of arbitrary objects is possible. There doesn't need to be a single feature that does both at once. ChrisA From jpic at yourlabs.org Wed Jan 30 05:07:52 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 11:07:52 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> Message-ID: On Wed, Jan 30, 2019 at 7:03 AM Robert Vanden Eynde wrote: >> >> Raises an error. Why should: >> >> ??.join([2, ?2?]) not raise an error as well? > > I agree What do you think could be the developer intent when they do ",".join([2, "2']) ? If the intent is clearly to assemble a string, as it looks like, then I don't find any disadvantage to automate this task for them. -- ? From jpic at yourlabs.org Wed Jan 30 05:08:44 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 11:08:44 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> Message-ID: On Wed, Jan 30, 2019 at 7:14 AM Robert Vanden Eynde wrote: > But ? duck typing is nasty... I don't want that in the stdlib (but in a pip package, sure!) Not only do I actually like your implementation, but I also love duck typing. For me duck typing means freedom, not barrier. -- ? From steve at pearwood.info Wed Jan 30 05:17:34 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 30 Jan 2019 21:17:34 +1100 Subject: [Python-ideas] Stack traces ought to flag when a module has been changed on disk Message-ID: <20190130101733.GI1834@ando.pearwood.info> This thought is motivated by this bug report: https://bugs.python.org/issue35857 If you import a module, then edit the .py file that goes with it, and then an exception occurs, the stack trace can show the wrong line. It doesn't happen very often, but when it does happen, it can be very perplexing. Here's a proposal: When a stack trace is printed, before printing each line, the interpreter checks whether the file's modification time is later than the time recorded in the .pyc file. If the times are different, the stack trace can flag the line and print an addition line stating that the file may have changed and the stack trace may not be accurate. Something like this perhaps? Traceback (most recent call last): File "spam.py", line 99, in eggs.foo() File "eggs.py", line 123, in ? for obj in sequence: File "cheese.py", line 456, in n = len(x) *** one or more files may have changed *** lines starting with ? may be inaccurate TypeError: object of type 'NoneType' has no len() I don't think performance will matter. Generating stack traces are rarely performance critical, so I don't think that a few extra file system checks will make any meaningful difference. Thoughts? -- Steve From jpic at yourlabs.org Wed Jan 30 05:17:34 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 11:17:34 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: <20190130082045.GH1834@ando.pearwood.info> References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <20190130082045.GH1834@ando.pearwood.info> Message-ID: On Wed, Jan 30, 2019 at 9:21 AM Steven D'Aprano wrote: > > If you don't like the two built-in stringify functions, you can write > your own, and they still work with for-loops, comprehensions and map(). I don't disagree, after all, there are many NPM packages that contain really short functions, we could package the function on its own. I see that the "implode" namespace is not taken on PyPi, so, what do you suggest it would look like ? from implode import implode ? Or can you suggest better names ? > Best of all, we're not even limited to strings. Change your mind and > want floats instead of strings? To be user friendly software will need to build proper text output. And most of the time joining a sequence is the best way to go. But, I often mistake because switching over from os.path.join and str.join. -- ? From jpic at yourlabs.org Wed Jan 30 05:19:32 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 11:19:32 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <20190130082045.GH1834@ando.pearwood.info> Message-ID: On Wed, Jan 30, 2019 at 11:17 AM Jamesie Pic wrote: > often mistake because switching over from os.path.join and str.join. I didn't mean "replacing an os.path.join call with an str.join call", I mean that I'm calling str.join 2 seconds after os.path.join, and forgot about the inconsistency we have between the two. Does this make any sense? Thanks for your great replies -- ? From eric at trueblade.com Wed Jan 30 05:24:08 2019 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 30 Jan 2019 05:24:08 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> Message-ID: <77b056de-91d9-5493-95c0-0af90105f9e2@trueblade.com> On 1/30/2019 5:07 AM, Jamesie Pic wrote: > On Wed, Jan 30, 2019 at 7:03 AM Robert Vanden Eynde > wrote: >>> >>> Raises an error. Why should: >>> >>> ??.join([2, ?2?]) not raise an error as well? >> >> I agree > > What do you think could be the developer intent when they do > ",".join([2, "2']) ? > > If the intent is clearly to assemble a string, as it looks like, then > I don't find any disadvantage to automate this task for them. Your examples show literals, but I literally (heh) never use str.join this way. I always pass it some variable. And 100% of the time, if that variable (say it's a list) contains something that's not a string, I want it to raise an exception. I do not want this to succeed: lst = ['hello', None] ', '.join(lst) lst is usually computed a long way from where the join happens. So, I do not want this task automated for me. Eric From jpic at yourlabs.org Wed Jan 30 05:26:17 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 11:26:17 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> Message-ID: On Wed, Jan 30, 2019 at 11:06 AM Chris Angelico wrote: > > Most places where you need to talk to humans, you'll end up either > interpolating the values into a template of some sort (see: percent > formatting, the format method, and f-strings), or plug individual > values straight into method calls (eg when building a GUI). I'm not > sure why or how your use-case is somehow different here. The new python format syntax with f-strings is pretty awesome, let's see how we can assemble a triple quoted f-string: foo = f''' some {more(complex)} {st.ri("ng")} '''.strip() Pretty cool right ? In a function it would look like this: def foo(): return f''' some {more(complex)} {st.ri("ng")} ''').strip() Ok so that would also work but we're going to have to import a module from the standard library to restore visual indentation on that code: import textwrap def foo(): return textwrap.dedent(f''' some {more(complex)} {st.ri("ng")} ''').strip() Let's compare this to the join notation: def foo(): return '\n'.join('some', more(complex), st.ri('ng')) Needless to say, I prefer the join notation for this use case. Not only does it fit in a single line but it doesn't require to dedent the text with an imported function, nor does it require to juggle with quotes, but also it sorts of look like it would be more performant. All in all, I prefer the join notation to assemble longer strings. Note that in practice, using f-strings for the "pieces" that I want to assemble and that works great: def foo(): return '\n'.join('some', more(complex), f'_{other}_') Anyway, ok good-enough looking code ! Let's see what you have to say: TypeError: join() takes exactly one argument (2 given) Oh, that again, kk gotfix: def foo(): return '\n'.join(['some', more(complex), f'_{other}_']) I should take metrics about the number of times were I make this mistake during a day, cause it looks like it would be a lot (i switch between os.path.join to str.join a lot). It seems there is a lot of friction when proposing to add a convenience join method to the list method. I won't go over the reasons for this here, there's already a lot to read about it on internet, that's been written during the last 20 years. ## Conclusion I have absolutely no idea what should be done about this, the purpose of this article was just to share a bit of one of my obsessions with string assembling. Maybe it strikes me assembling strings multiple times a day with a language I've got 10 years of full-time experience and still repeating the same mistakes because I coded an os.path.join call 3 seconds before assembling a string with str.join, silly me ^^ Not because I don't understand the jurisprudence, not because I don't understand the documentation, or because the documentation is wrong, but probably just because i switch from os.path.join and str.join which take different syntax, i think. > It's generally best to provide simple low-level functionality, and > then let people build it into whatever they like. For example, VLC > Media Player and Counter-Strike: Global Offensive don't have any means > of interacting, but with some simple Python programming in between, > it's possible to arrange it so that the music automatically pauses > while you're in a match. But there does NOT need to be a game feature > "automatically pause VLC while in a match". Joining a collection of > strings is possible. Stringifying a collection of arbitrary objects is > possible. There doesn't need to be a single feature that does both at > once. Even for a program without user interface: you still want proper logs in case your software crashes for example . So even if you're not building a user interface, you still want to assemble human readable strings. If it's such a common task, why not automate what's obvious to automate ? -- ? From jpic at yourlabs.org Wed Jan 30 05:31:46 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 11:31:46 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: <77b056de-91d9-5493-95c0-0af90105f9e2@trueblade.com> References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> <77b056de-91d9-5493-95c0-0af90105f9e2@trueblade.com> Message-ID: On Wed, Jan 30, 2019 at 11:24 AM Eric V. Smith wrote: > Your examples show literals, but I literally (heh) never use str.join > this way. I always pass it some variable. And 100% of the time, if that > variable (say it's a list) contains something that's not a string, I > want it to raise an exception. I do not want this to succeed: > > lst = ['hello', None] > ', '.join(lst) > > lst is usually computed a long way from where the join happens. > > So, I do not want this task automated for me. That's a really good point ! So, maybe we have a parameter for that ... from implode import implode assert implode('-', [3, None, 2], none_str='') == '3-2' Even that still seems pretty fuzzy to me, please, can you share an idea for improvement ? -- ? From pradyunsg at gmail.com Wed Jan 30 05:41:11 2019 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Wed, 30 Jan 2019 16:11:11 +0530 Subject: [Python-ideas] Stack traces ought to flag when a module has been changed on disk In-Reply-To: <20190130101733.GI1834@ando.pearwood.info> References: <20190130101733.GI1834@ando.pearwood.info> Message-ID: On Wed, 30 Jan 2019 at 3:48 PM, Steven D'Aprano wrote: > This thought is motivated by this bug report: > > https://bugs.python.org/issue35857 > > If you import a module, then edit the .py file that goes with it, and > then an exception occurs, the stack trace can show the wrong line. > > It doesn't happen very often, but when it does happen, it can be very > perplexing. Here's a proposal: > > When a stack trace is printed, before printing each line, the > interpreter checks whether the file's modification time is later than > the time recorded in the .pyc file. If the times are different, the > stack trace can flag the line and print an addition line stating that > the file may have changed and the stack trace may not be accurate. > > Something like this perhaps? > > > Traceback (most recent call last): > File "spam.py", line 99, in > eggs.foo() > File "eggs.py", line 123, in > ? for obj in sequence: > File "cheese.py", line 456, in > n = len(x) > *** one or more files may have changed > *** lines starting with ? may be inaccurate > TypeError: object of type 'NoneType' has no len() > > > I don't think performance will matter. Generating stack traces are > rarely performance critical, so I don't think that a few extra file > system checks will make any meaningful difference. > > > Thoughts? I like this idea. A variation would be to just add " (modified)" to the file name - line number bits of the stack trace, for the appropriate files. -- Pradyun > > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jan 30 05:57:52 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 30 Jan 2019 21:57:52 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> Message-ID: <20190130105752.GJ1834@ando.pearwood.info> On Wed, Jan 30, 2019 at 11:07:52AM +0100, Jamesie Pic wrote: > What do you think could be the developer intent when they do > ",".join([2, "2']) ? I don't know what your intent was, although I can guess, but I do know that I sure as hell don't want a dumb piece of software like the interpreter running code that I didn't write because it tried to guess what I possibly may have meant. http://www.catb.org/jargon/html/D/DWIM.html And from the Zen: Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. Don't think about toy examples where you put literals in the code. Sure, we want a string, but that's trivial. What sort of string and what should it look like? Think about non-trivial code like this: header = generate_header() body = template.format(','.join(strings)) document = make(header, body) and imagine that somehow, a non-string slips into something which is supposed to be a string. Now what do you think my intent is? It isn't enough to just say "I want a string dammit, and I don't care what's in it!". If a non-string slips in there, I sure as hell want to know how and why, because that's a bug, not a minor inconvenience. The most junior developer in the team could easily paper over the bug by adding in a call to map(str, strings) but that doesn't fx the bug, it just hides it and all but guarantees the document generated is corrupt, or worse, wrong. "I find it amusing when novice programmers believe their main job is preventing programs from crashing. ... More experienced programmers realize that correct code is great, code that crashes could use improvement, but incorrect code that doesn?t crash is a horrible nightmare." -- Chris Smith If we look at where the strings come from: strings = [format_record(obj) for obj in datasource if condition(obj)] we're now two steps away from the simplistic "we want a string" guess of your example. When we look at format_record and find this: def format_record(record): if record.count < 2: ... elif record.type in ('spam', 'eggs'): ... elif record.next() is None: ... # and so on for half a page we're just getting further and further away from the trivial cases of "just give me a string dammit!". Going back to your example (correcting the syntax error): ",".join([2, "2"]) To save you about a quarter of a second by avoiding having to type quote marks around the first item, you would cost me potentially hours or days of hair-tearing debugging trying to work out why the document I'm generating is occasionally invalid or corrupt in hard to find ways. That's not a trade off I have any interest in making. -- Steve From jpic at yourlabs.org Wed Jan 30 06:13:32 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 12:13:32 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: <20190130105752.GJ1834@ando.pearwood.info> References: <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> <20190130105752.GJ1834@ando.pearwood.info> Message-ID: Wow, thanks for your great reply Steven ! It really helps me get a better understanding of what I'm trying to do and move forward in my research ! Some values are not going to be nice as strings, so I think I'm more going to try to make a convenience shortcut for str map join, for when I want to generate a human readable string. Ie.: mapjoin(*args, sep='\n', key=str). Then I could replace: readable = '\n'.join(map(str, [ 'hello', f'__{name}__', etc... ])) OR def foo(): readable = textwrap.dedent(f''' hello __{name}__ ''').strip() With: readable = mapjoin( 'hello', f'__{name}__' sep='\n', # map=format_record could be used ) That removes the "fuzzy" feeling I get from my previous proposals. So, after a while if people are using that mapjoin that we could have on PyPi, we could perhaps consider it to improve str.join. Or, do you think adding such features to str.join is still discussable ? From jpic at yourlabs.org Wed Jan 30 06:17:56 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 12:17:56 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> <20190130105752.GJ1834@ando.pearwood.info> Message-ID: Oops, fixing my last example: readable = mapjoin( 'hello', f'__{name}__', sep='\n', # key=format_record, could be used here ) Signature would be like (illustrating defaults): mapjoin(*args, sep='\n', key=str) From ijkl at netc.fr Wed Jan 30 06:18:08 2019 From: ijkl at netc.fr (Jimmy Girardet) Date: Wed, 30 Jan 2019 12:18:08 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: Hi, At the end this long thread because 2 functions doing quite the same thing have the same name but not the same signature and it's confusing for some people (I'm one of those) |str.||join|(/iterable/) |os.path.||join|(/path/, /*paths/) There are strong arguments about why it's implemented like that and why it's very difficult to change it. Maybe some change could be giving str.join 1 iterable or many args : about str.join: ?? a - if 0 arg : error ?? b - if 1 arg : process or return error if not iterable ?? c - if > 1 arg: do b using all args as one iterable maybe some performance issues could go against it. I agree with the fact that this is a minor need and it should not allow major change Le 30/01/2019 ? 11:01, Jamesie Pic a ?crit?: > I'm not disagreeing by any mean. I'm just saying assembling strings is > a common programing task and that we have two different methods with > the same name and inconsistent signatures and that it's error-prone. > I'm most certainly *not* advocating for breaking compatibility or > whatnot. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jpic at yourlabs.org Wed Jan 30 06:30:20 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 12:30:20 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> Message-ID: On Wed, Jan 30, 2019 at 11:06 AM Chris Angelico wrote: > Most places where you need to talk to humans, you'll end up either > interpolating the values into a template of some sort (see: percent > formatting, the format method, and f-strings), or plug individual > values straight into method calls (eg when building a GUI). I'm not > sure why or how your use-case is somehow different here. Actually we're moving away from templates, in favor of functional decorating component-based pattern pretty much like React, in some R&D open source project. Not only do we get much better performance than with a template rendering engine, but we also get all the power of a good programing language: Python :) -- ? From jpic at yourlabs.org Wed Jan 30 06:37:45 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 12:37:45 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: Thanks for your reply Jimmy ! As suggested by Chris and Steven, we might also want to throw in a "key" kwarg, that could be none by default to keep BC, but also allow typecasting: ' '.join('a', 2, key=str) -- ? From jfine2358 at gmail.com Wed Jan 30 06:39:40 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 30 Jan 2019 11:39:40 +0000 Subject: [Python-ideas] Stack traces ought to flag when a module has been changed on disk In-Reply-To: References: <20190130101733.GI1834@ando.pearwood.info> Message-ID: I think Steve's suggestion fails in this situation. Suppose wibble.py contains a function fn. Now do import wibble fn = wibble.fn # Modify and save wibble.py reload(wibble) fn() I've posted a message to this effect in the original bug https://bugs.python.org/msg334553 Please note that the original poster, after the cause has been explained, is happy for the bug to be closed. https://bugs.python.org/msg334551 Perhaps move discussion back to https://bugs.python.org/issue35857. -- Jonathan From p.f.moore at gmail.com Wed Jan 30 06:42:43 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 30 Jan 2019 11:42:43 +0000 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: On Fri, 4 Jan 2019 at 19:03, Abe Dillon wrote: > Currently PEP-8 prescribes all caps for constants and uses the all cap variable "FILES" as an example in a different section. It also appears to be the defacto-standard for enums (based on the documentation) > > I don't think it's necessary to make any breaking changes. Just pep-8 and (of less importance) spurious documentation examples. If you don't like the recommendation, just don't follow it. It's not like it's set in stone or anything. Personally, I like it and I'm glad it's used on the projects I work on. But you do what suits you. As it's unlikely that the stdlib will stop using caps for constants, changing PEP 8 isn't appropriate (see the first line of the PEP - "This document gives coding conventions for the Python code comprising the standard library in the main Python distribution"). Paul From mertz at gnosis.cx Wed Jan 30 08:09:20 2019 From: mertz at gnosis.cx (David Mertz) Date: Wed, 30 Jan 2019 08:09:20 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: I really don't get the "two different signatures" concern. The two functions do different things, why would we expect them to automatically share a signature. There are a zillion different open() functions or methods in the standard library, and far more in third party software. They each have various different signatures and functionality because they "open" different things. So what? Use the interface to the function you are using, not to something else that happens to share a name (in a different namespace). On Wed, Jan 30, 2019, 5:06 AM Jamesie Pic I'm not disagreeing by any mean. I'm just saying assembling strings is > a common programing task and that we have two different methods with > the same name and inconsistent signatures and that it's error-prone. > I'm most certainly *not* advocating for breaking compatibility or > whatnot. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Wed Jan 30 08:24:13 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Wed, 30 Jan 2019 14:24:13 +0100 Subject: [Python-ideas] Stack traces ought to flag when a module has been changed on disk In-Reply-To: <20190130101733.GI1834@ando.pearwood.info> References: <20190130101733.GI1834@ando.pearwood.info> Message-ID: <5308F88F-38EE-4D9D-8C01-13F26440F877@killingar.net> I've been bitten by this and it always costs me several minutes of confusion. +1 > On 30 Jan 2019, at 11:17, Steven D'Aprano wrote: > > This thought is motivated by this bug report: > > https://bugs.python.org/issue35857 > > If you import a module, then edit the .py file that goes with it, and > then an exception occurs, the stack trace can show the wrong line. > > It doesn't happen very often, but when it does happen, it can be very > perplexing. Here's a proposal: > > When a stack trace is printed, before printing each line, the > interpreter checks whether the file's modification time is later than > the time recorded in the .pyc file. If the times are different, the > stack trace can flag the line and print an addition line stating that > the file may have changed and the stack trace may not be accurate. > > Something like this perhaps? > > > Traceback (most recent call last): > File "spam.py", line 99, in > eggs.foo() > File "eggs.py", line 123, in > ? for obj in sequence: > File "cheese.py", line 456, in > n = len(x) > *** one or more files may have changed > *** lines starting with ? may be inaccurate > TypeError: object of type 'NoneType' has no len() > > > I don't think performance will matter. Generating stack traces are > rarely performance critical, so I don't think that a few extra file > system checks will make any meaningful difference. > > > Thoughts? > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rosuav at gmail.com Wed Jan 30 08:30:05 2019 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 31 Jan 2019 00:30:05 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> Message-ID: On Wed, Jan 30, 2019 at 10:33 PM Jamesie Pic wrote: > > On Wed, Jan 30, 2019 at 11:06 AM Chris Angelico wrote: > > > Most places where you need to talk to humans, you'll end up either > > interpolating the values into a template of some sort (see: percent > > formatting, the format method, and f-strings), or plug individual > > values straight into method calls (eg when building a GUI). I'm not > > sure why or how your use-case is somehow different here. > > Actually we're moving away from templates, in favor of functional > decorating component-based pattern pretty much like React, in some R&D > open source project. Not only do we get much better performance than > with a template rendering engine, but we also get all the power of a > good programing language: Python :) > Well, I've no idea how your component-based system works, but in React itself, under the covers, the values end up going straight into function calls, which was the other common suggestion I gave :) There's a reason that those two styles, rather than join() itself, will tend to handle most situations. ChrisA From tritium-list at sdamon.com Wed Jan 30 11:34:33 2019 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 30 Jan 2019 11:34:33 -0500 Subject: [Python-ideas] Stack traces ought to flag when a module has been changed on disk In-Reply-To: References: <20190130101733.GI1834@ando.pearwood.info> Message-ID: <08de01d4b8b9$af275cc0$0d761640$@sdamon.com> > -----Original Message----- > From: Python-ideas list=sdamon.com at python.org> On Behalf Of Jonathan Fine > Sent: Wednesday, January 30, 2019 6:40 AM > To: python-ideas > Subject: Re: [Python-ideas] Stack traces ought to flag when a module has > been changed on disk > > I think Steve's suggestion fails in this situation. Suppose wibble.py > contains a function fn. Now do > import wibble > fn = wibble.fn > # Modify and save wibble.py > reload(wibble) > fn() > I think using reload should raise warnings, since it doesn't work, and the reload case shouldn't be the killer of this really good idea. > I've posted a message to this effect in the original bug > https://bugs.python.org/msg334553 > > Please note that the original poster, after the cause has been > explained, is happy for the bug to be closed. > https://bugs.python.org/msg334551 > > Perhaps move discussion back to https://bugs.python.org/issue35857. > -- > Jonathan > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From fuzzyman at gmail.com Wed Jan 30 12:34:32 2019 From: fuzzyman at gmail.com (Michael Foord) Date: Wed, 30 Jan 2019 17:34:32 +0000 Subject: [Python-ideas] Stack traces ought to flag when a module has been changed on disk In-Reply-To: <08de01d4b8b9$af275cc0$0d761640$@sdamon.com> References: <20190130101733.GI1834@ando.pearwood.info> <08de01d4b8b9$af275cc0$0d761640$@sdamon.com> Message-ID: On Wed, 30 Jan 2019 at 16:34, Alex Walters wrote: > > > > -----Original Message----- > > From: Python-ideas > list=sdamon.com at python.org> On Behalf Of Jonathan Fine > > Sent: Wednesday, January 30, 2019 6:40 AM > > To: python-ideas > > Subject: Re: [Python-ideas] Stack traces ought to flag when a module has > > been changed on disk > > > > I think Steve's suggestion fails in this situation. Suppose wibble.py > > contains a function fn. Now do > > import wibble > > fn = wibble.fn > > # Modify and save wibble.py > > reload(wibble) > > fn() > > > > I think using reload should raise warnings, since it doesn't work, and the > reload case shouldn't be the killer of this really good idea. > > Reload isn't the issue here. Even without the reload the call to `fun()` will no longer match the file on disk. reload was moved to the imp module for exactly that reason. Michael > > I've posted a message to this effect in the original bug > > https://bugs.python.org/msg334553 > > > > Please note that the original poster, after the cause has been > > explained, is happy for the bug to be closed. > > https://bugs.python.org/msg334551 > > > > Perhaps move discussion back to https://bugs.python.org/issue35857. > > -- > > Jonathan > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- http://www.michaelfoord.co.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Wed Jan 30 12:35:40 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 30 Jan 2019 17:35:40 +0000 Subject: [Python-ideas] Stack traces ought to flag when a module has been changed on disk In-Reply-To: <08de01d4b8b9$af275cc0$0d761640$@sdamon.com> References: <20190130101733.GI1834@ando.pearwood.info> <08de01d4b8b9$af275cc0$0d761640$@sdamon.com> Message-ID: > I think using reload should raise warnings, since it doesn't work, and the > reload case shouldn't be the killer of this really good idea. In Python2, reload is in __builtin__ module (and so available without import at the Python console). Since Python 3.4 this functionality is in the importlib module. https://docs.python.org/3.7/library/importlib.html#importlib.reload reload is a 44 line pure Python function. No need to use it if you don't want to. https://github.com/python/cpython/blob/3.7/Lib/importlib/__init__.py#L133-L176 I was writing this as Michael Foord's similar (but concise) post came in. -- Jonathan From barry at barrys-emacs.org Wed Jan 30 13:31:09 2019 From: barry at barrys-emacs.org (Barry Scott) Date: Wed, 30 Jan 2019 18:31:09 +0000 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> Message-ID: <0E539D40-A0AE-4DCD-A8D1-92C4CBA447BB@barrys-emacs.org> > On 30 Jan 2019, at 10:07, Jamesie Pic wrote: > > On Wed, Jan 30, 2019 at 7:03 AM Robert Vanden Eynde > wrote: >>> >>> Raises an error. Why should: >>> >>> ??.join([2, ?2?]) not raise an error as well? >> >> I agree > > What do you think could be the developer intent when they do > ",".join([2, "2']) ? > > If the intent is clearly to assemble a string, as it looks like, then > I don't find any disadvantage to automate this task for them. The intent is not clear. How is the 2 to be formatted? I fixed a nasty bug recently where a join of a list of strings contained a non-string in some cases. If the str(bad_value) had been the default I would not have been able to track this down from the traceback in a few minutes. I'm -1 on this idea as it would hide bugs in my experience. Barry > > -- > ? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From jpic at yourlabs.org Wed Jan 30 14:38:33 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Wed, 30 Jan 2019 20:38:33 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: <0E539D40-A0AE-4DCD-A8D1-92C4CBA447BB@barrys-emacs.org> References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> <0E539D40-A0AE-4DCD-A8D1-92C4CBA447BB@barrys-emacs.org> Message-ID: Thanks for your email Barry. This is indeed a good point and the proposal has changed a bit since then. It's more "add a key kwarg to str.join where you can set key=str yourself if you want". From abedillon at gmail.com Wed Jan 30 14:47:56 2019 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 30 Jan 2019 13:47:56 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: > Is it that really obnoxious? EXTREMELY! > Does using upper case for constants measurably slows down coders? Can you cite the actual papers describing such experiments that lead to this conclusion ? https://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read https://en.wikipedia.org/wiki/All_caps#Readability https://uxmovement.com/content/all-caps-hard-for-users-to-read/ https://practicaltypography.com/all-caps.html > from my experience having a visual clue that a value is a constant or an enum is something pretty useful. Do you have any proof that it's useful? Have you ever been tempted to modify math.pi or math.e simply because they're lower case? Have you ever stopped to wonder if those values change? If the socket library used packet_host, packet_broadcast, etc. instead of PACKET_HOST, PACKET_BROADCAST, ETC. would you be confused about whether it's a good idea to rebind those variables? Would you be tempted to write the line of code: socket.packet_host = x? It seems to me that nobody is actually considering what I'm actually talking about very carefully. They just assume that because all caps is used to convey information that information is actually important. Not just important, but important enough that it should be in PEP-8. They say I should just violate PEP-8 because it's not strictly enforced. It is strictly enforced in workplaces. I don't see why it can't be the other way around: PEP-8 doesn't say to use all caps, but if you want to it's OK. > Surely, I'd hate reading a newspaper article where the editor generously sprinkled upper case words everywhere Exactly. If it's an eye-sore in every other medium, then it seems likely to me, the only reason programmers don't consider it an eye-sore is they've become inured to it. > but analogies only go so far, reading code have some similarities with reading prose, but still is not the same activity. CAN you articulate what is DIFFERENT about READING code that makes the ALL CAPS STYLE less offensive? On Tue, Jan 29, 2019 at 6:09 PM Marcos Eliziario wrote: > Is it that really obnoxious? Does using upper case for constants > measurably slows down coders? Can you cite the actual papers describing > such experiments that lead to this conclusion ? > Because, from my experience having a visual clue that a value is a > constant or an enum is something pretty useful. > Surely, I'd hate reading a newspaper article where the editor generously > sprinkled upper case words everywhere, but analogies only go so far, > reading code have some similarities with reading prose, but still is not > the same activity. > > Best, > Marcos Eliziario > > > > Em ter, 29 de jan de 2019 ?s 20:30, Cameron Simpson > escreveu: > >> On 29Jan2019 15:44, Jamesie Pic wrote: >> >On Fri, Jan 4, 2019 at 10:07 PM Bernardo Sulzbach >> > wrote: >> >> I'd suggest violating PEP-8 instead of trying to change it. >> > >> >TBH even my bash global environment variables tend to become more and >> >more lowercase ... >> >> If you mean _exported_ variables, then this is actually a really bad >> idea. >> >> The shell (sh, bash, ksh etc) makes no enforcement about naming for >> exported vs unexported variables. And the exported namespace ($PATH etc) >> is totally open ended, because any programme might expect arbitrary >> optional exported names for easy tuning of defaults. >> >> So, you think, since I only use variables I intend and only export >> variables I plan to, I can do what I like. Example script: >> >> a=1 >> b=2 >> export b >> >> So $b is now exported to subcommands, but not $a. >> >> However: the "exported set" is initially the environment you inherit. >> Which means: >> >> Any variable that _arrives_ in the environment is _already_ in the >> exported set. So, another script: >> >> a=1 >> b=2 >> # not exporting either >> >> If that gets called from the environment where you'd exported $b (eg >> from the first script, which could easily be your ~/.profile or >> ~/.bashrc), then $b gets _modified_ and _exported_ to subcommands, even >> though you hadn't asked. Because it came in initially from the >> environment. >> >> This means that you don't directly control what is local to the script >> and what is exported (and thus can affect other scripts). >> >> The _only_ way to maintain sanity is the existing convention: local >> script variables use lowercase names and exported variables use >> UPPERCASE names. With that in hand, and cooperation from everyone else, >> you have predictable and reliable behaviour. And you have a nice visual >> distinction in your code because you know immediately (by convention) >> whether a variable is exported or not. >> >> By exporting lowercase variables you violate this convention, and make >> your script environment unsafe for others to use. >> >> Do many many example scripts on the web do the reverse: using UPPERCASE >> names for local script variables? Yes they do, and they do a disservice >> to everyone. >> >> Cheers, >> Cameron Simpson >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > Marcos Elizi?rio Santos > mobile/whatsapp/telegram: +55(21) 9-8027-0156 > skype: marcos.eliziario at gmail.com > linked-in : https://www.linkedin.com/in/eliziario/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leban.us Wed Jan 30 15:01:17 2019 From: bruce at leban.us (Bruce Leban) Date: Wed, 30 Jan 2019 12:01:17 -0800 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: Text in color or against black backgrounds is harder to read than black on white. See for example: https://trevellyan.biz/graphic-design-discussion-how-color-and-contrast-affect-readability-2/ Text where different words in the same sentence are in different colors is even harder to read. And I think we should totally ban anyone on the web from putting light gray text on a lighter gray background (see https://www.wired.com/2016/10/how-the-web-became-unreadable/ for a good discussion). Or to say that another way: I think we should totally ban anyone on the web from putting light gray text on a lighter gray background!! But many of us use editors that use color for syntax highlighting and we do that because projecting semantics onto the color axis works for us. So we don't ban colorizing text and we shouldn't. Capitalizing constants may be slightly harder to read but constants in code are the minority and emphasizing them is precisely the point. I'm MINUS_ONE on changing PEP 8. Make your own styleguide if you don't want to follow PEP 8 in your code. --- Bruce On Wed, Jan 30, 2019 at 11:48 AM Abe Dillon wrote: > > Is it that really obnoxious? > > EXTREMELY! > > > Does using upper case for constants measurably slows down coders? Can > you cite the actual papers describing such experiments that lead to this > conclusion ? > > > https://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read > https://en.wikipedia.org/wiki/All_caps#Readability > https://uxmovement.com/content/all-caps-hard-for-users-to-read/ > https://practicaltypography.com/all-caps.html > > > from my experience having a visual clue that a value is a constant or an > enum is something pretty useful. > > Do you have any proof that it's useful? Have you ever been tempted to > modify math.pi or math.e simply because they're lower case? Have you ever > stopped to wonder if those values change? > > If the socket library used packet_host, packet_broadcast, etc. instead of > PACKET_HOST, PACKET_BROADCAST, ETC. would you be confused about whether > it's a good idea to rebind those variables? Would you be tempted to write > the line of code: socket.packet_host = x? > > It seems to me that nobody is actually considering what I'm actually > talking about very carefully. They just assume that because all caps is > used to convey information that information is actually important. Not just > important, but important enough that it should be in PEP-8. They say I > should just violate PEP-8 because it's not strictly enforced. It is > strictly enforced in workplaces. I don't see why it can't be the other way > around: PEP-8 doesn't say to use all caps, but if you want to it's OK. > > > Surely, I'd hate reading a newspaper article where the editor generously > sprinkled upper case words everywhere > > Exactly. If it's an eye-sore in every other medium, then it seems likely > to me, the only reason programmers don't consider it an eye-sore is they've > become inured to it. > > > but analogies only go so far, reading code have some similarities with > reading prose, but still is not the same activity. > > CAN you articulate what is DIFFERENT about READING code that makes the ALL > CAPS STYLE less offensive? > > On Tue, Jan 29, 2019 at 6:09 PM Marcos Eliziario < > marcos.eliziario at gmail.com> wrote: > >> Is it that really obnoxious? Does using upper case for constants >> measurably slows down coders? Can you cite the actual papers describing >> such experiments that lead to this conclusion ? >> Because, from my experience having a visual clue that a value is a >> constant or an enum is something pretty useful. >> Surely, I'd hate reading a newspaper article where the editor generously >> sprinkled upper case words everywhere, but analogies only go so far, >> reading code have some similarities with reading prose, but still is not >> the same activity. >> >> Best, >> Marcos Eliziario >> >> >> >> Em ter, 29 de jan de 2019 ?s 20:30, Cameron Simpson >> escreveu: >> >>> On 29Jan2019 15:44, Jamesie Pic wrote: >>> >On Fri, Jan 4, 2019 at 10:07 PM Bernardo Sulzbach >>> > wrote: >>> >> I'd suggest violating PEP-8 instead of trying to change it. >>> > >>> >TBH even my bash global environment variables tend to become more and >>> >more lowercase ... >>> >>> If you mean _exported_ variables, then this is actually a really bad >>> idea. >>> >>> The shell (sh, bash, ksh etc) makes no enforcement about naming for >>> exported vs unexported variables. And the exported namespace ($PATH etc) >>> is totally open ended, because any programme might expect arbitrary >>> optional exported names for easy tuning of defaults. >>> >>> So, you think, since I only use variables I intend and only export >>> variables I plan to, I can do what I like. Example script: >>> >>> a=1 >>> b=2 >>> export b >>> >>> So $b is now exported to subcommands, but not $a. >>> >>> However: the "exported set" is initially the environment you inherit. >>> Which means: >>> >>> Any variable that _arrives_ in the environment is _already_ in the >>> exported set. So, another script: >>> >>> a=1 >>> b=2 >>> # not exporting either >>> >>> If that gets called from the environment where you'd exported $b (eg >>> from the first script, which could easily be your ~/.profile or >>> ~/.bashrc), then $b gets _modified_ and _exported_ to subcommands, even >>> though you hadn't asked. Because it came in initially from the >>> environment. >>> >>> This means that you don't directly control what is local to the script >>> and what is exported (and thus can affect other scripts). >>> >>> The _only_ way to maintain sanity is the existing convention: local >>> script variables use lowercase names and exported variables use >>> UPPERCASE names. With that in hand, and cooperation from everyone else, >>> you have predictable and reliable behaviour. And you have a nice visual >>> distinction in your code because you know immediately (by convention) >>> whether a variable is exported or not. >>> >>> By exporting lowercase variables you violate this convention, and make >>> your script environment unsafe for others to use. >>> >>> Do many many example scripts on the web do the reverse: using UPPERCASE >>> names for local script variables? Yes they do, and they do a disservice >>> to everyone. >>> >>> Cheers, >>> Cameron Simpson >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> -- >> Marcos Elizi?rio Santos >> mobile/whatsapp/telegram: +55(21) 9-8027-0156 >> skype: marcos.eliziario at gmail.com >> linked-in : https://www.linkedin.com/in/eliziario/ >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jan 30 15:03:25 2019 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 30 Jan 2019 12:03:25 -0800 Subject: [Python-ideas] Stack traces ought to flag when a module has been changed on disk In-Reply-To: References: <20190130101733.GI1834@ando.pearwood.info> Message-ID: On Wed, Jan 30, 2019 at 3:43 AM Jonathan Fine wrote: > I think Steve's suggestion fails in this situation. Suppose wibble.py > contains a function fn. Now do > import wibble > fn = wibble.fn > # Modify and save wibble.py > reload(wibble) > fn() > Sure -- but this is just a warning that there *might* be an issue -- if a user doesn't get the warning in some cases (particularly cases in which they have used an "advanced" feature like reload) we aren't any worse off. +1 on the idea -- I don't see a downside. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Wed Jan 30 16:22:27 2019 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 30 Jan 2019 15:22:27 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: > Capitalizing constants may be slightly harder to read but constants in code are the minority and emphasizing them is precisely the point. The question I'm trying to get everyone to actually think about: Is the communication of constancy via ALL CAPS so important that it must be in PEP-8 despite the documented harm that all caps does to readability? ttps://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read https://en.wikipedia.org/wiki/All_caps#Readability https://uxmovement.com/content/all-caps-hard-for-users-to-read/ https://practicaltypography.com/all-caps.html I've gotten many responses that seem like a knee-jerk reaction in favor of the status quo. I get the sense people "like" all caps because they've been conditioned to believe it conveys important information, but they haven't taken the time to really consider how valid that belief is. Consider that math.pi and math.e are constants that are not all caps, have you ever been tempted to re-bind those variables? Do you seriously worry that those variables are going to be re-bound by other code? Functions and classes are essentially constants that aren't all caps, yet nobody gets confused about whether or not to re-bind those, or if other code will rebind them. If socket.AF_INET6 were socket.af_inet6 would you consider re-binding that variable? Would you be worried that other code will re-bind it? Can you measure the value of the information conveyed by all-caps? Are you so sure that it's as important as you think? I've gotten a lot of responses like, "If you don't like it just ignore PEP-8, it's not mandatory". A) It is mandatory in many cases. B) We could just as easily NOT prescribe all caps in PEP-8 but still allow it. In other words: you can use all caps if you want to but it's not mandatory or in PEP-8. I would like to discourage its use, but we don't have to go so far. That way nobody has to violate PEP-8. On Wed, Jan 30, 2019 at 2:01 PM Bruce Leban wrote: > Text in color or against black backgrounds is harder to read than black on > white. > See for example: > https://trevellyan.biz/graphic-design-discussion-how-color-and-contrast-affect-readability-2/ > > Text where different words in the same sentence are in different colors is > even harder to read. > > And I think we should totally ban anyone on the web from putting light > gray text on a lighter gray background > (see https://www.wired.com/2016/10/how-the-web-became-unreadable/ for a > good discussion). > > Or to say that another way: > I think we should totally ban anyone on the web from putting light gray > text on a lighter gray background!! > > But many of us use editors that use color for syntax highlighting and we > do that because projecting semantics onto the color axis works for us. So > we don't ban colorizing text and we shouldn't. > > Capitalizing constants may be slightly harder to read but constants in > code are the minority and emphasizing them is precisely the point. > > I'm MINUS_ONE on changing PEP 8. Make your own styleguide if you don't > want to follow PEP 8 in your code. > > --- Bruce > > > > > On Wed, Jan 30, 2019 at 11:48 AM Abe Dillon wrote: > >> > Is it that really obnoxious? >> >> EXTREMELY! >> >> > Does using upper case for constants measurably slows down coders? Can >> you cite the actual papers describing such experiments that lead to this >> conclusion ? >> >> >> https://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read >> https://en.wikipedia.org/wiki/All_caps#Readability >> https://uxmovement.com/content/all-caps-hard-for-users-to-read/ >> https://practicaltypography.com/all-caps.html >> >> > from my experience having a visual clue that a value is a constant or >> an enum is something pretty useful. >> >> Do you have any proof that it's useful? Have you ever been tempted to >> modify math.pi or math.e simply because they're lower case? Have you ever >> stopped to wonder if those values change? >> >> If the socket library used packet_host, packet_broadcast, etc. instead >> of PACKET_HOST, PACKET_BROADCAST, ETC. would you be confused about >> whether it's a good idea to rebind those variables? Would you be tempted to >> write the line of code: socket.packet_host = x? >> >> It seems to me that nobody is actually considering what I'm actually >> talking about very carefully. They just assume that because all caps is >> used to convey information that information is actually important. Not just >> important, but important enough that it should be in PEP-8. They say I >> should just violate PEP-8 because it's not strictly enforced. It is >> strictly enforced in workplaces. I don't see why it can't be the other way >> around: PEP-8 doesn't say to use all caps, but if you want to it's OK. >> >> > Surely, I'd hate reading a newspaper article where the editor >> generously sprinkled upper case words everywhere >> >> Exactly. If it's an eye-sore in every other medium, then it seems likely >> to me, the only reason programmers don't consider it an eye-sore is they've >> become inured to it. >> >> > but analogies only go so far, reading code have some similarities with >> reading prose, but still is not the same activity. >> >> CAN you articulate what is DIFFERENT about READING code that makes the >> ALL CAPS STYLE less offensive? >> >> On Tue, Jan 29, 2019 at 6:09 PM Marcos Eliziario < >> marcos.eliziario at gmail.com> wrote: >> >>> Is it that really obnoxious? Does using upper case for constants >>> measurably slows down coders? Can you cite the actual papers describing >>> such experiments that lead to this conclusion ? >>> Because, from my experience having a visual clue that a value is a >>> constant or an enum is something pretty useful. >>> Surely, I'd hate reading a newspaper article where the editor generously >>> sprinkled upper case words everywhere, but analogies only go so far, >>> reading code have some similarities with reading prose, but still is not >>> the same activity. >>> >>> Best, >>> Marcos Eliziario >>> >>> >>> >>> Em ter, 29 de jan de 2019 ?s 20:30, Cameron Simpson >>> escreveu: >>> >>>> On 29Jan2019 15:44, Jamesie Pic wrote: >>>> >On Fri, Jan 4, 2019 at 10:07 PM Bernardo Sulzbach >>>> > wrote: >>>> >> I'd suggest violating PEP-8 instead of trying to change it. >>>> > >>>> >TBH even my bash global environment variables tend to become more and >>>> >more lowercase ... >>>> >>>> If you mean _exported_ variables, then this is actually a really bad >>>> idea. >>>> >>>> The shell (sh, bash, ksh etc) makes no enforcement about naming for >>>> exported vs unexported variables. And the exported namespace ($PATH >>>> etc) >>>> is totally open ended, because any programme might expect arbitrary >>>> optional exported names for easy tuning of defaults. >>>> >>>> So, you think, since I only use variables I intend and only export >>>> variables I plan to, I can do what I like. Example script: >>>> >>>> a=1 >>>> b=2 >>>> export b >>>> >>>> So $b is now exported to subcommands, but not $a. >>>> >>>> However: the "exported set" is initially the environment you inherit. >>>> Which means: >>>> >>>> Any variable that _arrives_ in the environment is _already_ in the >>>> exported set. So, another script: >>>> >>>> a=1 >>>> b=2 >>>> # not exporting either >>>> >>>> If that gets called from the environment where you'd exported $b (eg >>>> from the first script, which could easily be your ~/.profile or >>>> ~/.bashrc), then $b gets _modified_ and _exported_ to subcommands, even >>>> though you hadn't asked. Because it came in initially from the >>>> environment. >>>> >>>> This means that you don't directly control what is local to the script >>>> and what is exported (and thus can affect other scripts). >>>> >>>> The _only_ way to maintain sanity is the existing convention: local >>>> script variables use lowercase names and exported variables use >>>> UPPERCASE names. With that in hand, and cooperation from everyone else, >>>> you have predictable and reliable behaviour. And you have a nice visual >>>> distinction in your code because you know immediately (by convention) >>>> whether a variable is exported or not. >>>> >>>> By exporting lowercase variables you violate this convention, and make >>>> your script environment unsafe for others to use. >>>> >>>> Do many many example scripts on the web do the reverse: using UPPERCASE >>>> names for local script variables? Yes they do, and they do a disservice >>>> to everyone. >>>> >>>> Cheers, >>>> Cameron Simpson >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> >>> >>> -- >>> Marcos Elizi?rio Santos >>> mobile/whatsapp/telegram: +55(21) 9-8027-0156 >>> skype: marcos.eliziario at gmail.com >>> linked-in : https://www.linkedin.com/in/eliziario/ >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Jan 30 16:40:23 2019 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 31 Jan 2019 08:40:23 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: On Thu, Jan 31, 2019 at 8:23 AM Abe Dillon wrote: > > > Capitalizing constants may be slightly harder to read but constants in code are the minority and emphasizing them is precisely the point. > > The question I'm trying to get everyone to actually think about: > > Is the communication of constancy via ALL CAPS so important that it must be in PEP-8 despite the documented harm that all caps does to readability? > > ttps://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read > https://en.wikipedia.org/wiki/All_caps#Readability > https://uxmovement.com/content/all-caps-hard-for-users-to-read/ > https://practicaltypography.com/all-caps.html Nobody is saying that the *entire document* should be in all caps. This is a specific token, a specific identifier. Are English paragraphs hard to read because tokens like "HTML" and "IBM" are in all-caps? > If socket.AF_INET6 were socket.af_inet6 would you consider re-binding that variable? Would you be worried that other code will re-bind it? Can you measure the value of the information conveyed by all-caps? Are you so sure that it's as important as you think? > With constants that are taken directly from C, consistency is extremely helpful. Why is it called AF_INET6? Because it has exactly the same as AF_INET6 in C, or any other language that also has derived its socket handling from BSD sockets. (Which, for reference, is basically every language that has any sort of socket support.) > I've gotten a lot of responses like, "If you don't like it just ignore PEP-8, it's not mandatory". > A) It is mandatory in many cases. That is not PEP 8's problem. The document stipulates that it is the standard for the Python standard library, nothing else. Are you going to appeal to Google to have *their* style guide changed too? A lot of people have adopted Google's style guide, and it explicitly says that module level constants *must* be in all caps. > B) We could just as easily NOT prescribe all caps in PEP-8 but still allow it. In other words: you can use all caps if you want to but it's not mandatory or in PEP-8. I would like to discourage its use, but we don't have to go so far. That way nobody has to violate PEP-8. > I don't think PEP 8 actually mandates that *all* constants be written in all caps. It says "usually". But you have many options here - ignore PEP 8, treat PEP 8 as a guideline, or just accept that ALL_CAPS_CONSTANTS actually do carry useful information in their names. ChrisA From abedillon at gmail.com Wed Jan 30 16:51:22 2019 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 30 Jan 2019 15:51:22 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: <20190105034143.GS13616@ando.pearwood.info> References: <20190105034143.GS13616@ando.pearwood.info> Message-ID: [Steven D'Aprano] > > ALL_CAPS_IS_OBNOXIOUS > > > > It destroys the visual flow of code > Does it? This claim doesn't ring true to me. To me, "visual flow of > code" is the way it flows down and across the page, not the shape of the > individual words. It does. Your field of vision is two-dimensional and multi-scale. Your visual system uses lots of queues to determine what to focus on and how to interpret it. So both the way code flows down the page and the shape of individual words matter to readability: ttps://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read https://en.wikipedia.org/wiki/All_caps#Readability https://uxmovement.com/content/all-caps-hard-for-users-to-read/ https://practicaltypography.com/all-caps.html [Steven D'Aprano] > << DUP2 DIV SWAP OVER * ROT SWAP - >> > The flow is fine if you know how to read reverse Polish notation ("Yoda > speak"). It flows from left to right, and top down, same as English. > Only the word order is different. The flow would be precisely the same > if it were written like this: > << dup2 div swap over * rot swap - >> It's not precisely the same to me. My eye is drawn heavily to "dup2" in the latter. In the former my eye isn't drawn strongly to anything in particular. It's slightly drawn to the asterisk. I suppose I should clarify that when I talk about "visual flow" I mean how my eye is drawn around the media. [Steven D'Aprano] > I can immediately tell that unlike spam and eggs, FILENAME ought to be a > global constant, which is a valuable hint that I can probably find the > value of FILENAME by looking at the top of the module, and not worry > about it being rebound anywhere else. + f "filename =" You can tell if its rebound anywhere by the number of matches. [Steven D'Aprano] > What naming convention would you suggest for distinguishing between > constants and variables? None. You don't need one. [Steven D'Aprano] > We can (usually) accurately > recognise modules, classes and functions from context, but we can't do > the same for constants. What are you basing that claim on? I can tell that math.pi, string.digits, and timedelta.resolution are constants just fine. On Fri, Jan 4, 2019 at 9:42 PM Steven D'Aprano wrote: > On Fri, Jan 04, 2019 at 01:01:51PM -0600, Abe Dillon wrote: > > > I keep coming back to this great video > about > > coding style, and one point in particular rings true to me: > > ALL_CAPS_IS_OBNOXIOUS > > > > It destroys the visual flow of code > > Does it? This claim doesn't ring true to me. To me, "visual flow of > code" is the way it flows down and across the page, not the shape of the > individual words. > > To me, long lines spoil the visual flow of code (especially if they are > long enough that I have to scroll horizontally to see the end). > > To me, sudden blocks of unexpected indentation spoil the visual flow of > code. > (Fortunately, this is rare in Python.) > > I've looked over code in the standard library, my own code, and > third-party libraries, and I don't see that the choice of name disrupts > the flow of code, whether it is written in CamelCase of lowercase or > ALLCAPS or even rAnSOmenOTecAse. (Although I admit that last one is > quite hard to read...) > > I have a bunch of code written in RPL for the HP-48GX calculator, and > the convention there is that nearly everything is written in allcaps. > Here's an equivalent function to Python's divmod(): > > << DUP2 DIV SWAP OVER * ROT SWAP - >> > > The flow is fine if you know how to read reverse Polish notation ("Yoda > speak"). It flows from left to right, and top down, same as English. > Only the word order is different. The flow would be precisely the same > if it were written like this: > > << dup2 div swap over * rot swap - >> > > Where RPL does suffer from the lack of visual flow is the lack of > structure to the code. In Python terms, it would be as if we wrote: > > def function(): if condition: for x in sequence: do_this() > do_that() endfor else: do_something_else() endif > > Ouch. > > The bottom line is, I don't agree that the visual flow of code is > negatively affected, or affected at all, by the shape of individual > words in the code. > > > > and for what? To signify a global, > > constant, or Enum? Is that really so important? I don't think so. > > I think the convention is useful, of moderate importance, and I think > Python code would be ever-so-slightly harder to understand without it. > > I rarely, if ever, use allcaps for constants defined and used in a > single function, but that's because my functions are typically short > enough that you can fit the entire function on screen at once and tell > that the name is defined once and never re-bound, hence a constant. > > Where the naming convention really makes sense is for module-level > constants, where the initial binding is typically separated from the > eventual use by a lot of time and space, I think it is useful to have a > simple naming convention to distinguish between variables and constants. > When I see this in the middle of a function: > > def foo(): > ... > process(spam, FILENAME, eggs, ...) > ... > > I can immediately tell that unlike spam and eggs, FILENAME ought to be a > global constant, which is a valuable hint that I can probably find the > value of FILENAME by looking at the top of the module, and not worry > about it being rebound anywhere else. So yes, having a naming convention > for constants is useful. > > And FILENAME is much better than cfilename or kfilename or > constant_filename_please_dont_rebind_ok_thx *wink* > > What naming convention would you suggest for distinguishing between > constants and variables? > > I suppose one might argue that we don't need to care about the semantics > of which names are variables and which are constants. In fairness, we > cope quite well with modules, classes and functions being effectively > constants and yet written in non-allcaps. > > But on the other hand, we generally can recognise modules, classes and > functions by name and usage. We rarely say "process(module)", but we > might say "process(module.something)". Classes have their own naming > convention. So the analogy between global constants which don't use the > allcaps convention (namely modules, classes and functions) and global > constants which do is fairly weak. We can (usually) accurately > recognise modules, classes and functions from context, but we can't do > the same for constants. > > > > Currently PEP-8 prescribes all caps for constants > > and uses the all > cap > > variable "FILES" as an example in a different section. > > > It > > also appears to be the defacto-standard for enums (based on the > > documentation < > https://docs.python.org/3/library/enum.html#creating-an-enum> > > ) > > That's because the typical use for enums is as constants. If I had a > *variable* which merely held an enum, I wouldn't use allcaps: > > # No! Don't do this! > for MYENUM in list_of_enums: > if condition(MYENUM): > MYENUM = something_else() > process(MYENUM) > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at cskk.id.au Wed Jan 30 17:07:09 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Thu, 31 Jan 2019 09:07:09 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: Message-ID: <20190130220709.GA54892@cskk.homeip.net> On 30Jan2019 15:22, Abe Dillon wrote: >> Capitalizing constants may be slightly harder to read but constants in >code are the minority and emphasizing them is precisely the point. > >The question I'm trying to get everyone to actually think about: > >Is the communication of constancy via ALL CAPS so important that it must be >in PEP-8 despite the documented harm that all caps does to readability? Lots of caps is an issue. Using all caps for specific low frequency items is a much smaller issue, and having "constants" visually distinctive is a handy thing. And constants are not just "things that should not be changed", such as a variable from an outer scope. They tend to be default values and genuine constants (such as equivalence values form another API, as in the socket examples). Your cited URLs are almost all about prose in all caps - whole sentences and paragraphs in all caps. Wikipedia aside, they _all_ have carve outs for places where all caps can be valuable: >https://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read "From a design perspective, All Caps can be useful for labels, logos and menus where your message doesn't involve reading large sections of text." >https://en.wikipedia.org/wiki/All_caps#Readability >https://uxmovement.com/content/all-caps-hard-for-users-to-read/ "When is it okay to use all caps? All caps are fine in contexts that don?t involve much reading, such as logos, headings, acronyms, and abbreviations." >https://practicaltypography.com/all-caps.html "That doesn?t mean you shouldn?t use caps. Just use them judiciously. Caps are suitable for headings shorter than one line (e.g., ?Table of Contents?), headers, footers, captions, or other labels. Caps work at small point sizes. Caps work well on letterhead and business cards." >I've gotten many responses that seem like a knee-jerk reaction in favor of >the status quo. I get the sense people "like" all caps because they've been >conditioned to believe it conveys important information, but they haven't >taken the time to really consider how valid that belief is. You may find that (a) plenty of us have been using this convention for a very long time and (b) we find it useful and (c) it doesn't cause us any trouble. Also, cross language conventions have additional value. Don't think we've never thought about its value. >Consider that math.pi and math.e are constants that are not all caps, have >you ever been tempted to re-bind those variables? No, but "e" and "pi" are _conventionally_ spelt in lowercase in prose. Which is why they are lower case in the math module. As with the socket example below, here they match their casing in their source context (mathematics). And regarding temptation to change these, there's a very old humorous anecdote about FORTRAN, in that like Python it has no constants: you can change "PI", for example to 3. Badness ensues. Nobody is arguing that we should consider math.e or math.pi sane things to modify because the Python module uses lower case for their names. >Do you seriously worry >that those variables are going to be re-bound by other code? Functions and >classes are essentially constants that aren't all caps, yet nobody gets >confused about whether or not to re-bind those, or if other code will >rebind them. You'll notethat in Python, assigning to a name in a function _causes_ that name to be locally scoped. This avoids a whole suite of accidents. And we rebind names bound to functions all the time, not just for monkey patching but also when we pass functions as callbacks. Python function names are _not_ functions, they're _references_ to functions. >If socket.AF_INET6 were socket.af_inet6 would you consider re-binding that >variable? Would you be worried that other code will re-bind it? Can you >measure the value of the information conveyed by all-caps? Are you so sure >that it's as important as you think? It is important for 2 reasons: it adhere's to the well established convention of a constant being in all caps, which has value in itself (adherence to the convention) _and_ it is spelled exactly like the AF_INET6 constant from the C socket API, of which Python's socket module is essentially a shim with some additional utility facilities. >I've gotten a lot of responses like, "If you don't like it just ignore >PEP-8, it's not mandatory". >A) It is mandatory in many cases. >B) We could just as easily NOT prescribe all caps in PEP-8 but still allow >it. In other words: you can use all caps if you want to but it's not >mandatory or in PEP-8. I would like to discourage its use, but we don't >have to go so far. That way nobody has to violate PEP-8. Arguing that "nobody has to violate PEP-8" by the mechanism of just dropping recommendations from it isn't very sound, IMO. If your in house style eschews caps for constants, go right ahead. Nobody will stop you. PEP-8 is primarily for the stdlib, where like any shared body of code it is useful to have common style. Plenty of places use PEP-8 as a basis - it is reasonable, and it gets you a fairly complete style guide from day 1. By all means diverge from it on a reasoned basis in your own code or within your own organisation. I do in my personal code: primarily 2 space indentation instead of 4, and my docstring formatting differs as well. And correspondingly, feel free to document your reasons for diverging. They may be universally valid or domain specific; provide such context. But I suspect you'll not get much traction on changing PEP-8 itself in this aspect because the convention is widely liked. Finally, I've worked in ALL CAPS programming environments. BASIC and assembler were traditionally written in all caps. Also FORTRAN. Also Modula-3, at least for its keywords (IF, etc). On the whole I think the world is a better place for being mostly lower case. To such an extent that I wrote a preprocessor for all my Modula-3 to transcribe my personal lower case programmes to upper case for the compiler. But the flip side of being in a mostly lower case world is that having occasional upper case as a standout formatting for particular entities sudden has more value. And that value has been used for constants for a long time with, IMO, some benefit. Cheers, Cameron Simpson From abedillon at gmail.com Wed Jan 30 17:41:08 2019 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 30 Jan 2019 16:41:08 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: [ChrisA] > Nobody is saying that the *entire document* should be in all caps. I've never claimed as much. Most of the reasons all caps harm readability hold true whether you're talking about a single word or entire document. [ChrisA] > Are English paragraphs hard to read because tokens like "HTML" and "IBM" > are in all-caps? Acronyms are a different use case for all caps. We can discuss the value proposition for those in another thread if you'd like. I will say that I read my share of research papers where acronyms tend to see heavy use and, yes; it can harm readability. [ChrisA] > With constants that are taken directly from C, consistency is > extremely helpful. That's fair. I can live with it. The socket module was just an example. There are several examples of all caps that aren't inherited from other sources. If datetime.MINYEAR and datetime.MAXYEAR were datetime.minyear and datetime.maxyear what do you think the consequences would be? Do you really think it's less clear? Do you think timedelta.min, timedelta.max, and timedelta.resolution aren't clear enough? How about string.digits? Should those shout at you that they're constants? Why or why not? [ChrisA] > Are you going to appeal to Google to have *their* style guide changed too? That's comparing apples and oranges. Python is an open language with an ideas forum about how to improve things. Google generally isn't open to my suggestions. Any given company I work with is much more likely to enforce PEP-8 than Google's style guides. As far as I know, getting Google to adopt a given idea isn't a precondition for the Python community accepting said idea. [ChrisA] > treat PEP 8 as a guideline Again, that's not always an option. On Wed, Jan 30, 2019 at 3:41 PM Chris Angelico wrote: > On Thu, Jan 31, 2019 at 8:23 AM Abe Dillon wrote: > > > > > Capitalizing constants may be slightly harder to read but constants in > code are the minority and emphasizing them is precisely the point. > > > > The question I'm trying to get everyone to actually think about: > > > > Is the communication of constancy via ALL CAPS so important that it must > be in PEP-8 despite the documented harm that all caps does to readability? > > > > ttps:// > www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read > > https://en.wikipedia.org/wiki/All_caps#Readability > > https://uxmovement.com/content/all-caps-hard-for-users-to-read/ > > https://practicaltypography.com/all-caps.html > > Nobody is saying that the *entire document* should be in all caps. > This is a specific token, a specific identifier. Are English > paragraphs hard to read because tokens like "HTML" and "IBM" are in > all-caps? > > > If socket.AF_INET6 were socket.af_inet6 would you consider re-binding > that variable? Would you be worried that other code will re-bind it? Can > you measure the value of the information conveyed by all-caps? Are you so > sure that it's as important as you think? > > > > With constants that are taken directly from C, consistency is > extremely helpful. Why is it called AF_INET6? Because it has exactly > the same as AF_INET6 in C, or any other language that also has derived > its socket handling from BSD sockets. (Which, for reference, is > basically every language that has any sort of socket support.) > > > I've gotten a lot of responses like, "If you don't like it just ignore > PEP-8, it's not mandatory". > > A) It is mandatory in many cases. > > That is not PEP 8's problem. The document stipulates that it is the > standard for the Python standard library, nothing else. Are you going > to appeal to Google to have *their* style guide changed too? A lot of > people have adopted Google's style guide, and it explicitly says that > module level constants *must* be in all caps. > > > B) We could just as easily NOT prescribe all caps in PEP-8 but still > allow it. In other words: you can use all caps if you want to but it's not > mandatory or in PEP-8. I would like to discourage its use, but we don't > have to go so far. That way nobody has to violate PEP-8. > > > > I don't think PEP 8 actually mandates that *all* constants be written > in all caps. It says "usually". But you have many options here - > ignore PEP 8, treat PEP 8 as a guideline, or just accept that > ALL_CAPS_CONSTANTS actually do carry useful information in their > names. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Jan 30 18:24:06 2019 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 31 Jan 2019 10:24:06 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: On Thu, Jan 31, 2019 at 9:41 AM Abe Dillon wrote: > > [ChrisA] >> >> Nobody is saying that the *entire document* should be in all caps. > > > I've never claimed as much. Most of the reasons all caps harm readability hold true whether you're talking about a single word or entire document. You have implied it by making the claim about single words, and then citing references that talk about entire documents. > [ChrisA] >> >> Are English paragraphs hard to read because tokens like "HTML" and "IBM" are in all-caps? > > > Acronyms are a different use case for all caps. We can discuss the value proposition for those in another thread if you'd like. > I will say that I read my share of research papers where acronyms tend to see heavy use and, yes; it can harm readability. Initialisms (that aren't acronyms) carry information: you read them out letter by letter rather than as a word (ignoring phonograms etc). Constants carry information by being in all caps also. > [ChrisA] >> >> Are you going to appeal to Google to have *their* style guide changed too? > > > That's comparing apples and oranges. Python is an open language with an ideas forum about how to improve things. Google generally isn't open to my suggestions. Any given company I work with is much more likely to enforce PEP-8 than Google's style guides. As far as I know, getting Google to adopt a given idea isn't a precondition for the Python community accepting said idea. > Both documents are specific to an original context, but have been adopted elsewhere. If your company has adopted PEP 8, that's your company's decision. It would equally be your company's decision to use the Google style guide, or a modified version of either, or a hybrid of both. If PEP 8 changes, will your company instantly change its policy to be "use this new version"? They don't have to. > [ChrisA] >> >> treat PEP 8 as a guideline > > Again, that's not always an option. > Again, that's not PEP 8's problem. ChrisA From abedillon at gmail.com Wed Jan 30 18:24:38 2019 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 30 Jan 2019 17:24:38 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: <20190105042053.GU13616@ando.pearwood.info> References: <20190105042053.GU13616@ando.pearwood.info> Message-ID: [Steven D'Aprano] > I've just watched it, ... it is a great video. I'm really glad you did. I don't think many have, which I can't really blame anyone for because it's rather long, but if all this thread accomplishes is a few more people seeing that video, it'll have been worth it. [Steven D'Aprano] > I don't agree with everything he says, but even where I disagree it is > great food for thought. Yes, I disagree about his anti-dependency-injection stance. [Steven D'Aprano] > IDEs rot the brain. He advocates for the use of "extract method" only 12 minutes into the talk. This comes off as Socrates famous warning that written language would, "create forgetfulness in the learners? souls, because they will not use their memories.? No, IDEs are tools. Tools are generally meant to solve problems. He laments that IDEs can encourage pointless boilerplate, but tools don't have to include "the ghost of Clippy". I know this is a jab at my suggestion that an IDE could help communicate what is global (which, yes; is not synonymous w/ constant, though it usually should be) via syntax highlighting. I wonder if you believe that syntax highlighting "rots the brain"? [Steven D'Aprano] > the speaker makes an excellent point that in Java, you don't need a naming > convention for constants > because the compiler will give an error if you try to write to a constant. That's not the only argument he makes against all caps constants, though I agree that it's the strongest argument against all caps in Java. I largely agree with the rest of your post. I just don't think we need a naming convention for constants. On Fri, Jan 4, 2019 at 10:21 PM Steven D'Aprano wrote: > On Fri, Jan 04, 2019 at 01:01:51PM -0600, Abe Dillon wrote: > > > I keep coming back to this great video > > I've just watched it, its a bit slow to start but I agree with Abe that > it is a great video. (And not just because the speaker agrees with me > about 80 columns :-) > > I don't agree with everything he says, but even where I disagree it is > great food for thought. I *strongly* suggest people watch the video, > although you might find (as I did) that the main lessons of it are > that many common Java idioms exist to work around poor language design, > and that IDEs rot the brain. > > *semi-wink* > > Coming back to the ALLCAPS question, the speaker makes an excellent > point that in Java, you don't need a naming convention for constants > because the compiler will give an error if you try to write to a > constant. > > But we don't have that in Python. Even if you run a linter that will > warn on rebinding of constants, you still need a way to tell the linter > that it is a constant. > > The speaker also points out that in programming, we only have a very few > mechanisms for communicating the meaning of our code: > > - names; > - code structure; > - spacing (indentation, grouping). > > Code structure is set by the language and there's not much we can do > about it (unless you're using a language like FORTH where you can create > your own flow control structures). So in practice we only have naming > and spacing. > > That's an excellent point, but he has missed one more: > > * naming conventions. > > In Python, we use leading and trailing underscores to give strong hints > about usage: > > _spam # private implementation detail > > __spam # same, but with name mangling > > __spam__ # overload an operator or other special meaning > > spam_ # avoid name clashes with builtins > > We typically use CamelCase for classes, making it easy to distinguish > classes from instances, modules and functions. > > And we use ALLCAPS for constants. If that's not needed in Java (I have > my doubts...) we should also remember the speaker's very good advice > that just because something is needed (or not needed) in language X, > doesn't mean that language Y should copy it. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Wed Jan 30 18:41:56 2019 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 30 Jan 2019 17:41:56 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: > > > [ChrisA] > >> > >> Nobody is saying that the *entire document* should be in all caps. > > > > > > I've never claimed as much. Most of the reasons all caps harm > readability hold true whether you're talking about a single word or entire > document. > You have implied it by making the claim about single words, and then > citing references that talk about entire documents. I don't know how to make myself more clear. I haven't been talking about entire documents. Some of the links I provided discuss tests conducted on entire documents. That is not relevant to this discussion. Please cut out the pedantry. It's aggravating and doesn't contribute to the discussion. It comes of as you trying to score "points" in a discussion for being more "technically correct". Example B: > [ChrisA] > >> > >> Are English paragraphs hard to read because tokens like "HTML" and > "IBM" are in all-caps? > > > > > > Acronyms are a different use case for all caps. We can discuss the value > proposition for those in another thread if you'd like. > > I will say that I read my share of research papers where acronyms tend > to see heavy use and, yes; it can harm readability. > Initialisms (that aren't acronyms) carry information: you read them > out letter by letter rather than as a word (ignoring phonograms etc). > Constants carry information by being in all caps also. I'm not talking about acronyms OR initialisms **rolls eyes extremely hard**. If you want to discuss whether HttpClient or HTTPClient is more acceptable, go start another thread. I've already responded to this. > [ChrisA] > >> > >> Are you going to appeal to Google to have *their* style guide changed > too? > > > > > > That's comparing apples and oranges. Python is an open language with an > ideas forum about how to improve things. Google generally isn't open to my > suggestions. Any given company I work with is much more likely to enforce > PEP-8 than Google's style guides. As far as I know, getting Google to adopt > a given idea isn't a precondition for the Python community accepting said > idea. > > > Both documents are specific to an original context, but have been > adopted elsewhere. If your company has adopted PEP 8, that's your > company's decision. It would equally be your company's decision to use > the Google style guide, or a modified version of either, or a hybrid > of both. If PEP 8 changes, will your company instantly change its > policy to be "use this new version"? They don't have to. This conversation isn't going to go anywhere if you ignore half of what I write. > [ChrisA] > >> > >> treat PEP 8 as a guideline > > > > Again, that's not always an option. > > > Again, that's not PEP 8's problem. It's my problem. It's a problem I have to deal with. It's a problem that doesn't need to be a problem. It's a problem that can be solved by modifying PEP-8. I don't even know what you mean by something being "PEP 8's problem". If you can't contribute to the discussion in a meaningful way, then why even respond? On Wed, Jan 30, 2019 at 5:24 PM Chris Angelico wrote: > On Thu, Jan 31, 2019 at 9:41 AM Abe Dillon wrote: > > > > [ChrisA] > >> > >> Nobody is saying that the *entire document* should be in all caps. > > > > > > I've never claimed as much. Most of the reasons all caps harm > readability hold true whether you're talking about a single word or entire > document. > > You have implied it by making the claim about single words, and then > citing references that talk about entire documents. > > > [ChrisA] > >> > >> Are English paragraphs hard to read because tokens like "HTML" and > "IBM" are in all-caps? > > > > > > Acronyms are a different use case for all caps. We can discuss the value > proposition for those in another thread if you'd like. > > I will say that I read my share of research papers where acronyms tend > to see heavy use and, yes; it can harm readability. > > Initialisms (that aren't acronyms) carry information: you read them > out letter by letter rather than as a word (ignoring phonograms etc). > Constants carry information by being in all caps also. > > > [ChrisA] > >> > >> Are you going to appeal to Google to have *their* style guide changed > too? > > > > > > That's comparing apples and oranges. Python is an open language with an > ideas forum about how to improve things. Google generally isn't open to my > suggestions. Any given company I work with is much more likely to enforce > PEP-8 than Google's style guides. As far as I know, getting Google to adopt > a given idea isn't a precondition for the Python community accepting said > idea. > > > > Both documents are specific to an original context, but have been > adopted elsewhere. If your company has adopted PEP 8, that's your > company's decision. It would equally be your company's decision to use > the Google style guide, or a modified version of either, or a hybrid > of both. If PEP 8 changes, will your company instantly change its > policy to be "use this new version"? They don't have to. > > > [ChrisA] > >> > >> treat PEP 8 as a guideline > > > > Again, that's not always an option. > > > > Again, that's not PEP 8's problem. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jan 30 18:53:58 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 31 Jan 2019 10:53:58 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190105034143.GS13616@ando.pearwood.info> Message-ID: <20190130235357.GL1834@ando.pearwood.info> On Wed, Jan 30, 2019 at 03:51:22PM -0600, Abe Dillon wrote: > [Steven D'Aprano] > > > > ALL_CAPS_IS_OBNOXIOUS > > > > > > It destroys the visual flow of code > > Does it? This claim doesn't ring true to me. To me, "visual flow of > > code" is the way it flows down and across the page, not the shape of the > > individual words. > > > It does. Your field of vision is two-dimensional and multi-scale. > Your visual system uses lots of queues to determine what to focus on and > how to interpret it. > So both the way code flows down the page and the shape of individual words > matter to readability: I'm not disputing that, I'm disputing your claim that the presence of a all-caps CONSTANT somewhere on a page full of lower and mixed case code "destroys the visual flow of code". The rest of your comment is a distraction. You have made one strong claim (all-caps constants destroy the flow of code), I've explained why I consider it dubious, and you've introduced a completely different, milder, uncontroversial fact, that the shape of individual words slightly influences how that word is read. Yes they do. How does that support your claim that a handful of all-caps names scattered over a page of code "destroys the flow of text"? Does the same apply to a page of prose that happens to mention NASA, the USSR, the USA, FAQ or some other TLA? > https://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read Let's start with the first paragraph: "There's nothing worse than browsing the web and being hit with a huge slab of text in All Caps - that is, all in CAPITAL LETTERS." Yes there is: websites (like this one) which use low-contrast light grey text on a white or slightly-lighter grey background, especially if (like this one) they use a sans serif font. (It could have been even worse though: at least the page doesn't use a tiny font size.) What does the shape of the letters matter if the reader has problems distinguishing them from the background due to lack of contrast? https://www.contrastrebellion.com/ In any case, we're not talking about "a huge slab" of all-caps. If you write your Python code like this: # Don't do this! import MYMODULE SOME_VARIABLE = 1234 for MYLOOPVARIABLE in MYMODULE.SEQUENCE(SOME_VARIABLE): PROCESS(MYLOOPVARIABLE) if SOME_CONDITION(MYLOOPVARIABLE) or FLAG: with SOMETHING as ANOTHER: DO_ANOTHER_THING_WITH(ANOTHER) then this argument about large blocks of all-caps is relevant. Nobody here is advocating for great slabs of all-caps, and neither does PEP 8. For individual words occasionally scattered around the code, the argument against using nothing but all-caps is irrelevant. When we read, we don't actually look at every letter in a sentence, but actually the shapes of the words. That's an over-simplification, i.e. inaccurate. But certainly looking at the overall shape of words is *part* of what we do. However, if it was *all* we do when reading, then we couldn't tell these words apart: case more core mean even user then when this that else than If I remember correctly, didn't you make the claim earlier that all-caps names draw the eye and emphasize that word? (If you did, I agree with it, and acknowledge that this in and of itself is not a desirable thing. It is a cost worth paying for the other benefits of having a convention for all-caps which doesn't depend on using a smart IDE and syntax highlighting.) It strikes me as a bit strange that one moment you are (?) claiming that all-caps names draw the eye, and the next you are giving as evidence for your position a source which claims the opposite: "...the monotonous rectangular shape of the All Caps text reducing the emphasis on the All Caps word." Seems like you are cherry-picking arguments that support you and hoping we don't read all the way through the article to find those that go against you. Speaking of which: "From a design perspective, All Caps can be useful for labels, logos and menus where your message doesn't involve reading large sections of text." We can add to that, from a coding perspective, all-caps can be useful for constants, environment variables, and other uses which don't involve reading large blocks of all-caps. [...] > > I can immediately tell that unlike spam and eggs, FILENAME ought to be a > > global constant, which is a valuable hint that I can probably find the > > value of FILENAME by looking at the top of the module, and not worry > > about it being rebound anywhere else. > > > + f "filename =" > You can tell if its rebound anywhere by the number of matches. Can I? You seem to know a lot about the editor I am using. What if it doesn't show the number of matches but only one match at a time? You are assuming that I only have one global variable filename and no local variables using the same name. That's an unsafe assumption. But even if it were safe, it seems strange that you are so worried about the microsecond or so extra reading time it takes to recognise an all-caps word, based on the "shape of the word" model, yet are prepared to pay this enormous cost probably counted in multiple seconds: - change the focus of my attention from the code I'm reading - remember this unreliable trick (see above) - move my hands to the position to type Ctrl-F - which for touch-typists involves the hardest key on the keyboard to press (Ctrl) using the weakest finger on the hand - depending on the editor, I may have to pause a moment or two while the search occurs - or possibly I have to de-focus and ignore intermediate results if the search occurs while I'm typing - refocus on where the number of results are displayed - correctly interpret this number in terms of the semantics "one match means only one binding" - draw the correct conclusion "hence a constant" - worry about whether I missed some other way the variable might have been re-bound e.g. ``for filename in list_of_files`` - and finally refocus back to where I'm reading the code. And this is supposed to be an improvement over a clean convention for constants? I don't think so. > [Steven D'Aprano] > > > What naming convention would you suggest for distinguishing between > > constants and variables? > > None. You don't need one. You are correct, having a good naming convention for constants is not strictly necessary. Especially for those who don't care about the readability of their code. No naming convention is *necessary*, so long as the variable names are distinguishable by the interpreter we don't need conventions to distinguish functions from variables from classes from constants. We could just name everything using consecutive x1, x2, x3 ... names and the code would run just as well. Having good naming conventions is very valuable, but not *necessary*. Using all-caps for constants is very valuable, but you are right, it isn't *necessary*. > [Steven D'Aprano] > > > We can (usually) accurately > > recognise modules, classes and functions from context, but we can't do > > the same for constants. > > > What are you basing that claim on? I can tell that math.pi, string.digits, > and timedelta.resolution are constants just fine. Sure, but only because you know the semantics that pi is a numeric constant, digits refers only to the Hindi-Arabic numerals 0...9, etc. I wouldn't have guessed that timedelta.resolution is a constant, because I don't know that module so well. But how about filename pattern location person sigma characters Which of those are constants? All of those are taken from real code I've written, except "characters" which I just made up. All of them have been constants in some modules and variables in others, except for sigma, but I'm not telling you which it was. Since it is so easy for you to tell a constant from a variable, you ought to be able to tell which it was. Right? Remember, the person reading your code is not necessarily an expert in the domain of your code. It might be trivial for you to say that spam.aardvark cannot possibly be anything but a constant, but to those who aren't experts in the domain, they might as well be metasyntactic variables. -- Steve From jpic at yourlabs.org Wed Jan 30 19:45:23 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Thu, 31 Jan 2019 01:45:23 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <5C4FDEFC.6070902@brenbarn.net> <20190129230457.GG1834@ando.pearwood.info> <6LixVJw1B3DaVyo9VmS5uteSplxoqeP_tKUDSderrQ9TpM-UhvDUgY518yazShmNFt-KbMMPENtvTGWwF6kv0g==@pm.me> <5C512C8D.3000803@canterbury.ac.nz> <0E539D40-A0AE-4DCD-A8D1-92C4CBA447BB@barrys-emacs.org> Message-ID: Let's see if this gets any download at all: https://pypi.org/project/mapjoin/ Sorry for this obscenity xD Thank you all for your replies ! Have a great day Best regards From steve at pearwood.info Wed Jan 30 20:24:07 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 31 Jan 2019 12:24:07 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: <20190131012407.GM1834@ando.pearwood.info> On Wed, Jan 30, 2019 at 01:47:56PM -0600, Abe Dillon wrote: > > Is it that really obnoxious? > > EXTREMELY! I don't agree with you that the use of one or two words in a sentence for EMPHASIS is obnoxious, and I don't writing a one-word sentence is a good test of that. In any case, even if it is the case that all-caps has the downside that it draws the eye ("shouty") more than necessary, I believe that the benefits of the convention outweigh that cost. > > Does using upper case for constants measurably slows down coders? Can > you cite the actual papers describing such experiments that lead to this > conclusion ? > > https://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read > https://en.wikipedia.org/wiki/All_caps#Readability > https://uxmovement.com/content/all-caps-hard-for-users-to-read/ > https://practicaltypography.com/all-caps.html None of those "cite the actual papers" as requested, and only the Wikipedia page cites secondary sources. But that's okay, since we don't dispute that wall-to-wall paragraphs of all-caps are hard to read, we only dispute that the judicious and occasional use of all-caps to communicate metadata about what would otherwise appear to be a variable hurts readability in any meaningful sense. It isn't very interesting to say that it takes the average programmer (let's say) 250ms to read this line: with open(filename, 'w') as f: versus (let's say) 280ms to read this one: with open(FILENAME, 'w') as f: I don't know if that is true or not, but even if it is, I'm not particularly interested in optimizing the time it takes to read individual words. When it comes to coding, the time used scanning over a line of text is typically insiginificant compared to the time needed to understand the semantics and context. It isn't that I particularly want to slow down reading of indivudual words, but I'm willing to lose (let's say) 10 to 20 milliseconds to read FILENAME versus filename, so that I don't have to spend (let's say) 30 to 120 seconds trying to determine whether or not the value of filename has been rebound somewhere I didn't expect and that's why my script is writing to the wrong file. > > from my experience having a visual clue that a value is a constant or an > enum is something pretty useful. > > Do you have any proof that it's useful? Not peer-reviewed, no, but from my own experience I know that generally speaking when I'm trying to understand the semantics of unfamiliar code (even if that is code I wrote myself!) if I see an ALLCAPS name, I generally know that on a first pass I can treat it as a given and ignore it without caring about its actual value. There are exceptions: for example, if there's a bug in my regex, then I do have to care about the value of PATTERN. If I'm writing to the wrong file, then I do have to care about the name of FILENAME. But even then, I can guess that the value is only set in one place. If there's a bug in my regex PATTERN, I can fix it once at the top of the module, and not worry about other parts of the script or library re-binding the value. > Have you ever been tempted to > modify math.pi or math.e simply because they're lower case? Have you ever > stopped to wonder if those values change? This is a bit of a red herring (but only a bit) because the major value for signalling constantness is not so much as a warning to the *writer* not to change them, but to the *reader* that in well-written code, they haven't been changed. Obviously you can't have the second without the first, but since code is read more than it is written, the second occurs much more often. When it comes to math.pi and math.e, why would you want to change them? What is your use-case for changing them? I actually do have one. Many years ago, I wondered whether changing math.pi would change how math.sin, cos and tan work. So I tried this: py> import math py> math.cos(2*math.pi) # Period of cosine is 2?. 1.0 py> math.pi = 3 # Change the universe! ? is now three exactly. py> math.cos(2*math.pi) # Should still be 1, right? 0.960170286650366 I learned that the trig functions don't work that way. Ever since then, I've never had any reason to want to change the value of math.pi. What would be the point? > If the socket library used packet_host, packet_broadcast, etc. instead of > PACKET_HOST, PACKET_BROADCAST, ETC. would you be confused about whether > it's a good idea to rebind those variables? If they are variables, then by definition they must vary. If they vary, there must be reasons to rebind them. Since these are not flagged as "private" with a leading underscore, presumably they are part of the public API, so I would expect that, yes, rebinding those variables was a good idea. Since I'm not a domain expert when it comes to sockets, I would probably spend many minutes, maybe hours, trying to work out what part of the socket API requires me to set these global variables, and why. But in reality, since they are clearly flagged as constants, I can assume that they are intended as read-only constants, not global variables. I don't need to be a domain expert on sockets to know that rebinding PACKET_HOST is a bad idea, I just need to know that it isn't supported by the socket module. (The danger in asking rhetorical questions is that sometimes the answer isn't the one you expected.) > It seems to me that nobody is actually considering what I'm actually > talking about very carefully. They just assume that because all caps is > used to convey information that information is actually important. Or maybe some of us have thought carefully about what you have said, and concluded that you are making an unjustified micro-optimization for reading time as measured by eye-tracking time over individual words, at the cost of total editing and debugging time. > Not just > important, but important enough that it should be in PEP-8. They say I > should just violate PEP-8 because it's not strictly enforced. It is > strictly enforced in workplaces. Clearly it isn't "strictly enforced". Because the single most important rule of PEP 8 is the one mentioned right at the start about knowing when to break all the other rules. If your workplace forbids any exceptions to the other PEP 8 rules, then they are violating the most important PEP 8 rule of all. > > Surely, I'd hate reading a newspaper article where the editor generously > sprinkled upper case words everywhere > > Exactly. If it's an eye-sore in every other medium, then it seems likely to > me, the only reason programmers don't consider it an eye-sore is they've > become inured to it. It isn't an eyesore in every other context. You are making a logical error in assuming that since wall-to-wall all-caps significantly hurt readability, so much individual all-caps words. This is simply not correct. Drinking six litres of water in a single sitting will likely kill an adult; therefore (according to your reasoning) we shouldn't drink even a single sip of water. https://www.medicaldaily.com/taste-death-ld50-3-popular-drinks-can-kill-you-298918 Even if eye-tracking experiments show that it takes a fraction of a second longer to read that one word, that doesn't correspond to "hurting readability" in any meaningful sense. Optimizing for eye-tracking time for its own sake is not a useful thing for programmers to worry about. > > but analogies only go so far, reading code have some similarities with > reading prose, but still is not the same activity. > > CAN you articulate what is DIFFERENT about READING code that makes the ALL > CAPS STYLE less offensive? You are misusing the word "offensive" there. Please don't use it to mean "hard to read" or "annoying". Again, to answer your rhetorical question in a way you probably hoped it wouldn't be answered: because we don't write wall-to-wall all-caps code (at least not in Python). We're not likely to have six all-caps names in a single expression; we're probably not likely to have six all-caps names in an entire function. Consequently the cost of reading the all-caps word is tiny, and the benefit is greater. Just as it is for using all-caps for initialisms and acronyms like AFL, USA, FAQs, LD50, LED, PRC, PC, etc. Or for the occasional use for emphasis. Etc. -- Steve From jpic at yourlabs.org Wed Jan 30 20:50:50 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Thu, 31 Jan 2019 02:50:50 +0100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: <20190131012407.GM1834@ando.pearwood.info> References: <20190129223001.GA92125@cskk.homeip.net> <20190131012407.GM1834@ando.pearwood.info> Message-ID: On Thu, Jan 31, 2019 at 2:24 AM Steven D'Aprano wrote: > > Consequently the cost of reading the all-caps word is tiny, and the > benefit is greater. > What do you think about the cost of typing caps ? Surely, shift aggravates repeated strain injury. -- ? From pythonchb at gmail.com Wed Jan 30 20:57:00 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Wed, 30 Jan 2019 17:57:00 -0800 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: > I'm just saying assembling strings is > a common programing task and that we have two different methods with > the same name and inconsistent signatures No, we don?t ? one is for assembling paths, one for generic strings. And in recent versions, there is a new totally different way to assemble paths. Also: the primary use cases are diffferent ? when I use os.path.join(), I usually have the components in variables or literals, so the *args convention is most natural. When I am assembling text with str.join() I usually have the parts in an iterable, so that is the most natural. And besides, Python (necessarily) has some inconsistencies ? we don?t need to correct them all. There have been multiple changes to str.join() discussed in this thread. Mostly orthogonal to each other. If anyone wants to move them forward, I suggest you be clear about which you are advocating for. 1) that there be a join() method on lusts ( or sequences) ? frankly, I think that?s a non-starter, I wouldn?t waste any more time on it. 2) that str.join() take multiple positional arguments to join (similar to os.path.join) ? This could probably be added without much disruption, so if you really want it, make your case. I personally don?t think it?s worth it ? it would make the API more confusing, with little gain. 3) that str.join() (or some new method/function) ?stringify? (probably by calling str() ) the items so that non strings could be joined in one call ? we?ve had a fair bit of discussion on this one, and given Python?s strong typing and the many ways one might want to convert an arbitrary type to a string, this seems like a bad idea. Particularly bad to add to str.join() (Or was ?stringify? supposed to only do the string conversion, not the joining? If so, even more pointless) Any others? -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Wed Jan 30 20:57:35 2019 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 30 Jan 2019 19:57:35 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: <20190130235357.GL1834@ando.pearwood.info> References: <20190105034143.GS13616@ando.pearwood.info> <20190130235357.GL1834@ando.pearwood.info> Message-ID: [Steven D'Aprano] > I'm not disputing that, I mean, you literally wrote: [Steven D'Aprano] > To me, "visual flow of code" is the way it flows down and across the > page, *not* the shape of the > individual words. So it sounded a lot like you were. [Steven D'Aprano] > I'm disputing your claim that the presence of a > all-caps CONSTANT somewhere on a page full of lower and mixed case code > "destroys the visual flow of code". Maybe I was being a little hyperbolic, but it depends on degree. If every other line of code has LOGGER.debug(...) or STATSD_EMITER.count(...) in it, then it tends to out-shout the code you're trying to read. [Steven D'Aprano] > The rest of your comment is a distraction. Like going on a rant about one of my sources contrast ratios and font choices? [Steven D'Aprano] > When we read, we don't actually look at every letter in a sentence, > but actually the shapes of the words. That's an over-simplification, i.e. inaccurate. I'm sure you've heard of "Typoglycemia " before. It would be interesting to see how readability degrades as more and more of the scrambled words are converted to all caps. [Steven D'Aprano] > But certainly looking at the overall shape of words is *part* of what we > do. However, if it was > *all* we do when reading, then we couldn't tell these words apart: > case more core mean even user > then when this that else than I guess I might have some sort of disability that you don't but I find those two lines much more difficult to read or even focus on than normal text. It's very hard to describe the sensation, but it's very unpleasant. It's like my eye doesn't know where to start so I keep scanning back and forth like a Cylon . [Steven D'Aprano] > If I remember correctly, didn't you make the claim earlier that all-caps > names draw the eye and emphasize that word? Yes. It was I who said that. I know it seemingly contradicts statements in some of the sources I cited, but I think in those cases are referring to "slabs" of all caps. When it's lots of normal text with a few all caps, my eye is drawn to the all caps; when it's a block of all caps, everything is a wash and perhaps the few lower-case words stand out. I'm sorry that's confusing. I might go look for better sources that pertain more exclusively to code, but honestly; it doesn't look like anyone else cares or will agree with me any time soon. [Steven D'Aprano] > It seems strange that you are so worried about > the microsecond or so extra reading time it takes to recognize an > all-caps word, based on the "shape of the word" model, yet are prepared > to pay this enormous cost probably counted in multiple seconds: ... It seems like the fundamental problem you have is trying to find where and when a variable was last bound. I don't understand why your go-to solution to that problem is in the style-guide of a language. It doesn't seem at all related to style. It seems like the kind of problem that's completely in the wheel-house of an IDE. Does it not feel to you like you're relying on a style-kludge to make up for inadequate tools? Why perpetuate that? Why not demand more from your tools? [Steven D'Aprano] > You are correct, having a good naming convention for constants is not > strictly necessary. Especially for those who don't care about the > readability of their code. I've pointed this out several times before, but Python itself violates the all caps constant naming convention all over the place and yet, hardly anybody notices. The fear you seem to have about not communicating constancy clearly seems to be entirely hypothetical. The only person that's tried to show me a case where using all caps was crucial completely defeated his own argument by presenting a non-constant that had to be documented because its usage was so non-obvious. I haven't heard an explanation yet for why it's so important that pickle.DEFAULT_PROTOCOL be all caps while sys.copyright is not. If it's as important as you claim, then shouldn't there be mass hysteria? Cats and dogs getting along together, etc.? [Steven D'Aprano] > Sure, but only because you know the semantics that pi is a numeric > constant, digits refers only to the Hindi-Arabic numerals 0...9, etc. I > wouldn't have guessed that timedelta.resolution is a constant, because I > don't know that module so well. Be honest: what would your first guess be if you saw code using timedelta.resolution? Where and when would you guess it was last bound? Would you guess that it's a variable that changes on a whim or is often rebound? How often do you deal with interfaces where module-level variables are intended to be re-bound? Would you say that's good practice? [Steven D'Aprano] > how about > filename > pattern > location > person > sigma > characters > Which of those are constants? > All of those are taken from real code I've written, except "characters" > which I just made up. All of them have been constants in some modules > and variables in others, except for sigma, but I'm not telling you which > it was. Since it is so easy for you to tell a constant from a variable, > you ought to be able to tell which it was. Right? I would be able to tell very quickly if I saw those in my IDE whether they were local or global variables. I tend to only import up to the module level (as per Google's style guidelines) specifically so that others know where various variables (like math.pi) come from. In my experience, most constants are configuration that people haven't decided to make configurable yet. I worked at a computer vision lab where the camera had a resolution of 640 x 480 which were originally represented as constants in a lot of our code VERTICAL_RESOLUTION and HORIZONTAL_RESOLUTION , eventually; they became self.horizontal_resolution and self.vertical_resolution. So, my guess is that sigma is either a variable or will become a variable at some point in the future. On Wed, Jan 30, 2019 at 5:59 PM Steven D'Aprano wrote: > On Wed, Jan 30, 2019 at 03:51:22PM -0600, Abe Dillon wrote: > > [Steven D'Aprano] > > > > > > ALL_CAPS_IS_OBNOXIOUS > > > > > > > > It destroys the visual flow of code > > > Does it? This claim doesn't ring true to me. To me, "visual flow of > > > code" is the way it flows down and across the page, not the shape of > the > > > individual words. > > > > > > It does. Your field of vision is two-dimensional and multi-scale. > > Your visual system uses lots of queues to determine what to focus on and > > how to interpret it. > > So both the way code flows down the page and the shape of individual > words > > matter to readability: > > I'm not disputing that, I'm disputing your claim that the presence of a > all-caps CONSTANT somewhere on a page full of lower and mixed case code > "destroys the visual flow of code". > > The rest of your comment is a distraction. You have made one strong > claim (all-caps constants destroy the flow of code), I've explained why > I consider it dubious, and you've introduced a completely different, > milder, uncontroversial fact, that the shape of individual words > slightly influences how that word is read. > > Yes they do. How does that support your claim that a handful of all-caps > names scattered over a page of code "destroys the flow of text"? Does > the same apply to a page of prose that happens to mention NASA, the > USSR, the USA, FAQ or some other TLA? > > > > > https://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read > > Let's start with the first paragraph: > > "There's nothing worse than browsing the web and being > hit with a huge slab of text in All Caps - that is, all > in CAPITAL LETTERS." > > Yes there is: websites (like this one) which use low-contrast light > grey text on a white or slightly-lighter grey background, especially if > (like this one) they use a sans serif font. > > (It could have been even worse though: at least the page doesn't use a > tiny font size.) > > What does the shape of the letters matter if the reader has problems > distinguishing them from the background due to lack of contrast? > > https://www.contrastrebellion.com/ > > > In any case, we're not talking about "a huge slab" of all-caps. If you > write your Python code like this: > > # Don't do this! > import MYMODULE > SOME_VARIABLE = 1234 > for MYLOOPVARIABLE in MYMODULE.SEQUENCE(SOME_VARIABLE): > PROCESS(MYLOOPVARIABLE) > if SOME_CONDITION(MYLOOPVARIABLE) or FLAG: > with SOMETHING as ANOTHER: > DO_ANOTHER_THING_WITH(ANOTHER) > > then this argument about large blocks of all-caps is relevant. Nobody > here is advocating for great slabs of all-caps, and neither does PEP 8. > For individual words occasionally scattered around the code, the > argument against using nothing but all-caps is irrelevant. > > When we read, we don't actually look at every letter in a sentence, > but actually the shapes of the words. > > That's an over-simplification, i.e. inaccurate. But certainly looking at > the overall shape of words is *part* of what we do. However, if it was > *all* we do when reading, then we couldn't tell these words apart: > > case more core mean even user > > then when this that else than > > If I remember correctly, didn't you make the claim earlier that all-caps > names draw the eye and emphasize that word? > > (If you did, I agree with it, and acknowledge that this in and of itself > is not a desirable thing. It is a cost worth paying for the other > benefits of having a convention for all-caps which doesn't depend on > using a smart IDE and syntax highlighting.) > > It strikes me as a bit strange that one moment you are (?) claiming that > all-caps names draw the eye, and the next you are giving as evidence for > your position a source which claims the opposite: > > "...the monotonous rectangular shape of the All Caps text reducing > the emphasis on the All Caps word." > > Seems like you are cherry-picking arguments that support you and hoping > we don't read all the way through the article to find those that go > against you. Speaking of which: > > "From a design perspective, All Caps can be useful for labels, > logos and menus where your message doesn't involve reading large > sections of text." > > We can add to that, from a coding perspective, all-caps can be useful > for constants, environment variables, and other uses which don't involve > reading large blocks of all-caps. > > > [...] > > > I can immediately tell that unlike spam and eggs, FILENAME ought to be > a > > > global constant, which is a valuable hint that I can probably find the > > > value of FILENAME by looking at the top of the module, and not worry > > > about it being rebound anywhere else. > > > > > > + f "filename =" > > You can tell if its rebound anywhere by the number of matches. > > Can I? You seem to know a lot about the editor I am using. What if it > doesn't show the number of matches but only one match at a time? > > You are assuming that I only have one global variable filename and no > local variables using the same name. That's an unsafe assumption. > > But even if it were safe, it seems strange that you are so worried about > the microsecond or so extra reading time it takes to recognise an > all-caps word, based on the "shape of the word" model, yet are prepared > to pay this enormous cost probably counted in multiple seconds: > > - change the focus of my attention from the code I'm reading > > - remember this unreliable trick (see above) > > - move my hands to the position to type Ctrl-F > > - which for touch-typists involves the hardest key on the keyboard > to press (Ctrl) using the weakest finger on the hand > > - depending on the editor, I may have to pause a moment or two > while the search occurs > > - or possibly I have to de-focus and ignore intermediate results > if the search occurs while I'm typing > > - refocus on where the number of results are displayed > > - correctly interpret this number in terms of the semantics > "one match means only one binding" > > - draw the correct conclusion "hence a constant" > > - worry about whether I missed some other way the variable might > have been re-bound e.g. ``for filename in list_of_files`` > > - and finally refocus back to where I'm reading the code. > > And this is supposed to be an improvement over a clean convention for > constants? I don't think so. > > > > > [Steven D'Aprano] > > > > > What naming convention would you suggest for distinguishing between > > > constants and variables? > > > > None. You don't need one. > > You are correct, having a good naming convention for constants is not > strictly necessary. Especially for those who don't care about the > readability of their code. > > No naming convention is *necessary*, so long as the variable names are > distinguishable by the interpreter we don't need conventions to > distinguish functions from variables from classes from constants. We > could just name everything using consecutive x1, x2, x3 ... names and > the code would run just as well. > > Having good naming conventions is very valuable, but not *necessary*. > Using all-caps for constants is very valuable, but you are right, it > isn't *necessary*. > > > > [Steven D'Aprano] > > > > > We can (usually) accurately > > > recognise modules, classes and functions from context, but we can't do > > > the same for constants. > > > > > > What are you basing that claim on? I can tell that math.pi, > string.digits, > > and timedelta.resolution are constants just fine. > > Sure, but only because you know the semantics that pi is a numeric > constant, digits refers only to the Hindi-Arabic numerals 0...9, etc. I > wouldn't have guessed that timedelta.resolution is a constant, because I > don't know that module so well. > > But how about > > filename > pattern > location > person > sigma > characters > > Which of those are constants? > > All of those are taken from real code I've written, except "characters" > which I just made up. All of them have been constants in some modules > and variables in others, except for sigma, but I'm not telling you which > it was. Since it is so easy for you to tell a constant from a variable, > you ought to be able to tell which it was. Right? > > Remember, the person reading your code is not necessarily an expert in > the domain of your code. It might be trivial for you to say that > spam.aardvark cannot possibly be anything but a constant, but to those > who aren't experts in the domain, they might as well be metasyntactic > variables. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Wed Jan 30 20:59:42 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Wed, 30 Jan 2019 17:59:42 -0800 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> <20190131012407.GM1834@ando.pearwood.info> Message-ID: On Wed, Jan 30, 2019 at 5:54 PM Jamesie Pic wrote: > > What do you think about the cost of typing caps ? Surely, shift > aggravates repeated strain injury. We spend more time reading code than typing it? even more so with code completion. It?s a style *guide* folks ? let it go! - CHB > -- > ? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Wed Jan 30 21:03:40 2019 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 30 Jan 2019 20:03:40 -0600 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> <20190131012407.GM1834@ando.pearwood.info> Message-ID: [Christopher Barker] > It?s a style *guide* folks ? let it go! While I don't harbor any delusions that this is going anywhere (given the feedback so far), that's a double edged sword. It's also an extremely benign request, why fight so hard? On Wed, Jan 30, 2019 at 8:00 PM Christopher Barker wrote: > > > On Wed, Jan 30, 2019 at 5:54 PM Jamesie Pic wrote: > >> >> What do you think about the cost of typing caps ? Surely, shift >> aggravates repeated strain injury. > > > We spend more time reading code than typing it? even more so with code > completion. > > It?s a style *guide* folks ? let it go! > > - CHB > > > > > >> -- >> ? >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -- > Christopher Barker, PhD > > Python Language Consulting > - Teaching > - Scientific Software Development > - Desktop GUI and Web Development > - wxPython, numpy, scipy, Cython > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Wed Jan 30 21:07:11 2019 From: mertz at gnosis.cx (David Mertz) Date: Wed, 30 Jan 2019 21:07:11 -0500 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: On Wed, Jan 30, 2019, 4:23 PM Abe Dillon Consider that math.pi and math.e are constants that are not all caps, > have you ever been tempted to re-bind those variables? > I generally use 'from math import pi as PI' because the lower case is confusing and misnamed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Jan 30 21:08:31 2019 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 31 Jan 2019 13:08:31 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> <20190131012407.GM1834@ando.pearwood.info> Message-ID: On Thu, Jan 31, 2019 at 1:05 PM Abe Dillon wrote: > > [Christopher Barker] >> >> It?s a style *guide* folks ? let it go! > > > While I don't harbor any delusions that this is going anywhere (given the feedback so far), that's a double edged sword. It's also an extremely benign request, why fight so hard? > I think you should be able to tell from the backlash that this is NOT a benign request. You're asking for the standard library's style guide to be changed, purely for the convenience of some company that is pigheadedly refusing to read the first few paragraphs of a document that it is blindly adopting. Time to take this to python-list? ChrisA From jpic at yourlabs.org Wed Jan 30 21:09:50 2019 From: jpic at yourlabs.org (Jamesie Pic) Date: Thu, 31 Jan 2019 03:09:50 +0100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: Thank you Christopher for the heads up. Using paths as an example were really poor, they distract readers from the actual problem that is assembling a human readable string. Maybe pointless, but on the basis of 30 seconds to run my test suite, see my failure, correct it, and run it again to be at the same point as if I didn't mistake, 300 workdays a year, spans 25 hours over 10 years, and I have already strategized my R&D to capitalize on python for another 10 years. So spending less than 25 hours on this would seem profitable, despite how pointless it is to actual programmers. Anyway, at this point the proposal could also look like str.joinmap(*args, key=str). But I don't know, I can iterate on mapjoin for a while and open a new topic when I stabilize it. Meanwhile, I'm always happy to read y'all so feel free to keep posting :P Have a great day From abedillon at gmail.com Wed Jan 30 21:27:41 2019 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 30 Jan 2019 20:27:41 -0600 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: [Calvin Spealman] > Why not allow excepts on fo loops, for example? Why not, indeed... I've heard there's a non-insignificant performance penalty for setting up a try statement, so it might be important to only set a for-loop up as a guarded for-loop upon reading the "except" statement (if the compiler can handle such behavior). On Tue, Jan 22, 2019 at 2:24 PM Calvin Spealman wrote: > > > On Tue, Jan 22, 2019 at 3:11 PM Paul Ferrell wrote: > >> I've found that almost any time I'm writing a 'with' block, it's doing >> something that could throw an exception. As a result, each of those >> 'with' blocks needs to be nested within a 'try' block. Due to the >> nature of 'with', it is rarely (if ever) the case that the try block >> contains anything other than the with block itself. >> >> As a result, I would like to propose that the syntax for 'with' blocks >> be changed such that they can be accompanied by 'except', 'finally', >> and/or 'else' blocks as per a standard 'try' block. These would handle >> exceptions that occur in the 'with' block, including the execution of >> the applicable __enter__ and __exit__ methods. >> >> Example: >> >> try: >> with open(path) as myfile: >> ... # Do stuff with file >> except (OSError, IOError) as err: >> logger.error("Failed to read/open file {}: {}".format(path, err) >> >> The above would turn into simply: >> >> with open(path) as myfile: >> ... # Do stuff with file >> except (OSError, IOError) as err: >> logger.error(...) >> >> > It definitely makes sense, both the problem and the proposed solution. > > The thing that concerns me is that any such problem and solution seems > to apply equally to any other kind of block. Why not allow excepts on fo > loops, for example? > > >> >> I think this is rather straightforward in meaning and easy to read, >> and simplifies some unnecessary nesting. I see this as the natural >> evolution of what 'with' >> is all about - replacing necessary try-finally blocks with something >> more elegant. We just didn't include the 'except' portion. >> >> I'm a bit hesitant to put this out there. I'm not worried about it >> getting shot down - that's kind of the point here. I'm just pretty >> strongly against to unnecessary syntactical additions to the language. >> This though, I think I can except. It introduces no new concepts and >> requires no special knowledge to use. There's no question about what >> is going on when you read it. >> >> -- >> Paul Ferrell >> pflarr at gmail.com >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > > CALVIN SPEALMAN > > SENIOR QUALITY ENGINEER > > cspealma at redhat.com M: +1.336.210.5107 > > TRIED. TESTED. TRUSTED. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Jan 31 01:08:59 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 31 Jan 2019 17:08:59 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: Message-ID: <20190131060858.GN1834@ando.pearwood.info> On Wed, Jan 30, 2019 at 12:37:45PM +0100, Jamesie Pic wrote: > Thanks for your reply Jimmy ! As suggested by Chris and Steven, we > might also want to throw in a "key" kwarg, that could be none by > default to keep BC, but also allow typecasting: > > ' '.join('a', 2, key=str) Please don't misrepresent what I said, I never suggested a key function like that. And I don't recall Chris suggesting it either, but perhaps I missed one of his posts. -- Steven From rosuav at gmail.com Thu Jan 31 01:12:52 2019 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 31 Jan 2019 17:12:52 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: <20190131060858.GN1834@ando.pearwood.info> References: <20190131060858.GN1834@ando.pearwood.info> Message-ID: On Thu, Jan 31, 2019 at 5:09 PM Steven D'Aprano wrote: > > On Wed, Jan 30, 2019 at 12:37:45PM +0100, Jamesie Pic wrote: > > Thanks for your reply Jimmy ! As suggested by Chris and Steven, we > > might also want to throw in a "key" kwarg, that could be none by > > default to keep BC, but also allow typecasting: > > > > ' '.join('a', 2, key=str) > > Please don't misrepresent what I said, I never suggested a key function > like that. And I don't recall Chris suggesting it either, but perhaps > I missed one of his posts. I didn't, but I don't know if Chris Barker did. (Can't swing a cat without hitting someone named Steve or Chris, in some spelling or another!) ChrisA From steve at pearwood.info Thu Jan 31 01:39:10 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 31 Jan 2019 17:39:10 +1100 Subject: [Python-ideas] Potential PEP: with/except In-Reply-To: References: Message-ID: <20190131063910.GP1834@ando.pearwood.info> On Wed, Jan 30, 2019 at 08:27:41PM -0600, Abe Dillon wrote: > I've heard there's a non-insignificant performance penalty for setting up a > try statement, so it might be important to only set a for-loop up as a > guarded for-loop upon reading the "except" statement (if the compiler can > handle such behavior). I believe you have been misinformed. I admit I haven't tried it recently, but back in Python 2.5 days or so I ran some benchmarks which satisfied me that: - the cost of setting up a try block was effectively zero; - the cost of catching an exception is quite steep. I'd be surprised if Python 3 reverses that. -- Steven From steve at pearwood.info Thu Jan 31 02:05:58 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 31 Jan 2019 18:05:58 +1100 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: <20190131070558.GQ1834@ando.pearwood.info> On Wed, Jan 30, 2019 at 05:41:56PM -0600, Abe Dillon wrote: > I don't know how to make myself more clear. I haven't been talking about > entire documents. Some of the links I provided discuss tests conducted on > entire documents. That is not relevant to this discussion. Then why on earth are you providing those links as support for your assertion that the use of a few all-caps identifiers anywhere on the page "destroys the visual flow of code" (your words) and reduces readability? Don't blame us for the fact that the links you provided don't actually support your assertions. You chose them, you posted them at least three times, if they don't support your position it is cheeky of you to tell us now that they are irrelevant. The links you have provided definitively support the idea that large paragraphs of all-caps text hurt readability, reducing reading speed by about 10%. But they say nothing about the cost of using the odd all-caps word here or there. Ten percent decrease in reading speed is nothing to sneer at, but it is a relatively minor cost. In any case, those readability tests were performed on ordinary non-programmers, reading prose text. For programmers reading code, I would expect that the physical cost of reading words is likely to be only a very small proportion of the total cost of comprehending the code. I've spent *hours* trying to understand the semantics of code that took seconds to read. Compared to that, what's plus or minus ten percent? But never mind, let's go with the 10% figure. That applies to an entire paragraph of all-caps, versus mixed case. It says nothing about the cost of using one or two, or even ten or twelve, all-caps identifiers in pages of code which is otherwise 95% or more mixed case. If 100% all-caps is 10% more costly to read, then 5% all-caps is probably no more than 0.5% more costly to read. Which puts it firmly in "margin of error" territory. -- Steven From tjreedy at udel.edu Thu Jan 31 02:53:47 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 31 Jan 2019 02:53:47 -0500 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: On 1/30/2019 6:41 PM, Abe Dillon wrote: I think continued discussion is pretty useless. 1. Most everything relevant has been said, likely more than once, and 2. The required core-developer support does not exist. PEP 8 was written by Guido with input from other core developers. As Chris said, it defines itself in the opening paragraphs as a "guideline" for the "code in the stdlib". It also disclaims a straightjacket approach. We seldom revise the PEP. Doing so would require initial support from a core developer who garnered more more support. I am not fond of caps either, but not displeased enough to promote the proposal. (What I would like is clarification (from Guido or co-authors) what a 'constant' means in the context.) Abe, if an employer imposes PEP 8 on employees, especially in an overly rigid manner that the PEP discourages, that is an issue between the employer and employees, It is not the problem of Guido and other core developers. -- Terry Jan Reedy From p.f.moore at gmail.com Thu Jan 31 04:12:51 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 31 Jan 2019 09:12:51 +0000 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> <20190131012407.GM1834@ando.pearwood.info> Message-ID: On Thu, 31 Jan 2019 at 02:09, Chris Angelico wrote: > Time to take this to python-list? Better still, let's just drop it. Paul From marcos.eliziario at gmail.com Thu Jan 31 08:55:49 2019 From: marcos.eliziario at gmail.com (Marcos Eliziario) Date: Thu, 31 Jan 2019 11:55:49 -0200 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190129223001.GA92125@cskk.homeip.net> Message-ID: You cited articles, but all of them in the context of reading prose, not code. What I was asking where papers describing controlled experiments with a good cross section of programmers with different skill levels against a good cross-section of different project and team sizes. Reading code has some similarity to reading prose, for sure, but let's not forget that code is actually symbolic language more akin to math and logic that "tries" to read like prose. Also, there is the issue of tradition and log embedded traditions for naming constants like these that make such change violate the time-honoured goal of "least-astonishment". Professional programmers usually work/worked previously in several languages, and most of the usual ones follow this style *Ruby:* Ruby style guide as per rubocop says to use SCREAMING_SNAKE_CASE for constants. *Java:* The names of variables declared class constants and of ANSI constants should be all uppercase with words separated by underscores ("_"). (ANSI constants should be avoided, for ease of debugging.) *C:* Multiple style guides, mostly derived from K&R, for example, the linux kernel style guide is clear: Names of macros defining constants and labels in enums are capitalized. *Javascript: * Also multiple competing style guides, but in general the consensus seems to be that *exported *consts must be capitalized *Go:* The exception here (as usual in Go). Constants should be Camel Case, with lower caps initial for symbols that are not the exported outside the current package, and initial upper case for exported symbols. *R:* The chaos, do whatever the code base you're working with seems to follow as style Those are the languages that most most python programmers are likely to have to work along with python, and as you can see, with the exception of Go and R, most of them have exactly the same Let's also not forget the humongous amount of POSIX constants that are defined as ALL_CAPS, and that we definitely shouldn't be rewriting in another style. Best, Marcos Eliziario Em qua, 30 de jan de 2019 ?s 19:22, Abe Dillon escreveu: > > Capitalizing constants may be slightly harder to read but constants in > code are the minority and emphasizing them is precisely the point. > > The question I'm trying to get everyone to actually think about: > > Is the communication of constancy via ALL CAPS so important that it must > be in PEP-8 despite the documented harm that all caps does to readability? > > > ttps://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read > > https://en.wikipedia.org/wiki/All_caps#Readability > https://uxmovement.com/content/all-caps-hard-for-users-to-read/ > https://practicaltypography.com/all-caps.html > > I've gotten many responses that seem like a knee-jerk reaction in favor of > the status quo. I get the sense people "like" all caps because they've been > conditioned to believe it conveys important information, but they haven't > taken the time to really consider how valid that belief is. > > Consider that math.pi and math.e are constants that are not all caps, > have you ever been tempted to re-bind those variables? Do you seriously > worry that those variables are going to be re-bound by other code? > Functions and classes are essentially constants that aren't all caps, yet > nobody gets confused about whether or not to re-bind those, or if other > code will rebind them. > > If socket.AF_INET6 were socket.af_inet6 would you consider re-binding > that variable? Would you be worried that other code will re-bind it? Can > you measure the value of the information conveyed by all-caps? Are you so > sure that it's as important as you think? > > I've gotten a lot of responses like, "If you don't like it just ignore > PEP-8, it's not mandatory". > A) It is mandatory in many cases. > B) We could just as easily NOT prescribe all caps in PEP-8 but still allow > it. In other words: you can use all caps if you want to but it's not > mandatory or in PEP-8. I would like to discourage its use, but we don't > have to go so far. That way nobody has to violate PEP-8. > > > On Wed, Jan 30, 2019 at 2:01 PM Bruce Leban wrote: > >> Text in color or against black backgrounds is harder to read than black >> on white. >> See for example: >> https://trevellyan.biz/graphic-design-discussion-how-color-and-contrast-affect-readability-2/ >> >> Text where different words in the same sentence are in different colors >> is even harder to read. >> >> And I think we should totally ban anyone on the web from putting light >> gray text on a lighter gray background >> (see https://www.wired.com/2016/10/how-the-web-became-unreadable/ for a >> good discussion). >> >> Or to say that another way: >> I think we should totally ban anyone on the web from putting light gray >> text on a lighter gray background!! >> >> But many of us use editors that use color for syntax highlighting and we >> do that because projecting semantics onto the color axis works for us. So >> we don't ban colorizing text and we shouldn't. >> >> Capitalizing constants may be slightly harder to read but constants in >> code are the minority and emphasizing them is precisely the point. >> >> I'm MINUS_ONE on changing PEP 8. Make your own styleguide if you don't >> want to follow PEP 8 in your code. >> >> --- Bruce >> >> >> >> >> On Wed, Jan 30, 2019 at 11:48 AM Abe Dillon wrote: >> >>> > Is it that really obnoxious? >>> >>> EXTREMELY! >>> >>> > Does using upper case for constants measurably slows down coders? Can >>> you cite the actual papers describing such experiments that lead to this >>> conclusion ? >>> >>> >>> https://www.mity.com.au/blog/writing-readable-content-and-why-all-caps-is-so-hard-to-read >>> https://en.wikipedia.org/wiki/All_caps#Readability >>> https://uxmovement.com/content/all-caps-hard-for-users-to-read/ >>> https://practicaltypography.com/all-caps.html >>> >>> > from my experience having a visual clue that a value is a constant or >>> an enum is something pretty useful. >>> >>> Do you have any proof that it's useful? Have you ever been tempted to >>> modify math.pi or math.e simply because they're lower case? Have you ever >>> stopped to wonder if those values change? >>> >>> If the socket library used packet_host, packet_broadcast, etc. instead >>> of PACKET_HOST, PACKET_BROADCAST, ETC. would you be confused about >>> whether it's a good idea to rebind those variables? Would you be tempted to >>> write the line of code: socket.packet_host = x? >>> >>> It seems to me that nobody is actually considering what I'm actually >>> talking about very carefully. They just assume that because all caps is >>> used to convey information that information is actually important. Not just >>> important, but important enough that it should be in PEP-8. They say I >>> should just violate PEP-8 because it's not strictly enforced. It is >>> strictly enforced in workplaces. I don't see why it can't be the other way >>> around: PEP-8 doesn't say to use all caps, but if you want to it's OK. >>> >>> > Surely, I'd hate reading a newspaper article where the editor >>> generously sprinkled upper case words everywhere >>> >>> Exactly. If it's an eye-sore in every other medium, then it seems likely >>> to me, the only reason programmers don't consider it an eye-sore is they've >>> become inured to it. >>> >>> > but analogies only go so far, reading code have some similarities with >>> reading prose, but still is not the same activity. >>> >>> CAN you articulate what is DIFFERENT about READING code that makes the >>> ALL CAPS STYLE less offensive? >>> >>> On Tue, Jan 29, 2019 at 6:09 PM Marcos Eliziario < >>> marcos.eliziario at gmail.com> wrote: >>> >>>> Is it that really obnoxious? Does using upper case for constants >>>> measurably slows down coders? Can you cite the actual papers describing >>>> such experiments that lead to this conclusion ? >>>> Because, from my experience having a visual clue that a value is a >>>> constant or an enum is something pretty useful. >>>> Surely, I'd hate reading a newspaper article where the editor >>>> generously sprinkled upper case words everywhere, but analogies only go so >>>> far, reading code have some similarities with reading prose, but still is >>>> not the same activity. >>>> >>>> Best, >>>> Marcos Eliziario >>>> >>>> >>>> >>>> Em ter, 29 de jan de 2019 ?s 20:30, Cameron Simpson >>>> escreveu: >>>> >>>>> On 29Jan2019 15:44, Jamesie Pic wrote: >>>>> >On Fri, Jan 4, 2019 at 10:07 PM Bernardo Sulzbach >>>>> > wrote: >>>>> >> I'd suggest violating PEP-8 instead of trying to change it. >>>>> > >>>>> >TBH even my bash global environment variables tend to become more and >>>>> >more lowercase ... >>>>> >>>>> If you mean _exported_ variables, then this is actually a really bad >>>>> idea. >>>>> >>>>> The shell (sh, bash, ksh etc) makes no enforcement about naming for >>>>> exported vs unexported variables. And the exported namespace ($PATH >>>>> etc) >>>>> is totally open ended, because any programme might expect arbitrary >>>>> optional exported names for easy tuning of defaults. >>>>> >>>>> So, you think, since I only use variables I intend and only export >>>>> variables I plan to, I can do what I like. Example script: >>>>> >>>>> a=1 >>>>> b=2 >>>>> export b >>>>> >>>>> So $b is now exported to subcommands, but not $a. >>>>> >>>>> However: the "exported set" is initially the environment you inherit. >>>>> Which means: >>>>> >>>>> Any variable that _arrives_ in the environment is _already_ in the >>>>> exported set. So, another script: >>>>> >>>>> a=1 >>>>> b=2 >>>>> # not exporting either >>>>> >>>>> If that gets called from the environment where you'd exported $b (eg >>>>> from the first script, which could easily be your ~/.profile or >>>>> ~/.bashrc), then $b gets _modified_ and _exported_ to subcommands, >>>>> even >>>>> though you hadn't asked. Because it came in initially from the >>>>> environment. >>>>> >>>>> This means that you don't directly control what is local to the script >>>>> and what is exported (and thus can affect other scripts). >>>>> >>>>> The _only_ way to maintain sanity is the existing convention: local >>>>> script variables use lowercase names and exported variables use >>>>> UPPERCASE names. With that in hand, and cooperation from everyone >>>>> else, >>>>> you have predictable and reliable behaviour. And you have a nice >>>>> visual >>>>> distinction in your code because you know immediately (by convention) >>>>> whether a variable is exported or not. >>>>> >>>>> By exporting lowercase variables you violate this convention, and make >>>>> your script environment unsafe for others to use. >>>>> >>>>> Do many many example scripts on the web do the reverse: using >>>>> UPPERCASE >>>>> names for local script variables? Yes they do, and they do a >>>>> disservice >>>>> to everyone. >>>>> >>>>> Cheers, >>>>> Cameron Simpson >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>> >>>> >>>> >>>> -- >>>> Marcos Elizi?rio Santos >>>> mobile/whatsapp/telegram: +55(21) 9-8027-0156 >>>> skype: marcos.eliziario at gmail.com >>>> linked-in : https://www.linkedin.com/in/eliziario/ >>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> -- Marcos Elizi?rio Santos mobile/whatsapp/telegram: +55(21) 9-8027-0156 skype: marcos.eliziario at gmail.com linked-in : https://www.linkedin.com/in/eliziario/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jan 31 12:51:20 2019 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 31 Jan 2019 09:51:20 -0800 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190131060858.GN1834@ando.pearwood.info> Message-ID: On Wed, Jan 30, 2019 at 10:14 PM Chris Angelico wrote: > > I didn't, but I don't know if Chris Barker did. > nope -- not me either :-) > (Can't swing a cat without hitting someone named Steve or Chris, in > some spelling or another!) > good thing there aren't a lot of cats being swung around, then. One more note about this whole thread: I do a lot of numerical programming, and used to use MATLAB and now numpy a lot. So I am very used to "vectorization" -- i.e. having operations that work on a whole collection of items at once. Example: a_numpy_array * 5 multiplies every item in the array by 5 In pure Python, you would do something like: [ i * 5 for i in a_regular_list] You can imagine that for more complex expressions the "vectorized" approach can make for much clearer and easier to parse code. Also much faster, which is what is usually talked about, but I think the readability is the bigger deal. So what does this have to do with the topic at hand? I know that when I'm used to working with numpy and then need to do some string processing or some such, I find myself missing this "vectorization" -- if I want to do the same operation on a whole bunch of strings, why do I need to write a loop or comprehension or map? that is: [s.lower() for s in a_list_of_strings] rather than: a_list_of_strings.lower() (NOTE: I prefer comprehension syntax to map, but map would work fine here, too) It strikes me that that is the direction some folks want to go. If so, then I think the way to do it is not to add a bunch of stuff to Python's str or sequence types, but rather to make a new library that provides quick and easy manipulation of sequences of strings. -- kind of a stringpy -- analogous to numpy. At the core of numpy is the ndarray: a "a multidimensional, homogeneous array of fixed-size items" a strarray could be simpler -- I don't see any reason for more than 1-D, nor more than one datatype. But it could be a "vector" of strings that was guaranteed to be all strings, and provide operations that acted on the entire collection in one fell swoop. If it turned out to be useful, you could even make a version in C or Cython that might give significant performance benefits. I don't have a use case for this -- but if someone does, it's an idea. -CHB Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Jan 31 13:04:23 2019 From: mertz at gnosis.cx (David Mertz) Date: Thu, 31 Jan 2019 13:04:23 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190131060858.GN1834@ando.pearwood.info> Message-ID: On Thu, Jan 31, 2019 at 12:52 PM Chris Barker via Python-ideas < python-ideas at python.org> wrote: > I know that when I'm used to working with numpy and then need to do some > string processing or some such, I find myself missing this "vectorization" > -- if I want to do the same operation on a whole bunch of strings, why do I > need to write a loop or comprehension or map? > Isn't what you want called "Pandas"? E.g.: >>> type(strs) >>> strs 0 Jan 1 Feb 2 Mar 3 Apr 4 May 5 Jun 6 Jul 7 Aug 8 Sep 9 Oct 10 Nov 11 Dec >>> strs.str.upper() 0 JAN 1 FEB 2 MAR 3 APR 4 MAY 5 JUN 6 JUL 7 AUG 8 SEP 9 OCT 10 NOV 11 DEC >>> strs.str.upper().str.count('A') 0 1 1 0 2 1 3 1 4 1 5 0 6 0 7 1 8 0 9 0 10 0 11 0 >>> strs.str.replace('[aA]','X') 0 JXn 1 Feb 2 MXr 3 Xpr 4 MXy 5 Jun 6 Jul 7 Xug 8 Sep 9 Oct 10 Nov 11 Dec -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Jan 31 13:11:48 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 1 Feb 2019 05:11:48 +1100 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190131060858.GN1834@ando.pearwood.info> Message-ID: On Fri, Feb 1, 2019 at 4:51 AM Chris Barker wrote: > I know that when I'm used to working with numpy and then need to do some string processing or some such, I find myself missing this "vectorization" -- if I want to do the same operation on a whole bunch of strings, why do I need to write a loop or comprehension or map? that is: > > [s.lower() for s in a_list_of_strings] > > rather than: > > a_list_of_strings.lower() > > (NOTE: I prefer comprehension syntax to map, but map would work fine here, too) > > It strikes me that that is the direction some folks want to go. > > If so, then I think the way to do it is not to add a bunch of stuff to Python's str or sequence types, but rather to make a new library that provides quick and easy manipulation of sequences of strings. -- kind of a stringpy -- analogous to numpy. > > At the core of numpy is the ndarray: a "a multidimensional, homogeneous array > of fixed-size items" > > a strarray could be simpler -- I don't see any reason for more than 1-D, nor more than one datatype. But it could be a "vector" of strings that was guaranteed to be all strings, and provide operations that acted on the entire collection in one fell swoop. > Here's a simpler and more general approach: a "vector" type. Any time you attempt to look up any attribute, it returns a vector of that attribute for each of its elements. When you call a vector, it calls each element (with the same args) and returns a vector of the results. So the vector would, in effect, have a .lower() method that returns .lower() of all its elements. (David, your mail came in as I was typing mine, so it looks fairly similar, except that this proposed vector type wouldn't require you to put ".str" in the middle of it, so it would work with any type.) ChrisA From mikhailwas at gmail.com Thu Jan 31 13:28:14 2019 From: mikhailwas at gmail.com (Mikhail V) Date: Thu, 31 Jan 2019 21:28:14 +0300 Subject: [Python-ideas] AMEND PEP-8 TO DISCOURAGE ALL CAPS In-Reply-To: References: <20190105034143.GS13616@ando.pearwood.info> <20190130235357.GL1834@ando.pearwood.info> Message-ID: On Thu, Jan 31, 2019 at 4:59 AM Abe Dillon wrote: > [Steven D'Aprano] >> >> It seems strange that you are so worried about >> the microsecond or so extra reading time it takes to recognize an >> all-caps word, based on the "shape of the word" model, yet are prepared >> to pay this enormous cost probably counted in multiple seconds: ... > > > It seems like the fundamental problem you have is trying to find where and when a variable was last bound. > I don't understand why your go-to solution to that problem is in the style-guide of a language. It doesn't seem at all > related to style. It seems like the kind of problem that's completely in the wheel-house of an IDE. Does it not feel to you > like you're relying on a style-kludge to make up for inadequate tools? > > Why perpetuate that? Why not demand more from your tools? BTW if your IDE or editor supports rich text format styling you can maybe tweak capitalized words to use other font or size. I use UDL in Notepad++ a lot and it can do this - e.g. I use this feature with C++ code to change type names to smaller compact font. Though it requires the list with names of course. So that should be possible in some other Scintilla-based editors i think. AND YES all caps are annoying, especially long ones makes it hard to read, it distracts attention all the time and breaks optical line flow. I need to use enums in Pythonscript plugin quite often and thats how it looks: http://npppythonscript.sourceforge.net/docs/latest/enums.html really annoying. good at least that method wrappers are lowercase. Mikhail From steve at pearwood.info Thu Jan 31 18:23:08 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 1 Feb 2019 10:23:08 +1100 Subject: [Python-ideas] Vectorization [was Re: Add list.join() please] In-Reply-To: References: <20190131060858.GN1834@ando.pearwood.info> Message-ID: <20190131232307.GR1834@ando.pearwood.info> On Thu, Jan 31, 2019 at 09:51:20AM -0800, Chris Barker via Python-ideas wrote: > I do a lot of numerical programming, and used to use MATLAB and now numpy a > lot. So I am very used to "vectorization" -- i.e. having operations that > work on a whole collection of items at once. [...] > You can imagine that for more complex expressions the "vectorized" approach > can make for much clearer and easier to parse code. Also much faster, which > is what is usually talked about, but I think the readability is the bigger > deal. Julia has special "dot" vectorize operator that looks like this: L .+ 1 # adds 1 to each item in L func.(L) # calls f on each item in L https://julialang.org/blog/2017/01/moredots The beauty of this is that you can apply it to any function or operator and the compiler will automatically vectorize it. The function doesn't have to be written to specifically support vectorization. > So what does this have to do with the topic at hand? > > I know that when I'm used to working with numpy and then need to do some > string processing or some such, I find myself missing this "vectorization" > -- if I want to do the same operation on a whole bunch of strings, why do I > need to write a loop or comprehension or map? that is: > > [s.lower() for s in a_list_of_strings] > > rather than: > > a_list_of_strings.lower() Using Julia syntax, that might become a_list_of_strings..lower(). If you don't like the double dot, perhaps str.lower.(a_list_of_strings) would be less ugly. -- Steven From tjreedy at udel.edu Thu Jan 31 18:43:53 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 31 Jan 2019 18:43:53 -0500 Subject: [Python-ideas] Add list.join() please In-Reply-To: References: <20190131060858.GN1834@ando.pearwood.info> Message-ID: On 1/31/2019 12:51 PM, Chris Barker via Python-ideas wrote: > I do a lot of numerical programming, and used to use MATLAB and now > numpy a lot. So I am very used to "vectorization" -- i.e. having > operations that work on a whole collection of items at once. > > Example: > > a_numpy_array * 5 > > multiplies every item in the array by 5 > > In pure Python, you would do something like: > > [ i * 5 for i in a_regular_list] > > You can imagine that for more complex expressions the "vectorized" > approach can make for much clearer and easier to parse code. Also much > faster, which is what is usually talked about, but I think the > readability is the bigger deal. > > So what does this have to do with the topic at hand? > > I know that when I'm used to working with numpy and then need to do some > string processing or some such, I find myself missing this > "vectorization" -- if I want to do the same operation on a whole bunch > of strings, why do I need to write a loop or comprehension or map? that is: > > [s.lower() for s in a_list_of_strings] > > rather than: > > a_list_of_strings.lower() > > (NOTE: I prefer comprehension syntax to map, but map would work fine > here, too) > > It strikes me that that is the direction some folks want to go. To me, thinking of strings as being in lists is Python 1 thinking. Interactive applications work with *streams* of input strings. > If so, then I think the way to do it is not to add a bunch of stuff to > Python's str or sequence types, but rather to make a new library that > provides quick and easy manipulation of sequences of strings.? -- kind > of a stringpy -- analogous to numpy. > > At the core of numpy is the ndarray: a "a multidimensional, homogeneous > array > of fixed-size items" > > a strarray could be simpler -- I don't see any reason for more than 1-D, > nor more than one datatype. But it could be a "vector" of strings that > was guaranteed to be all strings, and provide operations that acted on > the entire collection in one fell swoop. I think an iterator (stream) of strings would be better. Here is a start. class StringIt: """Iterator of strings. A StringIt wraps an iterator of strings to provide methods that apply the corresponding string method to each string in the iterator. StringIt methods do not enforce the positional-only restrictions of some string methods. The join method reverses the order of the arguments. Except for join(joiner), which returns a single string, the return values are iterators of the return value of the string methods. An iterator of strings is returned as a StringIt so that further methods can be applied. """ def __init__(self, objects, nogen=False): """Return a wrapped iterator of strings. Objects must be an iterator of strings or an iterable of objects with good __str__ methods. All builtin objects have a good __str__ methods and all non-buggy user-defined objects should. When *objects* is an iterator of strings, passing nogen=True avoids an layer of wrapping by claiming that str calls are not needed. StringIt methods that return a StringIt do this. An iterable of strings, such as ['a', 'b', 'c'], can be turned into an iterator with iter(iterable). Users who pass nogen=True do so at their own risk because checking the claim would empty the iterable. """ if not hasattr(objects, '__iter__'): raise ValueError('objects is not an iterable') if nogen and not hasattr(objects, '__next__'): raise ValueError('objects is not an iterator') if nogen: self.it = objects else: self.it = (str(ob) for ob in objects) def __iter__(self): return self.it.__iter__() def __next__(self): return self.it.__next__() def upper(self): return StringIt((s.upper() for s in self.it), nogen=True) def count(self, sub, start=0, end=None): return (s.count(sub, start, end or len(s)) for s in self.it) def join(self, joiner): return joiner.join(self.it) for si, out in ( (StringIt(iter(('a', 'b', 'c')), nogen=True), ['a', 'b', 'c']), (StringIt((1, 2, 3)), ['1', '2', '3']), (StringIt((1, 2, 3)).count('1'), [1, 0, 0]), (StringIt(('a', 'b', 'c')).upper(), ['A', 'B', 'C']), ): assert list(si) == out assert StringIt(('a', 'b', 'c')).upper().join('-') == 'A-B-C' # asserts all pass -- Terry Jan Reedy From allemang.d at gmail.com Thu Jan 31 20:24:25 2019 From: allemang.d at gmail.com (David Allemang) Date: Thu, 31 Jan 2019 20:24:25 -0500 Subject: [Python-ideas] Vectorization [was Re: Add list.join() please] In-Reply-To: <20190131232307.GR1834@ando.pearwood.info> References: <20190131060858.GN1834@ando.pearwood.info> <20190131232307.GR1834@ando.pearwood.info> Message-ID: I accidentally replied only to Steven - sorry! - this is what I said, with a typo corrected: > a_list_of_strings..lower() > > str.lower.(a_list_of_strings) I much prefer this solution to any of the other things discussed so far. I wonder, though, would it be general enough to simply have this new '.' operator interact with __iter__, or would there have to be new magic methods like __veccall__, __vecgetattr__, etc? Would a single __vectorize__ magic method be enough? For example, I would expect (1, 2, 3) .** 2 to evaluate as a tuple and [1, 2, 3] .** 2 to evaluate as a list, and some_generator() .** 2 to still be a generator. If there were a __vectorize__(self, func) which returned the iterable result of applying func on each element of self: class list: def __vectorize__(self, func): return [func(e) for e in self] some_list .* other becomes some_list.__vectorize__(lambda e: e * 2) some_string..lower() becomes some_string.__vectorize__(str.lower) some_list..attr becomes some_list.__vectorize__(operator.__attrgetter__('attr')) Perhaps there would be a better name for such a magic method, but I believe it would allow existing sequences to behave as one might expect, but not require each operator to require its own definition. I might also be over-complicating this, but I'm not sure how else to allow different sequences to give results of their same type. On Thu, Jan 31, 2019 at 6:24 PM Steven D'Aprano wrote: > On Thu, Jan 31, 2019 at 09:51:20AM -0800, Chris Barker via Python-ideas > wrote: > > > I do a lot of numerical programming, and used to use MATLAB and now > numpy a > > lot. So I am very used to "vectorization" -- i.e. having operations that > > work on a whole collection of items at once. > [...] > > You can imagine that for more complex expressions the "vectorized" > approach > > can make for much clearer and easier to parse code. Also much faster, > which > > is what is usually talked about, but I think the readability is the > bigger > > deal. > > Julia has special "dot" vectorize operator that looks like this: > > L .+ 1 # adds 1 to each item in L > > func.(L) # calls f on each item in L > > https://julialang.org/blog/2017/01/moredots > > The beauty of this is that you can apply it to any function or operator > and the compiler will automatically vectorize it. The function doesn't > have to be written to specifically support vectorization. > > > > So what does this have to do with the topic at hand? > > > > I know that when I'm used to working with numpy and then need to do some > > string processing or some such, I find myself missing this > "vectorization" > > -- if I want to do the same operation on a whole bunch of strings, why > do I > > need to write a loop or comprehension or map? that is: > > > > [s.lower() for s in a_list_of_strings] > > > > rather than: > > > > a_list_of_strings.lower() > > Using Julia syntax, that might become a_list_of_strings..lower(). If you > don't like the double dot, perhaps str.lower.(a_list_of_strings) would > be less ugly. > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Thu Jan 31 23:58:51 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Fri, 1 Feb 2019 05:58:51 +0100 Subject: [Python-ideas] Vectorization [was Re: Add list.join() please] In-Reply-To: <20190131232307.GR1834@ando.pearwood.info> References: <20190131060858.GN1834@ando.pearwood.info> <20190131232307.GR1834@ando.pearwood.info> Message-ID: I love moredots ?? With pip install funcoperators, one can implement the *dotmul* iff dotmul can be implemented as a function. L *dotmul* 1 Would work. Or even a simple tweak to the library would allow L *dot* s to be [x*s for x in L] and L /dot/ s to be [x/s for x in L]" I'd implement something like "if left is iterable and right is not, apply [x*y for x in left] else if both are iterable, apply [x*y for x,y in zip(left, right)] etc." Iterble Disclaimer : I'm the creator of funcoperators On Fri, 1 Feb 2019, 00:23 Steven D'Aprano On Thu, Jan 31, 2019 at 09:51:20AM -0800, Chris Barker via Python-ideas > wrote: > > > I do a lot of numerical programming, and used to use MATLAB and now > numpy a > > lot. So I am very used to "vectorization" -- i.e. having operations that > > work on a whole collection of items at once. > [...] > > You can imagine that for more complex expressions the "vectorized" > approach > > can make for much clearer and easier to parse code. Also much faster, > which > > is what is usually talked about, but I think the readability is the > bigger > > deal. > > Julia has special "dot" vectorize operator that looks like this: > > L .+ 1 # adds 1 to each item in L > > func.(L) # calls f on each item in L > > https://julialang.org/blog/2017/01/moredots > > The beauty of this is that you can apply it to any function or operator > and the compiler will automatically vectorize it. The function doesn't > have to be written to specifically support vectorization. > > > > So what does this have to do with the topic at hand? > > > > I know that when I'm used to working with numpy and then need to do some > > string processing or some such, I find myself missing this > "vectorization" > > -- if I want to do the same operation on a whole bunch of strings, why > do I > > need to write a loop or comprehension or map? that is: > > > > [s.lower() for s in a_list_of_strings] > > > > rather than: > > > > a_list_of_strings.lower() > > Using Julia syntax, that might become a_list_of_strings..lower(). If you > don't like the double dot, perhaps str.lower.(a_list_of_strings) would > be less ugly. > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: